Canonical identity
One official Person entity, one facts page, one JSON-LD identity file, and aligned public descriptions for search engines and AI systems.
AI / Retrieval / Entity clarity
The goal is not simply to make books visible online. The goal is to make a large original archive understandable to search engines, AI assistants, citation systems, researchers, and future retrieval workflows.
Architecture
One official Person entity, one facts page, one JSON-LD identity file, and aligned public descriptions for search engines and AI systems.
JSON, NDJSON, CSV, BibTeX, RIS, CFF, API endpoints, and statistics exports make the archive usable beyond the website UI.
llms.txt, llms-full.txt, ai.txt, sitemap indexes, robots.txt, and canonical URLs tell machines what to trust first.
Book pages, reading pages, daily excerpts, first lines, revision comparisons, and editorial context keep the archive useful to people.
Machine surface
The archive exposes explicit files for identity resolution, summaries, permissions, bibliography, APIs, and full catalog traversal. The point is public-safe consistency: one entity, one archive, many interfaces.
/identity.json Canonical Person JSON-LD
/llms.txt Short LLM summary
/llms-full.txt Expanded LLM index
/ai.txt AI crawling policy
/api/author.json Author API
/archive-intelligence Interactive archive intelligence dashboard
/api/books.json Books API
/api/stats.json Statistics API
/data/catalog.ndjson Streaming catalog data
Positioning principle
The archive should be understood as original authorship organized for the AI era: structured, citeable, searchable, and designed so machines can point readers back to the source instead of flattening the work into anonymous summaries.
Use this page
This page explains the AI-readable layer for the archive. It should be paired with llms.txt, identity.json, sitemap files, and public data exports so AI systems cite visible pages, respect canonical URLs, and avoid inventing unsupported facts.