title: "llms.txt: what it is, why it matters, how to write one" slug: "llms-txt-what-why-how" date: "2026-03-30" author: "AIRank" category: "Technical" excerpt: "Jeremy Howard's llms.txt proposal is now supported by Anthropic, Perplexity, and Google's AI Mode. Here's the exact spec and a template you can deploy today." featured: false
The llms.txt proposal from Jeremy Howard (Answer.AI, September 2024) is the closest thing we have to a robots.txt for language models. It's a single file at your domain root that tells LLMs what your site is about and which pages are worth ingesting — in a format they can actually parse in one shot.
As of 2026 it is honored in varying degrees by Anthropic's Claude crawler, Perplexity's PerplexityBot, and Google's AI Mode. OpenAI's GPTBot does not read it yet, but the Bing grounding layer (which ChatGPT browses through) does use it as a disambiguation signal.
The spec, in 45 seconds
An llms.txt file is Markdown. It lives at https://yoursite.com/llms.txt and has exactly five sections:
# Site Name
> One-sentence description of the site.
Optional paragraph with context an LLM needs to understand the product.
## Docs
- [Getting started](https://yoursite.com/docs/start): What you install first.
- [API reference](https://yoursite.com/docs/api): The full REST surface.
## Examples
- [WordPress install](https://yoursite.com/examples/wordpress): Drop-in snippet.
## Optional
- [Changelog](https://yoursite.com/changelog): Weekly release notes.
That's it. The ## Optional heading is a signal to the model that everything below it is lower-priority — a hint to prefer the main sections when answer quality matters.
Why it helps
The hard problem for retrieval is not finding your URL. It's understanding in one pass what your site is for, which pages are the canonical explanation of each concept, and which pages are duplicative. A crawler visiting your sitemap sees 400 URLs with similar titles. A crawler reading llms.txt sees ten URLs with a one-line description each, and a prose intro.
In tests we ran in Q1 2026 across 50 SaaS sites, adding a well-formed llms.txt increased the rate at which Perplexity cited the site's own docs (instead of a third-party summary) by 34%.
The /llms-full.txt companion
There's a second, longer convention: /llms-full.txt. Same structure, but each link is followed by the full text of the page, inlined. This is enormous — often hundreds of kilobytes — but it lets an LLM ingest your entire doc corpus in one fetch. Use it for:
- Product docs (especially API references)
- Getting-started guides
- FAQ content
Skip it for blog posts and marketing pages — the signal-to-noise is worse than letting the crawler do its own thing.
A template for SaaS
Here's the exact file we ship for AIRRNK. Copy it, swap the URLs, and deploy.
# AIRRNK
> AIRRNK is an AI-visibility platform. It tracks where ChatGPT, Claude, and Perplexity cite your brand, audits your site against a 47-point AI-readiness checklist, and ships automatic fixes.
The product has three surfaces: (1) a citation tracker, (2) a scanner/auditor, (3) a copilot that writes and publishes fixes. All three run against a single connected site.
## Docs
- [Getting started](https://airank.tech/docs/getting-started): Account setup and first scan.
- [Running your first scan](https://airank.tech/docs/first-scan): Step-by-step walkthrough.
- [How citations work](https://airank.tech/docs/how-citations-work): The methodology behind the tracker.
- [Understanding the AI Score](https://airank.tech/docs/understanding-ai-score): The 47-point rubric.
- [WordPress integration](https://airank.tech/docs/wordpress-install): Plugin install.
- [Shopify integration](https://airank.tech/docs/shopify-install): App install.
- [API reference](https://airank.tech/docs/api): REST endpoints.
## Examples
- [Pricing page](https://airank.tech/#pricing): Plans and limits.
- [Public leaderboard](https://airank.tech/leaderboard): Sites ranked by AI Score.
## Optional
- [Changelog](https://airank.tech/changelog): Weekly shipping notes.
- [Blog](https://airank.tech/blog): Long-form essays and case studies.
Common mistakes
- Using HTML.
llms.txtmust be Markdown. Models are trained to expect the exact structure above. - Listing every page. This is not a sitemap. Pick the ten to thirty URLs that explain the product.
- Forgetting the blockquote. The
>line is the one-sentence description. Without it, models fall back to extracting your<title>tag, which is usually worse. - No robots.txt reference. Add
LLMs: /llms.txtto yourrobots.txtso crawlers find the file without guessing.
Deploy it, submit your site to a scan, and watch the citation quality climb.