Benefits and limitations
A frank look at what llms.txt does, what it does not, and how to decide whether to publish one.
Last updated:
Benefits
1. A shipped, machine-readable contract
The first concrete benefit is internal: writing an llms.txt
forces your team to agree on which pages actually represent your project.
Most teams discover that the file they ship is shorter and clearer than
their nav, and that exercise alone is valuable.
2. Better grounding for assistants that do read it
When a user pastes your domain into Claude, ChatGPT, Perplexity, or a
custom RAG pipeline, several of these tools (or their integrations) will
look for /llms.txt first. A well-curated file gives them the
pages you want them to ground answers on, instead of whatever the open web
surfaces.
3. A stable, citable corpus
With llms-full.txt alongside, an assistant can ingest your
documentation as a single artifact and cite specific URLs back to the user.
That citation behavior matters for trust and for click-through.
4. Disambiguation for ambiguous brands
If your name collides with another company, product, or term, the
blockquote summary at the top of llms.txt is a one-shot
disambiguator. You control the first sentence an assistant reads about you.
5. A defensible baseline for "GEO" / "AEO" work
Generative Engine Optimization (GEO) and Answer Engine Optimization (AEO)
are still emergent disciplines. Publishing llms.txt is one of
the few concrete, low-risk tactics where the cost is hours and the upside
is real-world citations.
Limitations
1. No major search engine confirms reading it
As of April 2026, neither Google, Bing, nor any major LLM provider has
publicly committed to using llms.txt as a ranking or grounding
signal. Anthropic, Mintlify, Cloudflare, Stripe, and Vercel publish files;
that is not the same as confirming they consume them on the receiving end.
2. It does not control crawler behavior
llms.txt has no allow / disallow semantics. If you want to
block GPTBot or ClaudeBot, you still need
robots.txt. The two files solve different problems.
3. Adoption on the receiving side is uneven
Some tools (Cursor, Windsurf, several MCP integrations) explicitly fetch
llms.txt. Others ignore it. Coverage will improve, but plan
for a long tail.
4. Spec is community-maintained, not standardized
The proposal lives at llmstxt.org and has not been through an IETF or W3C process. Expect minor changes, and validate against the live spec rather than against a frozen copy.
5. Easy to over-optimize
The format invites the same sins as meta tags did in 2010: keyword stuffing, marketing copy, hidden agendas. Resist. The file is read by humans too, and reads as low-quality very quickly.
Documented skepticism
Healthy debate exists. The most-cited skeptical voice is John Mueller (Google Search Advocate), who has questioned both adoption and impact in multiple posts on Bluesky and Mastodon throughout 2025 and 2026. His summary, paraphrased: publishing the file is cheap; expecting Google to consume it is wishful thinking.
Counter-argument from the documentation community: even if Google never consumes it, the LLM-side ecosystem (Perplexity, Claude, Cursor, MCP integrations) is already a large enough audience to justify the file.
Both views are compatible. llms.txt is not a search ranking
play — it is an LLM-grounding play. Decide based on whether your audience
uses LLM assistants to discover and consume your content.
When it is worth it
- Documentation sites. Highest ROI: assistants are already a primary discovery channel for developers.
- Developer tools and APIs. Same reason. Pair with
llms-full.txt. - SaaS with technical buyers. Buyers research with AI tools; a clean file improves what those tools say about you.
- Knowledge bases and reference sites. Wikis, glossaries, and taxonomies benefit from explicit curation.
- Brands with name collisions. The blockquote summary is your one shot at disambiguation.
When it is probably not
- Pure e-commerce catalogs. Product listings change too often; structured data (Schema.org) is a better bet.
- Visual or media-first sites. Galleries, video platforms, and design portfolios get little from a Markdown text file.
- Sites without stable URLs. If your URLs churn, your file rots, and stale URLs poison its reputation.
- Sites under heavy auth. Public listings of private pages are useless and risky.
How to measure impact
Honest answer: measurement is hard. There is no Google Search Console for LLM citations. A practical instrument set:
- Server logs. Filter for known LLM user-agents (
GPTBot,ClaudeBot,PerplexityBot,OAI-SearchBot) hitting/llms.txtand/llms-full.txt. - Referrer analysis. Watch referrers from
chat.openai.com,claude.ai,perplexity.ai. Volume there grew substantially across 2025. - Manual prompts. Periodically ask Claude, ChatGPT, and Perplexity about your product. Note whether they cite your domain and which URLs they reach for.
- Brand monitoring. Tools like Profound, Otterly, Xfunnel, and AthenaHQ track LLM mentions of brands; they are early but improving fast.
Next
- Best practices — the rules and the mistakes.
- Real-world examples.
- Validator.