/ llmtxt.info

Best practices

Ten rules, the mistakes we see most often, and concrete patterns for i18n, security, and CI.

Last updated:

Ten rules

  1. Curate. A short list of high-signal pages beats a long list of mediocre ones. Aim for 10–30 links in the root file.
  2. Use absolute URLs. Always https://yourdomain.com/.... Relative URLs are technically allowed but fragile.
  3. Group by product surface. Sections like Product, Pricing, Developers reflect how a user (and an LLM) thinks. Avoid blog/doc/guide buckets unless they map onto your real navigation.
  4. Keep the summary factual. The blockquote after the H1 should read like a Wikipedia opener, not like a landing-page hero.
  5. One sentence per item. The colon-prefixed note is for disambiguation, not for marketing.
  6. Use the Optional section sparingly. It is the right home for press, brand assets, and archives. Do not dump half your sitemap there.
  7. Mirror your stable URLs. If a page in llms.txt moves, update or redirect it. Stale URLs poison the file’s reputation.
  8. Publish llms-full.txt when content is text-heavy. Documentation, tutorials, and reference material benefit. Galleries, interactive tools, and primarily visual content do not.
  9. Run the validator in CI. A content migration that breaks your file should fail the build.
  10. Date your file. A short note like “Last reviewed 2026-04-01” in the body is helpful for both humans and crawlers.

Common mistakes

  • No H1. The H1 is the only required element. Without it, the file is invalid.
  • Multiple H1s. Use H2s for sections. There must be exactly one H1.
  • Custom front-matter. No YAML, no JSON header. The spec is strict; clients will not parse extras.
  • Pasted markdown tables / images. Stick to a heading + blockquote + lists. Tables and images add no value to an LLM.
  • Including auth-gated URLs. If a page requires login, do not list it — the LLM will hit a wall.
  • Overlong descriptions. “The world’s most advanced AI-powered platform for next-gen synergistic transformation” helps no one. Keep notes <15 words.
  • Listing 500 URLs. If you need that many, you need llms-full.txt, per-product variants, or both.
  • Forgetting to update robots.txt. Make sure /llms.txt is not blocked.

Multilingual sites

The spec is silent on internationalization. Two patterns work in practice:

  1. Single English file at the root. The simplest option. Most LLM clients will translate on the fly. Good enough for most sites.
  2. Per-locale variants. Serve /llms.txt (default), /fr/llms.txt, /es/llms.txt. Link to them from your root file’s body or under an Optional section.

Whichever pattern you pick, do not duplicate URL sets across locales: each variant should point to the localized version of each page.

Security and privacy

  • Everything in llms.txt is public. Treat the file as broadcast.
  • Never list staging or preview URLs. They will be picked up by anything that downloads the file.
  • Do not list URLs with secrets in query strings. This sounds obvious; we have seen it happen.
  • If the page exposes user data behind auth, it does not belong here.
  • Audit the file at every release. A leaked draft URL is the most common security mistake.

Automation in CI

Treat llms.txt like any other artifact: generate, validate, and gate releases on it.

  • Generate it from your content source (CMS, MDX collection, database).
  • Run the validator in CI; fail the build on any error.
  • Diff the file across releases; alert the docs owner on large deletions.
  • Smoke-test the production URL after deploy: curl -fsS https://yourdomain.com/llms.txt | head -1.

Next