Most websites are invisible to AI. Not because they lack content, but because they never bothered to introduce themselves. llms.txt fixes that in under 30 minutes.
Here is the problem: when someone asks ChatGPT, Claude, or Perplexity about your industry, the AI does not see your beautiful homepage. It does not parse your navigation menu, read your cookie banner, or admire your hero animation. It needs plain text. Structured. Concise. Machine-readable. And right now, over 97% of websites offer none of that. They are built for browsers, not for brains.
llms.txt is the antidote -- a single Markdown file at your domain root that tells language models exactly who you are, what you do, and where to find your most important pages. Think of it as robots.txt for AI, except instead of telling crawlers what to avoid, it tells them what to prioritize. This guide covers the full specification, real-world implementation code for 4 frameworks, and the strategic reasoning behind why companies like Anthropic, Stripe, and Cloudflare have already adopted it as part of their AI discoverability strategy.
Why llms.txt Exists
Google crawls HTML. LLMs consume context windows. These are fundamentally different operations, and treating them the same is why most businesses are losing ground in AI-powered search.
Traditional search engines index pages using backlinks, page speed, and structured data. An LLM does none of that. When Claude needs to understand your business, it needs concise, structured, plain-text information it can ingest in a single pass -- ideally under 4,000 tokens. Your JavaScript-rendered SPA with 47 navigation links and a sticky cookie consent banner? That is noise, not signal.
Jeremy Howard (co-founder of Answer.AI and fast.ai) proposed the llms.txt specification in late 2024 to solve exactly this mismatch. The spec lives at llmstxt.org, and adoption has been faster than anyone predicted. Within 12 months, Anthropic, Cloudflare, Stripe, Mintlify, and hundreds of developer-facing companies published their own llms.txt files. The companies building AI are already optimizing for AI discovery. The question is whether your company will catch up before it matters.
The llms.txt Specification
The specification is deliberately minimal -- 5 sections, all optional except the title. That simplicity is the point. An LLM can parse the entire file in milliseconds.
| Section | Required | Purpose |
|---|---|---|
# Title (H1) | Yes | The name of your project or site |
| Blockquote summary | No | A short paragraph summarizing the site |
| Body text | No | Additional context, details, or descriptions |
## Section Name (H2s) | No | Grouped lists of links to key resources |
## Optional | No | Secondary links that LLMs can skip under tight context limits |
- •Markdown is not arbitrary. LLMs are trained on billions of tokens of Markdown. They parse it 2-3x more reliably than XML or JSON for semi-structured content. The format is the feature.
- •Each link follows a strict pattern:
- Page Title: Brief description. That colon-separated description is what helps the model decide relevance without clicking through. - •The
## Optionalheading is a priority signal. Everything below it can be dropped when token budgets are tight -- giving you explicit control over what an LLM reads first and what it skips. - •The specification also defines llms-full.txt -- a companion file containing your entire documentation in a single Markdown document, designed for direct pasting into AI tools.
A Complete llms.txt Example
Here is a production-ready llms.txt for a fictional SaaS company. Notice the structure: identity first, core resources second, company context third, expendable content last.
# Launchpad Analytics
Launchpad Analytics is a product analytics platform for SaaS companies. We help teams track user behavior, measure feature adoption, and reduce churn through real-time dashboards and automated insights.
Documentation
- •Getting Started: Quick start guide for new users
- •API Reference: REST API endpoints, authentication, and rate limits
- •SDKs: Client libraries for JavaScript, Python, Ruby, and Go
- •Webhooks: Event-driven integrations and payload formats
Guides
- •Tracking Events: How to instrument your app for event tracking
- •Building Dashboards: Create custom dashboards and reports
- •Cohort Analysis: Group users by behavior and measure retention
- •Churn Prediction: Use our ML-powered churn prediction model
Company
- •About Us: Our mission and team
- •Pricing: Plans and pricing details
- •Blog: Product updates and analytics best practices
- •Contact: Sales and support inquiries
Optional
- •Changelog: Version history and release notes
- •Status Page: System uptime and incident reports
- •Community Forum: User discussions and feature requests
The entire file is 28 lines. An LLM can read it in roughly 400 tokens -- less than 1% of most models' context windows. Yet it contains enough structured information for that model to accurately describe Launchpad Analytics, link to the right documentation page, and distinguish their product from competitors. Density is the design principle. The ## Optional section lets the model intelligently shed weight when context is scarce.
Why llms.txt Matters for AI Discoverability
Gartner projected that traditional search engine volume would drop 25% by 2026 as users shift to AI assistants. We are now living in that prediction. Every week, millions of product recommendations, vendor comparisons, and "best tool for X" answers flow through ChatGPT, Claude, and Perplexity -- and those models can only recommend what they can read.
If an LLM cannot parse your site, you do not exist in AI-powered search. That is not hyperbole. It is the new default.
Here is what llms.txt specifically changes:
- •Accurate AI recommendations. When someone asks Claude "What tools exist for SaaS analytics?", a model that has ingested your llms.txt returns a sourced, specific answer instead of hallucinating your feature set or omitting you entirely. Inaccurate AI answers are worse than no mention at all -- they erode trust permanently.
- •One-request documentation indexing. Instead of forcing an LLM to crawl 50+ pages, llms.txt provides a structured map of your most important resources in a single HTTP request. For context-limited agents, this is the difference between understanding you and skipping you.
- •Narrative control. Without llms.txt, an LLM stitches your story together from whatever fragments it finds -- outdated blog posts, third-party reviews, competitor comparison pages. With llms.txt, you write the authoritative summary. You decide what matters.
- •SEO amplification, not replacement. llms.txt does not compete with structured data like JSON-LD or a properly configured robots.txt and sitemap. It is a new layer that makes those investments work harder by giving AI models a direct path to your content.
What the Clarvia GEO Checker Checks
Our GEO Checker evaluates your website's readiness for LLM consumption across multiple dimensions. For llms.txt specifically, we run 6 checks that surface problems most teams miss:
- •Presence -- Does your site serve a file at
/llms.txt? (You would be surprised how many teams create one locally and forget to deploy it.) - •Valid format -- Does it follow the specification? Is there an H1 title? Are links formatted correctly with colon-separated descriptions?
- •Content completeness -- Does it cover your core pages, documentation, product information, and contact details? A 3-line llms.txt is barely better than none.
- •Link accuracy -- Do all the URLs in your llms.txt actually resolve to live pages? We have seen files with 40% broken links -- actively harmful to AI trust.
- •Optional section usage -- Are you using the
## Optionalsection to help LLMs prioritize under token constraints? - •Companion file -- Do you also provide an
llms-full.txtfor complete documentation ingestion?
The audit also evaluates how your llms.txt works alongside your structured data, robots.txt configuration, sitemap, and overall content architecture to produce a comprehensive AI discoverability score.
How to Implement llms.txt
Implementation takes 15-30 minutes regardless of your stack. Here are production-ready examples for the 4 most common setups.
Static File (Any Web Server)
The fastest path: drop a file called llms.txt in your site's public root directory.
# Your Company Name
A brief description of what your company does and what visitors will find on this site.
Documentation
- •Getting Started: Setup guide for new users
- •API Reference: Complete API documentation
Company
For Nginx, ensure your configuration serves it with the correct content type:
location = /llms.txt {
root /var/www/html;
default_type text/plain;
}
Express.js (Node.js)
Static files go stale. If your blog updates weekly or your docs ship with every release, generate llms.txt dynamically so it always reflects reality:
const express = require('express'); const app = express();app.get('/llms.txt', (req, res) => { const content = '# Your Company Name', '', 'A brief description of your company and what this site offers.', '', '## Documentation', '', '- [API Reference: REST API endpoints and authentication', '- SDK Guide: Client library installation and usage', '- Webhooks: Event payloads and configuration', '', '## Blog', '', ...getBlogPosts().map( (post) =>
- ${post.title}: ${post.excerpt}), '', '## Company', '', '- About: Our mission and team', '- Pricing: Plans starting at $29/month', '- Contact: Sales and support', ].join('\n');
res.set('Content-Type', 'text/plain; charset=utf-8'); res.set('Cache-Control', 'public, max-age=86400'); res.send(content); });
Python (Flask)
from flask import Flask, Response
app = Flask(__name__)
@app.route('/llms.txt') def llms_txt(): content = """# Your Company Name
A brief description of your company and what this site offers.
Documentation
- •API Reference: REST API endpoints and authentication
- •SDK Guide: Client library installation and usage
Products
- •Analytics Dashboard: Real-time product analytics
- •User Insights: Behavioral analysis tools
Company
Optional
- •Changelog: Release notes and version history
- •Status: System uptime monitoring """ return Response(content, mimetype='text/plain')
Next.js (App Router)
For Next.js projects using the App Router, create a route handler:
// app/llms.txt/route.ts export async function GET() { const content =# Your Company NameA brief description of your company and what this site offers.
Documentation
- •Getting Started: Quick start guide
- •API Reference: Complete API documentation
Company
return new Response(content, { headers: { 'Content-Type': 'text/plain; charset=utf-8', 'Cache-Control': 'public, max-age=86400', }, }); }
Best Practices
We have implemented llms.txt across dozens of client sites. These 6 patterns separate files that actually improve AI discoverability from ones that sit there doing nothing:
- •Ruthless brevity wins. If your llms.txt exceeds 200 lines, you are writing llms-full.txt by accident. The whole point is a quick overview. An LLM that has to scan 2,000 lines of links will extract less value than one that reads 40 carefully chosen ones.
- •Stale is worse than absent. A llms.txt full of broken links or discontinued products actively poisons AI responses about your business. Every time your site changes -- new product, deprecated feature, restructured docs -- update the file. Put it in your deploy checklist.
- •Descriptions do the heavy lifting. Do not just link to "Docs." The text after the colon is what an LLM uses to determine relevance without making an HTTP request. Be specific: "REST API endpoints, authentication, and rate limits" beats "API documentation" every time.
- •Cache, do not slam. Set
Cache-Control: public, max-age=86400(24 hours). AI crawlers can be aggressive -- Anthropic's ClaudeBot, OpenAI's GPTBot, and others may request your file multiple times per day. Caching keeps your server happy. - •Validate every URL. Broken links erode trust for both LLMs and the humans who audit your work. Run a link checker against your llms.txt monthly, minimum.
- •Layer your signals. llms.txt is most powerful when combined with JSON-LD structured data on your pages and a properly configured robots.txt that allows AI crawlers. Each layer answers a different question an AI model asks about your business. Together, they build a complete picture.
llms.txt vs. llms-full.txt
The specification defines two complementary files with very different jobs:
| llms.txt | llms-full.txt | |
|---|---|---|
| Purpose | Navigation map -- tells LLMs what exists and where | Complete content -- gives LLMs everything in one file |
| Length | Concise (typically under 200 lines) | Comprehensive (can be thousands of lines) |
| Use case | Quick lookup, recommendations, understanding your site | Deep research, full documentation ingestion |
| Required | Recommended | Optional |
Get Your Site Audited
llms.txt is one layer of AI discoverability. A critical one, but still just one. Your structured data, crawl configuration, content structure, and schema markup all determine whether AI models can accurately represent your business -- or whether they ignore you entirely.
Run our free GEO Checker to get a comprehensive score across every dimension that matters.
Need hands-on help implementing llms.txt or building a full AI discoverability strategy? Get in touch with our team. The window between early adopters and everyone else is closing fast. The businesses that make themselves readable to AI today will own the recommendations tomorrow.
