AI accessibility — your app in Markdown
Every page of a Pyxle app can serve a clean Markdown rendition of itself — so AI assistants and coding agents (Claude, ChatGPT, Cursor, Copilot, Perplexity) read your app as text instead of scraping HTML. Turn it on with one flag; decide where the Markdown comes from with a few conventions in your project.
{ "llms": true }That single line gives you:
- Per-page Markdown at each URL with
.mdappended —/docs/routing→/docs/routing.md. - Content negotiation — the same URL returns Markdown to any request that sends
Accept: text/markdown. Browsers never send it, so humans are unaffected. - An
/llms.txtindex — the llms.txt convention: a Markdown map of your site. - Discovery headers on every response —
Link: </llms.txt>; rel="llms-txt"andX-Llms-Txt: /llms.txt— so an agent finds the index without parsing HTML.
This very page proves it: append .md to its URL to read the Markdown you're looking at.
The mental model
There are two moving parts, and they're cleanly separated:
- Routing is the framework's job. When
llmsis enabled, Pyxle registers a.mdroute for every page, wires upAcceptnegotiation, serves/llms.txt, and adds the discovery headers. You don't configure any of that. - Content is your job — expressed as files, not config. Where a page's Markdown comes from is resolved from your project on each request. The
llmsconfig block is just an on/off switch (plus one opt-in fallback); everything substantive lives in.mdfiles andllms.pyhandlers next to your routes.
The feature is off by default and adds nothing to the normal page render path — the .md routes are separate and only run for .md (or Accept: text/markdown) requests.
Enabling it
In pyxle.config.json:
{
"llms": {
"enabled": true,
"autoConvert": false
}
}| Key | Default | What it does |
|---|---|---|
enabled |
false |
Turns the whole feature on. "llms": true is shorthand for { "enabled": true }. |
autoConvert |
false |
A last-resort fallback: convert a page's rendered HTML to Markdown when no authored source exists. Off by default because it's lossy — see autoConvert. |
That is the entire configuration surface. See Configuration → AI accessibility.
How a page's Markdown is resolved
For any <page>.md request (or an Accept: text/markdown request to the page), Pyxle walks this ladder and uses the first source that returns text:
| # | Source | Scope | Best for |
|---|---|---|---|
| 1 | Co-located <page>.md file |
one page | static, hand-written pages |
| 2 | to_markdown in the page's own module |
one page (a catch-all covers its subtree) | pages that already load their content |
| 3 | to_markdown in the nearest ancestor llms.py |
a route subtree (pages/llms.py = app-wide) |
one handler for many pages |
| 4 | autoConvert (only if enabled) |
any page | a rough fallback when nothing else exists |
| 5 | Redirect /<page>.md → /<page> |
— | so a guessed .md URL never 404s |
Whatever a rung returns is then passed through the optional wrap_markdown hook before it's sent.
1. A co-located .md file
Drop a Markdown file next to the page. Simplest possible option — no code:
pages/
about.pyxl
about.md ← served at /about.md
index.pyxl
index.md ← served at /index.md (i.e. the / page)Best for static, content-heavy pages you'd rather write by hand than generate — a landing page, a manifesto, a pricing page.
2. A page-local to_markdown handler
Add a to_markdown function to a page's Python (server) section. It receives a MarkdownContext and returns a Markdown string — or None to defer to the next rung:
# pages/products/[id].pyxl
@server
async def load(request):
return {"product": await get_product(request.path_params["id"])}
async def to_markdown(ctx):
product = await get_product(ctx.request.path_params["id"])
return f"# {product.name}\n\n{product.description}\n"Because a catch-all page ([[...slug]].pyxl) is a single page that handles every sub-path, its to_markdown already covers the whole subtree — read ctx.request.path_params["slug"] to know which page was asked for.
3. A directory llms.py handler (covers a route subtree)
To serve many pages under a directory with one handler, put a to_markdown in an llms.py at that directory. Resolution walks from the page's own directory up to pages/, nearest ancestor first — exactly like layout.pyxl, error.pyxl, and loading.pyxl:
pages/
llms.py ← app-wide: to_markdown for any page below
docs/
llms.py ← handles everything under /docs
intro.pyxl
routing.pyxl# pages/docs/llms.py — one handler for the whole /docs subtree
import json
from pathlib import Path
DOCS = Path("public/docs-data")
def to_markdown(ctx):
slug = ctx.path.removeprefix("/docs/") # e.g. "guides/routing"
page = DOCS / f"{slug}.json"
if not page.is_file():
return None # decline → try a broader handler
return json.loads(page.read_text())["markdown"]Returning None declines and defers to the next ancestor (and ultimately to autoConvert/redirect). So a /docs handler can answer the slugs it knows and let everything else fall through — handlers compose down the tree. llms.py is also the intended home for any future per-directory AI hooks (it already hosts llms_txt and wrap_markdown at the root).
autoConvert (the lossy fallback)
If nothing above resolves and you've set "autoConvert": true, Pyxle renders the page and converts its HTML to Markdown with a small, dependency-free converter. It's off by default and deliberately best-effort: headings, paragraphs, lists, links, emphasis, and code survive; layout chrome, tables, and rich components may not. Treat it as "something is better than a redirect" — prefer an authored .md or a handler for anything you care about.
The redirect fallback
With the feature on but no Markdown source and autoConvert off, /<page>.md returns a 307 redirect to /<page>. An agent that guesses a .md URL lands on the real page instead of a 404.
The MarkdownContext (ctx)
Every to_markdown and wrap_markdown handler receives a single argument — a MarkdownContext. It's a small, read-only object with everything you need to produce a page's Markdown:
| Member | Type | Description |
|---|---|---|
ctx.request |
starlette.requests.Request |
The incoming request. Use it for route params, query string, headers, and the body. |
ctx.path |
str |
The canonical page path, always without .md — e.g. /docs/routing, or / for the home page. Use this, not request.url.path. |
await ctx.run_loader() |
Any |
Runs only the page's @server loader and returns its data — the dict the page would receive — skipping the render. The cheap path when you just want the loaded data. Returns {} for a page with no loader. |
await ctx.render_html() |
str |
Renders the original page — running its @server loader and full SSR — and returns the body HTML (the component output, without the document shell). Lazy: nothing renders unless you call it. |
ctx.request — the request object
A standard Starlette Request. The members you'll actually reach for:
ctx.request.path_params— the matched route parameters. Forpages/docs/[[...slug]].pyxl,ctx.request.path_params["slug"]is"guides/routing". This is how one handler serves many pages.ctx.request.query_params— the query string (?q=…), a multidict.ctx.request.headers— request headers.await ctx.request.body()/await ctx.request.json()— the request body, if any.
ctx.path vs ctx.request.url.path — an important distinction
Use ctx.path. It is always the canonical page path with no .md suffix, whether the request arrived as /docs/routing.md or as /docs/routing with Accept: text/markdown. In contrast, ctx.request.url.path carries the raw request path — which includes .md on a .md request and omits it on an Accept-negotiated one. Reading ctx.path means your handler behaves identically on both entry points.
def to_markdown(ctx):
# ctx.path -> "/docs/routing" (always canonical)
# ctx.request.url.path -> "/docs/routing.md" OR "/docs/routing"
slug = ctx.path.removeprefix("/docs/")
...ctx.render_html() — post-processing the rendered page
When you want to derive Markdown from what the page actually renders — rather than from source data — call await ctx.render_html(). It runs the page's loader and server render and hands you the body HTML, which you can transform:
from pyxle.devserver.llms import html_to_markdown # the built-in converter
async def to_markdown(ctx):
html = await ctx.render_html()
return html_to_markdown(html) # roughly what autoConvert does, but on your termsIt's lazy and potentially expensive (a full SSR pass), so only call it when you need it. When you want the page's data rather than its rendered HTML, reach for ctx.run_loader() instead — it's much cheaper.
ctx.run_loader() — the loader's data, without the render
Often you don't want rendered HTML at all — you want the same data the page loads, to format as Markdown yourself. await ctx.run_loader() runs just the page's @server loader and returns its result (the dict the page would receive as data), skipping SSR entirely:
async def to_markdown(ctx):
data = await ctx.run_loader() # runs the @server loader, no render
post = data["post"]
return f"# {post['title']}\n\n{post['body']}\n"This is the cheap path — a loader call, not a full render — and it reuses the exact data-loading your page already does. A page with no loader returns {}.
Handler contract
- Sync or async — both work. An async handler is awaited.
- Return
strto serve that Markdown. - Return
Noneto decline and fall through to the next rung. - Returning anything else raises a
TypeError(surfaced in logs); the.mdrequest degrades gracefully to a redirect, and anAccept-negotiated request falls back to HTML.
Framing every page (wrap_markdown)
To add a consistent header/footer to every .md response — agent instructions, navigation hints, a canonical-URL banner — define a wrap_markdown(ctx, markdown) function in the root pages/llms.py. Pyxle calls it with the MarkdownContext and the already-resolved Markdown, and serves whatever string it returns:
# pages/llms.py
BASE = "https://example.com"
def wrap_markdown(ctx, markdown):
header = (
f"> Markdown rendition of {BASE}{ctx.path}, served for AI agents.\n"
f"> Append `.md` to any URL for its Markdown. Index: {BASE}/llms.txt\n"
)
return f"{header}\n{markdown}"Because it runs on Markdown from every source (co-located files, to_markdown handlers, autoConvert), the framing is defined once and applied everywhere. Return None to leave the Markdown untouched. And because it's applied at serve time — not baked into your source — your /llms-full.txt corpus and the raw .md files stay clean.
The /llms.txt index
/llms.txt is a Markdown map of your site that agents (and humans) can read to discover what's available. Pyxle resolves it, first hit wins:
- a static
public/llms.txt(served by the static-asset layer before anything else) — full manual control; - a
llms_txtfunction in the rootpages/llms.py— generate it dynamically; - a generated default: an
H1plus a## Pageslist linking every concrete (non-parameterised) page's.md.
For a site with dynamic content — docs, a blog, a catalog — the generated default can't enumerate your dynamic routes, so provide a llms_txt hook:
# pages/llms.py
def llms_txt(ctx):
lines = ["# My App", "", "> One-line summary of the app.", "", "## Docs", ""]
for slug, title in load_doc_index():
lines.append(f"- [{title}](https://example.com/docs/{slug}.md): short description")
return "\n".join(lines) + "\n"The hook receives an LlmsTxtContext:
| Member | Type | Description |
|---|---|---|
ctx.request |
Request |
The incoming request. |
ctx.pages |
tuple[LlmsPageInfo, ...] |
Your app's concrete pages (see below). |
ctx.render_default() |
str |
The framework's generated index — return it verbatim, extend it, or ignore it. |
Each entry in ctx.pages is an LlmsPageInfo — path (e.g. /about), md_url (/about.md), and title (a humanized label). Return a string, or None to fall back to the generated default.
llms.txtvsllms-full.txt./llms.txtis a map (links + descriptions) an agent reads to decide what to fetch. A/llms-full.txtis the whole corpus concatenated into one file for one-shot ingestion. Pyxle generates/llms.txt; if you want/llms-full.txt, produce it in your build (Pyxle serves it as a static file). Both are complementary — the index for discovery, the full file for bulk reading, the per-page.mdfor precise pulls.
Content negotiation and discovery headers
Beyond the .md URLs, two things make the feature work for agents that don't append .md:
Accept: text/markdownnegotiation. A request to the canonical URL (/docs/routing) that includestext/markdownin itsAcceptheader gets the Markdown, resolved through the exact same ladder. Browsers never send that header, so this is invisible to human visitors. The response carriesVary: Acceptso shared caches key on it correctly.- Discovery headers. Every response advertises the index:
Link: </llms.txt>; rel="llms-txt"andX-Llms-Txt: /llms.txt. An agent can find yourllms.txtfrom any page without parsing the body.
Deployment
- A page's own
to_markdownis compiled into your build and works anywherepyxle serveruns. - Co-located
.mdfiles andllms.pyhandlers are source files. They must be present alongsidepages/at runtime — which they are for the common case of deploying your whole project directory. If they're ever absent, resolution simply falls through to the next rung (and ultimately the redirect), so nothing breaks; the page just isn't available as Markdown. - Caching.
.mdresponses aren't run through the page edge-cache. If you serve heavy handlers under load, cache at your CDN/reverse proxy keyed on the path (andVary: Acceptfor the negotiated route).
robots.txt and AI crawlers
The .md/llms.txt endpoints offer clean content; they don't gate crawling. If you want reach, don't block the AI bots in public/robots.txt — being read and cited by them is free distribution:
User-agent: GPTBot
User-agent: OAI-SearchBot
User-agent: ChatGPT-User
User-agent: ClaudeBot
User-agent: Claude-SearchBot
User-agent: PerplexityBot
User-agent: CCBot
Allow: /
Sitemap: https://example.com/sitemap.xmlShip a normal XML sitemap alongside it. (OpenAI and Anthropic expose separate tokens for training vs search if you want to allow one and not the other — see their bot docs.)
Recipes
Serve docs from generated JSON (as pyxle.dev does): one directory handler maps a slug to a pre-built Markdown field. See directory handler above.
Rewrite links for portability. Markdown that travels (pasted into a chat, saved to disk) should carry absolute links. Rewrite relative links to absolute .md URLs when you generate or store the Markdown, so [Routing](../routing.md) becomes [Routing](https://example.com/docs/routing.md).
Add an agent search endpoint. Agents can search by fetching /llms-full.txt and scanning it, but a dedicated endpoint is nicer. A plain pages/api/*.py route that ranks your content and returns Markdown links works well — then point to it from wrap_markdown:
# pages/api/search.py → GET /api/search?q=...
from starlette.responses import PlainTextResponse
async def endpoint(request):
q = request.query_params.get("q", "")
hits = search_your_index(q) # your ranking
lines = [f"# Results for “{q}”", ""]
lines += [f"- [{h.title}]({h.md_url})" for h in hits]
return PlainTextResponse("\n".join(lines), media_type="text/markdown; charset=utf-8")Per-section handlers. Give pages/docs/llms.py and pages/blog/llms.py different to_markdown handlers; each scopes to its subtree, and the root pages/llms.py catches anything else.
FAQ
Does this slow down my pages? No. The .md routes are separate and only run for .md/Accept: text/markdown requests. The normal render path is untouched, and the whole feature is off unless you enable it.
Do I have to write Markdown for every page? No. Enable the feature and pages with no source simply redirect their .md URL to the page. Add .md files or handlers only where clean Markdown is worth it (usually docs and content pages).
What about my interactive pages? A dashboard or playground isn't meaningful as Markdown — either give it a short hand-written .md describing what it is, or let it redirect. autoConvert exists for a rough automatic version if you want one.
Is this the same as Mintlify's .md/llms.txt? Same idea, but built into the framework and applied to your whole app, not just a hosted docs site — and you control exactly where each page's Markdown comes from.
Who reads all this? Right now, the clearest win is direct: point any AI assistant — Claude, Cursor, ChatGPT — at a .md URL or your llms.txt and it gets clean, token-efficient context instead of scraped HTML. Beyond that, .md and llms.txt are the conventions the AI ecosystem is standardizing on — so your app already speaks the format machine readers are moving toward, served to spec (per-page Markdown, Accept negotiation, discovery headers) with nothing more to do as adoption grows.
See also
- Configuration → AI accessibility — the config block.
- Runtime API → AI accessibility hooks —
to_markdown,wrap_markdown,llms_txt, and the context objects. - Pyxle for AI coding agents — the broader case for Pyxle in the AI era.
- The llms.txt specification.