Skip to content

feat: agent engine optimization (AEO) for AI crawler visibility#45

Merged
francisfuzz merged 3 commits intomainfrom
feat/agent-engine-optimization
Apr 16, 2026
Merged

feat: agent engine optimization (AEO) for AI crawler visibility#45
francisfuzz merged 3 commits intomainfrom
feat/agent-engine-optimization

Conversation

@francisfuzz
Copy link
Copy Markdown
Owner

What this does

Implements Agent Engine Optimization (AEO) — the practice of structuring and serving site content so AI agents can discover, parse, and accurately cite it. Five targeted changes, prioritized by hard data.

Why

AI agents (ChatGPT, Claude, Perplexity, Gemini) are increasingly how people find information. They don't use traditional search the same way humans do — they crawl, parse, and reason over structured content. Without explicit signals, agents either miss the site entirely or hallucinate identity details.

Key research driving these changes:

  • Princeton GEO paper (ACM KDD 2024): Structured, well-formatted content boosts visibility in generative engine responses by up to 40%
  • Google + Microsoft (March 2025): Officially confirmed they use schema.org JSON-LD during AI response generation; pages with schema get 3.2× more citations in AI answers
  • Mintlify CDN log data: llms-full.txt receives dramatically more AI agent traffic than llms.txt — ChatGPT accounts for the majority; agents prefer embedding full content upfront over follow-up fetches
  • Anthropic: Explicitly requested Mintlify implement both llms.txt and llms-full.txt for their own docs — first-party signal that ClaudeBot uses these files
  • Addy Osmani's AEO framework: A missing robots.txt silently denies AI agents access; it's a required precondition

Changes

1. _layouts/default.htmlPerson JSON-LD schema (highest ROI)

Adds schema.org/Person structured data to every page's <head>. This tells Google, Bing, and AI agents exactly who Francis is: name, job title, employer, description, and social profile URLs. Confirmed by Google/Microsoft to directly influence AI Overview citations.

2. _layouts/post.htmlBlogPosting JSON-LD schema

Adds schema.org/BlogPosting structured data to each post: headline, publish date, description, author attribution, and canonical URL. Enables accurate authorship attribution when AI agents cite individual posts.

3. llms-full.txt — Full site content for agent consumption

A Jekyll-templated file at /llms-full.txt that outputs clean, stripped text for the About section, Résumé, and all posts (auto-updating via Liquid loop). Mintlify's data shows llms-full.txt gets ~25× more AI traffic than llms.txt because agents prefer full content upfront. This is the file ChatGPT reads most.

4. llms.txt — Structured site index

A root-level markdown file at /llms.txt following the llms.txt standard. Provides a token-efficient index: who Francis is, what each page covers, and direct URLs. Used by Anthropic, Cloudflare, and Stripe for their own properties. 844,000+ sites have adopted it.

5. robots.txt — Explicit AI crawler permissions

Explicitly allows all major AI crawlers: anthropic-ai, ClaudeBot, GPTBot, OAI-SearchBot, PerplexityBot. Per Addy Osmani's AEO framework, a missing robots.txt will silently block agents. Also points crawlers to the RSS feed as a sitemap.

Expected impact

Signal Expected effect
JSON-LD Person + BlogPosting 3.2× citation multiplier in AI answers (Google/MS confirmed)
llms-full.txt ChatGPT and other agents actively crawl this; full content reduces hallucinated summaries
llms.txt Identity signal for Anthropic/OpenAI crawlers; used by Stripe, Cloudflare as standard practice
robots.txt Removes silent access denial; prerequisite for all other signals to work

What was deprioritized and why

  • Token efficiency / skill.md: Addy Osmani's AEO targets API documentation sites. This is a personal blog — no API surface to describe.
  • Meta description overhaul: jekyll-seo-tag already outputs correct <meta name="description"> tags from post front matter description fields. No work needed.

🤖 Generated with Claude Code

Add structured data, llms.txt/llms-full.txt, and robots.txt so AI agents
can reliably discover, parse, and cite this site's content.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@francisfuzz francisfuzz marked this pull request as ready for review April 16, 2026 06:22
francisfuzz and others added 2 commits April 16, 2026 06:56
- Convert llms.txt to Jekyll template so post index auto-updates
- Add /gear/ page to llms.txt and llms-full.txt
- Add claude-web crawler to robots.txt (Anthropic's third agent)
- Fix publisher type to Organization in BlogPosting JSON-LD
- Add jsonify filter to bare Liquid interpolations in JSON-LD
- Fix placeholder channel metadata in public/feed.xml

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@francisfuzz francisfuzz merged commit b5d62db into main Apr 16, 2026
@francisfuzz francisfuzz deleted the feat/agent-engine-optimization branch April 16, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant