feat: agent engine optimization (AEO) for AI crawler visibility by francisfuzz · Pull Request #45 · francisfuzz/francisfuzz.github.io

francisfuzz · 2026-04-16T06:22:27Z

What this does

Implements Agent Engine Optimization (AEO) — the practice of structuring and serving site content so AI agents can discover, parse, and accurately cite it. Five targeted changes, prioritized by hard data.

Why

AI agents (ChatGPT, Claude, Perplexity, Gemini) are increasingly how people find information. They don't use traditional search the same way humans do — they crawl, parse, and reason over structured content. Without explicit signals, agents either miss the site entirely or hallucinate identity details.

Key research driving these changes:

Princeton GEO paper (ACM KDD 2024): Structured, well-formatted content boosts visibility in generative engine responses by up to 40%
Google + Microsoft (March 2025): Officially confirmed they use schema.org JSON-LD during AI response generation; pages with schema get 3.2× more citations in AI answers
Mintlify CDN log data: llms-full.txt receives dramatically more AI agent traffic than llms.txt — ChatGPT accounts for the majority; agents prefer embedding full content upfront over follow-up fetches
Anthropic: Explicitly requested Mintlify implement both llms.txt and llms-full.txt for their own docs — first-party signal that ClaudeBot uses these files
Addy Osmani's AEO framework: A missing robots.txt silently denies AI agents access; it's a required precondition

Changes

1. `_layouts/default.html` — `Person` JSON-LD schema (highest ROI)

Adds schema.org/Person structured data to every page's <head>. This tells Google, Bing, and AI agents exactly who Francis is: name, job title, employer, description, and social profile URLs. Confirmed by Google/Microsoft to directly influence AI Overview citations.

2. `_layouts/post.html` — `BlogPosting` JSON-LD schema

Adds schema.org/BlogPosting structured data to each post: headline, publish date, description, author attribution, and canonical URL. Enables accurate authorship attribution when AI agents cite individual posts.

3. `llms-full.txt` — Full site content for agent consumption

A Jekyll-templated file at /llms-full.txt that outputs clean, stripped text for the About section, Résumé, and all posts (auto-updating via Liquid loop). Mintlify's data shows llms-full.txt gets ~25× more AI traffic than llms.txt because agents prefer full content upfront. This is the file ChatGPT reads most.

4. `llms.txt` — Structured site index

A root-level markdown file at /llms.txt following the llms.txt standard. Provides a token-efficient index: who Francis is, what each page covers, and direct URLs. Used by Anthropic, Cloudflare, and Stripe for their own properties. 844,000+ sites have adopted it.

5. `robots.txt` — Explicit AI crawler permissions

Explicitly allows all major AI crawlers: anthropic-ai, ClaudeBot, GPTBot, OAI-SearchBot, PerplexityBot. Per Addy Osmani's AEO framework, a missing robots.txt will silently block agents. Also points crawlers to the RSS feed as a sitemap.

Expected impact

Signal	Expected effect
JSON-LD Person + BlogPosting	3.2× citation multiplier in AI answers (Google/MS confirmed)
`llms-full.txt`	ChatGPT and other agents actively crawl this; full content reduces hallucinated summaries
`llms.txt`	Identity signal for Anthropic/OpenAI crawlers; used by Stripe, Cloudflare as standard practice
`robots.txt`	Removes silent access denial; prerequisite for all other signals to work

What was deprioritized and why

Token efficiency / skill.md: Addy Osmani's AEO targets API documentation sites. This is a personal blog — no API surface to describe.
Meta description overhaul: jekyll-seo-tag already outputs correct <meta name="description"> tags from post front matter description fields. No work needed.

🤖 Generated with Claude Code

Add structured data, llms.txt/llms-full.txt, and robots.txt so AI agents can reliably discover, parse, and cite this site's content. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Convert llms.txt to Jekyll template so post index auto-updates - Add /gear/ page to llms.txt and llms-full.txt - Add claude-web crawler to robots.txt (Anthropic's third agent) - Fix publisher type to Organization in BlogPosting JSON-LD - Add jsonify filter to bare Liquid interpolations in JSON-LD - Fix placeholder channel metadata in public/feed.xml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat: add agent engine optimization (AEO) for AI crawler visibility

e8a16e6

Add structured data, llms.txt/llms-full.txt, and robots.txt so AI agents can reliably discover, parse, and cite this site's content. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

francisfuzz marked this pull request as ready for review April 16, 2026 06:22

francisfuzz and others added 2 commits April 16, 2026 06:56

docs: add AEO maintenance guidance to CLAUDE.md

20f26e1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

francisfuzz merged commit b5d62db into main Apr 16, 2026

francisfuzz deleted the feat/agent-engine-optimization branch April 16, 2026 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: agent engine optimization (AEO) for AI crawler visibility#45

feat: agent engine optimization (AEO) for AI crawler visibility#45
francisfuzz merged 3 commits intomainfrom
feat/agent-engine-optimization

francisfuzz commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

francisfuzz commented Apr 16, 2026

What this does

Why

Changes

1. _layouts/default.html — Person JSON-LD schema (highest ROI)

2. _layouts/post.html — BlogPosting JSON-LD schema

3. llms-full.txt — Full site content for agent consumption

4. llms.txt — Structured site index

5. robots.txt — Explicit AI crawler permissions

Expected impact

What was deprioritized and why

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `_layouts/default.html` — `Person` JSON-LD schema (highest ROI)

2. `_layouts/post.html` — `BlogPosting` JSON-LD schema

3. `llms-full.txt` — Full site content for agent consumption

4. `llms.txt` — Structured site index

5. `robots.txt` — Explicit AI crawler permissions