Skip to content

Commit c7cdc63

Browse files
timsaucerclaude
andcommitted
docs: publish llms.txt at docs site root
Adds `docs/source/llms.txt` in llmstxt.org schema: a short description plus categorized links to the agent skill, user guide pages, DataFrame API reference, and example queries. `html_extra_path` in `conf.py` copies it verbatim to the published site root so it resolves at `https://datafusion.apache.org/python/llms.txt`. Implements PR 4b of the plan in #1394. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 6e2241b commit c7cdc63

3 files changed

Lines changed: 41 additions & 1 deletion

File tree

dev/release/rat_exclude_files.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,4 +49,5 @@ benchmarks/tpch/create_tables.sql
4949
**/.cargo/config.toml
5050
uv.lock
5151
examples/tpch/answers_sf1/*.tbl
52-
SKILL.md
52+
SKILL.md
53+
docs/source/llms.txt

docs/source/conf.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,10 @@ def setup(sphinx) -> None:
129129
# so a file named "default.css" will overwrite the builtin "default.css".
130130
html_static_path = ["_static"]
131131

132+
# Copy agent-facing files (llms.txt) verbatim to the site root so they
133+
# resolve at conventional URLs like `https://.../python/llms.txt`.
134+
html_extra_path = ["llms.txt"]
135+
132136
html_logo = "_static/images/2x_bgwhite_original.png"
133137

134138
html_css_files = ["theme_overrides.css"]

docs/source/llms.txt

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# DataFusion in Python
2+
3+
> Apache DataFusion Python is a Python binding for Apache DataFusion, an in-process, Arrow-native query engine. It exposes a SQL interface and a lazy DataFrame API over PyArrow and any Arrow C Data Interface source. This file points agents and LLM-based tools at the most useful entry points for writing DataFusion Python code.
4+
5+
## Agent Guide
6+
7+
- [SKILL.md (agent skill)](https://datafusion.apache.org/python/skill.html): idiomatic DataFrame API patterns, SQL-to-DataFrame mappings, common pitfalls, and the full `functions` catalog. Primary source of truth for writing datafusion-python code.
8+
9+
## User Guide
10+
11+
- [Introduction](https://datafusion.apache.org/python/user-guide/introduction.html): install, the Pokemon quick start, Jupyter tips.
12+
- [Basics](https://datafusion.apache.org/python/user-guide/basics.html): `SessionContext`, `DataFrame`, and `Expr` at a glance.
13+
- [Data sources](https://datafusion.apache.org/python/user-guide/data-sources.html): Parquet, CSV, JSON, Arrow, Pandas, Polars, and Python objects.
14+
- [DataFrame operations](https://datafusion.apache.org/python/user-guide/dataframe/index.html): the lazy query-building interface.
15+
- [Common operations](https://datafusion.apache.org/python/user-guide/common-operations/index.html): select, filter, join, aggregate, window, expressions, and functions.
16+
- [SQL](https://datafusion.apache.org/python/user-guide/sql.html): running SQL against registered tables.
17+
- [Configuration](https://datafusion.apache.org/python/user-guide/configuration.html): session and runtime options.
18+
19+
## DataFrame API reference
20+
21+
- [`datafusion.dataframe.DataFrame`](https://datafusion.apache.org/python/autoapi/datafusion/dataframe/index.html): the lazy DataFrame builder (`select`, `filter`, `aggregate`, `join`, `sort`, `limit`, set operations).
22+
- [`datafusion.expr`](https://datafusion.apache.org/python/autoapi/datafusion/expr/index.html): expression tree nodes (`Expr`, `Window`, `WindowFrame`, `GroupingSet`).
23+
- [`datafusion.functions`](https://datafusion.apache.org/python/autoapi/datafusion/functions/index.html): 290+ scalar, aggregate, and window functions.
24+
- [`datafusion.context.SessionContext`](https://datafusion.apache.org/python/autoapi/datafusion/context/index.html): session entry point, data loading, SQL execution.
25+
26+
## Examples
27+
28+
- [TPC-H queries (GitHub)](https://github.com/apache/datafusion-python/tree/main/examples/tpch): canonical translations of TPC-H Q01–Q22 to idiomatic DataFrame code, each with reference SQL embedded in the module docstring.
29+
- [Other examples (GitHub)](https://github.com/apache/datafusion-python/tree/main/examples): UDF/UDAF/UDWF, Substrait, Pandas/Polars interop, S3 reads.
30+
31+
## Optional
32+
33+
- [Contributor guide](https://datafusion.apache.org/python/contributor-guide/introduction.html): building from source, extending the Python bindings.
34+
- [Upgrade guides](https://datafusion.apache.org/python/user-guide/upgrade-guides.html): migration notes between releases.
35+
- [Upstream Rust `DataFusion`](https://datafusion.apache.org/): the underlying query engine.

0 commit comments

Comments
 (0)