Skip to content

Commit e461499

Browse files
timsaucerclaude
andcommitted
docs: wire new contributor skills and plan-comparison diagnostic into AGENTS.md
- List the three contributor skills (`check-upstream`, `write-dataframe-code`, `audit-skill-md`) under the Skills section so agents know what tools they have before starting work. - Document the plan-comparison diagnostic workflow (comparing `ctx.sql(...).optimized_logical_plan()` against a DataFrame's `optimized_logical_plan()` via `LogicalPlan.__eq__`) for translating SQL queries to DataFrame form. Points at the full write-up in the `write-dataframe-code` skill rather than duplicating it. `CLAUDE.md` is a symlink to `AGENTS.md`, so the change lands in both. Implements PR 4f of the plan in #1394. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent a3f19a9 commit e461499

1 file changed

Lines changed: 29 additions & 0 deletions

File tree

AGENTS.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,35 @@ Skills follow the [Agent Skills](https://agentskills.io) open standard. Each ski
3333
- `SKILL.md` — The skill definition with YAML frontmatter (name, description, argument-hint) and detailed instructions.
3434
- Additional supporting files as needed.
3535

36+
Currently available skills:
37+
38+
- [`check-upstream`](.ai/skills/check-upstream/SKILL.md) — audit upstream
39+
Apache DataFusion features (functions, DataFrame ops, SessionContext
40+
methods, FFI types) not yet exposed in the Python bindings.
41+
- [`write-dataframe-code`](.ai/skills/write-dataframe-code/SKILL.md)
42+
contributor-facing guide for writing idiomatic DataFrame code inside this
43+
repo (TPC-H pattern index, plan-comparison diagnostic, docstring
44+
conventions). Layers on top of the user-facing [`SKILL.md`](SKILL.md).
45+
- [`audit-skill-md`](.ai/skills/audit-skill-md/SKILL.md) — cross-reference
46+
the repo-root `SKILL.md` against the current public Python API and report
47+
new APIs needing coverage and stale mentions. Run after upstream syncs.
48+
49+
## Plan-comparison diagnostic
50+
51+
When translating a SQL query to a DataFrame — TPC-H, a benchmark, or an
52+
answer to a user question — correctness is gated by the answer-file
53+
comparison in `examples/tpch/_tests.py`, but plan-level equivalence is a
54+
separate question. Two surface-different DataFrame forms that resolve to
55+
the same optimized logical plan are effectively the same query.
56+
57+
As an ad-hoc check, compare `ctx.sql(reference_sql).optimized_logical_plan()`
58+
against the DataFrame's `optimized_logical_plan()`. Use `LogicalPlan.__eq__`
59+
for structural equality and `LogicalPlan.display_indent()` for readable
60+
diffs. This is a diagnostic, not a gate — a mismatch does not mean the
61+
DataFrame form is wrong, only that the two forms are not literally the same
62+
plan. The [`write-dataframe-code`](.ai/skills/write-dataframe-code/SKILL.md)
63+
skill has the full workflow.
64+
3665
## Pull Requests
3766

3867
Every pull request must follow the template in

0 commit comments

Comments
 (0)