Commit e0284c6
feat: add AI skill to find and improve the Pythonic interface to functions (#1484)
* feat: accept native Python types in function arguments instead of requiring lit()
Update 47 functions in functions.py to accept native Python types (int, float,
str) for arguments that are contextually literals, eliminating verbose lit()
wrapping. For example, users can now write split_part(col("a"), ",", 2) instead
of split_part(col("a"), lit(","), lit(2)). All changes are backward compatible.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: update alias function signatures to match pythonic primary functions
Update instr and position (aliases of strpos) to accept Expr | str for
the substring parameter, matching the updated primary function signature.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update make-pythonic skill to require alias type hint updates
Alias functions that delegate to a primary function must have their type
hints updated to match, even though coercion logic is only added to the
primary. Added a new Step 3 to the implementation workflow for this.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address review feedback on pythonic skill and function signatures
Update SKILL.md to prevent three classes of issues: clarify that float
already accepts int per PEP 484 (avoiding redundant int | float that
fails ruff PYI041), add backward-compat rule for Category B so existing
Expr params aren't removed, and add guidance for inline coercion with
many optional nullable params instead of local helpers.
Replace regexp_instr's _to_raw() helper with inline coercion matching
the pattern used throughout the rest of the file.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: add coerce_to_expr helpers and replace inline coercion patterns
Introduce coerce_to_expr() and coerce_to_expr_or_none() in expr.py as the
complement to ensure_expr() — where ensure_expr rejects non-Expr values,
these helpers wrap them via Expr.literal(). Replaces ~60 inline isinstance
checks in functions.py with single-line helper calls, and updates the
make-pythonic skill to document the new pattern.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add aggregate function literal detection to make-pythonic skill
Add Technique 1a to detect literal-only arguments in aggregate functions.
Unlike scalar UDFs which enforce literals in invoke_with_args(), aggregate
functions enforce them in accumulator() via get_scalar_value(),
validate_percentile_expr(), or downcast_ref::<Literal>(). Without this
technique, the skill would incorrectly classify arguments like
approx_percentile_cont's percentile as Category A (Expr | float) when they
should be Category B (float only). Updates the decision flow to branch on
scalar vs aggregate before checking for literal enforcement.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add window function literal detection to make-pythonic skill
Add Technique 1b to detect literal-only arguments in window functions.
Window functions enforce literals in partition_evaluator() via
get_scalar_value_from_args() / downcast_ref::<Literal>(), not in
invoke_with_args() (scalar) or accumulator() (aggregate). Updates the
decision flow to branch on scalar vs aggregate vs window.
Known window functions with literal-only arguments: ntile (n), lead/lag
(offset, default_value), nth_value (n).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use explicit None checks, widen numeric type hints, and add tests
Replace 7 fragile truthiness checks (x.expr if x else None) with
explicit is not None checks to prevent silent None when zero-valued
literals are passed. Widen log/power/pow type hints to Expr | int | float
with noqa: PYI041 for clarity. Add unit tests for coerce_to_expr helpers
and integration tests for pythonic calling conventions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: suppress FBT003 in tests and remove redundant noqa comments
Add FBT003 (boolean positional value) to the per-file-ignores for
python/tests/* in pyproject.toml, and remove the 6 now-redundant
inline noqa: FBT003 comments across test_expr.py and test_context.py.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: replace static function lists with discovery instructions in skill
Replace hardcoded "Known aggregate/window functions with literal-only
arguments" lists with instructions to discover them dynamically by
searching the upstream crate source. Keeps a few examples as validation
anchors so the agent knows its search is working correctly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: make interrupt test reliable on Python 3.11
PyThreadState_SetAsyncExc only delivers exceptions when the thread is
executing Python bytecode, not while in native (Rust/C) code. The
previous test had two issues causing flakiness on Python 3.11:
1. The interrupt fired before df.collect() entered the UDF, while the
thread was still in native code where async exceptions are ignored.
2. time.sleep(2.0) is a single C call where async exceptions are not
checked — they're only checked between bytecode instructions.
Fix by adding a threading.Event so the interrupt waits until the UDF is
actually executing Python code, and by sleeping in small increments so
the eval loop has opportunities to check for pending exceptions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 0357716 commit e0284c6
8 files changed
Lines changed: 893 additions & 193 deletions
File tree
- .ai/skills/make-pythonic
- python
- datafusion
- tests
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
111 | 111 | | |
112 | 112 | | |
113 | 113 | | |
| 114 | + | |
114 | 115 | | |
115 | 116 | | |
116 | 117 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
243 | 243 | | |
244 | 244 | | |
245 | 245 | | |
| 246 | + | |
| 247 | + | |
246 | 248 | | |
247 | 249 | | |
248 | 250 | | |
| |||
255 | 257 | | |
256 | 258 | | |
257 | 259 | | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
258 | 264 | | |
259 | 265 | | |
260 | 266 | | |
| |||
299 | 305 | | |
300 | 306 | | |
301 | 307 | | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
302 | 343 | | |
303 | 344 | | |
304 | 345 | | |
| |||
0 commit comments