You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: add aggregate function literal detection to make-pythonic skill
Add Technique 1a to detect literal-only arguments in aggregate functions.
Unlike scalar UDFs which enforce literals in invoke_with_args(), aggregate
functions enforce them in accumulator() via get_scalar_value(),
validate_percentile_expr(), or downcast_ref::<Literal>(). Without this
technique, the skill would incorrectly classify arguments like
approx_percentile_cont's percentile as Category A (Expr | float) when they
should be Category B (float only). Updates the decision flow to branch on
scalar vs aggregate before checking for literal enforcement.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There are four concrete techniques to check, in order of signal strength:
62
68
63
69
#### Technique 1: Check `invoke_with_args()` for literal-only enforcement (strongest signal)
64
70
@@ -75,6 +81,48 @@ let granularity_str = if let ColumnarValue::Scalar(ScalarValue::Utf8(Some(v))) =
75
81
76
82
**If you find this pattern:** The argument is **Category B** — accept only the corresponding native Python type (e.g., `str`), not `Expr`. The function will error at runtime with a column expression anyway.
77
83
84
+
#### Technique 1a: Check `accumulator()` for literal-only enforcement (aggregate functions)
85
+
86
+
Technique 1 applies to scalar UDFs. Aggregate functions do not have `invoke_with_args()` — instead, they enforce literal-only arguments in their `accumulator()` (or `create_accumulator()`) method, which runs at planning time before any data is processed.
87
+
88
+
Look for these patterns inside `accumulator()`:
89
+
90
+
-`get_scalar_value(expr)` — evaluates the expression against an empty batch and errors if it's not a scalar
91
+
-`validate_percentile_expr(expr)` — specific helper used by percentile functions
92
+
-`downcast_ref::<Literal>()` — checks that the physical expression is a literal constant
"The second argument of the string_agg function must be a string literal"
111
+
);
112
+
};
113
+
// ...
114
+
}
115
+
```
116
+
117
+
**If you find this pattern:** The argument is **Category B** — accept only the corresponding native Python type, not `Expr`. The function will error at planning time with a non-literal expression.
118
+
119
+
Known aggregate functions with literal-only arguments:
#### Technique 2: Check the `Signature` for data type constraints
79
127
80
128
Each function defines a `Signature::coercible(...)` that specifies what data types each argument accepts, using `Coercion` entries. This tells you the expected **data type** even if it doesn't enforce literal-only.
@@ -125,12 +173,22 @@ let decimal_places: Option<i32> = match args.scalar_arguments.get(1) {
125
173
#### Decision flow
126
174
127
175
```
128
-
Is argument rejected at runtime if not a literal?
129
-
(check invoke_with_args for ColumnarValue::Scalar-only match + exec_err!)
130
-
→ YES: Category B — accept only native type, no Expr
131
-
→ NO: Does the Signature constrain it to a specific data type?
132
-
→ YES: Category A — accept Expr | <native type matching the constraint>
133
-
→ NO: Leave as Expr only
176
+
Is the function a scalar UDF or an aggregate?
177
+
Scalar UDF:
178
+
Is argument rejected at runtime if not a literal?
179
+
(check invoke_with_args for ColumnarValue::Scalar-only match + exec_err!)
180
+
→ YES: Category B — accept only native type, no Expr
181
+
→ NO: continue below
182
+
Aggregate:
183
+
Is argument rejected at planning time if not a literal?
184
+
(check accumulator() for get_scalar_value / validate_percentile_expr /
185
+
downcast_ref::<Literal>() + error)
186
+
→ YES: Category B — accept only native type, no Expr
187
+
→ NO: continue below
188
+
189
+
Does the Signature constrain it to a specific data type?
190
+
→ YES: Category A — accept Expr | <native type matching the constraint>
0 commit comments