Skip to content

Commit c6e10b2

Browse files
llalishivsood
andauthored
Sync Repos: Release 170.168.0 (#190)
* Merged PR 1952906: JSON_ARRAYAGG support for OVER clause # Pull Request Template for ScriptDom ## Description `JSON_ARRAYAGG` supports windowed aggregate usage via the `OVER (PARTITION BY ...)` clause in SQL Server, but ScriptDOM was missing support for parsing and generating it. This PR adds OVER clause support for `JSON_ARRAYAGG` in the TSql170 parser and script generator. **SQL syntax now supported:** ```sql SELECT JSON_ARRAYAGG(name) OVER (PARTITION BY dept) FROM employees; SELECT JSON_ARRAYAGG(name ABSENT ON NULL) OVER (PARTITION BY dept) FROM employees; SELECT JSON_ARRAYAGG(name NULL ON NULL) OVER (PARTITION BY dept) FROM employees; SELECT JSON_ARRAYAGG(name ORDER BY name NULL ON NULL RETURNING JSON) OVER (PARTITION BY dept) FROM employees; ``` * Merged PR 1948594: Fix allocation issue in TSql80ParserBaseInternal.AddAndUpdateTokenInfo High memory allocations were detected in the `TSql80ParserBaseInternal.AddAndUpdateTokenInfo` method, specifically in the overload that processes collections. The method uses a `foreach` loop to iterate over an `IList<TFragmentType>` collection, which causes the allocation of a boxed `List.Enumerator` object on every invocation. Since this is a hot path in the SQL parser (called frequently during parsing operations), these repeated allocations contribute significantly to GC pressure. Performance impact: This issue appears in 0.28% of high allocation traces and allocates 3.0 MB/sec at both the 50th and 90th percentiles. ## Solution Replaced the `foreach` loop with an index-based `for` loop in the `AddAndUpdateTokenInfo` method. The new implementation iterates using an integer index from 0 to `otherCollection.Count`, accessing each element via the indexer. This eliminates the enumerator allocation while preserving identical functionality and behavior. The change is made in `/SqlScriptDom/Parser/TSql/TSql80ParserBaseInternal.cs` at lines 292-299. **Before:** ```csharp foreach (TFragmentType item in otherCollection) { AddAndUpdateTokenInfo(node, collection, item); } ``` **After:** ```csharp for (int i = 0; i < otherCollection.Count; i++) { TFragmentType item = otherCollection[i]; AddAndUpdateTokenInfo(node, collection, item); } ``` ## Benefits - Eliminates boxing allocation of `List.Enumerator` on hot path - Reduces GC pressure and improves parser performance - Maintains identical behavior and semantics - Minimal code change with no API modifications * Merged PR 1881483: Add support for 3-part and 4-part identifiers in VECTOR_SEARCH TOP_N parameter <!-- COPILOT_AI_GENERATED_START --> ## Summary This PR enables the VECTOR_SEARCH table-valued function to accept 3-part and 4-part column identifiers in the TOP_N parameter, allowing queries like: ```sql SELECT qt.qid, src.id, ann.distance FROM QueryTable qt CROSS APPLY VECTOR_SEARCH( TABLE = graphnode AS src, COLUMN = embedding, SIMILAR_TO = qt.qembedding, METRIC = &#39;euclidean&#39;, TOP_N = dbo.qt.top_n -- 3-part identifier now supported ) AS ann ``` Previously, the parser only accepted up to 2-part identifiers (e.g., `qt.top_n`) in the TOP_N parameter, which prevented queries using schema-qualified column references in CROSS APPLY scenarios. ## Changes Made ### Grammar Rule Update Modified the `vectorSearchColumnReferenceExpression` rule in `SqlScriptDom/Parser/TSql/TSql170.g` to allow up to 4-part identifiers instead of the previous 2-part limit. This change enables the TOP_N parameter to accept: - Simple identifiers: `@variable` or `columnName` - 2-part identifiers: `table.column` - 3-part identifiers: `schema.table.column` (the primary use case from the issue) - 4-part identifiers: `database.schema.table.column` The change was minimal—updating a single parameter from `multiPartIdentifier[2]` to `multiPartIdentifier[4]` in line 33971 of TSql170.g. This aligns with standard SQL Server column reference behavior in other contexts. ### Test Coverage Added comprehensive test coverage in `Test/SqlDom/TestScripts/VectorSearchCrossApplyTests170.sql` with a test case that validates the exact query pattern from the issue. The test includes: - A DECLARE statement setting up a VECTOR variable - A SELECT query with CROSS APPLY using VECTOR_SEARCH - The TOP_N parameter specified as a 3-part identifier (`dbo.qt.top_n`) The corresponding baseline file `Test/SqlDom/Baselines170/VectorSearchCrossApplyTests170.sql` captures the expected formatted output from the parser. Updated `Test/SqlDom/Only170SyntaxTests.cs` to register the new test case with appropriate error counts for older parser versions (TSql80 through TSql160), which still use the 2-part identifier limit and correctly report parse errors for this syntax. ## Impact This change only affects the TSql170 (SQL Server 2025) parser and is backward compatible. The AST definition already used `ScalarExpression` for the TopN member, so no AST changes were required. Older parser versions continue to enforce the 2-part identifier limit and generate expected parse errors for 3-part identifiers. The fix enables real-world scenarios where VECTOR_SEARCH is used in CROSS APPLY with outer references that require schema qualification, which is common in production databases with explicit schema naming conventions. Fixes: #4843961 * Merged PR 1881565: ## Fix VECTOR_SEARCH SIMILAR_TO Parameter to Reject Subqueries <!-- COPILOT_AI_GENERATED_START --> ### Summary The VECTOR_SEARCH function in the TSql170 parser was incorrectly allowing subqueries in the SIMILAR_TO parameter. This fix adds validation to properly reject subqueries and throw an appropriate parse error, aligning with SQL Server 2025 implementation requirements. ### Problem The SIMILAR_TO parameter was accepting any scalar expression, including subqueries wrapped in parentheses. This allowed invalid syntax like: ```sql SELECT * FROM VECTOR_SEARCH( TABLE = graphnode, COLUMN = embedding, SIMILAR_TO = (SELECT TOP 1 embedding FROM GTQuery), -- Should error but didn&#39;t METRIC = &#39;euclidean&#39;, TOP_N = 20 ) AS ann ``` This syntax should be rejected because subqueries are not supported in the SIMILAR_TO parameter, similar to how they are restricted in other contexts. ### Solution Added validation logic in the `vectorSearchTableReference` grammar rule (`SqlScriptDom/Parser/TSql/TSql170.g`) that checks if the SIMILAR_TO expression is a `ScalarSubquery` type. When detected, the parser now throws a SQL46098 error with the message &quot;Subqueries are not allowed in this context. Only scalar expressions are allowed.&quot; The validation follows the same pattern used elsewhere in the grammar for similar restrictions and is placed immediately after matching the SIMILAR_TO identifier, before assigning the expression to the result object. ### Testing Added a comprehensive error test case in `Test/SqlDom/ParserErrorsTests.cs` within the existing `VectorSearchErrorTest170` method. The test verifies that: - Subqueries in the SIMILAR_TO parameter are properly rejected - The correct error code (SQL46098) is generated - The error position is accurately reported Valid syntax continues to work correctly, including: - Variables: `SIMILAR_TO = @qv` - Column references: `SIMILAR_TO = outerref.vector_col` - Other scalar expressions that are not subqueries ### Impact - **All 558 existing tests pass** with no regressions - **Minimal change**: Only 12 lines added across 2 files - **Consistent behavior**: Uses the same error code and message pattern as other subquery restrictions in the parser - **Backward compatible**: Only rejects previously invalid syntax that should not have been allowed Fixes: #4844065 Related work items: #4844065 * Merged PR 1959169: Adding release notes for 170.168.0 # Pull Request Template for ScriptDom ## Description Please provide a detailed description, include the link to the design specification or SQL feature document for the new TSQL syntaxes. Make sure to add links to the Github or DevDiv issue Before submitting your pull request, please ensure you have completed the following: ## Code Change - [ ] The [Common checklist](https://msdata.visualstudio.com/SQLToolsAndLibraries/_git/Common?path=/Templates/PR%20Checklist%20for%20SQLToolsAndLibraries.md&version=GBmain&_a=preview) has been reviewed and followed - [ ] Code changes are accompanied by appropriate unit tests - [ ] Identified and included SMEs needed to review code changes - [ ] Follow the [steps](https://msdata.visualstudio.com/SQLToolsAndLibraries/_wiki/wikis/SQLToolsAndLibraries.wiki/33838/Adding-or-Extending-TSql-support-in-Sql-Dom?anchor=make-the-changes-in) here to make changes in the code ## Testing - [ ] Follow the [steps](https://msdata.visualstudio.com/SQLToolsAndLibraries/_wiki/wikis/SQLToolsAndLibraries.wiki/33838/Adding-or-Extending-TSql-support-in-Sql-Dom?anchor=to-extend-the-tests-do-the-following%3A) here to add new tests for your feature ## Documentation - [ ] Update relevant documentation in the [wiki](https://dev.azure.com/msdata/SQLToolsAndLibraries/_wiki/wikis/SQLToolsAndLibraries.wiki) and the README.md file ## Additional Information Please provide any additional information that might be helpful for the reviewers Adding release notes for 170.168.0 --------- Co-authored-by: Shiv Prashant Sood <shivsood@microsoft.com> Co-authored-by: GitHub Copilot <GitHub Copilot>
1 parent 5ddf970 commit c6e10b2

File tree

12 files changed

+108
-11
lines changed

12 files changed

+108
-11
lines changed

SqlScriptDom/Parser/TSql/TSql170.g

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19306,6 +19306,13 @@ vectorSearchTableReference returns [VectorSearchTableReference vResult = Fragmen
1930619306
Comma tSimilarTo:Identifier EqualsSign vSimilarTo = expression
1930719307
{
1930819308
Match(tSimilarTo, CodeGenerationSupporter.SimilarTo);
19309+
19310+
// Validate that SIMILAR_TO does not contain a subquery
19311+
if (vSimilarTo is ScalarSubquery)
19312+
{
19313+
ThrowParseErrorException("SQL46098", vSimilarTo, TSqlParserResource.SQL46098Message);
19314+
}
19315+
1930919316
vResult.SimilarTo = vSimilarTo;
1931019317
}
1931119318
Comma tMetric:Identifier EqualsSign vMetric = stringLiteral
@@ -33104,6 +33111,7 @@ jsonArrayAggBuiltInFunctionCall [FunctionCall vParent]
3310433111
{
3310533112
ScalarExpression vExpression;
3310633113
OrderByClause vOrderByClause;
33114+
OverClause vOverClause;
3310733115
}
3310833116
: (
3310933117
vExpression=expression
@@ -33133,6 +33141,12 @@ jsonArrayAggBuiltInFunctionCall [FunctionCall vParent]
3313333141
{
3313433142
UpdateTokenInfo(vParent, tRParen);
3313533143
}
33144+
(
33145+
vOverClause=overClauseNoOrderBy
33146+
{
33147+
vParent.OverClause = vOverClause;
33148+
}
33149+
)?
3313633150
;
3313733151

3313833152
jsonObjectBuiltInFunctionCall [FunctionCall vParent]
@@ -33980,7 +33994,7 @@ vectorSearchColumnReferenceExpression returns [ColumnReferenceExpression vResult
3398033994
MultiPartIdentifier vMultiPartIdentifier;
3398133995
}
3398233996
:
33983-
vMultiPartIdentifier=multiPartIdentifier[2]
33997+
vMultiPartIdentifier=multiPartIdentifier[4]
3398433998
{
3398533999
vResult.ColumnType = ColumnType.Regular;
3398634000
vResult.MultiPartIdentifier = vMultiPartIdentifier;

SqlScriptDom/Parser/TSql/TSql80ParserBaseInternal.cs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -292,8 +292,9 @@ protected static void AddAndUpdateTokenInfo<TFragmentType>(TSqlFragment node, IL
292292
protected static void AddAndUpdateTokenInfo<TFragmentType>(TSqlFragment node, IList<TFragmentType> collection, IList<TFragmentType> otherCollection)
293293
where TFragmentType : TSqlFragment
294294
{
295-
foreach (TFragmentType item in otherCollection)
295+
for (int i = 0; i < otherCollection.Count; i++)
296296
{
297+
TFragmentType item = otherCollection[i];
297298
AddAndUpdateTokenInfo(node, collection, item);
298299
}
299300
}

SqlScriptDom/ScriptDom/SqlServer/ScriptGenerator/SqlScriptGeneratorVisitor.FunctionCall.cs

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,8 @@ public override void ExplicitVisit(FunctionCall node)
9494
GenerateSpace();
9595
GenerateReturnType(node?.ReturnType);
9696
GenerateSymbol(TSqlTokenType.RightParenthesis);
97+
// Generate OVER clause for windowed json_arrayagg
98+
GenerateSpaceAndFragmentIfNotNull(node.OverClause);
9799
}
98100
else if (node.FunctionName.Value.ToUpper(CultureInfo.InvariantCulture) == CodeGenerationSupporter.JsonQuery)
99101
{
@@ -115,10 +117,10 @@ public override void ExplicitVisit(FunctionCall node)
115117
else if (node.FunctionName.Value.ToUpper(CultureInfo.InvariantCulture) == CodeGenerationSupporter.JsonValue)
116118
{
117119
GenerateCommaSeparatedList(node.Parameters);
118-
if (node.ReturnType?.Count > 0) //If there are return types then generate space and return type clause
119-
{
120-
GenerateSpace();
121-
GenerateReturnType(node?.ReturnType);
120+
if (node.ReturnType?.Count > 0) //If there are return types then generate space and return type clause
121+
{
122+
GenerateSpace();
123+
GenerateReturnType(node?.ReturnType);
122124
}
123125
GenerateSymbol(TSqlTokenType.RightParenthesis);
124126
}

Test/SqlDom/Baselines170/JsonArrayAggOrderBy170.sql

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,16 @@ FROM data;
1616
SELECT TOP (5) c.object_id,
1717
JSON_ARRAYAGG(c.name ORDER BY c.column_id) AS column_list
1818
FROM sys.columns AS c
19-
GROUP BY c.object_id;
19+
GROUP BY c.object_id;
20+
21+
SELECT JSON_ARRAYAGG(name) OVER (PARTITION BY dept)
22+
FROM employees;
23+
24+
SELECT JSON_ARRAYAGG(name ABSENT ON NULL) OVER (PARTITION BY dept)
25+
FROM employees;
26+
27+
SELECT JSON_ARRAYAGG(name NULL ON NULL) OVER (PARTITION BY dept)
28+
FROM employees;
29+
30+
SELECT JSON_ARRAYAGG(name ORDER BY name NULL ON NULL RETURNING JSON) OVER (PARTITION BY dept)
31+
FROM employees;

Test/SqlDom/Baselines170/VectorFunctionTests170.sql

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,4 +45,4 @@ WHERE outerref.id IN (SELECT src.id
4545
SIMILAR_TO = @qv,
4646
METRIC = 'cosine',
4747
TOP_N = outerref.max_results
48-
) AS ann);
48+
) AS ann);
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
DECLARE @qv AS VECTOR(1536);
2+
3+
SELECT qt.qid,
4+
src.id,
5+
ann.distance
6+
FROM QueryTable AS qt CROSS APPLY VECTOR_SEARCH(
7+
TABLE = graphnode AS src,
8+
COLUMN = embedding,
9+
SIMILAR_TO = qt.qembedding,
10+
METRIC = 'euclidean',
11+
TOP_N = dbo.qt.top_n
12+
) AS ann;

Test/SqlDom/Only170SyntaxTests.cs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ public partial class SqlDomTests
2020
new ParserTest170("RegexpTests170.sql", nErrors80: 0, nErrors90: 0, nErrors100: 0, nErrors110: 0, nErrors120: 0, nErrors130: 0, nErrors140: 0, nErrors150: 0, nErrors160: 0),
2121
new ParserTest170("AiGenerateChunksTests170.sql", nErrors80: 19, nErrors90: 16, nErrors100: 15, nErrors110: 15, nErrors120: 15, nErrors130: 15, nErrors140: 15, nErrors150: 15, nErrors160: 15),
2222
new ParserTest170("JsonFunctionTests170.sql", nErrors80: 29, nErrors90: 8, nErrors100: 54, nErrors110: 54, nErrors120: 54, nErrors130: 54, nErrors140: 54, nErrors150: 54, nErrors160: 54),
23-
new ParserTest170("JsonArrayAggOrderBy170.sql", nErrors80: 6, nErrors90: 6, nErrors100: 6, nErrors110: 6, nErrors120: 6, nErrors130: 6, nErrors140: 6, nErrors150: 6, nErrors160: 6),
23+
new ParserTest170("JsonArrayAggOrderBy170.sql", nErrors80: 10, nErrors90: 9, nErrors100: 9, nErrors110: 9, nErrors120: 9, nErrors130: 9, nErrors140: 9, nErrors150: 9, nErrors160: 9),
2424
new ParserTest170("ComplexJsonObjectFunctionTests170.sql"),
2525
new ParserTest170("AiGenerateEmbeddingsTests170.sql", nErrors80: 14, nErrors90: 11, nErrors100: 11, nErrors110: 11, nErrors120: 11, nErrors130: 11, nErrors140: 11, nErrors150: 11, nErrors160: 11),
2626
new ParserTest170("CreateExternalModelStatementTests170.sql", nErrors80: 2, nErrors90: 2, nErrors100: 2, nErrors110: 2, nErrors120: 2, nErrors130: 4, nErrors140: 4, nErrors150: 4, nErrors160: 4),
@@ -32,6 +32,7 @@ public partial class SqlDomTests
3232
new ParserTest170("RegexpLikeTests170.sql", nErrors80: 15, nErrors90: 15, nErrors100: 15, nErrors110: 18, nErrors120: 18, nErrors130: 18, nErrors140: 18, nErrors150: 18, nErrors160: 18),
3333
new ParserTest170("OptimizedLockingTests170.sql", nErrors80: 2, nErrors90: 2, nErrors100: 2, nErrors110: 2, nErrors120: 2, nErrors130: 2, nErrors140: 2, nErrors150: 2, nErrors160: 2),
3434
new ParserTest170("CreateEventSessionNotLikePredicate.sql", nErrors80: 2, nErrors90: 1, nErrors100: 1, nErrors110: 1, nErrors120: 1, nErrors130: 0, nErrors140: 0, nErrors150: 0, nErrors160: 0),
35+
new ParserTest170("VectorSearchCrossApplyTests170.sql", nErrors80: 1, nErrors90: 1, nErrors100: 1, nErrors110: 1, nErrors120: 1, nErrors130: 1, nErrors140: 1, nErrors150: 1, nErrors160: 1),
3536
// Complex query with VECTOR types - parses syntactically in all versions (optimization fix), but VECTOR type only valid in TSql170
3637
new ParserTest170("ComplexQueryTests170.sql"),
3738
// Comment preservation tests - basic SQL syntax works in all versions

Test/SqlDom/ParserErrorsTests.cs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7544,6 +7544,11 @@ public void VectorSearchErrorTest170()
75447544
ParserTestUtils.ErrorTest170(
75457545
"SELECT * FROM VECTOR_SEARCH('tbl1', 'col1', 'query_vector', 'dot', 5)",
75467546
new ParserErrorInfo(28, "SQL46010", "'tbl1'"));
7547+
7548+
// Subquery not allowed in SIMILAR_TO parameter
7549+
ParserTestUtils.ErrorTest170(
7550+
"SELECT * FROM VECTOR_SEARCH(TABLE = graphnode, COLUMN = embedding, SIMILAR_TO = (SELECT TOP 1 embedding FROM GTQuery), METRIC = 'euclidean', TOP_N = 20) AS ann",
7551+
new ParserErrorInfo(80, "SQL46098"));
75477552
}
75487553

75497554
/// <summary>

Test/SqlDom/TestScripts/JsonArrayAggOrderBy170.sql

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,16 @@ SELECT JSON_ARRAYAGG(value ORDER BY value NULL ON NULL) FROM data;
1818
-- Real-world example with GROUP BY and system tables
1919
SELECT TOP(5) c.object_id, JSON_ARRAYAGG(c.name ORDER BY c.column_id) AS column_list
2020
FROM sys.columns AS c
21-
GROUP BY c.object_id;
21+
GROUP BY c.object_id;
22+
23+
-- JSON_ARRAYAGG with OVER clause (PARTITION BY)
24+
SELECT JSON_ARRAYAGG(name) OVER (PARTITION BY dept) FROM employees;
25+
26+
-- JSON_ARRAYAGG with ABSENT ON NULL and OVER clause
27+
SELECT JSON_ARRAYAGG(name ABSENT ON NULL) OVER (PARTITION BY dept) FROM employees;
28+
29+
-- JSON_ARRAYAGG with NULL ON NULL and OVER clause
30+
SELECT JSON_ARRAYAGG(name NULL ON NULL) OVER (PARTITION BY dept) FROM employees;
31+
32+
-- JSON_ARRAYAGG with ORDER BY, NULL ON NULL, RETURNING JSON, and OVER clause
33+
SELECT JSON_ARRAYAGG(name ORDER BY name NULL ON NULL RETURNING JSON) OVER (PARTITION BY dept) FROM employees;

Test/SqlDom/TestScripts/VectorFunctionTests170.sql

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,4 +52,4 @@ WHERE outerref.id IN (
5252
METRIC = 'cosine',
5353
TOP_N = outerref.max_results
5454
) AS ann
55-
)
55+
)

0 commit comments

Comments
 (0)