You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Updated section headers for consistency and clarity in the documentation regarding the GROUP BY clause in Transact-SQL. Added details about support for ISO and ANSI SQL-2006 features.
Revise GROUP BY options in T-SQL documentation
Updated the documentation for GROUP BY options in T-SQL, including allowed and disallowed statements, and added details on hierarchical aggregation and multidimensional summarization.
Update documentation for GROUP BY syntax in T-SQL
Clarified syntax for Analytics Platform System and updated explanation for WITH (DISTRIBUTED_AGG).
Revise GROUP BY ROLLUP and grouping sets documentation
Updated the documentation for GROUP BY ROLLUP and clarified restrictions on grouping sets in Transact-SQL.
Clarify non-aggregate and compatibility terms
Update documentation for GROUP BY syntax
Update docs/t-sql/queries/select-group-by-transact-sql.md
Co-authored-by: Randolph West MSFT <97149825+rwestMSFT@users.noreply.github.com>
Update docs/t-sql/queries/select-group-by-transact-sql.md
Co-authored-by: Randolph West MSFT <97149825+rwestMSFT@users.noreply.github.com>
Update docs/t-sql/queries/select-group-by-transact-sql.md
Co-authored-by: Randolph West MSFT <97149825+rwestMSFT@users.noreply.github.com>
Update docs/t-sql/queries/select-group-by-transact-sql.md
Co-authored-by: Randolph West MSFT <97149825+rwestMSFT@users.noreply.github.com>
Update docs/t-sql/queries/select-group-by-transact-sql.md
Co-authored-by: Randolph West MSFT <97149825+rwestMSFT@users.noreply.github.com>
Update docs/t-sql/queries/select-group-by-transact-sql.md
Co-authored-by: Randolph West MSFT <97149825+rwestMSFT@users.noreply.github.com>
Update docs/t-sql/queries/select-group-by-transact-sql.md
Co-authored-by: Randolph West MSFT <97149825+rwestMSFT@users.noreply.github.com>
Apply suggestion from @rwestMSFT
Co-authored-by: Randolph West MSFT <97149825+rwestMSFT@users.noreply.github.com>
Fix formatting in GROUP BY restrictions section
Apply suggestions from code review
Co-authored-by: Van To <40007119+VanMSFT@users.noreply.github.com>
Copy file name to clipboardExpand all lines: docs/t-sql/queries/select-group-by-transact-sql.md
+88-68Lines changed: 88 additions & 68 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -90,7 +90,7 @@ GROUP BY {
90
90
} [ , ...n ]
91
91
```
92
92
93
-
Syntax for Analytics Platform System (PDW):
93
+
Syntax for Analytics Platform System/Paralel Data Warehouse (APS/PDW):
94
94
95
95
```syntaxsql
96
96
GROUP BY {
@@ -103,55 +103,32 @@ GROUP BY {
103
103
104
104
### *column-expression*
105
105
106
-
Specifies a column or a non-aggregate calculation on a column. This column can belong to a table, derived table, or view. The column must appear in the `FROM` clause of the `SELECT` statement, but doesn't need to appear in the `SELECT` list.
106
+
Specifies a column or a nonaggregate calculation on a column. This column can belong to a table, derived table, or view. The column must appear in the `FROM` clause of the `SELECT` statement, but doesn't need to appear in the `SELECT` list.
107
107
108
108
For valid expressions, see [expression](../language-elements/expressions-transact-sql.md).
109
109
110
110
The column must appear in the `FROM` clause of the `SELECT` statement, but isn't required to appear in the `SELECT` list. However, each table or view column in any nonaggregate expression in the `<select>` list must be included in the `GROUP BY` list.
111
111
112
-
The following statements are allowed:
113
-
114
-
```sql
115
-
SELECT ColumnA,
116
-
ColumnB
117
-
FROM T
118
-
GROUP BY ColumnA, ColumnB;
119
-
120
-
SELECT ColumnA + ColumnB
121
-
FROM T
122
-
GROUP BY ColumnA, ColumnB;
123
-
124
-
SELECT ColumnA + ColumnB
125
-
FROM T
126
-
GROUP BY ColumnA + ColumnB;
127
-
128
-
SELECT ColumnA + ColumnB + constant
129
-
FROM T
130
-
GROUP BY ColumnA, ColumnB;
131
-
```
132
-
133
-
The following statements aren't allowed:
134
-
135
-
```sql
136
-
SELECT ColumnA,
137
-
ColumnB
138
-
FROM T
139
-
GROUP BY ColumnA + ColumnB;
140
-
141
-
SELECT ColumnA + constant + ColumnB
142
-
FROM T
143
-
GROUP BY ColumnA + ColumnB;
144
-
```
112
+
### GROUP BY options
145
113
146
-
The column expression can't contain:
114
+
The following options extend the basic `GROUP BY` clause to support hierarchical aggregation, multidimensional summarization, custom grouping combinations, and platform‑specific execution behaviors. These options allow queries to produce subtotals and grand totals in a single logical operation.
147
115
148
-
- A column alias that you define in the `SELECT` list. It can use a column alias for a derived table that is defined in the `FROM` clause.
149
-
- A column of type **text**, **ntext**, or **image**. However, you can use a column of text, ntext, or image as an argument to a function that returns a value of a valid data type. For example, the expression can use `SUBSTRING()` and `CAST()`. This rule also applies to expressions in the `HAVING` clause.
150
-
- xml data type methods. It can include a user-defined function that uses xml data type methods. It can include a computed column that uses xml data type methods.
151
-
- A subquery. Error 144 is returned.
152
-
- A column from an indexed view.
116
+
-**ROLLUP ( <group_by_expression> [ , ...n ] )**
117
+
Generates hierarchical subtotals for the listed columns and a final grand total (for example, `(a,b,c)`, `(a,b)`, `(a)`, `()`). Use for drill‑up reports like **year** > **quarter** > **month**.
118
+
-**CUBE ( <group_by_expression> [ , ...n ] )**
119
+
Produces all combinations of the specified columns (the full 2^n lattice) plus the grand total. Best suited for multi‑dimensional analysis across every slice.
120
+
-**GROUPING SETS ( <grouping_set> [ , ...n ] )**
121
+
Defines the exact groupings to compute (including `()` for grand total) in one pass; functionally similar to a `UNION ALL` of multiple `GROUP BY` queries but optimized together.
122
+
-**`()` (empty grouping set)**
123
+
Shorthand for computing only the **grand total** across all rows—used alone as `GROUP BY ()` or inside `GROUPING SETS`.
Older, non‑ISO syntax equivalent to `GROUP BY CUBE(...)` or `GROUP BY ROLLUP(...)`. Supported for backward compatibility; use the ISO subclauses when possible.
128
+
-**WITH (DISTRIBUTED_AGG)**
129
+
Hints distributed execution for aggregations when grouping by a single column. It's supported only in Azure Synapse Analytics dedicated SQL pools and Analytics Platform System/Parallel Data Warehouse (APS/PDW).
153
130
154
-
###GROUP BY *column-expression*[ ,...n ]
131
+
## GROUP BY *column-expression*[ ,...n ]
155
132
156
133
Groups the `SELECT` statement results according to the values in a list of one or more column expressions.
157
134
@@ -199,9 +176,51 @@ The query result has three rows since there are three combinations of values for
199
176
| Canada | British Columbia | 500 |
200
177
| United States | Montana | 100 |
201
178
202
-
### GROUP BY ROLLUP
179
+
The column expression in `GROUP BY` can't contain:
180
+
181
+
- A column alias that you define in the `SELECT` list. It can use a column alias for a derived table that's defined in the `FROM` clause.
182
+
- A column of type **text**, **ntext**, or **image**. However, you can use a column of **text**, **ntext**, or **image** as an argument to a function that returns a value of a valid data type. For example, the expression can use `SUBSTRING()` and `CAST()`. This rule also applies to expressions in the `HAVING` clause.
183
+
-**xml** data type methods. It can include a user-defined function that uses **xml** data type methods. It can include a computed column that uses **xml** data type methods.
184
+
- A subquery. The query returns error 144.
185
+
- A column from an indexed view.
186
+
187
+
The following statements are allowed:
188
+
189
+
```sql
190
+
SELECT ColumnA,
191
+
ColumnB
192
+
FROM T
193
+
GROUP BY ColumnA, ColumnB;
194
+
195
+
SELECT ColumnA + ColumnB
196
+
FROM T
197
+
GROUP BY ColumnA, ColumnB;
198
+
199
+
SELECT ColumnA + ColumnB
200
+
FROM T
201
+
GROUP BY ColumnA + ColumnB;
202
+
203
+
SELECT ColumnA + ColumnB + constant
204
+
FROM T
205
+
GROUP BY ColumnA, ColumnB;
206
+
```
207
+
208
+
The following statements aren't allowed:
209
+
210
+
```sql
211
+
SELECT ColumnA,
212
+
ColumnB
213
+
FROM T
214
+
GROUP BY ColumnA + ColumnB;
215
+
216
+
SELECT ColumnA + constant + ColumnB
217
+
FROM T
218
+
GROUP BY ColumnA + ColumnB;
219
+
```
220
+
221
+
## GROUP BY ROLLUP ()
203
222
204
-
Creates a group for each combination of column expressions. In addition, it "rolls up" the results into subtotals and grand totals. To do this, it moves from right to left, decreasing the number of column expressions over which it creates groups and the aggregations.
223
+
Creates a group for each combination of column expressions. In addition, it *rolls up* the results into subtotals and grand totals. While creating the groups, it moves from right to left, decreasing the number of column expressions over which it creates groups and the aggregations.
205
224
206
225
The column order affects the `ROLLUP` output and can affect the number of rows in the result set.
207
226
@@ -211,7 +230,7 @@ For example, `GROUP BY ROLLUP (col1, col2, col3, col4)` creates groups for each
211
230
- col1, col2, col3, NULL
212
231
- col1, col2, NULL, NULL
213
232
- col1, NULL, NULL, NULL
214
-
- NULL, NULL, NULL, NULL (This is the grand total)
233
+
- NULL, NULL, NULL, NULL (The group with the NULL values is the grand total)
215
234
216
235
Using the table from the previous example, this code runs a `GROUP BY ROLLUP` operation instead of a simple `GROUP BY`.
217
236
@@ -234,7 +253,7 @@ The query result has the same aggregations as the simple `GROUP BY` without the
234
253
| United States | NULL | 100 |
235
254
| NULL | NULL | 700 |
236
255
237
-
###GROUP BY CUBE ()
256
+
## GROUP BY CUBE ()
238
257
239
258
`GROUP BY CUBE` creates groups for all possible combinations of columns. For `GROUP BY CUBE (a, b)`, the results have groups for unique values of `(a, b)`, `(NULL, b)`, `(a, NULL)`, and `(NULL, NULL)`.
240
259
@@ -262,7 +281,7 @@ The query result has groups for unique values of `(Region, Territory)`, `(NULL,
262
281
| Canada | NULL | 600 |
263
282
| United States | NULL | 100 |
264
283
265
-
###GROUP BY GROUPING SETS ()
284
+
## GROUP BY GROUPING SETS ()
266
285
267
286
The `GROUPING SETS` option combines multiple `GROUP BY` clauses into one `GROUP BY` clause. The results are the same as using `UNION ALL` on the specified groups.
268
287
@@ -296,6 +315,14 @@ GROUP BY CUBE(Region, Territory);
296
315
297
316
SQL doesn't consolidate duplicate groups generated for a `GROUPING SETS` list. For example, in `GROUP BY ((), CUBE (Region, Territory))`, both elements return a row for the grand total, and both rows appear in the results.
298
317
318
+
### Support for ISO and ANSI SQL-2006 GROUP BY features
319
+
320
+
The `GROUP BY` clause supports all `GROUP BY` features that are included in the SQL-2006 standard with the following syntax exceptions:
321
+
322
+
- Grouping sets aren't allowed in the `GROUP BY` clause unless they're part of an explicit `GROUPING SETS` list. For example, `GROUP BY Column1, (Column2, ...ColumnN)` is allowed in the standard but not in Transact-SQL. Transact-SQL supports `GROUP BY C1, GROUPING SETS ((Column2, ...ColumnN))` and `GROUP BY Column1, Column2, ... ColumnN`, which are semantically equivalent. These clauses are semantically equivalent to the previous `GROUP BY` example. This restriction avoids the possibility that `GROUP BY Column1, (Column2, ...ColumnN)` could be misinterpreted as `GROUP BY C1, GROUPING SETS ((Column2, ...ColumnN))`, which aren't semantically equivalent.
323
+
324
+
- Grouping sets aren't allowed inside grouping sets. For example, `GROUP BY GROUPING SETS (A1, A2,...An, GROUPING SETS (C1, C2, ...Cn))` is allowed in the SQL-2006 standard but not in Transact-SQL. Transact-SQL allows `GROUP BY GROUPING SETS( A1, A2,...An, C1, C2, ...Cn)` or `GROUP BY GROUPING SETS( (A1), (A2), ... (An), (C1), (C2), ... (Cn))`, which are semantically equivalent to the first `GROUP BY` example and have clearer syntax.
325
+
299
326
### GROUP BY ()
300
327
301
328
Specifies the empty group, which generates the grand total. This group is useful as one of the elements of a `GROUPING SET`. For example, this statement gives the total sales for each region and then gives the grand total for all regions.
@@ -307,7 +334,7 @@ FROM Sales
307
334
GROUP BY GROUPING SETS(Region, ());
308
335
```
309
336
310
-
###GROUP BY ALL column-expression [ ,...n ]
337
+
## GROUP BY ALL column-expression [ ,...n ]
311
338
312
339
**Applies to**: SQL Server and Azure SQL Database
313
340
@@ -321,21 +348,26 @@ Specifies whether to include all groups in the results, regardless of whether th
321
348
- Isn't supported in queries that access remote tables if there's also a `WHERE` clause in the query.
322
349
- Fails on columns that have the FILESTREAM attribute.
323
350
324
-
### GROUP BY column-expression [ ,...n ] WITH { CUBE | ROLLUP }
351
+
### Support for ISO and ANSI SQL-2006 GROUP BY Features
352
+
353
+
The `GROUP BY` clause supports all `GROUP BY` features that are included in the SQL-2006 standard with the following syntax exceptions:
354
+
-`GROUP BY ALL` and `GROUP BY DISTINCT` are only allowed in a simple `GROUP BY` clause that contains column expressions. You can't use them with the `GROUPING SETS`, `ROLLUP`, `CUBE`, `WITH CUBE`, or `WITH ROLLUP` constructs. `ALL` is the default and is implicit. It's also only allowed in the backward compatible syntax.
355
+
356
+
## GROUP BY column-expression [ ,...n ] WITH { CUBE | ROLLUP }
325
357
326
358
**Applies to**: SQL Server and Azure SQL Database
327
359
328
360
> [!NOTE]
329
361
> Use this syntax only for backward compatibility. Avoid using this syntax in new development work, and plan to modify applications that currently use this syntax.
330
362
331
-
###WITH (DISTRIBUTED_AGG)
363
+
## WITH (DISTRIBUTED_AGG)
332
364
333
365
**Applies to**: [!INCLUDE [ssazuresynapse-md](../../includes/ssazuresynapse-md.md)] and [!INCLUDE [ssPDW](../../includes/sspdw-md.md)]
334
366
335
367
The `DISTRIBUTED_AGG` query hint forces the massively parallel processing (MPP) system to redistribute a table on a specific column before performing an aggregation. Only one column in the `GROUP BY` clause can have a `DISTRIBUTED_AGG` query hint. After the query finishes, the redistributed table is dropped. The original table isn't changed.
336
368
337
369
> [!NOTE]
338
-
> The `DISTRIBUTED_AGG` query hint is provided for backward compatibility with earlier [!INCLUDE [ssPDW](../../includes/sspdw-md.md)] versions and doesn't improve performance for most queries. By default, MPP already redistributes data as necessary to improve performance for aggregations.
370
+
> The `DISTRIBUTED_AGG` query hint provides backward compatibility with earlier [!INCLUDE [ssPDW](../../includes/sspdw-md.md)] versions and doesn't improve performance for most queries. By default, MPP already redistributes data as necessary to improve performance for aggregations.
339
371
340
372
## Remarks
341
373
@@ -362,12 +394,10 @@ The `DISTRIBUTED_AGG` query hint forces the massively parallel processing (MPP)
362
394
363
395
- If a grouping column contains `NULL` values, all `NULL` values are considered equal, and they're collected into a single group.
364
396
365
-
## Limitations
397
+
###Limitations
366
398
367
399
**Applies to**: SQL Server and [!INCLUDE [ssazuresynapse-md](../../includes/ssazuresynapse-md.md)]
368
400
369
-
### Maximum capacity
370
-
371
401
For a `GROUP BY` clause that uses `ROLLUP`, `CUBE`, or `GROUPING SETS`, the maximum number of expressions is 32. The maximum number of groups is 4,096 (2<sup>12</sup>). The following examples fail because the `GROUP BY` clause has more than 4,096 groups.
372
402
373
403
- The following example generates 4,097 (2<sup>12</sup> + 1) grouping sets and then fails.
@@ -389,17 +419,7 @@ For a `GROUP BY` clause that uses `ROLLUP`, `CUBE`, or `GROUPING SETS`, the maxi
389
419
GROUP BY a1, ..., a13 WITH CUBE
390
420
```
391
421
392
-
For backward compatible `GROUP BY` clauses that don't contain `CUBE` or `ROLLUP`, the number of `GROUP BY` items is limited by the `GROUP BY` column sizes, the aggregated columns, and the aggregate values involved in the query. This limit originates from the limit of 8,060 bytes on the intermediate worktable that holds intermediate query results. A maximum of 12 grouping expressions is permitted when `CUBE` or `ROLLUP` is specified.
393
-
394
-
### Support for ISO and ANSI SQL-2006 GROUP BY Features
395
-
396
-
The `GROUP BY` clause supports all `GROUP BY` features that are included in the SQL-2006 standard with the following syntax exceptions:
397
-
398
-
- Grouping sets aren't allowed in the `GROUP BY` clause unless they're part of an explicit `GROUPING SETS` list. For example, `GROUP BY Column1, (Column2, ...ColumnN)` is allowed in the standard but not in Transact-SQL. Transact-SQL supports `GROUP BY C1, GROUPING SETS ((Column2, ...ColumnN))` and `GROUP BY Column1, Column2, ... ColumnN`, which are semantically equivalent. These clauses are semantically equivalent to the previous `GROUP BY` example. This restriction avoids the possibility that `GROUP BY Column1, (Column2, ...ColumnN)` might be misinterpreted as `GROUP BY C1, GROUPING SETS ((Column2, ...ColumnN))`, which aren't semantically equivalent.
399
-
400
-
- Grouping sets aren't allowed inside grouping sets. For example, `GROUP BY GROUPING SETS (A1, A2,...An, GROUPING SETS (C1, C2, ...Cn))` is allowed in the SQL-2006 standard but not in Transact-SQL. Transact-SQL allows `GROUP BY GROUPING SETS( A1, A2,...An, C1, C2, ...Cn)` or `GROUP BY GROUPING SETS( (A1), (A2), ... (An), (C1), (C2), ... (Cn))`, which are semantically equivalent to the first `GROUP BY` example and have clearer syntax.
401
-
402
-
-`GROUP BY ALL` and `GROUP BY DISTINCT` are only allowed in a simple `GROUP BY` clause that contains column expressions. You can't use them with the `GROUPING SETS`, `ROLLUP`, `CUBE`, `WITH CUBE`, or `WITH ROLLUP` constructs. `ALL` is the default and is implicit. It's also only allowed in the backward compatible syntax.
422
+
For backward compatible `GROUP BY` clauses that don't contain `CUBE` or `ROLLUP`, the `GROUP BY` column sizes, the aggregated columns, and the aggregate values involved in the query limit the number of `GROUP BY` items. This limit originates from the limit of 8,060 bytes on the intermediate worktable that holds intermediate query results. You can use a maximum of 12 grouping expressions when you specify `CUBE` or `ROLLUP`.
403
423
404
424
### Comparison of supported `GROUP BY` features
405
425
@@ -408,7 +428,7 @@ The following table describes the `GROUP BY` features that different SQL Server
408
428
| Feature | SQL Server Integration Services | SQL Server compatibility level 100 or higher |
409
429
| --- | --- | --- |
410
430
|`DISTINCT` aggregates | Not supported for `WITH CUBE` or `WITH ROLLUP`. | Supported for `WITH CUBE`, `WITH ROLLUP`, `GROUPING SETS`, `CUBE`, or `ROLLUP`. |
411
-
| User-defined function with `CUBE` or `ROLLUP` name in the `GROUP BY` clause | User-defined function `dbo.cube(<arg1>, ...<argN>)` or `dbo.rollup(<arg1>, ...<argN>)` in the `GROUP BY` clause is allowed.<br /><br />For example: `SELECT SUM (x) FROM T GROUP BY dbo.cube(y);`| User-defined function `dbo.cube (<arg1>, ...<argN>)` or `dbo.rollup(arg1>, ...<argN>)` in the `GROUP BY` clause isn't allowed.<br /><br />For example: `SELECT SUM (x) FROM T GROUP BY dbo.cube(y);`<br /><br />The following error message is returned: "Incorrect syntax near the keyword 'cube'|'rollup'."<br /><br />To avoid this problem, replace `dbo.cube` with `[dbo].[cube]` or `dbo.rollup` with `[dbo].[rollup]`.<br /><br />The following example is allowed: `SELECT SUM (x) FROM T GROUP BY [dbo].[cube](y);`|
431
+
| User-defined function with `CUBE` or `ROLLUP` name in the `GROUP BY` clause | User-defined function `dbo.cube(<arg1>, ...<argN>)` or `dbo.rollup(<arg1>, ...<argN>)` in the `GROUP BY` clause is allowed.<br /><br />For example: `SELECT SUM (x) FROM T GROUP BY dbo.cube(y);`| User-defined function `dbo.cube (<arg1>, ...<argN>)` or `dbo.rollup(arg1>, ...<argN>)` in the `GROUP BY` clause isn't allowed.<br /><br />For example: `SELECT SUM (x) FROM T GROUP BY dbo.cube(y);`<br /><br />SQL Server returns the following error message: "Incorrect syntax near the keyword 'cube'|'rollup'."<br /><br />To avoid this problem, replace `dbo.cube` with `[dbo].[cube]` or `dbo.rollup` with `[dbo].[rollup]`.<br /><br />The following example is allowed: `SELECT SUM (x) FROM T GROUP BY [dbo].[cube](y);`|
412
432
|`GROUPING SETS`| Not supported | Supported |
413
433
|`CUBE`| Not supported | Supported |
414
434
|`ROLLUP`| Not supported | Supported |
@@ -472,11 +492,11 @@ HAVING DATEPART(yyyy, OrderDate) >= N'2003'
472
492
ORDER BY DATEPART(yyyy, OrderDate);
473
493
```
474
494
475
-
## Examples: Azure Synapse Analytics and Analytics Platform System (PDW)
495
+
## Examples: Azure Synapse Analytics and Analytics Platform System / Parallel Data Warehouse (PDW)
476
496
477
497
### E. Basic use of the GROUP BY clause
478
498
479
-
The following example finds the total amount for all sales on each day. One row containing the sum of all sales is returned for each day.
499
+
The following example finds the total amount for all sales on each day. The query returns one row containing the sum of all sales for each day.
0 commit comments