Refresher





Refresher

The GROUP BY clause can explicitly appear in a SELECT statement, as in this example:


SELECT column1 FROM Table1

  GROUP BY column1

Grouping also happens implicitly if there is a HAVING clause or a set function, as in this example:


SELECT COUNT(*) FROM Table1

  HAVING COUNT(*) = 5

In standard SQL—and with InterBase and Microsoft—a GROUP BY column may be followed by a COLLATE clause:


SELECT column1 FROM Table1

  GROUP BY column1,

           column2 COLLATE SQL_Latin1_General

In a nonstandard SQL extension supported by Ingres, Microsoft, MySQL, Oracle, and Sybase, a GROUP BY column may contain an expression:


SELECT LOWER(column1) FROM Table1

  GROUP BY LOWER(column1)

And in ANSI SQL:1999, IBM, Microsoft, and Oracle, a GROUP BY clause may contain CUBE or ROLLUP to indicate another level of grouping, like this:


SELECT column1, column2 FROM Table1

  GROUP BY CUBE (column1, column2)

The SQL Standard says that the correct name for an aggregate function is "set function," and the required set functions are AVG, COUNT, MAX, MIN, and SUM. In contrast, DBMS vendors prefer the term "aggregate function," and some provide extra built-in functions—for example, for standard deviation (STDEV) and/or variance (VAR)—in addition to the standard set functions. Some DBMSs also allow users to create their own aggregate functions. For example:


SELECT AVG(column1), STDEV(column1), UDF1(column1)

  FROM Table1

  WHERE column1 > 55

  GROUP BY column1

  ORDER BY column1

Most of the features we've mentioned are supported by most DBMSs. Figure shows the SQL Standard requirements and the level of support the Big Eight have for GROUP BY.

ANSI/DBMS GROUP BY Support
  Basic GROUP BY Expressions CREATE VIEW COLLATE Clause CUBE/ ROLLUP Max Columns Max Bytes
ANSI SQL Yes No Yes Yes Yes N/S N/S
IBM Yes No Yes No Yes >=20 254
Informix Yes No Yes No Yes >=20 >=2000
Ingres Yes Yes Yes No No >=20 >=2000
InterBase Yes No No Yes No >=20 >=2000
Microsoft Yes Yes Yes Yes Yes >=20 >=2000
MySQL Yes Yes No No No >=20 >=2000
Oracle Yes Yes Yes No Yes >=20 1969
Sybase Yes Yes Yes No No >=20 >=2000

Notes on Figure:

  • Basic GROUP BY column

    This column is "Yes" if the DBMS supports basic GROUP BY syntax like:

    
    SELECT column1, MIN(column2) FROM Table1
    
      GROUP BY column1
    
    
  • Expressions column

    This column is "Yes" if the DBMS supports expressions in GROUP BY, like this:

    
    SELECT UPPER(column1) FROM Table1
    
      GROUP BY UPPER(column1)
    
    
  • CREATE VIEW column

    This column is "Yes" if the DBMS lets you put GROUP BY in a CREATE VIEW statement, like this:

    
    CREATE VIEW View1 AS
    
       SELECT column1, COUNT(column1) FROM Table1
    
         GROUP BY column1
    
    
  • COLLATE Clause column

    This column is "Yes" if the DBMS supports ANSI SQL-style COLLATE clauses, or Oracle-style NLSSORT() function calls, or a CAST to a different character set with a different collation in GROUP BY, like this:

    
    SELECT column1, MIN(column2) FROM Table1
    
      GROUP BY column1 COLLATE SQL_Latin1_General
    
    
  • CUBE/ROLLUP column

    This column is "Yes" if the DBMS supports CUBE and ROLLUP for summarizing, like this:

    
    SELECT column1, column2 FROM Table1
    
      GROUP BY CUBE (column1, column2)
    
    
  • Max Columns column

    Shows how many columns may be listed in the GROUP BY clause.

    For Sybase, our tests showed it was possible to group at least 20 columns. This differs from Sybase's response to JDBC's getMaxColumnsInGroupBy call, which returns 16.

  • Max Bytes column

    Shows the maximum allowed length, in bytes, of grouped values.


     Python   SQL   Java   php   Perl 
     game development   web development   internet   *nix   graphics   hardware 
     telecommunications   C++ 
     Flash   Active Directory   Windows