Grouping on One Column






Grouping on One Column

The simplest form of the GROUP BY clause is the one in which only one column is grouped. In the previous chapters, we gave several examples of statements with such a GROUP BY clause. For the sake of clarity, we specify several other examples in this section.

Figure. Get all the different town names from the PLAYERS table.

SELECT   TOWN
FROM     PLAYERS
GROUP BY TOWN

The intermediate result from the GROUP BY clause could look like this:

TOWN       PLAYERNO                    NAME
---------  --------------------------  ----------------------
Stratford  {6, 83, 2, 7, 57, 39, 100}  {Parmenter, Hope, ...}
Midhurst   {28}                        {Collins}
Inglewood  {44, 8}                     {Baker, Newcastle}
Plymouth   {112}                       {Bailey}
Douglas    {95}                        {Miller}
Eltham     {27, 104}                   {Collins, Moorman}

Explanation: All rows with the same TOWN form one group. Each row in the intermediate result has one value in the TOWN column, whereas all other columns can contain multiple values. To indicate that these columns are special, the values are placed between brackets. We show those columns in this way for illustrative purposes only; you should realize that SQL probably would solve this internally in a different way. Furthermore, these two columns cannot be presented like this. In fact, a column that is not grouped is completely omitted from the end result, but we return to this later in the chapter.

The end result of the statement is:

TOWN
---------
Stratford
Midhurst
Inglewood
Plymouth
Douglas
Eltham

A frequently used term in this particular context is grouping. The GROUP BY clause in the previous statement has one grouping, which consists of only one column: the TOWN column. In this chapter, we sometimes represent this as follows: The result is grouped by [TOWN]. Later in this chapter, we give examples of groupings with multiple columns and GROUP BY clauses consisting of multiple groupings.

We could have solved the earlier question more easily by leaving out the GROUP BY clause and adding DISTINCT to the SELECT clause (work this out for yourself). Using the GROUP BY clause becomes interesting when we extend the SELECT clause with aggregation functions.

2. For each town, find the number of players.

SELECT   TOWN, COUNT(*)
FROM     PLAYERS
GROUP BY TOWN

The result is:

TOWN       COUNT(*)
---------  --------
Stratford         7
Midhurst          1
Inglewood         2
Plymouth          1
Douglas           1
Eltham            2

Explanation: In this statement, the result is grouped by [TOWN]. The COUNT(*) function is now executed against each grouped row instead of against all rows. In other words, the function COUNT(*) is calculated for each grouped row (for each town).

In this result, it is obvious that the data is aggregated. The individual data of players cannot be displayed anymore, and the data is aggregated by TOWN. Or the aggregation level of this result is TOWN.

3. For each team, get the team number, the number of matches that has been played for that team, and the total number of sets won.

SELECT   TEAMNO, COUNT(*), SUM(WON)
FROM     MATCHES
GROUP BY TEAMNO

The result is:

TEAMNO  COUNT(*)  SUM(WON)
------  --------  --------
     1         8        15
     2         5         9

Explanation: This statement contains one grouping consisting of the TEAMNO column.

4. For each team that is captained by a player resident in Eltham, get the team number and the number of matches that has been played for that team.

SELECT   TEAMNO, COUNT(*)
FROM     MATCHES
WHERE    TEAMNO IN
        (SELECT   TEAMNO
         FROM     TEAMS INNER JOIN PLAYERS
                  ON TEAMS.PLAYERNO = PLAYERS.PLAYERNO
         WHERE    TOWN = 'Eltham')
GROUP BY TEAMNO

The result is:

TEAMNO  COUNT(*)
------  --------
     2         5

The column on which the result has been grouped might also appear in the SELECT clause as a parameter within an aggregation function. This does not happen often, but it is allowed.

Figure. Get each different penalty amount, followed by the number of times that the amount occurs, in the PENALTIES table, and also show the result of that amount multiplied by the number.

SELECT   AMOUNT, COUNT(*), SUM(AMOUNT)
FROM     PENALTIES
GROUP BY AMOUNT

The PENALTIES table is grouped on the AMOUNT column first. The intermediate result could be presented as follows:

PAYMENTNO  PLAYERNO  PAYMENT_DATE              AMOUNT
---------  --------  ------------------------  ------
{5, 6}     {44, 8}   {1980-12-08, 1980-12-08}  25.00
{7}        {44}      {1982-12-30}              30.00
{4}        {104}     {1984-12-08}              50.00
{2, 8}     {44, 27}  {1981-05-05, 1984-11-12}  75.00
{1, 3}     {6, 27}   {1980-12-08, 1983-09-10} 100.00

Again, the values of the columns that are not grouped are placed between brackets, and the AMOUNT column shows only one value. However, that is not entirely correct. Behind the scenes, SQL also creates a group for this column. So, the intermediate result should, in fact, be presented as follows:

PAYMENTNO  PLAYERNO  PAYMENT_DATE              AMOUNT
---------  --------  ------------------------  --------------
{5, 6}     {44, 8}   {1980-12-08, 1980-12-08}  {25.00, 25.00}
{7}        {44}      {1982-12-30}              {30.00}
{4}        {104}     {1984-12-08}              {50.00}
{2, 8}     {44, 27}  {1981-05-05, 1984-11-12}  {75.00, 75.00}
{1, 3}     {6, 27}   {1980-12-08, 1983-09-10}  {100.00, 100.00}

The values in the AMOUNT column are also represented as a group now. Of course, only equal values appear in each group. And because it is a group, aggregation functions can be used.

The result is:

AMOUNT  COUNT(*)  SUM(AMOUNT)
------  --------  -----------
 25.00         2        50.00
 30.00         1        30.00
 50.00         1        50.00
 75.00         2       150.00
100.00         2       200.00

However, in this book, we do not present the values of the grouped columns between brackets.

Exercise 10.1:

Show the different years in which players joined the club; use the PLAYERS table.

Exercise 10.2:

For each year, show the number of players who joined the club.

Exercise 10.3:

For each player who has incurred at least one penalty, give the player number, the average penalty amount, and the number of penalties.

Exercise 10.4:

For each team that has played in the first division, give the team number, the number of matches, and the total number of sets won.



 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows