# Collections

### Collections

In this book, we have assumed that a column contains only one value for each row. In this section, we introduce a new concept, the cell. A cell is the intersection of a column and a row. So far, we have assumed that a cell can contain only one value. Of course, it is possible to store multiple values in a cell, such as a complete address consisting of a street name, house number, postcode, and so on, separated by commas. We interpret this value as if it consists of several values. SQL, on the other hand, still considers this value as one atomic value.

However, this changes with the adoption of OO concepts into SQL. Now, you can store sets of values in a cell, and SQL will truly regard this set as a set, not as one atomic value. Such a set is called a collection. With a collection, you could, for example, record for one player any number of phone numbers in the column PHONES.

##### Figure. Define the PLAYERS table so that a set of phone numbers can be stored.

 ```CREATE TABLE PLAYERS (PLAYERNO INTEGER PRIMARY KEY, : : PHONES SETOF(CHAR(13)), LEAGUENO CHAR(4)) ```

Explanation: The term SETOF indicates that a set of value can be stored within the column PHONES. The table itself could look as follows. (Just as in the set theory, brackets are used to indicate a set.) It is obvious that some players have two and some even have three phone numbers.

```PLAYERNO ... TOWN      PHONES                               LEAGUENO
-------- --- --------- ------------------------------------ -------
6 ... Stratford {070-476537, 070-478888}             8467
44 ... Inglewood {070-368753}                         1124
83 ... Stratford {070-353548, 070-235634, 079-344757} 1608
2 ... Stratford {070-237893, 020-753756}             2411
27 ... Eltham    {079-234857}                         2513
104 ... Eltham    {079-987571}                         7060
7 ... Stratford {070-347689}                         ?
57 ... Stratford {070-473458}                         6409
39 ... Stratford {070-393435}                         ?
112 ... Plymouth  {010-548745, 010-256756, 015-357347} 1319
8 ... Inglewood {070-458458}                         2983
100 ... Stratford {070-494593}                         6524
28 ... Midhurst  {071-659599}                         ?
95 ... Douglas   {070-867564, 055-358458}             ?
```

Of course, the use of collections in tables affects the other SQL statements. Here are some examples of how data can be entered in this specific column and how it can be queried with the SELECT statement.

##### 16. Add a new player with two phone numbers.

 ```INSERT INTO PLAYERS (PLAYERNO, ... , PHONES, ...) VALUES (213, ..., {'071-475748', '071-198937'}, ...) ```

Explanation: The brackets specify the set of phone numbers. Within the brackets, you are allowed to include zero, one, or more values. Zero can be used when this player has no phone.

##### 17. Assign player 44 a new phone number.

 ```UPDATE PLAYERS SET PHONES = {'070-658347'} WHERE PLAYERNO = 44 ```

##### 18. Get the numbers of the players who can be reached at the phone number 070-476537.

 ```SELECT PLAYERNO FROM PLAYERS WHERE '070-476537' IN (PHONES) ``` The result is: ```PLAYERNO -------- 6 ```

Explanation In this SELECT statement, a new form of the IN operator is used. Usually, a list of literals or expressions (see Section 8.6) or a subquery (see Section 8.7) is specified after the IN operator. What is given between brackets represents a set of values for both forms. The same applies to this new form because the column PHONES also represents a set of values. This form of the IN operator can be used only for collections, not for other columns.

##### 19. Get the numbers of the players who have more than two telephone numbers.

 ```SELECT PLAYERNO FROM PLAYERS WHERE CARDINALITY(PHONES) > 2 ``` The result is: ```PLAYERNO -------- 83 112 ```

Explanation: To determine the number of values in a collection, the CARDINALITY function can be used. When the number of values is determined, the NULL values are not counted and duplicate values count as one.

The statement could have been defined as follows:

```SELECT   PLAYERNO
FROM     PLAYERS
WHERE    2 < (SELECT COUNT(*) FROM TABLE(PLAYERS.PHONES))
```

Explanation: The statement looks like an ordinary statement, except that the FROM clause in the subquery contains a new construct: TABLE(PHONES). This construct transforms the set in a table (consisting of one column) into a number of rows. The number of rows is, of course, equal to the number of values in the collection. For each player, there will be another table.

The reason this more complex solution has been added is that it offers more features than that with the CARDINALITY function.

##### 20. Get the numbers of the players with the largest set of phone numbers.

 ```SELECT PLAYERNO FROM PLAYERS WHERE CARDINALITY(PHONES) >= ALL (SELECT CARDINALITY(PHONES) FROM PLAYERS) ``` The result is: ```PLAYERNO -------- 83 112 ```

##### 21. Find the numbers of the players who have the same set of phone numbers as player 6.

 ```SELECT PLAYERNO FROM PLAYERS WHERE PHONES = (SELECT PHONES FROM PLAYERS WHERE PLAYERNO = 6) ```

Explanation: The statement is obvious. You can also use > and < instead of the comparison operator =. If we would use the comparison operator >, the statement would answer the question: Who has at least the same telephone numbers as player 6? However, this person can have multiple phone numbers. Less than means: Who has at least one phone number that player 6 also has?

##### Figure. Get a list of all phone numbers from the PLAYERS table. The list should be presented in ascending order.

 Unfortunately, this question is not as simple as it seems. The following statement, for example, is not correct. The column PHONES does not return one set of values that can be ordered but returns a set consisting of sets. ```SELECT PHONES FROM PLAYERS ORDER BY 1 ``` First, this column must be "flattened," as it is called. ```SELECT PS.PHONES FROM THE (SELECT PHONES FROM PLAYERS) AS PS ORDER BY 1 ```

Explanation In Section 10.7, in Chapter 10, "SELECTStatement: TheGROUP BYClause," we stated that the FROM clause can contain a subquery. We will make use of that feature again, but now we put the word THE in front of it. The effect is that the result of the subquery, consisting of a set with sets, is transformed into one set consisting of atomic values. Thus, the set is flattened.

The result of the subquery itself can be represented as follows:

```PHONES
------------------------------------
{070-476537, 070-478888}
{070-368753}
{070-353548, 070-235634, 079-344757}
{070-237893, 020-753756}
{079-234857}
{079-987571}
{070-347689}
{070-473458}
{070-393435}
{010-548745, 010-256756, 015-357347}
{070-458458}
{070-494593}
{071-659599}
{070-867564, 055-358458}
```

The result after THE operator has been used looks as follows:

```PHONES
----------
070-476537
070-478888
070-368753
070-353548
070-235634
079-344757
070-237893
020-753756
079-234857
079-987571
070-347689
070-473458
070-393435
010-548745
010-256756
015-357347
070-458458
070-494593
071-659599
070-867564
055-358458
```

Now it is a "normal" table again, consisting of one column with a set of values. This result table is called PS in the FROM clause. We ask for this column in the SELECT statement, and the values are ordered in the ORDER BY clause.

Flattening of collections offers several possibilities, as illustrated in the following examples.

##### 23. Get the number of phone numbers of players 6 and 44 all together.

 ```SELECT COUNT(DISTINCT PS.PHONES) FROM THE (SELECT PHONES FROM PLAYERS WHERE PLAYERNO IN (6, 44)) AS PS ```

Explanation: The subquery itself returns two sets of phone numbers: one for player 6 and one for player 44. The THE operator flattens the two sets to one set of rows, each consisting of one atomic value. With this operator, nonduplicate values are removed automatically. That is why we use DISTINCT in the COUNT function.

##### 24. Get the phone numbers that player 6 and 44 have in common.

 ```SELECT PS1.PHONES FROM THE (SELECT PHONES FROM PLAYERS WHERE PLAYERNO = 6) PS1 INTERSECT SELECT PS2.PHONES FROM THE (SELECT PHONES FROM PLAYERS WHERE PLAYERNO = 44) PS2 ``` In the previous examples, the collection is defined on a column with a base data type. User-defined data types can be used as well. Likewise, user-defined data types can also make use of collections. The following are examples of both.

##### Figure. Define the PLAYERS table so that a set of phone numbers can be stored; use the PHONE data type.

 ```CREATE TYPE PHONE AS (AREA_CODE CHAR(3), SUBSCRIBER_NO CHAR(6)) CREATE TABLE PLAYERS (PLAYERNO INTEGER PRIMARY KEY, : : PHONES SETOF(PHONE), LEAGUENO CHAR(4)) ```

##### Figure. Define the PLAYERS table so that a set of phone numbers can be stored, but define the table in such a way that the set of values is defined with the PHONES data type.

 ```CREATE TYPE PHONES AS (PHONE SETOF(CHAR(13)) CREATE TABLE PLAYERS (PLAYERNO INTEGER PRIMARY KEY, : : PHONES PHONES, LEAGUENO CHAR(4)) ```