May 27, 2011, 6:22 p.m.
posted by novinick
Linking Tables via Row Identifications
In OO databases, all the rows (or their equivalents) have a unique identification. The user does not generate this identification with, for example, an INSERT statement, but it is instead generated by the system itself. These identifications are often called row identifications, object identifiers, or surrogate keys. These unique row identifications can be used to link rows and to have rows refer to each other.
SQL also has adopted this concept. Here, a unique identification is assigned to each row. The row identifications are of no value to the users, but only to the system itself. Although they can be requested and printed, they bear no information. If a row receives an identification, it belongs to that row forever. If the row is removed, the matching identification will never be reused. Note that unique row identifications are not the same as primary keys (even though they do show a resemblance). Later in this chapter, we explain the differences.
Row identifications are stored together with the row, but it is not necessary to define columns for them explicitly. These column values are generated automatically. One could say that each table has a hidden column in which the row identifications are stored.
The row identification (or the value of the hidden column) can be requested with the REF function.
4. Get the Row Identification of the Player with number 6.
Explanation The REF function has the name of a table as its only parameter and returns the row identification. How row identifications actually look (on disk) depends on the product. As an example, a possible Oracle row identification is displayed.
As said, row identifications can be used to "link" rows. The identification of one row is stored within another row. In other words, one row refers or points to another.
5. Define the tables of the tennis club again, but use row identifications this time.
Explanation: Wherever a foreign key occurred, a column is now specified that points to another table. These are called reference columns. The link that is created between, for example, the PLAYERS and TEAMS tables could now be represented as in Figure.
1. Reference columns
6. Add a new team. The captain of this team is player 112.
Explanation: The SELECT statement retrieves the row identification of player 6 and then stores it in the PLAYER column.
7. The captain of team 1 is no longer player 6, but player 44.
The linking of tables with row identifications has a great influence on the way in which joins can be formulated. It becomes much easier to formulate most joins.
8. Find, for each team, the team number and the name of the captain.
Explanation: For each row in the TEAMS table, two values are printed: the value of the TEAMNO column and the value of the expression PLAYER.NAME. This is an expression that we have not discussed yet. Let us deal with it in more detail.
For the sake of convenience, we call this a reference expression. A reference expression always begins with a reference column. In this case, it is the column PLAYER. This column refers to the PLAYERS table. After the reference column is the NAME column from that table. The final result is that the name of the player, who is the captain of the team, is printed.
It looks as if this statement does not execute a join at all, but the join is hidden in the reference expression PLAYER.NAME. For each team, SQL finds the row identification of the player (the captain). This identification is stored in the column PLAYER. Next, SQL looks for the row with this identification in the hidden column of the PLAYERS table. If it is found, the value in the NAME column is retrieved. In other words, this statement does not specify a classical join indeed. Behind the scenes, the previous SELECT statement is converted to the following:
SELECT TEAMS.TEAMNO, PLAYERS.NAME FROM TEAMS, PLAYERS WHERE TEAMS.PLAYER = REF(PLAYERS)
First, the table to which the PLAYER column refers is added to the FROM clause. Then, a join condition is added to the statement. In this join condition, TEAMS.PLAYER is compared to the REF of the table to which the column refers.
Two aspects should be taken into account. First, a reference column can contain a NULL value, of course. In that case, no join will be executed on the other table, and the value of the reference expression is then NULL. Second, the reference column can contain a row identification that does not occur in the other table. Imagine that the row identification in the PLAYER column of the TEAMS table does not occur in the PLAYERS table. In that case, that team would not occur in the result when an inner join is executed. However, for this type of expression, an outer join is always executed. So actually, the previous statement is not executed, but the following one is:
SELECT TEAMS.TEAMNO, PLAYERS.NAME FROM TEAMS LEFT OUTER JOIN PLAYERS ON (TEAMS.PLAYER = REF(PLAYERS))
9. For each match played by someone from Eltham and for a team from the first division, find the match number, the name of the player, and the name of the captain of the team.
Explanation: The statement contains three reference expressions: PLAYER.NAME, TEAM.PLAYER.NAME, and PLAYER.TOWN. The first and last have well-known forms, a reference column followed by a "normal" column. However, the second expression has a new form. Here, the reference column TEAM is followed by another reference column, PLAYER, which is followed by NAME. This expression must be read as: Get, for each row concerned, the NAME of the PLAYER who is captain of the TEAM. In fact, this reference expression replaces a join specification of the MATCHES table by that of TEAMS first and then that of PLAYERS.
No restriction governs the length of reference expressions. The only restriction is that the last column cannot be a reference column.
10. Create two tables with employee and department data.
Explanation: For each department, we want to know the name of the boss of the department and where the boss of each department works.
Reference columns can also refer to the table of which they are a part.
Figure. Create the PLAYERS table with the new columns FATHER and MOTHER. These two columns are used if the father and mother are also members of the tennis club.
12. For each player whose mother also plays for the tennis club, get the player number and the name of the father.
13. Get the player number of each player whose grandfather also plays for the tennis club.
Using references has a number of advantages.
However, the use of references also has a number of disadvantages: