Performing Joins

Performing Joins

Because relational databases are more complexly structured, they sometimes require special query statements to retrieve the information you need most. JoinsSQL queries performed by cross-referencing tablesare used to extract more usable data from relational databases. As you can see from Figure 5.14, the returned data can be the intersection of two sets of records (see the top image) or all of one data set plus the intersection of the other.

14. These Venn diagrams represent the two primary ways the data in two tables can be retrieved using joins.

Several types of joins are conceivable according to SQL, each returning a different set of data. Beginning to intermediate users will find that the two most basic joins (which I'll teach in this chapter) will satisfy almost every need. The most used join is called an inner join. For example, if you want to match up all the clients with their invoices, this could be done using two syntaxes:

SELECT * FROM invoices, clients WHERE invoices.client_id = clients.client_id
SELECT * FROM invoices INNER JOIN clients ON invoices.client_id = clients.client_id

This join will retrieve all of the information from both the invoices and clients tables wherever an invoices.client_id is the same as the clients.client_id. In other words, the query will replace the client_id foreign key in the invoices table with all of the information for that client in the clients table. An inner join like this will only return records wherein a match is made (so if a client record did not match an existing invoice, that client's information would not be returned).

When selecting from multiple tables and columns, you must use the dot syntax (table.column) if there are columns with the same name. This is normally the case when dealing with relational databases because a primary key from one table will have the same name as a foreign key from another.

The second type of join I'll discussan outer joindiffers from an inner join in that it could return records not matched by a conditional. It comes in both left outer join and right outer join varieties (also just called a left join or right join). For example:

SELECT * FROM clients LEFT JOIN invoices ON invoices.client_id = clients.client_id

With that query, all of the clients records will be returned, and if a client has any corresponding invoice records, those will be returned as well. This differs from the inner join in that all the data from the first listed table will be returned whether or not a match is made to the second table.

The right outer join just gives precedence to the second table listed instead of the first. These two queries have the same result:

SELECT * FROM invoices RIGHT JOIN clients ON invoices.client_id = clients.client_id
SELECT * FROM clients LEFT JOIN invoices ON invoices.client_id = clients.client_id

If both tables in an outer join have the same column name, you can simplify your query with

SELECT * FROM invoices LEFT JOIN clients USING (client_id)

Joins are complicated stuff, so hopefully these next examples will help to clarify these points.

To use joins:

Retrieve the invoice amount, the invoice date, and the client names for every invoice (Figure 5.15).

SELECT invoice_amount, invoice_date, client_name FROM invoices, clients
WHERE invoices.client_id = clients.client_id;

15. The basic inner join, albeit wordy, returns more usable information from your database than would a standard query.

This query, which includes an inner join, will return each client's name, invoice amount, and invoice date, if they have been invoiced. As you can see in the figure, the result is one record, with all three pieces of information, for each matching client and invoice. If a client has not been invoiced, they would not be included in the results.

Retrieve the expense category, expense date, and expense amount for every expense (Figure 5.16).

SELECT expense_category, expense_amount, expense_date FROM expense_categories, expenses
 WHERE expense_categories. expense_category_id = expenses. expense_category_id;

Figure 5.16. Inner joins are usable wherever you have a primary keyforeign key relationship, as I do here between expenses and expense_categories.

This query is another application of the same principle as that in Step 1. Because I am again performing an inner join, the order of the tables will not matter (the end result will be the same if you reverse the table listings).

Retrieve all of the client names and all of the invoices for those clients (Figure 5.17).

SELECT client_name, invoice_id, invoice_amount, invoice_date, invoice_description FROM
 clients LEFT JOIN invoices USING (client_id);

Figure 5.17. Left joins are more particular in their wording than inner joins and do not require matches to return records. Compare this result with that in Figure 5.18.

This queryan outer joinwill retrieve every client name and then associate the appropriate invoices with each. Even if a client has no invoices (see Something Clever at bottom), the client will be listed. If I had used an inner join, Something Clever would be omitted from the returned results (see Figure 5.15).

Retrieve all of the invoice IDs, amounts, dates, and descriptions, along with the corresponding client names (Figure 5.18).

SELECT client_name, invoice_id, invoice_amount, invoice_date, invoice_description FROM
 invoices LEFT JOIN clients USING (client_id);

18. The order of the tables in a left join will affect the results returned. Every record from the first table will always be retrieved, regardless of the conditional.

This query only differs from that in Step 3 in that here I am left-joining invoices to clients rather than the other way around. The end result is that two fewer records are returned, as there are no invoices associated with two clients.


  • MySQL differs from standard SQL in how it refers to joins. I've simplified this discussion here, focusing on the two most important types of joins, and stuck to the MySQL implementation.

  • You can perform joins on more than two tables. You can even join a table with itself!

  • Joins can be created using conditionals involving any columns, not just the primary and foreign keys, as I have done here.

  • Joins that do not include a WHERE clause (e.g., SELECT * FROM invoices, clients) are called full joins and will return a Cartesian product: every record from Table A combined with every record from Table B. This construct can have unwieldy results with larger tables.

 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows