Return a Sample of Records

Return a Sample of Records

Use the Top predicate to return a portion of your records without bias.

Most often, you use a Select query to return all the records that match certain criteria. Usually, this query returns a data set that is smaller than the table or tables upon which the query is built. That is, not all records match the criteria, and the number of records that do match is smaller than the underlying set of table data.

Sometimes, you might need only a sample of records that aren't based on the criteria or in which the criteria are irrelevant. This isn't the same as fine-tuning the criteria to limit the number of returned records. For example, statistical work might require a sample from which to infer facts about the whole data set. Regardless of whether the population data is a table or a filtered data set already based on some criteria, the point is that the next step of getting a sample is completed without any preconceived notion. This is where the SQL Top predicate comes in handy.

The Top predicate works in two ways:

  • Returns a number of records

  • Returns a percentage of records

Using the Top Predicate

The Top predicate allows you to isolate records from the top of a data set. If you want to get records from the bottom, first apply a reverse sort (i.e., descending instead of ascending). Either way, you will continuously get the same set of records each time you run the query. Later in this hack, we'll discuss a method for getting a true random sample.

Figure shows a query in Design mode in which the query property sheet is used to indicate the number of records to return. The Top Values property has a few values from which to select. You can use one of these, or you can enter your own.

Selecting a value for the Top predicate

With 25 as the selected count to return, the query returns, no surprise, 25 records from the top of the data set, as shown in Figure.

Returning the designated number of records

It's interesting to see the SQL the Access query grid generates. Switching to SQL view, here is what you see:

	SELECT TOP 25 Occurrences.Reading
	FROM Occurrences;

The Top predicate sits just after the Select statement and specifies the number of records to return.

To return a percentage of records, simply add the word Percent to the SQL statement, after the number:

	SELECT TOP 25 PERCENT Occurrences.Reading
	FROM Occurrences;

To indicate percent when using the query designer, add the percent sign (%) to the Top Values property, as shown in Figure.

Indicating to return a percentage of records

Hacking the Hack

The Top predicate is great for grabbing a handful of records, but it will always grab the same records. Even when no sort is placed on the source data, the records still sit in the order in which they were placed in the table.

Returning a random set of records requires using the Rnd function. You apply this as a sort. Normally, a sort isn't what you want to use to return an unbiased data set, but sorting on a random value makes this a moot point. To make this work, alter the SQL statement to look like this:

	SELECT TOP 25 Occurrences.Reading
	FROM Occurrences
	ORDER BY RND([Reading]);

Enter the name of the field as the argument for the Rnd function. Each time the query runs, a random selection of records is returned.

     Python   SQL   Java   php   Perl 
     game development   web development   internet   *nix   graphics   hardware 
     telecommunications   C++ 
     Flash   Active Directory   Windows