July 19, 2011, 4:12 a.m.
posted by nfrank
Regression, Release Mechanism, and Tape-out Criteria
The centralized file database must maintain high-quality code. To prevent bugs from being checked into the centralized file database, a set of tests, called check-in tests, must be run to qualify the to-be-checked-in code. If the tests pass, the code can be checked in. A check-in test may not detect all bugs; therefore, all files in the centralized file database should be run on a larger suite of tests, called a regression test, at regular intervals (such as weekly). A check-in test can be regarded as a smaller scale regression test. Its runtime is much shorter, so that code can be checked in without much delay.
Large regression tests can be layered further. First, the full regression test suite is run only occasionally (for example, biweekly or before a major release). Second, the regression suite of the next scale can be run for patch releases or for a weekly release. Third, a nightly regression suite is run to catch as early as possible bugs in newly checked-in files. Finally, a check-in test, when considered a regression test, is run whenever there is a file to check in. The sooner a regression is run, the sooner the bugs are detected; however, regression tests place a heavy burden on computing resources. A full regression test suite can take as much as a day on a computer farm, and a nightly test can take as long as 12 hours on a project server. Therefore, regression tests are often run after hours. If a regression suite takes more than 12 hours on the project server, it should be run either on a weekend or on a computer farm to avoid slowing down engineering productivity during the day. The frequency of running a regression test may increase as a milestone release approaches.
A regression suite, whether for a check-in test or the entire code database, collects test cases from diagnostics targeted at specific areas of the code, randomly generated programs, and stimuli that had activated bugs. A well-designed regression suite has good code and functional coverage, and a minimum number of overlapping tests. To verify a design using a regression test, the output from the regression run is compared with a known, good output. Because code and even project specifications can change over time, the known, good output also changes. Therefore, regression suites require maintenance. Mismatches from a regression run should be resolved in a timely fashion to prevent error proliferation. All errors from the current week's regression run must be resolved by the end of the week so that these errors will not create secondary errors. An objective of the regression run is to minimize the number of errors per root cause while maximizing detection of errors.
Large regression suites are run distributively on a computing grid, also called a computer farm, which consists of hundreds and thousands of machines. The individual tests in a regression suite are run simultaneously on separate machines. A computing grid has a multiple-client/multiple-server queue as its interface. Jobs submitted are queued up and served by available machines from the farm. A job submission entry usually contains a script for job execution, a list of input files, and a point to store output files. When a job is finished, the submitter is notified of the status and the time of completion.
Release mechanism refers to the method by which code is delivered to customers. In a hardware design team, products can be RTL code, PLI C programs, CAD tools, and test programs. The most primitive release mechanism is to place files in a specific directory so that customers can download them. A key consideration in determining a release mechanism is to understand how the code will be used and to craft a release mechanism to minimize the impact on the customers' application environment. For example, if the release product is an executable program that resides in the customer's environment in the directory /application/bin/, one release mechanism is to put the release program in that directory under the customer's revision control system and attach a new version number to it. To use it, the customer simply changes the version number of the product in his view. In this example, copying the released product directly to /application/bin may interfere with the customer's operation. For instance, the older version of the program may be in use at the time of copying. A script that invoked the program multiple times might end up using the older version in the first invocation and the new version in the later invocations.
Because there are no direct ways to know that a design is free of bugs, several indirect measures such as tape-out criteria, are used in practice. One is the coverage measure. Usually, nearly 100% code coverage must be achieved. On the other hand, numerical measures for parameter and functional coverage may not be accurate enough to cover the complete functional spectrum and hence may be subject to interpretation. An alternative practical approach is to have project architects to review all functional tests to determine whether sufficient coverage is achieved. Another tape-out criterion is bug occurrence frequency or, simply, bug rate, which is the number of bugs found during the past week or weeks. If bug rate is low, it may indicate that the design is relatively stable and bug free. Of course, it may also mean that the tests are not finding any new bugs; therefore, they should be used in conjunction with a coverage measure. Another criterion is the number of cycles simulated, which offers some insight into the depth in state space in which the design is explored, but this lacks any proved accuracy. Figure shows plot of the three tape-out criteria. The bug rate curve eventually becomes zero and the coverage metric reaches a plateau of 98% regardless of the increases in the simulation cycle. The tape-out time in this example is dictated by the coverage, because it reaches a plateau last. The simulation cycle can never reach a plateau.
17. Tape-out criteria based on coverage, bug rate, and simulation cycle