May 2, 2011, 11:43 p.m.
posted by nfrank
Clock Generation and Synchronization
Clocks are the main synchronizing signals to which all other signals reference. In a majority of situations, clock waveforms are deterministic and periodic. Therefore, if you know how to write RTL code for one period, the complete waveform can be generated by embedding the one period inside a loop. To create one period of clock, there are two methods: explicit and toggle. With the explicit method, you specify rising and falling times of the clock and assign a value to it at each of the transition times. With the toggle method, you also specify rising and falling times of the clock, but invert the clock at the transition times.
Explicit and Toggle Methods
Consider generating the clock waveform in Figure. First, one period of a clock is generated. Using the explicit method, one period of clock has the RTL codes
#1 clock = 1'b1; #1 clock = 1'b0; #2 clock = 1'b1; #2 clock = 1'b0;
6. Generating a clock signal
The delays before the assignments are delay intervals between successive transitions. Putting this period inside a loop and initializing the clock produces the complete waveform, as shown here:
initial clock = 1'b0; always begin #1 clock = 1'b1; #1 clock = 1'b0; #2 clock = 1'b1; #2 clock = 1'b0; end
initial clock = 1'b0; always begin #1 clock = ~clock; // rising #1 clock = ~clock; // falling #2 clock = ~clock; // rising #2 clock = ~clock; // falling end
This toggle method can be difficult to see the value of clock at a given time. Thus, comments indicating rising or falling transitions are recommended. Furthermore, if the clock is left uninitialized, the clock will not toggle and simply stays at the unknown value xa potential pitfall that the explicit method avoids. On the other hand, the toggle method is easy to change the phase, or the initial value, of the clock by simply initializing the clock to a different value while keeping the toggle statements intact. Changing a clock's initial value is more complicated with the explicit method; all assign values have to be modified.
Note that we used blocking assignment operator =, but nonblocking operator <= could also be used in this example.
Absolute Transition Delay
In the previous example, the delays are interval delays between successive transitions, and there are situations when absolute transition times are desired. To do so, nonblocking intraassignment delay can be used. The following code representing the waveform shown in Figure uses nonblocking intraassignment delays:
initial begin clock <= #0 1'b0; clock <= #1 1'b1; clock <= #2 1'b0; clock <= #4 1'b1; clock <= #6 1'b0; clock <= #7 1'b1; ... end
Statement clock <= #2 1'b0 assigns 1'b0 to clock two units of time from the current time. Because this statement does not block, the next statement is simulated immediately. The next statement, clock <= #4 1'b1, assigns 1 to clock four units of time from the current time. Therefore, all delays in the nonblocking assignments refer to the current time.
Because the delays are absolute transition times, all transitions have to be explicitly specified, as opposed to embedding one period in a loop. Hence, generating waveforms using absolute times is used only for aperiodic waveforms such as reset signals. It's important to note that if the nonblocking assignments are replaced by blocking assignments, the next statement must wait until the current statement is executed, meaning the delays now have become interval delays.
Time Zero Clock Transition
Similar to the time zero initialization problem, the very first transition of clock at time zero may be perceived as a transition because clock has an unknown value before time zero and gets assigned to a value at time zero. Whether this time zero transition is perceived is simulator dependent, and thus care must be exercised to deal with either scenario. One way to avoid this ambiguity is to initialize clock explicitly to unknown value x at time zero, hence eliminating the time zero transition, and start the clock at a later time.
Time Unit and Resolution
During verification, the clock period or duty cycle may change. It is beneficial to use parameters to represent the delays, instead of hard coding them. For example, to generate a clock starting with zero that has a 50% duty cycle, one may code as follows:
define PERIOD 4 initial clock = 1'b0; always #('PERIOD/2) clock = ~clock;
Caution should be exercised when PERIOD is not evenly divided by two. If PERIOD is odd, the result is truncated. If integer division is replaced by real division, the result is rounded off according to the specified resolution.
In Verilog, the unit for delay (for example, #3) is specified using 'timescale, which is declared as
where unit is the unit of measurement for time and resolution is the precision of time. For example, with 'timescale 1.0ns/100ps, # (4/3.0) clock = 1'b1 means at 1300 ps, clock is assigned to 1. Note that although 4/3.0 gives 1333ps, it is rounded off to 1300ps because the resolution is declared to be 100ps.
Clock Multiplier and Divider
A complex chip often uses multiple clocks generated from a phase lock loop (PLL) block. The analog behavior of PLL is not modeled in HDL, but is abstracted to generate clock waveforms using delays and assigns. When multiple clock waveforms are generated, their relationship needs to be determined (for example, whether their transitions are independent, whether some clocks are derived from others). A clock can be derived from another via a frequency divider or multiplier. If the frequency ratio is an integer, it is easy to generate a derived clock from the base clock. For frequency division, the derived clock can be generated from base clock without knowing the base clock's frequency. To divide base_clock N times to get derived_clock, trigger a transition on derived_clock for every N transitions of base_clock:
initial i = 0; always @( base_clock ) begin i = i % N; if (i == 0) derived_clock = ~derived_clock; i = i + 1; end
Multiplying a clock frequency by N can be achieved using Verilog's repeat statement once the base clock's frequency is known (for example, for every positive or negative transition of the base clock, repeatedly generate 2N transitions for the derived clock). For example,
always @(posedge base_clock) begin repeat (2N) clock = #(period/(2N)) ~clock; end
If the period of the base clock is not known or is changing constantly, and/or the ratio is not an integer, a different technique is required. First, the base clock's period is measured on the fly and then the derived clock is generated using
forever clock = #(period/(2N)) ~clock;
A sample code to implement this general clock divider/multiplier is as follows:
// measure the first period of base_clock initial begin derived_clock = 1'b0; // assume starting 0 @(posedge base_clock) T1 = $realtime; @(posedge base_clock) T2 = $realtime; period = T2 - T1; T1 = T2; ->start; // start generating derived_clock end // continuously measure base_clock's period always @(start) forever @(posedge base_clock) begin T2 = $realtime; period = T2 - T1; T1 = T2; end // generate derived_clock N times the frequency of base_clock always @(start) forever derived_clock = #(period/(2N)) ~ derived_clock;
Clock Independence and Jitter
If the clocks are independent, each of them should be modeled with its own always statement. For example, the following code generates two independent clocks:
initial clock1 = 1'b0; always clock1 = #1 ~clock1; initial clock2 = 1'b0; always clock2 = #2 ~clock2;
An incorrect way to produce these two clocks is to use one clock to generate the other, as illustrated here:
initial clock1 = 1'b0; always clock1 = #1 ~clock1; initial clock2 = 1'b0; always @ (negedge clock1) clock2 = ~clock2;
To emphasize more the relative independence of two clocks, jitter can be introduced to one of the clocks to simulate the nondeterministic relative phase. This can be done with a random generator:
initial clock1 = 1'b0; always clock1 = #1 ~clock1; jitter = $random(seed) % RANGE; assign clock1_jittered = #(jitter) clock1;
The modulo operator % returns the remainder when divided by RANGE, and thus restricts the range of jitter. Clock clock1_jittered is a version of clock1 with edges that are randomly shifted in the range [0, RANGE]. All random functions/tasks should be called with a seed so that the result can be reproduced later when errors are found. Verilog offers a variety of random distributions, such as uniform, normal, and Poisson.
Clock Synchronization and Delta Delay
Because independent waveforms are not locked in phase, they can drift relative to each other. Jittering models this phase-drifting phenomenon. When two independent waveforms arrive at the same gate, glitches can come and go depending on the input's relative phase, creating intermittent behavior. Therefore, independent waveforms should be first synchronized before propagation. A synchronizer uses a signal, the synchronizing signal, to trigger sampling of another to create a dependency between them, hence removing the uncertainty in their relative phase. The following code is a simple synchronizer. On every transition of signal fast_clock, signal clock1 is sampled. The synchronizer is essentially a latch; thus, nonblocking assign is used to avoid a race condition:
always (fast_clock) clock_synchronized <= clock1;
If the synchronizing signal's transitions do not align with those of the synchronized signal, some transitions will be missed, as shown in Figure. Because transitions will be missed if the synchronizing signal has a lower frequency, the signal with the highest frequency is usually chosen as the synchronizing signal.
7. Intermittent glitches and synchronizing independent signals
Because of the nonblocking assign in the synchronizer, the synchronized signal and the synchronizing signal are not exactly aligned but are separated by an infinitesimal amount. This is the result of a characteristic of nonblock assignment. At simulation time, nonblocking assignments are evaluated after all regular events scheduled for the time are evaluated. For example, in the following RTL code, the value of v is sampled when the nonblocking statement is encountered, but the actual assignment to x happens only after all blocking statements are evaluated. In this case, x gets the value of v after the blocking assignment y = x is executed. Thus, y gets the old value of x (the value of the previous clock cycle). Therefore, even though the two assignments are evaluated at the same simulation time, y always lags x by one clock cycle:
always @(posedge clock) begin x <= v; y = x; end
In the previous synchronizer example, the synchronized signal, clock_synchronized, always lags synchronizing clock1 by an infinitesimal amount. This amount is from the time clock1 is sampled to the time clock_synchronized is actually assigned. Nevertheless, the two clocks' transitions always have identical simulation times. However small this infinitesimal amount is, it sometimes can cause other signals to lag by a cycle, as illustrated in the previous example.
Clock Generator Organization
A central clock module generates various clock waveforms from the same clock source. During circuit implementation, the clock source is a PLL, and various clock signals are derived from the PLL output. In RTL, clock generation should be encapsulated as a module with parameters to change clock frequencies, phases, and jitter ranges, among others. A typical block diagram for a clock generation network is shown in Figure. The primary clock source is a simple waveform generator, modeling a PLL. The secondary clock generation block derives clocks of various frequencies and phases with frequency multipliers, dividers, and phase shifters. Frequency, phase, and random distribution are controlled through parameters and variables. Finally, the clock distributor multiplexes the clocks to appropriate clock domains. In RTL, the clock generation network is
8. A typical clock generation network
module clock_generation_network (clk1, clk2,...); output clk1, clk2,...; parameter DISTRIBUTION 1; parameter JITTER_RANGE 2; ... primary_clock pc (.clock_src(csrc)); clock_gen cg(.in_clock(csrc),.freq1(freq1),.phase1(.ph1),...); clock_distributor cd (.outclk1(clk1),..., .inclk1(freq1),...); endmodule