Timing and Signal Integrity Analysis:Clocked Circuits
Clocked Circuits
As mentioned earlier, combinational circuits have timing checks imposed only at the circuit primary outputs. However, for circuits containing clocked elements such as latches, flip-flops, gated clocks, domino/precharge logic, etc., timing checks must also be enforced at various internal nodes in the circuit to ensure that the circuit operates correctly and at-speed. In circuits containing clocked elements, a separate recognition step is required to detect the clocked elements and to insert constraints. There are two main techniques for detecting clocked elements: pattern recognition and clock propagation.
In pattern recognition-based approaches, commonly used sequential elements are recognized using simple topological rules. For example, back-to-back inverters in the netlist are often an indication of a latch. For more complex topologies, the detection is accomplished using templates supplied by the user. Portions of a circuit are typically recognized in the graph of the original circuit by employing subgraph isomorphism algorithms [9]. Once a subcircuit has been recognized, timing constraints are automatically inserted. Another application of pattern-based subcircuit recognition is to determine logical relationships between signals. For example, in pass-gate multiplexors, the data select lines are typically one-hot. This relationship cannot be obtained from the transistor-level circuit representation without recognizing the subcircuit and imposing the logical relationships for that subcircuit. The logical relationship can then be used by timing analysis tools. However, purely pattern recognition-based approaches can be restrictive and may necessitate a large number of templates from the user for proper functioning.
In clock propagation-based approaches, the recognition is performed automatically by propagating clock signals along the timing graph and determining how these clock signals interact with data signals at various nodes in the circuit. The primary input clocks are identified by the user and are marked as (simple) clock nodes. Starting from the primary clock inputs and traversing the timing arcs in the timing graph, the type of the nodes are determined based on simple rules. These rules are illustrated in Figure 63.3, where we show the transistor-level subcircuits and the corresponding timing subgraphs for some common sequential elements.
• A node that has only one clock signal incident on it and no feedback is classified as a simple clock node (Figure 63.3(a)).
• A node that has one clock and one or more data signals incident on it, but no feedback, is classified as a gated clock node (Figure 63.3(b)).
• A node that has multiple clock signals (and zero or more data signals) incident on it and no feedback is classified as a merged clock node (Figure 63.3(c)).
• A node that has at least one clock and zero or more data signals incident on it and has a feedback of length two (i.e., back-to-back timing arcs) is classified as a latch node (Figure 63.3(d)). The other node in the two node feedback is called the latch output node. A latch node is of type data. The timing arc(s) from the latch output node to the latch is (are) broken.
Latches can be of two types: level-sensitive and edge-triggered. To distinguish between edge- triggered and level-sensitive latches, various rules may be applied. These rules are usually design- specific and will not be discussed here. It is assumed that all latches are level-sensitive unless the user has marked certain latches to be edge-triggered.
Note that the domino gates of Figure 63.3(e) also satisfy the conditions for a latch node. For a latch node, both data and clock signals cause rising and falling transitions at the latch node. For domino gates, data inputs a and b cause only falling transitions at the domino node x. This condition can be used to distinguish domino nodes from latch nodes. Footed and footless domino gates can be distinguished from each other by looking at the clock transitions on the domino node. Since the footed gate has the clocked nMOS transistor at the “foot” of the evaluate tree, the clock signal at CK causes both rising and falling transitions at node x. In the footless domino gate, CK causes only a rising transition at node x.
Clock propagation stops when a node has been classified as a data node. This type of detection can be easily performed with a simple breadth-first search on the timing graph.
Once the sequential elements have been recognized, timing constraints must be inserted to ensure that the circuit functions correctly and at-speed [10]. These are described below and illustrated in Figures 63.4 and 63.5.
• Simple clocks: In this case, no timing checks are necessary. The arrival times and slopes at the simple clock node are obtained just as in normal data node.
• Gated clocks: The basic purpose of a gated clock is to enable or disable clock transitions at the input of the gate from propagating to the output of the gate. This is done by setting the value of the data input. For example, in the gated clock of Figure 63.3(b), setting the data input to 1 will allow the clock waveform to propagate to the output, whereas setting the data input to 0 will disable transitions at the gate output. To make sure that this is indeed the behavior of the gated clock, the timing constraints should be such that transitions at the data input node(s) do not create transitions at the output node. For the gated NAND clock of Figure 63.3(b), we have to ensure that the data can transition (high or low) only when the clock is low, i.e., data can transition after the clock turns low (short path constraint) and before the clock turns high (long path constraint). This is shown in Figure 63.4(a). In addition to imposing this timing constraint, we also break the timing arc from the data node to the gated clock node since data transitions cannot create output clock transitions.
• Merged clocks: Merged clocks are difficult to handle in static TA since the output clock waveform may have a different clock period compared to the input clocks. Moreover, the output clock waveform depends on the logical operation performed by the gate. To avoid these problems, static TA tools typically ask the user to provide the waveform at the merged clock node and the merged clock node is treated as a (simple) clock input node with that waveform. Users can obtain the clock waveform at the merged clock node by using dynamic simulation with the input clock waveforms.
• Edge-triggered latches: An edge-triggered latch has two types of constraints: set-up constraint and hold constraint. The set-up constraint requires that the data input node should be ready (i.e., the rising and falling signals should have stabilized) before the latch turns on. In the latch shown in Figure 63.3(d), the latch is turned on by the rising edge of the clock. Hence, the data should arrive some time before the rising edge of the clock (this time margin is typically referred to as the set-up time of the latch). This constraint imposes a required time on the latest (or maximum) arrival time at the data input of the latch and is therefore a long path constraint. This is shown in Figure 63.4(b). The hold constraint ensures that data meant for the current clock cycle does not accidentally appear during the on-phase of the previous clock cycle. Looking at Figure 63.4(b), this implies that the data should appear some time after the falling edge of the clock (this time margin is called the hold time of the latch). The hold time imposes a required time on the early (or minimum) arrival time at the data input node and is therefore a short path constraint. As the name implies, in edge-triggered latches, the on- edge of the clock causes data to be stored in the latch (i.e., causes transitions at the latch node). Since the data input is ready before the clock turns on, the latest arrival time at the latch node will be determined only by the clock signal. To make sure that this is indeed the behavior of
the latch, the timing arc from the data input node to the latch node is broken, as shown in Figure 63.4(b). One additional set of timing constraints is imposed for an edge-triggered latch. Since data is stored at the latch (or latch output) node, we must ensure that the data gets stored before the latch turns off. In other words, signals should arrive at the latch output node before the off-edge of the clock.
• Level-sensitive latches: In the case of level-sensitive latches, the data need not be ready before the latch turns on, as is the case for edge-triggered latches. In fact, the data can arrive after the on- edge of the clock — this is called cycle stealing or time borrowing. The only constraint in this case is that the data gets latched before the clock turns off. Hence, the set-up constraint for a level- sensitive latch is that signals should arrive at the latch output node (not the latch node itself) before the falling edge of the clock, as shown in Figure 63.4(c). The hold constraint is the same as before; it ensures that data meant for the current clock cycle arrives only after the latch was turned off in the previous clock cycle. This is also shown in Figure 63.4(c). Since the latest arriving signal at the latch node may come from either the data or the clock node, timing arcs are not broken for a level-sensitive latch. Since data can flow through the latch, level-sensitive latches are also referred to as transparent latches.
• Domino gates: Domino circuits have two distinct phases of operation: precharge and evaluate [11]. Looking at the domino gate of Figure 63.3(e), we see that in the precharge phase, the clock signal is low and the domino node x is precharged to a high value and the output node y is pre- discharged to a low value. During the evaluate phase, the clock is high and if the values of the gate inputs establish a path to ground, domino node x is discharged and output node y turns
high. The difference between footed and footless domino gates is the clocked nMOS transistor at the “foot” of the nMOS evaluate tree. To demonstrate the timing constraints imposed on domino circuits, consider the domino circuit block diagram and the clock waveforms shown in Figure 63.5. The footed domino blocks are labeled FD1 and FD2, and the footless blocks are labeled FLD 1 and FLD 2. From Figure 63.5(b), note that all three clocks have the same period 2T, but the falling edge of CK 2 is 0.25T after the falling edge of CK1 which in turn is 0.5T after the falling edge of CK0. Therefore, the precharge phase for FD1 and FD2 is T, for FLD1 is 0.5T, and for FLD 2 is 0.25T. The various timing constraints for domino circuits are illustrated in Figure 63.5 and discussed below.
1. We want the output O to evaluate (rise) before the clock starts falling and to precharge (fall) before the clock starts rising.
2. Consider node N1, which is an output of FD1 and an input of FD2. N1 starts precharging (falling) when CK0 falls, and the constraint on it is that it should finish precharging before CK0 starts rising.
3. Next, consider node N2, which is an input to FLD1 clocked by CK1. Since this block is footless, N 2 should be low during the precharge phase to avoid short-circuit current. N2 starts pre- charging (falling) when CK0 starts falling and should finish falling before CK1 starts falling.
Note that the falling edges of CK0 and CK1 are 0.5T apart, and the precharge constraint is on the late or maximum arrival time of N2 (long path constraint). Also, N2 should start rising only after CK1 has finished rising. This is a constraint on the early or minimum arrival time of N2 (short path constraint). In this example, N2 starts rising with the rising edge of CK0 and, since all the clock waveforms rise at the same time, the short path constraint will be satisfied trivially.
4. Finally, consider node N3. Since N3 is an input of FLD2, it must satisfy the short-circuit current constraints. N3 starts precharging (falling) when CK1 starts falling and it should fall completely before CK2 starts falling. Since the two clock edges are 0.25T apart, the precharge constraint on N3 is tighter than the one on N2. As before, the short path constraint on N3 is satisfied trivially.
The above discussion highlights the various types of timing constraints that must be automatically inserted by the static TA tool.
Note that each relative timing constraint between two signals is actually composed of two constraints. For example, if signal d must rise before clock CK rises, then (1) there is a required time on the late or maximum rising arrival time at node d (i.e., Ad,r < ACK,r), and (2) there is a required time on the early or minimum rising arrival time at the clock node CK (i.e., aCK,r < ad,r). There is one other point to be noted. Set-up and hold constraints are fundamentally different in nature. If a hold constraint is violated, then the circuit will not function at any frequency. In other words, hold constraints are functional constraints. Set-up constraints, on the other hand, are performance constraints. If a set-up constraint is violated, the circuit will not function at the specified frequency, but it will function at a lower frequency (lower speed of operation). For domino circuits, precharge constraints are functional constraints, whereas evaluate constraints are performance constraints.
Comments
Post a Comment