Technology Scaling and Low-Power Circuit Design:Circuit Design Challenges and Leakage Control Techniques

Circuit Design Challenges and Leakage Control Techniques

We have established that in technology scaling as supply voltage scales down, transistor threshold voltage needs to reduce to maintain high performance, resulting in excessive subthreshold leakage power. In this section, circuit techniques for leakage avoidance, control, and tolerance to mitigate the subthreshold leakage will be discussed.

Leakage power has become a larger fraction of the total active power of microprocessors (Figure 21.19). This poses serious challenges for heat removal and power delivery in high-performance processors. Excessive leakage power can also cause thermal runaway during burn-in, and impact the burn-in cost. Subthreshold leakage dominates at high temperature and gate oxide leakage is a significant contributor to the burn- in leakage power owing to the higher voltage used. Of course, standby leakage power at room temperature also needs to be kept sufficiently small for battery-operated systems. Consequently, today’s designers are aware of power and managing leakage and power are part of design objectives. Researchers have suggested various means for designs to be power aware that will be discussed in the following sections.

Dual-Vt and Body Bias

Dual-Vt designs can reduce leakage power during active operation, burn-in, and standby. Two Vt’s are provided by the process technology for each transistor. Performance-critical transistors are made low-Vt to provide the target chip performance. The rest of the transistors are made high-Vt to minimize leakage power [21]. Since the full-chip frequency is dictated by only a fraction of transistors in the critical paths, this selective Vt assignment is possible without degrading the overall chip performance achievable by using a single low-Vt transistor everywhere. Figure 21.20 shows an example circuit block, where all low Vt design provides 24% delay improvement over all high Vt design. Notice that as you start inserting low Vt devices (y-axis), the delay improves (x-axis). Only 34% of the total transistor width needs to be low-Vt in this example, to get the same frequency as using low-Vt everywhere. Typically, low-Vt device leakage is 10X higher than high-Vt. Thus, by carefully employing low-Vt up to 34% of the total width, 24% delay improvement is possible with ~3X increase in leakage, compared to all high-Vt design.

Another technique to reduce leakage power during burn-in and standby is to apply reverse-body bias (RBB) to the transistors to increase Vt since high performance is not required during these modes. There is an optimal RBB value that minimizes leakage power as shown in Figure 21.21 [22]. Using RBB values larger than this value causes the junction leakage current to increase and overall leakage power to go up. In sub-100 nm technology generation, approximately 500 mV RBB is optimal. Two to three times reduction in leakage current is achievable. However, effectiveness of RBB reduces as channel lengths become smaller or Vt values are lowered (Figure 21.22) [23]. Essentially, the Vt-modulation capability by RBB weakens as short-channel effects become worse or body effect diminishes owing to lower channel doping.

Therefore, RBB becomes less effective with technology scaling and as leakage currents are pushed higher by shorter L or lower Vt. Therefore, we must exploit the RBB technique for leakage control while it lasts.

Forward-body bias (FBB) technique is an alternative discussed in Refs. [24,25]. In this technique, we apply FBB between source and body terminals of a transistor, which reduces Vt, and hence improves circuit performance, but also increases source/drain leakage, as shown in Figure 21.23. This figure shows that at 1 GHz operation for the same operating supply voltage, we can achieve 25% power saving with FBB. Or alternatively, at 1 V supply voltage operation, we can achieve 35% higher frequency of operation. To effectively utilize FBB, we should start with higher Vt transistor, provided by process technology with inherently higher channel doping and lower source/drain leakage, and improve performance by applying FBB during operation. Although source/drain leakage is increased during active operation of the circuit, it reverts back when the FBB is removed. This can be used to provide substantial leakage savings during burn-in operation as shown in Figure 21.24. Low Vt devices achieved by FBB can result in 30X saving in stand-by leakage because we can remove FBB and apply RBB during stand-by.

A combination of RBB and FBB together can be applied adaptively in a technique called adaptive body biasing (ABB) to reduce source/drain leakage with reduced performance or improve performance with higher source/drain leakage [26]. Figure 21.25 and Figure 21.26 show a test chip and resulting bin split by employing the ABB technique. In this experiment, a test chip is subdivided into 23 subsites, and each site has circuitry to apply FBB and RBB in small increments. PMOS transistors in each site may be biased individually; however, all NMOS transistors of the test chip have the same body bias. During the test, slower dies are applied with FBB to improve performance and faster, leaky dies are applied RBB to reduce source/drain leakage. The experiment shows that 100% yield can be achieved with significant boost to the high frequency bin by applying ABB at the full die level (ignoring subsites). Furthermore, by applying selective body bias to PMOS transistors in the different subsites, the high-frequency bin grows even further to 97% [27].

ABB has also some merit in dealing with increased circuit leakage and frequency variability that will not be discussed here. However, Figure 21.27 shows how leakage, power, and frequency spreads can be modulated by applying body bias. ABB can deal with both die-to-die and within-die parameter variations.

Although the effectiveness of dual-Vt designs for low-power applications is shown, there are several challenges in implementing it. One of the issues is the variation that exists in transistor threshold voltage. Each target Vt has a variation around it. Consideration of this variation and desired targets to set Vt values has created challenges in design. Making two precise Vt values has become daunting considering

variation, thereby reducing the effectiveness of dual-Vt designs [28]. From transistor design point of view, targeting a low Vt transistor competes with managing device short-channel effects and architecting high Vt transistors makes managing junction leakage more important. If we use halo to manage short-channel effect of low Vt transistors, then we also need to worry about a band-to-band tunneling junction leakage in low Vt transistors as well.

Stack Effect and Sleep Transistors

Leakage current through series-connected transistors or transistor “stacks,” with more than one device “off,” is at least an order of magnitude smaller than that through a single device (Figure 21.28) [29]. This so-called “stack effect” can be exploited for leakage reduction in circuits. The stack effect factor, defined as the ratio of single device leakage to stack leakage, increases as the DIBL factor becomes larger and supply voltage increases. As the rate of supply voltage scaling diminishes and DIBL effects become stronger with technology scaling, the effectiveness of leakage reduction by stacks becomes higher (Figure 21.29).

Leakage reduction by stack effect can be exploited by converting a single transistor into a two-transistor stack in a logic circuit. The widths of these transistors can be half of the original size or other combinations can be chosen to preserve the same input capacitance load as the original single device. Leakage versus delay trade-off provided by this “stack forcing” technique applied to both high-Vt and low-Vt devices is illustrated in Figure 21.30. Clearly, stack forcing can be used to emulate additional higher Vt devices without increasing process complexity. Stack forcing can be applied to transistors in noncritical paths in single-Vt or dual-Vt designs to reduce overall chip leakage power without impacting chip performance. Also, robustness of leakage-sensitive circuits can be improved by this technique.

Leakage versus delay trade-offs offered by stack forcing are compared with similar trade-offs achievable by increasing transistor channel lengths (Figure 21.31). Increasing transistor length reduces leakage because of threshold roll-off and width reduction mandated by preserving the original input capacitance. In sub-100 nm technology, where halo doping is used, reverse Vt roll-off is typically observed for channel lengths higher than nominal. Furthermore, two-dimensional potential distribution effects dictate that doubling the channel length is less effective for leakage reduction than stacking two transistors, especially when the DIBL is high. Simulation results confirm this behavior and show that channel length has to be made three times as large to get the same leakage as a stack of two transistors, resulting in 60% worse delay. Clearly, then “stack forcing” for leakage control is preferred if the channel length needs to be more than doubled to achieve the target low leakage.

Typically, large circuit blocks contain some series-connected devices in complex logic gates. These so-called “natural stacks” can be exploited to reduce standby leakage [30]. Leakage power of a large circuit block such as a 32-bit static CMOS Kogge-Stone adder depends strongly on the primary input vector (Figure 21.32). The total “off ” device width and the number of transistor stacks with two or more “off ” devices change as primary input vectors change. This causes the leakage power to vary with input vector. When a circuit block is “idle,” one can store the input vector that provides the least amount of leakage at the primary input flops. This can reduce the standby leakage power by 2X. There is no performance overhead since this predetermined input vector can be encoded in the feedback path of the input flip-flop. The minimum time required in standby mode, so that the energy overhead for entry and exit into this mode is <10% of the leakage energy saved, is tens of microseconds. This time reduces further with technology scaling as leakage levels increase, making this technique more attractive. Of course, EDA tools will be needed to identify this “lowest leakage” input vector efficiently during design phase for each circuit block.

Another promising leakage control technique is to employ sleep transistors [31], which act like brute- force switches between logic and power rails, which are turned off when the logic is not in use to reduce source/drain leakage, as shown in Figure 21.33. The sleep transistors may be high or low Vt, and may be simply turned off (Vgs = 0) or underdriven (Vgs < 0), and the leakage reduction varies accordingly. Switching sleep transistors on and off consumes energy, and this must be taken into account in evaluating the overall leakage reduction benefit. Figure 21.34 shows the potential benefits of sleep transistors when applied to an ALU, and compares it to body bias [32,33]. For the same performance (frequency), the sleep transistor-based ALU must operate at higher supply voltage, resulting in higher active power when compared to body bias. Yet, the leakage savings owing to sleep transistor are higher, and the total power of the ALU is the lowest with sleep transistors for the same performance.

The use of sleep transistors has not been limited to logic design only, but has been extended to static RAMs or SRAMs [34–36]. Since leakage power is a significant fraction of the total cache power, it is imperative to use such low-leakage techniques in conventional memory designs. This however, comes at

the cost of memory access speed and cell stability. Proper design techniques are necessary to lower the leakage in memories, without affecting the data retention capability or read/write stability.

UltraLow-Power Circuit Design

In recent years, the demand for power-sensitive, battery-operated, and hand-held devices has increased substantially, thereby necessitating research in an area of ultralow power circuits capable of working in a few tens to a few hundreds of kilohertz. Subthreshold logic (where the supply voltage, VDD is below the threshold voltage, Vt) has emerged as a popular choice for realizing extremely low-power digital systems [37,38]. The advantage in power reduction comes not only from the reduced VDD, but also from the reduced gate capacitance in the subthreshold region. Considerable work has already been done in realizing digital signal processors in subthreshold domain [39] and optimizing the design techniques for the same [40]. Parameter variation poses a challenge for designing circuits operating in subthreshold region.

Search This Blog

Integrated circuit course