Power-Aware Architectural Synthesis:Challenges of Low-Power Synchronous System Synthesis and Design.

Challenges of Low-Power Synchronous System Synthesis and Design

This section introduces the fundamentals necessary to understand the sources of power consumption in synchronous digital systems (Section 17.2.1). It then describes a number of techniques that may be used during behavioral synthesis and system synthesis to improve power and thermal characteristics (Sections 17.2.2–17.2.5).

Power Overview

With increasing system integration, as well as aggressive technology scaling, power consumption has become a major challenge in digital system design. In high-performance computer systems, power and thermal issues are key design concerns. Power management and optimization techniques are essential for minimizing system temperature to permit reliable operation. For portable devices, prolonging battery lifetimes and minimizing packaging costs are primary design challenges. Power also interacts with other design metrics, such as performance, cost, and reliability, thereby further increasing design complexity. For example, the failure rate of electronic devices is a strong function of system temperature, which is in turn controlled by system power dissipation. Therefore, increasing power consumption results in the need for more complicated cooling and packaging solutions to sustain system reliability, which in turn increases costs. As projected by International Technology Roadmap for Semiconductors (ITRS) [2], power will continue to be a limiting factor in future technologies. There is an increasing need to address power issues in a systematic way at all levels of the design process.

In digital CMOS circuits, power dissipation is the sum of dynamic power, Pdynamic, and static power, Pstatic. Dynamic power, Pdynamic, results from charging and discharging of the capacitance of CMOS gates and interconnect during circuit switching, Pswitch, and the power during transient short-circuits when inputs are in transition, Pshort circuit. For synchronous CMOS designs, switching power is one of the dominant sources of power consumption. It is a function of physical capacitance, C, switching activity, s,§ clock frequency, f, and supply voltage, Vdd:

Power-Aware Architectural Synthesis-0027In CMOS, the other major source of power consumption, static power, Pstatic, results from leakage current. Leakage current has five basic components: reverse-biased PN junction current, subthreshold leakage, gate leakage, punch-through current and gate tunneling current. Of these five components, subthreshold and gate leakage will remain dominant during the next few years. The subthreshold leakage power is given by

Power-Aware Architectural Synthesis-0028

The product of physical capacitance, C, and switching activity, s, is also called switched capacitance.

where Isub and n are technology parameters, W and L device geometries, Vth is the threshold voltage, and VT the thermal voltage constant [3]. Gate leakage is the current between the gate terminal and any of the other three terminals (drain, source, or body). As a result of technology scaling, gate leakage increases exponentially due to decreasing gate oxide thickness.

From Eqs. (17.1) and (17.2), it can be seen that total power consumption may be reduced by attacking operating voltage, capacitance, switching activity, threshold voltage, transistor size, and temperature. In real designs, the variables upon which total power consumption depends are often closely related: reducing one may increase another. In addition, reducing power consumption may have a negative impact on other design metrics. A synthesis algorithm must simultaneously consider, and trade off, these design metrics.

Operating Voltage-Oriented Techniques

Reducing operating voltage, Vdd, is one of the most promising techniques for reducing dynamic power consumption. As indicated by Eq. (17.1), Pdynamic is quadratically related to Vdd. All other things being equal, halving Vdd reduces Pdynamic to one-fourth of its initial value. However, this reduction has a negative impact on circuit performance [4]:

Power-Aware Architectural Synthesis-0029

where k is a design-specific constant and α a process-specific constant ranging from 1 to 2. As a result, for low values of Vth and α = 2, and all other things being equal, halving Vdd implies halving clock frequency, f . Some of the following sections describe techniques for reducing operating voltage without degrading performance.

Multiple Simultaneous Operating Voltages

ICs contain timing-critical and -noncritical combinational logic paths between memory elements (latches and flip-flops). It is possible to selectively decrease the operating voltage(s) of gates on the noncritical paths, thereby reducing Pdynamic without reducing performance. Multiple voltage techniques may be used within architectural synthesis. Although it is not essential, processing elements¶ sharing the same voltage are often placed in contiguous regions called voltage islands to simplify power distribution. Communication between different voltage regions relies on level converters. This physical requirement for contiguous regions dramatically changes the IC floorplan, thereby changing communication power consumption, wire delays, and thermal properties. These changes, in turn, impact the original design properties, e.g., combinational path criticality, optimal clock frequency, and operation cycle times. It is necessary to consider the consequences of using multiple voltages at different design levels, i.e., architectural and physical. Recent multiple voltage behavioral [5] and system [6] synthesis techniques allow solution of the voltage-level assignment problem concurrently with one or more of the other following problems: processing element selection, assignment of tasks to processors, scheduling, and floorplanning.

Dynamic Voltage (and Frequency) Scaling

In addition to varying the operating voltages of subcircuits by position, it is possible to vary operating voltages in time. Dynamic voltage scaling is generally carried out in conjunction with frequency scaling to prevent timing violations. It allows an IC to adaptively adjust operating voltage to minimize power con- sumption without violating timing constraints. Dynamic voltage and frequency scaling (DVFS) interacts closely with scheduling: some schedules allow timing slack to be used for power minimization without the violation of deadlines while others leave little opportunity for power minimization. Synthesis algorithms have been developed for both offline DVFS and online DVFS [7], for which predictions of future system behavior are used to amortize the cost of voltage and frequency changes over longer low-power periods.

Power-Aware Architectural Synthesis-0030

Scheduling and Timing

Scheduling is the process of selecting the orders and execution start times of operations and communi- cation events. In some cases, a system’s original schedule may not permit the reduction of operating voltage(s) without performance degradation. For example, some operations may be immediately followed by other operations, leaving little spare time for voltage reduction. Changing operation start times and orders can open up opportunities for greater reductions in power consumption via operating voltage reduction. It is also possible to change the number of clock cycles and frequency for an operation, thereby allowing a decrease in power consumption without degrading computational throughput.

Power for Performance and Area Techniques

Even if it seems that attempts to reduce the operating voltage for tasks on critical timing paths will result in performance degradation, it is sometimes possible to buy back the lost performance at a cost in area. Consider the example in Figure 17.2. In the serial implementation shown on the left, the processing element must operate at a high voltage at all times to meet the performance requirements. By adding another processing element and parallelizing the operations, as shown by the parallel implementation to the right, it is possible to finish execution early, thereby providing enough timing slack to permit operating voltage to be reduced to 3/4 its initial value, thereby reducing dynamic power consumption to 9/16 its initial value. This general technique of buying voltage reduction at the cost of performance and gaining back the lost performance through increased area and design complexity forms the basis of a number of techniques in low-power architectural synthesis [8–10].

Switched Capacitance-Oriented Techniques

It is possible to reduce both active device and interconnect capacitance via a number of synthesis techniques. Reducing a CMOS gate in size reduces the capacitance driven by the previous gate. However, this also increases resistance to the power and ground rails, increasing the delay of the subsequent gate. In many cases, several devices are not on the critical timing path of the system. Their sizes may be reduced to reduce driven capacitance. This technique shares properties with operating voltage reduction. However, the potential for improvement to dynamic power consumption is generally smaller because the relation- ship between power and capacitance-dependent delay is subquadratic, i.e., reducing operating voltage is generally a better choice than reducing capacitance. However, reducing capacitance does come with two additional advantages: area efficiency and no need for multi-voltage support. During architectural synthesis, it is common for libraries to contain functionally equivalent processing elements with different power and performance properties. Many of these differences have their sources in differing internal gate and wire capacitance values. However, in architectural synthesis, this problem is often encompassed by processing element selection and assignment of operations to processing elements.

Interconnect self-capacitance may be decreased by three techniques, one local and two architectural. Decreasing interconnect width reduces capacitance at the cost of increased delay. However, it is also possible to simultaneously reduce interconnect delay and capacitance by decreasing wire length. Finally, one can change the assignment of tasks to processing elements to reduce or eliminate the inter-processing element-switched capacitance necessary for data communication. The lengths of wires are decided by the impact of architectural decisions, such as the allocation of processing elements and the assignment of operations to processing elements, upon the floorplan ultimately produced. The impact of interconnect coupling on effective capacitance is becoming increasingly important. It can be addressed during archi- tectural synthesis via bus planning as well as coding techniques.

Switched capacitance, the product of physical capacitance and switching activity, reflects the actual run-time load of the circuit. Recent studies [11] have demonstrated that switched capacitance minimi- zation is a much more efficient power optimization technique than physical capacitance reduction. Switched capacitance reduction techniques have been developed at all levels of the design hierarchy. Architecture-level techniques [12], such as power management, data encoding, glitch suppression, and architectural transformation, are widely used in low-power behavioral and system synthesis.

Leakage Power Techniques

Most work in low-power synthesis explicitly targets dynamic power consumption. This is not surprising. Even at the 90-nm process node, dynamic power accounts for over 90% of total power in modern processors. However, research indicates that a half or more of the total power consumption will result from leakage at the 25-nm process node [2]. Subthreshold leakage is an exponential function of chip temperature. As a result, increasing temperature from 25°C to 100°C can result in subthreshold leakage being the dominant source of power consumption.

As indicated in Eq. (17.3), it is necessary to reduce Vth in unison with Vdd to maintain good perfor- mance. However, reduction in threshold voltage increases subthreshold leakage. This problem may be addressed by using multiple threshold voltages [13], such as multi-Vth, adaptive body biasing, etc. During synthesis, high threshold voltages can be assigned to functional units along noncritical timing paths to reduce subthreshold leakage while functional units on critical paths operate at lower threshold voltages to maintain performance.

Power gating reduces subthreshold leakage power consumption by inserting sleep transistors in series with pull-up or pull-down paths of functional units to control their leakage power dependent on the sleep transistor inputs [13]. NMOS transistors with high threshold voltages are typically used as sleep transistors. This circuit topology is known as MTCMOS. Other techniques, e.g., exploiting the transistor stack effect, transistor sizing, and supply voltage scaling, may also be used to minimize subthreshold leakage power consumption.

Temperature-Oriented Techniques

All other things being equal, increasing IC power consumption increases temperature. Using temperature- aware techniques in architectural synthesis is a complex task. IC temperature is affected by many factors, including IC dynamic and leakage power profile, interconnect power profile, as well as the packaging and cooling solution. Many of these power profiles are only available after physical design, i.e., floor- planning. Although power optimization techniques can reduce average chip temperature, local thermal hotspots due to unbalanced chip power profiles may result in thermal emergencies, e.g., reliability problems due to electromigration. In addition, subthreshold leakage power consumption has an exponential relationship with chip temperature. Without temperature optimization, leakage power

Power-Aware Architectural Synthesis-0031

can dominate power consumption. To address IC thermal problems, it is critical to integrate architectural synthesis with physical synthesis and thermal analysis to form a complete thermal optimization flow [5]. Thermal modeling and analysis also need to be incorporated into the inner optimization loop to guide IC synthesis. However, detailed thermal characterization requires 3D full chip-package thermal analysis, which may have high computational complexity. Thermal analysis may easily become the performance bottleneck for thermal-aware synthesis.

Potential of Power Optimization at Different Design Levels

Although power minimization techniques were first developed at the device level, postponing power optimization until this stage of the design process neglects opportunities at higher levels. As indicated in Figure 17.3, considering power minimization at earlier stages of the synthesis or design process has a number of advantages. It yields greater potential for improvement. Moreover, it indirectly improves solution quality because many candidate designs may be considered at higher levels of synthesis due to the use of more abstract (hierarchical) system modeling.

Comments

Popular posts from this blog

SRAM:Decoder and Word-Line Decoding Circuit [10–13].

ASIC and Custom IC Cell Information Representation:GDS2

Timing Description Languages:SDF