Power-Aware Architectural Synthesis:Low-Power Behavioral Synthesis

Low-Power Behavioral Synthesis

Behavioral synthesis, or high-level synthesis, is the automatic design of an IC starting from an implementation-independent description of the design’s behavior, a description of the functional units and communication resources available, and constraints on performance and power.

Figure 17.4 gives an overview of a behavioral synthesis optimization flow. Note that, although this flow is representative, other high-level meta-algorithms exist. For example, it would be possible to use a mixed integer linear program (MILP) solver on a unified behavioral synthesis problem formulation, in which case there would be no allocation, binding, and scheduling optimization loop.

Although the input to a behavioral synthesis system can take many forms, the most common are software language, hardware description language, or graph-based specifications. An example C input

Power-Aware Architectural Synthesis-0032

file for a finite impulse response filtering algorithm is shown in Figure 17.5. As shown in Figure 17.4, regardless of the starting point, behavioral synthesis systems use compilers [14,15] to convert specifica- tions into (possibly synchronous) data flow graphs (DFGs) or control-data flow graphs (CDFGs) for further optimization. Translation and performance optimization of the code in Figure 17.5 results in the

Power-Aware Architectural Synthesis-0033

control-data flow graph shown in Figure 17.6. In this figure, the graph at the upper-right shows the flow of control among the basic blocks, i.e., straight-line sequences of code that may be represented with DFGs. Within each basic block, the nodes without incoming edges represent variables or constants and the other nodes represent operations on the data arriving on the incoming arcs. In addition to a description of the algorithm to be implemented, behavioral synthesis tools require models for the hardware resources that may be used in the implementation. For example, the user may provide a library of performance and power models for the available functional units, e.g., adders, multipliers, and registers. These models may be provided as part of the resource library or automatically generated by commercial timing and power analysis tools.

A behavioral synthesis algorithm does functional unit allocation, operation binding, and scheduling to optimize performance, IC area, and possibly power. Power optimization, e.g., minimizing switched wire capacitance, may require physical information and, therefore, floorplanning block placement within behavioral synthesis. The product of behavioral synthesis is a complete RTL description of the synthesized system. This output is generally used as an input to a logic synthesis or RTL synthesis tool as indicated by steps (d) or (e) in Figure 17.1.

Dynamic Power Optimization

Extensive research has been conducted in low-power behavioral synthesis. In the past, IC power consumption was dominated by dynamic power. Therefore, most low-power synthesis research has focused on dynamic power optimization. Dynamic power is a quadratic function of supply voltage. Therefore, voltage reduction is commonly used to reduce power consumption in behavioral synthesis. However, reducing operating voltage requires global design changes, i.e., changes to functional unit allocation, assignment of operations to functional units, and schedules.

Optimal scheduling using multiple supply voltages is an NP-hard problem. Johnson and Roy [16] developed a behavioral scheduling algorithm, called minimum energy schedule with voltage selection (MESVS) that uses integer linear programming (ILP) to optimize the energy consumption of a DSP datapath by using multiple supply voltages. Voltage scaling may have a negative impact on circuit performance. In this work, timing requirements are enforced via ILP constraints. MESVS is limited to discrete voltage-level selection. Later, Johnson and Roy [17] proposed MOVER, which allows continuous voltage assignment. MOVER also uses an ILP-based method to conduct voltage selection and operation partition, and then derive a feasible schedule with minimum area overhead. Optimal ILP-based solutions generally have high computation complexity. Chang and Pedram [18] developed a dynamic program- ming-based method to address the multiple voltage scheduling problem in datapath circuits. Under timing constraints, this approach reduces supply voltages along noncritical paths to maximize power reduction with low area overhead. Raje and Sarrafzadeh [19] developed a heuristic-based voltage assign- ment algorithm, with computational complexity O (N 2), to minimize power consumption. Although it is demonstrated that voltage reduction can greatly reduce power, incremental gains decrease with the number of voltage levels. In addition, incorporating multiple on-chip supply voltages complicates IC design.

In addition to voltage scaling, researchers have developed behavioral synthesis algorithms that minimize switching activity and driven capacitance. Chatterjee and Roy [20] designed a behavioral synthesis system for low-power DSPs. In this work, application DFGs were transformed to reduce switching activity, thereby reducing power consumption. Chandrakasan et al. [21] designed HYPER-LP, a behavioral synthesis system. HYPER-LP uses algorithmic transformations enable voltage scaling and effective capacitance reduction. Kumar et al. [22] developed a profile-driven behavioral synthesis algorithm, using profiling to characterize the run-time activities of DFG-based system models. Low-power behavioral synthesis is then conducted to minimize estimated system switching activity. Chang and Pedram [23] proposed an allocation and binding technique to minimize the switching activity in registers. In this work, statistical methods are used to characterize the switching activities of registers. A max-cost flow algorithm was then proposed to conduct power-optimal register assignment. Chang and Pedram [24] also proposed a low-power binding technique to minimize the power consumption of datapath functional units, in which power optimization is formulated as a max-cost multicommodity flow problem. Dasgupta and Karri [25] proposed simulta- neous binding and scheduling techniques to reduce switching activity, and hence the power consumption, of buses. Mehra et al. [26] proposed behavioral synthesis techniques for low-power real-time applications. By preserving locality and regularity in input behavior during resource assignment, this technique reduces the need for global buses, thereby reducing power consumption. Ercegovac et al. [27] proposed a behavioral synthesis system that uses multiple precision arithmetic units to support low-power ASIC synthesis. In this work, system resource allocation is conducted through multigradient search and task assignment is based on a modified Karmarkar–Karp’s number partitioning heuristic.

A few researchers have developed high-level synthesis algorithms that combine numerous power optimization techniques. Musoll and Cortadella [28] proposed several high-level power-optimization techniques, including loop interchange, operand reordering, operand sharing, idle units, and operand correlation, for reducing the activities of functional units. Raghunathan and Jha [29] designed SCALP, an iterative-improvement-based behavioral synthesis system, for low-power data-intensive applications. SCALP provides a rich set of behavioral optimization techniques, including architectural transformation, scheduling, clock selection, module selection, and hardware allocation and assignment. Khouri et al. [30] showed how to perform low-power behavioral synthesis for control-flow intensive algorithms. This work uses an iterative improvement framework to perform design space exploration. Behavioral power opti- mization techniques, including loop unrolling, module selection, resource sharing, and multiplexer network restructuring, are done concurrently.

Physical-Aware Power Optimization

In conventional behavioral synthesis, physical implementation details were generally ignored when mak- ing architectural decisions. Continued process scaling has required fundamental changes to IC synthesis. At present, physical design details must be considered during all stages of IC synthesis. Many of the techniques use physical information, e.g., floorplan block placements, to optimize switched capacitance better [31–34], as explained in Section 17.2.3. Although they do not use a floorplan, Lyuh et al. [35] optimize assignment of communication events to interconnect buses, and the order of (capacitively coupled) wires within buses, to reduce effective switched capacitance. Prabhakaran and Banerjee [36] proposed a simultaneous scheduling, binding, and floorplanning algorithm to address the power con- sumption of interconnect during behavioral synthesis. Zhong and Jha [37] presented an interconnect- aware low-power behavioral synthesis algorithm, called ISCALP, that minimizes power consumption in interconnects through interconnect-aware binding. Recently, Gu et al. [38] designed a fast, high-quality incremental floorplanning and behavioral synthesis system that concurrently optimizes performance, power, and area.

Leakage Power Optimization

As a result of technology scaling, leakage power consumption is becoming increasingly significant in digital CMOS circuits. Khouri and Jha [39] were the first to propose a method of reducing leakage power consumption during behavioral synthesis. They proposed an iterative algorithm using dual-Vth technology. Through each iteration, a greedy prioritization approach is used to identify the functional unit with maximum leakage power reduction potential, and then replace it with a higher-Vth functional unit. Gopalakrishnan and Katkoori [40] proposed KnapBind, a leakage-aware resource allocation and binding algorithm to minimize datapath leakage power consumption. This work maximizes the idle time of datapath modules. MTCMOS functional modules with large idle time slots are placed into sleep mode when they are idle. Tang et al. [41] proposed a heuristic to minimize leakage power consumption during behavioral synthesis. The synthesis problem is formulated as the maximum weight-independent set prob- lem. Datapath components with maximum or near-maximum leakage-saving potentials are identified and replaced with low-leakage library modules. Leakage power is a strong function of chip temperature. Mukherjee et al. [42] proposed a temperature-aware resource-binding technique to minimize leakage power consumption during behavioral synthesis. The proposed iterative resource-binding technique min- imizes chip peak temperature by balancing the chip power profile, thereby reducing leakage power.

Thermal Optimization

Increasing performance requirements and system integration are dramatically increasing IC power density, and hence chip temperature. Thermal effects are becoming increasingly important during IC design. Mukherjee et al. [43] addressed thermal issues during behavioral synthesis. They proposed temperature-aware resource allocation and binding algorithms to minimize chip peak temperature. Gu et al. [5] designed TAPHS, a thermal-aware unified physical and behavioral synthesis system. TAPHS incorporates a complete set of integrated behavioral and physical thermal optimization techniques, including voltage assignment, voltage island generation, and thermal-aware floorplanning, to jointly optimize chip temperature, power, performance, and area. Thermal-aware behavioral synthesis algorithms must determine the temperature profiles of a tremendous number of candidate designs. Recently, researchers have developed and publicly released fast and accurate thermal analysis tools specifically for this purpose [44].

Comments

Popular posts from this blog

SRAM:Decoder and Word-Line Decoding Circuit [10–13].

ASIC and Custom IC Cell Information Representation:GDS2

Timing Description Languages:SDF