HDL-Based Tools and Environments:Synthesis Tools
Synthesis Tools
A synthesis tool is a computer program that transforms the description of a circuit into a lower level description which is closer to the hardware, while at the same time optimizes some objectives such as area, delay, testability, and power consumption [1]. Here we will introduce the different types of the synthesis tools, i.e., system synthesis, behavioral synthesis, RTL synthesis, and logic synthesis.
System Synthesis
Today, most designers implement embedded systems as core-based systems-on-a-chip (SOC) [2]. They use different implementation of cores provided by different core vendors to achieve the required functionality. This approach reduces system design time by reusing the existing cores. The set of existing cores defines the design space, while the set of tasks defines the system specification. The designer should select an implementation of the system for both achieving the system functionality and satisfying the design objectives (e.g., area, performance, price, testability, and power consumption) from the design space. This process is called system synthesis. Therefore, a system synthesis tool takes in a specification of the system and a set of cores as inputs and provides an architecture that is an interconnected network of cores as its output.
Specification of a system is described in a high-level language such as System Verilog or SystemC and then converted into a directed acyclic task graph. Each node of a task graph is associated with a task, while each edge, connecting two nodes, is associated with a data transfer link between the two nodes. Data dependency edges ensure the correct order of execution of tasks. This implies that each task can execute when all its predecessor tasks are completed. Each edge is associated with a scalar describing the amount of data that must be transferred between the two connected tasks, representing the communication time between these tasks if mapped on different cores.
There are four major steps in a system synthesis tool: allocation, assignment, scheduling, and cost estimation. In the first step, i.e., allocation, we determine the quantity (number) of each type of core and buses needed to be used. The next step is assignment, where we assign each task to a core (for execution), and each communication link to a bus (for data transfer). Then we perform scheduling to determine the start time of all tasks and communications. Cost estimation indicates the price, area, performance, testability, and power consumption of the solution. This step is required to select the optimal solution from the existing ones.
Several design parameters should be optimized during system synthesis steps including price, area, performance, testability, and power consumption. Here we will describe these qualification parameters.
• Price: The price of the final solution can be defined as the sum of the prices of all the cores on the chip plus the area-dependent price of the chip.
• Area: At system level, the area of a system can be estimated with the sum of the area of used cores and buses. This area should be minimized.
• Performance: The scheduler provides accurate information on the start and finish times of tasks in the allocated cores and finds the critical-path worst-case execution time. This value is used to estimate the performance of the synthesized system.
• Testability: Since embedded cores are not directly accessible via chip IO pins, a special test access mechanism (TAM) is required to test them after system integration. Testability optimization means that TAM designs should be integrated into the system synthesis process to find the best testable solution.
• Power consumption: The power consumption of the synthesized core equals the sum of the power consumed by all the allocated cores and buses.
Behavioral Synthesis
A behavioral synthesis tool usually translates the behavioral description from VHDL, Verilog, System Verilog, or SystemC into a suitable intermediate format, such as a data flow graph (DFG). In a DFG, each node is associated with an operation, while each edge is associated with a variable. Figure 96.3 shows a partial VHDL code and its corresponding DFG, called ex1. To generate the RTL architecture, the behavioral synthesis performs three major tasks: scheduling, resource allocation, and design optimization.
Scheduling
Scheduling assigns each operation to one or more clock cycles, and specifies cycle-by-cycle behavior of a circuit. The result of scheduling is called scheduled DFG (SDFG). For an operation o in a DFG, o.earliest (o.latest) denotes the earliest (latest) cycle time, in which the operation can be executed. In an SDFG, the cycle time, at which a variable is first defined, is called the birth time of the variable, while the cycle time, at which a variable is used last, is called the death time of the variable. The lifetime of the variable is defined as the interval [birth, death]. Two variables are compatible if their lifetimes do not overlap. Such variables can be mapped on to a single register, and two variables are incompatible if their lifetimes overlap. For example, in ex1 variables, a and f are compatible while variables e and f are incompatible. Scheduling algorithms are used for assigning operations to specific clock cycles.
Resource Allocation
Given an SDFG, resource allocation assigns modules to perform operations (module allocation), and registers to store variables (register allocation). In module allocation, operations in an SDFG are mapped to proper data-path modules available in the given library. Operations used in high-level synthesis can be classified as arithmetic operations (e.g., add, subtract, multiply, divide, and compare) and logical operations (e.g., and, or, nand, and nor). If a data-path library has modules with the same functionality but different characteristics, high-level synthesis can achieve better performance. For example, an “add” operation can be mapped to a ripple-carry adder, a carry look-ahead adder, or a carry-save adder. The trade-off among different characteristics enables the synthesized circuit to have smaller area, higher performance, or less power consumption. In register allocation, a register should be assigned to each module input or output variable. The left-edge algorithm (LEA) that is used for both module and register allocation finds the minimum-number modules and registers required for data-path implementation.
To illustrate scheduling and allocation, we apply some of their corresponding algorithms to the example of Figure 96.3. In this example, using a method of scheduling that is called, force-directed, the operations +1 and +2 are scheduled at time frame 0 and the operation +3 is scheduled at time frame 1. Using LEA
for module allocation results in MLEA = {(+1, 3), (+2)} allocation, which means that operations +1 and +3 are mapped on to the same adder, and the operation +2 is mapped to another instance of adder. Using LEA for register allocation leads to RLEA = {(a, e, g), (b, f), (c), (d)} allocation, which means that variables a, e, and g are mapped to register R0, variables b and f are mapped to R1, and variables c and d are mapped to registers R2 and R3, respectively. The RTL implementation of ex1 is shown in Figure 96.4.
Behavioral Design Optimization
In behavioral synthesis several qualification parameters are optimized. These parameters are:
• Area: The area of the final circuit should be minimized. At the behavioral level the area of the circuit can be estimated with the sum of the area of functional units and interconnections. When we move to the deep submicron era, the wiring area should be considered.
• Delay: The delay of the resulted circuit should be optimized during the synthesis steps. The delay of the circuit is estimated with the sum of the delay of functional units on the critical path. In deep submicron, since wiring delays dominate gate delays, they should be considered.
• Testability: The architecture provided by a synthesis tool is ready to be fabricated. However, manufacturing test and debug remain as major problems, in which the test sequences provided for the system components should be justified and propagated. In addition, increasing the test time, results in increase in chip costs and time-to-market. Therefore, testability consideration during the early stages of behavioral synthesis generates an RTL circuit that is optimized for testability.
• Power consumption: Dynamic power consumption that is caused by switching of logic values is the main source of power consumption in CMOS designs. Scheduling and allocation algorithms can help optimize a circuit for reducing its power consumption.
RTL Synthesis
If the described RTL component already exists in the synthesis library (e.g., an adder) the RTL synthesis tool simply uses the cell and its corresponding expressions to implement the functionality of the RTL component. Otherwise, the RTL synthesis tool finds the functionality of all signals in the circuit and generates Boolean equations to satisfy their functionality. By doing so, the output of the RTL synthesis tool is a set of Boolean equations which describe the functionality of all signals. It should be noted that in this process no optimization is performed on the Boolean equations.
During the RTL synthesis steps several optimization guidelines are provided by different tools (e.g., test tools) to assist designers to enhance several features of the final circuit.
• Testability: At this level of abstraction several design for testability methods, e.g., scan design and built-in self test (BIST) insertion, can be applied.
○ In scan design, all FFs form a shift register called scan register. This scan register can be accessed from external pins for test inputs and outputs. In the test mode, the scan register can be loaded by shifting appropriate data into the scan register. Then the data applied to a specific part of the circuit and the outputs are loaded into the scan register. And finally, the results are shifted out via the scan register.
○ In BIST, on-chip circuitry is included to generate test vectors and analyze the outputs. For test pattern generation and analyzing the output, the registers in BIST must be redesigned. Func- tions of a BIST normally involve random test pattern generation (RTPG), serial shifting of data into scan registers, collecting and compressing results using multiple-input signature register (MISR), and analyzing the results.
• Power consumption: The main concern for power reduction at the RT level is to select low-power cells from the library.
Logic Synthesis
A logic synthesizer parses the input design (which is described as Boolean equations) and builds an internal data structure (usually a graph represented by linked lists). Next, logic optimization uses a series of factoring, substitution, and elimination steps to simplify the equations that represent the synthesized network. To make logic optimization tractable, most tools use algorithms based on algebraic factors rather than Boolean factors. Logic optimization attempts to simplify the equations in the hope that this will also minimize area and maximize speed. Following the optimization pass, the technology-mapping pass decides on cells to use for the optimized representation of a circuit. In technology mapping, the algorithms attempt to minimize area (the default constraint) while satisfying other user constraints (delay, testability, and power constraints). In logic synthesis process several qualification parameters of the circuit are optimized. These parameters are:
• Testability: The aim of testability optimization in logic synthesis is mainly to minimize the number of scan FFs and consequently reduce the design area overhead and cost.
• Power consumption: As mentioned before, the main source of power consumption in digital CMOS circuits is switching power that can be reduced at the logic optimization stage.
Comments
Post a Comment