Content-Addressable Memory:CAM Cell Circuits
CAM Cell Circuits
In the CAM architecture, the majority parts of this circuit are constructed by large amounts of CAM cell, each CAM cell is implemented by bit storage along with bit comparison circuit. In the CAM cell designs, the bit storage is utilized to store input data, and the bit comparison circuit is applied to compare a desired search data with the stored data. Two kinds of CAM cells are widely used in CAMs: binary CAM (BCAM) cell and ternary CAM (TCAM) cell. The conventional BCAM cell requires nine [12–14] or ten transistors [15], while the conventional TCAM cell requires seventeen transistors [16] in standard CMOS circuit design. Since more transistors result in a higher die area, the CAM cells have relatively low-memory density compared with standard memory cells, such as DRAMs and SRAMs. The low-memory density limits the circuit capacity for CAM applications. In most CAM applications, the required CAM size is smaller than that of ordinary memory devices, as a result of the required search tables in these CAM applications are quite small. In the following sections, both BCAM and TCAM cell designs are introduced.
BCAM Cells
In the CAM circuit design, the BCAM cell is typically constructed of nine- or 10-transistor structures as shown in Figure 56.2. The nine-transistor BCAM cell circuit as shown in Figure 56.2(a) consists of an ordinary six-transistor SRAM cell to store the input data and an XOR-type comparison circuit M1 and M2 with a resultant transistor M3 to drive a match-line (ML). During the write cycle, the input data BL and its complement BLB are placed on the bit-lines. Then the word-line (WL) is driven forcing the stored data Q to be replaced by the input data BL. To read out the stored data Q, the word-line WL is driven placing the stored data Q and its complement QB on the bit-lines BL and BLB, respectively. Then a read sense amplifier is used to detect and amplify the small level difference between two bit-lines BL and BLB. As the result, the read and write operations of the BCAM cell are similar to that of ordinary memory cells. In most CAM applications, since the required CAM size is quite small, the parasitic capacitance of each bit-line is smaller than that of ordinary memory devices. For this reason, the read and write operations of CAM are faster than that of ordinary memory devices.
In the nine-transistor BCAM cell design, the comparison circuits M1 and M2 are designed as an XOR logic circuit to realize bit comparison operation. During search cycle, the search data BL is sent into this cell to compare with the stored data Q using the comparison circuit M1 and M2. In this cycle, if the search data BL is equivalent to the stored data Q, then the comparison circuit outputs a logic 0 to turn off the resultant transistor M3, and the match-line ML is floating indicating that the data comparison is matched. Otherwise, if the search data BL is different from the stored data Q, then the comparison circuit outputs a logic 1 to turn on the resultant transistor M3, and the match-line ML is falling down to VSS indicating that the data comparison is mismatched. Table 56.1 shows the truth table of the nine-transistor BCAM cell. As summarized in this table, if the data comparison is mismatched then the match-line ML is VSS (the resultant transistor M3 is turned on). Otherwise, the match-line ML is floating (the resultant transistor M3 is turned off). In this BCAM cell design, since the comparison circuit (M1 and M2) is implemented using pass transistor logic (PTL) XOR circuit, the output potential of the comparison circuit M1 and M2 is less than or equal to (VDD − VSS − Vtn), where Vtn is the threshold voltage of NMOS transistor [17]. Therefore, the operating voltage (VDD − VSS) must be greater than 2Vtn for turning on
the resultant transistor M3. Based on the nine-transistor BCAM cell design, if the circuit operates in high operating voltage condition, then the falling time of the match-line ML can be very short, as a result the parasitic capacitance of the match-line ML consists only of the drain capacitance of the resultant transistor M3, and the longest pull-down path from the match-line ML to ground node VSS contains only one transistor M3. For this reason, the cell design achieves high-speed data search performance. However, if the circuit operates in low operating voltage condition, then the match-line ML requires a long falling time, since the weak turn-on transistor M3 limits the discharge current from the match-line ML to the ground node VSS. Therefore, the cell design has low data search performance. As in the above-mentioned circuit operation, the operating voltage of nine-transistor BCAM cell design cannot be reduced efficiently for high-speed data search applications.
To reduce the required operating voltage in BCAM cell design, the 10-transistor BCAM cell as shown in Figure 56.2(b) is a better choice than the nine-transistor BCAM cell. In this cell design, the function is the same as that of the nine-transistor BCAM cell. To reduce the operating voltage, the comparison circuit with resultant transistor in the 10-transistor BCAM cell is implemented using the NMOS part of CMOS XNOR gate (M1–M4) with inputs BL and Q as shown in this figure. Since the required operating voltage of CMOS logic circuit is lower than that of PTL logic circuit, the 10-transistor BCAM cell design has low operating voltage feature. However, in this cell design, since the parasitic capacitance of the match-line ML contains at least two drain capacitances of M1 and M3 transistors, the large parasitic capacitance not only reduces the switching speed of the match-line ML but also consumes large amounts of dynamic power dissipation during data search operation. For this reason, the 10-transistor BCAM cell design is not suitable for high-speed and low-power data search applications.
TCAM Cells
In the BCAM cell designs, the stored data has two states (“0” and “1”) called binary state. In most data search applications, the circuit design uses binary state to store data in BCAMs as a binary search table. However, in some specific applications, for example, Internet routers and pattern recognition, an extra state “X” (do not care state) is used to improve memory utilization and search performance. Therefore, the BCAM cell designs are not suitable for these specific applications. To solve this problem, the TCAM cell is designed to provide three states (“0”, “1”, and “X”), called ternary state. The TCAM cell as shown in Figure 56.3 is typically constructed of 17 transistors. This cell design consists of two ordinary six-transistor SRAM cells (the pattern storage and the mask storage) to store pattern data P and mask data M, and a five-transistor comparison circuit M1–M5 to drive a match-line ML. In this TCAM cell design, the write operation requires two write cycles to store pattern data P and mask data M in the pattern storage and the mask storage, respectively. Similarly, the read operation requires two read cycles to read the stored pattern data P and mask data M from the pattern storage and the mask storage, respectively.
During search cycle, if the stored mask data M is 1, then the pull-down transistor M5 in the comparison circuit is turned on. In this situation, the operation of comparison circuit is the same as that of the 10-transistor BCAM cell, and the output state of the match-line ML depends on the comparison result of both the search data BL and the stored data P. Otherwise, if the stored mask data M is “0,” then the
pull-down transistor M5 in the comparison circuit is turned off. In this situation, the output state of the match-line ML is floating, no matter what the search data BL and the stored data P are. In other words, the data comparison is always matched when the stored mask M data is “0.” Table 56.2 shows the truth table of the 17-transistor TCAM cell. As the summarized in this table, the stored state of TCAM cell has “0”, “1”, and “X” states, where the “X” state makes the stored data always matched with the search data, no matter what the search data BL and the stored data P are.
Since more transistors result in more hardware cost, the 17-transistor TCAM cell is not suitable for low-cost applications. To reduce hardware cost, a dynamic four-transistor (4-T) TCAM cell design is provided [18]. The four-transistor dynamic TCAM cell, as shown in Figure 56.4, consists of a dynamic storage M1 and M2, and a comparison circuit M3 and M4. During the write cycle, the input data BLa and BLb are, respectively, stored in the nodes Qa and Qb by driving the word-line WL. In this cycle, the transistors M1 and M2 are turned on, and the input data BLa and BLb are stored as the charge on their related node capacitances. When the stored data is “1”, the related node capacitance is charged to VDD − VSS − Vthn. When the stored data is “0”, the related node capacitance is discharged to VSS. During search cycle, the transistors M3 and M4 are arranged in a PTL-type XOR logic circuit to perform a comparison circuit. Table 56.3 shows how the three states are stored in the dynamic 4-T TCAM cell. In the dynamic 4-T TCAM cell design, although the circuit has only four transistors that reduce hardware cost and power dissipation, the dynamic circuit operation requires complex circuit controller (e.g., dynamic refresh circuit) and careful circuit design to prevent some dynamic circuit design problems. Therefore, the dynamic 4-T TCAM cell design is only provided for some cost-sensitive CAM applications.
Comments
Post a Comment