Computer Arithmetic or VLSI Signal Processing:Dividers

Dividers

There are two major types of dividers: digit recurrence dividers that compute a fixed number of bits of the quotient on each iteration and functional approximation dividers that compute an increasingly accurate approximation to the quotient on each iteration.

Digit recurrence dividers use a sequence of shift, subtract, and compare operations to compute the quotient. The comparison operation is significant: it results in an inherently serial process which is not amenable to parallel implementation.

Computer Arithmetic for VLSI Signal Processing-0057

is the divisor. In this section it is assumed that both N and D are positive, see Ref. [10] for details on handling the general case.

Restoring divider. The restoring division algorithm is similar to paper and pencil division. A block diagram of the basic scheme is shown in Figure 80.14. Block 1 initializes the algorithm. In step 2 a trial dividend, TPk+1, is computed based on the assumption that the quotient bit is a 1. In the third step, the polarity of TPk+1, is evaluated. On the figure the diamonds indicate comparisons. The outgoing lines are labeled with the exit conditions (i.e., LT indicates Less Than, GE indicates Greater than or Equal, etc.). If TPk+1 is

Computer Arithmetic for VLSI Signal Processing-0058

Computer Arithmetic for VLSI Signal Processing-0059Computer Arithmetic for VLSI Signal Processing-0060

negative, step 4 “restores” the trial dividend to what it would be if the quotient digit had been assumed to be 0. This restoring operation gives the algorithm its name. Step 5 is performed if TPk+1, is zero or positive. Finally, step 6 tests whether all bits of the quotient have been formed and goes to step 2 if more need to be computed. Each pass through steps 2-6 forms one bit of the quotient. As shown in Figure 80.14, each pass through steps 2–6 requires an addition in step 2, a comparison in step 3, and may require an addition in step 4. If half of the quotient bits are 0 and half are 1, computing an n bit quotient will involve 3n/2 additions and n comparisons. This is the performing version of the restoring division algorithm which means that the restoring operation is actually performed in step 4.

Figure 80.15 shows an example of the restoring division algorithm, where the first four bits of quotient of 5/8 divided by 7/8 is evaluated.

The alternative nonperforming version of the algorithm places 2Pk in a temporary register in step 2 and uses that value for Pk+1 in step 4. The nonperforming version computes an n-bit quotient in n additions and n comparisons.

clip_image100Binary SRT divider. The binary SRT division process (named after its inventors, Sweeney, Robertson and Tocher) is similar to nonrestoring division, but the set of allowable quotient digits is increased to {1, 0, 1} and the divisor is restricted to 0.5 £ D < 1. The digit selection and resulting partial remainder are given for the k th iteration by the following relations:

Computer Arithmetic for VLSI Signal Processing-0061

Computer Arithmetic for VLSI Signal Processing-0062

The basic scheme is shown in Figure 80.16. Block 1 initializes the algorithm. In steps 3 and 5 Pk, is compared to ±0.5. If Pk ³ 0.5, in step 4 the quotient digit is set to 1 and Pk+1 = 2Pk - D. If Pk £ - 0.5, in step 6 the quotient digit is set to -1 and Pk+1 = 2Pk + D. If the value of Pk is between -0.5 and 0.5, step 7 sets Pk+1 = 2Pk. Finally, step 8 tests whether all bits of the quotient have been formed and goes to step 2 if more need to be computed. Each pass through steps 2–8 forms one digit of the quotient. As shown in Figure 80.16, each pass through steps 2–8 requires one or two comparisons in steps 3 and 5, and may require an addition (in step 4 or step 6). Thus computing an n-bit quotient will involve up to n additions and from n to 2n comparisons. In practice, most implementations use logic to inspect the top few bits of Pk so the comparisons that are shown in series in Figure 80.16 are performed in parallel in a single step.

clip_image102clip_image102[1]clip_image103The signed digit number (comprised of ±1 digits) can be converted into a conventional binary number by subtracting, NEG, the number formed by the 1s (with ZEROs where there are +ONEs or ZEROs and ONEs where there are 1s in Q) from, POS, the number formed by the +ONEs ( with ONEs where there are +ONEs and ZEROs where there are 1s or ZEROs in Q). For example:

Computer Arithmetic for VLSI Signal Processing-0063

Higher radix SRT divider. The higher radix SRT division process is similar to the binary SRT algorithms. Radix 4 is the most common higher radix SRT division algorithm with either the minimally redundant digit set of {±2, ±1, 0} or the maximally redundant digit set of {±3, ±2, ±1, 0}. The operation of the algorithm is similar to the binary SRT algorithm shown in Figure 80.16 except that Pk and Q are applied

Computer Arithmetic for VLSI Signal Processing-0064

to a lookup table or a programmable logic array to determine the quotient digit. A research monograph provides a detailed treatment of SRT division [10].

Newton–Raphson divider. A second division technique uses a form of Newton–Raphson iteration to derive a quadratically convergent approximation to the reciprocal of the divisor that is then multiplied by the dividend to produce the quotient. In systems which include a fast multiplier, this process is may be faster than digit recurrence division [10].

The Newton–Raphson division algorithm to compute Q = A consists of three basic steps:

Computer Arithmetic for VLSI Signal Processing-0065Computer Arithmetic for VLSI Signal Processing-0066

Figure 80.17 illustrates the operation of the Newton–Raphson algorithm. With this algorithm, the error decreases quadratically, so that the number of correct bits in each approximation is roughly twice the number of correct bits on the previous iteration. Thus, from a roughly four-bit initial approxima- tion from Eq. (80.32), two iterations of Eq. (80.33) produce a reciprocal estimate accurate to about 16 bits, three iterations produce a reciprocal estimate accurate to about 32 bits, and four iterations produce a reciprocal estimate accurate to about 64 bits.

Computer Arithmetic for VLSI Signal Processing-0067

The efficiency of this process is dependent on the availability of a fast multiplier, since each iteration of Eq. (80.33) requires two multiplications and a subtraction. The complete process for the initial estimate, two iterations, and the final quotient determination requires three subtracts and five multiples to produce a 16-bit quotient. This is faster than a conventional nonrestoring divider if multiplication is roughly as fast as addition, a condition which may be satisfied for systems which include hardware multipliers.

Comments

Popular posts from this blog

SRAM:Decoder and Word-Line Decoding Circuit [10–13].

ASIC and Custom IC Cell Information Representation:GDS2

Timing Description Languages:SDF