Area Efficient Carry Skip Adder Using Ladner-Fischer and CBL Architecture for Fastest Addition

Christeena Mohan

M.Tech scholar (VLSI &Embedded System)

Dept. of ECE

IIET

M.G University Kottayam

Kerala, India

Dhanya Pushkaran

Assistant professor,

Dept.of ECE

IIET

Nellikuzhy

Kerala,india

Abstract— Here present a modified carry skip adder (CSKA) structure that has a higher speed yet lower energy consumption compared with the proposed hybrid structure. The speed enhancement is achieved by applying new adder and optimized RCA schemes to improve the efficiency of the hybrid CSKA structure. In addition, instead of utilizing brent kung adder and normal RCA , the modified structure makes use of ladner fischer adder and optimized RCA. The structure realized with variable stage size style, where in the further improves the speed and of the adder. Finally, a application shows in the floating point adder unit with our modified CSKA, which lowers the power consumption without considerably impacting the speed, is presented.. In addition, the area–delay product was the lowest among the structures considered in this paper, ladner fischer parallel prefix adder with considerably smaller area and delay. Simulations on the modified hybrid variable latency CSKA reveal reduction in the delay consumption compared with the latest works in this field while having a reasonably high speed.

Keywords—Carry skip adder,hybrid structure,modified CSKA,Application in the floating point adder.

I. INTRODUCTION

Here present a carry skip adder (CSKA) structure that has a higher speed compared with the proposed hybrid CSKA The speed improvement is achieved by applying a new adder in to the structure this schemes to improve the efficiency of the CSKA structure. A hybrid variable latency extension of the proposed structure, which lowers the area consumption without considerably impacting the speed, is presented..

Adders are a key building block in arithmetic and logic units (ALUs) and hence increasing their speed and reducing their power/energy consumption strongly affect the speed and power consumption of processors. There are many works on the subject of optimizing the speed and power of these units, which have been reported in Obviously, it is highly desirable to achieve higher speeds at low-power/energy consumptions, which is a challenge for the designers of general purpose processors.

In this paper, a modified CSKA used which exhibits a higher speed and lower energy consumption compared with those of the hybrid CSKA one. The speed enhancement was achieved by modifying the structure through. A new type of parallel prefix adder network on the nucleus stage.

Nuclease stage is the higher bit width stage on the variable latency network. Here we are using a Lander Fischer adder as a parallel prefix network, here also introduce a optimized RCA. In the optimized RCA structure we use CBL in the adder offers minimum number of gate count. The CBL worked as a full adder .By using the modified CSKA an efficient floating point adder optimize. The results also suggested the CSKA structure is a very good adder for the applications where both the speed and area consumption. In this paper, given the attractive features of the CSKA structure,. The modified CSKA increases the speed considerably while maintaining the low area and delay consumption features of the CSKA.

Hence, the contributions of this paper can be summarized as follows.

1) Proposing a modified CSKA structure by combining parallel prefix network and optimized RCA. Modified CSKA structure for enhancing the speed of the adder. The modification provides us with the ability to use simple ladner Fischer adder and optimized RCA network.

2) Proposing a modified variable latency CSKA structure based on the extension of the suggested CSKA, by replacing some of the middle stages in its structure withal PPA, which is modified in this paper.

3) Optimized RCA used with minimum gate count.

4) Application of the modified structure in to a floating point adder unit

II. ONE-LEVEL CARRY-SKIP ADDERS

A novel strategy to design carry-skip adders is proposed. It allows to distribute bits into different groups to achieve minimum delay .Adder is used to perform a number of mathematical operations such as subtraction, multiplication, division as well as additions. Since the adder is often included in the critical path, many adder architectures have been proposed to increase the speed, while satisfying area and power dissipation constraints.

Most efficient architectures in terms of area and power dissipation is the carry-skip adder (CSA) proposed for the first time in A CSA consists of full adder gates grouped together into blocks, whose configuration strongly affects the overall speed . The blocks are connected by 2:1 multiplexers, which can be placed into one or more level structures. To minimize the delay, many strategies have been proposed to find the optimum distribution of full adders in blocks.

Fig 1: CSA architecture

A. Variable Block Size CSA

Instead of constant block size, we can use a variable block size to enhance the performance of the CSA .The number of bits can be further increased without increasing the propagation delay of the CSA, An adder should be considered optimal if it is not possible to add bits without increasing the delay and area.

B. High-Speed and Energy-Efficient Carry Skip Adder

Adders are a key building block in arithmetic and logic units (ALUs) and hence increasing their speed and reducing their power/energy consumption strongly affect the speed and power consumption of processors. There are many adder families with different delays, power consumptions, and area usages. Examples include ripple carry adder (RCA), carry increment adder (CIA), carry skip adder (CSKA), carry select adder (CSLA), and parallel prefix adders (PPAs).

Fig 2: structure of the CSKA

The RCA has the simplest structure with the smallest area and power consumption but with the worst critical path delay. In the CSLA, the speed, power consumption, and area usages are considerably larger than those of the RCA. The PPAs, which are also called carry look-ahead adders, exploit direct parallel prefix structures to generate the carry as fast as

possible

There are different types of the parallel prefix algorithms that lead to different PPA structures with different performances. As an example, the Kogge–Stone adder (KSA) is one of the fastest structures but results in large power consumption and area usage. It should be noted that the structure complexities of PPAs are more than those of other adder schemes. The CSKA, which is an efficient adder in terms of power consumption and area usage The critical path delay of the CSKA is much smaller than the one in the RCA.

The proposed modification increases the speed considerably while maintaining the low area and power consumption features of the CSKA. In addition, an adjustment of the structure, based on the variable latency technique, which in turn lowers the area consumption without considerably impacting the CSKA speed, is also presented.

1) Proposing a modified CSKA structure by combining the concatenation and the incrimination schemes to the conventional CSKA structure for enhancing the speed and energy efficiency of the adder. The modification provides us with the ability to use simpler carry skip logics based on the AOI/OAI compound gates instead of the multiplexer.

2) Providing a design strategy for constructing an efficient CSKA structure based on analytically expressions presented for the critical path delay.

3) Proposing a hybrid variable latency CSKA structure based on the extension of the suggested CSKA, by replacing some of the middle stages in its structure with a PPA, which is modified in this paper. The rest of this paper is organized as follows. Section III discusses related work on proposed hybrid structure .In Section IV proposed hybrid variable latency CSKA is explained, while Section V describes the Modified hybrid CSKA structure.

III PROPOSED HYBRID STRUCTURE

A. general description of the proposed hybrid structure

It provides us with the ability to use simpler carry skip logics. The logic replaces2:1 multiplexers by AOI/OAI compound gates (fig 3). The gates, which consist of fewer transistors, have lower delay, area, and smaller power consumption compared with those of the 2:1 multiplexer. Note that, in this structure the carry propagates through the skip logics, it becomes complemented. Therefore, at the output of the skip logic of even stages, the complement of the carry is generated. The structure has a considerable lower delay with a smaller area compared with those of the conventional one. Note that while the power consumptions of the AOI or OAI gate are smaller than that of the multiplexer, the delay of the proposed hybrid CSKA smaller than that of the conventional one. This is due to the smaller in the number of the gates.

In this structure, the first stage has only one block, which is RCA. The stages 2 to Q consist of two blocks of RCA and incrimination. The incrimination block uses the intermediate results generated by the RCA block and the carryout put of the previous stage to calculate the final stage.

Fig 3: Proposed Hybrid CSKA Structure

As shown in Fig. 3, the skip logic determines the carry output of the j the stage based on the intermediate results of the j the stage and the carry output of the previous stage (CO, j−1) as well as the carry output of the corresponding RCA block (Co ). For using both AOI and OAI compound gates as the skip logics are the inverting functions of these gates in standard cell libraries. The internal structure of the incrimination block, which contains a chain of half-adders (HAs).

B.Area and delay of the proposed structure

The use of the static AOI and Opiates compared with the static 2:1 multiplexer leads to decreases in the area usage and delay of the skip logic In addition; the proposed structure utilizes incrimination blocks that do not exist in the conventional one. However, may be implemented with about the same logic gates (XOR and AND gates) as those used for generating the select signal of the multiplexer in the conventional structure. Therefore, the area usage of the proposed CSKA structures decreased compared with that of the conventional one.

The critical path of the proposed CI-CSKA structure, which contains three parts, is shown in Fig. 3. These parts include the chain of the FAs of the first stage, the path of the skip logics, and the incrimination block in the last stage. The delay of this path (TD) may be expressed as

.

C. Stage sizes consideration

The number of stages and the corresponding size for each stage, which are given in Fig. 4 have been determined based on a 45-nmstatic CMOS technology.

Figure 4:Sizes of the stages in for the proposed and 32-bit CSKA structures

The highest bit size occupied stage is known as nucleus stage The parallel prefix adder are to be placed at here for decrease the delay and increase the speed.

IV PROPOSED HYBRID VARIABLE LATENCY CSKA

In this section, first, the structure of a variable latency adder, is described

A. Proposed Hybrid Variable Latency CSKA Structure

The basic idea behind using CSKA structures was based on almost balancing the delays of paths such that the delay of the critical path is minimized. Here we replace some of the middle stages in our proposed structure with a PPA. The proposed hybrid variable latency CSKA structure is shown in Fig. 5 wherein Mp-bit modified PPA is used for the path stage (nucleus stage). The nucleus stage, which has the largest size among the stages, replacing it by the PPA reduces the delay of the longest off-critical paths. Thus, the use of the fast PPA helps increasing the available slack time in the variable latency structure

Fig 5: Structure of the proposed hybrid variable latency CSKA.

B. Parallel prefix adders

The PPA is like a Carry Look Ahead Adder. The production of the carriers the prefix adders [1] can be designed in many different ways based on the different requirements. We use tree structure form to increase the speed of arithmetic operation. Parallel prefix adders are faster adders and these are faster adders [4] and used for high performance arithmetic structures in industries. The parallel prefix addition is done in 3 steps.

1. Pre-processing stage

2. Carry generation network

3. Post processing stage

Pre-processing stage we compute, the generate and propagate signals are used to generate carry input of each adder. A and B are inputs. These signals are given by the equation 1&2..Carry generation network stage we compute carries corresponding to each bit. Execution is done in parallel form after the computation of carries in parallel they are divided into smaller pieces. carry operator contain two AND gates , one OR gate. It uses propagate and generate as intermediate signals which are given by the equations 3&4Post processing stage the final stage to compute the summation of input bits. it is same for all adders and sum bit equation given

C..Brent kung adder

In the proposed hybrid structure, the prefix network of the Brent–Kung adder is used for constructing the nucleus stage (Fig.6). One the advantages of the this adder compared with other prefix adders is that in this structure, using forward paths, longest carry is calculated sooner compared with the intermediate carries, which are computed by backward paths. In addition, the fan-out of adder is less than other parallel adders, while the length of its wiring is smaller.

Fig 6: Internal structure of the path stage of the proposed hybrid variable latency CSKA.

V.MODIFIED HYBRID CSKA

A. Modification for brent kung adder

1.Lander Fischer adder

Lander- Fischer adder is a parallel prefix adder. This was developed by R. Lander and M. Fischer in 1980.Ladner- Fischer adder has minimum logic depth but it has large fan-out . Lander- Fischer adder has carry operator nodes. The 3-bit and 32 bit Lander- Fischer adder figures shown below. This adder structure has minimum logic depth, but has large fan-out requirement up to n

Fig 7: stage operation

Fig 8: Lander Fischer adder structure

In this paper we use a 16 bit ladner fischer adder circuit instead of brent kung adder. This type of ladner fischer adder make high speed of operation

B. Modification for RCA

1.Optimized RCA

Fig 9: CBL Structure

The common Boolean logic used RCA make efficient area having minimum number of gates. Normal full adder have 13 gates but in cal structure offers simple full adder with 12 gates. The summation and carry signal for full adder which has Can=1, generate by INV and OR gate. Through the multiplexer, the correct output result is selected according to the logic state of carry-in signal. One input to the mix goes from ripple carry adder block with Can=0 and other input from the Common Boolean logic.

VI APPLICATION OF THE MODIFIED CSKA

A. Application in the floating point adder unit

Floating Point arithmetic is the most used way of approximating real number arithmetic for performing numerical calculations on modern computers. The advantage of floating-point representation over fixed-point and integer representations that it support a much wider range of values. Addition/subtraction, Multiplication and division are the common arithmetic operations in the computations.

Fig10: Floating point number structure

Fig 11: Addition operation of floating point adder

Many fields of science require manipulating real numbers efficiently. Since the first computers appeared, different ways of approximating real numbers on it have been introduced. One of them, the floating point arithmetic, is the most efficient way of representing real numbers in computers. Representing an infinite, continuous set (real numbers) with a finite. Floating-point Arithmetic represent a very good compromise for most numerical applications Robots, Air traffic controller, Digital computers because of its raising application the main emphasis is on the implementation of floating point adder effectively such that it uses less chip area with more clock speed.

VII. EXPERIMENTAL RESULTS

To compare the performance of modified CSKA design with that of previous system, we implement the high speed cska having minimum delay techniques with the ladner fischer adder and CBL structure. So we can reduce the delay and area. Here shows the simulation result of modified CSKA top circuit output and modified floating point adder output.

fig 12. modified cska top circuit output

fig 13. Modified floating point adder output

Table.1: :comparison of 64 bit cska brent kung adder,ladnerfischer and modified CSKA

DESIGN NO.SLICES DELAY(nS)

Hybrid CSKA using brent kung adder 130 65.08

64 Bit cska using ladner fischer adder 126 62.372

Modified CSKA output 124 59.889

The summary of adder features and the comparison with related works are listed in Table 1.The delay will be efficiently reduce in order to use our parallel prefix network In the ladner Fischer adder at the nuclease stage of variable stage size this will improve over all speed of our addition purpose .also reduce the area, although they are efficient enough for specific applications. The use of optimized RCA used here is Full adder having CBL it has only 12 gates.

Figure.14:area, delay Comparison Chart Of proposed and modified scheme

The figure 14shows that, the area is slightly more in the case of Hybrid CSKA Compare to the Modified CSKA .because the number of gate is less here. Compare with brent kung adder the ladner Fischer adder wants only minimum area and delay. When we use CBL used Adder offers minimum number of gate count.

VIII. CONCLUSION

The Modified high speed efficient carry skip adder allow the accurate and high speed addition with minimum delay. The most important advantage of this paper is its ability to add bits with minimum delay. this can be done by using a parallel prefix adder network on the nucleus stage. Nuclease stage is the higher bit width stage on the variable latency network. Here we are using a Lander Fischer adder as a parallel prefix network, here also introduce a optimized RCA. When we use CBL used Adder offers minimum number of gate count. The efficient floating point adder optimized

In the future efficient addition can be done with minimum delay by introducing an efficient adder on the nuclease stage. this enhances the overall speed and reduce the delay static CMOS CSKA structure called CI-CSKA was proposed, which exhibits a higher speed and lower energy consumption compared with those of the conventional one. The speed enhancement was achieved by modifying the structure through the concatenation and incrimination techniques. In addition, AOI and OAI compound gates were exploited for the carry skip logics. The results also modified CSKA structure as a very good adder for the applications where both the speed and energy consumption are critical. In addition, a hybrid variable latency extension of the structure was modified.

.REFERENCES

[1]Milad Bahadori, Mehdi Kamal, Ali Afzali-Kusha, Senior Member, IEEE, and Massoud Pedram, Fellow, IEEE high-speed and energy-efficient carry skip adderoperating under a wide range of supply voltage levels1063-8210 © 2015

[2] P.Chaitanya kumari1, R.Nagendra2 Design of 32 bit Parallel Prefix Adders IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-ISSN: 2278-2834,p- ISSN: 2278-8735. Volume 6, Issue 1 (May. - Jun. 2013),

[3] Damarla Paradhasaradhi*, Prof. K. Anusudha ‘ An Area Efficient Enhanced SQRT Carry Select Adder D Paradhasaradhi et al Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 3, Issue 6, Nov-Dec 2013, pp.876-880

[4] S. K. Mathew, M. A. Anders, B. Bloechel, T. Nguyen,R. K. Krishnamurthy, and S. Borkar, “A 4-GHz 300-mW 64-bitinteger execution ALU with dual supply voltages in 90-nm CMOS,”

IEEE J. Solid-State Circuits, vol. 40, no. 1, pp. 44–51, Jan. 2005.

[5] V. G. Oklobdzija, B. R. Zeydel, H. Q. Dao, S. Mathew, andR. Krishnamurthy, “Comparison of high-performance VLSI adders inthe energy-delay space,” IEEE Trans. Very Large Scale Integr. (VLSI)Syst., vol. 13, no. 6, pp. 754–758, Jun. 2005.

[6] B. Ramkumar and H. M. Kittur, “Low-power and area-efficient carry select adder,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 20,no. 2, pp. 371–375, Feb. 2012.

[7] M. Vratonjic, B. R. Zeydel, and V. G. Oklobdzija, “Low- and ultralow-power arithmetic units: Design and comparison,” in Proc. IEEEInt. Conf. Comput. Design, VLSI Comput.Process. (ICCD), Oct. 2005,pp. 249–252.

[8] C. Nagendra, M. J. Irwin, and R. M. Owens, “Area-time-power tradeoffsin parallel adders,” IEEE Trans. Circuits Syst. II, Analog Digit.SignalmProcess., vol. 43, no. 10, pp. 689–702, Oct. 1996.

[9] Y. He and C.-H.Chang, “A power-delay efficient hybrid carrylookahead/carry-select based redundant binary to two’s complementconverter,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 55, no. 1,pp. 336–346, Feb. 2008.

[10] C.-H. Chang, J. Gu, and M. Zhang, “A review of 0.18 μm full adderperformances for tree structured arithmetic circuits,” IEEE Trans. VeryLarge Scale Integr. (VLSI) Syst., vol. 13, no. 6, pp. 686–695, Jun. 2005.

[11] D. Markovic, C. C. Wang, L. P. Alarcon, T.-T. Liu, and J. M. Rabaey,Ultralow-power design in near-threshold region,” Proc. IEEE, vol. 98,no. 2, pp. 237–252, Feb. 2010.

[12] R. G. Dreslinski, M. Wieckowski, D. Blaauw, D. Sylvester, andT. Mudge, “Near-threshold computing: Reclaiming Moore’s law throughenergy efficient integrated circuits,” Proc. IEEE, vol. 98, no. 2,pp. 253–266, Feb. 2010.

[12] S. Jain et al., “A 280 mV-to-1.2 V wide-operating-range IA-32processor in 32 nm CMOS,” in IEEE Int. Solid-State Circuits Conf.Dig. Tech. Papers (ISSCC), Feb. 2012, pp. 66–68.

[13] R. Zimmermann, “Binary adder architectures for cell-based VLSI andtheir synthesis,” Ph.D. dissertation, Dept. Inf. Technol. Elect.Eng., SwissFederal Inst. Technol. (ETH), Zürich, Switzerland, 1998.

[14] D. Harris, “A taxonomy of parallel prefix networks,” in Proc. IEEE Conf.Rec. 37th Asilomar Conf. Signals, Syst., Comput., vol. 2. Nov. 2003,pp. 2213–2217.

[15] P. M. Cogged and H. S. Stone, “A parallel algorithm for the efficientsolution of a general class of recurrence equations,” IEEE Trans.Comput., vol. C-22, no. 8, pp. 786–793, Aug. 1973.

[16] V. G. Oklobdzija, B. R. Zeydel, H. Dao, S. Mathew, andR. Krishnamurthy, “Energy-delay estimation technique for highperformancemicroprocessor VLSI adders,” in Proc. 16th IEEE Symp.Comput. Arithmetic, Jun. 2003, pp. 272–279.

[17] M. Lehman and N. Burla, “Skip techniques for high-speed carrypropagationin binary arithmetic units,” IRE Trans. Electron.Comput.,vol. EC-10, no. 4, pp. 691–698, Dec. 1961.

[18] K. Chircaet al., “A static low-power, high-performance 32-bit carryskip adder,” in Proc. EuromicroSymp.Digit. Syst. Design (DSD),Aug./Sep. 2004, pp. 615–619.

[19] M. Alioto and G. Palumbo, “A simple strategy for optimized designof one-level carry-skip adders,” IEEE Trans. Circuits Syst.I, Fundam.Theory Appl., vol. 50, no. 1, pp. 141–148, Jan. 2003.

[20] S. Majerski, “On determination of optimal distributions of carry skips inadders,” IEEE Trans. Electron.Comput., vol. EC-16, no. 1, pp. 45–58,Feb. 1967.

ay in here...

**...(download the rest of the essay above)**