Low power synchronization module for digital systems
Sagar B. Girawale
National Institute of Electronics and IT, Aurangabad,
Abstract- The advancements and scaling in technology are continuously increasing in accordance with Moore's Law. This results in an increase in the performance of chips, but comes with a price due to the increased power consumption, and hence resources are spent on cooling, packaging and other methods to reduce the after effects. This additional cost has to be eliminated, and the most obvious solution is to reduce the power consumption of a design which would also protect the chips from permanent failure due to additional heat in the chips. This paper presents a new systematic approach to flip-flop design using Internal Clock Gating (ICG) minimize the dynamic power. The results obtained give interesting insights into the effectiveness of clock gating.
Keywords- Dynamic power dissipation, clock gating, Synopsys tool, Saed90mm library, integrated clock gating etc.
In recent years, power-sensitive designs has been developed very significantly. This has mainly due to the rapid increment of electronics devices which are battery-operated such as notebook, laptop, computers, cellular phones, and many more communication devices. Semiconductor devices are scaled to new technology which provides high-performance and high integration density. Because of increment in density of transistors and used at higher frequencies of operation, the power consumption is increasing of changes occurs in the technology. Supply voltage is managed to keep the power consumption within limit . The scaling of supply voltage does not be satisfactory to uphold the power density with respect to limit, which is need of power sensitive devices. Circuit and system-level are the main techniques which are also required along with supply voltage scaling to gain low-power designs .
In small scaled designs, a major part of the total power consumption in high performance digital circuits is occurs because of leakage currents. The high-performance systems are subject to pre-defined power plan which gives efficient power reduction in digital circuits. It also helps to reduce the power consumption during standby operation of the system. Hence, these are the main techniques which are used to reduce leakage power to maintain high performance. Furthermore, as different components which increase leakage power are also becoming important with technology scaling. So low-power circuit techniques are used to reduce total power in high-performance small-scale circuits .
1.2. POWER DISSIPATION IN VLSI CIRCUITS
There are two types of power dissipation which are:
• Dynamic power dissipation
• Static power dissipation
1.2.1. DYNAMIC POWER
Dynamic power dissipation is also categorized in two types which are as follows:
1. Switching power which occurs because of charging of load capacitance and discharging of load capacitance.
2. Short circuit power which occurs mainly because of the disturbance in input waveforms. The switching power can be expressed as
Dynamic power = α CL VDD2 f
Where α = switching activity,
f = operation frequency,
CL = load capacitance,
VDD = supply voltage.
The short circuit power is given by
Short circuit power = β* (VDD – Vth) ^3 *t/ 12T
Where β = transistor coefficient,
t = rise/fall time (timing parameters),
T (1/f) = delay.
1.2.2. Leakage power
It has three components because of that leakage occurs in system:
1. Sub-threshold leakage is the leakage current which flows drain to source (Isub)).
2. Direct tunneling gate leakage is occurs due to electron (or hole) tunneling which is from the bulk silicon bulk to gate through the gate oxide potential barrier
3. Reverse-biased p-n junction leakage is mainly due to source and drain (substrate)
Fig.1 - Leakage paths in a CMOS inverter
2 SYNCHRONIZATION MODULE
At the time of IC design, voltage variations, temperature variations, manufacturing process, are becoming an important topics on the performance of systems on silicon, as the size of SoCs increases and the process technology advances. Components such as logic circuits, memories on chip are all affected, but the performance of synchronizers which are used to synchronize data which passes from one core to another core as each core has different clock. This may affect the system performance to a greater extent than other components because the synchronizer performance depends on small signal rather than large signal behaviour. Synchronization is a main part of the on-chip communication which affect the system performance. As the size of SoCs increases, the device dimensions shrink .
In a GALS system, different cores are modified to operate at different frequencies to achieve low power and maximum performance. Synchronization is used for data passing between different clock domains. To understand this, let us look at a flip-flop. As shown in Figure 3, data from a one clock region is an asynchronous signal which given to first the flip-flop. When it comes close to the clock edge, then there is possibility of setup condition violation which result in metastability condition may be occur at the output of the first flip-flop. Metastability is occurs due to irregular behavior of logic levels as shown in fig.2 which may cause failures in following circuit blocks which are designed only for definite logic levels. As metastability condition occurs, it help to specify logic level either 0 or 1 with respect to particular speed. If we couldn’t managed metastability before the next rising edge of the read clock, then output of first flip-flop goes to next flip-flop, which will create failure of system .
Fig.2- Metastability in flip-flop
Synchronizers are used to retime data passing between different clock regions, they are not used to avoid the metastability, but to leave some time for the metastability to resolve itself before the data is sampled by the following circuit. Fig.3 shows two flip-flop synchronizer. If data input comes close to the rising edge of the clock, then metastability may occur in the first flip-flop, so a full clock cycle is used for the metastability to resolve itself. To avoid this, metastability should be solve before the next clock edge, otherwise indeterminate level will be transferred to any subsequent circuit block, potentially resulting in system failures .
Fig.3- Two flip-flops synchronizer
Synchronization is usually restricted to control signals rather than data signals in order to reduce the number of synchronizers required. Fig. 4 shows a simple example of using synchronizers in system. Here Core A has some data to send to Core B. First the data is put onto the bus and the Req signal is sent to Core B through the on-chip network composed of the synchronizers and routers. When the Core B receives the Req signal it samples the data on the bus and sends the Ack signal back to Core A. For this communication architecture each core needs at least two synchronizers for the Req and Ack signals .
Fig.4- Synchronizers in system
2.1 SYNCHRONIZER ISSUES
Future SoCs are likely to consist of many synchronizers on a single chip as the number of IP cores incorporated increases. For example, in a 64-core processor system, at least 128 synchronizers are needed by considering that one core needs at least two synchronizers for its input and output. In future SoCs, the on-chip communication including synchronization, routing and buffering is likely to affect the system performance more than processing. The performance of the system depends on the performance of the synchronizers .
The simple synchronizer includes two flip-flops. Metastability may occur at the first flip-flop. So to avoid metastability, full clock cycle is required. MTBF can be increased by increasing the clock period which is the synchronization time. However, the resolution of metastability in a two flip-flop synchronizer is relatively slow, which makes it unsuitable for high speed applications where clock frequencies are high. In the past, many different synchronizers with improved performance have been proposed. However, main problems in synchronizers is that performance decreases as the VDD and VTH increases, which determines the synchronizer performance depends on the small signal behaviour of the bistable element in the synchronizers. This situation is aggravated by lowering the temperature which results in a higher threshold voltage. Consequently, the synchronizer performance is sensitive to Vdd, Vth and temperature variations. With the wider use of power saving techniques and the advances in process technology, Vdd will become lower and lower where synchronizers may fail to work. In addition, increasing on-chip variability could significantly degrade the synchronizer performance. Therefore, it is necessary to design synchronizers which are able to work at low Vdd and are robust to the Vdd, Vth and temperature variations .
3 DYNAMIC POWER REDUCTION TECHNIQUES
As the leakage power increases considerably because of changes in the scaling methods. Because of that, variations in dynamic power increases which creates failure of system. To avoid this, there are many techniques to reduce dynamic power which are as follows
• Transistor size and interconnect optimization,
• Gated clock,
• Multiple supply voltages
• Dynamic control of supply voltage.
By using above methods at the time of IC design, we can minimize the dynamic power .
3.1. Clock gating
Clock gating is an important techniques to avoid excessive power dissipation which occurs due to dynamic power dissipation digital integrated circuits. We can provide clock to synchronous circuits like microprocessor only when it is active at any specified time by using clock gating methodology. Because of this, unwanted power dissipation can be avoided. Hence clock gating techniques is very efficient to synchronous digital circuits to avoid power dissipation. Clock gating circuit implemented to synchronous circuits which specifies which portion is clock gated at given time interval as shown in fig.5 .
Clock-gating methodology results in toggling of gated circuit between ON-OFF states. The clock-gating control circuit is as big as the synchronous circuit, so because of this, clock gating circuit can dissipate more power than that without clock gating circuit .
Fig.5- – Basic implementation of Clock Gating
3.2. Setup and hold time requirement
Fig.6- setup and hold time requirement with waveforms
3.3 CLOCK-GATING CELLS
Power Compiler tool determines that clock gating used for providing better power saving method and also insert clock gating to design which may be create clock skew which affects timing . To avoid this, we have to use predefined clock gating cells provides by library. I have used SAED90nm library in my thesis work. This integrated clock gating has various sequential elements and combinational elements which are specified in single SAED90nm library. Fig.7 shows an integrated clock-gating cell .
Fig.7- integrated clock gating cell
3.4 Benefits of Clock Gating
Dynamic power can be saved as with low clock rate. Due to this, internal power of registers is reduces. By using this method we can save area by excluding multiplexers. Clock gating is technology independent which provides gated clock to the digital circuits . Power Compiler gives permission to perform clock gating with the reference of techniques as follows:
• Gated clock on unmapped registers (RTL based).
• Clock gating occurs if register bank size get particular minimum number of width constraints, then clock gating occurs
• Gated clock on previously mapped and upcoming unmapped registers.so in this process, clock gating given to the mapped IP cores. (Gate level).
• Power-driven gated clock insertion .
4 SOFTWARE OVERVIEW
4. 1 Synopsys EDA Tools
Electronic design tools are designed by Synopsys to support the whole standard design flow. The following tools are used to achieve expected results:
Design CompilerR: this tool is used for RTL synthesis, synthesis optimization.
Design Complier has two sub-tool:
• FormalityR: It is an equivalence-checking (EC) tool that checks whether a design are functionally equivalent.
• Power CompilerTM: This tool contains following features:
1. power consumption optimization at the RTL and gate level
2. It enable concurrent area, power and timing
IC Compiler: This tool provides features like
2. Low-power design, and
3. Design for manufacturability.
Synopsys VCSR: this tool has following features :
1. Native test bench (NTB) support
2. Simulation engines
3. Constraint solver engines
4. Broad System Verilog support
4 SIMULATION RESULTS
1. Proposed d flip flop(simulation using IC compiler)
Fig.8- proposed d flip-flop
2. Output waveform of Proposed d flip flop(simulation using IC compiler)
Fig.9- Output waveform of Proposed d flip flop
(Simulation using IC compiler)
3. Synchronization module with clock gating (simulation using IC compiler)
Fig.10- Synchronizer with clock gating
4. Synchronization module with clock gating (simulation using IC compiler)
Fig.11- Output waveform of synchronizer
4.1 THEORETICAL RESULTS
Theoretical results are taken from Synopsys design compiler as following:
• Report_ area: 2ff synchronizer (without clock gating)
• Total area: 50.199565µm^2
• Report_power_analysis_Design:2ff_synchronizer(without clock gating
Internal power 656.6435 nW
Switching power 35.3279 nW
Total dynamic power 691.971 5nW
• Report_ area: 2ff synchronizer (with clock gating)
• Total area 82 µm^2
• Report_power_analysis_Design:2ff_synchronizer(with clock gating)
Results for single flip-flop and three flip-flop synchronizer are taken from reference 2, 3, 4
In this work, the goals initially proposed were accomplished. The technique and all necessary requirements to implement it with great success, in the selected high-speed digital interface, were studied and understood. We have studied the architecture and operating modes of the interface as a way to predict the efficiency and clock gating coverage in every block and each mode.The process of automatic clock gating insertion using Design Compiler proved to be a well-defined and consistent process that can be implemented in any design without any previous caution, however, the results may not be what we have expected. Since the tool does not contain any process to qualify the clock gating implementation.
The other is thankful to the National Institute of Electronics and IT, Aurangabad, Maharashtra, India for providing necessary facilities to carrying out this work.
1. Metastability and Synchronizers, Tutorial by Ran Ginosar VLSI Systems Research Center Electrical Engineering and Computer Science Departments Technion—Israel Institute of Technology, Haifa 32000, Israel [[email protected]] © IEEE 2011
2. A clock-gated, double edge-triggered flip-flop implemented with transmission gates by Xiaowen Wang
3. Design and Measurement of Synchronizers by Jun Zhou, Technical Report Series NCL-EECE-MSD-TR-2008-138, November 2008
4. Clocking & Metastability by Peter Cheung, Department of Electrical & Electronic Engineering Imperial College London URL: www.ee.imperial.ac.uk/pcheung/
5. Device and circuit design challenges in the digital subthreshold region for ultralow-power applications by Ramesh Vaddi,S Dasgupta,R.P. Agarwal,Hindawi publishing corpo. NY, United States,ISSN:1065-514X
6. K.Deepa, K.S.Deepika, Dr.M.Kathirvelu, “Power Efficient Standby Switch Based Domino Logic Circuit”, IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 2, Ver. II (Mar-Apr. 2014), PP 17-22e-ISSN: 2319 –4200, p-ISSN No. : 2319–4197
7. S. M. Kang and Y. Leblebici, CMOS Digital Integrated Circuits Analysis and Design, 3rd ed., Mc Graw Hill, 2003, pp. 482-504.
8. M. Pedram, “Power Minimization in IC Design: Principles and Applications”, ACM Transaction on Design Automation of Electronic Systems, 1996, pp. 3-56.
9. V. Venkatachalam, M. Franz, “Power reduction techniques for microprocessor systems”, ACM Computing Surveys, vol. 37, Issue 3, pp. 195-237, 2005.
10. Pooja Singh, akshay Sachdeva, Archana Kumari, Clock Gating for Dynamic Power Reduction in Synchronous Circuits, International Journal of Scientific Research Engineering & Technology (IJSRET) ISSN: 2278–0882 IEERET-2014 Conference Proceeding, 3-4 November, 2014
11. Chandrakasan and R. Brodersen, Low-Power CMOS Design, IEEE Press, 1998, pp. 233-238.
12. N. Nedović and V. G. Oklobdzija “Dual-edge triggered storage elements and clocking strategy for low-power systems,” IEEE Transaction on VLSI Systems, vol. 13, pp.577-590, May 2005.
13. V. Zyuban, D. Brooks, V. Srinivasan, M. Gschwind, P. Bose, P. N. Strenski, and P. G. Emma, “Integrated analysis of power and performance for pipelined microprocessors,” IEEE Transactions on Computers, vol. 53, pp. 1004-1016, 2004.
14. V. Tiwari, D. Singh, S. Rajgopal, G. Mehta, R. Patel, and F. Baez, “Reducing power in high-performance microprocessors,” in 35th Annual Design Automation Conference, San Francisco, CA, pp. 732-737, 1998.
15. Jairam S, Madhusudan Rao, Jithendra Srinivas, Parimala Vishwanath, Udayakumar H, Jagdish Rao, Clock Gating for Power Optimization in ASIC Design Cycle: Theory & Practice ,SoC Center of Excellence, Texas Instruments, India.
16. S. Seyedi, S. H. Rasouli, A. Amriabadi, and A. Afzali-Kusha, “Clock gated static pulsed flip-flop (CGSPFF) in sub 100 nm technology,” VLSI Technologies and Architectures, 2006.
17. Yen-Kuang Chen and S. Y. Kung, Trend and Challenge on System-on-a-Chip Designs, Intel Corporation, DOI: 10.1007/s11265-007-0129-7 http://dx.doi.org/10.1007/s11265-007-0129-7, www.springerlink.com
18. C. C. Yu, “Low-power double edge-triggered flip-flop circuit design,” 3rd Innovative Computing Information and Control International Conference, 2008.
19. Synopsys. Low-Power Flow User Guide. F-2011.09 edition, September 2011.
20. Sung Mo Kang and Yusuf Leblebici. CMOS Digital Integrated Circuits-Analysis and Design, Open University Press, Third edition, 2003.
21. Samuel Sheng Anantha P. Chandrakasan and RobertW.Broadersen. Low power cmos digital design. IEEE Journal of Solid State Circuits vol. 27, no. 4, pages 472–484, April 1992.
22. Xiaodong Zhang. High performance low leakage design using power compiler and multi-vt libraries. Technical report, Synopsys User Group, 2003.
23. José Carlos Alves. Low power cmos design. Technical report, Faculdade de Engenharia da Universidade do Porto, 2012.
24. Clockdomaincrossingtutorial https://filebox.ece.vt.edu/~athanas/4514/ledadoc/html/pol_cdc.html
25. Synopsys. Design Compiler User Guide. D-2010.03-sp2 edition, June 2010.
26. Clock and reset tutorial ,www.springer.com
27. Dushyant Kumar Sharma. Effects of different clock gating techniques on design. International Journal of Scientific & Engineering Research Volume 3, May 2012.
28. Design Guidelines and Timing Closure Techniques for Hardcopy ASICs: © July 2010 Altera Corporation AN 545
29. Synopsys. Power Compiler User Guide. E-2010.12-sp2 edition, March 2011.
30. Synopsys. Formality User Guide. Z-2007.06 edition, June 2007.
31. Jun Zhou, Design and Measurement of Synchronizers, Technical Report Series, NCL-EECE-MSD-TR-2008-138, November 2008 ,Newcastle university
32. Narayana Koduri and Kiran Vittal. Power analysis of clock gating at rtl. Atrenta Inc. (San Jose, Calif.), available in http://www.design-reuse.com/articles/23701/power-analysis-clock-gating-rtl.html.
33. Uri Frank, Tsachy Kapshitz, Ran Ginosar, A predictive synchronizer for periodic clock domains, published online: 3 May 2006 Springer Science, Business Media, LLC 2006
34. Padmini G.Kaushik, Sanjay M.Gulhane,Athar Ravish Khan, Dynamic Power Reduction of Digital Circuits by C lock Gating, International Journal of Advancements in Technology, http://ijict.org/ ISSN 0976 – 4860 Vol. 4 No. 1(March 2013)
35. Dr. Neelam R. Prakash, Akash, Clock Gating for Dynamic Power Reduction in Synchronous Circuits, International Journal of Engineering Trends and Technology (IJETT) - Volume4Issue 5-May2013 ISSN: 223 5381http://www.ijettjournal.org
...(download the rest of the essay above)