Past CHAPTER 1
At the present day, the low power design has become an urgent and challengeable issue in the design for high performance very large scale integration (VLSI). Therefore, many techniques have been developed to decrease the power consumption of new VLSI design. However, most of these approaches are proposed to reduce power consumption during functional operation, while test mode operation has not been a major emphasis of research.
As the complexity of VLSI circuits constantly increases, there is a need of a built-in self-test (BIST) to be used. Built-in self-test enables the chip to test itself and to evaluate the circuit’s response. There have been proposed many BIST equipment design methods. In most of the state-of-the-art methods, some kind of a pseudorandom pattern generator (PRPG) is used to produce vectors to test the circuit. These vectors are applied to the circuit as either they are, or the vectors are modified by some additional circuitry in order to obtain better functional coverage.
Patterns generated by simple LFSRs or CA often do not provide a satisfactory functional coverage. Thus, these patterns have to be modified somehow. One of the most known approaches is the weighted random pattern testing. Here the LFSR code words are modified by a weighting logic to produce a test with given probabilities of occurrence of 0’s and 1’s at the particular circuit under test (CUT) inputs.
We propose a low switching activity test scheme for scan-based BIST. The scheme inserts a multiplexer at each scan input. These multiplexers select one of two data sources: previous scan value into the scan chain and pseudorandom test data generated by PRTG. In this manner, the scheme applies low toggle test stimulus bits among adjacent scan cells with a specified probability resulting in shift power reduction. In order to improve the tradeoff between test effectiveness decrease and test power reduction, we consider allowing a fraction of scan chains to receive fully pseudorandom test data.
Numerous schemes for power reduction during scan testing have been devised. Among them, there are solutions specifically proposed for BIST to keep the average and peak power below a given threshold. For example, the test power can be reduced by preventing transitions at memory elements from propagating to combinational logic during scan shift.
The main challenging areas in VLSI are performance, cost, testing, area, reliability and power. The demand for portable computing devices and communications system are increasing rapidly. These applications require low power dissipation for VLSI circuits. The power dissipation during test mode is 200% more than in normal mode. Hence, it is important aspect to optimize power during testing. Power optimization is one of the main challenges. There are various factors that affect the cost of chip like packaging, application, testing etc. In VLSI, according to thumb rule 5000 of the total integrated circuits cost is due to testing. During testing two key challenges are:
Cost of testing that cannot be scaled.
Engineering effort for generating test vectors
Low-Power Linear feedback shift register (LP-LFSR):
There are main two sources of power dissipation in digital circuits; these are static and dynamic power dissipation. Static power dissipation is mainly due to leakage current and its contribution to total power dissipation is very small. Dynamic power dissipation is due to switching i.e. the power consumed due to short circuit current flow and charging of load capacitances is given by equation:
P=0.5VDD2 E(sw) CL Fclk
Where Vdd is supply voltage, E (sw) is the average number of output transitions per 1/fclk, fclk is the clock frequency and CL is the physical capacitance at the output of the gate. Dynamic power dissipation contributed to total power dissipation. From the above equation, the dynamic power depends on three parameters: Supply voltage, Clock frequency, switching activity.
Title 1: Reducing Test Power and Improving Test Effectiveness for Logic BIST
Excessive power dissipation is one of the major issues in the testing of VLSI systems. Many techniques are proposed for scan test, but there are not so many for logic BIST because of its unmanageable randomness. This paper presents a novel low switching activity BIST scheme that reduces toggle frequency in the majority of scan chain inputs while allowing a small portion of scan chains to receive pseudorandom test data.
Title 2: Predictable, Accurate, and Flexible Fashion by Adapting the PRESTO-Based LBIST Infrastructure
This paper proposes a low-power (LP) programmable generator capable of producing pseudorandom test patterns with desired toggling levels and enhanced fault coverage gradient compared with the best-to-date built-in self-test (BIST)- based pseudorandom test pattern generators. It is comprised of a linear finite state machine (a linear feedback shift register or a ring generator) driving an appropriate phase shifter, and it comes with a number of features allowing this device to produce binary sequences with preselected toggling (PRESTO) activity. We introduce a method to automatically select several controls of the generator offering easy and precise tuning.
Title 3: Design and Verification of Low Power Programmable PRPG Using Universal Verification Methodology.
In the Built-In-Logic-Block Observation (BILBO), the require consideration of time and power is not desirable. So we are introducing a self-testing using MISR and parallel SRSG (STUMPS) architecture
2.1 BIST ARCHITECTURE:
It is very important to choose the proper LFSR architecture for achieving the appropriate fault coverage. Every architecture consumes different power even for same polynomial. Another problem associated with choosing LFSR is LFSR design issue, which includes LFSR partitioning, in this the LFSR are differentiated on the basis of hardware cost and testing time cost.
Circuit under Test (CUT): It is the portion of the circuit tested in BIST mode. It can be sequential, combinational or a memory. Their Primary Input (PI) and Primary output (PO) delimit it.
Test pattern generator (TPG): It generates the test patterns for the CUT. It is a dedicated circuit or a microprocessor. The patterns may be generated in pseudorandom or deterministically.
Multiple input signatures register (MISR): It is designed for signature analysis, which is a technique for data compression. MISR are frequently implemented in portability of alias. MISR are frequently implemented in BIST designs, in which output response are compressed by MISR.
Test Response Analysis (TRA): It analyses the value sequence on PO and compares it with the expected output.
BIST controller Unit (BCU): It controls the test execution; it manages the TPG, TRA and reconfigures the CUT and the multiplexer. It is activated by the Normal/Test signal.
2.2 ALGORITHM FOR LOW POWER LFSR
LFSR is a shift register that consists of a series of flip-flops and is practiced to create test prototype for BIST externally. The preliminary value of LFSR is called seed value. It plays a significant issue in power utilization.
The entire number of signal transitions occurs between these five vectors is equal to the number of signal transitions between T1 and T2. Below figure shows proposed algorithm for Low Power LFSR.
2.3 APPLICATIONS OF LFSR:
In computing, a linear-feedback shift register (LFSR) is a shift register whose input bit is a linear function of its previous state. The most commonly used linear function of single bits is exclusive-or (XOR).
Uses as counters:
The repeating sequence of states of an LFSR allows it to be used as a clock divider, or as a counter when a non-binary sequence is acceptable, as is often the case where computer index or framing locations need to be machine-readable.
Uses in Cryptography:
LFSRs have long been used as pseudo-random number generators for use in stream ciphers (especially in military cryptography), due to the ease of construction from simple electromechanical or electronic circuits, long periods, and very uniformly distributed output streams.
Uses in circuit testing:
LFSRs are used in circuit testing, for test-pattern generation (for exhaustive testing, pseudo-random testing or pseudo-exhaustive testing) and for signature analysis.
Integrated circuit (IC) technology is the enabling technology for a whole host of innovative devices and systems that have changed the way we live. Jack Kilby and Robert Noyce received the 2000 Nobel Prize in Physics for their invention of the integrated circuit; without the integrated circuit, neither transistors nor computers would be as important as they are today. VLSI systems are much smaller and consume less power than the discrete components used to build electronic systems before the 1960s.
3.2 APPLICATIONS OF VLSI:
Electronic systems now perform a wide variety of tasks in daily life. Electronic systems in some cases have replaced mechanisms that operated mechanically, hydraulically, or by other means; electronics are usually smaller, more flexible, and easier to service.
Personal entertainment systems such as portable MP3 players and DVD players perform sophisticated algorithms with remarkably little energy.
Electronic systems in cars operate stereo systems and displays; they also control fuel injection systems, adjust suspensions to varying terrain, and perform the control functions required for anti-lock braking (ABS) systems.
Digital electronics compress and decompress video, even at high definition data rates, on-the-fly in consumer electronics.
Low-cost terminals for Web browsing still require sophisticated electronics, despite their dedicated function.
Personal computers and workstations provide word-processing, financial analysis, and games.
3.3 ADVANTAGES OF VLSI:
• Size: Integrated circuits are much smaller—both transistors and wires are shrunk to micrometer sizes, compared to the millimeter or centimeter scales of discrete components.
• Speed: Signals can be switched between logic 0 and logic 1 much quicker within a chip than they can between chips.
• Power consumption: Logic operations within a chip also take much less power.
3.4 VLSI AND SYSTEMS:
These advantages of integrated circuits translate into advantages at the system level:
• Smaller physical size: Smallness is often an advantage in itself-consider portable televisions or handheld cellular telephones.
• Lower power consumption: Replacing a handful of standard parts with a single chip reduces total power consumption.
• Reduced cost: Reducing the number of components, the power supply requirements, cabinet costs, and so on, will inevitably reduce system cost.
3.5 INTEGRATED CIRCUIT MANUFACTURING:
Integrated circuit technology is based on our ability to manufacture huge numbers of very small devices—today, more transistors are manufactured in California each year than raindrops fall on the state. In this section, we briefly survey VLSI manufacturing.
Most manufacturing processes are fairly tightly coupled to the item they are manufacturing. An assembly line built to produce Buicks, for example, would have to undergo moderate reorganization to build Chevys—tools like sheet metal molds would have to be replaced, and even some machines would have to be modified. And either assembly line would be far removed from what is required to produce electric drills.
3.6 MASK-DRIVEN MANUFACTURING:
Integrated circuit manufacturing technology, on the other hand, is remarkably versatile. While there are several manufacturing processes for different circuit types—CMOS, bipolar, etc.—a manufacturing line can make any circuit of that type simply by changing a few basic tools called masks.
3.7 CIRCUITS AND LAYOUTS
We could build a breadboard circuit out of standard parts. To build it on an IC fabrication line, we must go one step further and design the layout, or patterns on the masks. The rectangular shapes in the layout (shown here as a sketch called a stick diagram) form transistors and wires which conform to the circuit in the schematic.
3.8 MANUFACTURING DEFECTS
Because no manufacturing process is perfect, some of the chips on the wafer may not work. Since at least one defect is almost sure to occur on each wafer, wafers are cut into smaller, working chips.
• Moore’s Law: In the 1960s Gordon Moore predicted that the number of transistors that could be manufactured on a chip would grow exponentially. His prediction, now known as Moore’s Law, was remarkably prescient
• Terminology: The most basic parameter associated with a manufacturing process is the minimum channel length of a transistor. (In this book, for example, we will use as an example a technology that can manufacture 180 nm transistors.)
3.10 COST OF MANUFACTURING:
IC manufacturing plants are extremely expensive. A single plant costs as much as $4 billion. Given that a new, state-of-the-art manufacturing process is developed every three years, that is a sizeable investment.
3.11 COST OF DESIGN:
• One of the less fortunate consequences of Moore’s Law is that the time and money required to design a chip goes up steadily. The cost of designing a chip comes from several factors:
• A large ASIC, which contains millions of transistors but is not fabricated on the state-of-the-art process, can easily cost $20 million US and as much as $100 million. Designing a large microprocessor costs hundreds of millions of dollars
Design cost and IP:
We can spread these design costs over more chips if we can reuse all or part of the design in other chips. The high cost of design is the primary motivation for the rise of IP-based design, which creates modules that can be reused in many different designs.
3.12 TYPES OF CHIPS:
The preponderance of standard parts pushed the problems of building customized systems back to the board-level designers who used the standard parts.
More specialized standard parts:
In the 1960s, standard parts were logic gates; in the 1970s they were LSI components. Today, standard parts include fairly specialized components: communication network interfaces, graphics accelerators, floating point processors. All these parts are more specialized than microprocessors but are used in enough volume that designing special-purpose chips is worth the effort.
• Application-specific integrated circuits (ASICs): Rather than build a system out of standard parts, designers can now create a single chip for their particular application.
3.13 CMOS TECHNOLOGY:
CMOS is the dominant integrated circuit technology. In this section we will introduce some basic concepts of CMOS to understand why it is so widespread and some of the challenges introduced by the inherent characteristics of CMOS.
3.14 POWER CONSUMPTION:
3.14.1 Power Consumption Constraints:
The huge chips that can be fabricated today are possible only because of the relatively tiny consumption of CMOS circuits. Power consumption is critical at the chip level because much of the power is dissipated as heat, and chips have limited heat dissipation capacity
3.14.2 Design and Testability:
Chip designs are simulated to ensure that the chip’s circuits compute the proper functions to a sequence of inputs chosen to exercise the chip. manufacturing test But each chip that comes off the manufacturing line must also undergo.
3.14.3 Testability as a design process:
Unfortunately, not all chip designs are equally testable. Some faults may require long input sequences to expose; other faults may not be testable at all, even though they cause chip malfunctions that aren’t covered by the fault model.
Reliability is a lifetime problem:
Earlier generations of VLSI technology were robust enough that testing chips at manufacturing time was sufficient to identify working parts a chip either worked or it didn’t. In today’s nanometer-scale technologies, the problem of determining whether a chip works is more complex.
3.14.5 Design for manufacturability:
A number of techniques, referred to as design-for-manufacturability or design-for-yield, are in use today to improve the reliability of chips that come off the manufacturing line.
Some of the convenient levels of abstraction that served us well in earlier technologies are no longer entirely appropriate in nanometer technologies. We need to check more thoroughly and be willing to solve reliability problems by modifying design decisions made earlier.
3.14.6 Integrated circuit design techniques:
To make use of the flood of transistors given to us by Moore’s Law, we must design large, complex chips quickly. The obstacle to making large chips work correctly is complexity—many interesting ideas for chips have died in the swamp of details that must be made correct before the chip actually works. Integrated circuit design is hard because designers must juggle several different problems:
Much higher flexibility in forming low-toggling test patterns can be achieved by deploying a scheme presented in Fig. 2.
Two additional parameters kept in 4-bit Hold and Toggle registers determine how long the entire generator remains either in the hold mode or in the toggle mode, respectively. They will remain in this state until another 1 occurs on the weighted logic output. The random occurrence of this event is now related to the content of the Hold register, which determines when to terminate the hold mode.
When using the PRESTO generator with an existing DFT flow, all LP registers are either loaded once per test or every test pattern. The registers loaded only once act as test data registers or are parts of an IJTAG network, and are initialized by the test setup procedure.
Automatic Selection of Controls:
As shown in the previous sections, performance of the PRESTO generator depends primarily on the following three factors (note that in the BIST mode they are delivered only once, at the very beginning of the entire test session):
1) For each switching code k, k = 1, . . ., 15, determine the corresponding probability pk of injecting a 1 into the shift register. These values are as follows:
Probability Shift Register
Table 1: Probability Values
2) As can be seen in Fig. 2, the values pk obtained in step 1 determine as well the probability of asserting the T flip- flop input for each hold (toggle) code k, and then the corresponding duration hk (tk ) of the hold (toggle) duty cycle. Clearly, hk = tk = 1/ pk .
3) Given the size n of PRPG, determine, for each switching code k, the average number nk of 1s occurring in the control register. As can be easily verified, nk = pk × n.
5.1 FIELD-PROGRAMMABLE GATE ARRAYS (FPGA):
A field-programmable gate array (FPGA) is a block of programmable logic that can implement multi-level logic functions. FPGAs are most commonly used as separate commodity chips that can be programmed to implement large functions.
5.1.1 Lookup tables:
The basic method used to build a combinational logic block (CLB) also called a logic element—in an SRAM-based FPGA is the lookup table (LUT). As shown in Figure , the lookup table is an SRAM that is used to implement a truth table.
Each address in the SRAM represents a combination of inputs to the logic element. The value stored at that address represents the value of the function for that input Combination an input function requires an SRAM with location.
5.1.2 Programming a lookup table:
Unlike a typical logic gate, the function represented by the logic element can be changed by changing the values of the bits stored in the SRAM. As a result, the n-input logic element can represent functions (though some of these functions are permutations of each other).
A typical logic element has four inputs. The delay through the lookup table is independent of the bits stored in the SRAM, so the delay through the logic element is the same for all functions. This means that, for example, a lookup table-based logic element will exhibit the same delay for a 4-input XOR and a 4-input NAND.
5.1.3 Complex logic element:
Many FPGAs also incorporate specialized adder logic in the logic element. The critical component of an adder is the carry chain, which can be implemented
5.1.4 Programmable interconnection points:
Simple version of an interconnection point, often known as a connection box.
A programmable connection between two wires is made by a CMOS transistor (a pass transistor). The pass transistor’s gate is controlled by a static memory program bit (shown here as a D register).When the pass transistor’s gate is high, the transistor conducts and connects the two wires; when the gate is low, the transistor is off and the two wires are not connected.
OVERVIEW OF FPGA
The main tools required for this project can be classified into two broad categories.
6.2 HARDWARE REQUIREMENTS:
FPGA KIT (SPARTAN)
In the hardware part, a normal computer where Xilinx ISE 10.1 software can be easily operated is required, i.e., with a minimum system configuration Pentium III, 1 GB RAM, 20 GB Hard Disk.
6.3 SOFTWARE REQUIREMENTS:
It requires Xilinx ISE 10.1 version of software where Verilog source code can be used for design implementation.
6.3.1 Introduction to Modelsim:
In Modelsim, all designs are compiled into a library. You typically start a new simulation in Modelsim by creating a working library called "work". "Work" is the library name used by the compiler as the default destination for compiled design units.
Compiling Your Design: After creating the working library, you compile your design units into it. The ModelSim library format is compatible across all supported platforms. You can simulate your design on any platform without having to recompile your design.
6.3.2 Introduction to XILINX ISE:
This tool can be used to create, implement, simulate, and synthesize Verilog designs for implementation on FPGA chips.
ISE: Integrated Software Environment
Environment for the development and test of digital systems design targeted to FPGA or CPLD.
• Integrated collection of tools accessible through a GUI.
• Supports all the steps required to complete the design:
Translate, map, place and route
Bit stream generation
6.4 INTRODUCTION TO FPGA:
FPGA stands for Field Programmable Gate Array which has the array of logic module, I /O module and routing tracks (programmable interconnect). FPGA can be configured by end user to implement specific circuitry. Speed is up to 100 MHz but at present speed is in GHz.
Main applications are DSP, FPGA based computers, logic emulation, ASIC and ASSP. FPGA can be programmed mainly on SRAM (Static Random Access Memory.
FPGA Design Flow:
FPGA contains a two dimensional arrays of logic blocks and interconnections between logic blocks. Both the logic blocks and interconnects are programmable. Logic blocks are programmed to implement a desired function and the interconnects are programmed using the switch boxes to connect the logic blocks.
FPGAs, alternative to the custom ICs, can be used to implement an entire System On one Chip (SOC). The main advantage of FPGA is ability to reprogram. User can reprogram an FPGA to implement a design and this is done after the FPGA is manufactured. This brings the name “Field Programmable.”
For Example: 2-LUT can be used to implement 16 types of functions like AND , OR, A +not B .... etc.
A B AND OR
0 0 0 0
0 1 0 1
1 0 0 1
1 1 1 1
Table 2: Logical Operation
A wire segment can be described as two end points of an interconnect with no programmable switch between them. A sequence of one or more wire segments in an FPGA can be termed as a track.
6.6 FPGA DESIGN FLOW:
In this part of tutorial we are going to have a short intro on FPGA design flow. A simplified version of design flow is given in the flowing diagram.
6.7 DESIGN ENTRY:
There are different techniques for design entry. Schematic based, Hardware Description Language and combination of both etc. Selection of a method depends on the design and designer. If the designer wants to deal more with Hardware, then Schematic entry is the better choice. When the design is complex or the designer thinks the design in an algorithmic way then HDL is the better choice. Language based entry is faster but lag in performance and density.
The process which translates VHDL or Verilog code into a device netlist format. i.e. a complete circuit with logical elements( gates, flip flops, etc…) for the design. If the design contains more than one sub designs, ex. to implement a processor, we need a CPU as one design element and RAM as another and so on, then the synthesis process generates netlist for each design element Synthesis process will check code syntax and analyze the hierarchy of the design which ensures that the design is optimized for the design architecture, the designer has selected.
In this work, design of a DWT and IDWT is made using Verilog HDL and is synthesized on FPGA family of Spartan 3E through XILINX ISE Tool. This process includes following:
• Place and Route
Process combines all the input netlists and constraints to a logic design file. This information is saved as a NGD (Native Generic Database) file. This can be done using NGD Build program.
Process divides the whole circuit with logical elements into sub blocks such that they can be fit into the FPGA logic blocks. That means map process fits the logic defined by the NGD file into the targeted FPGA elements (Combinational Logic Blocks (CLB), Input Output Blocks (IOB)) and generates an NCD (Native Circuit Description) file which physically represents the design mapped to the components of FPGA.
6.9.3 Place and Route:
PAR program is used for this process. The place and route process places the sub blocks from the map process into logic blocks according to the constraints and connects the logic blocks.
6.10 DEVICE PROGRAMMING:
Now the design must be loaded on the FPGA. But the design must be converted to a format so that the FPGA can accept it. BITGEN program deals with the conversion.
6.10.1 Design Verification:
Verification can be done at different stages of the process steps.
6.10.2 Behavioral Simulation (RTL Simulation):
This is first of all simulation steps; those are encountered throughout the hierarchy of the design flow. This simulation is performed before synthesis process to verify RTL (behavioral) code and to confirm that the design is functioning as intended.
6.10.3 Functional simulation (Post Translate Simulation):
7.1 SIMULATION RESULTS:
Figure 7.1: Simulation Wave Form of presto
7.2 SYNTHESIS RESULTS:
Figure 7.2: RTL Schematic of PRESTO Generator
Figure 7.3: Technology Schematic of PRESTO Generator
Table 3:Detailed project summary
S. No. Total number of paths / destination ports: 32 / 32
1. Offset: 4.040ns (Levels of Logic = 1)
2. Source: m9/out_31 (FF)
3. Destination: out<31> (PAD)
4. Source Clock: clk rising
5. Data Path: m9/out_31 to out<31>
Table 4 : paths / destination ports
S. No. Cell: in->out Gate- Fanout
Net- Delay Delay Logical Name
1. FDR:C->Q 1 0.514 0.357 m9/out_31
2. OBUF:I->O - 3.169 - out_31_OBUF
Total 4.040ns (3.683ns logic, 0.357ns route)
(91.2% logic, 8.8% route)
Table 5: Delay Calculation
Parameter Full subtractor using CMOS technology Full subtractor using GDI technology
Power(µW) 12.06 9.33
Delay(pS) 73 29
Transistor count 46 18
Table 6: Comparison of CMOS and GDI Technologies in-terms of Power, Delay and Transistor count
PRESTO the LP generator can produce pseudorandom test patterns with scan shift-in switching activity precisely selected through automated programming. The same features can be used to control the generator, so that the resultant test vectors can either yield a desired fault coverage faster than the conventional pseudorandom patterns while still reducing toggling rates down to desired levels, or they can offer visibly higher coverage numbers if run for comparable test times.
Hence we had designed a Programmable PRPG with low power. So to reduce more power we can replace the LFSR in PRPG with low power-LFSR (LP-LFSR).
e your essay in here...
...(download the rest of the essay above)