Implementation of an 8x2 SRAM
Joshua Agarth, Pranay Kumar Reddy Budida, Swastikka Ramasubbu
UFID: 5215-5618, 3461-8172, 6855-4939
Department of Electrical and Computer Engineering
University of Florida, Gainesville, Florida, USA
Abstract— This project aims at designing and simulating an 8x2 Static RAM memory block in Cadence using 240 nanometer process technology. The project uses 6 transistor SRAM cells along with other peripheral components required to design a functional memory block. Several crucial design aspects like size, timing parameters, noise margins and cell stability were considered while designing the circuit. Design choices were made in a way that gave a near optimum and compact circuit.
Memory is the most important component of any chip. According to the ITRS roadmap, the memory unit covers 90 % of a chip\'s area. The earlier ROM and NVRWM units consumed relatively more chip area. SRAMs and DRAMs were then introduced to trade off area to either performance or electrical reliability. Static random access memory (SRAM) can retain its stored information as long as power is supplied. This is in contrast to dynamic RAM (DRAM) where periodic refreshes are necessary or non-volatile memory where no power needs to be supplied for data retention, as in a flash memory. While such a DRAM memory has the advantages of fast access operations and high power efficiencies, it occupies more space. SRAMs these days are widely used as both on-chip and off-chip memories.
Other than the SRAM cell, a memory unit consists of the precharge circuit for maintaining stored data, sense amplifier to retrieve data from the cell and control circuitry to enable writing and reading to and from the cell. The overall circuit is divided into the following components: 6T SRAM cells, precharge circuitry, row and column decoders, and a sense amplifier.
Fig.1: Overall block diagram of 8x2 SRAM Array
The four modules were designed and implemented individually and later integrated to form the overall circuit. The capacitances added to the wordlines and bitlines for the memory were also calculated.
The constraints used are as follows:
Power Supply (Vdd): 3V
Temperature of Operation: 27C
Input data/Clock Signal Slopes: 100ps
Clock Phases: Non Overlapping
Clock : CLK1: 40% CLK2:60%
Min. Channel Length: 0.24um
Strict design rules were followed and the overall circuit implemented has an area of 4,406.74um^2. All schematic has been designed in cadence virtuoso and simulated in cadence spectre 0.25um Deep Submicron technology.
II. Design Overview
A. SRAM Cell
Fig.2: A 6T SRAM cell
A conventional 6T cell contains, 2 cross coupled inverters (output of transistor fed to other input) and two access transistors as shown in Fig 2. Output node of each inverter stores charge during write cycle and access transistor provide access to SRAM cell during read cycle. Access to SRAM cell is enabled by word line (WL) which controls the gate input of access transistors. Access transistors controls the cell connected to BL(Bit line) or BL` (Bit line bar). They are used to transfer data for both read and write operations.
The voltage across N1 should be less than the threshold voltage of 0.4V when the charge on BL` discharges through N1 and N2, to ensure read stability. This is achieved by having the size of N1 greater than N2. The size of N1 is determined by the Cell Ratio (CR) which is given by:
CR= (WN1/LN1)/ (WN2/LN2) = W1/W2
(Provided the lengths are equal)
WN1, LN1: Width and Length of N1
WN2, LN2: Width and length of N2
Read stability for an SRAM is ensured by having the CR greater than 1.2. A CR value of 1.5 is chosen for the design of our 6T cell.
To ensure write stability, the voltage across the transistor N4 should be less than the threshold voltage to overwrite the data in the cell, when the bitline BL is pulled low to write a value of ‘0’ into the 6T SRAM cell. This is achieved by considering the Pull-up ratio (PR) given by:
PR = (WP2/LP2)/(WN4/LN4) =W4/W6
(Provided the lengths are equal)
WP2, LP2: Width and Length of
WN4, LN4: Width and Length of N4
The value of PR should be less than 1.8 to ensure write stability. PR value of 1 is used to design the 6T SRAM cell to minimize the cell size.
Based on the above analyses, the transistor sizes used in our design are as follows:
WN2 = WN4 = WP1= WP2 = 360 μm (minimum width)
WN1 = WN3 = 1.5WN1 = 540 μm
B. Static Noise Margins
The static noise margin (SNM) is a key aspect of the SRAM. In each SRAM cell, if the voltage of thermal noise rises above the SNM the state of the SRAM can change and the data will be lost. The voltage transfer functions for the 6T SRAM cell are shown in the figures below.
SNM for Read is the length of the diagonal of the minimum square. It is obtained as 0.85V.
Fig.3: Read Static Noise Margin
SNM for write is the length of the diagonal of the smallest square that can fit in the curve. It is obtained as 1.25V.
Fig.4: Write Static Noise Margin
The length of the diagonal of the largest square that can fit inside the butterfly curve is the Hold Noise Margin. It is obtained as 0.85V.
Fig.5: Hold Static Noise Margin
C. Gate and Diffusion Capacitances
Cg = Cgn + C overlap
= WL Cox + 2W Co =(0.54um)(0.24um)(6fF/um2) + 2(.54um)(0.31fF/um)
= 0.77 fF
Cdiff = Cbottom + Csw + Co
= Cj WLs + Cjw (2Ls +W) +W*Co
= (2fF/um2 )(0.54um)(0.6um) + (0.28fF/um2)(2*0.6um + 0.54um) + (0.54um*0.32fF/um)
Cbl = NMOS count connected to BL * Cdiff
Cbl = 8 * Cdiff
Cwl = NMOS count connected to WL * Cg
Cwl = 2* 2 * Cg
= 3.08 fF
III. Pre-charge Circuit
The pre-charge circuit is used to charge the bit lines to VDD before reading from the SRAM cell. The pre-charge circuit is provided with clock 2. When clock 2 goes low, the BL and BL` lines go high and vice versa. This is achieved by using 3 PMOS transistors. The width of the PMOS is 360nm.
Fig.6: Pre-Charge Circuit
IV. Decoders and WordLine Driver
The row decoder is used to decode the address lines in order to select which of the 8 rows of the SRAM array is read from or written to. Similarly, the column decoder is used to select one of the 2 columns that is used to read or write the bit during a particular clock cycle. The row decoder is made of a 3 input NAND gate connected to an inverter which is equivalent to a 3 input AND gate. Alternatively, the decoder can also be constructed, using NOR gates with the same fan-in, however NOR gates require more rising and falling time. Decoder design typically considers using the NAND and NOR gates, but in this project, the decoder is designed using the NAND gates, since it is faster than the NOR gates. The 3 input NAND gate is also used to reduce the number of transistors and optimize the cell delay. This block has 8 outputs that go to a word line driver. This is nothing but a 3:8 decoder.
The word line circuit drives the signal from the decoder to the word line of the SRAM cells. 2 input AND gates are used as the word line drivers. One input of each AND gate comes from the decoder and the second input comes from CLK1. When the clock is high, the driver passes the decoder outputs to select the correct word lines.
The column decoder uses an inverter. Column 1 is connected to the inverter input and column 2 is connected to the inverter output. When the input is low column 2 is selected and when the input is high column 1 is selected.
Fig.7: Row Decoder block diagram
Fig 8: Row decoder
Fig. 9: Column Decoder
V. Sense Amplifier
This is typically a differential amplifier which amplifies the small differences in the bit lines. The sense amplifier is a part of the read circuitry which is used when data is read from the memory. The job of a sense amplifier is to sense the low power signals from a bitline which represents a data bit (1 or 0) stored in a memory cell, and amplify the small voltage swing to recognizable logic levels so that the data can be interpreted properly by logic outside the memory. The objectives for the design of sense amplifier are minimum power consumption, restricted area of the layout, minimum sense delay, required amplification, high reliability, and tolerance. The sense amplifier we designed is a differential amplifier which is connected to the three series inverters. One of the inverters is used to invert the inverted out of the differential amplifier. The other two are used to increase the speed.
Fig.10: Sense Amplifier
VI. Integration and Performance
Initially each of the individual blocks were designed and implemented. All of them passed the LVS and DRC tests and worked as expected. We then integrated them to form the overall memory block as shown below.
Fig. 11: The overall schematic of the 8x2 SRAM
Fig. 12: 8x2 SRAM Schematic with no errors in icfb window
Fig. 13: 8x2 SRAM Layout
Performance parameters of the SRAM cell are given below.
1. SRAM cell area:
Area =Width * Height=22.11um^2
Overall SRAM area
Area =Width * Height=4,406.74um^2
2. Read Access Time:
The 50% delay from the rising edge of clk1 to the output data transition from 0 to 1 calculated for SRAM cell in column 1 of row 1 is 413.8 ps. Similarly, the 50% delay from the rising edge of clock clk1 to the output data transition from 1 to 0 is 447.2 ps.
Fig. 14: Read Access Time 0 to 1
Fig. 15: Read Access Time 1 to 0
3. Write access time:
The 50% delay from the rising edge of clk1 to the final writing of the input into the memory form 0 to 1 for the SRAM cell in column 1 of row 1 is 542.4 ps. The 50% delay from the rising edge of clk1 to the writing of the input into the memory form 1 to 0 is 333.9ps.
Fig. 16: Write Access Time 0 to 1
Fig. 17: Write Access Time 1 to 0
5. Functional Timing Diagram
The timing diagram in figure 18 show successful read and write operations of the SRAM array, along with use of the decoder circuits. Initial conditions are set on the SRAM cells to show ordinary operation of the device. CLK1 and CLK2 are invoked to synchronize the device functions. The three least significant bits represent the row decoder and the most significant bit represents the column decoder in our addressing. Address 0001 is chosen by the decoder circuits, and DATA_IN and write enable (WR_EN) are driven high. In the SRAM cell Q1 corresponding to cell 0001 is written from a 0 to a 1. The process continues and the bits are flipped for cells 0010, 0100, and 1000, corresponding to waveforms Q2, Q4, and Q8 respectively. Next a read function is done on all the cells. Read enable (RD_EN) is driven high at each clock edge. The result of the read cycle is output into the DATA_OUT signal. We read the signals in the same order they were written to. We read the values that were previously written to the cells in the order 0001, 0010, 0100, 1000.
Fig. 18: Complete SRAM timing diagram
An 8x2 SRAM memory block was successfully designed and implemented. The total area of the SRAM was 4,406.74um2. Near optimal performance and size has been achieved.
 J. Rabaey, A. Chandrakasan, B. Nikolic, Digital Integrated Circuits: A Design Perspective.
 N. Weste, D. Harris, A. Banerjee, CMOS VLSI Design: A Circuits and Systems Perspective.
...(download the rest of the essay above)