# A 200 mV low leakage current subthreshold SRAM bitcell in a 130 nm CMOS process\*

Bai Na(柏娜)<sup>1,2,†</sup> and Lü Baitao(吕白涛)<sup>1,2</sup>

<sup>1</sup>School of Information Science and Engineering, Southeast University, Nanjing 210096, China
<sup>2</sup>School of Electronics and Information Engineering, Anhui University, Hefei 230601, China

**Abstract:** A low leakage current subthreshold SRAM in 130 nm CMOS technology is proposed for ultra low voltage (200 mV) applications. Almost all of the previous subthreshold works ignore the leakage current in both active and standby modes. To minimize leakage, a self-adaptive leakage cut off scheme is adopted in the proposed design without any extra dynamic energy dissipation or performance penalty. Combined with buffering circuit and reconfigurable operation, the proposed design ensures both read and standby stability without deteriorating writability in the subthreshold region. Compared to the referenced subthreshold SRAM bitcell, the proposed bitcell shows: (1) a better critical state noise margin, and (2) smaller leakage current in both active and standby modes. Measurement results show that the proposed SRAM functions well at a 200 mV supply voltage with 0.13  $\mu$ W power consumption at 138 kHz frequency.

Key words:subthreshold SRAM; static noise margin; leakage; ultra low powerDOI:10.1088/1674-4926/33/6/065008EEACC:2570A; 2570D

### 1. Introduction

Subthreshold logic circuits are becoming increasingly popular in ultra-low-power applications<sup>[1,2]</sup>. However, the transistor characteristics in the subthreshold region are significantly different from those in the superthreshold region. Conventional 6T SRAMs are prone to functionally fail in the subthreshold region due to: (1) weak writability, (2) degraded read static noise margin (SNM), and (3) increased sensitivity to process variations. Several subthreshold SRAMs have been reported to minimize energy-per-access<sup>[3-6]</sup>. However, Reference [2] has confirmed that a further reduction in  $V_{dd}$  results in an increased leakage energy consumption. This is because delay increases exponentially with decreasing supply voltage in the subthreshold region. And it is particularly important to reduce SRAM leakage energy, as SRAM is required to retain its data for an arbitrarily long time, regardless of its own access delay. Several leakage reduction techniques have been proposed in SRAM standby mode<sup>[7]</sup>. Unfortunately, almost all of the previous subthreshold designs ignore the leakage current in both active and standby modes.

This paper presents a robust subthreshold SRAM bitcell that can achieve leakage current reduction in both active (write/read operation) and standby modes without increasing the dynamic energy dissipation or performance penalty.

# 2. Subthreshold SRAM bitcell with self-adaptive leakage cut off scheme

To maintain a balance between read and write requirements, conventional 6T SRAM bitcell design is optimized by sizing the pull-down, pull-up and access transistors. However, given the exponential sensitivity of subthreshold current to threshold voltage ( $V_{\text{th}}$ ), it is impractical to rely on transistor sizing to ensure robustness under process variations. Therefore, a redesign of the bitcell structure instead of sizing is necessary.

#### 2.1. Proposed subthreshold SRAM bitcell design

A circuit schematic of the proposed bitcell is illustrated in Fig. 1(a). Transistors P1, N1 and P2, N2 form two coupling inverters. Transistors N3, N4, N5 and N6 are access transistors. Transistors N7 and N8 combined with N5 and N6 implement two buffering circuits which protect the stored data during read operation. Self-adaptive leakage cut off transistors P3 and P4, which are gated by the signals  $\overline{Q}$  and Q correspondingly, are introduced to the bitcell to minimize the leakage energy dissipation.

To maintain reliability and reduce leakage current in both active (write/read operation) and standby modes, a reconfigurable operating scheme is adopted, as shown in Fig. 1(b). In this way, readability and writability can be enhanced by the buffering circuit (N5\_N7, N6\_N8) and the control signal (discussed in detail in Section 3), respectively. Furthermore, using transistors with minimum size is possible (Fig. 1(c)), which is crucial for SRAM density requirements, as shown in Table 1.

#### 2.2. Self-adaptive leakage cut off scheme

The concept of the self-adaptive leakage cut off scheme is based on the drain-to-source current of logic in the subthreshold region being changed exponentially with  $V_{\rm GS}$  and  $|V_{\rm th}|$ , as shown in Eq. (1).

$$I_{\rm sub} = I_{\rm sub0} \exp \frac{V_{\rm GS} - V_{\rm th} + \eta V_{\rm DS} - \gamma V_{\rm SB}}{n V_{\rm T}} \left(1 - \exp \frac{-V_{\rm DS}}{V_{\rm T}}\right),\tag{1}$$

<sup>\*</sup> Project supported by the China State-Funded Study Abroad Program for High-Level Universities.

<sup>†</sup> Corresponding author. Email: realbain@gmail.com

Received 30 December 2011, revised manuscript received 17 February 2012

| Table 1. A comparison of various SRAM bitcells. |              |              |     |     |                                                                                   |  |  |
|-------------------------------------------------|--------------|--------------|-----|-----|-----------------------------------------------------------------------------------|--|--|
| Design                                          | Process (nm) | Read         | #WL | #BL | Component transistor size (nm)                                                    |  |  |
| Referenced_10T <sup>[5]</sup>                   | 90           | Differential | 2   | 2   | Not mentioned                                                                     |  |  |
| Referenced_ST <sup>[6]</sup>                    | 130          | Differential | 1   | 2   | $W_{\rm PU}/W_{\rm AX}/W_{\rm PD}/W_{\rm NFL}/W_{\rm NL2} = 160/240/320/160/160;$ |  |  |
|                                                 |              |              |     |     | $L_{\rm ALL} = 120$                                                               |  |  |
| Referenced_8T <sup>[7]</sup>                    | 130          | Single ended | 1   | 3   | $W_{\rm ALL} = 160;$                                                              |  |  |
|                                                 |              |              |     |     | $L_{\rm M1, M4}/L_{\rm M3, M6}/L_{\rm M7, M8}/L_{\rm M2, M5} = 120/360/240/120$   |  |  |
| Proposed design                                 | 130          | Differential | 2   | 2   | $W_{\rm ALL} = 160, L_{\rm ALL} = 120$                                            |  |  |

 $W_{\rm ALL}/L_{\rm ALL}$ : the width/length of all of the component transistors.



Fig. 1. (a) The proposed SRAM bitcell structure. (b) The reconfigurable operating principle. (c) The layout.



Fig. 2. Simulated waveform of write '0' to  $\overline{Q}$ .



Fig. 3. Simulated waveforms for the active and standby modes of the proposed design.

$$V_{\rm th} = V_{\rm t0} + \gamma (\sqrt{|-2\phi_{\rm F} + V_{\rm SB}|} - \sqrt{|-2\phi_{\rm F}|}), \qquad (2)$$

where *n* is the subthreshold swing factor,  $\eta$  is the DIBL coefficient, and  $\gamma$  is the linearized body effect factor.  $V_{\rm T}$  is the thermal voltage,  $I_{\rm sub0}$  is the saturation current,  $V_{\rm th}$  is the threshold voltage when substrate bias is present,  $V_{\rm SB}$  is the source-to-body substrate bias,  $2\phi_{\rm F}$  is the surface potential, and  $V_{\rm t0}$  is the threshold voltage for zero substrate bias.

Initially consider the condition that Q = "0" and  $\overline{Q} = "1"$ . In order to write a "0" to  $\overline{Q}$ , bitline BL and bitline  $\overline{BL}$  are forced to be "1" and "0", respectively. The wordlines (WL, WWL) of the accessed SRAM bitcell are raised and the precharge signal is disabled. Thus, the node  $\overline{Q}$  is discharged to "0" according to the value of  $\overline{BL}$ , and transistor P2 turns into an "on" state. So, the node Q is charged to "1" by the pull-up transistor P2. Note that the  $V_{\rm GS}$  of N1 is increased during this transient operation. Ultimately, the Q<sub>L</sub> node voltage is decreased to the same voltage value of  $\overline{\rm Q}$  when N1 is in an "on" state, as illustrated in Fig. 2. This gets the source terminal of the transistor P3 electrically connected to node  $\overline{\rm Q}$ . The gate-to-source voltage of P3 equals "0", and P3 turns into an "off" state once the write operation is completed. It is known that if the source-to-body voltage of a transistor is different from zero ( $V_{\rm SB} \neq 0$ ), then  $|V_{\rm th}|_{V_{\rm SB}\neq 0} > |V_{\rm th}|_{V_{\rm SB}=0}$ , as shown in Eq. (2). Therefore, the drain-to-source current of P3 decreases exponentially as  $|V_{\rm th}|_{\rm P3}$  increases, once write operation is actually completed. At the same times, the value of  $V_{\rm GS}$  for P4 is changed from "0" to a positive value. The drain-to-source current of P4 also



Fig. 4. Transient waveform of write '0' to  $\overline{Q}$  under 1000 sampling Monte Carlo simulation (both process and mismatch simulation).

decreases dramatically.

As shown in Fig. 3, either  $\overline{\text{QL}}$  or  $\overline{\text{QL}}$  is raised to a positive value  $\Delta V$  or  $\Delta V'$  depending on the bitcell value during read and standby operations. Based on the effects of  $\overline{\text{QL}}$  and  $\overline{\text{QL}}$  on the value of  $V_{\text{GS}}$ ,  $V_{\text{SB}}$ ,  $|V_{\text{th}}|$  for transistors N1, N2, P3 and P4, the leakage is degraded remarkably, according to Eqs. (1) and (2). According to the simulation results, the leakage current reduced by body effect during read and standby operations is larger than that induced by the parallel connected transistors, as shown in Fig. 1. Therefore, the leakage current of the proposed design is also reduced in both active (read/write operation) and standby modes.

The cut-off transistors P3 and P4 neither introduce extra load on the bitline nor require auxiliary circuits to switch the bitcell into a leakage cut off mode. And the proposed design can reduce leakage in both active (read/write operation) and standby modes without any extra dynamic energy dissipation and performance penalty.

#### 3. Simulation results of the proposed design

In the remainder of this paper, the referenced bitcell and the proposed design are simulated under the same condition: 130 nm logic process, 200 mV  $V_{dd}$  and 100 kHz frequency. The bitcell current is defined as the total current through the supply voltage and bitlines (BL,  $\overline{BL}$ ). Currents on the wordlines (WL, WWL) are not included since they receive currents from other independent power sources.

#### 3.1. Write operation

To enhance writability, some of the previous subthreshold SRAM bitcells collapse  $V_{dd}$  in write operation<sup>[3]</sup>. However, they also degrade the hold stability of the SRAM bitcells in other rows sharing the same  $V_{dd}$  line<sup>[3]</sup>. To improve the write margin, reference [6] utilizes the reverse short channel effect (RSCE). Unfortunately, these techniques would incur too much area overhead, especially for the large SRAM block. Instead of gating power<sup>[3]</sup> and RSCE<sup>[6]</sup>, boosted WL and WWL are adopted in the proposed design. At 200 mV, writability is ensured by boosting the wordline voltage to about 250 mV. Figure 4 shows that such boosting of the wordlines (WL, WWL) provides good writability in 1000 sampling Monte Carlo transient simulations. Since the gate input boosting overwhelms the sizing effect in the subthreshold region, strong writability can be obtained without incurring a large area penalty, despite having series access transistors.

#### 3.2. Read operation

Read failure is the most critical problem in realizing a subthreshold SRAM block. References [3, 4, 6] add extra buffering circuits (two or four transistors) to conventional 6T SRAM bitcells to get a read mechanism that does not disturb the internal nodes of the bitcell. However, in the subthreshold region, the bitline swing of a SRAM column during read operation deteriorates, especially at the worse-case process corner. This makes it difficult for the single-ended read SRAM sensing scheme to distinguish the right value. Hence, a fully differential read scheme of the proposed design improves the bitline noise immunity and enhances SRAM robustness during read operation<sup>[4, 5]</sup>.

In the proposed design, WL is enabled while WWL is set to "0" during read operation. Considering Q = "0" and  $\overline{Q} =$ "1", BL is conditionally discharged through the buffering circuit, transistors N5 and N7, depending on the value of node  $\overline{Q}$ . Since the bitcell node is isolated from the bitline during this operation, the problem of read SNM degradation is removed. Thus, the critical static noise margin of the proposed design is transferred to its hold SNM.

#### 3.3. Standby operation

During standby mode, WWL is enabled while WL is set to "0". As N3 and N4 are on during this operation, a current path to ground is formed by N3–N7 or N4–N8 to enhance the pull-down strength of the node storing data "0", depending on the stored data. As a result, the stability of the proposed design in standby operation is ensured.

Both mismatch and process sensitivity simulation results of the referenced 10T<sup>[4]</sup> and the proposed design for hold SNM are provided in Fig. 5. Compared to the referenced 10T bitcell,



Fig. 5. A comparison of the hold SNM of (a) referenced 10T and (b) the proposed design in Monte Carlo analysis (10000 samples, both process and mismatch simulation).

the proposed bitcell improves the hold margin. This is because the series connected N3 and N7 or N4 and N8 can weaken the pull-down current at the fast NMOS and slow PMOS corner, which is the worst-case process corner for hold stability.

#### 3.4. Write-back scheme for row data preservation

In a column MUXed array, the write operation still has stability problems because the enabled write wordline is also shared by the unselected columns<sup>[8]</sup>. This is also referred to as the half-selection problem in conventional 6T designs. To remove this problem and enhance readability with a smaller area penalty, a write-back sense amplifier is proposed, as shown in Fig. 6. During read operation, the bitlines of the columns (both selected and unselected) are also charged to  $V_{dd}$ . The sense amplifier is triggered by the bitline swing. It stores signals according to the bitline difference and writes them back to the bitcells in both read and write operations to enhance the transistor driving strength. By rewriting the read data back to unselected bitcells, there is no voltage difference between the bitlines (BL,  $\overline{BL}$ ) and the bitcell nodes. The contention current is eliminated in this way, and the half-selection problem is also removed.



Fig. 6. A write-back sense amplifier.

#### 3.5. Current distribution

The current distributions of the referenced 10T, ST<sup>[5]</sup> and the proposed design in both standby and active modes are shown in Fig. 7. The mean current of the proposed bitcell in standby mode is 9.0% and 55.59% less than that of the referenced 10T and ST SRAM bitcells, respectively. Meanwhile, the means of the per bitcell current of the referenced 10T and ST SRAM bitcell in active mode are 592.845 pA and 224.567 pA, while the proposed bitcell consumes only 154.615 pA leakage current at 200 mV. The proposed bitcell exhibits a 73.92% and 31.15% reduction current in the active mode compared to that of the referenced 10T and ST. This is because the proposed design reduces leakage in both active and standby modes, while the referenced 10T only reduces leakage in standby mode. Compared to the referenced 10T and ST design, the proposed design exhibits: (1) 90.0% and 30.38% standby leakage standard deviation (Std), and (2) 21.11% and 58.78% active leakage Std. Therefore, the proposed design is more robust against process variation compared with the referenced 10T and ST designs.

# 4. Subthreshold SRAM bitcell with self-adaptive leakage cut off scheme

A 32 × 256 bit SRAM array is implemented in a 130 nm process. The die photo is shown in Fig. 8(a). Measurement results confirm that both the active (write/read operation) and standby modes function well at a 200 mV supply voltage, as shown in Fig. 8(b). The maximum operation frequency and the average power consumption of a 32 × 256 bit SRAM array are shown in Fig. 9. The proposed SRAM array achieves a frequency of 138 kHz at 200 mV  $V_{dd}$ . The power is measured with a random input vector under an activity rate of 50% access per cycle. The total power (both dynamic power and standby power) consumption at 200 mV  $V_{dd}$  is 0.13  $\mu$ W, which is 1.78% that at 600 mV. A comparison of the proposed design with other referenced designs is illustrated in Table 2.



Fig. 7. The per bitcell current distribution of (a) referenced 10T, (b) ST and (c) the proposed design in both standby and active modes.

| Type of bitcell     | Traditional 6T bitcell | Referenced 10T bitcell | Referenced ST bitcell | Proposed design      |
|---------------------|------------------------|------------------------|-----------------------|----------------------|
| Structure           | 6T                     | 10T                    | 10T                   | 12T                  |
| Processing (nm)     | 65                     | 90                     | 130                   | 130                  |
| Size (kb)           | 16                     | 32                     | 4                     | 8                    |
| Area $(\mu m^2)$    | $120 \times 200$       | Not mentioned          | Not mentioned         | $141.1 \times 352.6$ |
| Supply voltage (mV) | 1200                   | 160                    | 160                   | 200                  |
| Power $(\mu W)$     | 11.2 @ 1200 mV         | 1.81 @ 300 mV          | Not mentioned         | 0.13 @ 200 mV        |
| Frequency (kHz)     | 1000 @ 1200 mV         | 0.5 @ 160 mV           | 620 @ 400 mV          | 138 @ 200 mV         |

### 5. Conclusion

A 200 mV subthreshold SRAM bitcell with self-adaptive

leakage cut off transistors is proposed in this paper. The feature of the proposed design is to reduce leakage in both active and standby modes without any extra dynamic energy dissipation



Fig. 8. (a) A chip micrograph and (b) an SRAM operating waveform at 200 mV.



Fig. 9. The total power and maximum frequency of the proposed SRAM array versus supply voltage.

and performance penalty. The stability and performance of the proposed design in the subthreshold region are enhanced by the differential read scheme, the buffering circuit and the operating scheme. Compared to the referenced 10T and ST SRAM bitcells, the proposed bitcell has five areas of improvement: (1) better critical state noise margin, (2) 9.0% and 55.59% smaller standby leakage mean, (3) 90.0% and 30.38% standby leakage standard deviation, (4) a 73.92% and 31.15% reduction in active leakage mean, and (5) 21.11% and 58.78% active leakage standard deviation. Measurement results show that the 32  $\times$  256 bit SRAM array functions correctly at a 200 mV supply voltage with 0.13  $\mu$ W power consumption at 138 kHz frequency.

#### References

- Bo Z, Pant S, Nazhandali L, et al. Energy-efficient subthreshold processor design. IEEE Trans Very Large Scale Integration Syst, 2009, 17(8): 1127
- [2] Wang A, Chandrakasan A. A 180-mV subthreshold FFT processor using a minimum energy design methodology. IEEE J Solid-State Circuits, 2005, 40(1): 310
- [3] Verma N, Chandrakasan A P. A 256 kb 65 nm 8T subthreshold SRAM employing sense-amplifier redundancy. IEEE J Solid-State Circuits, 2008, 43(1): 141
- [4] Chang I J, Kim J J, Park S P, et al. A 32 kb 10T sub-threshold SRAM array with bit-interleaving and differential read scheme in 90 nm CMOS. IEEE J Solid-State Circuits, 2009, 44(2): 650
- [5] Kulkarni J P, Kim K, Roy K. A 160 mV robust Schmitt trigger based subthreshold SRAM. IEEE J Solid-State Circuits, 2007, 42(10): 2303
- [6] Kim T H, Liu J, Kim C H. An 8T subthreshold SRAM cell utilizing reverse short channel effect for write margin and read performance improvement. IEEE Custom Integrated Circuits Conference, 2007: 241
- [7] Lakshminarayanan S, Joung J, Narasimhan G, et al. Standby power reduction and SRAM cell optimization for 65 nm technology. 10th International Symposium on Quality Electronic Design, 2009: 471
- [8] Kim T H, Liu J, Keane J, et al. A 0.2 V, 480 kb subthreshold SRAM with 1 k cells per bitline for ultra-low-voltage computing. IEEE J Solid-State Circuits, 2008, 43(2): 518