A light-powered sub-threshold microprocessor*

Liu Ming(刘鸣)†, Chen Hong(陈虹), Zhang Chun(张春), Li Changmeng(李长猛), and Wang Zhihua(王志华)

(Institute of Microelectronics, Tsinghua University, Beijing 100084, China)

Abstract: This paper presents an 8-bit sub-threshold microprocessor which can be powered by an integrated photosensitive diode. With a custom designed sub-threshold standard cell library and 1 kbit sub-threshold SRAM design, the leakage power of 58 nW, dynamic power of 385 nW @ 165 kHz, EDP 13 pJ/inst and the operating voltage of 350 mV are achieved. Under a light of about 150 kLux, the microprocessor can run at a rate of up to 500 kHz. The microprocessor can be used for wireless-sensor-network nodes.

Key words: power harvesting; wireless-sensor-network; sub-threshold microprocessor; photosensitive diode

DOI: 10.1088/1674-4926/31/11/115002

EEACC: 2570

1. Introduction

Solar energy harvesting has been proposed to replace batteries so as to lengthen the lifetime of an operating device and reduce undesirable weight and volume[1]. However, the light cell is still an external device and incompatible with the standard CMOS process. CMOS photodiodes widely used in digital cameras have been exploited for sunlight power harvesting but they suffer from low efficiency[2]. For voltage and power limitation generated by a single CMOS photodiode, the chip must work under about 500 mV and consume several microwatts, which is a big challenge for the circuit design.

The characteristic of the sub-threshold logic is very suitable for energy-constrained applications, such as a power harvesting sensor for which very high speed is not required. By lowering the supply voltage into the sub-threshold domain, the circuit can achieve ultra-low power consumption by sacrificing the speed, robustness and stability. There are several papers about sub-threshold circuit design[3–5], but few of them focus on the design of a standard library to optimize the performance in terms of power and speed.

This paper presents a light energy harvesting node that contains a 180 nm sub-threshold microprocessor and an integrated CMOS photodiode. Under the light of 150 kLux, the sensor can run at a rate of up to 500 kHz. The digital part of the microprocessor is constructed from an optimized sub-threshold library. A 128 × 8 bit sub-threshold SRAM is added to the microprocessor as an internal RAM. The 8-bit microprocessor contains 128 bytes of SRAM as the internal RAM. The other parts of the processor are the control-unit (CU), the arithmetic logic-unit (ALU), the operation-program-decoder (OP_DECODER), the register-file (REGS_FILE), etc. An 8-bit testing register (TEST_REG) is added to the internal SFR_BUS. The output signal can be detected by the oscilloscope. The microprocessor is fully compatible with the 8051.

To design such a microprocessor, the conventional standard library might be useful. However, this is designed for the normal voltage domain and not optimized for sub-threshold operation due to the different current characteristic. It is difficult to design a sub-threshold circuit because of the nonlinear modu-

2. System architecture

The system architecture is shown in Fig. 1. An integrated CMOS photodiode is designed to act as the voltage supplier. A sub-threshold 8-bit microprocessor is connected to the photodiode directly. The internal ROM is programmed to test system functions. The 8-bit microprocessor contains 128 bytes of SRAM as the internal RAM. The other parts of the processor are the control-unit (CU), the arithmetic logic-unit (ALU), the operation-program-decoder (OP_DECODER), the register-file (REGS_FILE), etc. An 8-bit testing register (TEST_REG) is added to the internal SFR_BUS. The output signal can be detected by the oscilloscope. The microprocessor is fully compatible with the 8051.

This paper presents an 8-bit sub-threshold microprocessor which can be powered by an integrated photosensitive diode. With a custom designed sub-threshold standard cell library and 1 kbit sub-threshold SRAM design, the leakage power of 58 nW, dynamic power of 385 nW @ 165 kHz, EDP 13 pJ/inst and the operating voltage of 350 mV are achieved. Under a light of about 150 kLux, the microprocessor can run at a rate of up to 500 kHz. The microprocessor can be used for wireless-sensor-network nodes.

* Project supported by the National Natural Science Foundation of China (No. 60906010).
† Corresponding author. Email: lium02@mails.tsinghua.edu.cn

Received 26 April 2010, revised manuscript received 22 June 2010 © 2010 Chinese Institute of Electronics
3. Sub-threshold circuit design

3.1. Physical principles

The current characteristic of the sub-threshold domain is different from that of the super-threshold. The current equation is expressed by

\[ I = \frac{W}{L} \mu_C C_{ox} U_t (n - 1) \exp \left( \frac{V_{gs} - V_{th}}{n U_t} \right) \left[ 1 - \exp \left( \frac{V_{ds}}{U_t} \right) \right] \]  

where \( W \) is the width, \( L \) is the length of the transistor, \( V_{th} \) is the threshold voltage, \( U_t \) is the thermal voltage, \( n \) is a factor dependent on the process, and \( V_{gs} \) and \( V_{ds} \) are the gate-to-source and drain-to-source voltages respectively. This equation can be used to analyze and predict the characteristics of circuits theoretically but it is not accurate in the weak transition.

The BSIM threshold model contains six effects: body-effect, charge-sharing, drain-induction-barrier-lowering (DIBL), reverse-short-channel-effect (RSCE), narrow-width effect and small-size effect. The theoretical calculation indicates that, in the sub-threshold domain, the DIBL effect can be neglected and the reverse-short-channel effect (RSCE) can affect the NMOS transistor strongly.

The sub-threshold drive and leakage current of NMOS and PMOS versus \( W \& L \) are plotted in Fig. 2. From this, it can be seen that when the width and length increase, the transistor current does not increase correspondingly due to the non-linear modulation effect by \( W \& L \). Consequently, it is hard to predict the current characteristic only by the size of the transistor, which is different from the normal voltage domain.

3.2. Design method for a sub-threshold circuit

The current of the transistor is modulated nonlinearly by the \( W \& L \). The performance of the circuit cannot be predicted simply by changing the size. To solve the problem, the size of the NMOS and PMOS should be fixed separately. The threshold will only be affected by the body-effect.

The first step is to decide the standard size of NMOS and PMOS. From Fig. 2, we can see that a small size transistor has the lowest leakage power whereas the dynamic is also low. This is useful for some applications that are powered by batteries. A small transistor also means a light load for the driver. As we know, the delay is linear to the capacitance of the load approximately. The performance will be compensated moderately.

The second problem is the drivability. Low leakage of the small transistor also means poor drivability. To solve this problem, the parallel structure is used to accelerate the speed of the circuit. Figure 3 illustrates an example.

The third problem is the impact of the process variation on the small transistor. The threshold turbulence is linear to \((WL)^{-1/2}\). Thus the small transistor will suffer stronger turbulence than the normal one. However, a large scale circuit would not be impacted by the variation as much as a transistor because of the randomness. Figure 4 shows a comparison.

Two types of 32-bit adder-SUB (sub-threshold adder) and NORMAL (normal adder) were analyzed using the Monte-Carlo method. The \( x \)-axis is delay and the \( y \)-axis is occurrence. Figure 4 indicates that the “3σ/μ” of the adders are the same approximately. Another result we can see from the figure is that the speed of the SUB is higher than the NORMAL.

3.3. Sub-threshold library design

Following the design method described in Section 3.2, a customized library is established. The size of the standard inverter is sized to achieve low leakage and maximal noise mar-
Fig. 3. (a) Parallel inverters. (b) Driving strength simulations.

Fig. 4. Monte–Carlo analysis on delay of two types of 32-bit adder.

Table 1. Simulation comparisons between 32-bit sub-threshold and standard CLAs.

<table>
<thead>
<tr>
<th>Synthesized library</th>
<th>Frequency (kHz)</th>
<th>Leakage (nA)</th>
<th>Dynamic (nW)</th>
<th>PDP (pJ)</th>
<th>EDP (pJ μs)</th>
<th>Cell amount</th>
</tr>
</thead>
<tbody>
<tr>
<td>Sub-threshold</td>
<td>690</td>
<td>2</td>
<td>45</td>
<td>0.07</td>
<td>0.08</td>
<td>343</td>
</tr>
<tr>
<td>Standard</td>
<td>501</td>
<td>5</td>
<td>90</td>
<td>0.18</td>
<td>0.36</td>
<td>349</td>
</tr>
<tr>
<td>Comparison</td>
<td></td>
<td></td>
<td></td>
<td>37%↑</td>
<td>60%↓</td>
<td>50%↓</td>
</tr>
</tbody>
</table>

gin, which is very important for a sub-threshold circuit. The library contains 80 cells. This meets the standard CAD tools and design flow.

A 32-bit carry-look-ahead-adder (CLA) is simulated for evaluation. The final measurements of the longest path for average frequency, the power-including leakage and dynamic, as well as the power-delay-product (PDP) and energy-delay-product (EDP), are listed in Table 1. In comparison, the adders are synthesized with a conventional 180 nm standard library and sub-threshold library separately. Judging from the results, all the performances of the sub-threshold adder are superior to those of the standard.

3.4. Sub-threshold SRAM design

The sub-threshold SRAM is another important design for the system. The RAM structure is depicted in Fig. 5. The cell array is designed with an improved 11-T sub-threshold cell (see Fig. 6). The address decoders and data drivers are synthesized by the combinational logic for stability. The 11-T sub-threshold SRAM cell is constructed by a standard 6-T cell and a read-out-buffer (M7–11).

The read-bit-line (RBL) is connected to VDD during the idle time, different from that with the traditional design. Normally it will increase the leakage current from VDD to GND.
through the read-out-buffer when all cells save ‘0’. To mitigate the problem, an extra transistor (M11 in Fig. 6) is added to the cell. When the RBL is connected to VDD, all cells appear as ‘1’ to RBL to suppress the leakage. This means that more cells can be connected to RBL theoretically. There are three main reasons for doing this:

1. The conventional pre-charge process of the RBL is bypassed to accelerate the speed.
2. The system stability is enhanced. In the ultra-low voltage domain, the float RBL will be disturbed easily due to random noise by parasitic capacitance.
3. The speed is accelerated because the delay time of reading logic ‘1’ is zero, which means that the delay depends only on the current discharge through the NMOS stack. In this technology, NMOS is much stronger than PMOS (about ten times).

4. Photo-sensitive diode

An integrated photodiode is designed as the power source. The structure is depicted in Fig. 7. There are two diodes: D1 and D2. The D2 formed by N-well and P-substrate is more light-sensitive than the one formed by N-well and P-diffusion. Thus only the D1 with N-well connected with ground can generate positive power and the serial connected diode cannot be realized. When exposed to the light of a certain intensity, D1 can deliver an output voltage of about 500 mV, but the photocurrent is very poor. The measured short currents and output voltages of different resistors versus light intensity are plotted in Fig. 8. From the results, it can be seen that the maximum output voltage of D1 is about 500 mV. When the resistor is 50 kΩ, the output maximum current is about 10 µA.

5. Test results

The microprocessor was fabricated using 180 nm standard technology. The die size was 1300 × 1000 µm² (see Fig. 9).

The performances of the memory and digital circuit operation are summarized separately and plotted in Fig. 10.

From Fig. 8, we can deduce that under 350 mV the leakage power is about 58 nW and the EDP/inst of the memory operation is 13.3 pJ/inst and the digital 4.35 pJ/inst.

Performance comparisons are made between three sub-threshold MCUs, as listed in Table 2.

From Table 2, we can see that our chip’s leakage power is
Table 2. Performance comparisons between three MCUs.

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology (nm)</td>
<td>130</td>
<td>65</td>
<td>180</td>
</tr>
<tr>
<td>Lowest $V_{DD}$ (mV)</td>
<td>200</td>
<td>300</td>
<td>350 (SRAM)</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>250 (Digital)</td>
</tr>
<tr>
<td>Leakage power @ 350 mV (pJ/inst)</td>
<td>2</td>
<td>—</td>
<td>1.1</td>
</tr>
<tr>
<td>Operation frequency (kHz)</td>
<td>84.66 @ 260 mV, 1529 @ 400 mV</td>
<td>434 @ 500 mV, 100 @ 250 mV, 330 @ 400 mV (SRAM), 1200 @ 400 mV (Digital)</td>
<td></td>
</tr>
<tr>
<td>Energy per inst (pJ)</td>
<td>2.6 @ 360 mV</td>
<td>27.3 @ 500 mV</td>
<td>7 @ 350 mV</td>
</tr>
<tr>
<td>SRAM (kb)</td>
<td>2</td>
<td>128</td>
<td>1</td>
</tr>
<tr>
<td>DC–DC</td>
<td>Available</td>
<td>Available</td>
<td>None</td>
</tr>
</tbody>
</table>

Fig. 10. Leakage current and performance versus $V_{DD}$.

An integrated photodiode is designed as the power source. Under the light of 150 kLux, the microprocessor can run up to 500 kHz. Figure 11 shows the experimental environment.

6. Conclusions

The WSN node device has power-stringent limitations when the power comes from the environment instead of a battery. The photo-sensitive diode can utilize light to generate power. Because of its poor driving strength and low voltage output, the power of the device is more critical than the speed. In this work, the sub-threshold logic is used to relax the power problem.

A light-powered 8-bit microprocessor for the WSN is presented in this paper. Under the light of 150 kLux, the microprocessor can run at up to 500 kHz.

A design method for a sub-threshold circuit is provided in this paper. This can enhance the performance of the sub-threshold circuit in terms of the $V_{DD}$ and temperature. The microprocessor core is designed with a customized standard library that contains about 80 cells optimized for leakage, robustness and speed. A 128 × 8 sub-threshold SRAM is designed with an improved 11-T cell for the microprocessor. The testing results for the microprocessor show that the leakage power of 46 nW and the dynamic power of 385 nW @ 165 kHz are achieved while the operating voltage is 350 mV and the energy per instruction is about 13 pJ. The digital operation of the microprocessor can run at a rate up to 500 kHz @ 350 mV and EDP/inst is 4.35 pJ/inst. This MCU has potential applications for WSN in the future.
References


