## A Low-Power, Single-Poly, Non-Volatile Memory for Passive RFID Tags\* Zhao Dixian<sup>1</sup>, Yan Na<sup>1</sup>, Xu Wen<sup>2</sup>, Yang Liwu<sup>2</sup>, Wang Junyu<sup>1,†</sup>, and Min Hao<sup>1</sup> (1 State Key Laboratory of ASIC & Systems, Fudan University, Shanghai 201203, China) (2 Design Services, Semiconductor Manufacturing International (Shanghai) Corp., Shanghai 201203, China) Abstract: Single-poly,576bit non-volatile memory is designed and implemented in an SMIC $0.18\mu m$ standard CMOS process for the purpose of reducing the cost and power of passive RFID tag chips. The memory bit cell is designed with conventional single-poly pMOS transistors, based on the bi-directional Fowler-Nordheim tunneling effect, and the typical program/erase time is 10ms for every 16bits. A new, single-ended sense amplifier is proposed to reduce the power dissipation in the current sensing scheme. The average current consumption of the whole memory chip is $0.8\mu A$ for the power supply voltage of 1.2V at a reading rate of 640kHz. Key words: RFID; single-poly; non-volatile memory; standard CMOS process; sense amplifier; low power **EEACC:** 1265D #### 1 Introduction In recent years, radio frequency identification (RFID) has received much attention for its explosive growth in use in public transportation, supply chain management, access control, and animal tracking [1~3]. Two of the most important concerns with a passive tag IC are the cost, which is the main driver for the popularization of RFID technology, and the power consumption of tag chips, which determines the operational range of the tag [4]. A critical component of tag chips is the embedded non-volatile memory (NVM), which stores essential information such as the electronic product code (EPC), chip configuration bits, tag manufacturer information, and possibly user data. However, the conventional embedded non-volatile memory approach, such as EEPROM, which has been widely used in RFID technology, is expensive because it requires additional mask and process steps. Aiming at low cost and low power, a single-poly non-volatile memory (SPNVM) is developed, which is compatible with the standard CMOS process. The single-poly memory cell consists of a coupling and tunneling capacitors, where the capacitance of the coupling capacitor is much larger than that of the tunneling one. Thanks to the capacitive divider, a large fraction of the voltage applied between two terminals will be transferred onto the dielectric of the tunneling capacitor. Then, several mechanisms, such as Fowler-Nordheim tunneling, exist to modulate the charge of the floating gate, and logic "1" and "0" states can be created. Some publications have addressed this issue [5~7]. Reference [5] used an nMOS transistor as tunneling capacitor. In fact, using a pMOS transistor instead is a better choice because of its higher data retention and endurance characteristics. In Ref. [6], each bit cell has its own localized sense-amplifier and switching circuits. Although it offers better reliability, the bit cell is too complicated and thus is too large and consumes too much power. Reference [7] used a MIM capacitor as a coupling capacitor, but this reduces the memory retention time due to leakage from the poly contact and the metal inter-layer dielectric. In this paper, both coupling and tunneling capacitors were realized by adopting pMOS transistors. Write/erase operations are performed by exploiting Fowler-Nordheim tunneling[8] to minimize the power consumption. The limitation of an SPNVM bit cell is its larger memory cell size compared with the conventional EEPROM bit cell. However, it is not a constraint in RFID technology since the required data storage capacity is small and the capacity of 576bit is enough in most cases. In addition, optimization of the periphery circuits, such as word-line driver, is also considered. A new single-ended sense amplifier is proposed for reducing the power consumption during the read process. An SPNVM chip containing a 576bit memory array has been fabricated in an SMIC 0. 18µm standard CMOS process. Testing is performed to qualify the memory bit cells and validate the whole memory chip. <sup>\*</sup> Project supported by the National High Technology Research and Development Program of China (No. 2005AA1Z1300) <sup>†</sup> Corresponding author. Email: junyuwang@fudan.edu.cn Fig. 1 Cross section (a) and layout (b) of the SPNVM cell structure The ratio of Mp1 to Mp2 is roughly 16:1. # 2 Cell structure and principle of SPNVM operation A cross section of the SPNVM cell structure is illustrated in Fig. 1 (a). The memory cell consists of two pMOS capacitors that are interconnected through a common floating polysilicon. Here, the physical size of transistor Mp1 is much larger than that of transistor Mp2, so Mp1 behaves like a coupling capacitor and can be used to control the floating-gate voltage and establish a large electric field across the tunneling capacitor (Mp2). In order to keep the power consumption as small as possible, both write and erase operations are based on the Fowler-Nordheim tunneling effect and, therefore, avoid the high drain current required for hot electron injection. By modulating the amount of charge residing on the floating gate, logic "1" and "0" states can be created. Special attention was paid to the layout of the memory bit cell, shown in Fig. 1 (b). In order to prevent the floating gate from disturbing neighboring signals, we adopted a low layer metal, which is connected to clean ground as a shield to cover the floating gate. In addition, those controlling signals were implemented in high layer metals and they were forbidden from passing above the floating gate in order to keep cross talk to a minimum. The schematic of a complete SPNVM bit cell is shown in Fig. 2 (a). Mp 1 and Mp 2 are the coupling | Node | WL | BL | PRO | ERA | AG | |---------|-------------|-------------|--------------|--------------|-------------| | Program | $V_{ m pp}$ | $V_{ m dd}$ | $V_{ m pp}$ | 0 | $V_{ m dd}$ | | Erase | $V_{ m pp}$ | $V_{ m dd}$ | 0 | $V_{ m pp}$ | $V_{ m dd}$ | | Read | $V_{ m dd}$ | $V_{ m dd}$ | $V_{ m sen}$ | $V_{ m sen}$ | 0 | | | | (b) | • | • | | Fig. 2 (a) SPNVM bit cell schematic; (b) Programming and reading modes versus bit cell signals and tunneling capacitors, respectively. Both are low voltage transistors and the thickness of their gate oxide is approximately 4nm. Due to this thin gate oxide, the high voltage required for the program/erase operation is only about 6V. Mn1 is a readout transistor and it provides a sink current for the read operation. The magnitude of this sink current is determined by the logic state of the memory cell. Mn2, Mn3, and Mn4 are control transistors, selecting the active block that will do the program or read operations. According to our specification, since all bit cells in the same block will be erased before programming, Mn2 can be shared in one block. The four nMOS transistors are realized by 3.3V input/output transistors, which are readily available in a standard CMOS process. They can withstand a higher voltage before Fowler-Nordheim tunneling affects them. The table in Fig. 2 (b) shows the operational conditions, where $V_{\rm dd}$ equals the voltage of the power supply; $V_{\rm pp}$ is the high voltage generated by the charge pump, and $V_{\rm sen}$ roughly equals the threshold voltage of Mn1. ## 3 Memory architecture A general description of the memory architecture is given in Fig. 3. The 576bit memory array is arranged with 36 rows and 16 columns. The periphery circuits consist of a word-line decoder, including the row pre-decoder and word-line drivers, and a bit-line decoder, including column pre-decoder and bit-line select module. The data latch stores the data that will be programmed in parallel and driving the corresponding bit-line to the high voltage. The charge pump generates internally the high voltage necessary to program the single-poly memory cells. To minimize power con- Fig. 3 General description of the memory architecture sumption, data is read out in serial and thus only one sense amplifier is needed. The control logic is necessary to manage functional and test modes, as well as synchronizing the datapath thanks to a clock signal provided by the baseband of the tag chip. ## 4 Design optimization Many design techniques for low power non-volatile memories have been reported [9,10]. The main aspects of low power optimization include word-line driving, the current sensing circuit for the read operation, and the charge pump circuit for the memory program [11,12]. #### 4.1 Word line driving According to the bit cell operational conditions (see Fig. 2), the word line signals must be driven to high voltage in the program mode, but low voltage in the read mode. The corresponding word line driving scheme is shown in Fig. 4. The address is partitioned into sections of 2bit that are decoded in advance in the first stage. The resulting signals are then combined using 4-input NAND gates to produce the fully decoded array of word line signals<sup>[13]</sup>. The use of a pre-decoder in the decoding scheme reduces the number of transistors required and, thus, reduces the propagation delay and the power consumption. The transistors (M1 $\sim$ M4) are high voltage devices, acting as a level shifter. The following buffer plays the role of isolation, preventing the large word line capacitance from disturbing the voltage-shifting operation. #### 4. 2 A new single-ended sense amplifier Sense amplifiers (SA) play a major role in the functionality and performance of memory circuits. In order to distinguish the logic "1" and "0", most conventional SA[14,15] need a bias-generator module, such as bandgap, which provides bias voltage or reference current for them to compare with the readout cell current. Such accessorial circuit modules are one of the main sources of power consumption during the read process. In this work, a new single-ended SA is proposed, as shown in Fig. 5 (a). Since the absolute value of $I_{ref}$ is not important in our circuit, we omit the accessorial modules that provide an accurate reference. Mp1, Mp2 and Mp3 all have a length to width ratio of 20:1 and they are connected in serial as a high-impedance resistor, providing reference current $(I_{ref})$ . Simulations have been done at different corners and $I_{\rm ref}$ is about $180 \sim 250 \, \rm nA$ when node B is grounded. Transistor Mp4 provides pre-charge current $(I_{\rm pre})$ . Concerning the special needs of the tag IC, the width to length ratio of Mp4 cannot be too large in order to reduce transient power consumption. For simplicity, the bitline switch and memory cell are represented by transistor Ms and Mc1-Mc2, respectively. And $C_b$ represents the parasitic capacitor at the bit line. Assuming that the potential of the node FG is -0.8V for the "0" state cell and 1.1V for the "1" state cell (these two voltages can be calculated and correspond to the high voltage applied during program process), the transient current of the memory cell is less than 1nA ( $I_{\rm off}$ ) for the "0" state cell, and Fig. 4 Word line driver description Fig. 5 (a) Schematic of the proposed single-ended SA; (b) Timing diagram for the operation of the SA $5\mu A(I_{on})$ for the "1" state cell when node B is high enough and transistor Mc1 works in saturation region. With the potential of node B decreasing and gradually lower than the threshold voltage of the inverter, $I_{on}$ will be reduced accordingly. So the threshold voltage of inverter should also be carefully designed to guarantee the right output can be achieved. The timing diagram for the operation of the SA is shown in Fig. 5 (b). In the pre-charge phase, Precharge signal is low and the bit line is pre-charged to a clamped value, which is near $V_{\rm dd}$ and higher than the threshold voltage of the inverter. When Precharge, ReadSyn and WL signals go high, the SA is activated. When the active memory cell is a "0" cell ( $I_{\rm off} \ll I_{\rm ref}$ ), the potential of node B is nearly unchanged since the parasitic capacitor of the memory cell is small. Then DataOut equals "0". Conversely, when the active memory cell is a "1" cell ( $I_{\rm on} \gg I_{\rm ref}$ ), node B is discharged and finally DataOut equals "1". Transistors, Mp1 $\sim$ Mp3, and the inverter form a feedback loop which is used to save a branch of current ( $I_{\rm ref}$ ) after DataOut equals "1". The main aim of this circuit is to minimize the power consumption in read mode. With the proposed SA, the simulation result shows that the power consumption decreases by 30%. Other techniques<sup>[16]</sup> were also used to minimize the voltage swing on bit lines and further reduce the power dissipation during the memory reading operation in our current sensing scheme. Fig. 6 Microphotographs of two memory chips, implemented in EEPROM (a) & standard CMOS process (b), respectively ### 5 Results and discussion The microphotographs of the two memory chips are shown in Fig. 6. Figure 6 (a) is implemented in an SMIC 0. $18\mu m$ EEPROM process while Figure 6 (b) is fabricated in an SMIC 0. $18\mu m$ standard CMOS process. A comparison between the two tag chips is illustrated in Table 1. Although the area of the SPNVM bit cell is comparatively large, the ratio between the area of the whole memory chip and that of the tag chip only increased 7. 7%. The total cost of the tag chip dramatically decreased 19. 7%, thanks to the relatively cheap process and less photolithograph steps. Tests performed include qualification of the SPNVM bit cells and validation of the whole memory chip. After a stress of 1000 program/erase cycles, there is no distinct degradation of the readout currents. Even after a stress of 10000 cycles, the memory bit cells still work well and the programmed and erased states can be sensed by the peripheral circuits. The data retention characteristics of the memory bit cells are slightly inferior to conventional EEPROM bit cells. The main reason of it is relatively thin gate oxide (4nm). The electrons flow through the oxides by a trap-assisted tunneling process<sup>[17]</sup>. So the tunneling probabilities will be higher in a thinner oxide with equal trap densities. Besides, comparatively large capacitance of the floating gate and thus a longer program/erase time (10ms) will also influence the retention characteristics. But it can be improved by using differential bit cells which allow the doubling of the storage window. A functional test of the whole memory chip was implemented under typical conditions, which includes block-erase, block-pro-gram, chip-erase, chip-program, bit-read, and address access<sup>[18]</sup>. Different patterns, such as all zeros, all ones, random and diagonal, were applied during programming. The read process is followed and the read-out Table 1 Comparison between two tag chips with different processes | Process (SMIC 0. 18μm) | EEPROM | CMOS | | |-------------------------------|----------------------|------------------------------|--| | Area of memory bit cell | $3.96 \mu m^2$ | $150 \mu m^2$ | | | Number of mask layers | 28 | 19 | | | Number of photolithograph | 29 | 19 | | | Memory chip area | 0. 15mm <sup>2</sup> | 0. 25 <b>mm</b> <sup>2</sup> | | | Memory chip/tag in area | 15.0% | 22.7% | | | Cost increase due to area | 0 | 10.3% | | | Cost increase due to process | 0 | - 30. 0% | | | Total cost increase of tag IC | 0 | - 19. 7% | | data are sent to the logic analyzer to validate the memory chip. The performance test was implemented by changing the working condition of the memory chip. Different power supplies and read-out data rates are applied to validate the robustness of the memory chip. When the data rate of the read process is $640 \, \text{kHz}$ , the current consumed by the whole chip is $0.8 \, \mu \text{A}$ for the power supply voltage of $1.2 \, \text{V}$ . #### 6 Conclusion A low-power SPNVM has been designed and implemented in an SMIC 0.18 µm standard CMOS process, the production cost of which will be 19.7% less than the conventional method. The SPNVM bit cells have good endurance characteristics and can be cycled up to 10000 times. In addition, a novel, singleended sense amplifier with a feedback loop was proposed for power consumption optimization. For a majority of applications, such as supply chains, the total operating time of tag chips may be less than 100h and the tag chip will be read or programmed only a limited number of times during its life time. These factors lessen the requirements for the tag's memory compared with conventional non-volatile memory. Thus, this low-cost, low-power SPNVM chip is suitable and reliable for RFID technology. Acknowledgements The authors would like to thank Liu Wenjun and Prof. Li Mingfu from the Device Reliability Research Laboratory in the Fudan University for their significant support on the memory chip test. #### References - [1] Finkenzeller K. RFID handbook:fundamentals and applications in contactless smart cards and identification. 2nd ed. New York: John Wiley & Sons, 1999 - [2] Glidden R.Bockorick C.Copper S. et al. Design of ultra-low-cost UHF RFID tags for supply chain applications. IEEE Commun Magazine, 2004, 42(8):140 - [3] Roussos G. Enabling RFID in retail. Computer, 2006, 39(3):25 - [4] Han Yifeng. Research and design of RFID reader. PhD Dissertation, Fudan University, 2005 (in Chinese) [韩益锋. 射频识别阅读器的研究与设计. 复旦大学博士学位论文, 2005] - [5] Ohsaki K, Asamoto N, Takagaki S. A single poly EEPROM cell structure for use in standard CMOS processes. IEEE J Solid-State Circuits, 1994, 29(3):311 - [6] Raszka J. Advani M. Tiwari V. et al. Embedded flash memory for security applications in a 0. 13µm CMOS logic process. IEEE International Solid-State Circuits Conference, 2004, 1:46 - [7] Na K Y, Kim Y S. High-performance single polysilicon EEPROM with stacked MIM capacitor. IEEE Electron Device Lett, 2006, 27 (4):294 - [8] Weinberg Z A. On tunneling in metal-oxide-silicon structures. IEEE J Appl Phys, 1982, 53(7):5052 - [ 9 ] Daga J M, Papaix C, Racape E, et al. A 40ns random access time low voltage 2Mbits EEPROM memory for embedded applications. Proceedings of MTDT Workshop, 2003;81 - [10] Liu Dongsheng, Zou Xuecheng, Zhang Fan, et al. New design of EEPROM memory for RFID tag IC. IEEE Circuits & Devices Magazine, 2006, 22(6);53 - [11] Pelliconi R, Iezzi D, Baroni A, et al. Power efficient charge pump in deep submicron standard CMOS technology. IEEE J Solid-State Circuits, 2003, 38(6):1068 - [12] Ker M D, Chen S L, Tsai C S. Design of charge pump circuit with consideration of gate-oxide reliability in low-voltage CMOS processes. IEEE J Solid-State Circuits, 2006, 41(5):1100 - [13] Rabaey J M, Chandrakasan A, Nikolic B. Digital integrated circuits: a design perspective. 2nd ed. New Jersey: Prentice Hall, 2003 - [14] Conte A, Giudice G L, Palumbo G, et al. A high-performance very low-voltage current sense amplifier for nonvolatile memories. IEEE J Solid-State Circuits, 2005, 40(2):507 - [15] Papaix C, Daga J M. A new single ended sense amplifier for low voltage embedded EEPROM. Proceedings of MTDT Workshop, 2002;149 - [16] Yan Na, Tan Xi, Zhao Dixian. An ultra-low-power embedded EE-PROM for passive RFID tags. Chinese Journal of Semiconductors, 2006, 27(6), 994 - [17] Meserjian J, Zamani N. Behavior of the Si/SiO<sub>2</sub> interface observed by Fowler-Nordheim tunneling. J Appl Phys, 1982, 53: 559 - [18] Sharma A K. Semiconductor memories: technology, testing and reliability. 1st ed. New York: Wiley-IEEE Press, 2002 ## 适用于无源射频标签的低功耗单栅非挥发性存储器\* 赵涤燹1 闫 娜1 徐 雯2 杨立吾2 王俊宇1, 7 闵 昊1 (1 复旦大学专用集成电路与系统国家重点实验室,上海 201203) (2 中芯国际集成电路制造(上海)有限公司设计服务处,上海 201203) 摘要: 针对低成本、低功耗无源射频电子标签,采用 SMIC $0.18\mu m$ 标准 CMOS 工艺设计实现了单栅、576bit 的非挥发性存储器.存储器单元基于双向 Fowler-Nordheim 隧穿效应原理并采用普通的 pMOS 晶体管实现;编程/擦写时间为 10 m s/16 bit. 芯片实现块编程和擦写功能,通过提出一种新型的敏感放大器而实现了读功耗的优化. 在电源电压为 1.2 V,数据率为 640 kHz 时,读操作平均消耗电流约为 $0.8\mu$ A. 关键词:射频识别;单栅;非挥发性存储器;标准 CMOS 工艺;敏感放大器;低功耗 **EEACC:** 1265D 中图分类号: TN492 文献标识码: A 文章编号: 0253-4177(2008)01-0099-06 <sup>\*</sup> 国家高技术研究发展计划资助项目(批准号:2005AA1Z1300) <sup>†</sup> 通信作者.Email:junyuwang@fudan.edu.cn 2007-07-03 收到,2007-08-24 定稿