# 20 Gb/ s 1 2 Demultiplexer in 0. 18µm CMOS<sup>\*</sup>

Wang Gui, Wang Zhigong, Wang Huan, Ding Jingfeng, and Xiong Mingzhen

(Institute of RF- & OE-ICs, Southeast University, Nanjing 210096, China)

**Abstract :** A 1 2 demultiplexer is designed and realized in standard 0. 18µm CMOS technology. A novel high-speed and low-voltage latch is used to realize the core circuit cell. Compared to the traditional source-coupled FET logic structure latch, its power supply voltage is lower and the speed is faster. In addition, the negative feedback is used in the buffer circuit to widen its bandwidth. Measurement results show that the chip can work at the data rate of 20 Gb/s. The supply voltage is 1. 8V and the current ,including the buffer circuit ,is 72mA.

Key words : demultiplexer ; latch ; CMOS ; high-speed circuitEEACC : 1280 ; 2570DCLC number : TN432Document code : AArticle ID : 0253-4177 (2005) 10-1881-05

## 1 Introduction

Giga-bit-per-second optical-fiber-link systems have become more important because of the increasing demand for high-speed communications. The demultiplexer (DEMUX) is a key component of the system. The integrated circuit (IC) must handle high-frequency signals whose frequency bands are at least half of the data rate. Fast switching speed is therefore necessary.

Usually, a DEMUX with an operating speed greater than 10 Gb/s is realized in GaAs MES-FETs, GaAs HBTs, Si bipolar transistors, Si Ge bipolar transistors, or Si BiCMOS<sup>[1-3]</sup>. The power consumption of these ICs, however, is relatively large because their supply voltage is high and their penetration current is large.

CMOS technologies have the advantages of low power consumption and low cost. Traditionally, it has rarely been used in such high-speed systems because the transistor 's operation speed is too low. However, using a modern deep submicron CMOS technology, the switching speed of CMOS transistor is high enough to realize ultra-high speed DEMUX. A low-power 0. 18µm CMOS realization of 10 Gb/s MUX/DEMUX has already been reported<sup>[4]</sup>. It consumed only 102mW from a 2V power supply, which is at least 75 % less than that of the designs using GaAs MESFET, InP-base HBT, or Si Ge technology. However, the design can only realize the MUX/DEMUX at 10 Gb/s. In this paper, we presented a CMOS 20 Gb/s 1 2 DEMUX based on a novel high-speed and low-voltage latch and implemented using the standard 0. 18µm CMOS process. The proposed DEMUX operates with a 1. 8V power supply.

#### 2 Circuit design

The whole circuit employs the input buffer, output buffer, and 1 2 DEMUX cell as shown in Fig. 1. In this architecture, the input data and clock signal are first amplified through each input buff-

\* Project supported by the National High Technology Research and Development Program of China (No. 2001AA312050)

Wang Gui male, PhD candidate. He is working in field of ultra-high-speed ICs.

Wang Zhigong male, professor. He is involving in RF- &OE-ICs design.

Received 31 March 2005 , revised manuscript received 4 June 2005

er ,and then they enter the DEMUX cell. Each output data of the DEMUX cell is also amplified through an output buffer. In order to reject the common mode noise and interferences ,all input and output signals are in differential mode.



Fig. 1 Circuit diagram of the DEMUX

Figure 2 shows the architecture diagram of the 1 2 DEMUX cell. It consists of a master-slavemaster flip-flop to capture the lead bit on the positive-edge of the clock and a master-slave flip-flop to capture the second bit using the negative-edge of the clock. The advantages of this architecture are as follows:

(1) The output signals are synchronous with each other ,not needing an additional flip-flop ;

(2) The clock frequency is half of the input data rate ,which degrades the design difficulty.



Fig. 2 Block diagram of 1 2 DEMUX

Figure 3 shows the 1 2 DEMUX's timing chart. To obtain the maximum timing margin, the signals should be aligned at the center of the data pulse. This can be realized by delaying the clock signal to the proper position according to the data signal.

The circuit topology of the latch is very important to the whole circuit. It affects not only the circuit speed but also the power dissipation. In this paper ,a novel low-voltage and high-speed latch is used in the DEMUX cell as shown in Fig. 4. This latch ,consisting of a differential pair and a regener-



Fig. 3 1 2 DEMUX timing chart

ative pair, achieves high speed in CMOS technologies. It results from the traditional MCML latch<sup>[5,6]</sup> by moving the transistors which are controlled by the clock signals to the top. In the traditional latch, the transistor working as the current source is stacked under the source-coupled pair transistors. To ensure the transistor operating correctly, the supply voltage must be high enough for the three-level transistor structure. So , to realize lowvoltage latch, series-gate should be avoided, as in topology based on triple-tail cells in Fig. 4. The pair M1, M2 and M3, M4 realize two differential stages that are alternatively activated by turning off M5 and M6, respectively. When signal CK is high, transistor M6 deactivates differential pair M3, M4, hence the output is equal to the input (i. e. the latch is transparent). When the signal CK is low (i.e. signal / CK is high), transistor M5 deactivates the differential pair M1, M2, and the output voltage is set by cross-coupled transistors M3, M4, that hold the previous result (i. e. latch is in the hold mode). In order to increase the circuit speed, the source follower is omitted compared to the traditional structure. This may result in the latch to degrade the ability of driving load. Thus, the output buffer should be designed carefully.



Fig. 4 Low voltage latch

As we know, the charge/discharge time of the output node is decided by the following formula:

$$T = \frac{CV_a}{I_c}$$

where C is the total capacitor of the output node,  $V_a$  is the amplitude of output signal, and  $I_c$  is the charge/discharge current. To increase the speed of the latch, C and  $V_a$  should be decreased, whereas  $I_c$ should be increased. However,  $V_a$  can not be too small and  $I_c$  can not be too large, because extra current leads to power consumption increment, and enough voltage swing is necessary to drive the load. So there is a trade-off among the W/L of transistors M1 ~ M4, the tail current, and the load resistors  $R_1$ ,  $R_2$ . In order to operate effectively, the size of clock pair (M5,M6) should be much larger than that of the sample and hold pair (M1 ~ M4).

Both data and clock inputs and outputs are terminated by on chip poly-silicon resistors. These resistors supply a 50 impedance match to avoid reflection. The input buffer has been designed to amplify the input signal to the needed amplitude. The output parallel data must be buffered in order to drive the external 50 load. As we have mentioned, a latch without a source follower can not drive a heavy load. So the buffer should be designed in multiple stages and the differential pair transistors ' size of the first stage should be small. In the following stages, the differential pair transistors 'ratio of W/L could be increased step by step. In this design, the three-stage differential amplifier structure has been introduced. Due to the large capacitance of the pads, the output signals become worse. In order to achieve a better eye diagram, the negative feedback is used by connecting a pair of resistors between the input terminals and the output terminals in the last stage amplifier.

#### **3** Layout and fabrication

The DEMUX circuit was designed using onepoly six-metal standard 0. 18µm CMOS technology and fabricated by Taiwan Semiconductor Manufacturing Co. ,L TD. Figure 5 shows the micrograph of the chip. The chip area including bonding pads is  $1 \text{ mm} \times 0.9 \text{ mm}$ . In fact ,only 4 % of the total chip area in the middle region is used for the active part.



Fig. 5 Chip micrograph of the 1 2 DEMUX

Because the circuit operates up to 20 Gb/s, the layout should be carefully designed. In this circuit, the following methods are used to enhance the performance. As the circuit is differential, a symmetrical layout of the differential signal path is diligently respected to suppress the common mode noise and stabilize the high frequency ground. Minimum interconnections are preferred for the signal lines especially during the key nodes, such as the drain of the differential pair transistors. In order to filter the noise of the power supply, the parasitic capacitance between supply and ground should be as large as possible. This can be realized by enlarging the overlap area of supply plane and ground plane during layout.

#### 4 Measurement results

The performance of the fabricated DEMUX was evaluated via an on wafer test employing a Cascade Microtech probe station ,an ADVENTEST D3186 Pulse Pattern Generator , an Agilent 81250 Parallel BER Tester ,an ADVENTEST R6142 Programmable DC Voltage/Current Generator , a Rohde & Schwarz SMP04 Signal Generator (10MHz ~ 40 GHz) and an Agilent 86100A Infinium DCA wide-bandwidth oscilloscope. The measurement diagram is shown in Fig. 6 ,in which both the D3186 and the 81250 are referred to the clock generated by the SMP04 ,then the D3186 can provide the dif-

ferential clock signal and the 81250 can provide the differential data signal.



Fig. 6 Measurement diagram

Under the 1. 8V DC supply voltage, a current of 72mA was measured from the whole circuit. The 20 Gb/s differential input data pseudorandom bit sequences (PRBS of  $2^{31}$  - 1) were obtained from the 81250. The 10 GHz input clock signals provided by the D3186 were also differential. The output data of the DEMUX was analyzed by the Agilent 86100A oscilloscope. In order to realize the AC coupled connection, several Bias-Ts were used. Figure 7 shows the measured eye diagram of the DE-MUX output signal at the data rate of 10 Gb/s. The measured eye opening is more than 100mV on the external 50 load. The double traces in the eye are due to the limited bandwidth of the output buffer. Since the rate of the output data signal from 81250 can not be adjusted continuously, this measurement scheme, with these instruments, cannot test the circuit working in limited conditions. That is, this DE-MUX circuit may operate at a data rate more than  $20 \, \text{Gb/s}$ .



Fig. 7 Output eye diagram at 20 Gb/ s input

In fact, we can estimate the maximum operation speed of this circuit by the following method. Figure 8 shows the output eye diagram of this DE-MUX working at 10 Gb/s input. According to this figure, the rise time and fall time from 10 % to 90 % of the peak-to-peak voltage are 72 and 75ps respec-



Fig. 8 Output eye diagram at 10 Gb/ s input

In the limited condition, this time equals to 2bit period, hence the maximum output data rate is

 $R_{\rm max} = 2/T_{\rm t} = 2/183.75$ 

0.01088 (Tb/s) = 10.88 (Gb/s)

From this, the maximum input data rate of the 1 2 DEMUX is  $2R_{\text{max}}$ , that is 21. 76 Gb/s.

## 5 Conclusion

A 1 2 demultiplexer has been designed and fabricated using the TSMC standard 0.18 $\mu$ m CMOS technology. Under a 1. 8V supply, this DE-MUX can work at the data rate of 20 Gb/s. The whole circuit consumes a power dissipation of 129mW and has an active size of 245 $\mu$ m ×150 $\mu$ m. These results show that a novel latch can be used in ultra high-speed circuit by using a deep submicron CMOS process.

#### References

- Shioiri S, Soda M, Hashimoto T, et al. A 10-Gb/ s Si Ge bipolar framer/ demultiplexer for SDH systems. ISSCC Dig ,1998:202
- Yoshida N, Fujii M, Atsumo T, et al. Low-power-consumption
   10-Gb/s GaAs 8 1 multiplexer/1 8 demultiplexer. IEEE
   GaAs IC Symp ,1997:113
- [3] Hauenschild J ,Dorschky C ,Seitz R ,et al. A 10 Gb/s BiCMOS clock and data recovering 1 4 demultiplexer in standard plastic package with external VCO. ISSCC Dig ,1996:202
- [4] Tanabe A, Umetani M, Fujiwara I, et al. 0. 18 µm CMOS 10 Gb/s multiplexer/demultiplexer ICs using current mode logic with tolerance to threshold voltage fluctuation. IEEE J Solid-State Circuits, 2001, 36(6):988
- [5] Wang Zhigong. IC design for optic-fiber communications. Bei-

jing:Higher Education Press,2003 [6] Yamashina M, Yamada H. An MOS current mode logic (MC- ML) circuit for low-power sub-GHz processors. IEICE Trans Electron ,1992 ,E75-C:1181

## 0. 18µm CMOS 20 Gb/s1 2 分接器设计\*

#### 王 贵 王志功 王 欢 丁敬峰 熊明珍

(东南大学射频与光电集成电路研究所,南京 210096)

摘要:使用标准 0.18µm CMOS 工艺设计并实现了 1 2 分接器.核心电路单元采用一种新的高速、低电压锁存器 结构实现.与传统的源极耦合场效应管逻辑结构的锁存器相比,其电源电压更低且速度更快.此外,为了拓展带宽, 在缓冲放大电路中采用了负反馈.测试结果表明芯片可以工作于 20 Gb/s 数据速率下.电源电压为 1.8V 时,包括缓 冲电路在内整个芯片的工作电流为 72mA.

关键词:分接器;锁存器;CMOS;高速电路
EEACC:1280;2570D
中图分类号:TN432
文献标识码:A
文章编号:0253-4177(2005)10-1881-05

<sup>\*</sup>国家高技术研究发展计划资助项目(批准号:2001AA312050)

王 贵 男,博士研究生,主要从事超高速集成电路设计.

王志功 男,教授,从事射频及光电集成电路设计.