# A 10 Gb/s burst-mode clock and data recovery circuit\*

Gu Gaowei(顾皋蔚), Zhu En(朱恩)<sup>†</sup>, Lin Ye(林叶), and Liu Wensong(刘文松)

Institute of RF- & OE-ICs, Southeast University, Nanjing 210096, China

**Abstract:** We introduce a gated oscillator based on XONR/XOR cells and illustrate its working process. A half-rate BM-CDR circuit based on the proposed oscillator is designed, and the design is implemented in SMIC 0.13  $\mu$ m CMOS technology occupying an area of 675 × 25  $\mu$ m<sup>2</sup>. The measured results show that this circuit can recover clock and data from each 10 Gbit/s burst-mode data packet within 5 bits, and the recovered data pass eye-mask test defined in IEEE standard 802.3av.

Key words: 10G-EPON; clock and data recovery; burst-mode; gated voltage-controlled-oscillator; frequency locked loop

**DOI:** 10.1088/1674-4926/33/7/075011 **EEACC:** 2570

## 1. Introduction

Driven by the explosive user-bandwidth demands, access network technologies have been evolving from copperbased lines to those based on optical fibers<sup>[1]</sup>. The 10G-EPON (10 Gb/s Ethernet passive optical network), which was ratified as IEEE Standard 802.3av in September 2009<sup>[2]</sup>, is one of the most promising technologies for next generation networks because of its high-speed, low cost, easy implementation and maintenance. The 10G-EPON system is a point-tomultipoint (P2MP) architecture consisting of an optical line terminal (OLT), multiple optical network units (ONUs) and an optical distribution network of passive optical components. While the downstream data from the OLT is broadcasted to all ONUs in continuous mode, the OLT receives burst-mode data packets from multiple ONUs, since a time division multiple access (TDMA) protocol is adopted. The phases of these received burst data packets in OLT vary arbitrarily due to the different distances travelled. As a result, a fast clock acquisition must be done within nanoseconds for effective transfer, and the phaselocked loop clock recovery technique is not suitable in such a situation as it requires a long locking time of several microseconds. Several multi-gigabit burst-mode clock and data recovery (CDR) circuits have been reported<sup>[3-6]</sup> by overseas institutes.

In this paper, we present a burst-mode CDR circuit for symmetric-rate 10G–EPON working at a half-rate of 5 GHz. The design is implemented in SMIC 0.13  $\mu$ m CMOS technology and tested on-wafer. The recovered clock and data meet IEEE 802.3av specification.

## 2. Architecture

Gated voltage-controlled oscillators (GVCOs) were widely adopted in earlier PON systems. The architecture of a conventional GVCO-based BM-CDR circuit is illustrated in Fig. 1<sup>[7]</sup>. There are three matched GVCOs. The GVCO1 and

GVCO2 start/stop oscillating alternatively according to the received digits. The sum of their outputs forms a consecutive full-rate clock. The GVCO3 and a reference PLL provide the control voltage for all three oscillators.

Figure 2 shows a block diagram of the proposed burstmode CDR, which contains only one inductor-less GVCO. The generated clock edges, which are either rising or falling, are aligned to the input data. Data is retimed using the D-type flipflops and the 2 : 1 selector. Instead of a reference PLL, the control voltage in this design is provided by a frequency-lockedloop (FLL) and no replica GVCO is need. The oscillating frequency of the proposed GVCO is half of the data rate. As the input data speed is rather high, the decreased central speed of the VCO reduces the design difficulty for high-speed applications and improves the noise performance.



Fig. 1. A conventional GVCO-based burst-mode CDR.

\* Project supported by the Key Technology Research and Development Program of Jiangsu Province, Industry Part, China (No. BE2008128).

<sup>†</sup> Corresponding author. Email: zhuenpro@seu.edu.cn

Received 1 December 2011, revised manuscript received 8 February 2012



Fig. 2. A block diagram of the presented BM-CDR.



Fig. 3. (a) A half-rate GVCO. (b) A XOR/XNOR cell.

## 3. Circuit design

### 3.1. Half-rate GVCO design

Figure 3(a) illustrates the proposed half-rate GVCO, which is composed of four XOR/XNOR delay cells. All the cells are implemented in full differential version, as shown in Fig. 3(b), so their propagation delays are the same. Suppose that the input data is '0's, Cells 2, 3 and 4 work as buffers and Cell 1 works as an inverter, thus forming a four-stage oscillator. On the other



Fig. 4. (a) The clock lags the data. (b) The clock leads the data.

side, if the input data is '1's, then Cell 2 works as an inverter and Cells 1, 3 and 4 work as buffers, again forming the fourstage oscillator.

To explain how the proposed GVCO works, we mark four nodes as A, B, C, and D. When a data flip occurs, the waveform at node A (the output node of Cell 1) will change its polarity, for example, from charge to discharge. For Cell 2, the two inputs are flipped at the same time so the phase of clock-B is continuous. Waveforms at the following nodes, C and D, are not directly influenced by the input data. If the clock is synchronous with the data, the waveform of node A is at zerocrossing when a 180° phase drift occurs at the data edge. The time for the following charge or discharge action is not varied as it remains half of the cell delay. If the clock lags the data, as shown in Fig. 4(a), a data edge arrives before v(A) arrives at its threshold, and less electronics need to be charged/discharged. As a result, the oscillation is speeded up. Otherwise, if the clock initially leads the data, as shown in Fig. 4(b), then v(A) has crossed the threshold when the 180° phase drift occurs as the result of the data flip. It takes more time to be discharged from the current voltage to the bottom level (or charged to the top level), which thus slows down the oscillation to track the input data.

The differential XNOR/XOR cell is realized in the form of a Gilbert cell<sup>[8]</sup>, for its high speed and immunity to commonmode noises. The load at each end of this differential pair is composed of a poly-silicon resister and a parallel connected PMOS in the triode region. If there are no data bursts, then the input data is long successive '0's, and the GVCO's tuning characteristics and phase noise feature can be roughly calculated<sup>[9]</sup>.

Under the condition that:

$$V_{\rm pp} \gg \sqrt{2} V_{\rm effd}.$$
 (1)

The GVCO works in the large-signal state, with the differential output swing:

$$V_{\rm pp} = IR_{\rm eq}.$$
 (2)

Denote the cell delay as the time  $\tau_d$  between the zerocrossings of the input and output differential waveform. As the load is an RC circuit,  $\tau_d$  is determined by decaying exponentials:

$$\tau_{\rm d} = \frac{C V_{\rm pp} \ln 2}{I} = R_{\rm eq} C \ln 2. \tag{3}$$

Thus the oscillation frequency is:

$$f_0 = \frac{1}{2M\tau_d}.$$
(4)

In this design, M is 4. The equivalent load resister is that:

$$R_{\rm eq} = R / / \frac{1}{g_{\rm m3,4}} \\ \approx \frac{1}{\frac{1}{\frac{1}{R} + \mu_{\rm p} C_{\rm ox} \frac{W}{L} (V_{\rm DD} - V_{\rm ctrl} - |V_{\rm THP}|)}.$$
 (5)

The gain of the VCO is derived as:

$$K_{\rm VCO} = \frac{\partial f}{\partial V_{\rm ctrl}} \approx -\frac{1}{2M \ln 2C_{\rm load}} \mu_{\rm p} C_{\rm ox} \left(\frac{W}{L}\right)_{8,9}.$$
 (6)

The noise to the GVCO can be categorized into two types: white noise and flicker noise. The single-side-band (SSB) phase noise due to white noise can be calculated separately as:

$$L_{\rm BWN}(f) = \frac{2kT}{I\ln 2} \left[ \gamma \left( \frac{3/4}{V_{\rm eff4-8}} + \frac{1}{V_{\rm eff2,3}} + \frac{1}{W_{\rm eff1}} \right) + \frac{1}{V_{\rm op}} \right] \times \left( \frac{f_0}{f} \right)^2.$$
(7)

And the SSB phase noise due to flicker noise is:

$$L_{\rm flicker}(f) = \frac{K_{\rm f}}{WLC_{\rm ox}f} \left(\frac{1}{V_{\rm eff1}^2} + \frac{1}{V_{\rm eff2,3}^2}\right).$$
 (8)

#### 3.2. Data recovery circuit

Using D-flip-flops (DFFs) and a 2 : 1 selector, the halfrate data recovery can be realized. Since the supply voltage in standard 0.13  $\mu$ m CMOS technology is only 1.2 V, all the subcircuits are designed in current-mode logic (CML), as it consumes lower headroom. CML circuits also present a commonnode suppression feature<sup>[10]</sup>. The internal single-ended signal swing is chosen to be about 400 mV, as a tradeoff of speed and reliability. The sub-circuits are given in Fig. 5.



Fig. 5. Circuits of (a) D-latch, (b) D-FF, and (c) 2 : 1 selector.



Fig. 6. (a) Frequency detector, (b) V/I circuit and LF.

### 3.3. Design of the FLL

An FLL is adopted in this design to keep the difference between the oscillator frequency  $(f_{gvco})$  and the half-rate of data transfer  $(f_0)$  within a tolerance of:

$$|f_{\rm gvco} - f_0| < f_0/2N_{\rm CID}.$$
 (9)





Fig. 8. Die photo of the implemented burst-mode CDR circuit.

for the input data and reference clock are located at the left side of the chip, and output pads for the recovered are at the right side to isolate the high-speed input and output signals. Buffers are inserted to drive the 50  $\Omega$  loads of the test instrument. A micro-photo of the chip is given in Fig. 8.

Fig. 7. Transient simulation results. (a) The frequency acquisition process. (b) The phase acquisition process.

 $N_{\text{CID}}$  is the consecutive identical digits (CID) of the transferred NRZ data.

The circuit of the frequency (FD), the V/I converter, along with the filter, is shown in Fig. 6. The load of the V/I adopts a folded current-mirror structure that effectively widens the output voltage range. The output of the V/I is single-ended, and the on-chip filter adopts a two-order  $\Pi$  type low-pass filter. Since the capacitance value of  $C_1$  is relatively large, it is realized in an NMOS transistor.  $C_2$  is implanted by a MIM capacitor.

The clock acquisition process of a gated-VCO is separated into two steps: a relatively slow frequency acquisition and a fast phase acquisition at each data burst. Figure 7 shows the transient simulation result of these two processes. When the free-running frequency of a burst-mode VCO is locked, the recovered clock is always aligned to the input data. In this way, input data jitter is transferred into the recovered clock and results in a larger jitter compared with continuous-mode CDRs. On the other side, since the recovered clock could trace the input data jitter, this circuit has a large jitter tolerance and very fast response.

### 4. Layout and experimental results

The presented circuit is fabricated in the SMIC 0.13  $\mu$ m CMOS process, occupying an area of 625  $\mu$ m by 675  $\mu$ m. Pads

The BM-CDR is measured on a cascade probe station. The supply voltage is 1.2 V and the total power consumption is 74.0 mW. First, we set the input data to be '0's, and the reference clock is a sinusoidal waveform originated from an Agilent RF signal generator. The GVCO is locked to four times the reference frequency, the locking range is from 4.9 to 6.0 GHz.

An Agilent wide-bandwidth oscilloscope is used to display the waveforms of the recovered data and clock. Given a  $2^{23}-1$ pseudo-random bit sequence (PRBS) of 10.3125 Gb/s generated by an Advantest pulse pattern generator, and a reference clock of 1.28906 GHz, the data is correctly recovered and an eye-diagram of the recovered clock and data is given in Fig. 9. A 10GBASE-PR PMD eye-mask defined in IEEE 802.3av specification is applied, and 10k waveforms are stacked without violation. The peak-to-peak voltage ( $V_{p-p}$ ) swing of the recovered data is 171 mV, and the  $V_{p-p}$  of the recovered clock is 134 mV. The recovered clock jitter is 4.98 ps rms, and 24.44 ps, p-p. The recovered data jitter rms is 5.67 ps.

The used pattern generator and sampling oscillator can hardly generate and capture the phase jump of the gigabit signals. To examine the fast phase align feature of the BM-CDR, a user-defined pattern is adopted and an error between the reference clock frequency and 1/8th data-rate is introduced by design. A visible phase error between the data and recovered clock appears after a long sequence of consecutive '0's. This error is quickly eliminated when a data flip occurs, and the output clock and data are shown in Fig. 10. The locking time is less than 5 bits.







Fig. 9. An eye diagram of the recovered clock and data.

## 5. Conclusion

A burst-mode CDR for 10G-EPON applications is presented. A half-rate GVCO based on differential XNOR/XOR cells is used to align the output clock phase to the input data edge, and an FLL is introduced to provide a proper control voltage. The circuit is fabricated in SMIC 0.13  $\mu$ m CMOS technology and the chip area is 675 × 625  $\mu$ m<sup>2</sup>. Experimental results show that this CDR circuit presents a fast phase align ability that can be adopted in 10G-EPON upstream applications. Given a 200 mV 2<sup>23</sup>–1 PRBS data input, the recovered clock jitter is 4.98 ps, rms, and the data jitter is 5.67 ps rms. The locking time is less than 5 bits (0.5 ns). The recovered data passed the 10GBASE-PR PMD eye-mask defined in IEEE 802.3av



Fig. 10. Fast phase alignment.

specification. The total power consumption (with buffers) is 74.0 mW at a 1.2 V voltage supply.

### References

- Tanaka K, Agata A, Horiuchi Y. IEEE 802.3av 10G-EPON standardization and its research and development status. J Lightwave Technol, 2010, 28(4): 651
- [2] IEEE Standard 802.3av, 10G EPON task force, 2009
- [3] Liang C F, Hwu S C, Liu S I. A 10 Gbps burst-mode CDR circuit in 0.18 μm CMOS. Proc IEEE Custom Integrated Circuits Conference, 2006: 599
- [4] Cho L C, Lee C, Liu S I. A 33.6-to-33.8Gb/s burst-mode CDR in 90 nm CMOS. IEEE International Solid-State Circuits Conference, Digest of Technical Papers, 2007: 48
- [5] Chu H L, Hsieh C L, Liu S L. 20 Gb/s 1/4-rate and 40 Gb/s 1/8-rate burst-mode CDR circuits in 0.13 μm CMOS. IEEE Asian Solid-State Circuits Conference, 2008: 429
- [6] Terada J, Nishimura K, Kimura S, et al. A 10.3125 Gb/s burstmode CDR circuit using a  $\Delta\Sigma$  DAC. IEEE International Solid-State Circuits Conference, Digest of Technical Papers, 2008: 226
- [7] Banu M, Dunlop A E. A 660 Mb/s CMOS clock recovery circuit with instantaneous locking for NRZ data and burst-mode transmission. IEEE International Solid-State Circuits Conference, Digest of Technical Papers, 1993: 102
- [8] Razavi B. Design of analog CMOS integrated circuits. New York: McGraw-Hill, 2003
- [9] Abidi A. Phase noise and jitter in CMOS ring oscillators. IEEE J Solid-State Circuits, 2006, 41(8): 1803
- [10] Rein H M, Moller M. Design considerations for very-high-speed Si bipolar IC's operating up to 50 Gb/s. IEEE J Solid-State Circuits, 1996, 31(8): 1076