# 12.5 Gbps 1:16 DEMUX IC with high speed synchronizing circuits\*

Zhou Lei(周磊)<sup>1,2</sup>, Wu Danyu(吴旦昱)<sup>1,2</sup>, Chen Jianwu(陈建武)<sup>1,2</sup>, Jin Zhi(金智)<sup>1,2,†</sup>, and Liu Xinyu(刘新宇)<sup>1,2</sup>

<sup>1</sup>Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China <sup>2</sup>Key Laboratory of Microelectronics Devices & Integrated Technology, Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China

Abstract: A 12.5 Gbps 1:16 demultiplexer (DEMUX) integrated circuit is presented for multi-channel high-speed data transmission. A novel high-speed synchronizing technique is proposed and integrated in this DEMUX chip. Compared with conventional synchronizing techniques, the proposed method largely simplifies the system configuration. The experimental result demonstrates that the proposed circuit is effective in two-channel synchronization under a clock frequency of 12.5 GHz. The circuit is realized using 1  $\mu$ m GaAs heterojunction bipolar transistor technology with die area of 2.3 × 2.3 mm<sup>2</sup>.

Key words:demultiplexer;synchronization;heterojunctionbipolar transistorDOI:10.1088/1674-4926/32/12/125010EEACC:2560J

# 1. Introduction

Optical communication systems place stringent requirements on building blocks such as multiplexers (MUX) and demultiplexers (DEMUX). In particular, at the receiving points, a high speed and high order  $(1:2^N)$  DEMUX is a strong requirement. A number of DEMUX integrated circuits have been reported using GaAs/InP HBTs<sup>[1–5]</sup>, SiGe HBTs<sup>[6]</sup> and CMOS<sup>[7, 8]</sup> operating at a data rate larger than 10 Gbps. In order to further improve the data rate of the communication system, a multi-channel transmission is introduced. The synchronizing circuit is used to align the clock signal between independent channels. This becomes essential when it comes to the application of multi-channel transmission.

Synchronization for high speed circuits operating at several GHz is a challenging task. Several synchronizing technologies have been proposed previously. One known technique employs a synchronous pulse as a reset signal as Figure 1(a) shows. Each channel is synchronized with the rising or falling edge of the reset pulse. However, this reset method requires extra pulse generation circuits. Besides, propagation-delay variation of the reset pulse needs to be small, which limits the routing rule. To overcome this limitation, we propose an improved method for multi-channel synchronization. A block diagram of the proposed synchronizing technique is shown in Fig. 1(b). The proposed method uses adjacent channel to provide synchronizing signals which largely simplifies the system configuration. Details of this synchronizing circuit will be illustrated in this paper.

In this paper, we present a 12.5 Gbps 1:16 DEMUX with high-speed synchronizing circuits. The circuit was fabricated using 1  $\mu$ m GaAs HBTs. The circuit architecture and design considerations are presented in this paper. The experimental results demonstrate that the synchronizing circuit is capable of working under 12.5 GHz input clock frequency.

# 2. Circuit description

### 2.1. Circuit architecture

A detailed block diagram of the proposed DEMUX circuit is illustrated in Fig. 2. The chip consists of several building blocks such as the DEMUX core, synchronizing circuits and decision circuit. Input data are first sampled and recovered from the decision circuits. Then the DEMUX core converts the high-speed data stream (12.5 Gbps) into 16 parallel low-speed data streams (781.25 Mbps). The LVDS output buffer is integrated to provide a suitable interface with the digital process devices.

A wideband input buffer is also integrated. This is used to isolate the input signal from the decision circuit and reduce kickback noise. A Cherry–Hooper type amplifier with emitter follower feedback<sup>[9]</sup> is employed to expand the input bandwidth. A 3-dB bandwidth of 16.7 GHz is achieved from simulation. It also enhances the sensitivity of the decision circuit. The measured input sensitivity of this DEMUX chip is 35.6 mV<sub>pp</sub>.

The DEMUX core is made up of four stages of DEMUX (1:2) cells in a tree-type architecture. Each DEMUX cell consists of five latches as Figure 3 shows. In these circuits, both rising and falling edges of the clock signal are utilized as sample signals which relax the requirement on switching speed of each latch unit. In order to optimize power consumption and speed, two types of latches are employed: the emitter-coupled-logic (ECL) and the current-mode-logic (CML). The ECL latch uses an emitter follower to isolate the load resistor from the parasitic capacitance at the output point, which helps to operate at higher speed. The CML latch is comparatively slower and more power efficient. In this design, the first two stages, where data rates are 12.5 Gbps and 6.25 Gbps, adapt ECL type latches. CML latches are used in the next two stages to reduce power consumption.

<sup>\*</sup> Project supported by the State Key Development Program for Basic Research of China (No. 2010CB327505).

<sup>†</sup> Corresponding author. Email: jinzhi@ime.ac.cn

Received 8 June 2011, revised manuscript received 16 July 2011



Fig. 1. System diagrams of synchronizing technologies. (a) Conventional method. (b) The proposed method.



Fig. 2. Block diagram of the DEMUX circuit.

A frequency divider chain is applied to generate a serial of clock signals for the DEMUX core. The lowest frequency is sent to the output port and utilized as the synchronizing signal. If the lowest frequency is synchronized, the higher frequencies will be phase aligned as well.

#### 2.2. Synchronizing circuits

The synchronizing circuit is inserted between the input

clock buffer and the frequency divider chain. Figure 4 is a simplified diagram of the synchronizing circuits. The output clock signal of channel N+1 (Clk<sub>N+1</sub>) is fed back and compared with the clock signal from the adjacent channel (Clk<sub>N</sub>) in a phase detector (PD). A control voltage ( $V_{ctrl}$ ) is generated by the PD and then controls the input clock through an AND gate. When a phase difference exists between Clk<sub>N+1</sub> and Clk<sub>N</sub>, a pulse is generated by the PD, which in turn blocks the input clock signal. The subsequent DEMUX circuits wait until





Fig. 4. Simplified diagram of the synchronizing circuits.

the synchronizing state is achieved. It takes several clock cycles for channel N+1 to synchronize with channel N. In the worst case, when  $\operatorname{Clk}_{N+1}$  is late for one input clock cycle at the initial point, it will take 32 cycles to reach the synchronizing state, as Figure 5 shows. In practice, when more than two channels exist, a master channel should be chosen first, to provide a reference signal; and then each slave channel is synchronized with the master channel one by one. Compared with the conventional synchronizing technique, the proposed method is more suitable for high-speed operation. Because only the adjacent channel is used to provide the synchronizing signal, the propagation-delay variation can be small, which largely relaxes the routing rules.

In order to handle a high-speed input clock signal, the PD needs to be designed carefully. First, the input bandwidth should be large to ensure that the relative phase of the input signals is detected. When the input data rate is 12.5 Gbps, the minimum time difference (Td) of the input signals is 80 ps. The rise time (RT) is set to be less than 50% of Td in this design, to avoid misjudgments of the PD. According to the relationship between bandwidth and rise time<sup>[10]</sup>:

$$BW = 0.35/RT.$$
 (1)



Fig. 5. Timing diagram of the proposed synchronizing circuits.



Fig. 6. Micrograph of the DEMUX chip.

In this design, the required bandwidth is calculated to be 8.75 GHz.

Second, the control voltage ( $V_{\text{ctrl}}$ ) should be aligned well with the input clock. Any misalignment may result in glitches of  $V_{\text{ctrl}}$  which lead to error actions on the clock signal. In this paper, the PD is implemented in the form of an XOR gate followed by an ECL D-flip-flop (DFF). The DFF aligns  $V_{\text{ctrl}}$  with the rising edges of the input clock. A wideband input buffer is also included in this PD. All the transistors of differential pairs are biased at peak  $f_t$  to fulfill the requirement on operation speed.

#### 3. Circuit implement and measurements

A 1  $\mu$ m GaAs HBT process with  $f_t$  of 60 GHz was used in the fabrication of this IC. A micrograph of the circuit is shown in Fig. 6. The complete IC has a complexity of 1100 HBTs. The total die area is 2.3 × 2.3 mm<sup>2</sup>.

Figure 7 illustrates diagram of the test system. In order to verify the performance of the synchronization circuit, two in-

| Table 1. Performance comparison of compound semiconductor DEMOX ICs. |                    |          |                     |                    |                     |
|----------------------------------------------------------------------|--------------------|----------|---------------------|--------------------|---------------------|
| Parameter                                                            | This work          | Ref. [2] | Ref. [3]            | Ref. [4]           | Ref. [11]           |
| Technology                                                           | GaAs HBT 1 $\mu$ m | GaAs HBT | InP HBT 1.2 $\mu$ m | InP HBT 1 $\mu$ m  | InP HBT 0.7 $\mu$ m |
| Cutoff frequency, $f_t$ (GHz)                                        | 60                 | 60       | 123                 | 150                | 350                 |
| Function                                                             | 1:16               | 1:4      | 1:16                | 1:4                | 1:2                 |
| Operation data rate (Gbps)                                           | 12.5               | 30       | 11                  | 43.2               | 112                 |
| Input sensitivity (mV <sub>pp</sub> )                                | 35.6               | 200      | 50                  | 27                 | 250                 |
| Supply voltage (V)                                                   | -5.2 & + 3.3       | N/A      | -2.6                | +3.3V              | -4.4                |
| Power (W)                                                            | 3(1)               | 2.7      | 1                   | 3.3 <sup>(2)</sup> | $2.2^{(2)}$         |

(1) The power consumption include synchronizing circuits.

(2) The power consumption include CDR circuits.



Fig. 7. Block diagram of the test system.



Fig. 8. Output clock and data signal at 12.5 GHz input clock and 6.640625 GHz input data.

dividual DEMUX ICs are included on the test board. The first chip is set as a master chip which provides a reference clock for the second chip. The input clock signal is generated by a microwave source and fed into each chip through a power divider. A LeCroy 7710 and a Lecroy Wave Master 816Zi-A oscilloscope are used to observe the output waveform.

In order to validate the logic function of the DEMUX circuit, the frequency of input data and clock signal are set as follows:

$$F_{\text{data}} = (1/2 + 1/32)F_{\text{clk}}.$$
 (2)

When the above relationship is met, the sampled data change from 0 to 1 (or 1 to 0) every 16 input clock cycles. After the 1:16 DEMUX circuit, each of the output data exhibit a 0101 behavior. For a 12.5 GHz input clock, a 6.640625 GHz sinusoidal signal is chosen as input data. Figure 8 shows the output clock and three of the output data signals, which validates the DEMUX function at 12.5 GHz clock frequency. The clock and data signal are aligned in a double-data-rate (DDR) wav

A comparison of output clock signals before and after synchronizing is illustrated in Fig. 9. The signals are measured sin-





Fig. 9. Comparison of output clock signals at 12.5 GHz input clock. (a) Before synchronization. (b) After synchronization.

gle ended. The measurement is carried out at 12.5 GHz input clock frequency. The experimental result confirms that the synchronizing circuit succeeds in two-chip synchronization.

Table 1 gives a comparison between this work and previously published DEMUX ICs fabricated by compound semiconductor devices. The proposed DEMUX chip has achieved a balanced performance between data rate, input sensitivity and power consumption.

### 4. Conclusion

A 12.5 Gbps 1:16 DEMUX integrated circuit is presented using the 1  $\mu$ m GaAs HBT process. An improved synchronization technique is proposed and integrated in the DEMUX circuits. The proposed technique uses the adjacent channel to provide synchronizing signals which largely simplifies the system configuration. The experimental result shows that this technique is effective in two-chip synchronization at 12.5 GHz clock frequency.

## Acknowledgement

The authors genuinely appreciate the help of all the members of the IMECAS compound semiconductor device department.

# References

- Tanaka K, Shikata M, Kimura T, et al. High speed 8:1 multiplexer and 1:8 demultiplexer ICs using GaAs DCFL circuit. 13th Annual Gallium Arsenide Integrated Circuit Symposium Tech Dig, 1991: 229
- [2] Runge K, Zampardi P J, Pierson R L, et al. High speed Al-GaAs/GaAs HBT circuits for up to 40 Gb/s optical communication. 19th Annual Gallium Arsenide Integrated Circuit Symposium Tech Dig, 1997: 211
- [3] Ishii K, Nosaka H, Nakajima H, et al. Low-power 1:16 DEMUX and one-chip CDR with 1:4 DEMUX using InP–InGaAs heterojunction bipolar transistors. IEEE J Solid-State Circuits, 2002, 37(9): 1146
- [4] Yen J, Case M G, Nielsen S, et al. A fully integrated 43.2 Gb/s clock and data recovery and 1:4 DEMUX IC in InP HBT technology. IEEE International Solid-State Circuits Conference

- [5] Suzuki Y, Mamada M, Yamazaki Z. Over-100-Gb/s 1:2 demultiplexer based on InP HBT technology. IEEE J Solid-State Circuits, 2007, 42(11): 2594
- [6] Reinhold M, Dorschky C, Rose E, et al. A fully integrated 40-Gb/s clock and data recovery IC with 1:4 DEMUX in SiGe technology. IEEE J Solid-State Circuits, 2001, 36(12): 1937
- [7] Rylyakov A, Rylov S, Ainspan H, et al. A 30Gb/s 1:4 demultiplexer in 0.12  $\mu$ m CMOS. IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, 2003: 176
- [8] Kanda K, Yamazaki D, Yamamoto T, et al. 40 Gb/s 4:1 MUX/1:4 DEMUX in 90 nm standard CMOS. IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, 2005: 152
- [9] Holdenried C D, Haslett J W, Lynch M W. Analysis and design of HBT Cherry-Hooper amplifiers with emitter-follower feedback for optical communications. IEEE J Solid-State Circuits, 2004, 39(11): 1957
- [10] Bogatin E. Signal integrity: simplified. New Jersey: Prentice Hall PRT, 2004
- [11] Makon R E, Driad R, Schubert C, et al. 107–112 Gbit/s fully integrated CDR/1:2 DEMUX using InP-based DHBTs. Proc 5th EUMIC Conf, 2010: 206