# 80 Gb/s 2 : 1 multiplexer in 0.13- $\mu$ m SiGe BiCMOS technology

Zhao Yan(赵衍)<sup>†</sup>, Wang Zhigong(王志功), and Li Wei(李伟)

(Institute of RF- & OE-ICs, Southeast University, Nanjing 210096, China)

**Abstract:** This work presents an ultra-high speed 2 : 1 multiplexer (MUX) in a SiGe BiCMOS technology with  $f_T$  = 103 GHz. To boost the operating speed, the system scheme is optimized including a 2 : 1 selector circuit directly driving an external 50  $\Omega$  load, and two wide-band data buffers and one clock buffer in the input stage. The chip exhibited an open eye at 80 Gb/s with a 160 mV single-ended voltage swing.

**Key words:** Gilbert cell; multiplexer; selector; ultrahigh-speed integrate circuit; wideband buffer **DOI:** 10.1088/1674-4926/30/2/025008 **EEACC:** 1280; 1350F; 2570A

# 1. Introduction

The ever-increasing data demands of our information age have been driving an impressive growth of optical communication technologies. As a key building block in optical communication systems, the first 80 Gb/s experimental multiplexer ICs was reported in 1998<sup>[1]</sup>. In the following years, as the fabrication technology improved, research on the multiplexer became competitive[2-5], and the working speed experienced an exponential growth up to 165 Gb/s, a record speed made in 2006<sup>[6]</sup>. Most of these MUX chips were realized in InP HEMT or HBT technology. However, these circuits were realized in very advanced technologies. To reduce the fabrication cost in the future, CMOS technologies were evaluated<sup>[7-9]</sup>. The highest operating speed of these chips achieves 60 Gb/s<sup>[10]</sup>. Above 80 Gb/s, CMOS is still incapable of operating. The exploding communication bandwidth demands are now expecting a next generation of multiplexers with lower costs. The aim of this paper is to demonstrate a 2 : 1 MUX, which can work at 80 Gb/s but realized in a SiGe technology with  $f_{\rm T} = 103$  GHz, a much cheaper technology than the technology in the reported MUXs. To the authors's knowledge, this is the reported MUX with the highest data rate in the mainland of China.

## 2. Circuitry design

Our chip consists of a selector core, two data input buffers, and a clock buffer. The block diagram of the chip is shown in Fig.1. The three labels, D1, D2, and Clk, denote the input data and clock signals of the chip, respectively. For simulation, a high current model (HiCUM) of the bipolar transistors is used. The HiCUM model is expected to be more accurate, especially in predicting higher current and saturation region device behavior. Since the  $g_m$  of the transistor dramatically drops at 80 GHz, we devised the 2 : 1 selector with the large emitter transistors to directly drive an external 50  $\Omega$ load. The biggest advantage of this idea is a significant improvement of the maximum operating speed without affecting the bandwidth limitation of the output buffer, but it also brings the problem of large transistor size in the selector requiring larger driving capabilities of the input data buffers in the former stage. To solve the driving problem, the data input buffers are designed in wideband techniques with the help of negative feedback topology. To apply the multiplexer in a wide range of input data rate, the clock buffer is also designed as wideband.

It is a challenge to design an input data buffer when the input data rate approaches half the  $f_{\rm T}$  of the technology. To gain a bandwidth from DC to 40 GHz, a trans-conductance amplifier (TCA) driving a trans-impedance amplifier (TIA) with peaking inductor is used in Fig.2 (a). To further increase their bandwidth, the TIA operating at the input clock frequency use peaking inductors,  $L_{p1}$  and  $L_{p2}$ , in series with their load resistors,  $R_1$  and  $R_2$ , which extend the bandwidth from 55 to 70 GHz.  $L_{p1}$  and  $L_{p2}$  are combined into a centertap spiral inductor with simulated differential inductance of 80 pH. To reduce the attenuation of the emitter followers, all the current sources employ high-breakdown transistors (Q7-Q12), which show much higher output resistance than the high-performance transistors. The resistors  $(R_5-R_6)$  are used to further increase the output resistance of the current source. Considering the tradeoff between the output resistance and voltage headroom, the resistors  $(R_5 - R_7)$  are 140  $\Omega$ , and the others  $(R_8 - R_{10})$  are 70  $\Omega$  because the current in the TIA is double that of the TCA. The output signals are feedback to the input of the TIA through Q7,  $R_3$  and Q8,  $R_4$ . The values of  $R_3$ and  $R_4$  are determined by the bandwidth and the stability.



Fig.1. Block diagram of the chip.

© 2009 Chinese Institute of Electronics



Fig.2. (a) Circuit diagram of the TCA-driving-TIA buffer; (b) Simulated frequency response of the buffer.

Figure 2 (b) shows that a bandwidth improvement of 40% can be achieved when peaking inductors are used. The simulated 3 dB bandwidth of the TCA-driving-TIA buffer with peaking inductor is 50 GHz, and the average power gain is 10 dB.

The input clock buffer uses the same design as the input data buffer to apply the selector operating at a wide range of speed, from 20 to 80 Gb/s.

The circuit of the selector core is shown in Fig.3 where a Gilbert cell is used. To drive external 50  $\Omega$  loads, the transistors Q13–Q18, use 2 strip emitters with 8  $\mu$ m width. They are biased near the optimal current for  $f_{\rm T}$  by the current source Q29. On one side of the Gilbert cell, either Q13 or Q14 is on when Q17 is switching, and the on transistor behaves like a common-base circuit. Thus, if Q13 is on, the input impedance  $R_{\rm in}$  is  $1/g_{\rm m1}$ , a very small value, while the output impedance of Q17,  $r_{ce}$ , is high. Therefore, the improvement of impedance matching between the emitter of Q13 or Q14 and the collector of Q17 will result in a higher working speed of the selector. A transmission line is inserted here to match the impedance. To further improve the operation speed, the inductive peaking is also used. The simulated peaking inductance of  $L_1$  or  $L_2$  is 180 pH and the quality factor (Q) is no less than 10 at 80 GHz. Moreover, the input emitter followers, Q19-Q22, are designed with the same size of the emitter area as the selector core to provide the large AC current.

### 3. Layout considerations

The layout of the 2 : 1 selector chip is designed by using the Cadence Virtuoso Layout Editor. As shown in Fig.4 (a),



Fig.3. Circuit diagram of the selector.



Fig.4. Layout design: (a) Layout of the chip; (b) Layout of the input buffer.

layout symmetry with respect to the complementary clock and data paths has been diligently respected. Two pairs of 40 Gb/s data streams are supplied from the left and right sides, and a pair of 40 GHz differential signals is injected from the underside. All the long transmission lines are of a coupled line with the metal shields on two sides to reduce the radiation. The peaking inductors are kept far enough away from each other to suppress the crosstalk and maintain their Q-factors. The trench isolators underneath each inductor effectively decrease the spiral to substrate capacitance and lower the coupling from turn to turn.

The layout of the input data buffer is depicted in Fig.4 (b). The first important principle is to maintain the length of the transmission lines connecting TCA and TIA ( $d_1$ ) long enough to cancel the input capacitance of the TIA, while the path

| Ref.      | Supply  | Power              | Output              | Rms    | Working | Tech.                              | $f_{\rm T}$ (GHz) | Utilization |
|-----------|---------|--------------------|---------------------|--------|---------|------------------------------------|-------------------|-------------|
|           | voltage | consume            | swing               | jitter | speed   |                                    |                   | efficiency  |
|           | (V)     | (W)                | (mV)                | (ps)   | (Gb/s)  |                                    |                   | (%)         |
| [1]       | -5.2    | 2.7                | -                   | -      | 80      | InP HEMT                           | 195               | 41          |
| [2]       | -5.2    | 1.3                | 0.7                 | —      | 90      | InP HEMT                           | 175               | 51          |
| [3]       | -3.3    | 1.0                | 0.75                | -      | 100     | InP HEMT                           | 175               | 57          |
| [4]       | -3.3    | 1.45               | 0.25                | 0.34   | 108     | SiGe BiCMOS                        | 210               | 51          |
| [5]       | -5.7    | 1.23               | 0.5                 | -      | 144     | InP HEMT                           | 245               | 59          |
| [6]       | -3.2    | 1.6                | -                   | —      | 165     | InP HBT                            | 300               | 55          |
| [7]       | 1.8     | 0.28               | 0.1                 | -      | 43      | 90 nm CMOS                         | 110               | 39          |
| [8]       | 1       | 0.023 <sup>a</sup> | 0.07                | -      | 50      | 90 nm CMOS                         | 110               | 45          |
| [9]       | 1.5     | 0.097              | 0.1                 | -      | 50      | $0.13 \mu \mathrm{m}\mathrm{CMOS}$ | 130               | 38          |
| [10]      | 0.7     | 0.011 <sup>a</sup> | 0.05                | -      | 60      | 90 nm CMOS                         | 155               | 39          |
| This work | -5      | 1.1                | > 0.16 <sup>b</sup> | 1.1    | 80      | SiGe BiCMOS                        | 103               | 78          |

Table 1. MUXs performance comparison.

<sup>a</sup>Only a selector integrated on a chip without any other buffers, so the power consumption is very small.

<sup>b</sup>Limited by our measurement conditions, the input clock signal is seriously attenuated by the coaxial lines. It will significantly improve the output swing by using the low loss coaxial lines for the 40 GHz clock signals.



Fig.5. Micrograph of the MUX chip.

distance,  $d_2$ , from the output of the TIA to the input of selector is kept as short as possible, because the extra length of the transmission line can increase the output resistance of the TIA, and as a result, mismatches the impedance between the TIA and the selector. In addition, the effective length of the peaking inductors is another important issue. Although the main part of the inductor  $L_{p1}$  or  $L_{p2}$  is fabricated in the top metal layer, its actual inductance is underestimated if only the top metal layer is considered. Under a differential mode, the effective length of the inductor is calculated from the pin of load resistors ( $R_1$  or  $R_2$ ) to the marked virtual ground at the mid point of the lower metal layer connecting the two inductors.

### 4. Measurements

The chip was implemented in a  $0.13-\mu$ m SiGe BiCMOS process with the maximum cutoff frequency of 103 GHz. The high performance NPN transistor consists of single and dual emitter stripe devices with a fixed emitter width of  $0.12 \mu$ m. The emitter is processed with a pedestal mask result in the lower breakdown voltage but the higher speed. An N-well is used to contact the subcollector, and it wraps around the collector so as to reduce the extrinsic collector resistance.



Fig.6. Measurement setup at 80 Gb/s.

The chip micrograph is shown in Fig.5. It occupies 700  $\times$  900  $\mu$ m<sup>2</sup>.

Figure 6 shows the measurement setup for the selector chip operating at 80 Gb/s. A 40 GHz signal was produced from an R&S SMP04 millimeter wave frequency signal generator, and was converted into a differential signal by a balun. Since the balun can only work up to 20 GHz, phase shifter 2 was used to correct the balun's output differential phases. Meanwhile, the output signal with divided-by-2 frequency divider from the back panel of the signal generator was sent to the 20-GHz external reference port of Agilent 40-Gb/s pseudorandom binary sequence (PRBS) generator to clock the PRBS module. To adjust the delay of the input clock signal to the data streams, another phase shifter (phase shifter 1) was used. The output and the inverted output of the PRBS module were delayed by two cables (line 1 and line 2) of different lengths to ensure that D1 and D2 were sufficiently independent. Moreover, the clock divide-by-4 output from the Agilent PRBS module was used to trigger the precision time base of an Agilent 86100A wideband oscilloscope. All the data and clock ports must be biased



Fig.7. Measured eye diagram at 80 Gb/s.

at the design voltage (-1.19 V). At the output port of the chip, a wide-band probe and a low loss cable with 67 GHz bandwidth were used to transmit the 80 Gb/s output data streams to the Agilent oscilloscope.

The chip consumed 1.1 W from a -5 V power supply. The measured eye-diagram is shown in Fig.7. The output data rate is around 80 Gb/s with a 10 GHz trigger frequency. The single ended voltage swing was 160 mV. The measured jitter at the eye crossing is 6.6 ps peak-to-peak and 1.1 ps rms. Since the PRBS generator produced a 40 Gb/s eye diagram with a jitter of 800 fs rms and 3.8 ps peak-to-peak, the 80 Gb/s output jitter of our chip can be accepted. As observed from the figure, the eye is not fully open. This is limited by the measurement conditions. The balun in Fig.6 has a bandwidth from 10 to 20 GHz. For the 40 GHz application, it shows very high loss (9 dB). Considering 3 dB loss of the phase shifter, 2 dB loss of the bias-tee, and 10 dB loss of line 3 and line 4, we measured the whole loss from the R&S signal generator's output to the clock input of the chip as 24 dB, which attenuated the maximum 13-dBm output power of the signal generator to -11 dBm. Since a weak clock signal cannot fully switch the differential transistor pair, the output eye cannot open completely. Using low loss coaxial lines in future measurements it is expected to gain a larger eye opening and a lower jitter. Additionally, the data inputs are designed to receive differential data streams, but limited by the available equipment, we can only have two single-ended data streams for the measurements, which will further decay the symmetry of the eye diagram.

In Table 1, some recently reported MUXs working from 80 Gb/s to 165 Gb/s are listed. The proposed MUX in this paper is very competitive in technology utilization efficiency defined by the ratio of working speed to  $f_{\rm T}$ . Although the de-

signs in CMOS consume very low power, their speed still cannot achieve 80 Gb/s, due to the low  $g_m$  of the transistor.

# 5. Conclusion

This paper provide a 2 : 1 multiplexer chip with built-in wideband input buffers which is functional up to 80 Gb/s as demonstrated in a SiGe technology with maximum  $f_{\rm T} = 103$  GHz. The power consumption is 1.1 W from a -5 V supply voltage. In addition, the chip is demonstrated in a low cost technology, which is meaningful in cost reduction for next generation ICs in backbone and metro networks.

#### Acknowledgements

The authors would like to gratefully acknowledge Dr. J. Feng for her valuable discussions and encouragement and Ms. Li Zhang for her help with the chip realization.

#### References

- Otsuji T, Murata K, Enoki T, et al. An 80-Gbit/s multiplexer IC using InAlAs/InGaAs/InP HEMT's. IEEE J Solid-State Circuits, 1998, 33(9): 1321
- [2] Suzuki T, Nakasha Y, Takahashi T, et al. A 90Gb/s 2 : 1 multiplexer IC in InP-based HEMT technology. IEEE Int Solid-State Circuits Conf, Digest Tech Paper, 2002, 1: 192
- [3] Suzuki T, Nakasha Y, Skoda T, et al. A 100-Gbit/s 2 : 1 multiplexer in InP HEMT technology. IEEE Int MTT Symp Digest, 2003, 2: 1173
- [4] Meghelli M. A 108Gb/s 4 : 1 multiplexer in 0.13μm SiGebipolar technology. IEEE Int Solid-State Circuits Conf, Digest Tech Paper, 2004, 1: 236
- [5] Suzuki T, Nakasha Y, Takahashi T, et al. 144-Gbit/s selector and 100-Gbit/s 4 : 1 multiplexer using InP HEMTs. IEEE Int MTT Symp Digest, 2004, 1: 117
- [6] Hallin J, Kjellberg T, Swahn T. A 165-Gb/s 4 : 1 multiplexer in InP DHBT technology. IEEE J Solid-State Circuits, 2006, 41(10): 2209
- [7] Yamamoto T, Horinaka M, Yamazaki D. A 43Gb/s 2 : 1 selector IC in 90nm CMOS technology. IEEE Int Solid-State Circuits Conf, Digest Tech Paper, 2004, 1: 238
- [8] Yamamoto T, Yamazaki D, Horinaka M. 2-to-1 selector IC in 90-nm CMOS technology operating up to 50 Gb/s. IEEE Compound Semiconductor Integrated Circuit Symposium, 2004: 243
- [9] Kehrer H D, Wohlmuth M W, Knapp H. 50 Gbit/s 2 : 1 multiplexer in 0.13 μm CMOS technology. IEE Electron Lett, 2004, 40(2): 100
- Kehrer D, Wohlmuth H D. A 60-Gb/s 0.7-V 10-mW monolithic transformer-coupled 2 : 1 multiplexer in 90 nm CMOS. IEEE Compound Semiconductor IC Symp, 2004:105