# High performance power-configurable preamplifier in a high-density parallel optical receiver\*

Wang Xiaoxia(王晓霞) and Wang Zhigong(王志功)<sup>†</sup>

Institute of RF- & OE-ICs, Southeast University, Nanjing 210096, China

Abstract: A power-configurable high performance preamplifier was implemented in standard 180-nm CMOS technology for  $12 \times 10$  Gb/s high-density ultra-high speed parallel optical communication system. With critical limitations on power consumption, area and fabrication cost, the preamplifier achieves high performance, e.g. high bandwidth, high trans-impedance gain, low noise and high stability. A novel feed-forward common gate (FCG) stage is adopted to alleviate contradictions on trans-impedance gain and bandwidth by using a low headroom consuming approach to isolate a large input capacitance and using complex pole peaking techniques to substitute inductors to achieve bandwidth extension. A multi-supply power-configurable scheme was employed to avoid wasteful power caused by a pessimistic estimation of process-voltage-temperature (PVT) variation. Two representative samples provide a trans-impedance gain of 53.9 dB $\Omega$ , a 3-dB bandwidth of 6.8 GHz, a power dissipation of 6.26 mW without power-configuration and a trans-impedance gain of 52.1 dB $\Omega$ , a 3-dB bandwidth of 8.1 GHz, a power dissipation of 6.35 mW with power-configuration, respectively. The measured average input-referred noise-current spectral density is no more than 28 pA/ $\sqrt{Hz}$ . The chip area is only 0.08 × 0.08 mm<sup>2</sup>.

Key words: preamplifier; parallel optical receiver; low power; low cost; feed-forward common-gate (simplified FCG) stage; power-configurable

**DOI:** 10.1088/1674-4926/33/1/015004 **EEACC:** 4270; 1220

# 1. Introduction

Fiber optical communication has migrated from telephony and wide-area-network infrastructures where low attenuation and high information capacity are of most concern, to shorter scales such as storage area networks, memory links and even chip-scale global signals where electrical interconnects are not desirable due to the explosively increased data stream and processing demands<sup>[1]</sup>. The high-density parallel optical communication system is becoming the preference of the future short scale links with its merits of ultra-high speed data transmission rate and low cost. The preamplifier, the most critical module of a parallel optical receiver, aims to meet specifications such as high bandwidth, high transimpedance gain, high sensitivity and with much more critical constraints, as compared to those in the traditional longhaul optical communication system. High-performance designs implementing low-cost processes, occupying low area and consuming low power to alleviate the heat-dissipation problem troubling in high-density analog chips are highly preferred. In addition, it is much more difficult to achieve broad bandwidth due to limited  $f_{\rm T}$  in low-cost less-advanced CMOS technology. Usually, peaking techniques are effective solutions, e.g. inductive peaking; capacitive peaking and transformer peaking<sup>[2-9]</sup>. However, for a parallel receiver, the channel width is specified to be 250  $\mu$ m, which is not able to adopt such area-consuming techniques. Furthermore, power consumption  $(P_{dc})$  is always an important constraint on preamplifier performance, especially for high-density parallel systems. For fair comparison between different designs, the figure-of-merit (FOM) is defined as the gain-bandwidth (GBW) product under unit power and unit area consumption.

In this design, a novel feed-forward common gate (FCG) stage is adopted to take the place of the traditional adopted preamplifiers and solve the conflicts between specifications well and achieve excellent FOM with minimum costs on area and power consumption.

In an ultra deep submicron (UDSM) CMOS process, PVT variations make it hard to achieve a "safe" design unless the overly conservative design is conducted with a high safety margin. If the measured results show that the chip is at better conditions than the worst case, there is extra power dissipated paying for the overestimation. For high-density parallel optical communication system, the wasted power is multiplied by the channel number and becomes an extra burden to the already troubling heat dissipation problem. The heat converted from the power consumption may impact the chip performance and increase the device failure rate, which increases the cost in the long term. Therefore, it is worthwhile to find an adaptive solution to both conduct safe design and avoid unnecessary power consumption in parallel system because the profit is multiplied whereas the cost overhead is shared. In this design, a powerconfigurable scheme was conducted by adaptively changing supply voltage acting on the preamplifier to keep the DC current a satisfactory level.

<sup>\*</sup> Project supported by the National Natural Science Foundation of China (No. 61106024) and the Natural Science Foundation of Jiangsu Province, China (No. BK2010411).

<sup>†</sup> Corresponding author. Email: zgwang@seu.edu.cn Received 11 July 2011, revised manuscript received 19 August 2011



Fig. 1. Typical topologies of single-stage (a) RSF, (b) CG, (c) RGC and (d) FCG stages.

## 2. FCG preamplifier analysis

With the down-scaling of the CMOS processes, conventional adopted topologies such as cascode and source followers are not suitable due to the disproportional scaling-down ratio of the supply and the transistor's threshold voltage and consequent constraints on design headroom. USDM technologies do not allow two gate-source voltages to be stacked and maximum circuit speed to be maintained at a low power supply<sup>[8]</sup>. The following single-stage topologies are possible options in USDM:

- (1) Resistive shunt feedback (RSF) preamplifier,
- (2) Common gate (CG) preamplifier,
- (3) Regulated cascode (RGC) preamplifier,
- (4) Feed-forward common gate (FCG) preamplifier.

Typical topologies of single-stage RSF, CG, RGC and FCG stages are shown in Figs. 1(a), 1(b), 1(c) and 1(d).

The absolute values of the low-frequency trans-impedance gain ( $Z_{T\_RSF}$ ,  $Z_{T\_CG}$ ,  $Z_{T\_RGC}$ ,  $Z_{T\_FCG}$ ) and constraint conditions (CC\_RSF, CC\_CG, CC\_RGC, CC\_FCG) for the approximations to take effect in each input stages are:

$$Z_{\text{T}_{\text{RSF}}} \approx \left| \frac{-R_1(g_{\text{m1}}R_{\text{f}} - 1)}{g_{\text{ds1}}R_1 + g_{\text{m1}}R_1 + 1} \right| \approx R_{\text{f}},$$
  
CC\_{\text{RSF}}:  $g_{\text{m1}}R_{\text{f}} \gg 1; \ g_{\text{m1}}R_1 \gg 1 + g_{\text{ds1}}R_1, \quad (1)$ 

$$Z_{\text{T_CG}} \approx \frac{R_1(g_{\text{m1}} + g_{\text{mb1}} + g_{\text{ds1}})}{g_{\text{mb1}} + g_{\text{mb1}} + g_{\text{ds1}} + g_{\text{ds1}}(1 + g_{\text{ds1}}R_1)} \approx R_1,$$
  

$$CC_{\text{CG}}: g_{\text{m1}} + g_{\text{mb1}} + g_{\text{ds1}} \gg g_{\text{dss}}(1 + g_{\text{ds1}}R_1), \quad (2)$$

$$Z_{\text{T}_{\text{RGC}}} \approx \frac{g_{\text{m1}}R_1(g_{\text{m3}}R_3 + 1)}{g_{\text{m1}}(g_{\text{m3}}R_3 + 1) + g_{\text{ds4}}} \approx R_1,$$
  
CC\_{RGC}:  $g_{\text{ds1},3,4} \approx 0,$  (3)

$$Z_{\text{T},\text{FCG}} \approx \frac{* + g_{\text{mb1}}R_1}{* + **} \approx R_1,$$
  

$$* = R_1g_{\text{m1}}[1 + (g_{\text{m2}} + g_{\text{mb2}})R_2g_{\text{m3}}R_3],$$
  

$$* * = g_{\text{m2}} + g_{\text{mb2}} + g_{\text{mb1}} + g_{\text{ds4}},$$
  

$$CC_{\text{FCG}}: * \gg **; g_{\text{ds1},3,4} \approx 0; * \gg g_{\text{mb1}}R_1.$$
(4)



Fig. 2. AC response comparison for the same  $R_1$  ( $R_f$ ).

CC<sub>\_RSF</sub> and CC<sub>\_CG</sub> are harder to be met than CC<sub>\_RGC</sub> and CC<sub>\_FCG</sub> in USDM. This means that with same  $R_1$  in CG, RGC and FCG and  $R_f$  in RSF (e.g. 470  $\Omega$ , equivalent to 53.4 dB $\Omega$ ),  $Z_{T_RSF}$  and  $Z_{T_CG}$  (49.0 and 52.4 dB $\Omega$ ), are smaller than  $Z_{T_RGC}$  and  $Z_{T_FCG}$  (both are 53.1 dB $\Omega$ ). The simulated AC responses are shown in Fig. 2. The 3-dB bandwidths ( $f_{3dB_RSF}$ ,  $f_{3dB_CG}$ ,  $f_{3dB_RGC}$ ,  $f_{3dB_FCG}$ ) are 2.6, 1.9, 4.8 and 9.5 GHz, respectively, which indicates that to achieve adequate transimpedance gain in a single stage, only the FCG configuration can provide enough bandwidth because it is expected to be above 8 GHz in typical case for 10 Gb/s data rate.

On the other hand, to avoid area-consuming peaking techniques for bandwidth compensation, the poles at input and output nodes should be carefully assigned above 8 GHz. With the most optimistic estimation on the input node, the capacitive loading ( $C_{pd}$ ) is at least 300 fF. To effectively isolate  $C_{pd}$ , the low-frequency input resistance ( $Z_{in}$ ) should be no more than 66  $\Omega$ . For the 4 topologies:

$$Z_{\text{in}\_\text{RSF}} \approx \frac{R_1 + R_f + g_{\text{ds1}} R_1 R_f}{g_{\text{ds1}} R_1 + g_{\text{m1}} R_1 + 1} \approx 1/g_{\text{m1}}, \qquad (5)$$

$$Z_{\text{in.CG}} \approx \frac{g_{\text{ds1}} R_1 + 1}{g_{\text{m1}} + g_{\text{mb1}} + g_{\text{ds1}} + g_{\text{dss}}(g_{\text{ds1}} R_1 + 1)} \approx 1/(g_{\text{m1}} + g_{\text{mb1}}), \tag{6}$$



Fig. 3. AC responses with equal  $f_{3dB}$  and  $P_{dc}$ .

$$Z_{\text{in}_{RGC}} \approx \frac{1}{g_{\text{ml}}} \frac{1}{1 + g_{\text{m3}} R_3},$$
 (7)

$$Z_{\text{in_FCG}} \approx \frac{1}{g_{\text{m1}}[1 + (g_{\text{m2}} + g_{\text{mb2}})R_2g_{\text{m3}}R_3] + **}$$
$$\approx \frac{1}{g_{\text{m1}}} \frac{1}{1 + (g_{\text{m2}} + g_{\text{mb2}})R_2g_{\text{m3}}R_3}.$$
(8)

For RSF and CG,  $g_{m1}$  is required to be larger than 15 mS to ensure that the input pole is qualified. In this design, the upper limit of the power consumption per channel is 8 mW. With this constraint and the bandwidth constraint of 8 GHz, other parameters can be derived with reasonable biasing. The RGC and FCG decrease the low-frequency input impedance of the RSF and CG configurations by a ratio of  $1+g_{m3}R_3$  and  $1+(g_{m2} + g_{mb2})R_2g_{m3}R_3$ , respectively. Therefore, the same  $Z_{in}$  can be derived in RGC and FCG with smaller  $g_{m1}$  to avoid large currents flowing through  $R_1$  and hence enabling larger values of  $R_1$  than those in RSF and CG, which can be proved by higher  $Z_{T,RGC}$  and  $Z_{T,FCG}$  in Fig. 3, where the simulated AC curves of 4 circuits with optimized device parameters to achieve approximately same bandwidths and power consumption are illustrated.

FCG is more efficient in the reduction of input impedance than RGC stage, which can be indicated by the preponderance of  $f_{3dB\_FCG}$  over  $f_{3dB\_RGC}$ , which are 9.6 GHz and 5.9 GHz, respectively. One reason is that the feed-forward path in FCG brings in 2 complex poles and 2 zeros. It is well known that for a complex-pole system, resonance can happen with the benefit of bandwidth extension. In addition, the most attractive feature of an FCG configuration is that it achieves input impedance reduction with lower  $V_i$ , which not only alleviates the negative effect on  $g_{m1}$  caused by the body effect but also releases more voltage headroom for  $R_1$ , which boosts the transimpedance gain. As shown in Fig. 3, the FCG stage achieves the largest  $Z_{T-FCG}$  of 53.1 dB $\Omega$  than others with equal bandwidth and power consumption. With the advantages of low input impedance and economical headroom consumption, a higher FOM is then achieved.



Fig. 4. FCG with power-configurable scheme.

The noise analysis is based on the Van der Ziel MOSFET noise model<sup>[3, 8]</sup>. The MOSFET noise current spectral density is comprised of three terms:

(1) The mean-square channel thermal noise current spectral density:

$$\overline{I_{\rm n,d}^2} = 4kT\alpha g_{\rm m},\tag{9}$$

where  $\alpha = \gamma g_{d0}/g_m$ ,  $\gamma \approx 1.2$  and  $g_{d0}$  is the zero-bias drain conductance.

(2) The mean-square induced gate noise current spectral density:

$$\overline{I_{n,g}^2} = 4kT\delta g_g,\tag{10}$$

where  $\delta \approx 4/15$ ,  $g_g = (\omega C_0)^2/g_{d0}$  and  $C_0$  is the gate-oxide capacitance of the MOSFET.

(3) The cross-correlation of the channel thermal noise and the induced gate noise:

$$\overline{I_{n,d}} \, \overline{I_{n,g}^*} = c \sqrt{\overline{I_{n,d}^2} \, \overline{I_{n,g}^2}},\tag{11}$$

where  $c \approx -0.4i$ .

The total equivalent output noise voltage spectral density is contributed by each noise current sources. It is obtained by summing up the products of each noise source multiplying corresponding trans-impedance gain, respectively:

$$V_{\text{no_equ}}^{2} = V_{\text{no_M}}^{2} + V_{\text{no_R}}^{2}$$
  
=  $\sum_{i=1}^{4} \left( \overline{I_{n,di}^{2}} |Z_{di}|^{2} + \overline{I_{n,gi}^{2}} |Z_{gi}|^{2} + \overline{I_{n,d}} \overline{I_{n,g}^{*}} |Z_{di} Z_{gi}| \right) + \sum_{i=1}^{3} \overline{I_{n,Ri}^{2}} |Z_{Ri}|^{2},$  (12)

where  $Z_{d1} = Z_{T_vV_1v_0} - Z_{T_vV_0v_0}$ ,  $Z_{g1} = Z_{T_vV_2v_0} - Z_{T_vV_1v_0}$ ;  $Z_{d2} = Z_{T_vV_1v_0} - Z_{T_vV_xv_0}$ ,  $Z_{g2} = -Z_{T_vV_1v_0}$ ;  $Z_{d3} = -Z_{T_vV_2v_0}$ ,  $Z_{g3} = Z_{T_vV_xv_0}$ ;  $Z_{d4} = Z_{T_vV_1v_0}$ ;  $Z_{g4} = 0$ ;  $Z_{R1} = Z_{T_vv_0v_0}$ ,  $Z_{R2} = Z_{T_vV_xv_0}$ ,  $Z_{R3} = Z_{T_vV_2v_0}$ .

Here,  $Z_{T_vV_i-V_0}$ ,  $Z_{T_vV_x-V_0}$ ,  $Z_{T_vV_y-V_0}$  and  $Z_{T_vV_0-V_0}$  are the low-frequency trans-impedance gains from  $V_i$ ,  $V_x$ ,  $V_y$  to the output node  $V_0$ .

| EN | CTRL[1] | CTRL[0] | Sel <sub>3</sub> | Sel <sub>2</sub> | Sel <sub>1</sub> | Sel <sub>0</sub> | $V_{\rm VDD}$ (V) |
|----|---------|---------|------------------|------------------|------------------|------------------|-------------------|
| 1  | 0       | 0       | 0                | 0                | 0                | 1                | 1.77              |
| 1  | 0       | 1       | 0                | 0                | 1                | 0                | 1.67              |
| 1  | 1       | 0       | 0                | 1                | 0                | 0                | 1.56              |
| 1  | 1       | 1       | 1                | 0                | 0                | 0                | 1.46              |
| 0  | Х       | Х       | 0                | 0                | 0                | 0                | 0.28              |

Table 2. Performance comparison with and without power-configuration at 9 typical corners.

| Corner     | CTRL[1:0] | $f_{3a}$ | <sub>iB</sub> (GHz) | GHz) Z <sub>T</sub> (dB |       | $P_{dc} (mW)$ |       | FOM ( $10^5 \Omega \cdot \text{GHz/mW/mm}^2$ ) |      |
|------------|-----------|----------|---------------------|-------------------------|-------|---------------|-------|------------------------------------------------|------|
|            |           | A*       | B**                 | A*                      | B**   | A*            | B**   | A*                                             | B**  |
| SS, -40 °C | 00        | 9.92     | 10.01               | 55.37                   | 55.37 | 6.16          | 6.16  | 1.48                                           | 1.49 |
| SS, 27 °C  | 00        | 7.78     | 7.90                | 54.70                   | 54.71 | 6.30          | 6.48  | 1.05                                           | 1.04 |
| SS, 85 °C  | 00        | 6.90     | 6.95                | 54.19                   | 54.21 | 6.38          | 6.57  | 0.87                                           | 0.85 |
| ТТ, –40 °С | 01        | 10.55    | 10.85               | 53.75                   | 53.77 | 6.68          | 8.21  | 1.20                                           | 1.00 |
| TT, 27 °C  | 01        | 9.45     | 9.78                | 53.06                   | 53.09 | 6.70          | 8.24  | 0.99                                           | 0.84 |
| TT, 85 °C  | 01        | 7.50     | 8.00                | 52.58                   | 52.58 | 6.70          | 8.21  | 0.74                                           | 0.65 |
| FF,40 °C   | 11        | 10.86    | 11.78               | 51.72                   | 51.77 | 6.14          | 10.66 | 1.07                                           | 0.67 |
| FF, 27 °C  | 11        | 9.70     | 10.40               | 50.98                   | 51.06 | 6.12          | 10.55 | 0.88                                           | 0.55 |
| FF, 85 °C  | 11        | 7.86     | 9.51                | 50.41                   | 50.51 | 6.11          | 10.43 | 0.67                                           | 0.48 |

A\*: Preamplifier with power-configurable scheme.

B\*\*: Preamplifier without power-configurable scheme.

The total equivalent input noise current spectral density of the propose FCG input stage is given by

$$I_{\rm in_{equ}}^{2} = V_{\rm no_{equ}}^{2} / |Z_{\rm T_{FCG}}|^{2}.$$
 (13)

The calculated equivalent input noise current spectral density in typical case is shown as curve "CALC" in Fig. 8.

#### 3. Power-configurable scheme

To stabilize the FOM of the FCG preamplifier when PVT varies, a power-configurable multi-supply scheme is adopted and illustrated in Fig. 4. The bandgap reference block generates 4 equally spaced voltage levels of  $V_{b0}-V_{b3}$  from 1.8 to 1.5 V. Four power gating blocks PG<sub>0</sub>–PG<sub>3</sub> convert  $V_{b0}-V_{b3}$  to  $V'_{b0}-V'_{b3}$  through large PMOS transistors either enabled or disabled, which are controlled by switching signals Sel<sub>0</sub>–Sel<sub>3</sub>, respectively.  $V'_{b0}-V'_{b3}$  are then connected together to provide virtual  $V_{DD}$  ( $V_{VDD}$ ) to be the supply of the FCG stage. The switching signals Sel<sub>0</sub>–Sel<sub>3</sub> are decoded from the power enable signal EN as well as 2 control bits CTRL [1:0] through the CTRL GEN block. The truth table of this logical block and corresponding  $V_{VDD}$  are listed in Table 1.

The power-configurable scheme is aimed to save redundant power consumption if the chip is proved to be in better PVT conditions than worst-case estimation. The advantage of this scheme is both providing enough margins to ensure the yield in the worst-case and minimizing the payments for the overestimation if the chip is in better conditions. It is possible to implement this configuration on an FCG preamplifier because an FCG preamplifier achieves broad bandwidth and high trans-impedance gain with economical consumption on voltage headroom and is hence suitable in low-supply applications. Furthermore, it is necessary to try to save power. Because any unwanted power consumed is multiplied by the channel numbers and would add extra burden to the already challenging heat dissipation problem. As shown in Table 2, the performance comparison with and without power-configuration at 9 typical corners, the amount of power saved for a 12-channel parallel optical receiver can be as large as 54.2 mW at FF corner and -40 °C. The results also show that with acceptable  $Z_{\rm T}$  and  $f_{\rm 3dB}$ , the power varies from 6.16 to 10.66 mW among corners in preamplifiers without power-configurable scheme, comparing to the variation range of 6.16 to 6.70 mW with power-configurable scheme. At the same time the worst-case FOM is improved from  $0.48e^5$  to  $0.67e^5$ . Another benefit for this configuration is that it makes the design more robust versus supply variation by providing an adjustable supply. On the other hand, the area and power overheads of the powerconfigurable scheme are shared by 12 channels and hence the cost is small. The maximum power supply noise rejection ratio (PSRR) deterioration for preamplifier with power-configurable scheme, as compared to that without a power-configurable block is 3.9 dB at 5.88 GHz. When the frequency is below 3.7 GHz, the PSRR of the two preamplifiers are almost same. The main reason of the PSRR deterioration is that the high frequency trans-impedance gain of the preamplifier with the power-configurable scheme is slightly worse than that without a power-configurable block and it can be compensated by removing high frequency noise on the supply with filter capacitors.

# 4. Experimental results

The proposed FCG preamplifier with analog components of the power-configurable scheme is implemented in SMIC 180-nm RFCMOS technology. The chip microphotograph of the proposed preamplifier is shown in Fig. 5. With the absence of area-consuming inductors and additional gainboosting stages, the core circuit, including the FCG stage and



Fig. 5. Microphotograph of the preamplifier.



(b) TST\_11 (CTRL = "11")

Fig. 6. Single channel eye-diagram with 10 Gb/s  $2^{31}$ -1 PRBS NRZ input. (*y*-scale: 40 mV/div, *x*-scale: 20 ps/div).

equivalent power-configurable share per channel, occupies an area of only  $0.08 \times 0.08 \text{ mm}^2$ , which is quite suitable for a 250- $\mu$ m pitch high-density parallel optical receiver. The total chip area with the pads is  $0.47 \times 0.27 \text{ mm}^2$ .

Two chips close to best and worst performance conditions were characterized without (TST\_00: CTRL="00") and with (TST\_11: CTRL="11") power-configuration to keep power consumption at 6.26 mW and 6.35 mW, respectively. The output eye-diagrams were characterized by using an Agilent 86100A wide-bandwidth oscilloscope. With 2<sup>31</sup>-1 PRBS voltage at 10 Gb/s ( $V_{\text{in.pp}} = 20 \text{ mV}$ ), the output eye-diagrams of 2 chips are shown in Fig. 6. These can qualitatively certify that both preamplifiers are able to operate appropriately at a high data transmission rate within a strict power constraint. The AC responses are shown in Fig. 7 with curves TST\_00 and TST\_11, which exhibit  $Z_T / f_{3dB}$  of 53.87 dB $\Omega/6.8$ GHz and 52.09 dB $\Omega$ /8.1 GHz, respectively. These are derived from small-signal characteristics evaluated by the S-parameter measurement with an Agilent 8363B network analyzer. Compared to corresponding simulation curves of the best and worst performance cases: SIM\_00 (SS corner, 85 °C, CTRL="00") and SIM\_11 (FF corner, -40 °C, CTRL="11"), which exhibit  $Z_{\rm T}/f_{\rm 3dB}$  of 54.21 dB $\Omega/6.95$  GHz and 51.77 dB $\Omega/11.78$  GHz, respectively.  $Z_{\rm T}$  and  $f_{\rm 3dB}$  in measurement are within the varia-



Fig. 7. Measured and simulated AC responses.



Fig. 8. Measured, simulated and calculated  $\overline{I_{n,in}}$ .

tion range of the simulation results. The equivalent input noise current spectrum densities are derived from the noise figures evaluated by the Agilent N8975A noise figure analyzer and illustrated in Fig. 8 together with the simulated and calculated noise. The averaged input-referred noise current spectral densities of TST\_00, TST\_11, SIM\_00, SIM\_11 are 27, 28, 24, 27 pA/( $\sqrt{\text{Hz}}$ )respectively from 100 MHz up to each 3-dB bandwidth, which are highly consistent with the calculation average of 25 pA/ $\sqrt{\text{Hz}}$  deduced in Section 2.

Some outstanding designs in different CMOS technologies are summarized in Table 3. The proposed design shows an excellent GBW performance and an outstanding FOM using the power-configuration scheme. Through Ref. [8] exhibits a higher FOM, the performance-to-cost ratio of this work is still first-class when considering the different manufacturing costs of these two processes (180-nm and 80-nm CMOS). The preamplifier proposed in Ref. [11] is an excellent design also for ultra high-speed high-density parallel optical communication systems with the same process as this work. The RGC input stage together with trans-impedance feedback and external passive network makes it less economical in power and area consumption. The proposed design is more competitive than other commonly used topologies for ultra high-speed highdensity parallel optical communication systems is because the

| Reference | $\begin{array}{cc} ce & f_{3dB} @ C_{pd} & Z_{T} (dB\Omega) \\ (GHz @ pF) \end{array}$ |      | $P_{dc} @ V_{dd}/Process$<br>(mW @ V) | Noise<br>(pA/ <del>√Hz</del> ) | Area (mm <sup>2</sup> ) | FOM $(k\Omega \cdot GHz/mW/mm^2)$ |
|-----------|----------------------------------------------------------------------------------------|------|---------------------------------------|--------------------------------|-------------------------|-----------------------------------|
| [10]      | 3.5 @ 0.75                                                                             | 64   | 20 @ 3.3/500-nm CMOS                  | 15                             | $0.28 \times 0.48$      | 2.1                               |
| [2]       | 8 @ 0.25                                                                               | 53   | 13.5 @ 1.8/180-nm CMOS                | 18                             | $0.45 \times 0.25$      | 2.4                               |
| [11]      | 7.9 @ 0.8                                                                              | 53   | 18 @ 1.8/180-nm CMOS                  | 30                             | $0.23 \times 0.25$      | 3.4                               |
| [12]      | 36.5 @ 0.3                                                                             | 53.6 | 110 @ 3/130-nm BiCMOS                 | 36.5                           | $0.33 \times 0.21$      | 1.8                               |
| [8]       | 13.4 @ 0.32                                                                            | 52.8 | 2.2 @ 1/80-nm CMOS                    | 50                             | $0.14 \times 0.07$      | 240                               |
| [13]*     | 22.8 <i>@</i> —                                                                        | 69.8 | 74 @ 1.8/1/65-nm CMOS                 | _                              | 0.4                     | 2.4                               |
| [14]**    | 30 @                                                                                   | 55   | 9 @ 1/45-nm SOI                       | 20.5                           | $0.52 \times 0.54$      | 6.7                               |
| This Work | 6.8 @ 0.35                                                                             | 53.9 | 6.26 @ 1.8/180-nm CMOS                | 27                             | 0.08 	imes 0.08         | 84                                |

Table 3. Performance comparison of preamplifier in different processes.

\*This design is built up with PreAmp (1.8 V) + PostAmp (1.0 V) + OutputBuffer + Equalizes.

\*\* This design is built up with PreAmp + PostAmp.

FCG stage can achieve high GBW in simpler structure with less current branches and can dispense with inductors, which are area-consuming. Meanwhile, the FCG topology is suitable for low-supply applications. Therefore, the stacked powerconfigurable block takes full advantage of this merit to stabilize performances when PVT varies without any sacrifice on power or area.

#### 5. Conclusions

In this paper, a power-configured FCG stage is presented for preamplifier design in a parallel optical receiver. The proposed topology resolves the contradiction of bandwidth extension and trans-impedance gain boost better than other commonly adopted topologies under limited voltage headroom and power consumption and achieves an excellent FOM with outstanding GBW performance and low power and area consumption. The advantage on headroom is further exploited by bringing in a power-configurable scheme to avoid wasteful power caused by pessimistic estimation of PVT variation. Using a standard 180-nm CMOS process, an inductor-less 10-Gb/s preamplifier with power-configurable scheme is designed and fabricated. Two chips close to worst and best performance conditions respectively exhibit a trans-impedance gain of 53.9  $dB\Omega$ , 3-dB bandwidth of 6.8 GHz, power dissipation of 6.26 mW without power-configuration and trans-impedance gain of 52.1 dB $\Omega$ , 3-dB bandwidth of 8.1 GHz, power dissipation of 6.35 mW with power-configuration. Due to the absence of inductors in the circuit implementation, the core area of the preamplifier measures only  $0.08 \times 0.08$  mm<sup>2</sup> and the measured average input-referred noise-current spectral density is lower than 28 pA/ $\sqrt{\text{Hz}}$ . It is extremely suitable for monolithic system integration in ultra high speed high-density parallel optical communication applications in low-cost CMOS processes.

## References

[1] Kim J, Buckwalter J F. Bandwidth enhancement with low groupdelay variation for a 40-Gb/s transimpedance amplifier. IEEE Trans Circuits Syst I, Reg Papers, 2010, 57(8): 214

- [2] Lu Z, Yeo K S, Ma J, et al. Broad-band design techniques for transimpedance amplifiers. IEEE Trans Circuits Syst I, Reg Paper, 2007, 54: 590
- [3] Lu Z, Yeo K S, Lim W M, et al. Design of a CMOS broadband transimpedance amplifier with active feedback. IEEE Trans Very Large Scale Integration Syst, 2010, 18: 461
- [4] Analui B, Hajimiri A. Bandwidth enhancement for transimpedance amplifiers. IEEE J Solid-State Circuits, 2004, 39: 1263
- [5] Shekhar S, Walling J S, Allstot D J. Bandwidth extension techniques for CMOS amplifiers. IEEE J Solid-State Circuits, 2006, 41: 2424
- [6] Wu C H, Lee C H, Chen W S, et al. CMOS wideband amplifiers using multiple inductive-series peaking technique. IEEE J Solid-State Circuits, 2005, 40: 548
- [7] Wang C Y, Wang C S, Wang C K. An 18-mW two-stage CMOS transimpedance amplifier for 10 Gb/s optical application. Proc IEEE Asian Solid-State Circuits Conf, 2007: 412
- [8] Kromer C, Sialm G, Morf T, et al. A low-power 20-GHz 52-dBΩ transimpedance amplifier in 80-nm CMOS. IEEE J Solid-State Circuits, 2005, 39: 885
- [9] Wang X X, Wang Z G, Liu J S, et al. 10-Gb/s high-density trans-impedance amplifier in 0.18-μm CMOS. International Conference on Wireless Communications & Signal Processing, 2009: 1
- [10] Hasan S M R. Design of a low-power 3.5-GHz broad-band CMOS trans-impedance amplifier for optical transceivers. IEEE Trans Circuits Syst I, 2005, 52: 1061
- [11] Li Zhiqun, Chen Lili, Wang Zhigong. Design of a 12channal 120-Gb/s optical receiver front-end amplifier in 0.18- $\mu$ m CMOS technology. Symposium on Photonics and Optoelectronic (SOPO), 2010: 1
- [12] Amid S B, Plett C, Schvan P. Fully differential, 40 Gb/s regulated cascode transimpedance amplifier in 0.13  $\mu$ m SiGe Bi-CMOS technology. IEEE Bipolar/BiCMOS Circuits and Technology Meeting, 2010: 33
- [13] Takemoto T, Yuki F, Yamashita H, et al. A 25 Gb/s × 4-channel 74 mW/ch transimpedance amplifier in 65 nm CMOS. IEEE Custom Integrated Circuits Conf, 2010: 1
- [14] Kim J, Buckwalter J F. A 40-Gb/s optical transceiver front-end in 45 nm SOI CMOS technology. IEEE Custom Integrated Circuits Conf, 2010: 1