# A programmable gain amplifier with digitally assisted DC offset calibration for a direct-conversion WLAN receiver

Yao Xiaocheng(姚小城)<sup>1,†</sup>, Gong Zheng(龚正)<sup>1</sup>, and Shi Yin(石寅)<sup>1,2</sup>

<sup>1</sup>Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China <sup>2</sup>Suzhou-CAS Semiconductors Integrated-Technology Research Center, Suzhou 215021, China

**Abstract:** This paper presents a programmable gain amplifier (PGA) circuit with a digitally assisted DC offset cancellation (DCOC) scheme for a direct conversion WLAN receiver. Implemented in a standard 0.13- $\mu$ m CMOS process, the PGA occupies 0.39 mm<sup>2</sup> die area and dissipates 6.5 mW power from a 1.2 V power supply. By using a single loop single digital-to-analog converter (DAC) mixed signal DC offset cancellation topology, the minimum DCOC settling time achieved is as short as 1.6  $\mu$ s with the PGA gain ranging from –8 to 54 dB in a 2 dB step. The DCOC loop utilizes a segmented DAC structure to lower the design complexity without sacrificing accuracy and a digital control algorithm to dynamically set the DCOC loop to fast or normal response mode, making the PGA circuit in compliance with the targeted WLAN specifications.

Key words: direct conversion receiver; digital assisted DC offset cancellation; segmented current mode digitalto-analog converter; settling time

**DOI:** 10.1088/1674-4926/33/11/115006 **EEACC:** 2570

## 1. Introduction

development of multiple WLAN standards The (802.11a/b/g/n) in the past a few years has created the need for integrated low cost, low power, multimode transceivers. In comparison with super-heterodyne architectures, directconversion architectures<sup>[1-3]</sup></sup> are often preferred by modern WLAN receivers because they appear to more easily lend themselves to integration by eliminating the external front-end band-select filters, and the LNA/mixer interface can be optimized for power dissipation, linearity and noise performances without requiring a 50- $\Omega$  impedance. However, several issues are also faced by this direct-conversion receiver architecture, among which the DC offset problem due to LO leakage and mixer self-mixing or baseband circuit mismatch becomes challenging. With an analog baseband gain of 40 dB or even higher, the DC offset appearing at the baseband inputs can easily saturate the following circuits and prohibit signal detection.

Several DCOC schemes have been adopted to cope with the DC offset problem. Possible methods include AC coupling the baseband stages to highpass the DC offsets form mixer outputs and between the baseband stages or applying a lowpass network in the feedback servo loop<sup>[4]</sup> around the baseband path to eliminate DC offsets. Both of these methods prove area-inefficient because large on-chip capacitors are always required. More severely, trade-offs between signal spectrum integrity and DCOC loop response time have made the highpass corner frequency of the baseband path hard to design.

In this paper, a digitally assisted DCOC loop consisting of PGAs, a rail-to-rail input comparator, a digital control block, and an 11-bit DAC decouples the problem of signal spectrum integrity with a DCOC loop response time. By using this novel mixed signal DCOC circuit, large on-chip capacitors

and power hungry DCOC servo amplifiers can be eliminated, which makes the design area- and power-efficient.

## 2. Circuit implementation

Possible solutions to eliminate baseband DC offsets with a mixed signal control loop can be generally illustrated by Fig. 1. In Fig. 1(a), digital calibration codes are fed back to multiple DACs<sup>[5]</sup> that are tied to different PGA stage inputs. By doing this, the required number of bits of each DAC can be lowered and the design complexity can also be reduced. However, calibration monotonicity should be taken into account in this scheme to guarantee calibration process convergence, especially when the PGA gains are changing in automatic gain control (AGC) tuning process. From this perspective, the calibration scheme in Fig. 1(b) is preferred for the sake of robustness because the single DAC topology can prevent any nonmonotonicity as far as the DAC itself is monotonic. In the presented digital assisted design, topology in Fig. 1(b) is chosen in which the system performances rely heavily on the DAC resolution.

#### 2.1. The PGAs

The proposed baseband chain including PGAs and an output buffer stage are operational amplifier (op-amp) based structures. In this baseband chain, a coarse step gain stage (8 dB/step) is put at the front, while the fine step gain stage (2 dB/step) is in the middle, followed by an output buffer with 0 dB voltage gain. In total, the baseband chain can achieve -8 to 54 dB gain range with a 2 dB step.

The op-amp with a topology shown in Fig. 2 takes a prevalent two-state structure to achieve large DC gain and a large output swing simultaneously.  $C_M$  is applied as a Miller com-

<sup>†</sup> Corresponding author. Email: xcyao82@gmail.com

Received 28 April 2012, revised manuscript received 22 May 2012



Fig. 1. Analog baseband circuit with a digitally assisted DCOC using (a) multiple DACs and (b) a single DAC.



Fig. 2. The op-amp.

pensation capacitor,  $R_Z$  is used for cancelling the right half plane (RHP) zero. Transistor MP8a,b in parallel with the input pair MP7a,b are added to suppress large input commonmode (CM) interferences. Auxiliary input pair MP9a,b help in forming a common-mode feed forward (CMFF) path to improve the op-amp common-mode rejection ratio (CMRR) especially at high frequencies. In the left part of this figure, the common-mode amplifier constructed by a transconductor built with MP10a,b and  $R_3$  compares the output CM voltage with the CM reference  $V_{CM}$ , amplifies and feeds the result back to current sources MN5a,b to stabilize the output CM voltages.

#### 2.2. The DAC

Given that the maximum endurable output residue DC offset voltage after cancellation  $V_{\text{res, out}}$  and the maximum input DC offset voltage at the baseband chain inputs  $V_{\text{os, max}}$ , it follows that:

$$V_{\rm FS} \ge V_{\rm os,\,max},$$
 (1)

$$V_{\rm LSB} \ge \frac{1}{2} \frac{V_{\rm res, out}}{A_{\rm v, max}},$$
 (2)

where  $V_{\text{FS}}$ ,  $V_{\text{LSB}}$ ,  $A_{\text{v, max}}$  stand for the DAC full scale voltage, the DAC least significant bit (LSB) voltage, and the maximum voltage gain in the baseband path, respectively. The DAC resolution requirement can thus be estimated by:

$$B \ge \log_2\left(2 \times \frac{A_{\rm v,\,max}V_{\rm os,\,max}}{V_{\rm res,\,out}} + 1\right),\tag{3}$$

where *B* stands for the required number of bits of the proposed DAC. Given  $A_{v, max} \approx 500 (54 \text{ dB})$ ,  $V_{os, max} = 20 \text{ mV}$  and  $V_{res, out} = 10 \text{ mV}$ , it follows that:

$$B \ge 10.97. \tag{4}$$

As shown in Fig. 3, an 11-bit current-source based DAC is adopted to fulfill the resolution requirement calculated above. This proposed DAC utilizes a 4-4-3 segmented topology to trade-off between accuracy and complexity as well as die area.

In Fig. 3, the most significant 4-bit sub-DAC and the middle 4-bit sub-DAC take a unit element structure to improve conversion accuracies. These two sub DACs are driven by the 16-bit thermometer control code A<15:0> and B<15:0> which are generated from the 4 most significant bits (MSB) D<10:7> and the input 4-bit middle codes D<6:3> by two binary-to-thermometer decoders, respectively. The MSB unit elements are 16 PMOS current sources, each carrying a unit current  $\Delta I$ , when D<10:7> is increased from "0000" to "1111" successively, 15 of these current sources diverge their current from the current source steers a unit current  $\Delta I$  to the less significant bit sub-DAC as its full scale current to improve monotonicity. The middle 4-bit sub-DAC and the least signif-



Fig. 3. The 11-bit segmented DAC topology.



Fig. 4. WLAN receiving mechanism and frame structure.

icant 3-bit sub-DAC make use of identical structures of identical resistors and switches to achieve good matching while the former is of unit element topology and all branch switches in each of the 15 element are controlled by one thermometer control bit from B < 14 > to B < 0 > and the latter chooses a binary weighed topology to save area and reduce design complexity that each of its resistor branches is controlled by the LSB binary codes form D < 2> to D < 0>. Furthermore, all of the branch resistors are implemented by a unit paralleled NMOS transistor with large device length to obtain large resistor values to improve matching further and minimize the effect of relevant switch's turn-on resistances. The total differential output current are first folded into the cascode device source nodes (nodes A, B) then output from their drain nodes to minimize the current DAC's output impedance variations due to input code diversifications. This final output current is ultimately injected into the first PGA op-amp's differential input nodes through two resistors to convert the DC offset calibration currents to voltages and it can be calculated that:

$$I_{\text{out, diff}} = \frac{1}{2} \times \left[ 2 \sum_{i=0}^{10} (D < i > \times 2^i) - 2047 \right] \times I_{\text{LSB}}, \quad (5)$$

where  $I_{\text{out, diff}}$  and  $I_{\text{LSB}}$  stands for the total output differen-

tial current and the LSB output current of the proposed DAC, respectively, with  $I_{\text{LSB}} = \Delta I/64$  and  $I_{\text{out, diff}}$  ranging from  $-1023.5I_{\text{LSB}}$  to  $+1023.5I_{\text{LSB}}$ , where 2047 stands for the largest input to the DAC which is 11 bits wide.

#### 2.3. DCOC digital module

The fundamental access method of the IEEE802.11 MAC is a distributed coordination function (DCF) known as carrier sense multiple access with collision avoidance (CSMA/CA). The DCF shall implemented in all stations (STAs), for use within both independent basic service set (IBSS) and infrastructure network configurations<sup>[6]</sup>. The CSMA/CA distributed algorithm mandates that a gap of a minimum specified duration must exist between contiguous frame sequences, as shown in Fig. 4. Signal rx\_en is the receiving state indication signal that indicates the receiving is valid when rx\_en is high. The duration of A as shown in Fig. 4 means that the current channel is idle. The duration of DCOC valid indicates that the DC offset detection is effective. The DCOC maintains the compensation value when DCOC valid is idle.

The input currents to the comparator shown in Fig. 1 can be described as



Fig. 5. DCOC flowchart.

$$i_{\rm p} = I_{\rm offset} + a(t), \quad i_{\rm n} = -I_{\rm offset} - a(t),$$
 (6)

where  $I_{\text{offset}}$  is the DC-offset current, and a(t) is the signal. The output of the comparator is 1 when  $i_p > i_n$ , otherwise 0. It can be easily seen that the output of the comparator is determined by both DC-offset and signal. The input of the comparator comprises the DC offset and additive Gaussian white noise (AWGN) only when the channel is idle. An optimum scheme is to detect the DC offset during the *A* domain and refine the DC offset during STS, as shown in Fig. 4. The relation between the DC offset to noise ratio (DCON) and the mark-space ratio (MRS) is approximately linear when DCON  $\in [-4 \text{ dB}, 4 \text{ dB}]$ . As a result, it is reasonable to scale the DC offset using MRS.

The detailed description of the DCOC flowchart is shown in Fig. 5, where MSR represents mark-space ratio, cons is a variable derived from AGC words to guarantee the accuracy of cancellation. A DCOC module comprises an offset detection block, an offset cancellation block, and an offset amplifying block. In particular, the offset cancellation block comprises a counter that counts polarities of DC offsets during the predetermined period, which is programmable using the output of the offset detection block. A controller comprises first to fourth branch units, respectively, operating fast down-compensation, regular down-compensation, fast up-compensation, and regular up-compensation branches. The control device controls the execution of one selected from the four branches by the MSR derived from the counter. The register storing data controls the offset amplifying block. The 11-bit data register, which is identical to the bit-wide of DAC, has an initial value of 1024. If the DC offset is severe, namely MSR < FDT or MSR > FUT, the fast compensation branch is executed to change the data value of the data register using the binary searching method. If the DC offset is not severe, namely FDT < MSR < RDT or FUT < MSR < RUT, the regular compensation branch is executed to change the data value of the data register by adding or subtracting cons from the data register. If the MSR derived from the counter is an appropriate value, the value of the data register is maintained. Two modes, namely fast mode, in which only fast compensation branches exist, and normal mode, in which both fast and regular compensation branches are all employed, are introduced to the DCOC module. The mode configuration can easily achieved by configuring the thresholds, namely FDT, RDT, FUT, RUT. The default thresholds [FDT, RDT, FUT, RUT] for fast mode are [1/2, 1/2, 1/2, 1/2] and [1/8, 2/8, 7/8, 6/8] for normal mode.

#### 3. Experimental results

The proposed PGA with digital assisted DCOC is fabricated in a standard 0.13- $\mu$ m CMOS process. Its die photograph is shown in Fig. 6, in which the active area includes the PGA core, DAC, digital control block, comparator and bias circuits, is 0.39  $\mu$ m<sup>2</sup>. The chip consumes 5.4 mA from a 1.2 V supply, in which only 0.25 mA is occupied by the DAC circuit.

The measured transient response of the proposed PGA in signal independent mode (fast-mode) and signal dependent mode (normal-mode) are shown in Figs. 7(a) and 7(b), respec-

| Table 1. Summarized experimental results. |               |                       |                         |
|-------------------------------------------|---------------|-----------------------|-------------------------|
| Parameter                                 |               | This work             | Ref. [5]                |
| Technology                                |               | 0.13 μm CMOS          | 0.13 μm CMOS            |
| Supply voltage (V)                        |               | 1.2                   | 1.2                     |
| Bandwidth (MHz)                           |               | > 40                  | 250                     |
| Voltage gain (dB)                         |               | -8 to 54 dB/2 dB step | -9 to 73 dB/0.5 dB step |
| Power consumption (mW)                    | Core          | 6.2                   | 79.2                    |
|                                           | DACs for DCOC | 0.3                   | 3.6                     |
| Minimum DCOC settling time ( $\mu$ s)     | )             | 1.6                   | 4                       |
| Die area (mm <sup>2</sup> )               |               | 0.39                  | 0.8                     |



Fig. 6. Chip die photograph.



Fig. 7. PGA output transient response during DCOC process in (a) fast mode and (b) normal mode.

tively. As in Fig. 7(a), the DCOC loop takes a successive approximation algorithm, by which the output can be quickly converged to the final offset cancellation result limited by DAC resolution in 1.6  $\mu$ s. In Fig. 7(b), the PGA and DCOC loop works concurrently and the offset cancellation is accomplished with allowable residues of no more than 15 mV within 5  $\mu$ s. Both modes are in compliance with the design specifications which valids the proposed digital assisted DCOC PGA circuit.

Table 1 summarizes the measurement results of the proposed PGA circuit compared to recently published work<sup>[5]</sup>. From this table, it could be concluded that the proposed single DAC mixed signal DCOC circuit achieves higher power efficiency, smaller chip area as well as shorter minimum DCOC loop response time compared to its multiple DAC counterpart.

## 4. Conclusion

A PGA circuit with a digitally assisted DCOC scheme for WLAN receiver application is implemented with a 0.13- $\mu$ m CMOS process. By using a single loop single DAC mixed signal DC offset cancellation scheme, the proposed PGA achieves low power, small die area, and fast settling simultaneously. A segmented current source based DAC is adopted to lower the design complexity without sacrificing accuracy. The digitally control algorithm dynamically sets the DCOC loop to fast or normal response mode. When in the fast mode, the proposed DCOC scheme greatly surpasses conventional analog DCOC circuits in speed and is also faster than the multiple DAC digital DCOC circuit published recently.

### References

- Zargari M, Su D K, Yue P, et al. A 5-GHz CMOS transceiver for IEEE 802.11a wireless LAN systems. IEEE J Solid-State Circuits, 2002, 37(12): 1688
- [2] Zargari M, Terrovitis M, Jen S H M, et al. A single-chip dualband tri-mode CMOS transceiver for IEEE 802.11a/ b/g wireless LAN. IEEE J Solid-State Circuits, 2004, 39(12): 2239
- [3] Mehta S S, Weber D, Terrovitis M, et al. An 802.11g WLAN SoC. IEEE J Solid-State Circuits, 2005, 40(12): 2239
- [4] Chen T M, Chiu Y M, Wang C C, et al. A low-power fullband 802.11a/b/g WLAN transceiver with on-chip PA. IEEE J Solid-State Circuits, 2007, 42(2): 983
- [5] Shih H Y, Kuo C N, Chen W H, et al. A 250 MHz 14 dB-NF 73 dB-gain 82 dB-DR analog baseband chain with digital-assisted DC-offset calibration for ultra-wideband. IEEE J Solid-State Circuits, 2010, 45(2): 338
- [6] IEEEStd802.11n TM-2009: Available: http://standards.ieee. org/getieee802/download/802.11n-2009.pdf