# Design and implementation of an IEEE 802.11 baseband OFDM transceiver in 0.18 $\mu m$ CMOS\*

Wu Bin(吴斌)<sup>†</sup>, Zhou Yumei(周玉梅), Zhu Yongxu(朱勇旭), Zhang Zhengdong(张振东), and Cai Jingjing(蔡菁菁)

Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China

**Abstract:** An SISO IEEE 802.11 baseband OFDM transceiver ASIC is implemented. The chip can support all of the SISO IEEE 802.11 work modes by optimizing the key module and sharing the module between the transmitter and receiver. The area and power are decreased greatly compared with other designs. The baseband prototype has been verified under the WLAN baseband test equipment and through transferring the video. The 0.18  $\mu$ m 1P/6M CMOS technology layout is finished and the chip is fabricated in SMIC, which occupies a 2.6 × 2.6 mm<sup>2</sup> area and consumes 83 mW under typical work modes.

Key words: WIFI; IEEE 802.11; OFDM; baseband; VLSI DOI: 10.1088/1674-4926/32/5/055001 EEACC: 2570

# 1. Introduction

The orthogonal frequency division multiplexing (OFDM), multiple-input multiple-output (MIMO), and forward error correction (FEC) are the key technologies for the baseband of the IEEE 802.11a/g/n, which supports the data rates from 6 up to 65 Mbps. Now almost all of the WLAN device must support the SISO IEEE 802.11 a/g/n OFDM mode, especially for low power mobile multimedia devices, which bring the urgent need for the low power and low cost chip design.

The presented baseband transceiver ASIC is composed of a transmitter and receiver, which support all of the non-HT mode (IEEE 802.11a/g, 6–54 Mbps) as well as the mixed HT mode (IEEE 802.11n, 6.5–65 Mbps). The chip is implemented in SMIC 0.18  $\mu$ m CMOS technology. The baseband prototype has been verified under WLAN baseband test equipment.

## 2. Transceiver architecture

The frame structure is very important for implementing the OFDM transceiver. Figure 1 shows the frame structure of the IEEE 802.11n mixed HT mode.

The proposed baseband transceiver architecture is shown in Fig. 2, where the upper part is the transmitter and the lower part is the receiver. The left part is the baseband control manage unit.

The transmitter mainly consists of a training sequence generation unit, a QAM mapping unit, a pilot insertion unit, an IFFT unit, a cyclic prefix (CP) extension unit, a digital upper conversion (DUC) and a presorting operation unit. The transmitted signal complies completely with the power spectrum mask requirement.

The receiver consists of a digital down conversion (DDC) unit, a frame start detector unit, a symbol start detector unit and

an FFT unit, and the receiving chain also includes a multi-step carrier frequency offset recovery loop unit, and an adaptation aided frequency-domain equalizer (FEQ) unit, a de-interleaver unit and a Viterbi decoder unit.

To efficiently ha-ndle the complexity entailed by the large number of transmission modes defined in IEEE 802.11, the receiver is broken up into 3 level controllers and 5 key main data path elements.

To efficiently lower the power and area of the transceiver, multiple technologies are used in this design. The most important strategies are optimizing the system architecture and hardware sharing, which make decrease the area and power greatly compared with other designs.

# 3. Transmitter design

The chip implements bit-interleaved coded modulation with convolution encoding, puncturing, and bit-wise interleaving. A single pipelined 64-point FFT/IFFT unit is shared between the transmitter and the receiver. The transmitter mainly consists of training sequence generation, QAM mapping, pilot insertion, IFFT, cyclic prefix (CP) extension, windowing, and wave shaping functional blocks.

By generating the preambles in the time-domain, the transmitter latency is reduced to less than 0.2  $\mu$ s. The wave shaping filter adopts a raised cosine filter with a roll-off factor of 0.22. The transmitted signal completely complies with the power spectrum mask requirement<sup>[1]</sup>.

## 3.1. Multi-mode FFT/IFFT

The 64 point pipeline FFT/IFFT processor is implemented with the radix-4 algorithm in the single-path delay feedback (SDF) architecture. The 64-point FFT/IFFT architecture can be divided into three pipeline stages. In order to decrease the

<sup>\*</sup> Project supported by the Major National Science & Technology Program of China (No. 2009ZX03007-002) and the National Natural Science Foundation of China (No. 60976022).

<sup>†</sup> Corresponding author. Email: wubin@ime.ac.cn Received 2 November 2010, revised manuscript received 11 January 2011



Fig. 1. IEEE 802.11n mixed format frame.



Fig. 2. Architecture of the IEEE 802.11 transceiver.



Fig. 3. SNR versus bit-width for 64-point FFT.

impact of the error, The SNR is 65 dB for this FFT/IFFT design if the bit-width is selected as 16 bit (Fig. 3), which satisfies the needs of the transceiver. The processor is designed to support both FFT and IFFT, which can be shared between the transmitter and the receiver, and the fitful bit-width is chosen; the above measures decrease the area efficiently<sup>[3]</sup>.

#### 3.2. Multi-mode interleaver/de-interleaver

The multi-mode interleaver/de-interleaver is implemented by a novel design strategy and a new architecture based on an adder and cyclic-shift register for multi-mode design. We propose three design techniques, including splitting the interleaving operations, merging the permutations, and replacing the arithmetic expressions with optimized cell structures, which can be shared between the transmitter and the receiver with a smaller area compared with other designs<sup>[4]</sup>.

### 4. Receiver design

The SISO IEEE 802.11 OFDM receiver consists of the inner receiver and the outer receiver. The detailed algorithm is introduced by Ref. [2]. The receiver has high computational complexity compared with the transmitter. The key signal process modules are composed of synchronization, channel estimation, residual carrier frequency offset and phase tracking, as well as the decoding module.

#### 4.1. Frame structure

The non-HT frame structure is shown in Fig. 1. The first part of the short preamble is about 4.8  $\mu$ s, which is reserved for AGC operation. The remaining short preamble is utilized for coarse symbol boundary detection and carrier frequency offset (CFO) estimation and compensation. The fine symbol boundary detection, the fine CFO estimation, the channel estimation is finished in the duration of the long preamble. The residual CFO is tracked in the data field. The receive work mode information is obtained by demodulating the signal field.

#### 4.2. Time and frequency joint synchronization

The time domain frequency offset estimation algorithm is introduced as follows. When the frequency offset exists, the baseband receive signal is

$$r_n = s_n \mathrm{e}^{\mathrm{j}2\pi f_\Delta n T_\mathrm{s}},\tag{1}$$

where  $s_n$  is the transmitter baseband signal,  $f_{\Delta}$  is the frequency offset,  $f_{\Delta} = f_{tx} - f_{rx}$ , and  $T_s$  is the sample time distance.

$$z = \sum_{n=0}^{L-1} r_n r_{n+D}^* = \sum_{n=0}^{L-1} s_n s_{n+D}^* e^{j2\pi f_{\Delta} n T_s} e^{-j2\pi f_{\Delta} (n+D)T_s},$$
  
$$s_n s_{n+D}^* = |s_n|^2,$$
(2)

$$\sum_{n=0}^{L-1} r_n r_{n+D}^* = e^{-j2\pi f_{\Delta} DT_s} \sum_{n=0}^{L-1} |s_n|^2.$$
(3)

The frequency offset is calculated as

$$f_{\Delta} = -\frac{1}{2\pi DT_{\rm s}} \arctan(\max(z)). \tag{4}$$

The design makes use of the auto-relation to detect the start of the frame, which can avoid the effect of the multi-path and the frequency offset. The design can decrease the area and reduce the hardware cost by sharing the auto-relation for the time and frequency joint synchronization method<sup>[5]</sup>.



Fig. 4. Time frequency synchronization architecture.

$$P(d) = \sum_{m=0}^{L-1} r_{d+m} r_{d+m+L}^*.$$
 (5)

The precise position of the start of the receive frame can't be achieved by the auto-relation because the peak of the autorelation is not very clear, which is obtained by the match filter in this paper as

$$m(n) = \left| \sum_{k=0}^{L-1} s_k r_{n+k}^* \right|^2.$$
(6)

The circuit architecture of the time and frequency joint synchronization is shown in Fig. 4.

## 4.3. Frequency domain channel estimation and equalization

In the design, the channel estimation  $\hat{H}_k$  is obtained by the long training sequence.

$$\hat{H}_{\rm LS} = F Q_{\rm LS} F^{\rm H} X^{\rm H} Y, \tag{7}$$

$$Q_{\rm LS} = (F^{\rm H} X^{\rm H} Y)^{-1},$$
 (8)

$$\hat{H}_{\rm LS} = X^{-1}Y. \tag{9}$$

The typical ways used to equalize the signal are least square (LS) as well as (MMSE). MSE has better performance than the LS, but it has higher computation complexity in practice, LS is used in order to achieve low complexity.

$$\hat{H}_{\rm LS} = F Q_{\rm LS} F^{\rm H} X^{\rm H} Y, \tag{10}$$

$$Q_{\rm LS} = (F^{\rm H} X^{\rm H} Y)^{-1}, \tag{11}$$

$$\hat{H}_{\rm LS} = X^{-1} Y. \tag{12}$$

According to the frame structure, the actual channel estimation and channel equalization are obtained as follows,



Fig. 5. VLSI architecture for residual carrier offset tracking.

$$\hat{H}_{k} = \frac{1}{2}(R_{l,k} + R_{2,k})X_{k}^{*}$$

$$= \frac{1}{2}(H_{k}X_{k} + W_{l,k} + H_{k}X_{k} + W_{2,k})X_{k}^{*}$$

$$= H_{k}|X_{k}|^{2} + \frac{1}{2}(W_{1,k} + W_{2,k})X_{k}^{*}$$

$$= H_{k} + \frac{1}{2}(W_{1,k} + W_{2,k})X_{k}^{*}, \qquad (13)$$

$$X(k) = \frac{Y(K)}{\hat{H}(k)} + \frac{W(k)}{\hat{H}(k)}.$$
 (14)

#### 4.4. Residual carrier offset and phase tracking

The residual carrier offset has a serious impact on the performance of the receiver, which becomes even more serious when the OFDM symbol length is greater (Eqs.  $(15)-(17))^{[6]}$ . The VLSI architecture for estimating the residual frequency offset and phase tracking is introduced in Fig. 5. From the architecture, it's clear that the pilot is used to estimate the residual carrier offset and phase.

$$Y_{k,l} = X'_{k,l} H_{k,l} e^{j\frac{2\pi}{N}(N_{\rm CP} + lN_{\rm sym})\phi_k} e^{j\pi \left(\frac{N_d - 1}{N}\phi_k + \theta\right)}$$
$$\times \operatorname{si}(\pi\phi_k) + W_{k,l}, \tag{15}$$

$$Y_{k,l} = X'_{k,l} H'_{k,l} e^{jl\phi_l(k)C} + W_{k,l}, \qquad (16)$$

$$C = 2\pi \frac{N_{\rm sym}}{N}.$$
 (17)

In order to simplify the circuit design, the residual frequency estimation and residual phase tracking are dealt with at the same time.

$$R_{l,k} = P_{l,k} \tilde{H}_{l,k} \exp(j\phi_{l,k}), \qquad (18)$$

$$\hat{\phi}_{l,k} = \arg \frac{R'_{l,k}}{P_{l,k}}.$$
(19)



Fig. 6. (2,1,7) convolution 64 status grid.



Fig. 7. Receiver PER simulation in AWGN.



Fig. 8. Layout and fabricated ASIC.

#### 4.5. Channel decoding

The Viterbi decoder is used for the outer receiver. The key of the Viterbi decoder is to find the shortest distance in the receiver sequence and obtain the final result by the distance (Fig. 6). The channel decoding is implemented on fully parallel radix-2 in order to decrease the area of the baseband, which can support 65 Mbps throughput with low area cost in a 80 MHz clock frequency.



Fig. 9. 16QAM EVM.



Fig. 10. Baseband video transfer demo.

## 5. Result analysis

Figure 7 shows the frame packet error performance of the transceiver design for scan mode from SNR1 to SNR35 by different modulation and coding schemes (MCS0–MCS7) in SISO IEEE 802.11n 1X1 mode with a 20 MHz bandwidth in an AWGN channel model. The whole design has been verified in the FPGA prototype design and the ASIC has been manufactured in a SMIC 0.18  $\mu$ m technology. This transceiver is implemented in a 2.6 × 2.3 mm<sup>2</sup> chip area. The power consumption is 83 mW under a 1.8 V/3.3 V power supply. The layout and the chip micrograph are provided in Fig. 8.

The EVM parameter shows the transmitter's performance. The EVM is tested in standard WLAN test equipment. The –31 dB EVM is better than the requirement of the IEEE 802.11 specification (Fig. 9). The video transferring demo shows that the baseband system works well (Fig. 10).

The design in this paper can support the IEEE 802.11 SISO OFDM work mode. Table 1 compares the performance of the designed chip with other SISO IEEE 802.11 OFDM baseband chips. The baseband is the biggest part in the SISO IEEE

| Table 1. Performance summary. |         |               |            |          |      |
|-------------------------------|---------|---------------|------------|----------|------|
| Design                        | Paper   | Ref. [7]      | Ref.       | Ref. [9] | Ref. |
|                               |         |               | [8]        |          | [10] |
| Area<br>(mm <sup>2</sup> )    | 6.76    | 46.24         | 12.25      | 256      | 59   |
| Process (µm)                  | 0.18    | 0.25          | 0.25       | 0.18     | 0.25 |
| Transistor<br>(M)             | 0.8     | 4.0           | 1.2<br>(1) | 30       |      |
| Power<br>(mW)                 | 83      | 452           | 109        | 958      | 795  |
| Voltage<br>(V)                | 1.8/3.3 | 2.5/3.3       | 2.5        | 1.5/3.3  | 2.5  |
| Function                      | bb      | bb/mac/afe/rf | bb         | bb+mac   | bb   |

Note: The design has 302189 gates (about 1.2 M transistor)

802.11 SOC. The performance summary shows that the design in this paper has a smaller area and lower power compared with the other baseband chip.

## 6. Conclusion

This paper presents the chip architecture of an SISO IEEE 802.11 transceiver baseband ASIC. The chip consists of a transmitter and a receiver, which supports all of the SISO IEEE 802.11 OFDM work mode. By sharing the hardware resource efficiently and optimizing the key module, the area and the power are decreased greatly. The simulation result shows that the PER result can satisfy the needs of the IEEE 802.11spec, and the prototype works well. The fabricated chip shows that the design supports the 802.11a/g OFDM as well as 802.11n  $1 \times 1$  mode, and occupies a smaller area compared with other

SISO 802.11a/g baseband chips.

# References

- [1] Draft 802.11n Standard, IEEE P802.11n/D4.0, March, 2008
- [2] Tse D. Fundamentals of wireless communication. Cambridge University Press, 2007
- [3] Jiang Xin. Design and implementation for FFT/IFFT processor. BaD Dissertation, University of Electronic Science and Technology of China/Institute of Microelectronics, Chinese Academy of Sciences, 2009 (in Chinese)
- [4] Zhang Zhendong, Wu Bin, Zhu Yongxu, et al. Design and implementation of a multi-mode interleaver/deinterleaver for MIMO OFDM systems. IEEE International Conference on ASIC, 2009, 1: 514
- [5] Wu Bin. Research for the key technology of the VLSI circuits for MIMO-OFDM receive. PhD Dissertation, Institute of Microelectronics, Chinese Academy of Sciences, 2011
- [6] Das S, Kumar R V R. Low complexity residual phase tracking algorithm for OFDM-based WLAN systems. Proc Fourth International Symposium on Communication Systems, Networks and Digital Signal Processing, 2004: 128
- [7] Thomson J, Baas B. An integrated 802.11a baseband and MAC processor. IEEE International Solid-State Circuits Conference, Digest of Technical Papers, 2002: 126
- [8] Seng W H, Chang C C. Digital VLSI OFDM transceiver architecture for wireless SOC design. IEEE International Symposium on Circuits and Systems, 2005, 6: 5794
- [9] Fujisawa T, Hasegawa J, Tsuchie K, et al. A single-chip 802.11a MAC/PHY with a 32-b RISC processor. IEEE J Solid-State Circuits, 2003, 38(11): 2001
- [10] Chinchilla A L T. Synchronization and channel estimation in OFDM: algorithms for efficient implementation of WLAN systems. PhD Dissertation, Institute for High Performance microelectronics in Frankfurt an der Oder, Germany, 2003