# A high speed direct digital frequency synthesizer realized by a segmented nonlinear DAC

Yuan Ling(袁凌)<sup>1,†</sup>, Ni Weining(倪卫宁)<sup>1</sup>, Hao Zhikun(郝志坤)<sup>1</sup>, Shi Yin(石寅)<sup>1</sup>, and Li Wenchang(李文昌)<sup>2</sup>

(1 Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China)

(2 University of Electronic Science and Technology of China, Chengdu 610054, China)

Abstract: This paper presents a high speed ROM-less direct digital frequency synthesizer (DDFS) which has a phase resolution of 32 bits and a magnitude resolution of 10 bits. A 10-bit nonlinear segmented DAC is used in place of the ROM look-up table for phase-to-sine amplitude conversion and the linear DAC in a conventional DDFS. The design procedure for implementing the nonlinear DAC is presented. To ensure high speed, current mode logic (CML) is used. The chip is implemented in Chartered 0.35  $\mu$ m COMS technology with active area of 2.0 × 2.5 mm<sup>2</sup> and total power consumption of 400 mW at a single 3.3 V supply voltage. The maximum operating frequency is 850 MHz at room temperature and 1.0 GHz at 0 °C.

**Key words:** direct digital frequency synthesizer; nonlinear DAC; segmented; ROM-less; CML **DOI:** 10.1088/1674-4926/30/9/095003 **EEACC:** 1265H

# 1. Introduction

Direct digital frequency synthesizers (DDFS) play a very important role in modern digital communication systems. Due to their fine frequency resolution, large bandwidth and fast settling time, they are essential components in spread spectrums such as Bluetooth devices and radar.

The conventional DDFS contains the ROM to store sine waveform data and the ROM size is exponentially proportional to the desired phase resolution. The ROM for sine look-up table occupies the majority of the DDFS area and also limits its maximum operation frequency due to the delay through the multi-layer decoders. Though many ROM compression methods have been proposed, such as trigonometric and parabolic approximation<sup>[1]</sup>, the problems indicated above still exist.

A novel approach is to replace the conventional linear DAC and the ROM with a nonlinear DAC that converts the digital phase word into an analog sine waveform directly<sup>[2]</sup>. Thus the ROM is completely removed and the performance of the DDFS is significantly improved.

In this paper, an application-specific DDFS consisting of an on-chip current-steering nonlinear DAC and a 32-bit frequency-tuning word (FTW) accumulator is implemented in a standard twin-well 4-metal layer 0.35  $\mu$ m CMOS process. Also, the nonlinear DAC based DDFS is discussed. The architecture of the nonlinear DAC is presented. The 32-bit pipelined accumulator is also discussed, and some circuit design and layout technology are presented.

## 2. Nonlinear DAC based DDFS architecture

The conceptual block diagram of the nonlinear DAC based ROM-less DDFS is shown in Fig.  $1^{[2,3]}$ . The *M*-bit fre-

quency tuning word (FTW) feeds into a phase accumulator that controls the output frequency of synthesizer sine waveform as

$$f_{\rm out} = \frac{\rm FTW}{2^M} f_{\rm clock}.$$
 (1)

The output of the phase accumulator is truncated into k bits according to the signal-to-noise ratio (SNR) requirement of the sine output. The two most significant bits (MSB) are used to determine which sine wave quadrant the phase accumulator output resides in according to the quadrant symmetry of the sine wave. The remaining M - 2 bits get through the complementor and are converted to a sine wave form by the nonlinear DAC.

## 3. Architecture of nonlinear DAC

The segmentation technique is widely used in linear DAC design since it can reduce power consumption and conserve chip area<sup>[4]</sup>. Similar to the segmented linear DAC, we first divide the k - 2 bits phase accumulator output into three parts:



Fig. 1. Conceptual block diagram of the DDFS.

<sup>†</sup> Corresponding author. Email: lyuan@semi.ac.cn Received 14 February 2009, revised manuscript received 21 April 2009

*A*, *B*, *C*, where *A* is the MSB part and *C* is the LSB part. Assume that the numbers of bits for the *A* part, the *B* part and the *C* part are given as *a*, *b* and *c*, respectively. Then based on trigonometric identities, the first quadrant of the sine function can be expressed as<sup>[2]</sup>:

$$f(A, B, C) = \sin \frac{\pi(A + B + C)}{2(2^{a+b+c} - 1)}$$
$$= \left[\sin \frac{\pi(A + B)}{2(2^{a+b+c} - 1)} \cos \frac{\pi C}{2(2^{a+b+c} - 1)} + \cos \frac{\pi(A + B)}{2(2^{a+b+c} - 1)} \sin \frac{\pi C}{2(2^{a+b+c} - 1)}\right], \quad (2)$$

where a + b + c = k - 2,  $2^{a+b+c} > A \gg B \gg C$ .

In order to use the segmented DAC design technique, the first and the second term of f(A, B, C) must be monotonic. Since *C* is relatively small, Equation (2) can be simplified to

$$f(A, B, C) \approx \left[ \sin \frac{\pi (A+B)}{2(2^{a+b+c}-1)} + \cos \frac{\pi (A+B_{avg})}{2(2^{a+b+c}-1)} \sin \frac{\pi C}{2(2^{a+b+c}-1)} \right], \quad (3)$$

where  $B_{avg}$  is the average value of *B*'s, and can be expressed as  $B_{avg} = \frac{1}{(2^{b+c} - 2^c)} \sum B$ .

It can be shown from Eq. (3) that the first term is monotonic and the second term is also monotonic for a fixed value of A. The first term can be realized using a monotonic nonlinear sine-weighted DAC. The second term cannot be realized using the above DAC design technique directly, but for a fixed value of A, it can apply the segmented DAC technique to implement a monotonic nonlinear sub-DAC. A different sub-DAC is activated according to A and the output of the corresponding nonlinear sub-DAC is determined by C. Then a fine-DAC can be constructed using  $2^c - 1$  number of nonlinear sub-DAC. Considering the complementor used in the nonlinear DAC shown in Fig. 1, a value of 0.5 LSB should be introduced to Eq. (3). Assuming  $I_0$  is the unit current, k = 10, then the sine-weighted DAC output  $I_{AB}$ , and the A-th sub-DAC output current  $I_{AC}$ , are given as

$$I_{\rm AB} = \left[ (2^9 - 1) \sin \frac{2\pi (A + B + 0.5)}{2(2^8 - 1)} \right] I_0, \tag{4}$$

$$I_{\rm AC} = \left[ (2^9 - 1) \sin \frac{2\pi C}{2(2^8 - 1)} \cos \frac{2\pi (A + B_{\rm avg} + 0.5)}{2(2^8 - 1)} \right] I_0.$$
(5)

Based on Eqs. (4) and (5), a 10-bit segmented nonlinear DAC can be implemented. Figure 2 shows the overall segmented nonlinear DAC architecture. The 10-bit nonlinear DAC can be divided into two parts: a sine-weighted DAC and a fine DAC. The square cell shown in Fig. 2 represents a current cell that has a value given by Eqs. (4) and (5). The sineweighted DAC current cells are addressed by row and column thermometer decoders according to *A* and *B*, and the fine DAC



Fig. 2 . Segmented nonlinear DAC architecture.

Table 1. Area-utilization efficiency.

| A-B-C | <b>NINL</b> <sub>max</sub> | NUM <sub>tot</sub> | AUE    |
|-------|----------------------------|--------------------|--------|
| 2-2-4 | 3.78                       | 74                 | 1.0938 |
| 2-3-3 | 2.20                       | 58                 | 0.4984 |
| 2-4-2 | 1.08                       | 74                 | 0.3122 |
| 3-2-3 | 1.12                       | 86                 | 0.3763 |
| 3-3-2 | 0.56                       | 86                 | 0.1881 |
| 4-2-2 | 0.29                       | 110                | 0.1246 |
| 4-3-1 | 0.27                       | 142                | 0.1498 |

current cells are addressed by the row binary decoder and column thermometer decoder according to A and C respectively. In the fine DAC part, when a row is selected according to A, the corresponding nonlinear sub-DAC in this row is selected.

Another critical problem is how to decide the value of *a*, *b* and *c*. Similar to the definition of the maximum integral nonlinearity (INL<sub>max</sub>) for qualifying the accuracy of a linear DAC, the maximum amplitude difference between an ideal sine wave and the nonlinear DAC output, which is NINL<sub>max</sub>, is utilized for qualifying the accuracy of the nonlinear DAC. According to the definition above, NINL<sub>max</sub> can be expressed as:

$$NINL_{max} = \max |\{A_n - A_{n\_ideal}\}|, \qquad (6)$$

where  $A_n$  is the output of the nonlinear DAC output, and  $A_{n.ideal}$  is an ideal sine wave amplitude. However, different segmentations among *A*, *B* and *C* may result in different nonlinear DAC core areas. A simple area-utilization efficiency (AUE) is introduced to study the nonlinear DAC<sup>[2]</sup>. AUE is defined as

$$AUE = NINL_{max} \times NUM_{tot}/2^{k-2}, \qquad (7)$$

where NUM<sub>tot</sub> is equal to the total number of DAC cells, and k is the resolution of the nonlinear DAC. Table 1 lists some different combinations of *A*, *B* and *C*, and their corresponding AUE's.

The AUE's for the combinations of 3-3-2, 4-2-2 and 4-3-1 are about the same, but the 3-3-2 combination gives the minimum die area. Besides, the thermometer-code decoders are simpler and smaller. Therefore, it was chosen for the final implementation.



Fig. 3. Block diagram of the accumulator.

#### 4. The 32-bit pipelined accumulator

A wide phase accumulator is often used in DDFS for fine frequency resolution at high clock frequency<sup>[5]</sup>, and the wide accumulator cannot finish one addition in a short single clock period because of the delay caused by the carry bits propagating through the adder. There are two ways to improve the performance of the phase accumulator. First, we can use the pipelined accumulator architecture. Second, high-speed logic can be introduced to implement the accumulator. Conventional adder architectures are ripple-carry adder and look-ahead logarithmic adder. When the length of the adder is more than 4 bits, the look-ahead logarithmic adder is faster than the ripplecarry adder, but it consumes more area and power. Figure 3 shows the block diagram of the accumulator; we use the ripple-carry adder to implement the 2-bit pipelined 32-bit accumulator. The output of the 32-bit accumulator is truncated into 10 bits according to the nonlinear DAC.

# 5. Implementation

Compared with other logic styles, CMOS current mode logic (CML) is the best choice to realize high speed circuits due to its superior performance<sup>[6,7]</sup>. Figure 4 is the common representation of CMOS CML circuit. The pull-up loads, M1, M2, work as load resistors in the triode region. All of the current in the constant current source flows through one of the two branches, depending on the value of the differential pull-down network (PDN). The CMOS CML provides complementary output signals, thus the SNR of this logic is better than that of other logic styles. The delay of the CMOS CML circuit can be approximated as

$$\tau = \frac{\Delta V}{I_0} C_{\rm L} = R_{\rm L} C_{\rm L},\tag{8}$$

where  $\Delta V$  is the output voltage swing, and  $I_0$  denotes the current that flows through the current source.  $C_L$  is the load capacitance, and  $R_L$  is the equivalent load resistance.

Increasing  $I_0$  could reduce the delay, but because some of the transistors in the PDN need to stay in saturation while



Fig. 4. Common representation of the CML circuit.



Fig. 5. Die photo of the DDFS chip.

keeping their previous overdrive voltage, the size of the PDN transistors needs to increase proportionally to  $I_0$ , thus the load capacitance also increases and the delay improvement is only marginal when the load capacitance is dominated by the transistors of the PDN. The typical delay of the CMOS CML is 100–200 ps.

The chip is implemented in a 2-poly, 4-metal 0.35  $\mu$ m CMOS technology provided by Chartered semiconductors, and occupies an active area of  $2.0 \times 2.5 \text{ mm}^2$ . The layout of the DDFS chip is shown in Fig. 5. The current source array and the switches are placed in a separate array to avoid coupling from the digital signals to the current sources, as shown in Fig. 5. Around the current source array, two dummy rows and columns have been added so as to avoid edge effects. To minimize the systematic error introduced by the voltage drop in the ground lines of the current-source transistors, sufficiently wide lines have been used. As illustrated in Fig. 5, a clock driver is used between the input differential clock pads and the DDFS core. The clock inputs from the PCB are shaped and amplified by the clock driver in the chip, then provide the differential CML compatible signals to the DDFS core. The maximum delay of the metal wire in chip is about 50 ps and the clock tree is carefully built to ensure an acceptable clock skew.

## **6.** Experimental results

The DDFS is measured at a single power supply of 3.3 V and the maximum output current for a pair of 50  $\Omega$  termi-



Fig. 6. Output spectrum  $f_{out} = 1.45$  MHz @ 850 MHz.





nation resistors is 20 mA to obtain the maximum single-ended analog output voltage of 1.0 V. The power consumption for the DDFS is about 400 mW. The supply voltage can vary from 2.4 to 4 V. Its maximum operating frequency is 850 MHz at room temperature, 1.0 GHz at 0 °C.

Figure 6 shows the measured spectrum at 1.45 MHz output frequency, while the input clock frequency is at 850 MHz. The measured spectrum at 31.25 MHz output frequency is presented in Fig. 7. It can be seen that the wideband spurious free dynamic range (SFDR) of the DDFS is about 50 dB, and the narrowband SFDR of the DDFS is more than 75 dB for  $f_{out} = 1.45$  MHz and  $f_{out} = 31.25$  MHz.

The SFDR degrades at high synthesized frequencies. This is mainly due to switching feed-through from the digital inputs of the current-steering pairs to the biasing voltage of the current sources. As a result of this, undesired harmonics and other spurious signals are generated. A better result can be obtained if an on-chip biasing source with low output impedance is used.

# 7. Conclusion

Table 2 summarizes the performances of the DDFS and shows a comparison between different reported DDFSs.

Table 2. Comparison between different reported DDFs.

|                                | Ref. [2] | Ref. [8] | Ref. [3] | This work |
|--------------------------------|----------|----------|----------|-----------|
| Process (µm)                   | 0.25     | 0.13     | 0.35     | 0.35      |
| Active area (mm <sup>2</sup> ) | 1.4      | 0.01     | 0.008    | 5         |
| Power efficiency (W/GHz)       | 0.8      | 0.008    | 0.008    | 0.47      |
| Phase resolution (bits)        | 12       | 24       | NA       | 32        |
| Amplitude resolution (bits)    | 11       | 10       | 9        | 10        |
| SFDR (dBc)                     | 51       | 63.2     | 50       | 50        |
| Max clock freq (MHz)           | 300      | 1000     | 50       | 850       |

In this paper, a high speed ROM-less DDFS which incorporates a 10-bit nonlinear current steering DAC is presented. In order to achieve higher speed performance, CMOS CML is used to implement the logic cells. The maximum operating frequency is 850 MHz at room temperature and 1.0 GHz at 0 °C. The wideband and narrowband SFDR of the DDFS presented is about 50 dBc and 75 dBc, respectively. The active area of the DDFS is  $2.0 \times 2.5$  mm<sup>2</sup> and the total power consumption is about 400 mW at a single 3.3 V power supply.

## References

- Sodagar A M, Lahiji G R. Mapping from phase to sineamplitude in direct digital frequency synthesizers using parabolic approximation. IEEE Trans Circuits Syst II, 2000, 47: 1452
- [2] Jiang J, Lee E K F. A low-power segmented nonlinear DACbased direct digital frequency synthesizer. IEEE J Solid-State Circuits, 2002, 37: 1326
- [3] McEwan A, Collins S. Direct digital frequency synthesis by analog interpolation. IEEE Trans Circuits Syst II, 2006, 53: 1294
- [4] Bosch A V D, Borremans M A F, Steyaert M S J, et al. A 10-bit 1-GSample/s Nyquist current-steering CMOS D/A converter. IEEE J Solid-State Circuits, 2001, 36: 315
- [5] Nicholas H T, Samueli H. An analysis of the output of direct digital frequency synthesizers in the presence of phaseaccumulator truncation. Proc 41st Annual Frequency Control Symposium, 1987: 495
- [6] Mizuno M, Yamashina M, Furuta K, et al. A GHz MOS adaptive pipeline technique using MOS current-mode logic. IEEE J Solid-State Circuits, 1996, 31: 784
- [7] Yuan Ling, Ni Weining, Shi Yin. A 10-bit 2-GHz CMOS D/A converter for high-speed system applications. Chinese Journal of Semiconductors, 2007, 28(10): 1540
- [8] Ashrafi A, Milenkovi鍭, Adhami R. A 1-GHz direct digital frequency synthesizer based on the quasi-linear interpolation method. IEEE Int Symp. Circuits Syst (ISCAS), 2007