# A Compact Direct Digital Frequency Synthesizer for the Rubidium Atomic Frequency Standard

Cao Xiaodong<sup>†</sup>, Ni Weining, Yuan Ling, Hao Zhikun, and Shi Yin

(Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China)

Abstract: A compact direct digital frequency synthesizer (DDFS) for system-on-chip implementation of the high precision rubidium atomic frequency standard is developed. For small chip size and low power consumption, the phase to sine mapping data is compressed using sine symmetry technique, sine-phase difference technique, quad line approximation technique, and quantization and error read only memory (QE-ROM) technique. The ROM size is reduced by 98% using these techniques. A compact DDFS chip with 32bit phase storage depth and a 10bit on-chip digital to analog converter has been successfully implemented using a standard  $0.35\mu$ m CMOS process. The core area of the DDFS is 1.6mm<sup>2</sup>. It consumes 167mW at 3.3V, and its spurious free dynamic range is 61dB.

Key words: CMOS integrated circuit; DDFS; rubidium atomic frequency standard; SoC EEACC: 1280; 1265H; 1265Z

CLC number: TN432 Document code: A Article ID: 0253-4177(2008)09-1723-06

# **1** Introduction

The use of the system-on-chip (SoC) technique allows for an unprecedented degree of miniaturization of rubidium atomic frequency standards. A compact direct digital frequency synthesizer (DDFS) plays an important role in our miniaturized rubidium atomic frequency standards for modulated 5.3125MHz sine wave generation. DDFSs are able to generate singlephase or quadrature sinusoids with excellent frequency resolution, good spectral purity, very fast frequency switching, and phase continuity on switching<sup>[1]</sup>. Generally, a DDFS consists of a phase accumulator, a phase to sine converter (sine ROM), and a DAC. At each clock pulse, the phase increment word in a frequency register is added to the phase value previously held in the phase accumulator. The phase value is generated using the modulo  $2^{j}$  overflowing property of a *j* bit phase accumulator<sup>[2]</sup>. The output frequency is the rate of the overflows,

$$f_{\text{out}} = \frac{\Delta p f_{\text{clk}}}{2^{j}} \quad \forall \quad f_{\text{out}} \leqslant \frac{f_{\text{clk}}}{2} \tag{1}$$

where  $\Delta p$  is the phase increment word, *j* is the number of phase accumulator bits,  $f_{clk}$  is the system clock frequency, and  $f_{out}$  is the output frequency. The previous constraint is required by the sampling theorem. The spectral purity of the conventional DDFS is partly determined by the resolution of the values stored in the ROM. It is desirable to increase the resolution of the ROM, but sometimes higher resolution means lar-

ger ROM size. However, the access speed and the maximum output frequency decrease as the ROM size increases. Larger ROM storage also means higher power consumption, lower reliability, and greatly increased costs. So it is significant to reduce the size of the ROM under the condition of meeting the high resolution requirement of the rubidium atomic frequency standard. The most elementary technique of compression is to store only  $\pi/2$  rad of sine information, and to generate the ROM samples for the full range of  $2\pi$ by exploiting the quarter-wave symmetry of the sine function<sup>[3]</sup>. The quarter wave memory can be further compressed by using modified Sunderland technique, sine-phase difference technique, Quad line approximation (QLA) technique, and quantization and error ROM (QE-ROM) technique. The phase of a quarter of a sine wave can be decomposed as  $\varphi = \alpha + \beta + \gamma$ , where  $\alpha$ ,  $\beta$ , and  $\gamma$  are, respectively, the most significant bits (MSBs), the middle bits, and the least significant bits (LSBs). The whole compression process in this paper is depicted in Fig. 1.

A 10bit on-chip current steering DAC is also implemented to convert the digital amplitude word to an equivalent analog amplitude.

### 2 Phase accumulator

A 32bit phase accumulator is used in this DDFS. Figure 2 shows the block diagram of the phase accumulator. The phase accumulator is based on a 32bit ripple carry adder, with a string of full adders that op-

<sup>†</sup> Corresponding author. Email: xdcao@semi.ac. cn

Received 28 April 2008, revised manuscript received 2 June 2008



Fig.1 Compression process diagram

erate on the same clock phase. The outputs of the full adders have built-in registers, and the sum bits feed back internally to perform accumulation. A multiplexer (MUX) is used to select the phase increment word from the 32bit phase increment words stored in registers. In order to guarantee the phase continuity, the select signal should be synthesized with the system clock signal of the DDFS.

# **3** Phase to sine converter

#### 3.1 Sine function symmetry technique

The full-period sine wave can be reconstructed with only  $\pi/2$  rad of the sine information by exploiting the quarter-wave symmetry of the sine function. Figure 3 shows the details of this method. Since the most significant two bits of the phase accumulator represent the quadrant of the sine function, the most significant bit (MSB) is used as the sign bit of the re-



Fig.2 Phase accumulator block diagram

sult, and the second most important bit (2nd MSB) is used to control whether the phase should be increasing or decreasing<sup>[4]</sup>. The low j - 2 bits of the phase accumulator output are sent to a complementor controlled by the 2nd MSB to generate addresses for the quarter-sine ROM. The slope of the saw tooth is inverted for the second quadrant, as shown in Fig. 3. The waveform at the output of the quarter-sine ROM is the quantified sine wave. The full-period sine wave is generated at the output of the second complementor, which consists of m - 1 exclusive-or gates and an inverter. Thus, the ROM capacity decreases at the expense of additional logic.

### 3.2 Modified Sunderland technique

To reduce the ROM size, this DDFS uses the modified Sunderland technique based on simple trigonometric identities<sup>[5]</sup>. The phase address of a quarter of a sine wave can be decomposed as  $\varphi = \alpha + \beta + \gamma$ , where  $\alpha$  is the MSBs, $\beta$  is the middle bits, and  $\gamma$  is the LSBs. The quarter-wave sine function is given by



Fig. 3 Phase to sine converter block diagram

$$\sin \frac{(\alpha + \beta + \gamma) \times \pi}{2^{j-1}} = \sin \frac{(\alpha + \beta) \times \pi}{2^{j-1}} \times \cos \frac{\gamma \times \pi}{2^{j-1}} + \cos \frac{(\alpha + \beta) \times \pi}{2^{j-1}} \times \sin \frac{\gamma \times \pi}{2^{j-1}}$$
(2)

where j is the number of the phase accumulator output bits. Equation (2) can be simplified further to

$$\sin \frac{(\alpha + \beta + \gamma) \times \pi}{2^{j-1}} \approx \sin \frac{(\alpha + \beta) \times \pi}{2^{j-1}} + \cos \frac{\alpha \times \pi}{2^{j-1}} \times \sin \frac{\gamma \times \pi}{2^{j-1}}$$
(3)

The information from the first term on the right of Eq. (3) is stored into a coarse ROM. The second term on the right of Eq. (3) is stored into a fine ROM. This method works by introducing the 1/2LSBs offsets into the phase and amplitude of the sine ROM samples. However, significant savings in ROM size can be realized due to the small magnitudes of  $\beta$  and  $\gamma$ relative to  $\alpha$ . The 10bit phase data of accumulator outputs is divided into three parts in this design. Computer simulations determine the optimum partitioning ratio  $\alpha = 4, \beta = 3, \text{ and } \gamma = 3^{[6]}$ .

#### 3.3 Sine-phase difference technique

The sine amplitude of the first term on the right of Eq. (3) is reduced using the sine phase difference technique, and two bits of the word length can be saved by only storing the difference of the sine amplitude and its phase.

$$y(\alpha + \beta) = \sin \frac{(\alpha + \beta) \times \pi}{2^{j-1}} - \frac{\alpha + \beta}{2^{j-2}}$$
(4)

$$\max[y(\alpha + \beta)] \approx 0.21 \max\left[\sin\frac{(\alpha + \beta) \times \pi}{2^{j-1}}\right]$$
(5)

#### 3.4 QLA technique

The QLA waveform is used to approximate the sine-phase difference, which is calculated using Eq. (4). The QLA approximation can be expressed with following four expressions.

$$qla\left(_{\alpha}+\beta\right) = \frac{\alpha+\beta}{2^{j-1}}, \quad 0 < \frac{\alpha+\beta}{2^{j-1}} < \frac{1}{8} \tag{6}$$

$$qla(\alpha + \beta) = \frac{1}{16} + \frac{\alpha + \beta}{2^{j}}, \quad \frac{1}{8} < \frac{\alpha + \beta}{2^{j-1}} < \frac{1}{4} \quad (7)$$
$$qla(\alpha + \beta) = \frac{1}{4} + \frac{1}{16} - \frac{\alpha + \beta}{2^{j}}, \quad \frac{1}{4} < \frac{\alpha + \beta}{2^{j-1}} < \frac{3}{8} \quad (8)$$

$$qla(\alpha + \beta) = \frac{1}{2} - \frac{\alpha + \beta}{2^{j-1}}, \quad \frac{3}{8} < \frac{\alpha + \beta}{2^{j-1}} < \frac{1}{2}$$
 (9)

The data for  $0 < (\alpha + \beta)/2^{j-1} < 1/8$  are generated through shifting down the phase  $(\alpha + \beta)/2^{j-2}$  by 1bit. The data for  $1/8 < (\alpha + \beta)/2^{j-1} < 1/4$  are generated through shifting down the phase  $(\alpha + \beta)/2^{j-2}$  by 2bit and by changing the first and second MSBs of the phase to "10". The data for  $0 < (\alpha + \beta)/2^{j-1} < 1/4$  and the data for  $1/4 < (\alpha + \beta)/2^{j-1} < 1/2$  are symmetric<sup>[7]</sup>. A complementor is needed to reconstruct the symmetric waveform. Two bits of word length can be saved.

 $cr(\alpha + \beta) = y(\alpha + \beta) - qla(\alpha + \beta)$ (10)  $max[cr(\alpha + \beta)] \approx 0.26max[y(\alpha + \beta)]$ (11)

#### 3.5 QE-ROM technique

Based on the continuity of the data that need compressing, the ROM size of the DDFS can be further reduced using the QE-ROM technique. In this design, the second term on the right of Eq. (3) and  $cr(\alpha + \beta)$  are both compressed using the QE-ROM technique. The ROM size can be reduced to  $2^{l} \times m +$  $2^{a} \times n$  bits using the QE-ROM technique, where l, m, a, and n are, respectively, the length of the address of the quantization ROM, the length of the data in the quantization ROM, the length of the address of the original data, and the length of the data in the error ROM. There may be several groups of parameter values that minimize the ROM size. To find these, the following algorithm is used.

(1) Find the maximum amplitude of the original data and figure out how many bits it takes to represent this value. The maximum value of m can be represented by maxm.

(2) Set  $m = \max m$ .

(3) Set l = a, which is the address length of the original data.

(4) Calculate the quantized values.

(5) Calculate the errors between the original data and the quantization ROM data.

(6) Determine how many bits it takes to represent these errors.

(7) Calculate the total ROM size  $(2^l \times m + 2^a \times n)$ .

(8) Decrease l by 1. If l < 0, then go to (9) or else repeat the process from (4) to (7).

(9) Decrease m by 1. If m < 0, then go to (10) or else repeat the process from (4) to (7).

(10) Determine the optimum values of the above parameters that minimize the total ROM size.

 $cr(\alpha + \beta)$  is stored into a coarse ROM that is divided into a quantization ROM and an error ROM using the QE-ROM technique. The above calculations indicate that the minimum size of the coarse ROM for a 10bit output DDFS is 480bit  $(2^5 \times 3 + 2^7 \times 3 = 480)$ . The second term on the right of Eq. (3) is stored into a fine ROM, which is also divided into a quantization ROM and an error ROM using the QE-ROM technique. The minimum size of the fine ROM for a 10bit output DDFS is 288bit  $(2^5 \times 1 + 2^7 \times 2 = 288)$ . So the total ROM size for a 10bit sine output is only 768bit.



Fig. 4 Architecture of the phase to sine converter

#### 3.6 Architecture of the phase to sine converter

Figure 4 shows the block diagram of the phase to sine converter, which consists of complementors, multiplexers (MUXs), ROMs, and adders. The complementors are used to recover the full wave output from the quarter sine ROM by inverting the phase and amplitude appropriately. Four column MUXs and three adders are also required. Figure 5 shows the intermediate results during the approximation process. Figure 6 shows the sum of the data in the coarse ROM and the fine ROM and the final error. Figure 7 shows the relative bit positions of the data used for reconstructing a sine wave. The first row represents the 7bit phase, the second row represents the 7bit quad line approximation, the third row represents the 3bit quantization-ROM data in the coarse ROM, the fourth row represents the 3bit error-ROM data in the coarse ROM, the fifth row represents the 1bit quantization-ROM data in the fine ROM, the sixth row represents the 2bit error-ROM data in the fine ROM, and the seventh row represents the 9bit output of the adders in Fig. 7.



Fig. 5 Approximation process



Fig.6 Sum of the coarse ROM data and the fine ROM data and the final error

| P9         | P8         | P7         | P6         | P5         | P4  | P3         |     |            |  |
|------------|------------|------------|------------|------------|-----|------------|-----|------------|--|
|            |            | QA7        | QA6        | QA5        | QA4 | QA3        | QA2 | QA1        |  |
|            |            |            |            | CQ5        | CQ4 | CQ3        |     |            |  |
|            |            |            |            |            |     | CE3        | CE2 | CE1        |  |
|            |            |            |            |            |     |            | FQ2 |            |  |
|            |            |            |            |            |     |            | FE2 | FE1        |  |
|            |            |            |            |            |     |            |     |            |  |
| <b>S</b> 9 | <b>S</b> 8 | <b>S</b> 7 | <u>S</u> 6 | <b>S</b> 5 | S4  | <b>S</b> 3 | S2  | <b>S</b> 1 |  |

Fig. 7 Relative bit positions of data to be added in the adders

The DDFS requires the smallest ROM compared with the DDFS using other compression methods in Table 1. Moreover, the ROM using this technique can produce a good spur level, as shown in Table 1.

### 4 Digital to analog converter

In this design, an on-chip 10bit segmented current steering digital to analog converter (DAC) is implemented. This converter has 6bit thermometer-decoded most significant bits (MSBs) and 4bit thermometerdecoded least significant bits (LSBs). It is full thermometer-decoded to guarantee monotonicity and minimal glitches. The simplified DAC architecture is shown in Fig. 8. A clock buffer is included on the chip to obtain a good timing accuracy for the different

Table 1Summary of memory compression and algorithmictechniques in the case of 12bit phase to 10bit amplitude mapping

| Compression method                  | Needed<br>ROM      | Total<br>compression<br>ratio | Worst case<br>spur/dBc |  |
|-------------------------------------|--------------------|-------------------------------|------------------------|--|
| Uncompressed<br>memory              | $2^{12} \times 10$ | 1:1                           | - 81.76                |  |
| Quarter sine wave                   | $2^{10} \times 9$  | 40:9                          | - 78.76                |  |
| Double trigonometric                | $2^{10} \times 6$  | 20:3                          | - 78.76                |  |
| Modified Sunderland<br>architecture | $2^7 \times 10$    | 32:1                          | - 73. 59               |  |
| Modified Nicholas<br>architecture   | $2^7 \times 9$     | 320:9                         | - 74.56                |  |
| Parabolic<br>approximation          | $2^{7} \times 6$   | 160:3                         | - 66.8                 |  |
| This work                           | $2^{7} \times 6$   | 160:3                         | - 72.32                |  |



Fig. 8 Simplified DAC architecture

clock signals used in the converter<sup>[8]</sup>. A step-down buffer shown in Fig. 9 has also been added in front of every current switch to achieve a better dynamic performance. An improper switching voltage may cause nonmonotonicity in the DDFS. Normally, the switching voltage for CMOS circuits is about 0.4V, but we choose 0.6V to ensure that the switch can perform effectively. Two important parameters of DACs' static performance are integral nonlinearity (INL) and differential nonlinearity (DNL), which are related to the strategy of the layout implementation. The DAC used in this paper employs a novel switching scheme called a Q<sup>2</sup> random walk to improve the nonlinearity, which can be degraded by the symmetric error and two-dimensional graded error of the DAC<sup>[9]</sup>. This current steering DAC can output 20mA at full scale.

### 5 Layout and experiment results

The compact DDFS has been fabricated using a  $0.35\mu$ m CMOS process. The core area is 1.6mm<sup>2</sup>, which is almost 80% smaller than the area of DDFSs with a complete 10bit sine ROM lookup table. A chip micrograph of the DDFS is shown in Fig. 10. As the figure shows, some other rubidium atomic frequency standard servo circuits are also implemented on the



Fig.9 Step-down buffer



Fig. 10 Chip micrograph

same chip besides the DDFS. Figure 11 (a) shows the output spectrum at  $f_{clk} = 20$ MHz and  $f_{out} = 625$ kHz. Figure 11 (a) shows that the DDFS has an excellent wideband SFDR of 61dB, which is better than the 55dB SFDR of recently reported DDFS. Figure 11 (b) shows the output spectrum at  $f_{clk} = 20$ MHz and  $f_{out} =$ 5. 3125MHz. The good spectral purity in Fig. 11 (b) means that the DDFS can provide a good 5. 3125MHz sinusoidal signal for the excitation circuit. So, the rubidium atoms can be excited effectively and the output of the physics package can express the useful transition information of the rubidium atoms effectively. The experimental results show that this compact DDFS can generate a good modulated 5.3125MHz sine wave under control of the processor in our rubidium atomic frequency standard. Reduction in power dissipation is also an important objective in the design of this DDFS. When the input system clock is operating at 20MHz, the DDFS only consumes 167mW at



Fig. 11 (a) Output spectrum at  $f_{clk} = 20$ MHz and  $f_{out} = 625$ kHz; (b) Output spectrum at  $f_{clk} = 20$ MHz and  $f_{out} = 5.3125$ MHz

3. 3V. The measured results show that the most important specifications of this compact DDFS are better than specifications of previously used DDFS chips in our rubidium atomic frequency standard and the compact DDFS meets the requirements of rubidium atomic frequency standards completely. The size and power dissipation of the rubidium atomic frequency standard are significantly reduced and the reliability of the rubidium atomic frequency standard is greatly improved through SoC implementation of the servo circuits.

## 6 Conclusion

A compact DDFS used for SoC implementation of high resolution rubidium atomic frequency standards is implemented in this paper. The DDFS consists of two 32bit phase registers, a 32bit phase accumulator, a phase to sine converter (768bit sine ROM), and a 10bit on-chip DAC. A DDFS chip with a core area of 1.6mm<sup>2</sup> has been successfully fabricated using a standard 0.35 $\mu$ m CMOS process. The DDFS consumes 167mW at 3.3V and its SFDR is 61dB.

#### References

- [1] De Caro D, Strollo A G M. High-performance direct digital frequency synthesizers using piecewise-polynomial approximation. IEEE Trans Circuits Syst I, Regular Papers, 2005, 52(2):324
- [2] Vankka J, Halonen K. Direct digital synthesizers: theory, design and applications. Boston London; Kluwer, 2001
- [3] Vankka J. Methods of mapping from phase to sine amplitude in direct digital synthesis. IEEE International Frequency Control Symposium, 1996:942
- [4] Nicholas H T, Samueli H. A 150-MHz direct digital frequency synthesizer in 1. 25-μm CMOS with 90-dBc spurious performance. IEEE J Solid-State Circuits, 1991, 26(12);1959
- [5] Vankka J. Methods of mapping from phase to sine amplitude in direct digital synthesis. Trans Ultrason Ferroelectr Freq Contr, 1997,44(2):526
- [6] Kim Y S, Kim S H, Baek K H, et al. Multiple trigonometric approximation of sine-amplitude with small ROM size for direct digital frequency synthesizers. IEEE Proceedings of the 16th International Conference on VLSI Design,2003
- Yang B D, Han J H, Han S H, et al. An 800-MHz low-power direct digital frequency synthesizer with an on-chip D/A converter. IEEE J Solid-State Circuits,2004,39(5):761
- [8] Van de Plassche R. CMOS integrated analog-to-digital and digitalto-analg converters. Boston London: Kluwer. 2003
- [9] Ni Weining, Geng Xueyang, Shi Yin. A 12bit 300MHz currentsteering CMOS D/A converter. Chinese Journal of Semiconductors, 2005, 26(6):1129

# 一种用于铷频标的紧凑型直接数字频率合成器

曹皖东<sup>†</sup> 倪卫宁 袁 凌 郝志坤 石 寅 (中国科学院半导体研究所,北京 100083)

**摘要:**研发了高精度铷频标芯片 SoC 实现中应用的一种紧凑型直接数字频率合成器(DDFS).为了减小芯片面积和降低功耗,采用正 弦对称技术、modified Sunderland 技术、正弦相位差技术、四线逼近技术以及量化和误差 ROM 技术对相位转正弦的映射数据进行了 压缩.利用这些技术,ROM 尺寸压缩了 98%.采用标准 0.35μm CMOS 工艺,一个具有 32 位相位存储深度和 10 位 DAC 的紧凑型 DDFS 流片成功,其核心面积为 1.6mm<sup>2</sup>.在 3.3V 电源下,该芯片的功耗为 167mW,无杂散动态范围(SFDR)为 61dB.

关键词: CMOS 集成电路; 直接数字频率合成器; 铷原子频标; 片上系统
EEACC: 1280; 1265H; 1265Z
中图分类号: TN432 文献标识码: A 文章编号: 0253-4177(2008)09-1723-06

<sup>\*</sup> 通信作者.Email:xdcao@semi.ac.cn 2008-04-28 收到,2008-06-02 定稿