On the design of high-speed energy-efficient successive-approximation logic for asynchronous SAR ADCs

    Corresponding author: Lin He, helin77@ustc.edu.cn
  • 1. Institute of MESIC, University of Science and Technology of China, Hefei 230027, China
  • 2. Science and Technology on Analog Integrated Circuit Laboratory, Chongqing 400060, China

Key words: analog-to-digital conversionsuccessive approximationlow-powerhigh-speedinternal switching activities

Abstract: This paper analyzes the power consumption and delay mechanisms of the successive-approximation (SA) logic of a typical asynchronous SAR ADC, and provides strategies to reduce both of them. Following these strategies, a unique direct-pass SA logic is proposed based on a full-swing once-triggered DFF and a self-locking tri-state gate. The unnecessary internal switching power of a typical TSPC DFF, which is commonly used in the SA logic, is avoided. The delay of the ready detector as well as the sequencer is removed from the critical path. A prototype SAR ADC based on the proposed SA logic is fabricated in 130 nm CMOS. It achieves a peak SNDR of 56.3 dB at 1.2 V supply and 65 MS/s sampling rate, and has a total power consumption of 555 μ W, while the digital part consumes only 203 μ W.

    HTML

1.   Introduction
  • Successive-approximation register (SAR) ADCs are known for their excellent power efficiency, small size and easy-integration into a digital process. They find wide applications in wireless sensors [1, 2] and portable biomedical systems[3-5] where power consumption and size are the main concerns. They also serve as potential replacements of pipelined ADCs in wireline/wireless communication systems[6-10]. A lot of researches have been done to minimize the DAC switching power [6, 7, 11-15]. However, for a high-speed SAR ADC, this is not enough because the percentage of the digital power consumption increases rapidly. The switches and the gates that drive them must be properly scaled to minimize the propagation delay instead of using the minimum size. Therefore, the digital part consumes much larger power than the DAC itself. For example, in Ref. [6], the DAC and the comparator together consumes around 40% of the total power, while the rest of the power attributes to the digital part.

    The asynchronous clock generation[16], which detects the completion of a comparison and automatically triggers the next one, was proposed to accelerate the speed of SAR ADCs. However, the asynchronous clock generator must allocate enough time to fully reset the comparator between two adjacent comparisons. An improved architecture was proposed in Ref. [17, 18], which uses N comparators to compare and store the $N$-bit comparison results. It removes the comparator reset time as well as the digital propagation delay with additional hardware and area cost. Comparator alternation[19] utilizes two comparators to generate the binary codes alternatively, which can also remove the comparator reset time. However, for a resolution beyond 6 bits, the offset mismatch between the comparators needs to be calibrated to avoid performance degradation, which is impractical in many application scenarios. Besides that, the asynchronous clock still has to tolerate the maximum DAC settling time. Ref. [20] realized fully self-timing by using a timer to track DAC settling behavior, and start the comparator's decision as soon as the settling is completed. This solution requires additional complexity to detect adequate settling precisely. As every method has limitations, the typical asynchronous architecture with a single self-timed comparator is still the most commonly used option.

    This paper analyzes the primary sources of the power consumption and the propagation delay of the digital circuits in a typical asynchronous SAR ADC with a single comparator, providing strategies to lower the power consumption and shorten the propagation delay. Following these strategies, a unique direct-pass successive-approximation logic is proposed based on a full-swing once-triggered DFF and a self-locking tri-state gate. The organization of this paper is as follows. Section 2 reviews the operational principle of a typical asynchronous SAR ADC. The power consumption and the propagation delay are analyzed, and strategies on reducing both of them are provided. Section 3 presents the proposed direct-pass logic. Section 4 describes the implementation details of the proposed SAR ADC scheme. Section 5 gives the measurement results. Section 6 draws the conclusion.

2.   Successive approximation logic

    2.1.   Operational principle

  • A typical asynchronous SAR ADC[16] is shown in Fig. 1, which contains a sample-and-hold (S/H), a switched-capacitor DAC, a comparator and a set of successive-approximation (SA) logics. The comparator is self-clocked by an asynchronous clock generator. The conventional SA logic[6] consists of a sequencer and a pair of data registers (only one set of data registers is shown for simplicity).

    During the sampling phase, the S/H switch turns on and samples the input signal into the DAC. Both the data register and the sequencer are reset. Then the S/H switch turns off and the conversion process starts by detecting the polarity of the sampled input using the comparator. The OR gate detects the completion of the comparison and sets Ready to high, which in turn causes a rising transition of ck$_{\rm \mathrm{1}}$, the first output of the sequencer. The first DFF in the data register is triggered and stores the current comparison result. Then the capacitive DAC adds or subtracts $V_{\rm ref}$/2 from $V_{\rm in}$, according to the comparison result. At the same time, ClkC starts to reset the comparator to prepare for the next comparison. After that the comparator detects the polarity of the updated DAC output again and the comparison result is stored in the second bit of the data register. This procedure repeats until the last bit is resolved.

  • 2.2.   Power consumption

    2.2.1.   Clock/data driving power
  • For a high speed design, each switch in the switched-capacitor DAC and its associated digital circuits, including the DFFs, have to be scaled according to the weight of the DAC element to make the settling time of each bit equal, except for the least significant bits which can be driven by switches with minimum size. This means that the overall capacitive load of the comparator will be roughly proportional to 2$^{N}$, where $N$ is the resolution bit of the ADC. It takes approximately $N$2$^{N}C_{\rm in,dff}V^{2}_{\rm DD}$ to charge and discharge such a heavy load $N$ times during conversion[21], where $C_{\rm in,dff}$ is the input capacitance of a unit DFF. It follows that to save power $N$ or $C_{\rm in,dff}$ should be designed smaller.

  • 2.2.2.   DFF internal switching
  • Another primary source of power consumption is the unnecessary internal switching activities in the DFFs[22]. Usually, the DFFs in both the sequencer and the data register are implemented with the typical resettable TSPC structure, which contains three stages of 3-transistor (3-T) branches, shown in Fig. 2. To begin with, the 3-T branch alone is taken out and analyzed, as is shown in Fig. 3. The two transistors at both ends are complementary and MA, the transistor in the middle, is a control switch. If MA is turned off, the charging/discharging path is cut off. When MA is on, the 3-T branch behaves like an inverter. If $B$ is periodic, the parasitic capacitances and the load capacitances of nodes $C$ and $D$ will be charged and discharged periodically, which consumes power.

    In the sequencer, even if $D$ stays at 0, the incoming CK will cause the switching activity of the second 3-T branch due to this mechanism. A similar case happens in the data register when CK stays at 0 to enable the first 3-T branch. Each time the comparator generates a result, the input $D$ of the data register on either p-side or n-side experiences a transition from 0 to 1 and dissipates power.

  • 2.3.   Propagation delay

  • Fig. 4 illustrates the detailed timing diagram of a single comparison cycle. A single conversion cycle starts with the fire off from the comparator. After $t_{\rm comp}$, the comparison completes. The ready detector, the sequencer and the data register work successively to pass the comparison result to the DAC, and the DAC takes $t_{\rm DAC}$ to settle. At the same time when the comparison completes, the asynchronous clock generator starts to reset the comparator after $t_{\rm delay}$. The total delay of a single comparison cycle is

    where $T_{\rm critical}$, the critical path's logic delay of the comparison result, equals $t_{\rm ready} + t_{\rm cki} + t_{\rm data}$, as is also indicated with the dashed curve in Fig. 1.

    In practical applications, $t_{\rm delay}$, the logic delay in the asynchronous clock generator, is often realized by a comparison results ready/reset detector and an inverter chain. To ensure adequate DAC settling time in every cycle, there is $t_{\rm delay} \geqslant $ max ($T_{\rm critical} + t_{\rm DAC} - t_{\rm comp-rst})$/2. With a given configuration to generate $t_{\rm delay}$, any extension of $T_{\rm critical}$ caused by parasitic effects or PVT variations will compress the time allocated for DAC settling.

    It can be inferred that the comparison result must wait to be stored for $t_{\rm ready} + t_{\rm cki}$ after it is generated. It inspires us to think over the following question: is there any possibility to use the comparison result to control the DAC directly?

3.   Digital logic design

    3.1.   Direct-pass logic

  • In order to reduce the propagation delay, we propose to replace the edge-triggered DFF in the data register with a tri-state gate, which passes the $i$-th comparison result directly to the $i$-th storage unit in the data register that controls the activity of the switched-capacitor DAC. This tri-gate is enabled while the comparator starts to compare. Now ck$_{\rm i}$ serves as an enabling instead of a triggering signal. The tri-gate should be designed in such a way that it locks immediately after the comparison result is passed. The imaginary SA logic is shown in Fig. 5. Instead of triggering the targeting DFF in the data register, now ck$_{\rm i}$ serves as an enabling signal. The critical path shown in Fig. 6 is given by

    where $t_{\rm ready}$ and $t_{\rm cki}$ is removed.

  • 3.2.   Self-locking tri-state (SLTS) gate

  • A dedicated self-locking tri-state gate is proposed as the storage unit for the data register, as is shown in Fig. 7. The proposed SLTS gate is based on the pre-charge dynamic logic. It contains a P cell and an N cell. Both cells are identical except that the gate of Mn1 is connected to a different output of the comparator. When EN $=$ 0, both cells are pre-charged to logic 1 and the gate is unlocked. When EN rises to logic 1, Mp1 turns off and Mn2 turns on, the SLTS gate is ready for receiving the comparison results DP and DN. If either cell in this latch outputs a logic 1, then the Lock is set to be 1, which locks the latch and prevents any comparison results to pass through.

    The logic delay of this SLTS gate is the only component of $T_{\rm critical}$. Like the conventional logic, its PVT variations should be carefully evaluated during designing, and the minimal $t_{\rm delay}$ should be long enough to tolerate the worst expected $t_{\rm data}$.

    With regard to the power consumption, as Mn2 is turned off when EN stays 0, the activity of the DP is not going to cause a current from $V_{\rm DD}$ to the ground. Another benefit is that the DP controls Mn1 only, which reduces the input capacitance by half as compared with a conventional TSPC DFF. Accordingly, the size of the buffer between the comparator output and data register can be reduced, which will further reduce the power consumption and logic delay.

    The energy consumed by the SLTS gate in different ck$_{\rm i}$ states when a comparison result applied is simulated and compared with the typical TSPC DFF using a 0.13-$\mu $m CMOS process and a 1.2-V power supply, as is shown in Table 1. It literally removes the power consumption caused by unnecessary internal switching activities.

  • 3.3.   Full-swing once-triggered (FSOT) DFF

  • The DFFs in the sequencer also need to be redesigned to avoid the internal switching activities. The once-triggered DFF (Fig. 8) in Ref. [23] has only three stages in which no complementary pairs are connected with its high-speed clock CK. This DFF is well suited for the sequencer as it can reduce the energy consumed by the charging/discharging of the internal capacitances commendably. Nevertheless, when $\overline{\text{RST}}$ turns to 0, the node $W_{\rm \mathrm{0}}$ is floated and cannot be reset to its highest level for CK $=$ 0 and $\overline{\text{D}}=1$. Then as the rising edge of CK comes, the charges in the capacitances of $\overline{\text{Q}}$ and $W_{\rm \mathrm{0}}$ are redistributed and the level of $W_{\rm \mathrm{0}}$ would drift even if $\overline{\text{D}}$ remains as 1. A leakage current from $V_{\rm DD}$ to ground will flow through the output stage persistently until $\overline{\text{D}}$ turns 0 and CK triggers the DFF.

    We worked out a possible solution to this problem by adding a transistor (marked by a dashed circle) in parallel with the reset switch, as is shown in Fig. 9. The polarities of all the transistors as well as input signals are reversed to remove the inverter. Its 0$\to $ 1 transition is slow because there are two PMOSs in series in the charging path; while that does not matter because it is taken out from the critical path, the added transistor ensures node $W_{\rm \mathrm{0}}'$ to be reset to 0 and forces $Q$ to stick to 0 when $D= 0$.

    Besides, as the clock input $\overline{\rm CK}$ only controls two transistors, the input capacitance is also reduced by half compared with the TSPC DFF. Thus the FSOT DFF also reduces the power consumed by the clock driving buffer, as is presented in Section 2.

4.   Implementation
  • A 1.2-V 10-bit SAR ADC employing the proposed SA logic is designed in a 0.13-$\mu $m CMOS process.

  • 4.1.   Serial comparator switching

  • The data register and the sequencer are split into two segments and each segment is assigned a respective comparator. Bit 1 to bit 4 are processed by comparator 1 and the other bits are processed by comparator 2. Each comparator has its own asynchronous clock generator. Lock$_{\rm \mathrm{4}}$, a sign of the completion of the 4-th cycle, is used to disable the self-timing loop of the first comparator and enable the second one.

    In this way, not only is the capacitive load to each comparator effectively reduced, but also the number of comparisons of each comparator. As is analyzed in Section 2, the power consumption is effectively reduced by this means.

    The low-noise low-offset comparator in Ref. [24] is adopted in this design with a meta-stability detection circuit to keep the bit error rate low.

  • 4.2.   Capacitive DAC

  • A capacitive DAC array based on the monotonic switching method[6] is adopted in this design for its excellent power efficiency, high speed and simplified digital logic. Two binary redundancies, 4C and 7C, are placed after bit 4 and bit 7 to relax the settling requirement. The offset mismatch between the comparators is also corrected by the redundant bits.

    The unit capacitor is implemented by the interdigitized structure using metal 1 to metal 4 and has a capacitance of 4 fF. A bridging capacitor of 4 fF is placed between bit 7C and bit 8. So the effective LSB capacitor is equivalent to 1 fF. The overall capacitance is around 1 pF for each side.

5.   Measurement results
  • The proposed ADC was fabricated using the 1P8M 0.13-$\mu $m CMOS process. The die photograph of the chip is shown in Fig. 10.

    Fig. 11 shows the measured DNL and INL. The measured peak DNL and INL are $-0.815/0.951$ LSB and $-0.922/1.04$ LSB, respectively. Fig. 12 shows 65536-point FFT of the ADC output for 31 MHz input signal at the 65 MS/s sampling rate. The measured SNDR is 56.3 dB and the measured SFDR is 73.6 dB. The total power consumption of the SAR ADC is 555 $\mu $W at 1.2 V at the 65 MS/s sampling rate. The analog part in total consumes 352 $\mu $W, with the capacitive DAC consuming 100 $\mu $W, the comparator consumes 168 $\mu $W, and the sample-hold consumes 84 $\mu $W. The digital part excluding the meta-stability detection circuit consumes 203 $\mu $W. The digital power is not reduced as much as expected, partially because of our limited design experience.

    Table 2 compares the prior arts at similar technology nodes. It shows that our design reduces the percentage of digital power consumption while at the same time increases the conversion speed.

6.   Conclusion
  • In this paper, a direct-pass SA logic is proposed to reduce the power consumption and the propagation delay of the digital part of an asynchronous SAR ADC. In order to verify the performance of the strategy, a 1.2-V 10-bit SAR ADC based on the proposed SA logic was fabricated and tested. Compared with the prior arts at similar technology nodes, the prototype achieved a similar peak SNDR with the digital power consumption reduced and the conversion speed enhanced.

Figure (12)  Table (2) Reference (24) Relative (20)

Journal of Semiconductors © 2017 All Rights Reserved