# CMOS implementation of a low-power BPSK demodulator for wireless implantable neural command transmission\*

Wu Zhaohui(吴朝晖)<sup>1,†</sup>, Zhang Xu(张旭)<sup>2</sup>, Liang Zhiming(梁志明)<sup>1</sup>, and Li Bin(李斌)<sup>1</sup>

<sup>1</sup>Institute of Microelectronics, School of Electronic and Information Engineering, South China University of Technology, Guangzhou 510640, China

<sup>2</sup>State Key Laboratory on Integrated Optoelectronics, Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China

**Abstract:** A new BPSK demodulator was presented. By using a clock multiplier with very simple circuit structure to replace the analog multiplier in the traditional BPSK demodulator, the circuit structure of the demodulator became simpler and hence its power consumption became lower. Simpler structure and lower power will make the designed demodulator more suitable for use in an internal single chip design for a wireless implantable neural recording system. The proposed BPSK demodulator was implemented by Global Foundries 0.35  $\mu$ m CMOS technology with a 3.3 V power supply. The designed chip area is only 0.07 mm<sup>2</sup> and the power consumption is 0.5 mW. The test results show that it can work correctly.

Key words: CMOS integrated circuits; low-power BPSK demodulator; implantable biomedical devices; wireless command transmission

EEACC: 2570D; 1250; 7550

**DOI:** 10.1088/1674-4926/33/5/055005

# 1. Introduction

Recently, wireless implantable micro-systems were widely studied for acquiring in-vivo real time neural activity information recording<sup>[1-4]</sup>. The attraction of this micro-system is that there is no need to place a battery inside the organism to supply power to the implanted devices. The power is transmitted to the implanted devices through magnetic coupling between external and internal coils. Usually some configuration commands are also delivered through this magnetic coupling to control the operations of acquiring and transmitting the bio-electronic signal. The command data is typically modulated on the carrier wave by a constant envelope BPSK (binary phase shift keying) modulation method in order to maintain a steady power supply to the implanted circuit and to get better power transfer efficiency<sup>[5-7]</sup>. The modulated signal is then sent out after the RF (radio frequency) power amplifier.

When the wave is delivered to the internal coil, a demodulation process is needed to recover the configuration command data. Since PSK (phase shift keying) is a modulation process whereby the input signal shifts the phase of the output waveform to one of the fixed number of states, the PSK waveform is a suppressed carrier by nature. Therefore coherent detection is required and the carrier has to be recovered first. These days there are several techniques applied to carrier recovery, such as squaring loop, COSTAS loop and remodulator loop<sup>[8]</sup>. Among them, the COSTAS loop is often used as the PSK demodulator. However, all digital COSTAS loop demodulators have a complex circuit structure and suffer from high power consumption<sup>[9]</sup>, which is intolerable for implanted applications, while the analog COSTAS loop demodulators also have a complex circuit structure because they use the complex analog multiplier<sup>[10]</sup>. Reference [11] proposed a novel BPSK demodulator by using BPSK signal regeneration and the PFD based phase-locked loop (PLL) techniques. However, it needs off-chip RC components, making a single chip design difficult. Some QPSK demodulators were proposed to get a high data rate[12, 13]. But the larger die area and higher power make them limited in applications of implanted modules. In our previous work<sup>[14]</sup>, a simpler structure clock multiplier was designed to replace the complex analog multiplier, and used in a traditional coherent demodulation based BPSK demodulator, a simple structure and low power were achieved, as shown in Fig. 1. This new BPSK demodulator had been verified by a wireless power and command transmission board circuit system<sup>[14]</sup>. In this paper, we will supplement and improve the working principle of this new BPSK demodulator first and then we will focus on how to implement it by CMOS ASIC design.



Fig. 1. Block diagram of the addressed BPSK demodulator.

<sup>\*</sup> Project supported by the National Natural Science Foundation of China (Nos. 60976026, 61076023), the National Basic Research Program of China (No. 2011CB933203), and the Fundamental Research Funds for the Central Universities, SCUT (No. 2009ZM0196).

<sup>†</sup> Corresponding author. Email: phzhwu@scut.edu.cn Received 24 October 2011, revised manuscript received 31 December 2011



Fig. 2. Waveform for describing the carrier wave recovering



Fig. 3. Circuit diagram of the clock multiplier.



Fig. 4. Parameter description for RC charging and discharging.

## 2. Circuit structure

Before coming into the demodulator, the signal received by the internal coil has been simply shaped so that the signal has a compatible voltage level to the digital circuit. The shaped input BPSK waveform has two phase states: 0 phase and  $\pi$ phase. After the clock multiplier, the frequency of the signal doubled and the phase is changed into 0 phase and  $2\pi$  phase. The signal after PLL has a perfect phase state and accurate frequency twice that of the carrier wave. After the clock divider, the carrier wave is recovered. The waveform description of the recovering carrier wave is shown in Fig. 2. The recovered carrier wave is then compared in phase with the received BPSK signal by an XOR gate. After a LPF (low pass filter) and a shaping circuit, the configuration command data is then recovered. Because other parts use a typical circuit structure, only a clock multiplier and PLL modules are focused on in this paper.

The circuit structure of the clock multiplier is shown in Fig. 3. An RC phase-shift-network and a Schmitt-Trigger are used to generate an output waveform which has a constant time delay to the input digital signal. If the original signal and the phase-shifted signal are sent to the XOR gate at the same time, an output signal that has double the frequency of the original input signal can be obtained.



Fig. 5. Charging and discharging voltage waveform of the capacitor.

As shown in Fig. 4, assuming that the supply voltage of the digital circuit is  $U_{\text{DD}}$ , the top charging voltage is  $U_{\text{T}}$  and the bottom discharging voltage is  $U_{\text{B}}$ , based on the charging and discharging theory of the RC circuit<sup>[15]</sup>, the charging voltage  $U_{\text{charge}}$  and the discharging voltage  $U_{\text{discharge}}$  can be obtained as follows:

$$U_{\text{charge}} = (U_{\text{DD}} - U_{\text{B}}) \left( 1 - e^{-t_{\text{charge}}/RC} \right) + U_{\text{B}}, \quad (1)$$

$$U_{\rm discharge} = U_{\rm T} {\rm e}^{-t_{\rm discharge}/RC}.$$
 (2)

Figure 5 gives three periods of the charging and discharging voltage waveform of the capacitor. Assuming that the input digital signal has a constant period of T and a constant duty cycle of 50%, from Eqs. (1) and (2), the relationship of the six parameters shown in Fig. 5 can be obtained as follows:

$$U_{\rm T}^{(0)} = \left(U_{\rm DD} - U_{\rm B}^{(0)}\right) \left(1 - e^{-T/2RC}\right) + U_{\rm B}^{(0)}, \quad (3)$$

$$U_{\rm B}^{(1)} = U_{\rm T}^{(0)} {\rm e}^{-T/2RC}, \qquad (4)$$

$$U_{\rm T}^{(1)} = \left(U_{\rm DD} - U_{\rm B}^{(1)}\right) \left(1 - e^{-T/2RC}\right) + U_{\rm B}^{(1)}, \quad (5)$$

$$U_{\rm B}^{(2)} = U_{\rm T}^{(1)} {\rm e}^{-T/2RC}.$$
 (6)

From Eqs. (3)–(6), the top and bottom voltage of the capacitor during a certain period can be obtained by recursion as,

$$U_{\rm T}^{(n-1)} = (U_{\rm DD} - U_{\rm B}^{(n-1)}) \left(1 - {\rm e}^{-T/2RC}\right) + U_{\rm B}^{(n-1)}, \quad (7)$$

$$U_{\rm B}^{(n)} = U_{\rm T}^{(n-1)} {\rm e}^{-T/2RC} \,. \tag{8}$$

From Eqs. (7) and (8), the relationship of the bottom discharging voltage of the capacitor between two adjacent periods can be obtained as,

$$U_{\rm B}^{(n)} = \left[ \left( U_{\rm DD} - U_{\rm B}^{(n-1)} \right) \left( 1 - e^{-T/2RC} \right) + U_{\rm B}^{(n-1)} \right] \\ \times e^{-T/2RC}.$$
(9)

That is

$$U_{\rm B}^{(n)} = U_{\rm DD} {\rm e}^{-T/2RC} - U_{\rm DD} {\rm e}^{-T/RC} + U_{\rm B}^{(n-1)} {\rm e}^{-T/RC}.$$
 (10)

Thus, a series of equations can be obtained as follows according to Eq. (10):

$$U_{\rm B}^{(n)} = U_{\rm DD} e^{-T/2RC} - U_{\rm DD} e^{-2T/2RC} + U_{\rm DD} e^{-3T/2RC} - U_{\rm DD} e^{-4T/2RC} + U_{\rm B}^{(n-2)} e^{-4T/2RC},$$
(11)

$$U_{\rm B}^{(n)} = U_{\rm DD} e^{-T/2RC} - U_{\rm DD} e^{-2T/2RC} + U_{\rm DD} e^{-3T/2RC}$$
$$- \dots - U_{\rm DD} e^{-2nT/2RC} + U_{\rm B}^{(0)} e^{-2nT/2RC},$$
(12)



Fig. 6. CMOS circuit of the clock multiplier.

$$U_{\rm B}^{(n)} = U_{\rm DD} \frac{1 + e^{-(2n-1)T/2RC}}{1 + e^{-T/2RC}} e^{-T/2RC} - U_{\rm DD} e^{-2nT/2RC} + U_{\rm B}^{(0)} e^{-2nT/2RC}.$$
(13)

Because the value of  $e^{-2nT/2RC}$  vanishes very fast as the value of *n* increases, the bottom voltage of the capacitor can reach a relatively steady value after a few charging and discharging periods, and the bottom and top voltages can be written as follows:

$$U_{\rm B} = \frac{U_{\rm DD}}{1 + {\rm e}^{T/2RC}},$$
 (14)

$$U_{\rm T} = \frac{U_{\rm DD}}{1 + {\rm e}^{-T/2RC}}.$$
 (15)

In order to get an output with a duty cycle of approximately 50%, a T/4 time shift should be done after the RC network and the Schmitt-Trigger. Setting  $\alpha U_{DD}$  and  $(1-\alpha)U_{DD}$  as the upper and lower switching point voltages of the Schmitt-Trigger respectively and assuming that  $U_{charge}$  can reach the trigger voltage  $\alpha U_{DD}$  when  $t_{charge} = T/4$  if the circuit parameters can be adjusted to suitable values, the RC time constant can be calculated from Eq. (1),

$$RC = -\frac{T}{4\ln\frac{U_{\rm DD} - \alpha U_{\rm DD}}{U_{\rm DD} - U_{\rm B}}}.$$
 (16)

Since  $U_{\text{discharge}} = (1-\alpha)U_{\text{DD}}$  when the Schmitt-Trigger is triggered at the falling edge, if substituting Eq. (16) into Eq. (2), the discharging time can be obtained as,

$$t_{\text{discharge}} = \frac{\ln \frac{U_{\text{DD}} - \alpha U_{\text{DD}}}{U_{\text{T}}}}{\ln \frac{U_{\text{DD}} - \alpha U_{\text{DD}}}{U_{\text{DD}} - U_{\text{B}}}} \frac{T}{4}.$$
 (17)



Fig. 7. Working principle demonstration of the CMOS implemented clock multiplier. (a) Input signal. (b) Output of the RC-Network. (c) Output of the Schnitt-Trigger. (d) Output of the clock multiplier.

It is seen from Eqs. (14) and (15) that  $U_{\rm T}$  equals ( $U_{\rm DD} - U_{\rm B}$ ). And then it is obtained from Eq. (17) that  $t_{\rm discharge}$  also equals T/4. Therefore an output waveform of the clock multiplier with a duty cycle of 50% can be obtained in theory.

## 3. Working principle

#### 3.1. Clock multiplier

Figure 6 shows the schematic of the CMOS implemented clock multiplier. There are three main function blocks in the circuit: RC network, Schmitt trigger and AOI XOR gate. The working principle of the clock multiplier is displayed by the



Fig. 8. Current-starved VCO for the CPPLL<sup>[16]</sup>.

simulated waveform in Fig. 7. The input signal (named as signal s1, shown in Fig. 7(a)) is a square wave with a frequency of 13.56 MHz and a duty cycle of 50%. Through the RC-Network and Schmitt-Trigger, the phase of the wave was shifted about 18.4 ns (named as signal s2, shown in Fig. 7(c)). If these two signals, s1 and s2, were given as the inputs of the AOI XOR gate, an output signal with a frequency of 27.12 MHz and a duty cycle of about 50% can be obtained (shown in Fig. 7(d)), which has a frequency of twice that of the input signal.

#### 3.2. Charge-pump phase-locked loop

A charge-pump phase-locked loop (CPPLL) is used to recover the carrier in this design, including a voltage control oscillator (VCO), a phase-frequency-detector (PFD) and a charge pump (CP). A current-starved VCO shown in Fig. 8 is designed for the CPPLL. A 4-stage current-starved delay cell is used and the corresponding components in every stage have the same parameters. In order to satisfy the phase condition of the oscillator, an inverter is added in the loop, which also acts as an amplitude booster. The center frequency of the VCO is set to a value of about 27.12 MHz. Since the delay time of the inverter in the loop is quite small and can be neglected, the frequency of the VCO is mainly determined by the delay time of the 4stage current-starved delay cell. In Fig. 8, the drain current of PM4 and NM4 is set to the same value, that is  $I_{\rm D} = I_{\rm D(PM4)} =$  $I_{D(NM4)}$ . Assuming that the capacitor in every stage has a value of C, the delay time of a single current-starved delay cell can be calculated as follows<sup>[16]</sup>:

$$t_{\rm d} \approx \frac{C_{\rm tot} V_{\rm DD}}{I_{\rm D}},\tag{18}$$

where  $C_{\text{tot}}$  is the total capacitance on the drains of PM4 and NM4 and can be calculated as<sup>[16]</sup>,

$$C_{\rm tot} = \frac{5}{2} C_{\rm ox} (W_{\rm p} L_{\rm p} + W_{\rm n} L_{\rm n}) + C.$$
(19)

Thus, the oscillating frequency of the VCO can be written

as,

$$f_{\rm osc} \approx \frac{1}{4t_{\rm d}} = \frac{I_{\rm D}}{4C_{\rm tot}V_{\rm DD}}.$$
 (20)

A typical structure is applied to the PFD, which is composed of two D flip-flops for comparison between the input signal and VCO output signal and one AND gate for reset function, as shown in Fig. 9. The working waveform of the designed PFD is also pictured in Fig. 9.

As demonstrated in Fig. 10, the main structure of the CP designed for the CPPLL is the same as that of Ref. [17]. Because the signal from the clock multiplier is not strictly constant in frequency (Fig. 2), even if the PLL is in a locked state, the output voltage of the charge pump cannot achieve a constant value and a voltage ripple will occur when the phase of the input BPSK signal changes. Therefore, an additional low pass filter, which is made up of  $R_2$  and  $C_3$ , is set before the output of the charge pump to reduce ripple. Meanwhile the buffer together with the MOS transistors PM4 and NM7 are used to reduce the charge sharing effect in the CP.

## 4. Results and discussions

The proposed new BPSK demodulator was implemented by a Chartered 0.35  $\mu$ m CMOS technology with 3.3 V power supply. The simulated waveform of the design is shown in Fig. 11. Figure 11(a) is the command data before modulating, which is a serial square wave with a period of 700  $\mu$ s and a duty cycle of about 43%. After about 18  $\mu$ s, the control voltage to the VCO, presented in Fig. 11(b), reaches a relatively steady value, which means that the PLL comes into the locked state. And then the correct data can be obtained from the demodulator when the PLL is locked, which is shown in Fig. 11(c).

Table 1 presents the consumed currents for the five subblocks: clock multiplier, CPPLL, clock divider, XOR and low pass filter. Since the low pass filter is composed of all passive resistors and capacitors, it does not consume power. The BPSK carrier frequency used in the simulation is 13.56 MHz with a maximum transmitting data rate of 330 kbps, which satisfies



Fig. 9. Circuit structure and working waveform of the PFD for the PLL<sup>[17]</sup>.



Fig. 10. CMOS circuit of the charge  $pump^{[17]}$ .



Fig. 11. Simulated waveform of the CMOS circuit of the BPSK demodulator. (a) The modulation signal. (b) Control voltage of the VCO. (c) Output of the demodulator.

Table 1. Simulated sub-block current consumption for the proposed demodulator.

| Building block   | Current consumption ( $\mu A$ ) |
|------------------|---------------------------------|
| Clock multiplier | 27.4                            |
| CPLL             | 65.7                            |
| Clock divider    | 5.2                             |
| XOR              | 7.5                             |

the command transmitting rate in a wireless implantable neural recording system. It can be calculated that the simulated power consumption of the BPSK demodulator is about 350  $\mu$ W.

Including an extra sub-block output buffer added for testing, the total area of the designed chip is  $320 \times 220 \ \mu m^2$ , as shown in Fig. 12. The five sub-blocks designed are labeled in the photograph, XOR is not labeled since it is too small. Among them the CPPLL (Part b) and the output buffer (Part a) take up most of the chip area.

Figure 13 shows the test results of the designed BPSK demodulator, where part (a) shows the input command signal and the BPSK modulated carrier signal, part (b) gives a clearer view of the rising edge in part (a), part (c) shows the input command signal and the output of the demodulator. In this test, square waves with frequencies of 165 kHz and 13.56 MHz are used for

| Table 2. Comparison results. |            |                   |           |                           |                         |                   |  |  |
|------------------------------|------------|-------------------|-----------|---------------------------|-------------------------|-------------------|--|--|
| Reference                    | Modulation | Carrier frequency | Data rate | Technology                | Core die area           | Power consumption |  |  |
|                              |            | (MHz)             | (Mbps)    |                           | size (mm <sup>2</sup> ) | $(\mu W)$         |  |  |
| Ref. [9]*                    | BPSK       | 13.56             | 1.12      | $0.18 \mu m  CMOS$        | 0.19                    | 610 at 1.8 V      |  |  |
| Ref. [11]*                   | BPSK       | 13.56             | 0.02      | $0.5 \ \mu m CMOS$        | 1.0                     | 3000 at 3.3 V     |  |  |
| Ref. [12]**                  | QPSK       | 13.56             | 4         | $0.18 \mu m  \text{CMOS}$ | _                       | 750 at 1.8 V      |  |  |
| Ref. [13]**                  | QPSK       | 13.56             | 8         | $0.18 \mu m  \text{CMOS}$ | 0.238                   | 680 at 1.8 V      |  |  |
| This work*                   | BPSK       | 13.56             | 0.16      | $0.35 \ \mu m \ CMOS$     | 0.0704                  | 500 at 3.3 V      |  |  |

\*Measured; \*\*Simulated.



Fig. 12. Photograph of the designed BPSK demodulator chip, the area is  $320 \times 220 \ \mu m^2$ . (a) Output buffer. (b) CPPLL. (c) Clock multiplier. (d) Clock divider. (e) Low pass filter.

the input command signal and the carrier signal respectively. It is seen from Fig. 13(b) that when the input command signal changes from "0" to "1", the phase of the carrier signal changes from 0 phase to  $\pi$  phase, which shows that the BPSK modulation process is successful. Comparison between the tested command output signal and the input command signal as shown in Fig. 13(c) demonstrates that the designed BPSK demodulator works correctly.

The performance of the designed chip was compared with that of other studies, as shown in Table 2. It shows that although 0.35  $\mu$ m/3.3 V technology is used, both the power and the area of the proposed PSK demodulator are much smaller than those of the COSTAS loop BPSK demodulator<sup>[9]</sup> or QPSK demodulators<sup>[12, 13]</sup>, even if they all use 0.18  $\mu$ m/1.8 V technology.

## 5. Conclusion

A new BPSK demodulator with a simple circuit structure and low power was designed and implemented successfully with a 3.3 V chartered 0.35  $\mu$ m CMOS technology. In this system, a clock multiplier made up of only an RC phase-shiftnetwork, a Schmitt-Trigger and an XOR was designed to double the frequency of the demodulator input signal from 13.56 to 27.12 MHz. A charge-pump phase-locked loop (CPPLL) with a current-starved VCO was designed to recover the carrier wave together with a clock divider. The test results show that when a BPSK modulated signal with a 13.56 MHz carrier and a 165 kHz input command signal was input into the newly designed BPSK demodulator, an output signal with frequency of about 165 kHz can be obtained, demonstrating that it can work



Fig. 13. Test results of the designed BPSK demodulator. (a) Input command signal and BPSK modulated carrier signal. (b) Enlarged view of the rising edge in (a). (c) Input command signal versus demodulator output signal.

efficiently. The design takes up an area of  $320 \times 220 \ \mu\text{m}^2$  with a power consumption of  $500 \ \mu\text{W}$ . The power might be reduced further if the new circuit structure is used in a single chip design for a wireless implantable neural recording system.

## References

[1] Yu H, Najafi K. Circuitry for a wireless microsystem for neural recording microprobes. Conference Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Istanbul, Turkey, 2001: 761

- [2] Harrison R R, Watkins P T, Kier R J, et al. A low-power integrated circuit for a wireless 100-electrode neural recording system. IEEE J Solid-State Circuits, 2007, 42(1): 123
- [3] Shen Xiaoyan, Wang Zhigong, Lü Xiaoying, et al. Microelectronic neural bridge for signal regeneration and function rebuilding over two separate nerves. Journal of Semiconductors, 2011, 32(6): 065011
- [4] Mai Songping, Zhang Chun, Chao Jun, et al. A new cochlear prosthetic system with an implanted DSP. Journal of Semiconductors, 2008, 29(9):1745
- [5] Donaldson N, Perkins T A. Analysis of resonant coupled coils in the design of radio frequency transcutaneous links. Med Biol Eng Comput, 1983, 21: 612
- [6] Galbraith D, Soma M, White R L. A wide-band efficient inductive transdermal power and data link with coupling insensitive gain. IEEE Trans Biomed Eng, 1987, 34(4): 265
- [7] Djemouai A, Sawan M. Prosthetic power supplies, in encyclopedia of electrical and electronics engineering. New York: Wiley, 1999, 17: 413
- [8] Best R E. Phase-locked loops: design, simulation, and applications. McGraw-Hill, 2004: 362
- [9] Hu Y, Sawan M. A fully-integrated low-power BPSK demodu-

lator for implantable medical devices. IEEE Trans CAS I, 2005, 52(12): 2552

- [10] Mizokami M, Takakubo K, Takakubo H. Four-quadrant-input linear transconductor employing source and sink currents pair for analog multiplier. IEICE Trans, 2006, 89-A(2): 362
- [11] Luo Z, Sonkusale S. A novel BPSK demodulator for biological implants. IEEE Trans Circuits Syst I: Regular Papers, 2008, 55(6): 1478
- [12] Deng S, Hu Y, Sawan M. A high data rate QPSK demodulator for inductively powered electronics implants. Proc IEEE Int Symp Circuits and Systems (ISCAS), 2006: 2577
- [13] Lu Z, Sawan M. An 8 Mbps data rate transmission by inductive link dedicated to implantable devices. Proc IEEE Int Symp Circuits and Systems (ISCAS), 2008: 3057
- [14] Wu Z H, Liang Z M, Li B. A new BPSK demodulation circuit for command transmission in wireless implantable neural recording system. IEEE Asia Pacific Conference on Circuits and Systems, Macao, China, 2008: 1526
- [15] Bueche F J, Hecht E. Schaum's outline of theory and problems of college physics. 9th ed. McGraw-Hill, 1997: 321
- [16] Baker R J, Li W H, Boyce E D. CMOS circuit design, layout and simulation. 2nd ed. New York: IEEE Press: 524
- [17] Razavi B. Design of analog CMOS integrated circuits. New York: McGRAW Hill International Edition, 2001: 528