# A low-power high-speed driving circuit for spatial light modulators

Zhu Minghao(朱明皓)<sup>1,2,†</sup>, Zhu Congyi(朱从义)<sup>1</sup>, Li Wenjiang(李文江)<sup>1</sup>, and Zhang Yaohui(张耀辉)<sup>1</sup>

<sup>1</sup>Suzhou Institute of Nano-Tech and Nano-Bionics, Chinese Academy of Sciences, Suzhou 215125, China <sup>2</sup>Graduate University of the Chinese Academy of Sciences, Beijing 100049, China

Abstract: This paper describes the design and test of a novel custom driving circuit for multi-quantum-well (MQW) spatial light modulators (SLMs). Unlike previous solutions, we integrated all blocks in one chip to synchronize the control logic circuit and the driving circuits. Single-slope digital-to-analog converters (DACs) inside each pixel are not adopted because it is difficult to eliminate capacitor mismatch. 64 column-shared 8-bit resistor-string DACs are utilized to provide programmable output voltages from 0.5 to 3.8 V. They are located on the top of 64 × 64 driving pixels tightly to match each other with several dummies. Each DAC performs its conversion in 280 ns and draws 80  $\mu$ A. For a high speed data transfer rate, the system adopts a 2-stage shift register that operates at 50 MHz and the modulating rate achieves 50 K frames/s while dissipating 302 mW from a 5-V supply. The die is fabricated in a 0.35  $\mu$ m CMOS process and its area is 5.5 × 7 mm<sup>2</sup>.

**Key words:** spatial light modulator; driving circuit; high speed; low power **DOI:** 10.1088/1674-4926/33/2/025013 **EEACC:** 2570

# 1. Introduction

Two-dimensional spatial light modulators (SLMs) constitute a significant element in optical communication and optical interconnects. Their abilities to perform mathematical operations are utilized in many vector-by-matrix (VMM) units and cooperating integrated circuits as electro-optic converters. The efficiency of these converters is directly dependent on the response of SLMs. The modulator based on multi-quantum well (MQW) is of particular interest as its response of a single pixel can be greater than 5 GHz, faster than SLMs based on the other materials<sup>[1]</sup>. MQW SLMs have made outstanding achievements in electro-optical signal processing and optical computing. A new optical digital signal process (DSP) based on a MQW SLM achieves 8000 Giga MAC/s with 5 mW/Giga MAC power dissipation<sup>[2]</sup>.

The development of SLMs puts forward new requirements for higher speed, lower power consumption and even higher accuracy driving circuits. Previously several reports have described a multi-chip module consisting of a CMOS driving array and four off-chip digital-to-analog converters (DACs)<sup>[3]</sup>. This solution permits high frame rates with high power consumption. Furthermore, a clock signal path on the motherboard is required to synchronize the driving chip and 4 DAC chips. With further scaling, the path of the clock synchronizing all blocks is too long to support the high-speed transmission. The solution to solve these problems was presented that built a single-slop DAC in each pixel<sup>[4]</sup>. As it integrates all blocks in one chip, it decreases power consumption and resolves the problem of data synchronization. The drawback is that the accuracy of the DAC depends on its integral capacitor value limited by the process technology. In this application, the relative errors of outputs are more than 10% for the severe capacitor mismatching, which should not be neglected.

In this paper, we describe a novel monolithic structure that overcomes the drawbacks mentioned above, as shown in Fig. 1(a). The chip utilizes a 2-stage shift register for highspeed transmission and data synchronization. Instead of singleslope DACs, resistor-string DACs are adopted for high accuracy. Numerous unity-gain buffers take a very important role in low power design which was neglected in previous reports. A class AB output stage is employed that decreases 30% power dissipation than before. Based on these artifices, the system is proposed to achieve 50 K/s frame rate and multi-level grayscale output voltage with 302 mW.

## 2. System architecture

In view of the above-mentioned facts, we make the following proposals: (1) monolithic; (2) reduction of mismatching; (3) low quiescent power consumption; (4) diminishable for further scaling. With these proposals, the proposed schematic consists of a 32-bit I/O, a 2-stage shift register, 64 columnshared 8-bit resistor-string DACs,  $64 \times 64$  driving pixels, a timing controller, and a bias voltage generator.

Figure 1(b) shows the block diagram of the proposed driving circuit. The 2-stage shift register is utilized to synchronize the input data from 32-bit parallel I/Os and format them into 512-bit to be transmitted to the column-shared DACs. 64 resistor-string DACs are employed for their excellent monotonicity and behaviors within 8 bits. The timing controller switches on pixels row by row. Each pixel will sample and hold the current output voltage of the corresponding DAC when it is switched on. The pixel drivers are required to provide an analog voltage from 0.5 to 3.8 V. Consequently a 5 V supply is required for the DACs and pixels. For lower power consumption and rejecting fluctuation from digital circuits, a 3.3V supply is employed for digital circuits.

<sup>†</sup> Corresponding author. Email: mhzhu2008@sinano.ac.cn Received 25 July 2011, revised manuscript received 19 September 2011



Fig. 1. The proposed (a) architectures and (b) schematic of the system.



Fig. 2. The 4-bit R-string DAC.

SLM pixels have a nonlinear response to the applied voltage. To compensate, a software preprocessing unit is utilized<sup>[4]</sup>.

## 3. Design and simulation

## 3.1. Column-shared DAC

64 column-shared 8-bit resistor-string DACs are applied to provide multi-level grayscale voltages for  $64 \times 64$  pixels. A conventional resister-string DAC includes  $2^N$  resisters and switches. In accordance with the pixel pitch and die area, the column DAC is limited into a 65  $\mu$ m wide and 1.8 mm long rectangle. An area-efficient segmented converter is adopted that employs a pair of resistor strings comprising  $2^{N/2+1}$  resistors instead of  $2^N$ , saving 80% area in this application<sup>[5]</sup>. For a detailed description, the schematic diagram of a 4-bit DAC based on this solution is shown in Fig. 2. Table 1 shows the relationship between 4-bit digital words fed to this DAC and open/closed positions of switches used in such DAC.

As 64 column-shared DACs need 512 bits data to produce 64 outputs for pixels in a row. Each DAC should finish its converting in 16 clock cycles. Considering the sampling time, the DACs' settling time must be less than 280 ns (1/3.57 MHz). The simulation results are shown in Table 2. The accuracy and power consumption meet the requirement.

| Table 1. R     | elationship between                                 | digital words and switches.                                                             |
|----------------|-----------------------------------------------------|-----------------------------------------------------------------------------------------|
| $D_3D_2D_1D_0$ | CLOSED                                              | OPENED SWITCHES                                                                         |
|                | SWITCHES                                            |                                                                                         |
| 0000           | $SM_0$ , $SM_1$ , $SL_0$                            | SM <sub>2</sub> , SM <sub>3</sub> SM <sub>4</sub> , SL <sub>1</sub> , SL <sub>2</sub> , |
|                |                                                     | SL <sub>3</sub>                                                                         |
| 0001           | $SM_0$ , $SM_1$ , $SL_1$                            | SM <sub>2</sub> , SM <sub>3</sub> SM <sub>4</sub> , SL <sub>0</sub> , SL <sub>2</sub> , |
|                |                                                     | SL <sub>3</sub>                                                                         |
| 0010           | $SM_0$ , $SM_1$ , $SL_2$                            | SM <sub>2</sub> , SM <sub>3</sub> SM <sub>4</sub> , SL <sub>0</sub> , SL <sub>1</sub> , |
|                |                                                     | SL <sub>3</sub>                                                                         |
| 0011           | $SM_0, SM_1, SL_3$                                  | SM <sub>2</sub> , SM <sub>3</sub> SM <sub>4</sub> , SL <sub>0</sub> , SL <sub>1</sub> , |
|                |                                                     | SL <sub>2</sub>                                                                         |
| 0100           | $SM_1$ , $SM_2$ , $SL_3$                            | $\overline{SM_0}$ , $SM_3$ , $SM_4$ , $SL_0$ , $SL_1$ ,                                 |
|                |                                                     | SL <sub>2</sub>                                                                         |
| 0101           | $SM_1$ , $SM_2$ , $SL_2$                            | $\overline{SM_0}$ , $SM_3$ , $SM_4$ , $SL_0$ , $SL_1$ ,                                 |
|                | 1, 2, 2                                             | SL <sub>2</sub>                                                                         |
| 0110           | $SM_1$ , $SM_2$ , $SL_1$                            | SM <sub>0</sub> , SM <sub>3</sub> SM <sub>4</sub> , SL <sub>0</sub> , SL <sub>2</sub> , |
|                | 1, 2, 1                                             | SL <sub>3</sub>                                                                         |
| 0111           | $SM_1$ , $SM_2$ , $SL_0$                            | SM <sub>0</sub> , SM <sub>3</sub> SM <sub>4</sub> , SL <sub>1</sub> , SL <sub>2</sub> , |
|                | 1, 2, 0                                             | SL <sub>3</sub>                                                                         |
| 1000           | $SM_2$ , $SM_3$ , $SL_0$                            | SM <sub>0</sub> , SM <sub>1</sub> SM <sub>4</sub> , SL <sub>1</sub> , SL <sub>2</sub> , |
|                |                                                     | SL <sub>3</sub>                                                                         |
| 1001           | $SM_2$ , $SM_3$ , $SL_1$                            | SM <sub>0</sub> , SM <sub>1</sub> SM <sub>4</sub> , SL <sub>0</sub> , SL <sub>2</sub> , |
|                |                                                     | SL <sub>3</sub>                                                                         |
| 1010           | $SM_2$ , $SM_3$ , $SL_2$                            | SM <sub>0</sub> , SM <sub>1</sub> SM <sub>4</sub> , SL <sub>0</sub> , SL <sub>1</sub> , |
|                |                                                     | SL <sub>3</sub>                                                                         |
| 1011           | SM <sub>2</sub> , SM <sub>3</sub> , SL <sub>3</sub> | SM <sub>0</sub> , SM <sub>1</sub> SM <sub>4</sub> , SL <sub>0</sub> , SL <sub>1</sub> , |
|                |                                                     | SL <sub>2</sub>                                                                         |
| 1100           | $SM_3$ , $SM_4$ , $SL_3$                            | $\overline{SM_0}$ , $SM_1$ , $SM_2$ , $SL_0$ , $SL_1$ ,                                 |
|                |                                                     | SL <sub>2</sub>                                                                         |
| 1101           | $SM_3$ , $SM_4$ , $SL_2$                            | SM <sub>0</sub> , SM <sub>1</sub> SM <sub>2</sub> , SL <sub>0</sub> , SL <sub>1</sub> , |
|                |                                                     | SL <sub>3</sub>                                                                         |
| 1110           | $SM_3$ , $SM_4$ , $SL_1$                            | SM <sub>0</sub> , SM <sub>1</sub> SM <sub>2</sub> , SL <sub>0</sub> , SL <sub>2</sub> , |
|                | -····                                               | SL <sub>3</sub>                                                                         |
| 1111           | $SM_3$ , $SM_4$ , $SL_0$                            | SM <sub>0</sub> , SM <sub>1</sub> SM <sub>2</sub> , SL <sub>1</sub> , SL <sub>2</sub> , |
|                | · · · · ·                                           | SL <sub>3</sub>                                                                         |

#### 3.2. Driver array

According to the requirement of the spatial light modulator, the pixel driver is restricted in a  $65 \times 65 \ \mu m^2$  square that consists of a sample/hold capacitor, a voltage buffer and several switches. The tradeoff between the switch charge injection and switching rate must be considered. A large sample capacitor could be helpful to reduce switch noise. In addition, it needs



Fig. 3. (a) A compensated cascode op amp with bias voltage. (b) A compensated op amp without bias voltage. (c) An ac small-signal equivalent circuit of Fig. 3(b). (d) Thevenin equivalent circuit of Fig. 3(c).

Table 2. Simulation results of the DAC.

| DAC spec         | TT (TTMOS 270)                 |
|------------------|--------------------------------|
| DNL              | 0.086 LSB                      |
| INL              | 0.078 LSB                      |
| SNDR @ 62 kHz    | 49.7 dB                        |
| SNDR @ 1.416 MHz | 49.6 dB                        |
| Total power      | 273 μW @ DC, 400 μW @ 3.57 MHz |
| Load             | 3 pF                           |

a higher driving capacity of the DAC buffer and a larger area. The bottom-plate sampling technology is not implemented as it would make a signal crossing the whole die introducing extra interference.

#### 3.3. Buffer

A common-gate technique (Fig. 3(a)) has been reported to cancel the zero inducing by the Miller capacitor<sup>[6]</sup>. In such an application, an extra current mirror is required for biasing common-gate transistors that would dissipate more than 18% power consumption and the area of the whole chip. A pseudo cascade architecture (Fig. 3(b)) is employed to solve the problem<sup>[7]</sup>. In Ref. [7], the author predicted that the slew rate depends on the sum of the two compensation capacitors and makes them equal. However, we get better results with two different capacitors. Considering the scarcity of area, optimum allocation of the capacitors is essential to improving the performance of the opamp.

A small-signal model suitable for analyzing the frequency response appears in Fig. 3(c). The current sources  $g_{m2}V_i/2$  and  $G_{m2}V_i/2$  result from the action of the input pair. The former corresponds to MN2 and the latter corresponds to MN4 as a cur-

rent mirror of MN1. The conductances  $g_{02}$  and  $g_{04}$  are smallsignal conductances of MN2 and MN4 respectively.  $g_{m7}$  and  $g_{07}$  are the transconductance and conductance of output stage.  $g_{06}$  represents the conductance of MN6 which works in linear region. MN4 and MN6 compose a common-source stage with source degeneration. We substitute the input source, MN4 and MN6 by a Thevenin equivalent. As illustrated in Fig. 3(d), MN4 degenerated by MN6 is equivalent to a current source  $g_{m2}V_i/2$  shunted with two series conductance<sup>[8]</sup>, where  $G_{04}$ equals  $\frac{g_{04}g_{06}}{g_{m4}+g_{06}}$  far less than  $g_{06}$ , and  $g_{m4}$  is the transconductance of MN4.

Analysis of the Thevenin equivalent circuit, results in the following polynomial transfer function:

$$\frac{V_{\rm o}}{V_{\rm i}} = g_{\rm m2} [g_{\rm m7}G_{05} - (G_{04}C_2 + G_{05}C_1 - g_{\rm m7}C_2)s - C_1C_2s^2]G_{02}g_{06}g_{07} + (g_{\rm m7}g_{06}C_1 - g_{\rm m7}g_{07}C_2 + G_{02}g_{06}C_L)s + (g_{\rm m7}C_1C_2 + g_{06}C_1C_L)s^2 + C_1C_2C_Ls^3.$$

where  $G_{02}$  equals  $g_{02}$  plus  $G_{04}$ , and  $G_{05}$  is the sum of  $G_{04}$ and  $g_{06}$ . Since  $g_{m7}$ ,  $g_{m4} \gg G_{02}$ ,  $g_{07}$ , a simplification of the equation can be obtained by factoring:

$$\frac{V_{\rm o}}{V_{\rm i}} = \frac{g_{\rm m2} \left(g_{\rm m7} - C_1 s\right) \left(G_{05} + C_2 s\right)}{s C_1 \left(g_{\rm m7} + C_{\rm L} s\right) \left(g_{06} + C_2 s\right)}$$

Since  $g_{06}$  much larger than  $G_{04}$ ,  $G_{05}$  is equal to  $g_{06}$  approximately. A zero and a pole in the left-half plane can be removed as a common factor. Finally we obtain two poles, one at the origin and another at  $g_{m7}/C_L$ . Compared with the conventional Miller compensation, the dominant pole is close to



Fig. 4. Schematic of the class AB amplifier.

| Table 3. Simulation results of the buff | er. |
|-----------------------------------------|-----|
|-----------------------------------------|-----|

| Buffer spee          | TT (TTMOS 270)                              |
|----------------------|---------------------------------------------|
| Gain                 | 86.3 dB                                     |
| GBW                  | 14.9 MHz                                    |
| Phase margin         | 59.3°                                       |
| Settling time        | 280 ns                                      |
| PSRR_power           | 110 dB @ DC, 92 dB @ 3.57 MHz               |
| PSRR_substrate       | 50 dB @ DC, 42 dB @ 3.57 MHz                |
| Power                | 212 $\mu$ W for DACs, 66 $\mu$ W for pixels |
| Offset @ Monte Carlo | 20.8 mV (pessimistic)                       |

| Table 4. | Performance | summar |
|----------|-------------|--------|
| Table 4. | Performance | summar |

| Parameter        | Performance |
|------------------|-------------|
| DNL              | 3.40 LSB    |
| INL              | 3.28 LSB    |
| SNDR             | 38.2 dB     |
| ENOB             | 6.15 bit    |
| Settling time    | 280 ns      |
| Frame rate       | 50 K/s      |
| Power (in total) | 302 mW      |

origin rather than  $\frac{g_{02}g_{07}}{g_{m7}C_1}$ . As a result, the relevance between gain-bandwidth (GBW) and  $C_1$  can be neglected. We can decrease the value of  $C_1$  to push the right-half-plane zero further and increase  $C_2$  to keep the sum of them constant. The best ratio of  $C_2$  to  $C_1$  is from 1.4 to 2.2 depending on the fabrication process. It improves the phase margin and diminishes settling time.

In order to increase the transconductance with lower quiescent current, a class AB output stage with a push-pull structure is adopted (Fig. 4). The slew rate enhancement technique is not applied for simplification, since the dynamic driving capacity is enough.

As shown in Table 3, the settling time achieves the requirement with low power consumption. However the result of power supply rejection ratio (PSRR) from substrate is not good enough. That is because noise is coupled to the output through  $C_2$  and MN3. The output offset may meet 2 least significant bits (LSBs). The Monte Carlo result implies that the output may be worse. However we have to abandon accuracy since power consumption and area are more significant.



Fig. 5. Die photograph.

#### 4. Measured performances

The prototype driving circuit has been fabricated in a standard mixed signal 0.35- $\mu$ m CMOS technology. The die photo is shown in Fig. 5 and the active area is 5.5 × 7 mm<sup>2</sup>. The SLM driven by the circuit achieves 50 K frames/sec with 302 mW. It indicates the system achieves 105 Giga MAC/sec with 2.9 mW/Giga MAC power dissipation, less than 60% of the result in Ref. [2].

To evaluate the output deviation of the driving circuit is similar to the traditional test of a DAC. The output waveform of the driving circuit is shown in Fig. 6(a). The DNL and INL of the circuit are 3.40 LSB and 3.28 LSB, respectively (Fig. 6(b)). The signal-to-noise ratio (SNR) is 38.6 dB and signal-to-noise and distortion ratio (SNDR) is 38.2 dB (Fig. 6(c)). The effective number of bits (ENOB) is 6.15. Table 4 summarizes the circuit's performances.

The dynamic accuracy is constrained by two causations. Firstly, restricted area leads to mismatching. Secondly, power and especially substrate noise contributed by all the blocks including 64 DACs and 4096 buffers introduce the deviation. It will cost more power consumption and die area if higher accuracy driving voltage is needed.

## 5. Conclusion

This paper describes a CMOS circuit that provides the programmable 64-level grayscale voltages from 0.5 to 3.8 V for a MQW SLM with  $64 \times 64$  pixels. The architecture is also able to achieve 50 K frame/s operation and sustains 1.6 Gbps throughput at 50 MHz. Meanwhile, the power dissipation is 302 mW.

A class AB amplifier is adopted with the analysis of the optimum design for compensation capacitors. The solution improves the buffer performance occupying minimum area and power dissipation. The whole system is integrated in one chip flip-bonded by the MQW SLM and further scaling is feasible.



Fig. 6. (a) Output waveform of the driving circuit. (b) Measured DNL and INL. (c) Measured SNDR.

The uniformity calibration for the driving array will become the central issue.

## References

- Goossen K, Walker J, D'Asaro L A, et al. GaAs MQW modulators integrated with silicon CMOS. IEEE Photonics Technol Lett, 1995, 7: 360
- [2] Eisenbach S. Optical signal processing. Confidential and Proprietary Information of Lenslet, 2003
- [3] Garvin C, Trezza J A, Ahhearn J S, et al. Overview of high-speed multiple quantum well optical modulator devices at Lockheed Martin Sanders. Proceeding of SPIE-The International Society for Optical Engineering, 1998, 3466: 145
- [4] Wu Lan, Yu Ningmei, Zhang Yaohui. A universal programmable driving circuit for spatial light modulators. Journal of Semiconductors, 2009, 30(7): 075013
- [5] Dempsey D, Gorman C. Digital to analog converter. USA Patent, No. 5969657, 1999

- [6] Ribner D B, Copeland M A. Design techniques for cascoded CMOS op amps with improved PSRR and common-mode input range. IEEE J Solid-State Circuits, 1984, 19(6): 919
- [7] Itakura T, Minamizaki H. 10 μA quiescent current opamp design for LCD driver ICs. Analog Integrated Circuits and Signal Processing, 1999, 20(2): 111
- [8] Razavi B. Design of analog CMOS integrated circuit. The McGraw-Hill Companies, Inc, 2001, Chap 3
- [9] Woodward T K, Krishnamoorthy A V, Goossen K W, et al. Modulator-driver circuits for optoelectronic VLSI. IEEE Photonics Technol Lett, 1997, 9(6): 839
- [10] Bell M J. An LCD column driver using a switch capacitor DAC. IEEE J Solid-State Circuits, 2005, 40(12): 2756
- [11] Weiler M H, Abearn J S, Adams S B, et al. Large scale modulator arrays for beam steering and optical modulator applications. IEEEAC paper#458, 2002
- [12] Worchesky T L, Ritter K J, Martin R, et al. Large arrays of spatial light modulators hybridized to silicon integrated circuits. Appl Opt, 1996, 35:1180