1. Introduction
1.1 From memristor to memtransistor
The past decades have witnessed tremendous progresses on memristor-based neuromorphic computing. Memristors, defined as a two-terminal device of which the resistance is proportional to the change rate of magnetic flux over that of charge flowing through it, was proposed in 1970s by Chua[1]. Until nearly 40 years later it got experimentally realized by Strukov et al. as that the memristor conductance could be modulated back-and-forth in a nonvolatile manner[2]. Since it acts like a voltage-driven microscale sliding rheostat (Fig. 1(a)), memristors were found to be a promising candidate for artificial synapses[3]: The synaptic strength defines the transmission efficiency of signals from pre-synaptic neuron to post-synaptic one, and thus is electrically equivalent to the conductance of a memristor; moreover, synaptic strengths have to be tunable and the tuning results should be memorable in the learning progress of neural network while the memorable conductance change of memristor perfectly matches such demands. Therefore, in less than 10 years since the first demonstration of memristor as electronic synapse, researches on memristor-based neuromorphic computing have explosively advanced from device-level illustrations to array-level[4−6] and to chip-level integration[7, 8].

Recently memtransistors (MT), a multi-terminal device generalized from two-terminal memristor, have been experimentally demonstrated and intensively explored also in neuromorphic computing[9−11]. As seen in Fig. 1(b), MT is defined as a transistor where the channel conductance can be nonvolatilely manipulated by the gate voltage. However, comparing to the simple structure of two-terminal memristor, MT not only cost much more in the fabrication process but also take up much more area on the chip. Thereby, a question arises: where are the kill applications of MT that could hardly be achieved by memristors? One potential answer is the concurrent forwarding and updating in spiking neural network (SNN) training that is empowered by the multi-terminal of MT, as seen in Fig. 1(c)[11, 12]. First, samples are encoded by spikes and fed to the network. Note that for this forward propagation, the source and drain terminals of MT are used. Then after propagating to the last layer, output signals come out and are compared to the target (desired spike train) in the supervise unit. Consequently, programming voltages are generated through the module of supervise circuit and then imposed to the gate terminals of MT. In this way, the synaptic weights are updated and a next round of sample forwarding and learning would be triggered. It is worth reminding that in Fig. 1(b), supervised learning (SL) is just taken as example. For other types of learning rules such as reward-modulated spike-timing dependent plasticity (R-STDP) known as a reinforcement learning (RL) rule, the supervise module can be replaced by that of reward signals and the above concurrent forwarding and updating paradigm remains effective. In other words, the concurrent scheme shown in Fig. 1(b) is universal for various types of learning in SNN. Comparing to the computing diagram of forward and backward propagation realized through memristor crossbar, the training scheme demonstrated in Fig. 1(c) makes full utilization of the multi-terminal characteristics of MT. Such a novel training paradigm may achieve better performance by designing MT-based SNN[11].
1.2 From MT to complementary MT
It is widely recognized that complementary metal-oxide-semiconductor (CMOS) field effect transistors (FET) made up by a pair of MOSFET with p-doped channel (p-MOS) and n-doped one (n-MOS) are the fundamental building block of today’s very large scale integration of circuits (VLSI). Will the complementary memtransistors (CMT) also play such an essential role in the future neuromorphic VLSI? If yes, how to realize CMT, what are the characteristics of CMT, and why bother to develop CMT-based neuromorphic computing (in other words, where are the killer applications)? For the first question, to be more precisely, how to realize wafer-scale mass production of CMT meanwhile with high quality and consistency? Various types of materials and mechanisms have been exploited in recent researches, and we will compare the advantages and disadvantages of them. The second question on the unique functions of CMT (what) will be investigated through the implementation of CMT as a pair of potentiative and depressive synapses. As seen from the bottom row to the middle one in Fig. 2, given the same positive gate voltage pulses, the conductance tuning in p- and n-channel MT would be opposite. For p-channel MT, the positive programming voltage would turn the originally upward ferroelectric polarization downward (indicated by those arrows within dash box), thus reducing the holes in the channel; for n-channel one, the situation is opposite where more electrons are induced in the channel (also indicated by the dash box). These opposite conductance tuning behaviors in the p- an n-channel MT can also be understood by the gate voltage-induced Fermi-level displacements as shown in the bottom row of Fig. 2. The imposing of positive gate voltages would leverage the Fermi levels in both p- and n-channels. However, the upward approaching of Fermi level in p-channel results in less holes and thus decreased conductance, while in n-channel there would be more electrons and hence increased conductance. As results, p- and n-channel MT show opposite conductance tuning behaviors as demonstrated by the long-term conductance tuning curves Gds(nset, nreset) shown in the middle row of Fig. 2. The final question on application scenarios (why) is even more important but quite difficult. It means that we have to find a path from the device-level unique properties of CMT to the algorithm-level efficient implementation of learning rules in neuromorphic computing, as seen in Fig. 2. Moreover, since CMT will double the area consumption and process complexity over MT, the benefits resulted by using CMT in the realization of target learning rules should be sufficiently large to compensate these extra costs. In this review, we will discuss several designs of CMT-based hardware system, one for a SL rule namely ReSuMe, another for a RL rule known as R-STDP and the last for dynamic vision associated in-sensor neuromorphic computing as seen from the middle row to the up one in Fig. 2. We will demonstrate that by fully exploiting the opposite behaviors of conductance tuning in CMT, advantages such as significant reduction of circuit modules, simplification of learning rule implementation, enhancement of the energy efficiency, etc. would be harvested. On the other side, the challenges of using CMT will also be discussed through each example.
2. CMT: Materials, mechanisms & behaviors
2.1 Nonvolatile conductance tuning
Compared to two-terminal memristors, additional electrical stimulation terminals have been introduced in MT. Overall, the resistance switching (RS) process of MT depends on the electrically induced dynamic variations. Various non-volatile conductance modulation mechanisms have been proposed, and as shown in Fig. 3, eight types of modulation mechanisms commonly found are summarized in this review.

Ferroelectric materials undergo reversible iron polarization in the presence of an external programming electric field. Comparing to conventional transistors, ferroelectric transistors use ferroelectric materials as the gate insulation layer. Upon the application of a gate voltage, the polarization direction of the ferroelectric gate dielectric is consistent with that of the external electric field. Taking the n-type transistor with a top gate as an example, the polarization of the ferroelectric gate dielectric flips downwards given a positive gate voltage pulse. Consequently, the electron concentration in the channel increases, leading to enhanced channel conductance. On the other side, for negative gate voltage pulses the ferroelectric polarization gradually flips upwards and hence the decreasing of electron concentration and channel conductance is deserved. Owing to the remnant polarization of ferroelectric materials, the electrons will not fully relax with the removed gate voltage and thus a non-volatile modulation of channel conductance is achieved. In the past years, hafnium based ferroelectric materials have been widely studied due to their complete compatibility with CMOS processes[21, 22]. Kim et al. prepared high-performance ferroelectric transistors using InZnOx as the channel material and HfZrOx as the ferroelectric gate dielectric layer, as shown in Fig. 3(a)[17]. Comparing to the current flash memory[23], it demonstrated several hundred times faster (<10−6 s), operating voltages four times lower (<5 V), and excellent endurance(>108 cycles). Besides, through contact engineering the dominant injected carrier type was modulated at the metal–semiconductor junction, resulting in n-type, p-type and ambipolar FeFET with 2D WSe2/Al0.68Sc0.32N structure[24]. Such reconfigurable channels showed high electron and hole current densities of ~20 and ~10 μA/μm, a high on/off ratio surpassing ~107 and a large memory window of >6 V (0.14 V/nm)[24].
The mechanism of ferroelectric semiconductor (FeS) memtransistors is similar to that of ferroelectric memtransistors, with the main difference being that FeS channel layer is composed of semiconductor materials with ferroelectricity as shown in Fig. 3(b). The non-volatile modulation comes from the polarization of channel rather than the ferroelectric gate dielectrics. Therefore, both the functions of the ferroelectric tuning and semiconductor channel are fulfilled by the FeS layer, thus simplifying the preparation process. Liao et al. reported a van der Waals (vdW) ferroelectric semiconductor (InSe) based metal oxide ferroelectric semiconductor field effect transistor (MOFeS FET), in which out of plane (OOP) ferroelectric polarization of InSe was used for data storage, while semiconductor properties were used for logic computing[14]. An impressive long holding time, high on/off ratio (greater than 106), high programming/erase (P/E) ratio (103), together with stable cycling durability were demonstrated in MOFeS FET.
Floating gate memtransistors are typically composed of a source, drain, control gate, and floating gate which locates within the gate dielectrics as seen in Fig. 3(c). The floating gate layer is generally made by metal materials or polycrystalline silicon. Taking n-type floating gate transistors as an example, when a sufficiently large positive voltage is applied to the control gate, electrons would tunnel into the floating gate through the oxide layer known as "writing process". When reversing the gate voltage, electrons stored in the floating gate will pass through the dielectric layer and return to the channel namely "erase process". Therefore, applying gate voltage will change the charge states in the floating gate, leading to different channel conductance. Moreover, given voltage withdrawal the charges stored in the floating gate cannot escape to the channel due to the existence of insulating gate dielectric layer and consequently nonvolatile conductance tuning is achieved. Wang et al. reported a threshold switching float gate MT fabricated with vdW heterostructure of MoS2/hBN/graphene stacked on SiO2/Si substrate[15]. Unlike conventional floating gate transistors, the threshold switching behavior was stimulated from the impact ionization in the channel and the coupled charge injection into float gate. Upon reaching the threshold, a sub-30 mV∙dec−1 increase of transient conductivity by more than four orders was triggered within several milliseconds. Such device was used to simulate nonlinear neural activation and heterodyne behavior, and by adopting optical signals as modulation inputs, two machine vision tasks namely collision avoidance and adaptive visual perception that relied on adaptive neural activation were successfully implemented.
The structure of charge trapping MT is similar to floating gate one except that the charge trapping layer is usually composed of high dielectric constant oxides, as seen in Fig. 3(d). Its conductance modulation mechanism is basically the same as that of floating gate MT, where the main difference lies in the materials used as the charge storage layers. Comparing to the floating gate layer, oxide materials used as charge trapping enable the local modulation of electrons/holes. Xiong et al. demonstrated HfO2/ReS2/POx/BP vdW heterostructure MT for nonconventional logic and memory applications[16]. POx was formed by the natural oxidation of black phosphor (BP), playing a crucial role in the charge trapping functional layer. An artificial synapse was then realized with reconfigurable synaptic effects between excited and inhibitory states. In addition to POx, HfO2 is also a classic material used as the charge trapping functional layer. Chou et al. fabricated charge-trapping MT with a single-charge-trapping-layer gate-stack or a double-charge-trapping-layer gate-stack on germanium (Ge) substrates, in which HfO2 was used as the charge trapping layer[25]. By implementing double-charge-trapping-layer gate-stack, the memory window was expanded by 372 mV and the on/off ratio was increased to 75.3.
Neuromorphic devices based on mechanisms of ion migration are commonly referred to as ENODE (electrochemical neuromorphic devices) or EIS (electrochemical ion synapses), which rely on the controllable intercalation of doped ions in semiconductor channels[26−28]. The ion migration based MT is usually consists of a source, drain, gate (also serving as an ion reservoir), electrolyte layer, and channel layer as seen in Fig. 3(e). The electrolyte layer is located between the anion reservoir and the channel layer. Upon an applied gate voltage pulse, ions shuttle back and forth between the anion reservoir and the channel, allowing the conductivity of the channel to increase or decrease in a controllable manner[29]. During the programming process, ions are transported through an electrolyte while the electrons flow through the external circuit. After the removal of gate voltage, i.e., the gate/reservoir is electrically left open, electrons would no long be able to flow and the ions would be localized in the channel. In this way, nonvolatile conductance tuning is realized. Among various types of migration ions, H+ (protons) are particularly attractive because of their small radius, light weight, high migration ability, and compatibility with CMOS[30]. Onen et al. used Pd as an inorganic reservoir/gate, metal oxide (WO3) as an inorganic channel, and photosensitive glass (PSG) as an electrolyte to achieve the function of ion migration MT[17]. Impressive indices of performance like nanosecond-level reversible modulation characteristics with many conductance states covering a 20× dynamic range were reported in the above MT. In addition, organic electrochemical transistors (OECT) also control the doping state of the organic film by controlling the ion implantation channel through gate voltage. Qian et al. demonstrated the fabrication and performance of multi-gate poly(3-hexylthiophene) (P3HT) OECTs with ion-gel gating. The neuromorphic behaviors were observed to depend on the degree of temporal correlation and distance between the in-plane-gate and the channel[31].
The mechanism of filamentary MT is similar to that of metal oxide resistive memory as shown in Fig. 3(f). An applied electric field drives the migration of cations/anions in the switching layer, thereby manipulating the formation and rupture of filament within the layer. Unlike the two terminal memristors, here in filamentary MT the gate modulation introduced through the third terminal helps reduce the switching voltage. Yan et al. reported a MT based on SnOx/MoS2 heterostructure in which the filament forming and rupture was enabled through the migration of O2− in the SnOx layer[18]. By exploiting the inherent randomness of ion motion, this work effectively maneuvered the random features in the output characteristics of the device through gate modulation technology, achieving reconfigurable random neurons and Boltzmann machines. The filamentary mechanism has also been demonstrated in GaSe-based MT, in which the positive drain voltage bias causes Se2− ions to migrate to the drain contact, forming a conductive filament composed of p-type Ga vacancies[32]. GaSe-based MT exhibits nonvolatile bipolar RS characteristics and RS behavior significantly increased even after one week of exposure to air, with an on/off ratio of 5.3 × 105 and an ultra-low threshold electric field of about 3.3 × 102 V∙cm-1.
Phase change materials (PCM) have also been utilized as channel layers as shown in Fig. 3(g). Through an external electric field, the channel layer undergoes a non-volatile phase transition, thereby achieving non-volatile conductance modulation. Vanadium oxide undergoes a Mott MIT transition at 341 K and can be subjected to electrical or thermal stimulation. Polycrystalline VO2 thin films were used in low-power MT, where metal-insulator transition (MIT) based RS process were sharply tuned at low bias voltages (<0.5 V) with fast switching time (~35 ns) as the RS process does not rely on the migration of atoms or charges[33]. Ge15Sb85&Sb MT was also reported. As a phase change synapse, it leveraged both the nonvolatility of the phase configurations and the volatility of field effect module for implementing tunable plasticity[19].
Various heterostructures can be realized through van der Waals bonding of 2D materials. Gate voltage is able to modulate the heights of Schottky barriers at the heterojunction interface, thereby regulating RS in the channel as shown in Fig. 3(h). Rehman et al. utilized a vertical heterojunction of Cu/ReS2/graphene to obtain a low-power gate tunable MT. In the device RS was manipulated via the barrier height tuning at the ReS2/graphene interface, which controlled the electron flow for neutralizing Cu ions and forming Cu filament[20]. Although the gate voltage did not directly interact with the charge carriers within channel layer, the work function of graphene got adjusted by the back gate voltage leading to different barrier heights and thus tuning the operating voltages. In this way, the Ron/Roff ratio was modulated from 102 to 105. A summary of key indices for memtransistors demonstrated before is provided in Table 1.
Mechanism | Channel layer |
Memory layer |
Operating voltage (V) |
On/off ratio |
Endurance (cycles) |
Retention (s) |
Speed (s) |
Ref | |
Ferroelectric | InZnOx | HfZrOx | <5 | ~104 | >108 | N/A | <10-6 | [13] | |
Ferrosemi-conductor | InSe | 4 | >106 | >100 | >104 | 2 × 10−7 | [14] | ||
Charge trapping | BP-ReS2 | POx | 4 | 106 | N/A | 50 | 10−3 | [16] | |
Floating gate | MoS2 | Gr | 6 | >105 | 30 | N/A | 10−4 | [15] | |
Ion migration | WO3 | 8.5/10 | 20 | >105 | ~102 | N/A | [17] | ||
Filamentary | MoS2&SnOx | 5 | ~108 | 106 | N/A | N/A | [18] | ||
Phase change | Ge15Sb85&Sb | N/A | 102 | 1015 | N/A | N/A | [19] | ||
vdW hetero-junctions | ReSe2&graphene | N/A | 103 | 103 | ~104 | N/A | [20] |
2.2 Complementary behaviors by oppositely doped channels
2.2.1 Complementary LTP & LTD
As seen in Figs. 4(a) and 4(b), by tuning the ferroelectric polarization of the gate dielectric layer the channel of MT could be switched between p- and n-doped owing to the electrostatic doping effect[34] (Note that for demonstration purpose here the zero-band gap semiconductor graphene was used to illustrate the ambipolar tuning of the conductance). Moreover, such opposite doping of the channel would lead to contrary variation trends of the nonvolatile conductance change ∆G when applying write voltage pulses through gate VG. Figs. 4(c) and 4(d) show that given the same negative voltage pulses (VG < 0), p-channel MT would experience an increase of the conductance (∆G > 0) while n-channel the opposite (∆G < 0). It was ascribed to that the negative voltage pulses by the gate would reinforce the upward polarization of ferroelectrics in the gate dielectric layer (Fig. 4(a)), thus enhancing the electrostatic doping of holes in the p-channel MT; in contrast, such negative pulses would attenuate the downward polarization (Fig. 4(b)) and lead to the decreasing of the electron doping in the n-channel MT. Therefore, Figs. 4(c) and 4(d) show that by setting the MT channel at opposite doping states the nonvolatile conductance tuning through the same gate voltage sweeping would be opposite.

Enlighted by the above contrary behaviors, the complementary long-term potentiation (LTP) and depression (LTD) were designed and experimentally realized as seen in Figs. 4(e) and 4(f). Given the same positive pulse train succeeded by a negative one, the p-channel MT showed first conductance decreasing known as LTD and then increasing as LTP, while the n-channel MT showed exactly the opposite as first LTP and then LTD. The concepts of potentiative & depressive synapses were then introduced and as seen in Figs. 4(e) and 4(f) CMT could be utilized as these complementary synapses[12].
Finally, it is worth reminding that the dependence of the conductance tuning ∆G on the initial conductance G0 was also plotted in Figs. 4(c) and 4(d), illustrating an important side effect namely conductance saturation. For example, for the n-channel MT with large initial conductance those positive gate voltage pulses were unable to further enhance the conductance as shown by the red lines at the upright corner of Fig. 4(d). Since saturation means the conductance becomes no longer tunable, the complementary LTP and LTD would be broken. The impact of such broken on the network-level training processes would be illustrated in the following sessions and potential remedies would be discussed.
2.2.2 STDP & anti-STDP
Spike timing-dependent plasticity (STDP) was discovered in biological nerves and regarded as the first rule of synaptic tuning[35, 36]. The quantitative expression for the synaptic strength change is as follows:
Δw=+∑f,f′Δw+Θ(t(f′)post−t(f)pre)exp(t(f′)post−t(f)pre)−∑f,f′Δw−Θ(t(f)pre−t(f′)post)exp(t(f)pre−t(f′)post). |
(1) |
In the above expression, t(f)pre and t(f′)post are the firing moments of the pre- and post-synaptic neurons, respectively, Θ(t) is Heaviside function determining the time sequencies of the pre- and post-synaptic spikes and ∆w± are the coefficients of potentiative/depressive weight change in the STDP learning. Fig. 5(a) shows the amount of weight update as a function of the timing difference between pre- and post- synaptic spikes. Here are two crucial points of STDP rule: first, positive or negative weight tuning would be obtained depending on the relative sequence of the pre- and post-synaptic spikes; besides, the magnitude of the weight tuning would decay exponentially with the timing lag between the above spikes. Note that by exchanging the positive and negative signs of the two terms in Eq. (1), anti-STDP learning rule of which the synaptic weight update is opposite to that of STDP is derived as seen in Fig. 5(b).

For hardware realization of STDP, various MT-based approaches have been demonstrated during the past decade[19, 37−40]. Generally, by carefully designing waveforms of spikes, the superposition of pre- and post-synaptic spikes becomes quite different depending on the timing difference as seen in Fig. 5(d). It would result in different amplitudes and durations of the effective write voltage Vdrop and thus synaptic conductance tuning ∆G with different directions and magnitudes could get realized in MT as seen in Fig. 5(e). Furthermore, researchers found that anti-STDP rule could also be realized by modifying spike wavefunction design at the source, drain or gate terminals of MT[41]. What our group discovered was that given the same spike waveforms imposed on MT, p-channel and n-channel MT would show opposite tuning of the channel conductance as seen in Figs. 5(c)−5(e)[12, 34]. Here the conductance tuning shown in p-type channel matched STDP rule while that in n-type channel anti-STDP rule.
STDP is widely accepted by neuroscientists as a law of cause and effect[43]. In contrast, anti-STDP violates the causal law. Thus, a direct application of anti-STDP learning rule would lead to absurd results in the real learning tasks just as using imaginary number straightforwardly in the real world. Is the hardware realization of anti-STDP just a toy, or can it be utilized as a powerful tool for neuromorphic computing? In the following sessions, we are going to show that the contrary behaviors of STDP and anti-STDP learning realized by CMT could be exploited to implement several types of learning with unprecedented efficiency.
3. CMT for neuromorphic computing
3.1 Application in ReSuMe
Several supervised learning rules have been developed for SNN, including SpikeProp, remote supervised method (ReSuMe)[44, 45], Tempotron, SPAN etc. Among these rules, ReSuMe has been found that by using CMT the hardware implementation efficiency can be revolutionarily enhanced[12, 34, 42]. The basic idea of ReSuMe is illustrated in Fig. 6(a): Apart from the pre-synaptic and post-synaptic neurons, a teacher one is introduced. Once the pre-synaptic neuron gives out a spike at the timing tin, the post-synaptic one would respond at tout while the teacher neuron emits a spike at the desired timing td. By tuning the strength of connection between pre-synaptic and post-synaptic neurons, the actual timing tout of the post-synaptic neuron is gradually synchronized to that of teacher. Mathematically, the synaptic weights are updated through the following rule[45]:
Δwoh(t)=1nhSh(t)[∫∞0apre(s)[Sdo(t−s)−Sao(t−s)]ds]+1nh[Sdo(t)−Sao(t)][a+∫∞0apost(s)Sh(t−s)ds], |
(2) |
with
apre(s)=−A−exp(−sτ−),apost(s)=+A+exp(−sτ+). |
(3) |
In the above expressions, Sh/o(t) stands for the spike trains ∑fδ(t−t(f)h/o) of neuron in the pre-synaptic h layer or in the post-synaptic o one, respectively, the superscript 'd/a' stands for desired/actual spiking, nh is the number of neurons in the presynaptic layer h, τ± denotes the decay time constant of the traces stimulated by the neurons, while A± denotes the associated decay coefficients.
The above formula is too complicated to understand, let alone inspiring a simple approach of hardware realization. Here we provide a physical picture as seen in Figs. 6(b) and 6(c) which would spark a highly succinct design based on CMT. First, we rewrite the target synaptic weight w as the sum of two components wd and wo, and demonstrate their update processes following ReSuMe as seen in Fig. 6(b). Then, the spiking of input neuron marked by tin in the figure stimulates two traces of the synaptic components. Both traces are exponentially decaying with time while they are symmetric with respect to zero. In the following the firing of the teacher neuron at the desired timing td and that of output neuron tout would sample on the two traces respectively, deserving two changes ∆w1 and −∆w2. Note that wd trace is positive while wo one is negative. Therefore, given that the actual timing of output neuron is later than that of desired one (tout > td), the sampled two changes would be ∆w1 > ∆w2. It results in a positive change of the targe synaptic weight (∆w > 0). Physically, an increased weight of the synapse would lead to earlier firing of the associated neuron next time. In this way, the originally later firing of the output neuron would catch up round by round through ReSuMe, and finally the learning stops at tout = td. Similar training but in the opposite direction happens for the other case where the actual timing of output neuron is earlier than that of desired one (tout < td) as seen in Fig. 6(c).
The above physical pictures especially the complementary traces stimulated by the spiking of input neurons and their separate sampling by the output neuron and teacher ones has inspired us to propose a CMT-based implementation diagram as seen in Fig. 7(a)[12, 34, 42]. From the viewpoint of biologically plausible learning rules, the input and teacher neurons actually make up the STDP pair while the input and output ones the anti-STDP. Therefore, the former can be implemented with the potentiative synaptic MT while the latter with the depressive synaptic one as seen in Fig. 7(a).
The primary advantage of the above CMT-based design for ReSuMe implementation is the substantial simplification of the supervise circuit. Without the usage of STDP and anti-STDP properties of CMT, it would request a quite complicated design of supervise circuit module as see in Fig. 7(b)[12, 34]. The STDP and anti-STDP changes of synapses connecting the input−output and input−teacher neuron pairs have to be realized first through the generation of two opposite traces (the lower-left components in the figure) and then the separate sampling by the output and desired signals (the lower-middle components in the figure). Finally, the sampled STDP and anti-STDP changes have to be summed up (the lower-right component in the figure) and applied to the gate of the synaptic transistor (upper pointed arrow in the figure). Fig. 7(c) further demonstrates the signal flowchart of ReSuMe. The difference between CMT-based design and ordinary MT one is clearly illustrated in the flowchart as that for the former STDP/anti-STDP are realized by the input-teacher/input-output neuron pair straightforwardly, while for the latter they have to be implemented with quite complexed circuits. A quantitative comparison is provided in Table 2 where at least two orders of energy reduction, an order of area saving and several orders of time consuming decrease were found[34].
Challenges and outlooks: Fig. 6 and Fig. 7 also indicate that the implementation of ReSuMe with CMT request strict symmetry between STDP and anti-STDP properties provided by the potentiative and depressive synapses. Otherwise, given unsymmetrical STDP and anti-STDP curves the training would converge at wrong timing of the output neuron as seen in Fig. 5(b) or 5(c) (tout ≠ td). Yet the conductance tuning in the real memristive devices usually does not match such stringent demand. As seen in Figs. 4(c) and 4(d), not only the conductance change ∆G strongly depends on the initial one G0, but it also demonstrates saturation behaviors when approaching the high and low limits. To be more general, we believe this symmetry challenge would be met in almost every possible application of CMT since by definition the word 'complementary' includes the meaning of symmetry. Thereby a refresh operation has been proposed to address designed to address the symmetry demand as seen in Fig. 7.
The refresh operation is to periodically exchange the roles of the potentiative and depressive synapses by the two CMT in the cell. Technically, it is realized by using positive and negative pulses alternatively as seen in Fig. 8(a). The conductance evolution curves sketched in Fig. 8(b) show that after several rounds of synaptic weight update, the two MT assuming the roles of potentiative and depressive synapses approach the higher and lower conductance limits, respectively. Then by exchanging the roles of the two MT, conductance saturation is avoided and thus the training gets continued. Here it is worth reminding that owing to the symmetry between the conduction band and valence one of graphene, the positive and negative pulse-induced STDP in the exchanged devices show nearly identical conductance tuning as sketched by Fig. 8(b). Fig. 8(c) further demonstrates the neural network-level difference of training with and without the refresh operation test on MNIST. Without exchanging the training would experience a short increase of the accuracy (nearly 100 epochs) and then a long last decrease until reaching the absolutely untrained results.

Finally, it is worth stressing that alongside those ubiquitous problems when using memtransistor as synapses, the unique challenges of CMT focus on symmetry. Unlike conventional CMOS where two transistors work alternatively, here the two complementary MT are parallelly connected and thus they work simultaneously as seen in Fig. 7. The implementation of ReSuMe algorithm requests highly symmetric STDP and anti-STDP rule by the two MT. On the other side, the device-level reality such as conductance saturation, stochasticity of the conductance tuning etc. makes the symmetric STDP and anti-STDP really difficult to maintain during the hundreds or thousands of epochs of training processes. As a preliminary attempt, we have proposed a refresh mechanism by periodically exchanging the roles of the two MT. Whether it works for real applications involving large-scale CMT arrays and long-duration training processes however remains to be seen. Therefore, effective and efficient addressing of the symmetric problem is called for regarding real applications of CMT in neuromorphic computing.
3.2 Application in R-STDP
It has long been known that although biologically plausible the STDP learning rule does not show satisfactory performance on those AI benchmark tasks. It is ascribed to the lack of supervise or reward signals which not only set 'goals' for the training but also implement global optimization over the whole network, rather than the local learning of STDP. Thereby, various modified versions of STDP were proposed to enhance the capacity. Among these modifications, the reward-modulated STDP (R-STDP) was first observed in the biological nerve experiments and then generalized to the application in neuromorphic computing. One possible biological foundation of R-STDP is the dopamine-induced modulation as seen in Figs. 9(a) and 9(b)[49]. Since the firing of pre-synaptic neuron comes later than that of post-synaptic one, the synaptic strength would have experienced a depression according to STDP rule as seen by the black lines in Fig. 9(b). However, the release of dopamine molecules reverses the process, turning the depression into potentiation as seen by the red lines.
The mathematical expression of R-STDP is then refined from the above biological processes as follows:
∂wij∂t=γr(t)ϵij(t), |
(4) |
where wij is the strength of the synapse connecting jth neuron in the pre-layer to ith one in the post layer, γ is the learning rate, r(t) is the time-variant reward signal by the environment and εij(t) is the eligibility trace which characterizes the short-term STDP learning results by the spike emission of pre- and post-synaptic neurons. The dynamic equation for its evolution is as follows:
τϵdϵij(t)dt=−ϵij(t)+ξij(t), |
(5) |
where τε is the associated relaxation time and ξij(t) is the sampled STDP:
ξij(t)=∑f,f′Δw+Θ(t(f′)post−t(f)pre)exp(t(f′)post−t(f)pre)δ(t−t(f′)post)−∑f,f′Δw−Θ(t(f)pre−t(f′)post)exp(t(f)pre−t(f′)post)δ(t−t(f)pre). |
(6) |
The quantities in the above expression share the same definitions as those in Eq. (1) except now it is a sampling of STDP-induced synaptic weight change as indicated by δ functions. Eqs. (4)−(6) appear quite complicated. To illuminate the physical mechanisms conceived in the above math formulas, the flow chart of the signals involved in R-STDP are plotted in Fig. 9(c). The first two rows denote the spike trains of the pre-synaptic neuron and post-one. Accordingly, two exponentially decaying curves namely STDP traces are stimulated by the spikes in these two trains while with opposite signs (the third and fourth rows). Then, the post-synaptic spikes would sample on the STDP trace by the pre-synaptic one and vise versa (the fifth row). In this way, ξij(t) is deserved. Note that here the sampling results are the initial values of the short-term STDP learning, and the corresponding trace would decay exponentially known as eligibility trace εij(t) as seen in the sixth row. The multiplication of this eligibility trace and the time-dependent reward signal R(t) (the seven row) defines the variation rate of the synaptic weight: ∂wij/∂t ∝ R × εij. Finally, the evolution of the synaptic weight is calculated from this dynamic equation and demonstrated in the last row. It is worth reminding that those important change points are marked by circle of different colors to help discriminate the opposite effects in Fig. 9(c).
From the viewpoint of hardware realization, the fact that the realization of STDP takes two terminals of synaptic device while that of reward takes the third one inspired our idea of MT-based R-STDP implementation. Furthermore, the potentiation/depression of synaptic strength by the positive/negative reward signal reminded that the opposite conductance tuning behaviors of CMT can be utilized to realize these contrasted tunings. Hence a CMT-based design to implement R-STDP was proposed as seen in Fig.10(a). Here the basic idea is that the positive rewards would be fed to one of the two parallelly connected CMT while the negative ones also known as punishments to the other. In the former case, the positive gate voltage VG in addition to the forwardly propagating signals along source−dtrain direction would cause conductance increasing of the target n-channel MT, while in the latter one the same VG leads to conductance decreasing of the p-channel MT. Fig. 10(b) shows the fabricated CMT cell to implement the proposed design where 2D WSe2 was used as the channel materials and the organic ferroelectric materials PVDF as the gate dielectrics. By setting one MT as n-channel while the other as p-channel, a CMT pair for R-STDP was constructed. Their conductance variation under the same coupling voltage pulses of Vpre and VG was demonstrated in Figs. 10(c) and 10(d) where STDP characteristics of conductance dependence ∆w on the spike timing difference ∆t was clearly illustrated. Besides, the incremental change of conductance under positive reward signal while decremental one under negative reward were achieved by alternatively utilizing one MT in the CMT cell.

The performance of the above CMT designed for R-STDP was further tested on a benchmark task namely cart-pole balance as seen in Fig. 10(e). In the task, left or right-oriented force with fixed amplitude is allowed to put on the car to prevent the pole from falling, and real-time switch between the left and right forces should be exerted depending on the currently observed situation. A single-layer SNN was then set up with the input neurons encoding four physical quantities of the system (x, dx, θ, dθ) which are the position and velocity of the car and the deviation angle and line speed of the pole, while the two output neurons represent selection of the left and right forces through time-to-first-spike coding as seen in Fig. 10(f). The synapses of this SNN were assumed to be made by the above CMT cells implementing the R-STDP learning rule. Then, network-level simulation showed that after 100 rounds of training the cart-pole was able to stand unfallen with 500 steps. The synaptic strengths of the two output neurons after training were plotted in heat map format with x-axis the pole angle θ and y-axis the pole line speed dθ, as shown in Figs. 10(g) and 10(h). The higher values of synaptic weights represented by the red color mainly lie in regions of θ and dθ with both negative (positive) values. At neural network level, such input cases result in quick response of the left (right) neuron. Physically, a force to the left (right) is instructed by the output of the SNN and thus the strongly oblique pole would be rescued by the force. Comparison of various hardware approach to implementing R-STDP is demonstrated in Table 3.
Challenges and outlooks: It has been found that the eligibility trace as described by Eq. (6) and demonstrated in Fig. 9(c) plays a crucial role when applying R-STDP in the reinforcement learning scenarios[53]. On one side, a quickly decaying eligibility trace would lead to fewer layers of synapses covered by the delayed reward signals, therefore weakening the effects of reward modulation. On the other side, an eligibility trace that decays too slowly may cause errors in SNN learning: The namely short-term tuning results by STDP for those layers near input last so long that a new frame of input would cause superposition of the learning processes by adjacent inputs. Thereby, it calls for an efficient hardware approach to realizing the eligibility trace. First, the short-term STDP learning should be implemented by those volatile tuning of conductance. Moreover, such short-term tuning has to be reinforced or turned upside down nonvolatilely by an additional signal from the third terminal. The former request may be met by those tuning mechanisms in the subthreshold region such as ferroelectric polarization switch[54], while the latter by the short-term to long-term memory transition of conductance switch discovered in memristive devices[19, 55].
3.3 Application in event-driven optoelectronic sensing and computing
Dynamic vision sensors (DVS) have attracted substantial interests in the past decade among the research community of SNN. Instead of emitting spikes according to the intensities of incident light, pixels of DVS only respond to the change of light. What DVS give out are in the address-event representation (AER) as (a1, t1, ±), ···, (an, tn, ±) where an and tn denote the address and timing of the light change event while ± the event polarity (light increasing/decreasing). Such dynamic response mode would result in much sparse information comparing to the conventional frame-by-frame camera, thus greatly reducing not only power but also bandwidth and other burden. Moreover, the event-driven sensing makes it ideal for the subsequent SNN treatment since the emission of spikes corresponds perfectly to the occurrence of events. In this area, various types of learning rules have been explored to develop event-driven computing and applications.
The fact that both positive and negative light-change events are characterized in DVS has inspired our idea of fabricating optoelectronic CMT as DVS, as seen in Fig. 11. By setting the channel of one MT as p−i−n doped while the other n−i−p doped with strict symmetry, the photocurrents generated through the two MT would offset each other exactly. Note that such sum-zero keeps valid for the static light incident. However, by introducing a capacitor in one branch there will be response lag between the two branches upon light changes. As seen in Fig. 11(a), given a change of light increase, the branch without capacitor would experience a step-like increase of the photocurrent while that with the capacitor a slower increase. Consequently, a positive spike is generated from the different response of this CMT branch pair. Similar mechanisms lead to the emergence of a negative current spike upon decreasing light. In this way, CMT cell here composed by p−i−n and n−i−p doped channels achieves the demanded dynamic vision. Fig. 11(b) demonstrates the experimental results where the intensity of incident light with λ = 520 nm increases and then decrease step-by-step (step size = 25 mW/cm2). Two positive electrical current pulses are stimulated by the increasing change while another two by the decreasing, indicating the effectiveness of the optoelectronic CMT based DVS.
Fig. 12(a) shows sketch and optical image of the fabricated optoelectronic MT: 2-dimensional (2D) WSe2 was used as the channel material, a HfO2 layer within the Al2O3 gate dielectrics served as the charge trapping layer and two split gates lay in the bottom for locally tuning the electrostatic doping in the separate channel regions. The channel length of each WSe2 photodiode is 3 μm. The area of the WSe2 photoactive region approximate to 45 μm2.The basic transfer characteristics of the fabricated device was shown in Fig. 12(b) through back-and-forth gate voltage sweeping. Note that here the two split gates were imposed with the same voltages. The results demonstrate not only nice ambipolar conductance properties of WSe2 channel, but also the reconfiguring of channel from heavily n-doped to slightly p-doped during the positive sweeping of the gate voltage. Here the regulation of electrostatic doping within the channel was ascribed to the charge injection/releasing in the HfO2 charge trapping layer during the gate voltage switch, as seen in the inset of Fig. 12(b). Then by setting the same or opposite voltages on the two split gates VG1,G2, the output characteristics of this MT demonstrated conduction behaviors of pure p-type, pn-junction, n-type or np-junction, respectively. Hence, the measurements shown in Fig. 12(c) verified the successful configuration of the WSe2 channel as p−i−n or n−i−p doped. Regarding the photovoltaic properties, the enhancement of the output electrical current under increasing intensity of incident laser (λ = 520 nm) on MT channel were demonstrated in Fig. 12(d) for p−i−n or n−i−p doped configuration, respectively. Quantitative evaluation indicated linear dependence of the short-circuit current Isc on the power of incident laser for both the opposite configurations of the WSe2 channel, as seen in Fig. 12(e). The photoresponsivity was then estimated by R = ISC/Peff where Peff was the effective power of incident laser on the channel.

The concept of in-sensor computing has become quite popular nowadays since it integrates three important functions of sensing, neuromorphic computing and memory in one system or even in one device[58−60]. Such integration can process the observed information straightforwardly in the local system rather than having to transmit it to the central processing cloud, wait for the treatment and then receive the results. Revolutionary saving of energy, time consuming and transmission bandwidth, together with more intelligent computing at the edge may be realized with this novel computing paradigm. When applied in the above CMT-based dynamic vision sensors, the design of in-sensor computing further requires that the photoresponsivity of optoelectronic CMT-based cell should be tunable and moreover, the tuned photoresponsivity should be memorable. Here the mechanisms are demonstrated by comparing the following equations:
V×G=I, |
(7) |
ΔI×R=Isc. |
(8) |
Eq. (7) denotes the electrical approach to neuromorphic computing, where the forward process is implemented via the multiplication of input voltage signals V (vector) and the conductance G (matrix), while the update process is realized by the nonvolatile tuning of the conductance G. Eq. (8) illustrates the optoelectronic counterpart: the forward is by the multiplication of input optical change signals ∆I (vector) and the photoresponsivity R (matrix), while the update by the nonvolatile tuning of the photoresponsivity. Therefore, the tunability of photoresponsivity is equivalent to the plasticity of synapses. The logic is that given the same intensities of the input light, tunable photoresponsivity leads to different output current. That is, for the same input signals, the output signals would be different. The latter is exactly the function of synaptic changes. Besides, the tuning of photoresponsivity has to be nonvolatile to mimics the long-term potentiation/depression of the synapse. Here in the case of the above optoelectronic MT, the demand of nonvolatile tuning of photoresponsivity was fulfilled by the split-gate and the charge-trapping layer. The quantitative results were shown in Fig. 12(f) where the variation of the photoresponsivity was measured under a series of split-gate voltage pulses. The increasing/decreasing behavior mimicked the long-term potentiation/depression characteristics of synapse plasticity, where satisfactory linearity, symmetry and multi-states were observed.
Then, motion/action recognition was demonstrated with a 128 × 128 × n array of the above CMT cells as seen in Fig. 13[56]. Here 128 × 128 denoted the space density of the vision pixels while n meant number of sub-pixels which was also that of output classes. In the circuit, the photocurrents produced by jth subpixel in each pixel were summed up and conducted to the jth output neuron to characterize the jth kind of output classes. For static illumination, the photocurrent in each subpixel remained zero since the latter were made by optoelectronic CMT as seen in the lower left inset of Fig. 13(a). Upon changed incident light, the summed photocurrents Ij would be different from each other owing to that the synaptic maps made up by the photocurrent responsibilities of the subpixels were tuned discriminable. Moreover, given motions such as the handwaving shown in Fig. 13(b), those pixels corresponding to the movement traces would give out positive or negative photocurrent spikes. Such time-variant sequences of spikes would activate the neurons in the output layer, and then learning rules of winner-take-all and STDP take effect. In this way, 92% recognition accuracy was achieved after 100 epochs of training for the three types of handwaving shown in Fig. 13(b). Besides,a 3 × 3 pixel array was created using 18 WSe2 photodiodes, which achieved a temporal resolution of 5 μs.

Finally, it is worth mentioning that a very recent work also reported dynamic vision and in-sensor computing through the usage of optoelectronic devices[61]. At the device level, the photoresponsivity was regulated not only by the back- and top-gate voltage but also by the source−drain voltage polarity. At the circuit level, a 3 × 3 small array was used as a cell to extract spatially variation messages from the incident light. Based on these novel devices and structures, successful tracking of a UAV flying through an alley was illustrated.
Challenges and outlooks: SNN utilizing information from DVS have been studied hotly and several learning rules have been proposed[62−65]. A common feature of these studies is that the signals transmitted from DVS to SNN are ±1. In other words, only the polarity information of the light changes is kept while the amplitudes are neglected. Such information compression on one side greatly simplifies the subsequent treatment by SNN, while on the other side the abundant information contained by the amplitudes or speeds of the light change has been dropped. The latter would cause errors in the SNN computing. Here, as seen in Fig. 11 there is much less information loss in our CMT-based DVS, since the amplitude of the output photocurrent is proportional to that of light change. Yet on the other hand, such effective preservation of sensed information would also demand innovated SNN algorithms that are capable of treating this non-spike-like or analog input information. In other words, currently there are challenges of finding efficient methods to graft analog-format input information generated by CMT sensor to the spike-format SNN treatment.
4. Conclusion and perspectives
Complementary memtransistors, made up by two memtransistors with opposite tuning behaviors of the channel conductance through gates, have been reviewed in this work on two critical issues: First, what types of materials and mechanisms can be utilized to fabricate CMT? Then, what types of novel hardware paradigms can be designed by utilizing CMT in neuromorphic computing?
For the first question, it is composed of two critical demands known as 1) highly efficient approaches to nonvolatilely tuning the channel conductance and 2) opposite gate-tuning behaviors of the channel conductance in CMT. For the former requirement, those mainstream approaches have been discussed with the key indices of performance listed including programming speed, multi-states, retention, endurance etc. Our conclusion is that currently, each type of nonvolatile conductance tuning has several advantages and meanwhile disadvantages. For example, the commercially matured mechanism of charge trapping layer achieves quite impressive performance while the programming speed remains a bottleneck yet. In the future, the novel properties found within 2D or heterotically stacked 2D materials may have the potential of making high-performance MT. Not only has ultrafast (<100 ns) and robust (106 endurance and 10-year retention) floating gate tuning been reported in 2D flash memory devices, but ferroelectric tuning has also been demonstrated in heterotically stacked 2D transistors. Yet it is known that nowadays researchers are still struggling with the large-scale and highly consistent production of 2D materials. Therefore, the potential and advantages of 2D materials based CMT remains to come true at chip-level with high quality and mass production. For the latter request on complementary tuning of channel conductance, it seems easy to implement at first glance. By using n- and p-doped channels two MTs with opposite transfer characteristics would be realized satisfactorily. However, the strict symmetry between n- and p-doped MTs remains a significant challenge. Here it is worth mentioning that graphene is a particularly ideal type of channel material for CMT fabrication. Its conduction and valence bands are exactly symmetric, resulting in CMT with highly symmetric conductance tuning as seen in Fig. 3. Considering the additional advantage of convenient and cheap fabrication of graphene channel, graphene has outlined itself as the best employee for proof-of-concept researches on CMT.
For the second question, there would be both great benefits and meanwhile severe challenges by introducing the unique properties of CMT in neuromorphic computing design. On the one side, tremendous advantages can be gained by exploiting the complementary conductance tuning behaviors of CMT in the design. For example, we have seen that by using CMT as the potentiative and depressive synapse pair in ReSuMe learning, the module of supervise circuit has been fully removed, hence saving the hardware and energy consumption by several orders. On the other side, the application of complementary behaviors has proposed specific demands on CMT cells. Primarily, strict symmetry between the conductance tuning of the two MTs during the whole training progress is required. However, the conductance saturation met at both the high- and low-resistance limits of the real devices would easily break the symmetry of the two CMTs during synaptic weight update process of the network-level training. In order to address the above challenge, a refresh operation by periodically exchanging the potentiative and depressive roles of the two CMTs has been proposed and shown promising potential of avoiding the conductance saturation phenomenon.
Another interesting question is that by further fabricating CMT channels with materials capable of switching between n- and p-doped, namely reconfigurable channels[37, 51, 66], will there emerge other notable benefits? Physically, by preparing MT channels with narrow bandgap semiconductor materials, the electrostatic doping of the channel can be manipulated between n- and p-types as seen in Fig. 5. The former would demonstrate potentiative synapse behavior while the latter depressive one. One potential scenario of such reconfigurable channels is the optoelectronic-related computing. By locally tuning the channel regions as p−i−n doping through split gates, a photodiode is prepared as seen in Fig. 9. Such a transistor can convert the input optical signals into electrical ones. Furthermore, by tuning electrostatic doping density of the p−i−n channel through the split gate voltages, the photoresponsivity would become different. It means that given the same input optical signals, the output photocurrent signals would be tunable depending on the reconfigurable channel doping. From the viewpoint of neuromorphic computing, such tunable photoresponsivity enabled by the reconfigurable channels achieves the synaptic plasticity function. In this way, in-sensor neuromorphic computing has been realized in MT and then more complicated dynamic vision has been demonstrated with CMT. It is worth pointing out that in the latter application, reconfigurable channels play even more critical role. Here the mechanism is that the neural network-level training would request tremendous flips of the synaptic weights between positive (+) and negative (−) values, which has to be realized by the tuning of the cell photoresponsivity between positive and negative. In our CMT design, it means the exchange of p−i−n and n−i−p doping of the two complementary optoelectronic MT. Hence the demand on the channel reconfigurability becomes even more prominent. For the work as discussed in Section 3.3, it was physically realized through local tuning effects, i.e., programming voltage pulses imposed on the split-gates of CMT. The major limitation of this reconfiguring approach is the endurance problem. Frequent tunnelling or hot carrier injection through the gate dielectric layer on one side meets the request of channel doping reconfiguration, while on the other side leads to quick recession of gate dielectric. To be more general, the shortcomings of implementing channel reconfigurability largely depend on the nonvolatile tuning mechanisms and materials employed by the CMT.
At neural network level, various learning rules realized through CMT cells have been discussed in this work. It seems that ReSuMe matches the best with CMT since STDP and anti-STDP making up the rule have been perfectly implemented with the two MTs respectively in the cell. However, at current stage it is very difficult to train SNN with more than one layer through ReSuMe[67]. Such performance refrains ReSuMe from solving problems more complicated than the Modified National Institute of Standards and Technology (MNIST) dataset. Similar problems lie in R-STDP algorithm. Although the imposing of a reward signal has greatly improved the training accuracy of the originally unsupervised STDP rule, whether such a globally identical signal would benefit significantly the training of multilayer namely deep SNN remains to be seen[68]. Thereby it calls for algorithm-level breakthrough to advance the application of CMT-based SNN design. In addition to ReSuMe and R-STDP learning algorithms, will there be other important neuromorphic algorithms that can be realized by CMT with revolutionary efficiency? We believe that here the key issue is the sophisticated usage of anti-STDP. In contrary to STDP rule, anti-STDP is against the law of causality and thus seems not allowed in the reality. However, we argue that the relation between STDP and anti-STDP is just like that between real number and imaginary one. A crafted usage of anti-STDP complementary with STDP would facilitate the application of neuromorphic algorithms in real world, just like that of complexed number in physics. Therefore, design technology co-optimization (DTCO) will be of particular importance for CMT-based neuromorphic computing since the not only the complementary behaviors of conductance tuning should be technically tailored but also the CMT cells and circuits should be carefully designed to match the request by different learning rules.
Acknowledgement
This work was supported by the National Key Research and Development Program of China (No. 2023YFB4502200), Natural Science Foundation of China (Nos. 92164204 and 62374063) and the Science and Technology Major Project of Hubei Province (No. 2022AEA001).