## Supplementary Information

# Demonstration of a novel Majority logic in the memristive crossbar array for in-memory parallel computing

Moon Gu Choi, Jae Hyun In, Hanchan Song, Gwangmin Kim, Hakseung Rhee, Woojoon Park and Kyung Min Kim\*

M. G. Choi, J. H. In, H. Song, G. Kim, H. Rhee, W. Park and Prof. K. M. Kim Department of Materials Science and Engineering Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea \*E-mail: <u>km.kim@kaist.ac.kr</u> This Supplementary Information contains the following materials:

**S1.** A detailed explanation of calculating the VTF value of logic gates (Supplementary Note 1)

**S2.** A comparison of the VTF value of logic gates depending on the different *P* values (Fig. S1)

**S3.** The requirements for high performance logic operation across different types of memristors. (Supplementary Note 2)

**S4.** Material characterization of the R<sub>s</sub>-controlled crossbar array and Ta/HfO<sub>2</sub>/Pt memristor (Fig. S2)

**S5.** Precise analysis and mathematical calculations of the influence of  $R_s$  on the set and reset voltage of the memristor (Supplementary Note 3-4)

S6. More electrical characterization of the Ta/HfO<sub>2</sub>/Pt memristor (Fig. S3-6)

S7. A detailed calculation of gate operation voltages of the logic gate (Fig. S7-9)

**S8.** Experimental results of logic gates for functionally complete MAJ logic (Fig. S10-11)

**S9.** Truth table of 1-bit Full Adder and 1-bit Full Subtractor (Fig. S12)

**S10.** A circuit design and SPICE simulation of an RVC circuit (Fig. S13)

S11. A feasible circuit architecture for a large-scale crossbar array (Fig. S14)

S12. A comparison of energy consumption of the RVC circuit and the memristor (Fig. S15)

**S13.** A detailed explanation of logical equality (Supplementary Note 5)

**S14.** A comprehensive specification comparison of 1-bit Full Adder (Table S1)

S15. A full procedure of 4-bit Carry Lookahead Adder (Fig. S16)

S16. A circuit design of 8-bit KSA (Fig. S17)

S17. A full procedure of 8-bit Kogge-Stone Adder (Fig. S18-20)

**S18.** Scaling-up of Kogge-Stone Adder to n-bit  $(n \ge 16)$  adder (Fig. S21)

**S19.** A comparison of energy consumption in 64-bit adder between NOR-based and MAJbased logic system (Fig. S22) S1. A detailed explanation of calculating the VTF value of various logic gates

Supplementary Note 1. The detailed calculation procedure of the VTF.

1) Assumptions used in the VTF calculation of the logic gate.

- The ON/OFF ratio of a memristor is high enough to ignore the conductance of HRS when calculating the node voltage.
- ii) The conductance of LRS is constant since the variation of LRS is much smaller than that of HRS.
- iii) The resistance of the series resistor (R<sub>S</sub>) is the same as the resistance of the LRS
- iv) Only cycle-to-cycle variation is considered.
- 2) Define cycle-to-cycle variation parameters and normalization with the maximum SET voltage (V<sub>SET, Max</sub>).



The figure (left) above shows the switching voltage distribution of a single memristor, while below shows the normalized distribution with the maximum SET voltage ( $V_{SET, Max}$ ). The table (right) shows the defined parameters and their expressions ( $v_i$  will be discussed later).

*3)* Define voltage conditions corresponding to each logic condition of the logic gates via truth tables. (NOT gate is shown as an example.)



The left figure shows the circuit configuration of the NOT gate and  $V_{Node}$  is the potential of the BL<sub>1</sub>. The right figure shows the voltage conditions corresponding to each logic condition. 1  $\rightarrow$  0 case can be ignored since RESET logic is not used.

| V <sub>Node</sub> | v <sub>A</sub>      | $v_B$                     |     | V <sub>Node</sub>     | $v_A$                       | $v_B$                       |
|-------------------|---------------------|---------------------------|-----|-----------------------|-----------------------------|-----------------------------|
| 0                 | V <sub>A</sub> 0    | V <sub>B</sub> 0          | SET | $\frac{V_B}{2}$       | $V_A - \frac{V_B}{2}$ (9)   | $\frac{V_B}{2}$ (1)         |
| $\frac{V_A}{2}$   | $\frac{V_A}{2}$ (1) | $V_B - \frac{V_A}{2}$ (0) |     | $\frac{V_A + V_B}{3}$ | $\frac{2V_A - V_B}{3}  (1)$ | $\frac{2V_B - V_A}{3}  (1)$ |

By using Kirchhoff's current law and previously mentioned assumptions,  $V_{Node}$  can be

calculated as  $\sum_{i}^{V_i G_i} / \sum_{i}^{G_i} G_i$ . The voltage drop on the memristor  $(^{v_i})$  can be calculated by subtracting the node voltage  $(^{V_{Node}})$  from the applied voltage on the memristor cell  $(^{V_i})$ . The above table shows the case for the NOT gate, and the yellow circles indicate the corresponding truth table (logical '0' or '1'). For example, if the input is '0' before SET switching, the output must be '1' for the NOT gate. An input must not be changed while an output must be changed appropriately, thus, the voltage drop on the device 'A', which is  $v_A$ , must be smaller than

 $1 - \Delta_{SET}$  (0  $\rightarrow$  0), whereas voltage drop on the device 'B', which is  $v_B$ , must be larger than 1 (0  $\rightarrow$  1).

### *4) Solve all inequalities for all conditions.*

If considering all voltage conditions of NOT gate, 6 inequalities can be obtained as follows.

$$V_A < 1 - \Delta_{SET}$$

$$V_A - \frac{V_B}{2} < 1 - \Delta_{SET}$$

$$V_B - \frac{V_A}{2} < 1 - \Delta_{SET}$$

$$\frac{V_A}{2} > \rho$$

$$\frac{V_B}{2} > \rho$$

$$V_B > 1$$

17

There are two variables,  $\rho$  and  $\Delta_{SET}$ , and they are all device's variables. If we fix the  $\rho$  value into all inequalities, it is possible to solve them and find out the solution when  $\Delta_{SET}$  is maximized. The variation tolerance factor (VTF) of the logic gate is defined as the maximum value of  $\Delta_{SET,1}$  The gate operation voltages, which are  $V_A$  and  $V_B$ , are the values that result in the VTF. However, the VTF considers only cycle-to-cycle variations, not cell-to-cell variations. Therefore, a slight compensation in the gate operation voltage value is recommended during the actual implementation of logic gates.

#### 5) Evaluate whether a particular logic gate is practical.

After obtaining the VTF, we can evaluate whether a particular logic gate is practical by comparing the VTF of the logic gate with the device's  $\Delta_{SET}$ . The VTF indicates how much variation a particular logic gate can tolerate. Therefore, the logic gate is practical if the VTF of the logic gate is larger than the device's variation (VTF >  $\Delta_{SET}$ ).

**S2.** A comparison of the VTF value of logic gates depending on the different  $\rho$  values



Fig. S1. The VTF values of various logic gates depending on different  $\rho$  values.

S3. The requirements for high performance logic operation across different types of memristors

**Supplementary Note 2.** A comparison of various parameters for logic operation across different types of memristors.

| Туре | Device structure<br>[Ref.]                                                                                                   | V <sub>RESET, Max</sub> /<br>V <sub>SET, Max</sub> | $\frac{V_{RESET, M}}{V_{SET, Max}}$ | <sup>as</sup> VTF | Variation tolerance | On/Off<br>ratio  | Switching speed | Endurance /<br>Uniformity |
|------|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------|-------------------------------------|-------------------|---------------------|------------------|-----------------|---------------------------|
| ECM  | Ag/HfO <sub>2</sub> /Pt [2]                                                                                                  | -1.5 V / 4 V                                       | 0.4                                 | 0.04              | Low                 | ~10 <sup>5</sup> | Medium          | Low / Low                 |
| TCM  | TiN/TiO <sub>2</sub> /Al [3]                                                                                                 | 1.65 V / 4.3 V                                     | 0.4                                 | 0.04              | Low                 | ~10 <sup>2</sup> | Fast            | Low / Low                 |
| CTM  | Pt/Ta <sub>2</sub> O <sub>5</sub> /Nb <sub>2</sub> O <sub>5-</sub><br><sub>x</sub> /Al <sub>2</sub> O <sub>3-y</sub> /Ti [4] | -6 V / 10 V                                        | 0.6                                 | 0.16              | Medium              | ~10 <sup>3</sup> | Slow            | Medium / High             |
|      | Ta/TaO <sub>x</sub> /Pt [5]                                                                                                  | -0.9 V / 1.85 V                                    | 0.5                                 | 0.1               | Low                 | $\sim 10^{2}$    | Fast            | High / High               |
| VCM  | Pt/Ta/HfO <sub>2</sub> /Pt<br>[This work]                                                                                    | -1.06 V / 0.87 V                                   | 1.2                                 | 0.4               | High                | ~40              | Fast            | High / High               |

From the perspective of reliable logic operation, the most important factor is maximizing the VTF of the device, and this is achieved by having a higher  $V_{RESET}$  to  $V_{SET}$  ratio. In this regard, the VCM-type HfO<sub>2</sub>-based memristor used in our study has advantages compared to other mechanisms.

First, the switching mechanism of the ECM memristor is the drift or diffusion of highly mobile metal ions, such as Ag or Cu, which form a conductive filament. In the ECM memristor, the reset failure (set-stuck) can occur because of the metal ions that accumulate over repeated switching, which is a critical problem for endurance. In addition, it shows low uniformity because metal ions are randomly located in the switching layer. Also, this makes the reset operation require a higher voltage potential than the set operation, which results in a low switching ratio (low VTF).

Second, the switching mechanism of the TCM memristor is cation reduction and oxidation. It exhibits unipolar switching because the conductive filament is ruptured by joule

heating. Unfortunately, the unipolar switching results in the reset operation in the positive voltage range (smaller than set voltage). This prevents the memristor from switching conditionally, which is the fundamental principle of logic operation. Furthermore, the TCM memristor shows a low switching ratio (low VTF) since the conductive filament ruptures easily through joule heating.

Third, the switching mechanism of the CTM memristor is charge trapping and detrapping in the defect site of the oxide layer. It exhibits high uniformity. However, the switching speed is about 1 ms, which is much slower than the VCM memristor (~50 ns). Also, the switching voltage is much higher than that of other types of memristors due to its switching mechanism. Slow switching speed and high switching voltage lead to slow and energyintensive computing, which is inappropriate for logic operation.

Last, the switching mechanism of the VCM memristor is the drift and diffusion of oxygen vacancies between the oxide and oxygen reservoir. This results in not only high endurance and uniformity, but also fast switching speed. In summary, the VCM memristor with a high switching ratio (high VTF) is appropriate for the logic operation.



S4. Material characterization of the R<sub>S</sub>-controlled crossbar array and Ta/HfO<sub>2</sub>/Pt memristor

Fig. S2. a) An optical microscopic image of the  $R_s$ -controlled crossbar array and b) the equivalent circuit of the  $R_s$ -controlled crossbar array. c) A cross-sectional TEM image of the Ta/HfO<sub>2</sub>/Pt memristor. The scale bars of the images are 100 µm and 10 nm, respectively.

**S5.** Precise analysis and mathematical calculations of the influence of  $R_s$  on the set and reset voltage of the memristor

**Supplementary Note 3.** Experimental analysis of the influence of  $R_s$  on the set and reset voltage of the memristor.



a) The *I-V* curves of the device with 4 different  $R_S$  values for 15 cycles for each. The *I-V* curves of 1st, 5th, and 15th cycles are represented. b) The distributions of set and reset switching voltages with different  $R_S$  values. c) The  $\rho$  values with different  $R_S$  values. The  $\rho$  value between 1.2 and 1.5 is the practically available range for variation-tolerant logic operation.

To precisely analyze the influence of  $R_S$  on the set and reset voltage of the memristor, 15 cycles of the device with 4 different  $R_S$  values and without  $R_S$  were measured. The  $V_{SET}$  of the device

was almost constant for every case, while the  $V_{RESET}$  was continuously decreased due to the voltage divider effect. Accordingly, the  $\rho$  value increased as  $R_s$  increased.

However, the  $R_s$  does not need to be larger than 500  $\Omega$  in terms of variation tolerance and energy consumption. The  $\rho$  value of 1.5 is already sufficient to obtain the VTF value for variation-tolerant logic operation. Also, a larger  $R_s$  further decreases the reset voltage, resulting in an increase in energy consumption. Supplementary Note 4. The voltage divider effect during set and reset switching.



Assume the R<sub>S</sub> (300  $\Omega$ ) is  $\frac{1}{3}$  of the device's LRS (900  $\Omega$ ) and  $\frac{1}{100}$  of the HRS (30k  $\Omega$ ).

1) Comparison of  $V_{\text{SET}}$  during SET switching with and without the  $R_{\text{S}}$ 

i) without  $R_s$ :  $v_A$  is applied to the memristor

ii) with R<sub>s</sub>: 
$$\frac{30000}{30000 + 300} \times v_A \approx 0.99 v_A$$
 is applied to the memristor

Therefore,  $V_{\text{SET}}$  is nearly constant for cases with and without the  $R_{\text{S}}$ .

2) Comparison of  $V_{\text{RESET}}$  during RESET switching with and without the  $R_{\text{S}}$ 

i) without  $R_s$ :  $v_A$  is applied to the memristor

ii) with R<sub>s</sub>: 
$$\frac{900}{900 + 300} \times v_A \approx 0.75 v_A$$
 is applied to the memristor

 $\frac{4}{3} \times v_A$  is required to apply  $v_A$  to the R<sub>s</sub> incorporated memristor. Therefore, the R<sub>s</sub> incorporated memristor's V<sub>RESET</sub> increased  $\frac{4}{3}$  times compared to the memristor without the R<sub>s</sub>.

S6. More electrical characterization of the Ta/HfO<sub>2</sub>/Pt memristor



**Fig. S3.** Statistical results of the V<sub>SET</sub> and V<sub>RESET</sub> at 50 cycles. a) The *I-V* curves of 50 switching cycles of the R<sub>s</sub> incorporated device. Self-limited switching region is indicated. b) Statistical distributions of the V<sub>SET</sub> and V<sub>RESET</sub>. The relative dispersion values ( $\sigma/\mu$ ) of the V<sub>SET</sub> and V<sub>RESET</sub> are 7.8 % and 3.6 %, respectively. The inset shows their histograms



**Fig. S4.** Robust cyclic endurance up to  $10^6$  cycles, where set and reset pulses were 0.9 V for 10 µs and reset -1.4 V for 10 µs, respectively, and the reading pulse was 0.2 V for 10 µs.



Fig. S5. Stable retention results up to  $10^4$  seconds read at 0.2 V at room temperature.



**Fig. S6.** Excellent device-to-device uniformity in the crossbar array. 8 randomly selected devices from the crossbar array were measured at 25 switching cycles each.





**Fig. S7.** Operation clock pulse calculation of the BUFFER gate in consideration of the VTF. a) Circuit configuration of the BUFFER gate. b) Physical state of input and output and appropriate voltage conditions for each logic condition. c) All voltage conditions that satisfy all logic conditions for the BUFFER gate.



**Fig. S8.** Operation clock pulse calculation of the NOT gate in consideration of the VTF. a) Circuit configuration of the NOT gate. b) Physical state of input and output and appropriate voltage conditions for each logic condition. c) All voltage conditions that satisfy all logic conditions for the NOT gate.

| 2                      |                                        |                                                                                                                                                         | h                                       | ٦.  |                             |          |                        |                                           |                                 |                          |                      |                       |
|------------------------|----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|-----|-----------------------------|----------|------------------------|-------------------------------------------|---------------------------------|--------------------------|----------------------|-----------------------|
|                        |                                        | V <sub>B</sub><br>R <sub>v</sub> :                                                                                                                      |                                         |     | Inputs                      | Outpu    | ut                     | Logio<br>Conditio                         | c<br>ons                        | Volta<br>Condit          | ige<br>ions          |                       |
| Inpu                   | it 'A' Input 'E                        | 3' Output 'Y'                                                                                                                                           | Input 'C'                               |     |                             |          |                        | 0 → 0                                     | 0                               | $v_i < 1 -$              | $\Delta_{SET}$       |                       |
|                        |                                        |                                                                                                                                                         |                                         |     | R <sub>A</sub>              | $R_B$    |                        | 1 →                                       | 1                               | $v_i >$                  | ρ                    |                       |
| BL <sub>1</sub>        |                                        |                                                                                                                                                         | – <mark>₩–-</mark> Л                    |     |                             |          | 0 →                    |                                           | 1                               | $v_i > 1$                |                      |                       |
| • WL <sub>1</sub>      | <sup>1</sup> WL <sub>2</sub>           | <sup>†</sup> WL <sub>3</sub>                                                                                                                            | RL                                      |     |                             |          |                        |                                           |                                 |                          |                      |                       |
| V <sub>Node</sub>      | $v_A$                                  | $v_B$                                                                                                                                                   | $v_Y$                                   |     | V <sub>Node</sub>           |          | 1                      | V <sub>A</sub>                            |                                 | v <sub>B</sub>           | $v_1$                | /                     |
| 0                      | V <sub>A</sub> ()                      | V <sub>A</sub>                                                                                                                                          | V <sub>B</sub>                          |     | $\frac{V_B}{2}$             |          | V <sub>A</sub>         | $-\frac{V_B}{2}$ (0)                      | $V_A$                           | $-\frac{V_B}{2}$ 0       | $\frac{V_{2}}{2}$    | 1                     |
| $\frac{V_A}{2}$        | $\frac{V_A}{2}$ ()                     | $\frac{V_A}{2}$ 1                                                                                                                                       | $V_B - \frac{V_A}{2}$ ()                | _\  | $\frac{V_A + V_B}{3}$       | 3        | $2V_A$                 | $\frac{-V_B}{3}$ 0 $\frac{2V_A - V_B}{3}$ |                                 | $\frac{-V_B}{3}$ (1)     | $\frac{2V_B}{3}$     | $-V_A$ (1)            |
| $\frac{V_A}{2}$        | $\frac{V_A}{2}$ 1                      | $\frac{V_A}{2}$ (1)                                                                                                                                     | $V_B - \frac{V_A}{2}$ (1)               |     | $\frac{V_A + V_B}{3}$       | 3        | $\frac{2V_A - V_B}{3}$ |                                           | $\frac{2V_A - V_B}{3} \bigcirc$ |                          | $\frac{2V_B}{3}$     | $-V_A$ (1)            |
| $\frac{2V_A}{3}$       | $\frac{V_A}{3}$ (1)                    | $\frac{V_A}{3}$ (1)                                                                                                                                     | $V_B - \frac{2V_A}{3}  \textcircled{0}$ |     | $\frac{2V_A + V}{4}$        | B        | $\frac{2V_A - V_B}{4}$ |                                           | $\frac{2V_A - V_B}{4}$ (1)      |                          | $\frac{3V_B}{4}$     | · 2V <sub>A</sub>     |
|                        |                                        |                                                                                                                                                         |                                         |     |                             |          |                        |                                           |                                 |                          |                      |                       |
| V <sub>Node</sub>      | v <sub>A</sub>                         | v <sub>B</sub>                                                                                                                                          | v <sub>Y</sub>                          |     | V <sub>Node</sub>           |          | 1                      | v <sub>A</sub>                            |                                 | v <sub>B</sub>           | v                    | <i>(</i>              |
| V <sub>C</sub>         | $V_A - V_C$ (0)                        | $V_A - V_C$ (1)                                                                                                                                         | $V_B - V_C$ (0)                         |     | $\frac{V_B + V_0}{2}$       | 2 I      | $V_A - \frac{1}{2}$    | $\frac{V_B + V_C}{2  0}$                  | $V_A - $                        | $\frac{V_B + V_C}{2  0}$ | $\frac{V_B}{2}$      | $\frac{V_c}{1}$       |
| $\frac{V_A + V_C}{2}$  | $\frac{V_A - V_C}{2}  \textcircled{0}$ | $\frac{V_A - V_C}{2}$ (1)                                                                                                                               | $V_B - \frac{V_A + V_C}{2}$             | _\  | $\frac{V_A + V_B + V_B}{3}$ | $V_c$ 2  | $2V_A -$               | $V_B - V_C$<br>3 0                        | $\frac{2V_A}{-}$                | $V_B - V_C$              | $\frac{2V_B-V}{3}$   | $\frac{V_A - V_C}{1}$ |
| $\frac{V_A + V_C}{2}$  | $\frac{V_A - V_C}{2}$ (1)              | $\frac{V_A - V_C}{2}  \bigcirc  \qquad \qquad$ | $V_B - \frac{V_A + V_C}{2}$             | SET | $\frac{V_A + V_B + V_B}{3}$ | $V_c$ 2  | $2V_A -$               | $\frac{V_B - V_C}{3}$                     | $\frac{2V_A}{-}$                | $V_B - V_C$              | $\frac{2V_B-V_B}{3}$ | $V_A - V_C$           |
| $\frac{2V_A + V_C}{3}$ | $\frac{V_A - V_C}{3}$                  | $\frac{V_A - V_C}{3}$                                                                                                                                   | $V_B - \frac{2V_A + V_C}{3}$            |     | $2V_A + V_B - A$            | $+V_c$ 2 | $2V_A -$               | $V_B - V_C$                               | $2V_A -$                        | $\frac{V_B - V_C}{4}$    | $\frac{3V_B-2}{4}$   | $V_A - V_C$           |
| 5                      | J U                                    | J J                                                                                                                                                     | 5 😈                                     |     | 4                           |          |                        | - U                                       |                                 | - U                      | 4                    |                       |

**Fig. S9.** Operation clock pulse calculation of the Majority gate in consideration of the VTF. a) Circuit configuration of three-input-one-output the Majority gate. b) Physical state of input and output and appropriate voltage conditions for each logic condition. c, d) All voltage conditions that satisfy all logic conditions of the Majority gate in the case of the input ( $V_C$ ) is logical '0' and '1', respectively.

S8. Experimental results of logic gates for functionally complete MAJ logic



**Fig. S10.** Experimental demonstration of the NOT gate. a) Circuit configuration of the NOT gate. The inset shows the operation clock pulse condition. b) Experimental results of the NOT gate.

Input and output states are read before and after the operation clock pulse. The read currents are measured at 0.25 V amplitude. Operating voltages and read currents are shown in gray and blue, respectively.



**Fig. S11.** Experimental demonstration of the BUFFER gate. a) Circuit configuration of the BUFFER gate. The inset shows the operation clock pulse condition. b) Experimental results of the BUFFER gate.

Input and output states are read before and after the operation clock pulse. The read currents are measured at 0.25 V amplitude. Operating voltages and read currents are shown in gray and blue, respectively.

## **S9.** Truth table of 1-bit Full Adder and 1-bit Full Subtractor

|   |          | Input |   | Ou               | tput |
|---|----------|-------|---|------------------|------|
| A | <b>\</b> | В     | С | C <sub>out</sub> | Sum  |
| C | )        | 0     | 0 | 0                | 0    |
| 0 | )        | 0     | 1 | 0                | 1    |
| 0 | )        | 1     | 0 | 0                | 1    |
| 0 | )        | 1     | 1 | 1                | 0    |
| 1 |          | 0     | 0 | 0                | 1    |
| 1 |          | 0     | 1 | 1                | 0    |
| 1 |          | 1     | 0 | 1                | 0    |
| 1 |          | 1     | 1 | 1                | 1    |

Fig. S12. The truth table of ALU operation. a) 1-bit Full Adder b) 1-bit Full Subtractor.



#### S10. A circuit design and SPICE simulation of an RVC circuit

**Fig. S13.** Circuit configuration of an R-to-V converter and its timing diagram. a) Circuit configuration of an R-to-V converter. The Comparator, D flip-flop, inverter, and transistors are connected. b, c) Timing diagram of the R-to-V converter for HRS and LRS, respectively.

The signal of the comparator will be high enough to trigger the D flip-flop if the selected memristor is LRS, while it will not be sufficient to trigger it, if HRS. Even after the read process, the converted signal Q will not disappear unless a clock pulse is given to the D flip-flop. This means that we can use it whenever we need it. Also, we can use the opposite of the read signal by using the Q signal. Lastly,  $V_Q$  will turn on the appropriate transistor, resulting in  $V_{out}$  being equal to  $V_{ss}$  (-0.65 V) or ground (0 V) whether the read memristor is LRS or HRS, respectively.

S11. A feasible circuit architecture for a large-scale crossbar array



Fig. S14. Feasible circuit architecture for a large-scale crossbar array.

The circuit architecture consists of 1T (transistor)1M (memristor) array and peripheral CMOS circuits.<sup>6</sup> 1T1M structure can reduce sneak current during logic operations. The peripheral CMOS circuit consists of demultiplexers, control units, R-to-V converters, and resistors. A control unit is the main unit that regulates the entire operation and sends appropriate signals to other subordinate units, such as pulse generators and demultiplexers. The address control unit delivers clock signals to the selected row and column demultiplexer. Converted signals from R-to-V converters can be stored in the register (or D-flip flop) temporarily, and output data can be stored permanently in the crossbar array.





**Fig. S15.** Comparison of energy consumption. a) Set switching transient of the memristor and b) the yellow region of Fig. S15a.

0.9 V amplitude pulse is applied on the device, and 100 ns is enough for the device to SET switching to 900  $\Omega$  (LRS). The energy consumption of a single memristor during SET switching is calculated as 70 pJ. This amount of energy is not an unfamiliar value for logic operations using conventional VCM memristors.<sup>7</sup> The measured power of flip-flops fabricated by conventional 65 nm CMOS technology at 1.2 V supply voltage, 100 MHz, and 20 % activity ratio is 0.37  $\mu$ W. If a clock pulse width is 150 ns, energy consumption is calculated as 0.055 pJ.<sup>8</sup> Also, the measured power of the voltage comparator fabricated by conventional 65 nm CMOS technology at 1.0 V supply voltage and 25 MHz is about 5  $\mu$ W, and the measured energy per comparison is 0.192 pJ.<sup>9</sup> The summed energy of the flip-flop and comparator is 0.247 pJ, which is 0.35 % of the SET switching energy of a memristor. Therefore, the energy consumption of the RVC circuit is negligible.

## **S13.** A detailed explanation of logical equality

Supplementary Note 5. Logical equality of Sum calculation.

**1.** Using 
$$C_n$$
 bit  
 $C_{n+1} = MAJ_3(A_n, B_n, C_n) = A_nB_n + B_nC_n + C_nA_n$   
 $\overline{C_{n+1}} = \overline{A_nB_n + B_nC_n + C_nA_n} = \overline{A_nB_n} + \overline{B_nC_n} + \overline{C_nA_n}$   
 $P_{n+1} = MAJ_3(A_n, B_n, \overline{C_n}) = A_nB_n + B_n\overline{C_n} + \overline{C_nA_n}$   
 $S_n = MAJ_3(C_nP_{n+1}, \overline{C_{n+1}}) = C_nP_{n+1} + P_{n+1}\overline{C_{n+1}} + \overline{C_{n+1}}C_n$   
 $= C_nP_{n+1} = A_nB_nC_n$   
 $= P_{n+1}\overline{C_{n+1}} = A_n\overline{B_nC_n} + \overline{A_nB_nC_n} + \overline{A_nB_nC_n}$   
 $S_n = A_n\overline{B_nC_n} + \overline{A_nB_nC_n} + A_nB_nC_n + \overline{A_nB_nC_n} = (A_n \oplus B_n)\overline{C_n} + (A_n \odot B_n)C_n$ 

$$= A_n \oplus B_n \oplus C_n$$

2. Using 
$$A_n$$
 bit  
 $C_{n+1} = MAJ_3(A_n, B_n, C_n) = A_nB_n + B_nC_n + C_nA_n$   
 $\overline{C_{n+1}} = \overline{A_nB_n + B_nC_n + C_nA_n} = \overline{A_nB_n} + \overline{B_nC_n} + \overline{C_nA_n}$   
 $Q_{n+1} = MAJ_3(\overline{A_n}, B_n, C_n) = \overline{A_nB_n} + B_nC_n + C_n\overline{A_n}$   
 $S_n = MAJ_3(A_n, Q_{n+1}, \overline{C_{n+1}}) = A_nQ_{n+1} + Q_{n+1}\overline{C_{n+1}} + \overline{C_{n+1}}A_n$   
 $A_nQ_{n+1} = A_nB_nC_n$   
 $Q_{n+1}\overline{C_{n+1}} = \overline{A_nB_nC_n} + \overline{A_nB_nC_n} + \overline{A_nB_nC_n}$   
 $S_n = A_n\overline{B_nC_n} + \overline{A_nB_nC_n}$ 

 $= A_n \oplus B_n \oplus C_n$ 

 $\oplus$  :XOR gate,  $\odot$  :XNOR gate

|                     |                          |                   |                          |               |               | *STC <sub>F</sub> | $_{A} = Steps \times Cells$ |
|---------------------|--------------------------|-------------------|--------------------------|---------------|---------------|-------------------|-----------------------------|
| Reference           | eference Year Used gates |                   | Logic state<br>variables | # of<br>steps | # of<br>cells | STC               | Considering<br>Variation?   |
| S. Kvatinsky et al. | 2013                     | 2IMP, NOT         | Resistance               | 29            | 6             | 174               | Yes                         |
| G. C. Adam et al.   | 2016                     | 2IMP, NOT         | Resistance               | 35            | 6             | 210               | No                          |
| P. Huang et al.     | 2016                     | 3NAND, 3AND       | Resistance               | 10            | 9             | 90                | No                          |
| S. G. Rohani et al. | 2017                     | 2IMP              | Resistance               | 22            | 5             | 110               | No                          |
| Z. Sun et al.       | 2018                     | 5SUM, 4CARRY      | Resistance               | 2             | 5             | 10                | No                          |
| K. M. Kim et al     | 2019                     | 3NIMP, 3AND, 2OR  | Resistance               | 7             | 7             | 49                | Yes                         |
| L. Cheng et al.     | 2019                     | 3NOR, 2IMP, NOT   | Resistance               | 12            | 11            | 132               | No                          |
| Y. S. Kim et al.    | 2020                     | 3NOR, NOT, BUFFER | Resistance               | 12            | 13            | 156               | Yes                         |
| Song. Y et al.      | 2022                     | 3XOR, 3AND, 3OR   | Resistance<br>& Voltage  | 6             | 9             | 54                | No                          |
| This work           | -                        | 3MAJ, NOT         | Resistance<br>& Voltage  | 5             | 7             | 35                | Yes                         |

## S14. A comprehensive specification comparison of 1-bit Full Adder

 Table S1. A comprehensive specification comparison of 1-bit Full Adder.



**Fig. S16.** The full procedure of 4-bit Carry Lookahead Adder (CLA). Carry-out generation requires 2 steps for 1-bit. The first step is to read the Carry-in bit and execute the MAJ gate (2 steps). Sum generation requires 3 steps regardless of the number of bits. Thus, *N*-bit CLA requires 2N + 3 steps.

### S16. A circuit design of 8-bit KSA



**Fig. S17.** A circuit design of 8-bit KSA. Parallel Prefix Adder requires  $log_2N$  rounds for adding up an *N*-bit adder. Therefore, 8-bit KSA requires 3 rounds to compute all  $c_{out}$  outputs.<sup>10</sup> The data obtained during logic operations ( $M_i, N_i, c_i$ ) are exactly matched with the data represented in the main Fig. 5b.



#### **S17.** A full procedure of 8-bit Kogge-Stone Adder

**Fig. S18.** Carry-out generation and relocation process of Kogge-Stone Adder. Carry-out generation process requires  ${}^{4log_2N-2}$  steps, and relocation process requires  ${}^{log_2N-1}$  steps.



**Fig. S19.** Sum generation process of Kogge-Stone Adder. Sum generation process requires 4 steps. Therefore, the total required step of *N*-bit Kogge-Stone Adder is  $5log_2N + 1$  (i.e.,  $4log_2N - 2 + log_2N - 1 + 4$ ).



**Fig. S20.** Illustration of 8-bit Kogge-Stone Adder.  $1110110_{(2)} + 0100101_{(2)} + 1_{(2)}$  is added as an example of Kogge-Stone Adder. All bits during calculation are represented.

#### **S18.** Scaling-up of Kogge-Stone Adder to n-bit $(n \ge 16)$ adder



**Fig. S21.** A detailed explanation of Kogge-Stone Adder and scaling up to 16-bit. The gray box executes two MAJ gates which compute two uncompleted bits, while the yellow box executes one MAJ gate which computes a completed  $c_{out}$  bits. The number written in each box indicates the degree of completeness of  $c_{out}$ . For instance, to calculate  $c_5$ , execute the MAJ gate using the **'5:2'** purple box bits and the **'1:0'** green box bit (i.e., **5:2** + **1:0** = **5:0**). In summary,  $c_{n+1}$  can be obtained when the number in the box reaches n:0.

**S19.** A comparison of energy consumption in 64-bit adder between NOR-based and MAJ-based logic system



**Fig. S22.** The comparison of energy consumption in 64-bit adder between a NOR-based and MAJ-based logic system. The NOR-based logic system requires about 24 nJ, while the MAJ-based logic system requires 27 nJ. The total number of set switching, reset switching, and the times that RVC was used were considered.

Power consumption during logic operations is heavily influenced by the number of switching events in the memristor cells, which can vary depending on the arrangement of input values. Therefore, we took an example of a 64-bit addition operation as shown below, where 0s and 1s are evenly distributed, to compare the power consumption between a NOR-based logic system and our MAJ-based logic system.

 $+ 011011001001101100100110110010011011001001101100100110110010011_{(2)}$ 

Before the comparison, we assumed that the energy consumption of an RVC circuit used in this study is 0.35 % of the set switching, which was precisely calculated in Fig. S15 in the ESI.<sup>†</sup> Moreover, the reset switching requires more energy than the set switching because the

former starts from a much higher current while the latter starts from almost zero. In this regard, we assumed that the reset switching energy is twice the set switching energy. Thus, set switching, reset switching, and the RVC circuit require 70 pJ, 140 pJ, and 0.25 pJ, respectively. Now, we will focus on the total number of set-switching and reset-switching events during the adder operation. In our MAJ-based logic system, the total number of set-switching events is 387, and the total number of times data was fetched using the RVC circuit is 610. In the NOR-based logic system, the total number of set-switching events is 281 and reset-switching events is 31. In summary, our MAJ-based logic system requires about 27 nJ, while the NOR-based logic system requires 24 nJ.

This result suggests that the number of switching events required to obtain the result of a logic operation does not differ significantly, regardless of the gate used, leading to similar overall energy consumption. Meanwhile, our MAJ-based logic system required only 10 % of the total steps of a 64-bit adder operation compared to the NOR-based logic system. The main advantage of the Parallel-Prefix Adder is its extremely fast speed, achieved at the cost of a slight increase in energy consumption.

#### References

- 1 Y. S. Kim, M. W. Son, H. Song, J. Park, J. An, J. B. Jeon, G. Y. Kim, S. Son and K. M. Kim, *Adv. Intell. Syst.*, 2020, **2**, 1900156.
- 2 Y. Liu, H. Tian, F. Wu, A. Liu, Y. Li, H. Sun, M. Lanza and T-L. Ren, *Nat. Commun.*, 2023, 14, 2695.
- 3 K. M. Kim, N. Xu, X. Shao, K. J. Yoon, H. J. Kim, R. S. Williams and C. S. Hwang, *Phys. Status Solidi-Rapid Res. Lett.*, 2019, **13**, 1800629.
- 4 G. Kim, S. Son, H. Song, J. B. Jeon, J. Lee, W. H. Cheong, S. Choi and K. M. Kim, *Adv. Sci.*, 2023, **10**, 2205654.
- 5 D. H. Kim, Y. S. Kim, W. H. Cheong, H. Song, H. Rhee, S. N. Kay, J-W. Han and K. M. Kim, *Adv. Intell. Syst.*, 2022, **4**, 2200058.
- 6 P. Huang, J. Kang, Y. Zhao, S. Chen, R. Han, Z. Zhou, Z. Chen, W. Ma, M. Li, L. Liu and X. Liu, *Adv. Mater.*, 2016, 28, 9758.
- 7 L. Cheng, Y. Li, K. S. Yin, S. Y. Hu, Y. T. Su, M. M. Jin, Z. R. Wang, T. C. Chang and X.
   S. Miao, *Adv. Funct. Mater.*, 2019, 29, 1905660.
- 8 J.-Y. Park, M. Jin, S.-Y. Kim and M. Song, *Electronics*, 2022, 11, 877.
- 9 S. Chevella, D. O'Hare and I. O'Connell, *IEEE Solid-State Circ. Lett.*, 2020, 3, 154.
- 10 V. Pudi, K. Sridharan and F. Lombardi, IEEE Trans. Comput., 2017, 66, 1824.