# This document is downloaded from DR-NTU (https://dr.ntu.edu.sg) Nanyang Technological University, Singapore. # Design exploration of hybrid CMOS and memristor circuit by new modified nodal analysis Fei, Wei; Yu, Hao; Zhang, Wei; Yeo, Kiat Seng 2011 Fei, W., Yu, H., Zhang, W., & Yeo, K. S. (2011). Design exploration of hybrid CMOS and memristor circuit by new modified nodal analysis. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 20(6), 1012-1025. https://hdl.handle.net/10356/94859 https://doi.org/10.1109/TVLSI.2011.2136443 © 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/TVLSI.2011.2136443]. Downloaded on 20 Mar 2024 18:04:16 SGT # Design Exploration of Hybrid CMOS and Memristor Circuit by New Modified Nodal Analysis Wei Fei, Student Member, IEEE, Hao Yu, Member, IEEE, Wei Zhang, Member, IEEE, and Kiat Seng Yeo, Senior Member, IEEE Abstract—Design of hybrid circuits and systems based on CMOS and nano-device requires rethinking of fundamental circuit analysis to aid design exploration. Conventional circuit analysis with modified nodal analysis (MNA) cannot consider new nano-devices such as memristor together with the traditional CMOS devices. This paper has introduced a new MNA method with magnetic flux $(\Phi)$ as new state variable. New SPICE-like circuit simulator is thereby developed for the design of hybrid CMOS and memristor circuits. A number of CMOS and memristor-based designs are explored, such as oscillator, chaotic circuit, programmable logic, analog-learning circuit, and crossbar memory, where their functionality, performance, reliability and power can be efficiently verified by the newly developed simulator. Specifically, one new 3-D-crossbar architecture with diode-added memristor is also proposed to improve integration density and to avoid sneak path during read-write operation. *Index Terms*—3-D crossbar memory, memristor, nano-scale circuit simulation. #### I. INTRODUCTION N order to extend the Moore's law, many new devices have been created at the nano-scale recently. The scaling at nano-scale also leads to the discovery of the fourth circuit element [1]–[3], memristor, which was not able to be observed at the traditional scale. Theoretically, the discovery of memristor resolves a mystery predicted almost 40 years ago [1], [2] for the linking between flux and charge in the circuit theory. Practically, the successful fabrication of the memristor [3] might provide new approaches to design high-performance circuits and systems such as oscillator and memory at nano-scale. In order to deal with a design composed of large number of memristors and also the other traditional devices such as CMOS, this new element needs to be included into a circuit simulator like SPICE [4]. Traditional nodal analysis (NA) only contains nodal voltages $(v_n)$ at terminals of devices. Since an inductor is short at dc and its two terminal voltages become dependent, the state matrix W. Fei, H. Yu, and K. S. Yeo are with the School of Electrical and Electronic Engineering, Nanyang Technological University, 639798 Singapore (e-mail: haoyu@ntu.edu.sg). W. Zhang is with the School of Computer Engineering, Nanyang Technological University, 639798 Singapore. is indefinite at dc. This problem is resolved by a modified nodal analysis (MNA) [4]-[6], which modifies the NA by adding branch currents $(j_b)$ as state variables. However, many nontraditional devices (memristor, spin-toque-transfer device and etc.) introduced at the nano-scale have to be described by state variables different from the traditional nodal voltages and branch currents. As such, the conventional circuit formulation in MNA may not be able to include these new nano-devices. For example, the fundamental device branch equation, or branch constitutive equation (BCE), of a memristor describes a relation for the change of charge with respect to the change of magnetic flux. This relation requires an explicit deployment of magnetic flux as the state variable. The magnetic flux has been explored as the state variable. In [7], by replacing the state-variable of the inductive branch-current with the magnetic flux, one new MNA formulation is derived to stamp the inverse of the inductance matrix, called susceptance matrix. In this paper, using the magnetic flux as the new state-variable, we have derived a new MNA formulation to stamp memristor together with other traditional devices. A new SPICE circuit simulator is also developed for simulations of large-scale hybrid designs of CMOS and memristor. Compared with the equivalent circuit based approach [8], our implementation of memristor model in a SPICE-like simulator has an explicit dependence on its geometry and process parameters, which is more scalable and flexible for the process migration. Note that memristor is promising with wide applications in new circuit designs as well. Its negative differential resistance can lead to potential applications in oscillator and chaotic circuit design. Moreover, its nonlinear behavior fits also well with the requirement of resistive crossbar, and hence, memristor has been extensively applied in crossbar-based designs [9]-[13]. Another natural application of memristor is in the neuromorphic system [11], [14]–[17]. However, due to the lack of development of related circuit simulator, all the aforementioned applications are currently designed in very limited size. The challenges faced when integrating with the traditional CMOS devices remain unsolved. With the aid of one SPICE-like simulator for memristor developed in this paper, we have demonstrated a number of hybrid CMOS and memristor circuit examples with the efficient verification of functionality, performance, reliability, and power consumption. One memristor-based crossbar-based architecture for memory design is also explored in this paper. The fundamental crossbar structure consists of horizontal and vertical nanowires with their crossing points configured as electronic devices. Resistors, diodes, and transistors have been implemented as the crossing-point junctions to present bistable states [18]–[24]. Compared with active crossbars, resistive crossbars have the advantage of higher density, simpler structure, and easier fabrication [18], [20]. The resistive crossbar utilizes bistable resistive material with hysteresis *I-V* behaviors to represent different states. Memristors thereby can be applied under this scheme, and hence has been extensively investigated for the crossbar-based memory design [9]. One major limitation for resistive crossbar is its high leakage current through sneak path. This means besides the desired path through the memory cell, where the current ought to flow during the writing and reading process, there are also many sneak paths through other junctions in the memory array and through the junctions of the demuxes [9]. It results in great degradation in both performance and power efficiency. These sneak paths are unavoidable unless elements such as diode can be integrated with memristors. Recent research has made it possible to fabricate diode-added component together with memristor without affecting the performance [13]. But this feature has not been studied for the crossbar-based memory design yet. In this paper, with the use of the newly developed circuit simulator, the performance improvement of memory design can be analyzed for the diode-added memristors. Moreover, the existing crossbar-based memory design is mainly based on 2-D integration, whose integration density is low compared to 3-D integration. The existing available crossbar-based memory by 3-D integration, however, requires either large peripheral area [25], [26] or many CMOS layers [27]. As such, we have also proposed one new crossbarbased memory using diode-added memristors with 3-D integration, which needs only one CMOS stack and hence highly increases the device density. The contribution of this paper is summarized as follows. - A new MNA using magnetic flux as state variable within one SPICE-like simulator is developed to verify the design of hybrid CMOS and memristor circuits. - A new 3-D crossbar-based memory architecture using diode-added memristors is proposed to improve integration density and reduce sneak path power consumption. The rest of this paper is organized in the following manner. In Section II, the background of circuit theory and traditional MNA are reviewed. Then, the derivation of the new MNA for memristor and its according SPICE-like circuit simulation are presented. Two different device models for memristors are analyzed in Section III with analytical formula of memristive power. In Section IV, one low-power and high-density 3-D crossbar-based memory is presented and analyzed. Experimental results are presented in Section V, and the paper is concluded in Section VI. #### II. MEMRISTOR CIRCUIT SIMULATION In this section, the new MNA formulation to consider memristor is derived within the SPICE-like simulator. All the variables used in this section are summarized in Table I. #### A. Background Kirchhoff's Current Law (KCL) and Kirchhoff's Voltage Law (KVL) are two fundamental equations governing the electric property of a circuit [6]. These two laws can be compactly TABLE I DEFINITIONS OF VARIABLES USED FOR MEMRISTOR CIRCUIT SIMULATOR | Variables | Definition | | | |-----------------------------|---------------------------------------------------------------------------------------------------------------------------|--|--| | E | incident matrix defined in Section II.A ([Ec Eg El Em Ei] describe the topological connections of capacitive, conductive, | | | | | inductive, memductor and voltage-source elements) | | | | $v_n, v_b, j_b$ | nodal voltage, branch voltage, and branch current | | | | $j_i, j_b, j_m$ | source, inductive, and flux branch current | | | | $\Phi_n, \Phi_b$ | nodal and branch flux | | | | M, W | memristance and memductance | | | | S, G, C | susceptance, conductance, and capacitance | | | | $h_k$ | $k$ -th time-step from $t_{k-1}$ to $t_k = t_{k-1} + h_k$ , | | | | $\alpha_{k_i}, \beta_{k_i}$ | integration coefficients of <i>i</i> -th order backward-differential-formula (BDF) at <i>k</i> -th time-step | | | | $\gamma_k$ | $\frac{\alpha_{k_0}}{h_k}$ | | | | $r_k$ | remainder in equation (13), containing previously calculated $\boldsymbol{X}$ and $\dot{\boldsymbol{X}}$ . | | | | $arepsilon_{\hat{q}}$ | estimated error of q-dot $(dq/dt)$ | | | formulated by an incidence matrix determined by the topology of circuits. Assuming n nodes and b branches, the incident matrix $E(\in R^{n\times b})$ is defined by $$e_{i,j} = \begin{cases} 1, & \text{if branch } j \text{ flows into node } i \\ -1, & \text{if branch } j \text{ flows out of node } i \\ 0, & \text{if branch } j \text{ is not included at node } i. \end{cases}$$ By further denoting branch current as $j_b$ , branch voltages as $v_b$ and nodal voltages as $v_n$ , KCL and KVL can be described by $$KCL: Ej_b = 0 \quad KVL: E^T v_n = v_b.$$ (1) Modified Nodal Analysis: Ideally, the branch current vector is a function purely dependent on the nodal voltages under the device branch equation $$j_b = \frac{d}{dt}q(E^Tv_n, t) + j(E^Tv_n, t).$$ However, as inductor and voltage source become indefinite at dc, when using the nodal voltages only (NA), the MNA breaks the branch current vector into four pieces with four corresponding incident matrices, and deploys branch inductive current $j_l$ and branch source current $j_i$ as new state-variables. As such, the KCL and KVL in (1) become $$\frac{d}{dt}E_cq(E_c^Tv_n,t) + E_gj(E_g^Tv_n,t) + E_lj_l + E_ij_i = 0$$ $$\frac{d}{dt}\Phi(j_l) - E_l^Tv_n = 0$$ $$E_i^Tv_n = 0.$$ (2) Here the four incident matrices $[E_c \ E_g \ E_l \ E_i]$ describe the topological connections of capacitive, conductive, inductive, and voltage-source elements. Introducing the state variable $\mathbf{x} = [v_n, j_l, j_i]^T$ , the above MNA formulation can be denoted shortly by the following differential-algebra-equation (DAE): $$\mathcal{F}(x,\dot{x},t) = \frac{d}{dt}q(x,t) + j(x,t) = 0.$$ (3) Memristor Branch Equation: Memristor by definition is a linkage between charge and flux. For a charge-controlled memristor $$M(q) = d\Phi(q)/dq \tag{4}$$ the device branch equation is given by $$v_b = M(q(t))j_b.$$ For a flux-controlled memristor, or called memductor $$W(\Phi) = dq(\Phi)/d\Phi$$ the device branch equation is given by $$j_b = W(\Phi(t))v_b. \tag{7}$$ As there is a charge or flux dependence for the value of memristor or memductor, its terminal voltage or current depends on a complete history. As a result, there could be many nontraditional switching phenomenon for nano-scale devices such as: current-voltage anomalies in switching with a hysteretic conductance; multiple-state conductances; and commonly observed "negative differential resistance". With the use of the concept of memristor or memductor, a range of non-traditional electrical switching phenomenon at nano-scale can now be explained in a simple manner. Note that the concept of such a new circuit element has not yet been widely adopted is mainly because in micro-scale chips, the value of memristor is too small to be observed. The two-terminal memristor device model in [3] shows that the magnitude of memristance grows inversely proportional to the device area. ## B. New MNA for Memristor Simulation The terminal voltage of a memristor depends on the complete history when branch currents are assumed as the state variables. As such, they cannot be easily deployed together with other devices in the traditional MNA formulation. This section first shows that the magnetic flux $\Phi$ can be used as the state variable to replace the inductive and flux branch current variables $j_l$ and $j_m$ . This leads to a new MNA formulation for both the inverse of the memristor element, called memductor, and the inverse of the inductor, called susceptor. Moreover, a corresponding transient analysis is presented by a backward-differential-formula (BDF) integrated with the local-truncation-error (LTE) check. MNA for Memristor: We first break the incident matrix into five pieces with the additional one $(E_m)$ for the branch memductor. Similarly to the nodal voltage $v_n$ , by introducing a nodal flux $\Phi_n$ , (2) becomes $$\frac{d}{dt}E_{c}q(E_{c}^{T}v_{n},t) + \frac{d}{dt}E_{m}q(E_{m}^{T}\Phi_{n},t) + E_{g}j(E_{g}^{T}v_{n},t) + E_{l}\dot{\eta}(E_{l}^{T}\Phi_{n},t) + E_{i}\dot{\eta}_{i} - 0, \quad E_{i}^{T}v_{n} = 0. \quad (8)$$ Defining a new state variable vector $$X = [v_n, \Phi_n, j_i]^T \tag{9}$$ the above new MNA can still be described by the same differential-algebra-equation as in (3). Let's further derive the Jacobian or generalized conductance, capacitance, susceptance and memductance of the DAE. At one biasing point $X_0$ , the first-order derivative (Jacobian) of the nonlinear equation in (8) with respective to X is given by $$\mathcal{G} = \left( E_g \frac{d}{dv_b^g} j\left(v_b^g, t\right) E_g^T \right) \Big\|_{X=X_0}, \quad v_b^g = E_g^T v_n,$$ (5) $$\mathcal{C} = \left( E_c \frac{d}{dv_b^c} q\left(v_b^c, t\right) E_c^T \right) \Big\|_{X=X_0}, \quad v_b^c = E_c^T v_n,$$ (6) $$\mathcal{S} = \left( E_l \frac{d}{d\Phi_b^l} j\left(\Phi_b^l, t\right) E_l^T \right) \Big\|_{X=X_0}, \quad \Phi_b^l = E_l^T \Phi_n,$$ $$\mathcal{W} = \left( E_m \frac{d}{d\Phi_b^m} q\left(\Phi_b^m, t\right) E_m^T \right) \Big\|_{X=X_0}, \quad \Phi_b^m = E_m^T \Phi_n. \quad (10)$$ As such, the linearized DAE becomes $$\mathcal{G} \cdot \delta v_n + \mathcal{C} \cdot \delta \dot{v}_n + \mathcal{S} \cdot \delta \Phi_n + \mathcal{W} \cdot \delta \dot{\Phi}_n + E_i \cdot \delta j_i$$ = $-\mathcal{F}(X_0, \dot{X}_0, t), \quad E_i^T \cdot \delta v_n = 0. \quad (11)$ Note that there is an additional constraint between the magnetic flux and the voltage through the Faraday's law $$E_l^T \dot{\Phi} = E_l^T v_n.$$ As a result, we have the following linearized system equation in first-order: $$\begin{bmatrix} \mathcal{G} & \mathcal{S} & E_{i} \\ -I & 0 & 0 \\ -E_{i} & 0 & 0 \end{bmatrix} \begin{pmatrix} \delta v_{n} \\ \delta \Phi_{n} \\ \delta j_{i} \end{pmatrix} + \begin{bmatrix} \mathcal{C} & \mathcal{W} & 0 \\ 0 & I & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{pmatrix} \delta \dot{v}_{n} \\ \delta \dot{\Phi}_{n} \\ \delta j_{i} \end{pmatrix} = -\mathcal{F}(X_{0}, \dot{X}_{0}, t). \quad (12)$$ Such a state matrix not only integrates the memductor together with other device elements but also results in a stamping of inverse inductance matrix S, which is diagonal-dominant and easy to be stably sparsified [7]. Transient Analysis of Memristor Circuit: The DAE can be numerically integrated at discrete time points $t_1t_2...$ , by BDF [6], [28]. For kth time-step $h_k$ from $t_{k-1}$ to $t_k = t_{k-1} + h_k$ , the time derivative of charge dq/dt in (8) at the time point $t_k$ is approximated by pth order BDF $$\cdot q(X_k) = \frac{1}{h_k} \sum_{i=0}^{p} \alpha_{ki} q(X_{k-i}) - \sum_{i=1}^{p} \beta_{ki} \dot{q}(X_{k-i}) = \gamma_k q(X_k) + r_k$$ (13) where $\gamma_k = (\alpha_{k_0})/(h_k)$ and $r_k$ contains previously calculated $\boldsymbol{X}$ and $\boldsymbol{X}$ . Note that $\alpha_{k_i}$ and $\beta k_i$ are the integration coefficients of ith order BDF at kth time-step. As a result, the numerical solution of the DAE (3) is reduced to solve a nonlinear equation $$\mathcal{F}(\gamma_k q(X_k) + r_k, X_k, t_k) = 0 \tag{14}$$ which is iteratively solved by Newton's method with calculated Jacobian in (10). Starting from a predictor $X_k^{(0)}$ , for example (9) $X_{k-1}$ , the correction $\delta X_k^{(l)} = X_k^{(l)} - X_k^{(l-1)}$ at lth iteration is calculated from the linearized equation (12), which has the following form under BDF: $$\begin{bmatrix} \mathcal{G}_{k}^{(l-1)} + \gamma_{k} \mathcal{C}_{k}^{(l-1)} & \mathcal{S}_{k}^{(l-1)} + \gamma_{k} \mathcal{W}_{k}^{(l-1)} & E_{i} \\ -\mathcal{S}_{k}^{(l-1)} & \gamma_{k} \mathcal{S}_{k}^{(l-1)} & 0 \\ -E_{i} & 0 & 0 \end{bmatrix} \cdot \delta X^{(l)}$$ $$= -\mathcal{F}(\gamma_{k} q(X_{k}) + r_{k}, X_{k}, t_{k}) \quad (15)$$ where $$\begin{split} q_k^{(l-1)} &= q\left(X_k^{(l-1)}\right), \\ \mathcal{G}_k^{(l-1)} &= \mathcal{G}\left(X_k^{(l-1)}\right), \mathcal{C}_k^{(l-1)} &= \mathcal{C}\left(X_k^{(l-1)}\right) \\ \mathcal{S}_k^{(l-1)} &= \mathcal{S}\left(X_k^{(l-1)}\right), \mathcal{W}_k^{(l-1)} &= \mathcal{W}\left(X_k^{(l-1)}\right). \end{split} \tag{16}$$ The Newton converges till the correction $\|\delta X_k^{(l)}\|$ satisfies the error constrained by the relative tolerance and the absolute tolerance for $v_n$ , $\Phi_n$ and $j_i$ , respectively. Moreover, in order to have an adaptive time-step control and a robust convergence, the LTE needs to be implemented. For example, for a first-order BDF (Backward Euler), the estimated error of q-dot (dq/dt) is given by $$\varepsilon_{\dot{q}} = \frac{1}{2} h_{k+1} \mathbf{DD2}(q(X_k))$$ where $\mathbf{DD2}$ is the second-order divide-difference. As such, the estimated time-step $h_{k+1}$ is bounded by a specified value $\varepsilon_{\mathrm{trtol}}$ . Recall that there two parts of contributions in $q(\mathbf{X}_k)$ . One is from the capacitive charge $q^c = q(v_b^c)$ and the other is from the flux charge $q^m = q(\Phi_b^m)$ . Thereby, the transient simulation of a memristor circuit is summarized in the following steps. - 1) Characterize the device branch function for $q(\Phi_b^m)$ and the memductance W is given by (6). - 2) Form a new MNA by (12) with the memductance matrix $W = E_m^T W E_m$ . - 3) Solve $\Phi_n(t)$ and $v_n(t)$ from (15) and obtain $\Phi_b^m(t), v_b^m(t)$ and $q(\Phi_b^m, t)$ . The runtime of a SPICE-like simulator usually composes of three parts: device evaluation, matrix solution and DAE integration. For small sized CMOS-memristor circuits, the runtime is mainly dominated by device evaluation and DAE integration. For large sized CMOS-memristor circuits, the runtime is mainly dominated by the matrix solution. Note that our new MNA formulation still enables a sparse matrix formulation. Moreover, under the MNA formulation with flux, the new MNA can even have a sparse representation for inductors as shown in [7]. For the sparse matrix, the complexity is $O(n^{\alpha})(1 < \alpha < 2)$ , which is mainly determined by the fill-ins created during the LU-factorization. Usually, by selecting the proper sparse matrix pre-ordering for LU, the complexity can be significantly reduced. For example, the column based AMD pre-ordering is deployed in the current implementation. #### III. MEMRISTOR DEVICE MODEL AND POWER In order to design memristor circuit within one SPICE-like circuit simulator, there are two parts required: 1) new modified nodal analysis and 2) memristor device model. Since the TABLE II DEFINITIONS OF VARIABLES USED FOR MEMRISTOR DEVICE MODELING | Variables | Definition | | | |----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--| | Φ | magnetic flux | | | | $k_1, k_2$ | slope of $q$ - $\Phi$ curve (memductance) | | | | δ | value of $\Phi$ at which memductance changes in Fig. 1 | | | | $V_o$ sin $\omega t$ | an sine input with amplitude $V_o$ and frequency $\omega$ | | | | $\Phi_0$ | initial $\Phi$ value | | | | $A_r$ , $A_f$ | The area under the rising and falling part of the hysteresis curve as indicated in Fig. 2 and 5 | | | | $A_{enclose}$ | The area enclosed by two curves $(A_f - A_r)$ as indicated in Fig. 2 and 5 | | | | $R_{on}, R_{off}$ | ON-state resistance and OFF-state resistance of memristor | | | | D | memristor length | | | | $\mu_{\nu}$ | average ion mobility in memristor | | | | w(t) | boundary between the doped and undoped regions of memristor, with value ranging from 0 to $D$ , indicating variance of memristance from $R_{on}$ to $R_{off}$ . | | | | M, W | memristance and memductance | | | | а | $\frac{\mu_{\nu}R_{on}(R_{off}-R_{on})}{D^2}$ , used to simplify equation. | | | | $M_0\sim M_2$ | Memristance at different stages on the hysterisis curve as indicated in Fig. 5. M0: memristance when voltage rises from origin; M1: memristance when voltage rises to the maximum point; M2: memristance when voltage drops back to origin. | | | recent rediscovery of memristor, different device models for both memristor and memristive system have been developed [3], [8]. In this paper, two different models are analyzed and used for the memristive circuit design. These designs are later verified in our simulator with further details. Since memristor defines relationship between change of charge and change of magnetic flux, q-controlled memristor and $\Phi$ -controlled memristor can be transformed to each other mathematically. Due to the use of magnetic flux $(\Phi)$ as the state variable, both models are first transformed to be $\Phi$ -controlled memristors. Moreover, in this paper, we define *memristive power*, which is the I-V area under the hysteresis curve consumed by memristor. The analytical power formula can be applied during circuit design exploration when power is concerned. Note that the term of "memristor" is used in the rest part of the paper for simplicity of presentation although it can mean other memristive systems. In addition, all the variables used in this section are summarized in Table II. ## A. Piecewise Linear Model Device Model and I-V Relation: As shown in Fig. 1, this model describes an ideal case where $q(\Phi)$ of the memristor jumps between 2 constant-slope values when $\Phi$ changes. Its q- $\Phi$ relation $$q(\Phi) = k_2 \Phi + 0.5(k_1 - K_2)(|\Phi + \delta| - |\Phi - \delta|)$$ and the corresponding memductance (W) $$W(\Phi) = \frac{dq}{d\Phi} = \begin{cases} k_1, & \text{if } |\Phi| < \delta \\ k_2, & \text{if } |\Phi| > \delta \end{cases}$$ can be given, respectively. Since the memductance is directly defined when $\Phi$ is given, the memristor behavior is affected by the initial condition of $\Phi$ . Different I-V curves thereby can appear for different $\Phi(0)$ . The plot in Fig. 2 is drawn when $\Phi(0) = (0)$ with a *sine* input $(V_o \sin \omega t)$ , where the hysteresis curve appears, i.e., the current Fig. 1. q- $\Phi$ relation of piecewise linear model. Fig. 2. I-V relation for piecewise linear model with a sine input $(V_o \sin \omega t)$ with the initial flux: $\Phi(0)=0$ . Parameters used here are: $V_o=1$ V, $\omega=2\pi\times 1\text{e8}$ rad/s, $k_1=0.5\,\text{e}-5\,(\Omega^{-1}), k_2=1\,\text{e}-5\,(\Omega^{-1}), \delta=(V_o)/(2\omega)=7.96\,\text{e}-10(\Omega^{-1}).$ path is different when voltage changes from different directions. Moreover, in order to show hysteresis behavior in Fig. 2, two conditions are needed: $$\begin{cases} -\delta < \Phi_0 < \delta \\ \delta - \Phi_0 < \frac{V_o}{\epsilon} \end{cases} \tag{17}$$ where (17) sets the initial memductance to be $k_1$ , and (18) ensures the threshold reached during the rising period. Memristive Power: Recall that the area under I-V hysteresis curves is defined for the memristive power. To derive this area analytically, the condition of *sine* input $V_o \sin \omega t$ with the initial state $\Phi(0) = \Phi_0$ is assumed. By segmenting the rising curve (marked with the arrow) into two parts: one part with memductance $k_1$ and the other part $k_2$ , the area under the rising and falling curves could be obtained by $$\begin{split} A_r &= \frac{[(1-\alpha^2)k_1 + \alpha^2 k_2]}{2} V_0^2 \quad \alpha = 1 - \frac{\omega(\delta - \Phi_0)}{V_0} \\ A_f &= \frac{k_2 V_0^2}{2}. \end{split}$$ Fig. 3. $q - \Phi$ relation of square root model. The enclosed area by the hysteresis curve is derived by $$A_{\text{enclose}} = A_f - A_r = \frac{(1 - \alpha^2)(k_2 - k_1)}{2} V_0^2$$ which can be used for the power exploration. Note that the piecewise linear model is an ideal model to understand and predict memresitive behaviors. For instance, we can explore the negative $k_1$ to show the negative resistance behavior for the oscillator and chaotic circuit design. Some design examples are discussed in Section V-A. #### B. Square Root Model Device Model and I-V Relation: One realistic memristor model presented by HP Lab [3] is $$v(t) = \left(R_{\text{on}} \frac{w(t)}{D} + R_{\text{off}} \left(1 - \frac{w(t)}{D}\right)\right) i(t)$$ $$\frac{dw(t)}{dt} = \mu \frac{R_{\text{on}}}{D} i(t)$$ where v(t) and i(t) are memristor's voltage and current, $R_{\rm on}$ , and $R_{\rm off}$ are ON-state resistance and OFF-state resistance, D is the device length and $\mu_v$ is the average ion mobility, respectively. Moreover, w(t) here presents the boundary between the doped and undoped regions of memristor, and its changing speed is controlled by i(t). Assume the following initial condition: $\Phi(0) = 0, q(0) = 0, w(0) = w_0, M(0) = M_0$ , where M represents the memristance. Moreover, define $a = (\mu_v R_{\rm on}(R_{\rm off} - R_{\rm on}))/(D_2)$ . As such, one can derive the following q- $\Phi$ relation: $$q(t) = \frac{M_0 - \sqrt{M_0^2 - 2a\Phi(t)}}{a}$$ and the corresponding memductance (W) $$W(\Phi(t)) = \frac{dq(t)}{d\Phi(t)} = \frac{1}{\sqrt{M_0^2 - 2a\Phi(t)}}.$$ Note that this is a square-root relation between q and $\Phi$ . Taking the boundary condition into consideration, the above formula can be modified as $$W(\Phi) = \frac{dq}{d\Phi} = \begin{cases} \frac{1}{R_{\text{off}}}, & \text{if } \Phi < \frac{M_0^2 - R_{\text{off}}^2}{2a} \\ \frac{1}{\sqrt{M_0^2 - 2a\Phi}}, & \text{if } \frac{M_0^2 - R_{\text{off}}^2}{2a} \le \Phi \le \frac{M_0^2 - R_{\text{on}}^2}{2a} \\ \frac{1}{R_{\text{on}}}, & \text{if } \Phi > \frac{M_0^2 - R_{\text{on}}^2}{2a}. \end{cases}$$ Fig. 4. Input voltage and output current for a single memristor of square root model. The parameters for the memristor are set as: $R_{\rm on} = 3.33 {\rm e}7 \ (\Omega), R_{\rm off} = 3.33 {\rm e}10 \ (\Omega), \mu_v = 2.5 {\rm e} - 6 ({\rm m}^2 {\rm s}^{-1} {\rm V}^{-1}), D = 1 {\rm e} - 8 {\rm m}.$ Fig. 5. I - V hysteresis for a square root model memristor. To explore the memristive behavior under the square-root model, a voltage source is given as the input to a memristor. As shown in Fig. 4, the peak output current lags the peak input voltage due to memristance change. This results in the I-V hysteresis as shown in Fig. 5. Note that the input frequency is kept low enough to show the hysteresis. Memristive Power: Similarly, the memristive power is derived analytically as follows. Assume that one sine input $V = V_o \sin \omega t$ keeps the resistance of memristor within the boundary all the time. The area under the rising and falling curves indicated by arrow in Fig. 5 can be derived by $$A_r = \frac{\omega^2}{6a^2} \left( M_0^3 + 2M_1^3 - 3M_0 M_1^2 \right)$$ $$A_f = \frac{\omega^2}{6a^2} \left( M_2^3 + 2M_1^3 - 3M_2 M_1^2 \right)$$ where $$M_1 = \sqrt{M_0^2 - (2aV_0)/(\omega)}$$ and $$M_2 = \sqrt{M_0^2 - (4aV_0)/(\omega)}$$ are the memristance when V rises to $V_0$ and return to 0, respectively. The memristance changes from $M_2$ to $M_1$ and then back to $M_0$ when the input voltage drops to the negative region shown in Fig. 5. The enclosed area can be similarly derived by the hysteresis curve in Fig. 5 $$A_{\text{enclose}} = A_f - A_r = \frac{\omega^2}{12a^2} (M_0 - M_2)^3.$$ Note that the square-root model can be used for predicting most memristive behaviors, including I-V hysteresis and negative dynamic differential resistance. Therefore, it can be used for evaluating many memristive circuit designs. In this paper, the square-root model is used to build crossbar, decoder, adder, and an amoeba learning circuit as shown in Section V-B and V-C #### IV. MEMRISTOR CIRCUIT DESIGN Upon the developed circuit simulator and device model for memristor, we can further explore the design of hybrid CMOS and memristor circuits. Previous researches have shown the potential of nanowire-based crossbar architecture as the next generation of memory due to its simple structure, high density, large-scale fabrication and flexible function [29]. Besides the memory application, crossbar can be also applied in the arithmetic processing [30], neuromorphic system [15], and pattern recognition [10]. In this paper, we focus on the design of memristive crossbar for memory. One major drawback of current resistive crossbar-based memory is the lack of isolation between memory cells. As a result, the presence of sneak paths can severely degrade the performance of memory and increase power consumption. As the power consumption is the most important metric for memory design, this problem has become the major limitation for the application of resistive crossbar-based memory [31]. Moreover, density is another important feature for the memory design, which directly relates to the cost. The current 2-D resistive crossbar-based memory can be extended to 3-D to achieve a higher device density. In order to resolve the aforementioned two issues, in this section, we propose the 3-D crossbar-based memory using the diode-added memristor. #### A. Diode-Added Memristor-Based Memory Recent device research has made it possible to fabricate each cross-point with a memristor and a pn-junction connected in series [13]. Though the junction could be modeled as a memristor in series with a diode, the pn-junction does not prevent setting the memristor value in the reverse direction [13]. In this way, the sneak path can be extensively reduced and large portion of power consumption is saved. Fig. 6 shows the structure of a $4 \times 4$ memristor crossbar, whose read-access is controlled by two 4-to-1 switch MUXes connected with the voltage source. Cross-points of the memory crossbar (red circle) can be implemented using either one pure resistive memristor or one diode-added memristor. How our design prevents the sneak paths is illustrated in Fig. 7. Here, only cross-points with an ON-state memristor are shown for visual clarity. The solid blue line indicates the current path to read the cell in first row and first column. Two possible sneak paths are shown with red dotted lines when pure resistive memristors are used in the memory cells. Each sneak path may Fig. 6. $4 \times 4$ crossbar memory for read-operation. Fig. 7. Sneak path prevention. be composed of 3, 5 or more (odd number) ON-state cells. Their resistances are connected in parallel with the reading-path, resulting in not only larger power consumption, but also large performance degradation. When diode-added memristors are used for memory cells instead, current can only flow in one direction for a read-operation. As shown in the figure, current can only flow from vertical bars to horizontal bars. Therefore, there is no way for a sneak path to go through other paths without getting blocked by one diode. The two cells marked by one pink circle can block the previous sneak paths. Adding the diode into the memory cells does not change the memory writing scheme described in [9]. Instead, it only introduces different energy requirements for setting and resetting of memristor states [13]. The comparisons between the two designs based on diode-added memristor and resistive memristor are analyzed in term of functionality and power consumption by our circuit simulator. Detailed results are reported in Section V-C. However, the new crossbar is not strictly "resistive" any more due to the embedded diode. This leads to some drawbacks. For example, depletion capacitor in reverse-biased diode increases capacitive load and may cause some delay. Moreover, threshold voltage of the diode can reduce output voltage swing as well. Fig. 8. One 3-8-decoder based on bistable diode crossbar. Therefore, appropriate sizing and doping are required to minimize these drawbacks. However, the superior performance and tremendous reduction in power consumption brought by the diode-based design motivate us to explore the potential of diode-added memristor for memory. #### B. Low Power Decoder Decoders are the essential peripheral circuits to support memory access, and are also important building blocks for memory-based logic. The diode-added memristor can be used to further improve the decoder as follows. The previous demux-based decoders are usually implemented using the pre-programmed pure resistive memristors [9]. However, due to the nature of the resistive crossbar, the demux functions as a voltage-divider with the large power consumption and the performance degradation from sneak paths. By adding the pn-junction, i.e., the diode into the memristor, the diode-added memristor crossbar can be developed. Fig. 8 shows one decoder implemented with a bistable diode-added crossbar. Here, ON-state (low-resistance) cross-points are marked with circles, while the other cross-points are in OFF-states (high-resistance). Since the crossbar does not include the inversion function, the address-signal (A0–A2) and their complements are needed. The circuit is essentially a lookup table. The ON-state diode-added memristors at the cross-points form an AND-gate in each row. The line is selected and remains on high-voltage when its cross-points are all connected to the logic "1". On the other hand, one or more ON-state cross-points in non-selected rows connect to the logic "0", which acts as one current sink to pull down the non-selected lines. The current flow-paths are marked by pink lines. We can see four sub-currents flowing through A0′–A2′ in Fig. 8, and hence the power consumption is large. In order to save power, a new decoder structure is proposed in Fig. 9. The current paths are now reduced to half of the last design. By using the pure resistive memristors in the columns for the first input signal, the current from A2 or A2' can flow into all the rows through the resistive cross-points. Hence the logic "1" in A2 or A2' can replace the voltage source. In this way, the leaking sub-currents in non-selected lines can be reduced to two. Hence, the new decoder can be used for write-operation together with the diode-added memristor memory to save the total power. Fig. 9. One low-power 3-8-decoder based on diode-added memristor crossbar. The performances in terms of functionality and power of two different designs are investigated in detail in Section V-B. #### C. 3-D Crossbar Memory With Diode-Added Memristor Based on the previously discussed building blocks, we further discuss memory architecture by introducing a 3-D crossbarbased design with the use of diode-added memristors. The pioneering idea to explore the nano-electronic at the architecture level is from the work of CMOS and molecular logic circuit (CMOL) [32]. CMOL adds the nanowire crossbar on top of CMOS stack, so as to further increase the device density. This hybrid architecture can be used to implement memory, reconfigurable logic, and neuromorphic networks [32]. Fig. 10(a) indicates that the traditional CMOL uses a special pin to reach the top layer of the crossbar. However, fabrication variation may cause this pin to entangle with the bottom layer of crossbar, and hence may result in missing contacts and defective circuits. To solve this problem, a modified CMOL, called Field Programmable Nanowire Interconnect (FPNI) is developed in [33]. As shown in Fig. 10(b), FPNI uses large size nano-pads to contact with CMOS stack, leading to a fabrication with high defect-tolerance. However, due to the large size of pads, a low device density is resulted. Another solution is to introduce a 3D-CMOL with two CMOS stacks and one crossbar layer in between [27]. As shown in Fig. 10(c), each CMOS stack only needs to contact with the nearer nanowire layer of the crossbar. However, since in memory design, the CMOS peripheral area is relatively small, only one CMOS stack is needed below the nanowire crossbars. In addition, 3-D memory design is also discussed in [25], [26], where multiple layers of nanowire crossbars are fabricated above one CMOS stack to form the 3-D Resistive RAM (RRAM). The crossbars are separated with each other by insulator layers [see Fig. 11(a)]. Nanowires are then contacted with CMOS stack in a similar way to FPNI, leading to large peripheral area. Apart from the limitations for each of the architectures discussed above, pure resistive crossbar-based memory also has a common limitation on its maximum size achievable for implementing one function. The number of sneak paths increases as the memory size rises, causing large size crossbar memory to fail in one operation. In this paper, we propose a different 3-D Fig. 10. Different architectures for CMOL. (a) Traditional CMOL. (b) FPNI. (c) 3-D CMOL. Fig. 11. Architecture for 3-D crossbar memory. (a) 3-D RRAM. (b) Our design. crossbar architecture with the use of diode-added memristors [see Fig. 11(b)]. Obviously, the sneak path can be prevented in this design. The limitation on crossbar size for proper operation is therefore released. Larger sized crossbar can be built with a much smaller peripheral area overhead. Moreover, we can further reduce the memory area and increase the device density by folding a two-layer nanowire crossbar into a three-layer crossbar. As shown in Fig. 11(b), two nanowire layers now share one perpendicular nanowire layer. The memory folding detail is shown in Fig. 12. Because the diode directions for two adjacent crossbars are opposite to Fig. 12. Folding of the nanowire crossbar. each other, the folded crossbar memory can function correctly. Due to the folding of the longer dimension of the memory, there is an estimated 33% increase in the memory density that can be built for the same technology. Performances of this 3-D crossbar memory are analyzed in Section V-C. #### V. EXPERIMENT Using the new MNA formulation introduced in Section II, the new SPICE circuit simulator has been developed for evaluation of large-scale hybrid CMOS and memristor designs. In this section, both piecewise-linear model and square-root model are used in experiments. Piecewise-linear model describes simplified behaviors of ideal memristors without physical limitations. Two experiments are carried out to study memristor for the oscillator and chaotic circuit design. Square-root model, on the other hand, is based on the physically fabricated model proposed by HP. All device parameters are selected in the similar range of previous work [9], [13], [25], [26], [31]. First, accuracy of the simulator is verified by comparing simulation result with published data in [34]. Then, the performance of the proposed decoder is evaluated and a full adder is designed based on proposed decoder and compared with a conventional adder. After that, a model for amoeba-learning is built and analyzed together with a CMOS spike-generator. The above experiments prove the effectiveness of our circuit simulator in handling various hybrid CMOS and memristor circuits. Moreover, a number of large sized memristor circuits are built to explore runtime scalability. After the verification of the simulator, the crossbar-based memories are designed and verified for process variation, low power design, and sneak path prevention. Finally, the new 3-D crossbar-based memory with sneak path prevention and high device density is designed and verified. #### A. Piecewise Linear Model Memristor Controlled Oscillator: Since the nonlinear negative resistance can be realized by memristor, it can be used for the oscillator design. As shown in Fig. 13, a memristor is connected with a LC tank to form an oscillator. Its parameters are set as: $k_1 = -3e4, k_2 = 9e4, \delta = 1e-12$ . Since negative resistance is realized in $k_1$ region, the memristor here functions as an active device, and therefore it can autonomously oscillate with no external supply needed. The flux-controlled memristor switches upon the flux magnitude at one terminal. It is equivalent to modulate the magnitude and frequency of the LC oscillator. Fig. 14 (a) shows the trajectory plane composed by $V_1$ and Fig. 13. Diagram of an oscillator circuit composed of a memristor controlled LC Fig. 14. Waveforms of the oscillator circuit: (a) phase diagram between $V_1$ and $V_2$ ; (b) waveform of $V_1$ ; and (c) waveform of $V_2$ . $V_2$ , and (b) and (c) further show the transient voltages $V_1$ and $V_2$ with respect to a stop-time of 1 ms. Both indicate a memristor-controlled oscillation. Memristor Chain: Due to its nonlinearity, memristor can replace Chua's diode to generate the chaotic outputs. As shown in Fig. 15, a chain of memristors cascaded with RC tanks is constructed to produce chaotic outputs. Their parameters are set as: $k_1 = 5e4, k_2 = 2e4, \delta = 4e - 12$ . For this example, Fig. 16(a) shows the state-trajectory-plane composed by $V_1$ and $\Phi_1$ , which is a chaotic attractor. Moreover, Fig. 16(b) and (c) further show the transient voltage $V_1$ and the flux $\Phi_1$ with respect to a stop-time of 1 ms. # B. Square Root Model Accuracy Verification: A specifically sized memristor fed with a specified input $(V \sin 2\pi ft)$ is simulated and its I-V curve is compared with published data in [34], as shown in Fig. 17. Parameters are set according to the paper: $R_{\rm on}=100~(\Omega), R_{\rm off}=20\,000~(\Omega), \mu_v=3e-12({\rm m}^2{\rm s}^{-1}{\rm V}^{-1}), D=1{\rm e}-8~{\rm m}, V=1~{\rm V}, f=100~{\rm Hz}.$ (Note: here $\mu_v$ combines both carrier mobility and fitting constant in [34].) In [34], simulation was done by generating an Fig. 15. Diagram of a chaos circuit composed of a memristor-diode-chain. Fig. 16. Waveforms of the chaos circuit: (a) phase diagram between $V_1$ and $\Phi_1$ ; (b) waveform of $V_1$ ; and (c) waveform of $\Phi_1$ . Fig. 17. I-V hysterysis curves for accuracy verification of proposed simulator. Parameters and input signals are set exactly the same as in [34]. AHDL model using voltage controlled memductive model derived from q- $\Phi$ relationship shown in Section III-B. The exactly matched data verify the accuracy of the proposed simulator. Low Power Decoder: Two decoder designs mentioned in Section IV and the demux structure proposed by HP [9] are used to construct a 2-to-4 demux for the decoder. Their performances are compared in Table III. The parameters of memristors are set to be similar as in Fig. 4 that: $R_{\rm on} = 1e7(\Omega), R_{\rm off} = 1e10(\Omega), \mu_v = 2.5e - 6~({\rm m}^2~{\rm s}^{-1})$ TABLE III DEMUX PERFORMANCE WHEN SELECT OUTPUT1 | Structure | HP | Virginia | This Paper | |------------------|-----------|----------|-------------| | Cross point | Memristor | Diode | Diode-added | | implementation | | | Memristor | | Output1 (V) | 1.497 | 1.4776 | 1.0614 | | Output2 (V) | -0.499 | 0.5386 | 0.496 | | Output3 (V) | -0.499 | 0.5386 | 0.496 | | Output4 (V) | -0.499 | 0.4896 | 0.4302 | | Total Power (nW) | 1805.4 | 44.334 | 8.5953 | ${ m V}^{-1}), D=1{ m e}-8$ m, $V_{ m thd}=2$ V, and $V_{ m thr}=4$ V, except for memristors in the first two columns (see Fig. 9), whose $R_{ m on}$ and $R_{ m off}$ values are set 10 times larger to assist voltage division. Here, $V_{ m thd}$ and $V_{ m thr}$ are the threshold-voltage for programming diode-added memristor and pure resistive memristor, respectively. Similarly, the pull-up resistors (see Fig. 8) are set 10 times of $R_{ m on}$ ( $R_{ m pu}=1{ m e}8(\Omega)$ ) for better performance. All outputs are loaded with $R_{ m load}=10$ $R_{ m off}=1{ m e}111(\Omega)$ . The threshold-voltage for the diode is set as 0.43 V. Input voltage level of $\pm 1.5$ V is used for HP's design, and 1.5 V for the other two designs. By adding state variable $\Phi$ , our simulator is able to handle historical information of memristor, and therefore handle hybrid memristor-CMOS diode circuit easily. Simulation results are shown in Table III. As Table III indicates, the power consumption decreases tremendously when diode-added memristors are used. Distinct output voltage levels are important for operations in memory. The output voltage levels in the later two demux structures are limited by the threshold-voltage of the diode. Note that the diode's threshold-voltage is an unwanted feature in diode-added memristor and should be minimized. Therefore, the actual performance can be improved when diode's threshold-voltage can be lowered. Full Adder of Programmable Logic Circuit: Decoders can be used to implement memory-based logic and CMOS buffers. In this paper, the proposed decoder with diode-added memristors is used to implement a full-adder [see Fig. 18(b)]. For comparison, another full-adder [see Fig. 18(a)] is implemented by pure-resistive-memristor-based crossbar method [12]. As Fig. 18(a) shows, the logic is again realized by the voltage dividing. In Fig. 18(a), each line of memristors with an inverter forms a NOR-gate demonstrated by HP in [12]. The parameters for memristor are set the same as in the decoder design. To design the inverter, parameters for nMOS are set as: $W/L=100~\mu\text{m}/0.24~\mu\text{m}$ , $\mu_n C_{\text{ox}}=117.7\mathrm{e}-6~(\mathrm{AV}^{-2})$ , $V_{\text{tn}}=0.43~\mathrm{V}$ , $\lambda=0.06~(\mathrm{V}^{-1})$ . A 33 k $\Omega$ resistor is connected in series with nMOS to form the inverter. The design in [22] is used in Fig. 18(b) with the decoder changed to the newly designed one as in Fig. 9 for the second full-adder design. Two pull-down resistors ( $R_{\mathrm{pd}}$ ) are set to be 100 times $R_{\mathrm{on}}$ of the memristors in the decoder. A small CMOS buffer is then used for obtaining the output. A 3 V supply voltage is used for both adders. The simulator now handles the hybrid circuits with both memristors and various CMOS components. The inputs and outputs of two designs could be viewed in Fig. 19. The power consumptions of memristor-based logic are compared for the two full-adders. The experiment results show that the power consumption improves from around $3.5 \mu W$ to around Fig. 18. Two full adders implemented by (a) pure resistive memristor crossbar and CMOS invertors and (b) proposed low power decoder with CMOS buffers. Fig. 19. Inputs and outputs of two full adders. V(Cin), V(A), V(B) are the inputs to the adders, while V(Sum) and V(Cout) are the outputs. $0.18~\mu\mathrm{W}$ when shifted to the diode-added memristor, saving 95% of power while maintaining the same performance. Memristive Model for Amoeba Learning: The value-adaptive nature of memristor can lead to potential application in neuromorphic systems. There are many recent researches conducted Fig. 20. Memristive model for the amoeba-learning together with the spike generator. on implementing memristors in neural network and other biological circuits [11], [14]–[17]. In [17], a memristive circuit (see Fig. 20) is used to model amoeba's learning behavior. When exposed to the periodic environment change, amoeba is able to remember the change and adapts its behavior for the next stimuli. By using a simple RLC circuit together with a memristor, this learning process can be emulated. According to the author, this model may also be extended and applied in neural network. To examine the full learning process, a CMOS spike-generator is cascaded with the amoeba model to emulate the changing environment in this paper. Parameters for the memristor are: $R_{\rm on} = 3(\Omega), R_{\rm off} = 20(\Omega), \mu_v = 1 {\rm e} - 16 \ {\rm m}^2 {\rm s}^{-1} {\rm V}^{-1}, D =$ 1e - 8 m, $V_{\rm thd} = 2.5$ V. The rest of the model is set as: R = $0.195(\Omega), L = 0.02 \text{ H}, C = 0.01 \text{ F}$ . As shown in Fig. 21, the memristor adjusts its value to facilitate oscillation when facing periodic spikes. As memristance becomes larger, when a following spike is fed to the circuit again, the oscillation becomes less attenuated and stays longer. This can be viewed as the emulation for amoeba to remember the environment change and adapt its behavior to anticipate next stimulus. On the other hand, when non-periodic inputs are fed the adjustment is much less obvious as the case under the periodic input. Here the added state variable $\Phi$ keeps information for both memristor and inductor. Simulation results in Fig. 21 show that the proposed simulator works well with analog simulation of hybrid memristor-CMOS circuits. #### C. Crossbar Memory Sneak Path Prevention: To analyze the effect of the new diode-added memristor, each cross-point is modeled as a memristor connected in series with a diode to form a new $4\times 4$ crossbar. A read-function is then operated in comparison with the crossbar by pure resistive memristors. A switch-MUX is implemented similarly to [9]. For simplicity, memristors for memory and switch-MUX are set with the same parameters except the threshold-voltage, which is set larger for switch-MUX to prevent unwanted value-changing during the write-function. The parameter settings are the same as in our designed decoder. With $\pm 0.8$ V as reading voltages, the output current is used to determine ON/OFF state stored in memory cells. Simulation results are shown in Table IV where performance and power consumptions are compared. In Table IV, $I_{\rm On}$ and $I_{\rm Off}$ indicate the resulted output currents when reading an ON-state or OFF-state, respectively. The worst case is to read an ON-state while all other cells are in OFF-states, and to read an OFF-state Fig. 21. Outputs of a memristive model for amoeba learning. (a) Periodic spike input causes memristor to adjust its value, leading to longer oscillation when the spike is fed again. (b) Non-periodic spike input results in less obvious adjustment. TABLE IV "READ" PERFORMANCE FOR CROSSBAR MEMORIES | | 2D 4X4 Crossbar Memory | | 3D 4X8 Crossbar Memory | |--------------------------------------------|------------------------|----------------------|-------------------------------| | | With diode | Without<br>diode | with diode-added<br>memristor | | $I_{on}$ all other cells off (nA) | 38.349 | 53.587 | 38.47 | | $I_{on}$ all cells on (nA) | 38.478 | 65.893 | 38.688 | | I <sub>off</sub> all other cells on (nA) | 0.81361 | 57.872 | 1.275 | | $I_{off}$ all cells off (nA) | 0.46612 | 0.63801 | 0.69832 | | Ion range (nA) | 38.349 to<br>38.478 | 53.587 to<br>65.893 | 38.47 to 38.688 | | I <sub>off</sub> range (nA) | 0.46612 to<br>0.81361 | 0.63801 to<br>57.872 | 0.69832 to 1.275 | | Wost case $I_{on}/I_{off}$ | 47.13 | 0.93 (fail) | 30.17 | | P range (nW) | 0.746 to 61.6 | 1.02 to 105 | 1.12 to 61.9 | while all other cells are in ON-state. These two operations generate the minimum $I_{\rm on}$ and maximum $I_{\rm off}$ , whose ratio (worst case $I_{\rm on}/I_{\rm off}$ ) is viewed as a measure for memory performance. As Table IV indicates, the $I_{\rm on}/I_{\rm off}$ ratio for the read-function improves tremendously when diode-added memristor is used, while the power consumption is also decreased greatly. Note that although the minimum power consumption for the memory without diode is only 1.02 (0.638 A $\times$ 1.6 V) nW, due to the existence of sneak path, the power consumption can rise to 92.6 (57.87 A $\times$ 1.6 V) nW when reading an OFF-state ( $I_{\text{off}}$ ) all other cells on). When diode-added memristor is used, on the other hand, high power consumption only appears when reading an ON-state. Also, the maximum power consumption is almost halved. Therefore, the total power consumption can be improved around four times. When the memory size increases, this improvement is expected to further increase. As mentioned earlier, the existence of sneak path limits the maximum memory size for a proper operation. As shown in Table IV, the $4 \times 4$ cossbar memory built with pure resistive Fig. 22. $4 \times 4$ crossbar with various inputs. memory already fails because it cannot distinguish an ON-state and OFF-state (worst $I_{\rm on}/I_{\rm off}$ ratio <1). Therefore, the maximum memory size achievable with the given device parameters is less than $4\times 4$ . Since parts of the peripheral components would not shrink the size along with memory [26], this limitation in size can result in limitation on device density, which is resolved when diode-added memristors are deployed instead. Variation Analysis for Write: We can also efficiently evaluate the process variation of the memreistive circuits by applying Monte Carlo simulations within the new simulator. A $4\times 4$ crossbar memory is implemented with memristors used for variation analysis of the write-operation. As Fig. 22 shows, three different input patterns (step functions switching between $\pm 4$ V) are fed to 8 bars through buffers to write the memory cells at the junction. A $\pm 30\%$ variation is assumed for memristor device length (D), resulting in a distinct I-V hysteresis path for each memristor. For simplicity, $R_{\rm on}$ , $R_{\rm off}$ and D are assumed to be not correlated. Parameters are set as in Fig. 4 $R_{\rm on}=3.33e7(\Omega)$ , $R_{\rm off}=3.33e10$ $(\Omega)$ , $\mu_v=2.5e-6$ m<sup>2</sup>s<sup>-1</sup>V<sup>-1</sup>, $D=1e-8\pm30\%$ m, $R_s=1e7(\Omega)$ . All memristances are set to $R_{\rm off}$ at the beginning. Diverse input voltages and variation in parameter D can lead to complicated transient paths for memristor values in the crossbar. Fig. 23 shows the transient change of memristance for one of the memristors $(W_{1-1})$ . As the figure indicates, the memristance is successfully written despite of the variations in D. In our experiment, all 16 memristors are written to the expected values. On the other hand, the transient path for the memristor value is very sensitive to D. A Monte Carlo analysis (see Fig. 24) shows that a $\pm 30\%$ variation in D leads to more than $\pm 50\%$ variation in time delay of the write-operation. 3-D $4 \times 8$ Crossbar Memory With Diode-Added Memristor: Using the proposed architecture in Fig. 12, two $4 \times 4$ crossbars are merged together on top of CMOS stack to form a folded 3-D $4 \times 8$ memory. With the same memristor, switch MUX and reading voltages implemented in the 2-D $4 \times 4$ crossbar memory, the resulted output current and power consumption Fig. 23. Transient path of value for one memristor $(W_{1-1})$ with $\pm 30\%$ variation of device lengths (D) for all 16 memristors. Only part of the results are shown in the plot for visual clarity. Fig. 24. Monte Carlo analysis for parameter D's impact on the transient changing path of memristance in the crossbar. For a $\pm 30\%$ variation of D, the initial changing speed of memristance has a mean value of 8.75 $\Omega$ s<sup>-1</sup> and a variation of $\pm 54\%$ , and the time delay before a successful "write" has a mean value of 3.95 ns and a variation of $\pm 52\%$ . are shown in Table IV. As Table IV indicates, the $I_{\rm on}/I_{\rm off}$ ratio for the read-operation degrades a bit when compared to 2-D memory, which could be justified by the increase in memory size. As the memory size doubles compared to 2-D memory, $I_{\rm off}$ is expected to rise due to increase in leakage current paths, while $I_{\rm on}$ should not be affected much. This is proved by the measured data in Table IV. More importantly, the 3-D power consumption remains the same level as the 2-D crossbar memory although memory size is doubled. This benefit comes from prevention of sneak path, which highly decreases the power consumption. ### VI. CONCLUSION One new modified nodal analysis (MNA) is introduced in this paper to handle the rediscovered memristor device. With the new MNA developed in the SPICE-like circuit simulator, hybrid CMOS and memristor circuit analysis can be performed similarly as we design the traditional integrated circuits in CMOS technology. The full memristor circuit and system verification including the transient analysis for functionality and Monte Carlo for reliability can be performed efficiently. Since it is similar to implement a CMOS device in the SPICE-like circuit simulator, our approach has more flexibility to be scaled for the process migration. Based on our newly developed circuit simulator, a number of CMOS and memristor based hybrid circuit designs are also explored with efficient verifications of the functionality, performance, reliability and power. Specifically, one new diode-added memristor crossbar-based memory with 3-D integration is proposed to improve the integration density and to avoid the sneak path during read-write operation. Experiments have shown provable advantage to employ this new simulator for the design exploration of the hybrid CMOS and memristor circuits. #### REFERENCES - [1] L. Chua, "Memristorthe missing circuit element," *IEEE Trans. Circuit Theory*, vol. 18, no. 5, pp. 507–519, Sep. 1971. - [2] L. Chua and S. Kang, "Memristive devices and systems," Proc. IEEE, vol. 64, no. 2, pp. 209–223, Feb. 1976. - [3] D. Strukov, G. Snider, D. Stewart, and S. Williams, "The missing memristor found," *Nature*, vol. 453, pp. 80–83, 2008. - [4] L. Nagel, "Spice2: A computer program to simulate semiconductor circuits," Univ. California, Berkeley, ERL-M520, 1975. - [5] C. W. Ho, A. E. Ruehli, and P. A. Brennan, "The modified nodal approach to network analysis," in *Proc. Int. Symp. Circuits Syst.*, 1974, pp. 504–509. - [6] L. Chua and P. Lin, Computer-Aided Analysis of Electronic Circuits: Algorithms and Computational Techniques.. Englewood Cliffs, NJ: Prentice-Hall, 1975. - [7] H. Yu, Y. Shi, L. He, and D. Smart, "A fast block structure preserving model order reduction for inverse inductance circuits," in *Proc. Int. Conf. Comput.-Aided Des.*, 2006, pp. 7–12. - [8] Y. Chen and X. Wang, "Compact modeling and corner analysis of spintronic memristor," in *Proc. Int. Symp. Nanoscale Arch. (NANOARCH)*, 2009, pp. 7–12. - [9] P. Vontobel, W. Robinett, P. Kuekes, D. Stewart, J. Straznicky, and R. Williams, "Writing to and reading from a nano-scale crossbar memory based on memristors," *Nanotechnology*, vol. 20, p. 425204, 2009. - [10] M. B. Laurent, "Pattern recognition using memristor crossbar array," U.S. Patent 7 459 933, Dec. 2, 2008. - [11] A. Afifi and A. Ayatollahi, "Implementation of biologically plausible spiking neural network models on the memristor crossbar based CMOS/Nano circuits," in *Proc. Eur. Conf. Circuit Theory Des.*, 2009, pp. 563–566. - [12] J. Borghetti, Z. Y. Li, J. Straznicky, X. M. Li, D. A. A. Ohlberg, W. Wu, D. R. Stewart, and R. S. Williams, "A hybrid nanomemristor/transistor logic circuit capable of self-programming," in *Proc. Nat. Acad. Sci.*, 2009, pp. 1699–1703. - [13] M. B. Laurent, "Programmable Crossbar Signal Processor," U.S. Patent 7 302 513, Nov. 27, 2007. - [14] G. S. Snider, "Self-organized computation with unreliable, memristive nanodevices," *Nanotechnology*, vol. 18, p. 365202, 2007. - [15] S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder, and W. Lu, "Nanoscale memristor device as synapse in neuromorphic systems," *Nano Letter*, vol. 10, pp. 1297–1301, 2010. - [16] A. Afifi, A. Ayatollahi, and F. Raissi, "STDP implementation using memristive nanodevice in CMSO-nano neuromorphic networks," *IEICE Electron. Expr.*, vol. 6, no. 3, pp. 148–153, Feb. 2009. - [17] Y. V. Pershin, S. L. Fontaine, and M. D. Ventra, "Memristive model of amoeba learning," *Phys. Rev. E*, vol. 80, p. 021926, 2009. - [18] A. Flocke and G. Noll, "Fundamental analysis of resistive nano-crossbars for the use in hybrid NANO/CMOS-memory," in *Proc. Eur. Solid State Circuits Conf.*, 2007, pp. 328–331. - [19] R. Waser and M. Aono, "Nanoionics-based resistive switching memories," *Nat. Mater.*, vol. 6, no. 11, pp. 833–839, Nov. 2007. - [20] M. Meier, C. Schindler, S. Gilles, R. Rosezin, A. Rudiger, C. Kugeler, and R. Waser, "A nonvolatile memory with resistively switching methyl-silsesquioxane," *IEEE Electron Device Lett.*, vol. 30, no. 1, pp. 8–10, Jan. 2009. - [21] M. R. Stan, P. D. Franzon, S. C. Goldstein, J. C. Lach, and M. M. Ziegler, "Molecular electronics: From devices and interconnect to circuits and architecture," *Proc. IEEE*, vol. 91, no. 11, pp. 1940–1957, Nov. 2003. - [22] M. M. Ziegler and M. R. Stan, "CMOS/Nano co-design for crossbar-based molecular electronic systems," *IEEE Trans. Nanotechnol.*, vol. 2, no. 4, pp. 217–230, Dec. 2003. - [23] G. Snider, P. Kuekes, T. Hogg, and R. S. Williams, "Nanoelectronic architectures," Appl. Phys. A, vol. 80, pp. 1183–1195, 2005. - [24] R. J. Luyken and F. Hofmann, "Concepts for hybrid CMOS molecular non-volatile memories," *Nanotechnol.*, vol. 14, pp. 273–276, 2003. - [25] C. Kugeler, M. Meier, R. Rosezin, S. Gilles, and R. Waser, "High density 3D memory architecture based on the resistive switching effect," *Solid-State Electron.*, vol. 53, no. 12, pp. 1287–1292, 2009. - [26] D. L. Lewis and H. S. Lee, "Architectural evaluation of 3D stacked RRAM caches," in *Proc. Conf. 3D Syst. Integr.*, 2009, pp. 1–4. - [27] D. Tu, M. Liu, W. Wang, and S. Haruehanroengra, "Three-dimensional CMOL: Three-dimensional integration of CMOS/nanomaterial hybrid digital circuits," *Micro Nano Lett.*, vol. 2, pp. 40–45, Jun. 2007. - [28] U. Ascher and L. Petzold, Computer Methods for Ordinary Differential Equations and Differential-Algebraic Equations. Philadelphia, PA: SIAM, 1998. - [29] M. Dong and L. Zhong, "Nanowire crossbar logic and standard cell-based integration," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 17, no. 8, pp. 997–1006, Aug. 2009. - [30] B. Mouttet, "Logicless computational architectures with nanoscale crossbar arrays," in *Techn. Proc. NSTI Conf. Trade Show*, 2008, pp. 73–75 - [31] K. Adarvardar and H.-S. P. Wong, "Ultralow voltage crossbar non-volatile memory based on energy-reversible NEM switches," *IEEE Electron Device Lett.*, vol. 30, no. 6, pp. 626–628, Jun. 2009. - [32] X. S. Hu, A. Khitun, K. K. Likharev, M. T. Niemier, M. Bao, and K. L. Wang, "Design and defect tolerance beyond CMOS," in *Proc. CODES+ISSS*, 2008, pp. 223–230. - [33] G. S. Snider and R. S. Williams, "Nano/CMOS architectures using a field-programmable nanowire interconnect," *Nanotechnol.*, vol. 18, 2007. 035204. - [34] S. Shin and K. Kim, "Memristor-based fine resolution programmable resistance and its applications," in *Proc. Int. Conf. Commun., Circuits,* Syst., 2009. **Wei Fei** (S'11) received the B.S. degree in electrical and electronics engineering from Nanyang Technological University, Singapore, in 2007, where he is currently pursuing the Ph.D. degree with the School of Electrical and Electronics Engineering. His research interest is to explore IC design at the extreme scale, including both nano-scale circuit design and millimeter-wave frequency circuit design. Hao Yu (M'06) received the B.S. degree from Fudan University, Shanghai, China, in 1999 and the M.S. and Ph.D. degrees in the field of the integrated circuit and embedded computing from the Electrical Engineering Department, University of California, Los Angeles, in 2007. He was a senior research staff with Berkeley Design Automation (BDA) until 2009, one of top-100 start-ups selected by Red-herrings at Silicon Valley. Since 2009, he has been an Assistant Professor with Nanyang Technological University, Singapore. He has 42 refereed international publications and 5 book/chapters. His primary research interests include 3-D cyber-physical computing system and analog/RF design exploration at extreme scale. Dr. Yu was a recipient of one Best Paper Award from the ACM Transactions on Design Automation of Electronic Systems (TODAES), two Best Paper Award Nominations from the Design Automation Conference (DAC) and the International Conference of Computer-Aided-Design (ICCAD), and one Inventor Award from Semiconductor Research Cooperation (SRC). He is in the editor board of several journals and serves as the technical program committee member and session chair of several conferences. Wei Zhang (M'05) received the B.S. and M.S. degrees in electrical engineering from Harbin Institute of Technology, Harbin, China, in 1999 and 2001, and the Ph.D. degree in electrical engineering from Princeton University, Princeton, NJ, in 2009. She is an Assistant Professor with the School of Computer Engineering, Nanyang Technological University, Singapore. Her research interests focus on embedded system, reconfigurable computing, nano-electronic VLSI, and electronic design automation. **Kiat Seng Yeo** (M'00–SM'09) received the B.Eng. degree (with Honors) and the Ph.D. degree from Nanyang Technological University, Singapore, in 1993 and 1996, respectively, both in electrical engineering. In 1996, he joined the School of Electrical and Electronic Engineering, Nanyang Technological University, as Head of the Division of Circuits and Systems and a Board Member of the Singapore Semiconductor Industry Association (SSIA). He is a widely known authority on low-power IC design and a recognized expert in CMOS technology and radio frequency IC design. As a result of his innovative pioneering work in the field of IC design, he has successfully attracted over US\$30 million of external research funding from various funding agencies and the industry in the last five years. He is currently a Professor with the School of Electrical and Electronic Engineering and Founding Director of VIRTUS, a new research center of excellence jointly set up by Nanyang Technological University and Singapore Economic Development Board. He has authored over 300 refereed papers in top-tier journals and conferences in his area of research. He is the author of 6 books, 3 book chapters, and holds 25 patents, including two patents for the world's smallest integrated transformer and several patents for 60-GHz applications. He provides consultation to multinational corporations and was General Chair, Co-General Chair and Technical Chair of several international conferences. He gave keynotes and invited presentations at various scientific meetings, workshops, and seminars. Dr. Yeo serves on the Editorial Board of IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES. He was the recipient of the Public Administration Medal (Bronze) on National Day 2009 by the President of the Republic of Singapore. He was also awarded the distinguished Nanyang Alumni Award in 2009 for his outstanding contributions to the university and Singapore's society.