# PETE: A Device/Circuit Analysis Framework for Evaluation and Comparison Of Charge Based Emerging Devices

Charles Augustine<sup>1</sup>, Arijit Raychowdhury<sup>2</sup>, Yunfei Gao<sup>1</sup>, Mark Lundstrom<sup>1</sup>, Kaushik Roy<sup>1</sup> School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA <sup>2</sup>Intel Corporation, Hillsboro, OR, USA

#### ABSTRACT

This paper describes PETE, a tool that has been developed for circuit/system level evaluation of nanoscaled devices. The motivation behind developing this tool is the fact that traditional device metrics like CV/Ion, Ioff or CV<sup>2</sup>f can no longer capture the true potential of semiconductor devices and underestimate or overestimate system level performance. At the same time, the development and deployment of compact models for any new device is a time-consuming effort, a task that can only be undertaken once the potential of the device has been established. Towards this end, we have developed PETE, so that device and circuit designers can perform a fast and reasonably accurate estimation of any new device without having to develop compact models. The inputs to PETE can be numerical I-V and C-V characteristics (derived from experiments or device simulations), and the tool can numerically evaluate a wide array of circuit/system level metrics pertaining to performance and power of logic gates, ring oscillators and mega-cells. We have evaluated four emerging device technologies, namely, 15nm Silicon MOSFET transistors, Multi-gate FinFET transistors, Band-to-band-tunneling transistors, and Ferroelectric FETs with PETE and the results obtained are within a 5% level of inaccuracy when compared to a traditional SPICE based approach. PETE has been deployed on the nanoHUB (nanohub.org) for public use, and its simple web interface ensures that even a non-expert in circuits-system design can obtain accurate estimation of performance-power trade-off of any new technology.

#### 1. Introduction

CMOS Technology scaling is fast approaching its fundamental physical limits [1]. As a result there is higher interest among device technologists to develop alternate technologies, which can provide computational capability better than Silicon MOSFETs under power, delay and area constraints. Multitude of devices has been proposed as substitute for silicon MOSFETs [2-5]. Some of the potential candidates include Carbon Nanotube Transistors (CNT)[2], Band To Band Tunneling devices (BTBT) [1], Ferro Electric-FETs (FEFET) [5], Nanowires and nano-magnet based devices[5]. Functional circuits have been developed with some of these devices while others are still in the developmental stage. Before emerging as a possible post-Silicon technology, each of these devices needs to be evaluated from all tenets of circuit (delay, area and power). In this paper we propose a web-based simulation framework to evaluate any charge-based device without having to develop physical or empirical compact models for the devices.

The typical design flow for developing and deploying any new technology is shown in fig. 1. The flow starts with the physics/material scientists proposing a device structure with certain (Current-Voltage) C-V (Capacitance-Voltage) and characteristics. This device information is transferred to circuit designers in the form of a numerical or analytical model, preferably in the form of a compact model. Circuit designers simulate circuits/systems using the compact device models, to evaluate the performance, power benefits of the devices. The performance evaluation indicates whether the intrinsic device improvement translates into circuit/system level performance improvement. But, a circuit evaluation focusing on a single circuit (ring oscillator or adder) cannot estimate true potential of the device and a set of representative benchmark circuits need to be evaluated to understand the true trade-offs and benefits of the device under evaluation. Device engineers can obtain feedback from circuit/system evaluation to re-engineer the device to improve system-level performance, a task which is often application specific (high speed or ultra low-power). This design process may be needed to undergo several iterations before finally arriving at



Fig.1 Device development flow (An unified approach towards system design)

optimized device specifications. Hence, developing a compact model for each iteration is not time efficient. To expedite the device development flow we have developed a 'device analysis framework' called PETE. PETE effectively reduces design cycle time for exploratory devices by assisting device engineers to quickly assess any new device from a circuit/system perspective for both power and performance without having to perform detailed circuit simulations.

In this paper, we have described the proposed tool, PETE. We have used PETE to evaluate four genres of charge based devices (a) Nanoscaled Single-gate bulk MOSFETs (b) Multi gate SOI MOSFETs (c) Band to Band Tunneling FETs and (d) Ferroelecric insulator based FETs for different application domains. These four types of FETs hold tremendous promise for future technology nodes and research is being conducted in the earnest to better their power-performance trade-offs. However, it is indeed a challenging task to compare and contrast devices which are based on different physics of operation unless a common benchmark or metric has been established. To this effect, we also propose a new weighted metric, which can be tuned to fit the primary design target - high performance or low power. Thus, the true potential of a device, as a candidate for performance boosting or power savings, can be prudently judged.

The rest of the paper is organized as follows. In section 2, the PETE framework is discussed along with different device models available in PETE for public use. In section 3 the algorithms used in PETE have been presented. Section 4, compares and contrasts different exploratory devices using PETE. Finally section 5 concludes the paper.

# 2. OVERVIEW OF THE PETE SIMULATION FRAMEWORK

The tool PETE is designed to handle both i) MOSFET based device characteristics as well as ii) any arbitrary charge based device characteristics. The MOSFET based model in PETE can be used to understand the significance of MOSFET parameters like Threshold voltage (Vt), Subthreshold swing (S) etc. on final

circuit/architecture performance. On the other hand generic device input model can help emerging technology researchers to obtain circuit evaluation results of any three-terminal charge based switching device, as long as a set of data-points on its I-V and C-V characteristics are known.

The details about MOSFET and generic device models are presented in the following sub-sections.

#### 2.1 MOSFET BASED MODEL:

The MOSFET based model requires eleven device parameters (both for NMOS and PMOS devices) and five technology parameters, which are described in table 1. The models, which are used to generate I-V and C-V characteristics for MOSFET device with the device parameters, are explained in detailed below.

# 2.1.1 Model for MOSFET I-V characteristic:

MOSFET I-V characteristics is shown to have two regions of operation 1) subthreshold region, where Vgs (gate-source voltage) is less than  $V_t$  (threshold-voltage), and is dominated diffusion current and 2) super-threshold region, where Vgs is greater or equal to  $V_t$ , and is dominated by drift current.

MOSFET current in subthreshold region is given by

$$I = Ioff.10^{\frac{Vgs}{S}}(1 - e^{-\frac{Vds}{mVt}})$$
 (1)

and V<sub>t</sub> is given by

$$Vt = V_{th \ vdd} + DIBL(Vdd - Vds)$$
 (2)

where  $I_{\rm off}$  is the OFF current, 'S' is the subthreshold swing (in mV/decade), Vds is the drain to source voltage and 'm' is the body-effect coefficient, which is given by:

$$m = \frac{S}{60 \, m \, V \, / \, decade(Ideal \, S)} \tag{3}$$

Current through the MOSFET in super-threshold region is given by

$$I = I_{2}(Vgs.V \min - 0.5V \min^{2})(1 + \lambda .Vds)$$
 (4)

and I<sub>o</sub> is given by

$$I_o = \mu_n C_{ox} \frac{W}{L} \tag{5}$$

where  $C_{ox}$  is the oxide capacitance per unit area, W is the MOSFET width and L is the channel length,  $\lambda$  is the channel length modulation parameter.  $V_{min}$  is given by

$$V \min = \min(Vgs, Vds, Vds \_sat)$$
 (6)

where  $V_{\text{ds\_sat}}$  is the drain-source saturation voltage is given by

$$Vds_{-}sat = \frac{(Vgs - Vt)}{m} \tag{7}$$

Depending on the simulation accuracy requirement (say, 2% to 10% levels of inaccuracy) PETE generates a voltage grid with more resolution for 2% and less resolution for 10%, starting from 0 to Vdd (power-supply voltage) for both Vgs and Vds inputs. Current (Id) through the MOSFET for these Vgs and Vds grid points can be computed using the equations 1-7 and finally a response surface I-V model is generated.

#### 2.1.2 Model for MOSFET C-V characteristics:

Similar to the MOSFET I-V characteristics, MOSFET C-V characteristics can be characterized by a) subthreshold (depletion) and b) super-threshold (inversion) regions. In the depletion region  $C_{tg}$  is given by:

$$C_{tg} = \frac{C t_{ox}}{\sqrt{1 + (\frac{2 C_{ox}^2 V g s}{\varepsilon_{si} \varepsilon_{o} q N a})}}$$
(8)

where  $\epsilon_{si}$ .  $\epsilon_{o}$  is the permittivity of silicon and Na is the doping density of silicon. In inversion region  $C_{tot}$  (at high frequency) is given by

$$C_{tg} = Ct_{ox} \parallel \sqrt{\frac{\varepsilon_{si}\varepsilon_{o}q^{2}Na}{4KT \ln(Na/Ni)}}$$
 (9)

| Device Parameters                        | Technology parameters               |  |  |  |  |
|------------------------------------------|-------------------------------------|--|--|--|--|
| Sub-threshold swing (S)                  | Supply Voltage (Vdd)                |  |  |  |  |
| Mobility (μ)                             | Fixed interconnect                  |  |  |  |  |
| Lambda (λ)                               | capacitance (C <sub>fix</sub> )     |  |  |  |  |
| Threshold voltage (V <sub>th vdd</sub> ) | Extrinsic device                    |  |  |  |  |
| DIBL(drain induced barrier               | capacitance $(C_{ext})$             |  |  |  |  |
| lowering)                                | capacitance (C <sub>ext</sub> )     |  |  |  |  |
| Oxide thickness (tox)                    | Minimum device Width                |  |  |  |  |
| Transistor Length (L)                    | $(W_{\min})$                        |  |  |  |  |
| Off current (Ioff)                       | Constant load (C <sub>L</sub> ) for |  |  |  |  |
| Saturation velocity (Vsat)               | analyzing high-cap                  |  |  |  |  |
| Doping concentration (Na)                | buses                               |  |  |  |  |
| Flat band voltage (Vfb)                  | and pads                            |  |  |  |  |

Table 1. Device and Technology parameters

where Ni is intrinsic carrier density of silicon and  $Ct_{ox}$  is the oxide capacitance is given  $Ct_{ox} = C_{ox}.W.L$  (10)

Again, for different simulation inaccuracy (2% to 10%) requirement  $C_{tot}$  is calculated for different Vgs grid points starting from '0' to 'Vdd'.

From the above two subsections it is clear that all eleven device parameters mentioned above, determine the MOSFET I-V and C-V characteristics. Since, the circuit/system performance is determined by the device I-V and C-V characteristics, which will be explained in detail in section 3, the impact of each device parameter towards circuit performance, can be accurately estimated using the PETE MOSFET model.

#### 2.2 GENERIC CHARGE BASED MODEL:

The generic three-terminal model, as has been incorporated in the publicly available version of PETE, can directly take device characteristics in the form of numerical current-voltage (I-V) values and capacitance-voltage (C-V) values. These values can be obtained directly from any device modeling tools (e.g. Medici-Taurus [6]) or from any device experiment and measurement. In case, the I-V and C-V characteristics are obtained from any device measurement, PETE provides a unique provision of directly taking the experimental data and performing circuit/system level evaluation with it.

#### 2.3 INPUT VALIDATION FOR MOSFET/GENERIC MODEL

The 'input validation parser' in PETE can check the correctness of both model inputs. The validation includes checking for the consistency of the table size for current (Id) and capacitance (C) values along with polarity check of current through the device (current through N and P device are assumed to be positive and negative respectively). The detailed error report can be used to correct the errors in the inputs before proceeding with circuit simulation in PETE.

#### 3. THE PETE CIRCUIT/SYSTEM SOLVER

PETE circuit/system solver consists of set of algorithms for processing inputs (both MOSFET and generic) and to generate power and delay results for a set of representative circuits. The results include DC results for an inverter circuit, transient results for unit cells like 2-inp NAND, 2-inp NOR, 2-inp XOR gate gates, mega cells like adders, chain of unit cells and ring oscillator. The algorithms for computing a)DC and b)AC (transient) results are discussed in further detail in the following sections.

#### 3.1 INVERTER DC CHARACTERISTICS

The DC performance of inverter is obtained by calculating inverter output voltage (Vout) for a certain input voltage (Vin) by equating the current through P-device and N-device. This calculation is performed for a range of input voltages (Vin) and the Voltage Transfer Characteristics (VTC) shown in fig 2. is obtained. PETE also calculates noise margins like, noise margin low (NML), noise margin high (NMH), input voltage low



Fig.2 Inverter VTC (with tox=1.1 nm and tox=1.8nm) using 65 nm CMOS  $\,$ 

and high (VIL and VIH), output voltage low and high (VOL and VOH) from the VTC. PETE DC results also include  $\beta$ -ratio (ratio of P-device ON current to that of N-device), which determine the width ratio of complementary P and N devices to obtain equal rise delay and fall delay for the circuit output.

Fig. 2 shows VTC characteristics obtained using PETE and SPICE[7] based on PTM model[7] for two different devices (different tox) in 65nm CMOS technology. The close match between SPICE generated results and PETE shows that PETE has correctly captured the *static device characteristics*.

#### 3.2 TRANSIENT RESULTS FOR REPRESENTATIVE CIRCUITS

Transient results are necessary for circuit/system designers to come up with optimized circuit architectures for a device technology. The representative circuits in PETE include multi input complementary logic circuits like NAND, NOR and XOR, which requires stacking of devices (P-device, N-device or both). Current (Id) flowing through these stacked devices is computed self consistently by solving node voltages at all intermediary nodes in the stack, which is described in detail below.

#### Current in stacked devices:

Let us consider a 2-inp NAND gate (fig. 3) to illustrate the 'stack solving algorithm' to solve the current in stacked devices. When input B is connected to '1' and input A switches from '0' to '1', current through the N-stack is determined by voltages at the output node (Vout) and intermediate node (V1) as shown in fig. 3. First the 'derivative' based Newton Raphson (NR) method [8] is used to compute the node voltage V1 self consistently. However, NR method fails in certain cases where the I-V and the C-V data are obtained from experimental measurements or device simulations. This can so happen due to sharp discontinuities in device I-V or C-V characteristics or due to non-existence of higher order derivatives in the device I-V or C-V characteristics. If NR



Fig.3 Transistor stack solving algorithm

fails, PETE uses non-derivative based methods [9] to determine the node voltage V1 and Vout. The non-derivative method uses exhaustive search procedure to obtain the node voltage V1 by searching the entire voltage space (0 to Vdd). The non-derivative based method is guaranteed to converge, but it comes with the cost of higher convergence time. Once the node voltages are obtained, current through the N-stack can be determined. For other stacked circuits like NOR and XOR, PETE employs a generalized algorithm, which is illustrated in detail in fig. 3. After obtaining current through the stacked circuits PETE computes the delay and power for all representative circuits, in an approach described in the following sub sections.

#### Calculation of the switching delay:

Delay of a circuit is defined as the time taken to propagate information from the circuit input to the circuit output. Delay is calculated by estimating the time taken to charge or discharge the output node capacitance, as the input changes. To compute the circuit delay faster without losing accuracy, PETE computes the delay of logic gates in the Voltage domain. The entire output voltage range ('0' to 'Vdd') is divided into several short voltage intervals and the time taken for the output to transition through each voltage interval is calculated. The time taken for each voltage step is given by

$$\Delta T_i = \frac{Ctot_{i+1}V_{i+1} - Ctot_iV_i}{\underbrace{(I_{i+1} + I_i)}_{2}} , I_i = net(I_{char} - I_{dischar})$$
 (11)

where Ctot is the summation of intrinsic device capacitance  $(C_{tg}),$  extrinsic device capacitance  $(C_{ext})$  and fixed interconnect capacitance  $(C_{fix}),$  which are all inputs to PETE model as explained in section 2. The time taken  $(\Delta T_i)$  for each voltage interval is summed up to obtain the total delay. In the case when output capacitance is charging, the voltage interval starts at '0' and goes to 'Vdd' and in the case of discharging the voltage interval goes from Vdd to 0'. The delay  $(T_{delay\text{-charge}})$  and  $T_{delay\text{-discharge}})$  is given by

$$T_{delay-ch \text{ arg } e} = \sum_{Vi=0}^{0.9 \text{ V} dd} \Delta T_i, T_{delay-disch \text{ arg } e} = \sum_{Vi=0.9 \text{ V} dd}^{0} \Delta T_i$$
 (12)

and average delay is given by

$$T_{avg-delay} = \frac{(T_{delay-ch \arg e} + T_{delay-dich \arg e})}{2}$$
 (13)

Fig. 4a) shows the average delay results obtained with PETE and BSIM based HSPICE for different logic gates (designed using 65nm PTM CMOS technology[7]) with different load capacitances (C<sub>L</sub>). The close match shows that PETE model has correctly captured the transient characteristics of the device.

# Active & leakage power

Active power  $(P_{act})$  is defined as the power consumed in switching the circuit output.  $P_{act}$  is calculated using the equation

$$P_{act} = \frac{V d d \sum_{delay} \Delta I \Delta T}{T_{delay}}$$
 (14)

where  $\Delta I$  is the current flowing out of Vdd in time  $\Delta T$  and  $T_{delay}$  is the output switching time (either  $T_{delay\text{-}charge}$  or  $T_{delay\text{-}discharge})$  depending on charging or discharging as given by eq. 12).  $P_{leak}$  is the power consumed when circuit is idle and it is calculated using equation

$$P_{look} = I_{look}.Vdd (15)$$

where  $I_{leak}$  is leakage current through circuit contributed by devices with  $|Vgs| \leq 0V$  and  $|Vds| \geq 0V$ .

After computing  $P_{act}$  and  $P_{leak}$  through unit cells, PETE estimates the power through mega-cell like Ripple Carry Adder (RCA) using an activity based power calculation procedure, which is explained in following sub-section.

#### Activity based power calculation

The procedure to estimate power dissipation in logic can be explained using an example Ripple Carry Adder (RCA). RCA is commonly used in arithmetic logic units (ALUs) and digital signal



load characteristics of Inverter, 2 inp-Nand and 2-inp Nor using 65nm CMOS

Fig.4 b) Ring Oscillator Frequency Vs Parasitic Capacitance using 15nm CMOS[12]

processing systems (DSPs). An 8-bit RCA is designed in PETE using 16, 2-inp XOR gates and 12, 2-inp NAND gates. The frequency of this circuit is determined by critical path delay, which is 'carry in' (Ci) to 'carry out' path (Cout) as indicated in fig. 5 and it is given by

$$CP_{delay} = \sum_{i=1}^{8} CP_{FAi} \tag{16}$$

where  $CP_{FA}$  is the critical path delay of the Full Adder (FA) in RCA and  $CP_{FA}$  is given by  $CP_{FA} = Delay_{XOR} + 2.Delay_{NAND}$ 

where Delay<sub>XOR</sub> and Delay<sub>NAND</sub> are delays of 2-inp XOR and 2-inp NAND, respectively. PETE employs signal activity based Pact and P<sub>leak</sub> calculation for RCA, since, only a subset of intermediate nodes switches at a particular time during the complete computation period. It is assumed that all inputs (A, B and Ci) have same transition probability (or activity) of 0.5.

P<sub>act</sub> for the FA is given by

$$P_{actFA} = 2 \frac{Delay_{XOR}}{Delay_{FA}} \cdot \frac{1}{2} \cdot \frac{1}{2} \cdot P_{actXOR} + 2 \frac{Delay_{NAND}}{Delay_{FA}} \cdot \frac{1}{4} \cdot \frac{3}{4} \cdot P_{actNAND} + \frac{Delay_{NAND}}{Delay_{FA}} \cdot \frac{7}{16} \cdot \frac{9}{16} \cdot P_{actNAND}$$
 (17)
where  $P_{actXOR}$  and  $P_{actNAND}$  are active power of 2-inp XOR and 2-

where 
$$P_{actXOR}$$
 and  $P_{actNAND}$  are active power of 2-inp XOK and inp NAND, respectively. The  $P_{act}$  of 8-bit RCA is given by
$$P_{actRCA} = \frac{\sum_{i=1}^{8} (C P_{FAi}.P a c t_{FAi})}{\sum_{i=1}^{8} (C P_{FAi})}$$
(18)

$$P_{leakFA} \text{ is the leakage power of FA and is given by}$$

$$P_{leakFA} = 2 \cdot \frac{(D e la y_{FA} - D e la y_{XOR})}{D e la y_{FA}} \cdot P_{leakXOR}$$

$$+3 \cdot \frac{(Delay_{FA} - Delay_{NAND})}{Delay_{FA}} \cdot P_{leakNAND}$$
where P and P are leakage power of 2 in YOP.

where  $P_{leakXOR}$  and  $P_{leakNAND}$  are leakage power of 2-inp XOR and 2-inp NAND, respectively.

P<sub>leak</sub> of RCA is computed from P<sub>leakFA</sub> using the following equation

$$P_{leakRCA} = \frac{\sum_{i=1}^{8} (CP_{FAi}.Pleak_{FAi})}{\sum_{i=1}^{8} (CP_{FAi})}$$
 (20)

Schematic diagram of Full Adder (FA) along with the critical path and signal activity at each node is given in fig. 5.

Significance of device parasitics and interconnects in emerging technologies: It is important to note that capacitance (Ctot) used in eq. 11 is the total capacitance of the circuit including all parasitic (extrinsic) capacitance and interconnect (fixed capacitance). Fig.4b) shows the dependence of RO frequency on parasitic capacitance. It can be observed that frequency degrades super linearly with increasing parasitic capacitance. As mentioned in the previous section (eq.11 and eq. 14), both circuit delay and power degrades with higher capacitance and the device needs to be characterized for both intrinsic and extrinsic capacitances along with feasible interconnect materials to compute the accurate circuit/system level performance estimation of the device.

#### 4. COMPARISON OF DIFFERENT EMERGING TECHNOLOGIES

In this section we will demonstrate the use of PETE in evaluating four prominent emerging technologies, namely, a 15nm scaled thermal MOSFET, an optimized double gate FinFET transistor, a band-to-band tunneling FET and a thermal MOSFET with a ferroelectric dielectric. Since the application space of these four different device technologies will potentially be different, it is prudent to define a metric that can span the entire space of high performance as well as low power computation. In order to compare the different emerging technologies, we propose the use of a single performance metric, namely weighted frequency per unit wattage (P) as defined by:

$$P = \frac{F^{\alpha}}{(P_{act} + P_{leak})^{\beta}} \tag{21}$$

where F is normalized frequency of operation (normalized with respect to the ring oscillator frequency of an ideal MOSFET with S=60 mV/decade, Ioff=1 nA/um,  $V_{th\_vdd}=300 \text{mV}$  and Vdd=0.9 V),  $P_{act}$  is the active power (eq. 14) and  $\bar{P}_{leak}$  is the leakage power (eq. 15) (also normalized with respect to Pact+Pleak for the same normalized MOSFET). With the ideal MOSFET input parameters we have obtained F=57GHz and P<sub>act</sub>+P<sub>leak</sub>=64μW for a 5-stage ring oscillator, which is used as reference in all the subsequent calculations.

The values  $\alpha$  and  $\beta$  determine, whether performance or power dissipation is the primary design target. They are set depending on the specific application in which the device is used. For instance a device that is used in High Performance (HP) application should have  $\alpha \ge 1$  and  $\beta \le 1$ , whereas a device used in Low Power (LP) application have  $\alpha \le 1$  and  $\beta \ge 1$ . Note further that when  $\alpha = 1$  and  $\beta = 1$ , the metric, P transforms to the normalized inverse of energy per switching, a traditional circuit design metric. Furthermore, the on-



Fig.5 Ripple Carry Adder with circuit activity at different nodes



Fig.6 Delay Vs Subthreshold Swing (S) at Iso-I<sub>leak</sub> conditions



Fig.7 Ileak Vs Subthreshold Swing (S) at **Iso-Delay conditions** 



Fig. 8a)Band diagram for 15nm MOSFET device

Fig.8b)FinFET device with source/drain under/overlap

current of the device, Ion, can be written as

$$I_{on} = I_{ideal}$$
. TP (22)

where TP is the transmission probability and I<sub>ideal</sub> is the I<sub>on</sub> of the ideal thermal MOSFET. In addition  $I_{leak}$ , (used in calculation of P<sub>leak</sub>) (eq. 15), can be modeled as

$$I_{leak} = K.10^{(-1/S)}$$
 (23)

where K is a constant and S represents subthreshold swing (input parameter for MOSFET model in PETE). Significance of S in device performance can be understood from fig.6 and fig.7.Devices with higher S will result in higher I<sub>leak</sub> at iso-delay as shown in fig. 7 or conversely higher delay at iso-  $I_{leak}$  as shown in fig. 6. It can be concluded that devices with lower 'S' can lead to an improvement of both power and performance. Note that in standard MOSFETs, due to thermionic emission, the lower limit of S is 60mV/decade (Ideal MOSFET), which bounds the total leakage in MOSFETs.

We have used PETE to analyze and compare four application specific novel devices, a) 15nm MOSFET, b) Optimized double gate MOSFET (FinFET), c) BTBT CNT-FET (for LP applications) and d) FEFET (for HP applications) using our proposed metric P. The characteristics of each of these devices are imported to PETE and used for calculating the circuit level metrics.

### Scaled 15nm bulk MOSFET[12]:

The 15nm bulk MOSFET characteristics have been derived from the ITRS roadmap and from [12]. As shown in fig. 8a the device relies on thermionic emission of carriers over channel barrier, which restricts the sub threshold swing (S) to 60mV/decade. Along with restriction on S, the device also suffers from DIBL and higher leakage current compared to Ideal MOSFET (S=60mV/decade) due to severe short channel effects.

#### Optimized 32nm FinFET[13]:

FinFETs[13] is designed to offer more gate control over channel



Fig 9: 2D plot for P, Transmission Probability Vs Sub-threshold swing for a high performance(HP: α=2.0, β=0.5) application (high Vdd: 1.0V, nominal Vdd: 0.9V and low Vdd: 0.3V)

compared single gate MOSFETs. As a result, the short channel effects like DIBL are better controlled in FinFET devices. The device also offers additional advantage in terms of lower subthreshold swing due to the use of a fully depleted body. FinFETs can be further optimized to include source and drain underlapping [13] to obtain improved gate overlap capacitance with marginal decrease in Ion. We have considered 2 flavors of FinFET devices a) Nominal FinFET and b) Symmetric underlap

gate structure

(equal source and drain underlap) FinFET, which has been optimized for higher performance [13]. Fig. 8b shows a symmetrically underlapped FinFET device.

#### BTBT CNT-FET[2]:

BTBT CNT device

The BTBT FET considered in this analysis is a carbon nanotube (CNT) based tunneling FET, described in details in [2]. The band diagram of a BTBT device is shown in fig. 8c. It can be observed that electrons tunnel from source valence band to channel conduction band through a tunneling barrier. Due to this tunneling mechanism, subthreshold swing better than 60mV/decade can be obtained but it comes at the cost of lower TP. Hence BTBT devices represent a class of extremely low current devices

# 65nm FEFET[4]:

A FEFET, (proposed in [4]) is a MOSFET device with a subthreshold swing (S) better than 60mV/decade due to the presence of a ferroelectric dielectric between gate and the channel. The hysteresis present in the ferroelectric dielectric is engineered to obtain negative gate capacitance, which decreases the S as shown in fig. 8d. Thus it provides excellent subthreshold swing without lowering the TP or I<sub>on</sub> of the device.

Every device described above is simulated with iso-Vdd (=0.9V) in PETE and a set of circuit performance results are computed. Table 2 shows some of the results obtained from PETE. The values for the performance parameter (P) are also computed for both high performance (HP) and low power (LP) settings for each device



Fig 10: 2D plot for P, Transmission Probability Vs Sub-threshold swing for a low power(LP:  $\alpha$ =0.5,  $\beta$ =2.0) application (high Vdd: 1.0V, nominal Vdd: 0.9V and low Vdd: 0.3V)

| Technology /<br>Circuits<br>(Vdd=0.9V)          | Inverter Delay(pSec) |                       |                                | Inverter Power (µW) |                   | Ring-<br>oscillator        | Ring                            | 8-bit                  | 10-<br>stage                     | 10-<br>stage                   | Weighted frequency<br>per unit wattage (P)<br>based on RO circuit |                   |                   |
|-------------------------------------------------|----------------------|-----------------------|--------------------------------|---------------------|-------------------|----------------------------|---------------------------------|------------------------|----------------------------------|--------------------------------|-------------------------------------------------------------------|-------------------|-------------------|
|                                                 | PETE                 | CV<br>I <sub>on</sub> | CV<br>I <sub>eff</sub><br>[11] | PETE                | CV <sup>2</sup> f | (RO)<br>Frequency<br>(GHz) | oscillator<br>(RO)Power<br>(µW) | RCA<br>Delay<br>(nSec) | Nand<br>chain<br>delay<br>(nSec) | Nand<br>chain<br>power<br>(µW) | HP<br>α=2<br>β=.5                                                 | NOM<br>α=1<br>β=1 | LP<br>α=.5<br>β=2 |
| BTBT CNT [2],<br>L=35nm                         | 5.7                  | 3.8                   | 6.5                            | 145                 | 184               | 17.5                       | 36                              | 0.3                    | 0.17                             | 138                            | 0.12                                                              | 0.55              | 1.76              |
| MOSFET[12],<br>L=15nm                           | 2.5                  | 1.6                   | 2.4                            | 1010                | 1125              | 40                         | 255                             | 0.15                   | 0.13                             | 518                            | 0.25                                                              | 0.17              | 0.05              |
| FEFET [4],<br>L=65nm                            | 3.85                 | 4.6                   | 11                             | 182                 | 218               | 26                         | 45.5                            | 0.06                   | 0.19                             | 98.4                           | 0.25                                                              | 0.64              | 1.34              |
| Nominal<br>FinFET,<br>L=35nm                    | 2.4                  | 2.2                   | 2.9                            | 1600                | 1788              | 42                         | 424                             | 0.11                   | 0.05                             | 966                            | 0.21                                                              | 0.11              | .02               |
| Symmetric<br>Underlap<br>FinFET [13],<br>L=35nm | 2.3                  | 2.2                   | 3                              | 1220                | 1693              | 42.7                       | 344                             | 0.1                    | 0.05                             | 767                            | 0.24                                                              | 0.14              | 0.03              |

Table 2. Benchmark Results with PETE for different novel technologies (The metric, P for the most suitable devices for the HP and LP applications have been marked in bold)

technology. It is observed that double gate devices (Lgate=35nm) offer same benefit as 15nm MOSFET in HP applications, which confirms the prediction that multi-gate devices will provide a suitable replacement for single gate devices in scaled technologies. The optimized FinFET [13] device out-performs nominal FinFET in both HP and LP performance metrics. From this observation we can predict that optimized FinFET device (with source and drain underlapping) will have significant power and performance advantages over the nominal FinFET. Another interesting observation is that an FEFET device offers more benefit compared to Ideal MOSFET for low power applications but not for HP applications. To understand this phenomenon further, Vdd is increased to 1V and performance metrics of FEFETs and Ideal MOSFETs are compared. It is observed that P for HP increases to '1' and P for LP decreases to 0.35. This shows that FEFET can be used for both LP and HP applications at different supply voltages. Table 2 also illustrates that the nominal metric of the inverse of energy per switching ( $\alpha$ =1 and  $\beta$ =1) is incapable of capturing the true merit of a device for the entire range of applications from HP to LP.

The use of each of these devices in a ring oscillator has been illustrated in figs. 9 and 10. We have used PETE to generate a 2D profile of the metric P (for different values of  $\alpha$  and  $\beta$ ), for devices with varying TP and subthreshold swing. The five optimized devices under consideration along with Ideal MOSFET and regular FinFET are six points in this 2D profile plot. The red (blue) region represents a higher (lower) value of P. Note that for HP applications, a high TP is desired (the subthreshold swing is of lower significance) whereas for LP applications, a lower value of subthreshold swing is more useful. Figs. 9 and 10 present two distinct conclusions. In HP applications with a Vdd=1V, an FEFET yields similar value of P compared to Ideal MOSFET, but with a Vdd of 0.9V, FEFET performs inferior compared to Ideal MOSFET. 15nm MOSFET performs similar to FEFET with a Vdd of 0.9V. This is indicated by the colors of the regions in which each of these points lie. However in the LP applications (fig.10), FEFETs represent a significantly higher value of P (with a Vdd=0.9V) than both the Ideal MOSFET, and the 15nm FET. However, BTBT devices offer much higher benefits than FEFET devices in the low power application space due to lower subthreshold swing and lower Ion. Hence, it is clear that the different genres of devices represent varying trade-offs of power and performance and the ideal choice is guided by the target application. PETE provides a common benchmark for evaluating devices of different conduction mechanisms and identifying the suitable choice. Our proposed metric can capture the trade-off between power and performance and it can be weighted appropriately to suit the target application.

#### 5. CONCLUSION

In this work we have developed an exploratory device simulation framework called PETE, which can be used to assess the performance of emerging devices and it has been deployed for public use on the NanoHUB.

PETE has been used to benchmark different novel technologies like BTBT FET, FEFET, and a 15nm MOSFET. We note that these three devices have different application domains depending on whether the target design is performance or power constrained. A new weighted and unified metric has been defined to evaluate these devices with PETE. PETE removes the need for developing compact models at an early stage of a device inception and can help experimentalists as well as theoreticians to obtain an early understanding of the circuit/system level performance of new device technologies. The software has been deployed (<a href="https://www.nanohub.org/resources/2841">www.nanohub.org/resources/2841</a>) for public use and the tool usage among academic and industrial researchers (more than 100 unique users) has continuously increased since its launch date.

#### REFERENCES

- S. E. Thompson et. al., IEEE TSM, Vol. 18, No. 1, pgs. 26 36, February 2005.
- [2] S. O. Koswatta et. al., Nano Lett., vol. 7, pp. 1160-1164,
- [3] A. Raychowdhury et. al., TCAD, pp. 1411 1420, Oct, 2004
- [4] S. Salahuddin, S. Datta, Nano. Lett. doi: 10.1021/nl071804g, 2007.
- [5] R. Cowburn, and M. Welland, Science, 287, pp. 1466–1468, 2000
- [6] Taurus-Medici: Synopsys Corp., Mountain View, CA.
- [7] http://www.eas.asu.edu/~ptm/latest.html.
- [8] C. T. Kelley, Fundamentals of Algorithms, SIAM, 2003.
- [9] S. Climer, ICAPS 2004, June 3-7, 2004.
- [10] S. Mukhopadhyay et. al., IEEE DAC, 2003.
- [11] J. Deng et. al., IEEE TED, vol.53, no.6, pp. 1317-1322, June
- [12] A. Khakifirooz et. al., IEEE TED. vol. 55, no. 6, pp. 1391– 1400, June 2008.
- [13] A. Bansal et. al., IEEE TED, Vol. 54, No.6, June 2007.