<table>
<thead>
<tr>
<th><strong>Title</strong></th>
<th>A Low-Power Hybrid RO PUF With Improved Thermal Stability for Lightweight Applications</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Author(s)</strong></td>
<td>Cao, Yuan; Zhang, Le; Chang, Chip-Hong; Chen, Shoushun</td>
</tr>
<tr>
<td><strong>Date</strong></td>
<td>2015</td>
</tr>
<tr>
<td><strong>URL</strong></td>
<td><a href="http://hdl.handle.net/10220/39237">http://hdl.handle.net/10220/39237</a></td>
</tr>
<tr>
<td><strong>Rights</strong></td>
<td>© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [<a href="http://dx.doi.org/10.1109/TCAD.2015.2424955">http://dx.doi.org/10.1109/TCAD.2015.2424955</a>].</td>
</tr>
</tbody>
</table>
A Low-power Hybrid RO PUF with Improved Thermal Stability for Lightweight Applications

Yuan Cao, Student Member, IEEE, Le Zhang, Student Member, IEEE, Chip-Hong Chang, Senior Member, IEEE, and Shoushun Chen, Senior Member, IEEE

Abstract—Ring oscillator (RO) based physical unclonable function (PUF) is resilient against noise impacts, but its response is susceptible to temperature variations. This paper presents a low-power and small footprint hybrid RO PUF with a very high temperature stability, which makes it an ideal candidate for lightweight applications. The negative temperature coefficient of the low-power subthreshold operation of current starved inverters is exploited to mitigate the variations of differential RO frequencies with temperature. The new architecture uses conspicuously simplified circuitries to generate and compare a large number of pairs of RO frequencies. The proposed 9-stage hybrid RO PUF was fabricated using GF 65 nm CMOS technology. The PUF occupies only 250 $\mu$m² of chip area and consumes only 32.3 $\mu$W per CRP at 1.2 V and 230 MHz. The measured average and worst-case reliability of its responses are 99.84% and 97.28%, respectively over a wide range of temperature from –40 to 120 °C.

Index Terms—Physical Unclonable Function, ring oscillator, hardware security, temperature stability, process variation.

I. INTRODUCTION

The internet of things (IoT) is envisaged to become an ultimate driver for the next growth phase of semiconductor industry. Lightweight electronic tagging technologies will avail themselves most in this ubiquitous computing revolution of advance connectivity of devices, systems and services. Unfortunately, the footprint and power budget have severely limited the strength of cryptographic algorithm implementable on radio frequency identification (RFID) and other intelligent tags. The secret data stored in these devices can be easily read or reverse engineered and copied [1]. Critics are concern that the widespread IoT adoption will make cyber attack an increasingly devastating physical (as opposed to virtual) threat. In this light, physical unclonable function (PUF) comes in handy as a new secure and low-cost primitive for integrated circuit (IC) authentication and counterfeit prevention [1].

A PUF is a circuit module that generates chip signatures based on its innate uncontrollable and unpredictable manufacturing process variations. Many PUFs have been proposed and successfully implemented in mobile devices [2]–[6]. RO PUF is superior to other silicon based PUFs [7] in that: 1) The RO can be implemented as a hard macro and instantiated as many times as needed in the top-level design, making all the ROs identical in terms of placement and routing. The output frequencies are also independent of the delay due to the routing of the RO outputs to the counter. 2) The difference in RO frequencies can be amplified by allowing them to “ring” for a longer time. Albeit the above advantages of RO PUF, the reliability of its responses is still highly susceptible to temperature variations [2]. In [8], a temperature-aware cooperative RO PUF is proposed. Bit generation rules are defined to convert the unreliable bits. In [5], [9], this problem is addressed by selecting only those pairs of oscillators of sufficiently large frequency distances to desensitize their variations with temperature. Methods to correct the noisy bits by using fuzzy extractors [7] are also proposed to improve the reliability of the PUF at the cost of its hardware area, power and complexity of operation. In [10], multi-level supply voltages are used to stabilize the PUF responses at varying operating temperature, with the drawback of additional power management circuits for voltage monitoring and sequencing.

In this paper, we propose a novel design of RO PUF that has much lower power and area consumptions than the conventional implementations, yet possesses enhanced reliability. To counteract the effect of thermal induced deviations in a randomly chosen pair of ROs, each RO consists of a nearly equal (i.e., different by one) number of positive temperature coefficient current starved inverter stages and negative temperature coefficient regular inverter stages to prevent the flipping of the response bit. The current starved inverter stages operate in the subthreshold region, which reduce the overall power consumption significantly. To exponentially increase the number of RO frequencies that can be generated for a given area, each RO in the randomly selected pairs are constructed from one of the two inverters in each inverter stage.

II. CLASSIC RO PUF AND ITS TEMPERATURE-INDUCED RESPONSE STABILITY PROBLEM

The classic RO PUF is made of 2 $N$-to-1 multiplexors, 2 counters, 1 comparator and $N$ identical ROs, as shown in Fig. 1(a). Due to the inter- and intra-chip process variations, the frequency of each RO differs. A $2 \log_2 N$-bit challenge is input to the two multiplexors to select a pair of ROs. Depending on which of the selected ROs has a larger frequency, the response bit of the PUF is either 0 or 1. Therefore, the greater the difference between the oscillation frequencies of any RO pair, the more reliable is the output response bit of the PUF. A 1-out-of-$k$ masking scheme is adopted in [5], where groups of $k$ 5-stage ROs with $k = 8$ are implemented. A stable response bit is derived by selecting the pair of ROs in a group that has the maximum frequency difference. The reliability is improved over the classic RO PUF by requiring $k \times n$ ROs for an $n$-bit
response. In [9], each RO is replaced by a configurable RO in a CLB of a FPGA. The configurable design enables \( k \) instead of one RO pair to be formed between two CLBs. Similar to the 1-out-of-\( k \) masking scheme, the pair that has the maximum distance among \( k \) RO pairs is selected. High reliability is achieved at the cost of substantial hardware redundancy.

The dynamic variation of the oscillation frequency with temperature is a major concern for the response bit stability. The output frequency of the oscillator is inversely proportional to the temperature [5]. Fig. 1(b) shows a scenario that the temperature dependent changes in the oscillation frequencies of the two ROs can reverse the relation between their frequencies and generate an error bit as the temperature varies from \( t_1 \) to \( t_2 \). Fig. 1(b-ii) shows the scenario that the temperature dependent changes in the oscillation frequencies of the two ROs are small enough to avoid the output of the PUF from flipping.

The oscillation frequency of the RO is directly determined by the propagation delay \( t_d \) of each inverter stage. The first order estimate of \( t_d \) can be expressed as [11]:

\[
t_d = \frac{C_0 V_{ld}}{\eta I_D}
\]

(1)

where \( C_0 \) is the total load capacitance, \( V_{ld} \) is the power supply voltage, \( \eta I_D \) is the mean current (disregard leakage and short-circuit current), \( \eta \) is a fixed parameter for a given inverter and \( I_D \) is the saturation current. To a crude approximation, \( I_D \) is given by [11]:

\[
I_D = \frac{\mu C_{ox} W}{2L} (V_{GS} - V_t)^2
\]

(2)

where \( W, L, V_{GS}, C_{ox}, V_t \) and \( \mu \) are the effective channel width and length, gate-to-source voltage, gate capacitance, threshold voltage and charge carrier mobility, respectively.

From (2), the temperature coefficient of switching current \( TCC \) [12] can be derived as:

\[
TCC = \frac{1}{I_D} \frac{d I_D}{dT} = \frac{1}{\mu I_D} \frac{d \mu}{dT} - \frac{2}{V_{GS} - V_t} \frac{d V_t}{dT}
\]

(3)

The temperature dependent parameters, \( V_t \) and \( \mu \), are expressed as [2]:

\[
V_t(T) = V_t(T_0) - \sigma (T - T_0)
\]

(4)

\[
\mu(T) = \mu(T_0) \left( \frac{T}{T_0}\right)^\kappa
\]

(5)

where \( T_0 \) is the reference temperature. \( \kappa \) and \( \sigma \) are respectively the mobility temperature exponent in the range of 1.2~2 and the threshold voltage temperature coefficient in the range of 0.5~3mV/K.

The threshold voltage \( V_t(T) \) decreases with increasing temperature, resulting in a rising drain saturation current as temperature increases. On the contrary, the mobility of charge carriers decreases with increasing temperature, which in turn reduces the drain saturation current. The reduction in carrier mobility is more prominent than the reduction in threshold voltage in the super-threshold operation region. Consequently, the delay of a regular inverter gate exhibits an overall positive temperature dependence relation.

III. PROPOSED TEMPERATURE COEFFICIENT COMPENSATED HYBRID-INVERTER BASED RO PUF

By adding two transistors to the regular inverter circuit, the MOSFET transistors of the current starved inverter circuit in Fig. 2(a) can be made to operate in the sub-threshold region by adjusting the bias voltages \( V_p \) and \( V_n \). The drain current can be expressed as:

\[
I_{D,sub} = \frac{\mu C_{ox} W}{L} \left( \frac{\kappa_B T}{q}\right)^2 (n - 1) e^{\frac{q(V_{GS} - V_t)}{n k_B T}} (1 - e^{\frac{qV_d}{n k_B T}})
\]

(6)

where \( \kappa_B \) is a temperature independent coefficient. \( C_S, C_d \) and \( C_{ox} \) are the capacitance associated with the semiconductor, fast surface states and gate oxide, respectively. The temperature coefficient of the switching current \( TCC_{sub} \) can be formulated as [12]:

\[
TCC_{sub} = \frac{1}{\mu u} \frac{d \mu}{dT} + \frac{2}{T} - \frac{q}{nk_BT} \left( \frac{d V_t}{dT} + \frac{V_{GS} - V_t}{T}\right)
\]

(7)
Since the decrease of threshold voltage dominates the decrease of mobility with increasing temperature in the subthreshold region, the value of $TCC_{sub}$ is negative [12]. As a result, the delay of a current starved inverter stage decreases with increasing temperature.

Based on the above analysis, the positive temperature coefficient effect of the current starved inverters can counteract the negative temperature coefficient effect of the regular inverters of the classic RO PUF. Fig. 2(b) shows the simulation results of the relative frequency deviations (with reference to the frequency at 27 °C) versus temperature for the regular, current starred and hybrid 9-stage ROs in GF 65 nm CMOS technology. The hybrid RO is made up of 5 regular inverters and 4 current starred inverters. The results show that the frequency of hybrid RO is least susceptible to temperature variations. The characteristics of these three types of 9-stage ROs are summarized in Table I. The temperature sensitivity is defined as the output frequency deviation per degree Celsius. The results show that the hybrid RO has a much lower power consumption and temperature sensitivity than the regular RO but uses 8 more transistors. These additional biasing transistors can be sized smaller and share their diffusion areas with other transistors to reduce the area overheads.

The architecture of the proposed $(n+1)$-stage $(n$ is even) hybrid RO PUF consists of an $n$-bit LFSR counter, a bidirectional counter, a two-input NAND gate, $\frac{n}{2}$ regular inverter stages and $\frac{3}{2}$ current starred inverter stages. Fig. 3 shows the CMOS circuit implementation of a 9-stage hybrid RO PUF. The NAND gate is equivalent to a regular inverter when $EN$ is asserted. Two multiplexors are placed before and after the inverters in each stage. The multiplexors are realized with transmission gates to reduce their delay and transistor count. The two multiplexors in each stage share the same select signal. The NAND gate is set to low. The select signal picks up either the upper or lower inverter output, and $2^8$ different possible combinations of inverter path for the RO can be selected.

Each response bit of this PUF is generated by the comparison of two selected ROs’ frequencies. Fig. 4 shows the timing diagram of its operation. First, the LFSR counter is initialized by shifting an 8-bit challenge $C_A$ through the Serial_In port with the Mode signal asserted. The enable line $EN$ of the PUF is set to low to disable the RO. After a small delay when $C_A$ is loaded into the LFSR, $EN$ is pulled high and the bidirectional counter is reset by $Rst$. The selected $RO_A$ starts to “ring” and its output is connected to the Clock input of the bidirectional counter. The bidirectional counter is configured as an up counter by setting the Up/down signal high. The counter value is registered after a specific time $t$ determined by the frequency $f_A$ of $RO_A$. Then, $EN$ is set to low. The bidirectional counter is then configured as a down counter by setting $Up/down$ signal low. Next, a shadow challenge $C_B$ is generated from the LFSR counter after $N_{clk}$ ($N_{clk} < 2^8$) clock cycles. With a well-chosen feedback function, the LFSR counter will produce a pseudo random sequence with a very long cycle and $C_B \neq C_A$. After $C_B$ is stable, $EN$ is set to high. With the same counting time $t$, the value stored in the counter is directly proportional to the frequency difference of the two selected ROs, i.e., $\Delta f = f_A - f_B$. The most significant bit (MSB) of the counter is the output bit of the PUF. The length of the bidirectional counter has to be large enough to discriminate the two successive RO’s frequencies. The same input challenge can generate a different response with a different $N_{clk}$. This structure can be regarded as a logically reconfigurable PUF [13] for increasing the security of the PUF. It allows the CRP behavior to be changed by changing $N_{clk}$ without physically replacing or modifying the underlying PUF. If logical reconfigurability is not required, $C_A$ and $C_B$ can be fed successively without the LFSR.

### IV. QUALITY ANALYSIS AND EXPERIMENTAL RESULTS

The proposed 9-stage hybrid RO PUF was successfully implemented and fabricated in GF 65 nm CMOS process. The
active area of the proposed PUF is only $5 \times 50 \mu m^2$. Five dice are packaged and tested with the IC probe station. Agilent oscilloscope with 1GS/s sampling rate is used to capture the output frequencies of the RO and the responses of the PUF. The control signals and the LFSR counter outputs are generated externally from a Xilinx Virtex-II Pro FPGA board. $N_{clk}$ of the LFSR is fixed at 10 to emulate the direct feeding of arbitrary 8-bit challenges without logical reconfigurability.

A. Uniqueness of Proposed Hybrid RO PUF

Uniqueness can be estimated by the average inter-die Hamming Distance (HD) of the responses produced by different PUFs. Let $R_u$ and $R_v$ be the $n$-bit responses of two different chips, $u$ and $v$, to the same input challenge $C$. The uniqueness $U$ for $m$ chips is expressed as:

$$U = \frac{2}{m(m-1)} \sum_{u=1}^{m-1} \sum_{v=u+1}^{m} HD(R_u, R_v) \times 100\%$$

(8)

10000 CRPs generated by the PUFs were collected from the five dice to evaluate the uniqueness. The distribution of the measured inter-die HDs is shown in Fig. 5(a). The uniqueness calculated from the inter-die HDs of the proposed PUF is 50.42%. Monte Carlo simulation for a larger population of 50 PUF instances was also performed by Cadence Virtuoso Spectre using the process design kit of GF 65nm 1.2V CMOS technology. The simulated inter-die HDs distribution is shown in Fig. 5(b). The uniqueness of these 50 instances is calculated to be 49.62%. The best fit Gaussian curve to the histogram diagram plotted in Fig. 5(b) has a mean of $\mu = 49.62\%$ and a standard deviation of $\sigma = 5.86\%$.

B. Reliability of the Proposed Hybrid RO PUF

The reliability measures how reproducible or stable are the CRPs of a PUF under varying operating conditions. It can be measured by its bit error rate (BER) by comparing the responses taken at different time with a reference response to the same challenge. Let $R_i$ be an $n$-bit response to an input challenge $C$ produced by the PUF of a chip $i$ under a nominal operating condition. The same set of challenges are then applied $k$ times to the same PUF under varying environmental conditions to obtain the responses $R_{i,j}$ for $j = 1, 2, \cdots, k$. The reliability $S$ of chip $i$ can be computed by:

$$S = 1 - BER = 1 - \frac{1}{k} \sum_{j=1}^{k} HD(R_i, R_{i,j}) \times 100\%$$

(9)

The reliability is measured using 1000 CRPs generated by the PUF under varying supply voltages and temperatures. Fig. 6(a) shows the reliability of the fabricated hybrid RO PUF against the voltage variations. Voltage regulator and voltage limiter are typically used in modern ASIC design to minimize the supply voltage variations. With regulated voltage changes of $\pm2\%$ from 1.2V nominal supply, the worst reliability is 98.26%. Fig. 6(b) shows the average reliability of the five hybrid RO PUF chips measured by the thermal station. The working temperature is varied from $-40$ to $120^\circ C$, with $27^\circ C$ as the reference temperature. The average reliability measured from the hybrid RO PUF chips is as high as 99.84% and the worst-case reliability is 98.28% at $-40^\circ C$. The results attest that the frequency of hybrid RO is much less susceptible to temperature fluctuation. Comparing with the worst-case reliability of 82% at $100^\circ C$ reported in [2] for the classic RO PUF, our proposed RO PUF has increased its temperature reliability by more than 16%.

C. Unpredictability of the Proposed Hybrid RO PUF

The unpredictability is estimated by its number of independent output bits [5]. The number of independent bits of an RO PUF is $\log_2(N_{osc})$ [5], where $N_{osc}$ is the number of oscillators. For comparison, the number of independent bits generated by an RO PUF is expressed in terms of the number of transistors required to realize the PUF. For the classic RO PUF, the number of independent bits is $\log_2(M/2M)$, if each RO has $N$ inverter stages and $M$ is the total number of transistors. For our proposed hybrid RO PUF, 16 transistors are used in each inverter stage on average. With $M$ transistors, $\log_2(2^{(M+16)})$ independent bits can be generated by our design. Fig. 7 compares the number of independent bits that can be produced by the classic RO PUF ($N = 5$) and the proposed hybrid RO PUF implemented with the same number of transistors. More
independent bits can be generated by the proposed PUF with the same amount of hardware resources. Although it takes two cycles to generate a response bit, the mechanism for increasing the number of independent response bits is decoupled from that for enhancing the reliability without having to increase the number of ROs to achieve both objectives. From security point of view, this increases the difficulty for brute force attack as it will take the attacker double the time to read the CRPs.

D. Power Analysis

In the classic RO PUF, the RO consumes the most power, as it has the greatest switching activities. In our design, the current starved inverters are biased in the subthreshold region, which reduces the power consumption of the RO. Besides, only one “RO” is active at any time by selecting one of the two inverters in each stage. A power analysis is carried out by applying 1000 random challenges to a prototype PUF IC. The power consumption is averaged over all the challenges. The average power consumption measured for each CRP generated by the PUF including the power dissipated by the bidirectional counter is 32.3μW at 1.2V and the maximum RO’s frequency of 230MHz.

A comparison of different PUFs reported in the literature is summarized in Table II. Unfortunately, the area and power results of RO [5] and Bistable ring [14] PUFs cannot be compared as they are not available and were implemented on FPGA. The results of the remaining PUFs are obtained from custom chip implementation. Our fabricated PUF has tiny footprint and very low power consumption. Its uniqueness and reliability are also highly competitive. It exhibits a measured reliability of 100% over a temperature range of −20 to 120°C, as depicted in Fig. 6(b).

ACKNOWLEDGEMENT

This work is supported by MOE AcRF Tier I grant no. MOE 2014-T1-002-141.

V. CONCLUSION

A low-cost RO PUF with improved response stability has been presented. The proposed PUF utilizes the positive temperature coefficient of the current starved inverters to offset the response instability due to the negative temperature coefficient of the regular inverters used in the classic RO PUF. The prototype PUF chip fabricated in GF 65nm CMOS technology consumes only 32.3μW per CRP at 1.2V with a working frequency of 230MHz. The measured CRPs show a nearly perfect average inter-die HD of 50.46% and an average BER of 0.16% with temperature varied from −40 to 120°C. The proposed PUF stands out as an ideal candidate for lightweight security applications by comparing its overall figures of merit with other existing PUFs.

REFERENCES


TABLE II COMPARISON OF THE QUALITIES AND COSTS OF PUFs.

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Power (μW)</td>
<td>250</td>
<td>NA</td>
<td>0.93</td>
<td>NA</td>
</tr>
<tr>
<td>Process (nm)</td>
<td>350</td>
<td>180</td>
<td>90</td>
<td>130</td>
</tr>
<tr>
<td>Area (μm²)</td>
<td>23,436</td>
<td>1,470,000</td>
<td>NA</td>
<td>15,288</td>
</tr>
<tr>
<td>Uniqueness (%)</td>
<td>40.00</td>
<td>46.14</td>
<td>64.70</td>
<td>50.90</td>
</tr>
<tr>
<td>Worst-case reliability (%)</td>
<td>95.00</td>
<td>95.20</td>
<td>99.52*</td>
<td>96.96</td>
</tr>
<tr>
<td>Reliability</td>
<td>1.5–5V</td>
<td>±2%</td>
<td>1.2–1.08V</td>
<td>0.9–1.2V</td>
</tr>
<tr>
<td>Conditions</td>
<td>−25–125°C</td>
<td>40–40°C</td>
<td>20–120°C</td>
<td>40–120°C</td>
</tr>
</tbody>
</table>

* The reliability against voltage and temperature variations was shown to be highly sensitive to different RO configurations and PUF settings [9].