Lo 2011
Lo 2011
Lo 2011
Abstract—SRAM has been under its renovation stage recently, on the SRAM SNM (Static Noise Margin) when operating in the
aiming to withstand the ever-increasing process variation as well as subthreshold region. This makes the SRAM performance highly
to support ultra-low-power applications using even subthreshold sensitive to the process variation. Besides, the bitline leakage
supply voltages. We present in this paper a novel P-P-N-based 10T
SRAM cell, in which the latch is formed essentially by a cross- (current) often imposes as another threat. An on/off current ratio
coupled P-P-N inverter pair. This type of cell can operate at a is often used as a gauge for this issue, where the on-current refers
voltage as low as 285 mV while still demonstrating high resilience to the cell current drawn by an accessed cell during the read op-
to process variation. Its noise margin has been elevated in not only eration from the sensing bitline while the off-current refers to
the hold state, but also the read operations. As compared to pre- the leakage current drawn by all the other unaccessed cells from
vious 10T SRAM cells, our cell excels in particular in two aspects:
1) ultra-low cell leakage, and 2) high immunity to the data-depen- the other complementary bitline on the same column. Typically,
dent bitline leakage. The second merit makes it especially suitable a rule of thumb demands that this on/off ratio is greater than 10
for an SRAM macro with long bitlines – a property often desirable so that there will be adequate voltage swing between the bitline
in order to achieve high density. We have fabricated and validated pair at the moment when the sense amplifier is activated to en-
its performance through a 16 Kb SRAM test chip using the UMC sure reliable read operation. However, this ratio has deteriorated
90 nm process technology.
in advanced process technologies especially at low voltages due
Index Terms—10T cell, bitline leakage, low-leakage, noise to the increased off-state leakage current associated with a tran-
margin, PPN, SRAM. sistor. The overall bitline leakage is data dependent in the sense
that it is strongly related to the data pattern stored in the cell
column being accessed. Thus, the bitline leakage varies from
I. INTRODUCTION
one cell column to another over the time, making the distribution
of the access time of an SRAM macro even more dispersed. In
SUBTHRESHOLD SRAM is a type of SRAM that is able order to accommodate the worse-case bitline leakage, one may
A to support normal operation even when the supply voltage
is below normal threshold voltage of a transistor, e.g., as low as
sometimes need to limit the bitline length, i.e., the number of
cells attached to a bitline, which in turn reduces the area effi-
0.3 V. It has drawn great research attention recently due to its ciency.
ultra-low-power merit and resilience to process variation. Con- In general, the reduced cell stability and writeability for the
ventional 6T-based SRAM cannot support reliable low-supply- conventional 6T SRAM may sporadically experience data-flip-
voltage operation due to degraded noise margins. As a result, ping (i.e., a bit cell changes its state from ’0’ to ’1’ or vice
with the voltage down-scaling in many low-power applications, versa after being read) or write failure (i.e., the data to be
designing a new and robust SRAM cell with the capability of op- written into a bit cell fails to overwrite its previously stored
erating at a subthreshold voltage while consuming low leakage value). To cope with these problems, several new SRAM cells
power during the standby mode is highly demanded. equipped with some supportive peripheral circuits have been
As the CMOS process technology continues to scale to the proposed. These emerging SRAM cells could be categorized
nanometer regime, process variation and leakage current of tran- into two types, namely single-ended sensing cells [2]–[6] and
sistors become more severe, which are further aggravated by differential sensing cells [7], [8]. Fig. 1 shows the schematics
the fluctuation of the operation conditions such as the variation of four of such 10T SRAM cells. In general, a single-ended
of the supply voltage and/or the temperature leads to a higher sensing cell is not as robust as the differential one, and hence,
chance of device malfunctioning. For example, as indicated in it often requires some extra compensation scheme to maintain
[1], the impact of the threshold variation could be exponential the reliability as proposed in [5]. Moreover, it is not easy if not
impossible for a single-ended sensing cell to support column
Manuscript received July 21, 2010; revised December 16, 2010; accepted De-
multiplexing (also known as bit-interleaving) in which an IO is
cember 16, 2010. Date of publication January 31, 2011; date of current ver- shared among several cell columns. Fig. 1 shows two variants
sion February 24, 2011. This paper was approved by Associate Editor Peter of differential-sensing 10T cells. Fig. 1(c) is referred to as
Gillingham. This work was supported in part by the National Science Council
of Taiwan (NSC) under Contract NSC 98-2220-E-007-033.
10T Schmitt-trigger cell, or ST cell for short throughout this
The authors are with the Department of Electrical Engineering, National paper [7]. The stability and writeability of this cell have been
Tsing Hua University, Hsinchu, Taiwan (e-mail: chlo@larc.ee.nthu.edu.tw; improved by using two cross-coupled Schmitt-trigger inverters
syhuang@ee.nthu.edu.tw). to form the storage cell. There is a minor problem with this type
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. of ST cell – it still suffers from the read disturbance problem,
Digital Object Identifier 10.1109/JSSC.2010.2102571 which refers to the phenomenon that a storage node with data
0018-9200/$26.00 © 2011 IEEE
696 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 3, MARCH 2011
Fig. 1. (a), (b) Single-ended 10T SRAM cells [2], [3]. (c), (d) Differential 10T SRAM cells [7], [8].
Fig. 3. (a) Cell status during the read operation. (b) Read SNM comparison of our cell and ST cell [5].
Fig. 6. (a) Cell status during the write operation. (b) Corresponding waveforms.
in our cell. As for an ST cell, the read current flows through the
storage node directly, thereby causing read disturbance, i.e., the
voltage at data node Q will rise temporarily. This will degrade
the read stability because the cell flipping will be more likely to
take place.
As shown in Fig. 3(b), our cell achieves 1.5 read SNM im-
provement (from 70 mV to 106 mV) as compared to the ST cell.
In order to examine the cell stability in the worst case process
corner (which is FNSP in our design), we run Monte Carlo sim-
ulation (M.C.) for both read SNM and hold SNM. To show the
difference between our cell and the ST cell [7], M.C. simula-
tion is also conducted for the ST cell, as shown in Fig. 4. Al-
though the ST cell gives higher hold SNM than ours, the effec-
tive SNM is limited by the lower read SNM. Statistically, our
cell gives 1.72 mean SNM and 12% reduction on the stan- Fig. 7. The normalized current versus the channel length, demonstrating the
dard deviation as compared to the ST cell. Our cell offers a good RSCE effect in the subthreshold region (based on simulation of UMC90 nm
technology).
balance between the hold SNM and the read SNM. Detailed
examination also reveals that the read SNM is slightly higher
than the hold SNM. The higher read SNM mainly results from When operating in a subthreshold region (e.g., when VDD is
the VDD-precharged bitline. When WL is enabled and raised to 300 mV), the data-flipping process takes place in slow motion
high, the high bitline voltage helps strengthen the cell stability in two steps. In general, the lower portion of our P-P-N inverter
by providing another charging path for the data node storing ’1’. pair can be viewed as a latch consisting of PUL2-PDL2 and
If the wordline is boosted to a higher voltage to improve access PUR2-PDR2. In some sense, this latch takes node pQ and node
time, sensing margin and writeability, the read SNM can be even pQb as the pseudo supply terminals. In step 1, pQb is pulled
higher as well. down quickly to nearly the ground voltage as shown in Fig. 6(b)
Bitline discharging speed has a direct impact on the SRAM at the beginning of the write operation since it is driven by BLB
access time. In the ST cell, the read current flows through three tied to strong ’0’. Then pQb starts to affect Qb via the pMOS
stacked transistors, resulting in reduced read current. On the between them (i.e., PUR2), reducing the voltage of Qb to a lower
other hand, the discharging path in our cell is formed by two middle voltage (e.g., 120 mV for a time span of about 4 ns when
transistors (one pass-gate transistor plus one pull-down tran- the supply voltage is 300 mV). During this time span, PUL1 and
sistor). Thus, the read current of our cell is 1.4 times that of PUL2 controlled by Qb still conducts weakly to pull up voltage
the ST cell as the supply voltage is 300 mV, as shown in Fig. 5. at node Q, as shown in Fig. 6(b). Due to the coupling effect of
parasitic capacitances Cgs and Cgd, the voltage of Qb, which is
C. Write Operation in the floating state, rises with node Q but only slightly. In step 2,
the data flipping finally takes place when Q is strong enough to
Fig. 6(a) shows a snapshot of our cell during a write oper- conduct the PDR transistor to discharge Qb down to the ground
ation. At the beginning, storage node Q stores ’0’ while Qb voltage.
stores ’1’. To perform a write operation, the wordline WL is It is worth mentioning that even though such a write mecha-
enabled and one bitline, e.g., BLB, is pulled down to ground in nism takes relatively longer time to accomplish the data flipping,
advance. When the supply voltage is relatively high (e.g., 1 V), it is still shorter than the read access time, and therefore, overall
node Qb (storing ’1’) in this example will be pulled down di- it does not introduce any operating frequency penalty.
rectly through the discharging path formed by We can also further improve the cell’s writeability by
. In turn, node Q will be charged up to complete the strengthening the access transistors (PGL and PGR). Unlike the
data-flipping process. ST cell, doing so does not affect the performance of the read
LO AND HUANG: P-P-N BASED 10T SRAM CELL FOR LOW-LEAKAGE AND RESILIENT SUBTHRESHOLD OPERATION 699
Fig. 9. WNM of our cell at different process corners with and without incor-
porating the RSCE-aware sizing.
Fig. 10. Monte Carlo simulation results for WNM in the worst case process
corner (SNFP – Slow nMOS Fast PMOS).
operation at all. Fig. 7 shows one example, in which the size of
the PFET gives 1.2 drivability with 2 minimum channel
length and the size of the NFET gives 1.4 drivability with larger read current and thereby a quicker access. 3) The pass gate
3 minimum channel length, according to the RSCE (Reverse transistors PGL and PGR need to be strong enough to serve as
Short Channel Effect). The sizes and layouts of transistors PGL, high-conduction paths between the accessed cell and the bitlines
PGR, PUL2 and PUR2 are shown in Fig. 8. Overall, the WNM during both the read and write operations. 4) The two pull-up
(Write Noise Margin) is significantly improved, as shown in transistors, PUL2 and PUR2, need to be slightly stronger, to
Fig. 9. In the worst case corner (FNSP) under process variation, compensate for the conductivity degradation of the cascaded
we run Monte Carlo simulation for our cell and the ST cell to pMOS structure linking the storage nodes (i.e., Q and Qb) and
derive the WNM distribution shown in Fig. 10. Our cell reduces VDD, which help contribute to a good hold SNM. 5) Unlike a 6T
the standard variation of the WNM by 52% than that of the ST cell, the pull-down transistors PDL2 and PDR2 do not have to
cell. It is worth mentioning that the transistor sizing of the ST be strong, since they do not involve in the cell discharging paths.
cell is rather more complicated than ours considering that its However, their strengths are made comparable to the cascaded
size requirement for the read operation does not comply with pMOS structure mentioned above to achieve a more balanced
that for the write operation. That leads to a dilemma – juicing cell structure which could lead to a larger hold SNM.
up the writeability may harm the stability of the read operation
in the ST cell. As a result, the transistor sizing in our cell is D. Efficient Bit-Interleaving
easier relatively. Bit-interleaving (or called column-multiplexing) is a scheme
In general, we use the following transistor sizing guidelines. that allows multiple columns of cells to share a sense ampli-
1) The pull-up transistors PUL1 and PUR1 are usually made fier and its following output pin driver. Such a scheme is al-
weaker to ease a write operation just like in a conventional 6T most indispensable in SRAM design because it not only in-
cell. 2) The pull-down transistors PDL1 and PDR1, forming creases the cell density but also eases the pitch matching of the
the cell discharging paths, need to be stronger to facilitate a overall layout between the cell array and the periphery circuit.
700 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 3, MARCH 2011
Fig. 11. (a) Bit-interleaving structure. (b) Timing diagrams of our cell showing
bit-interleaving capability.
Fig. 14. Sensing margin comparison under the worst case column pattern.
Fig. 17. Die photo and layout of our test chip with 16 Kb cells.
Fig. 18. Shmoo plots using X-march test pattern. (a) Our P-P-N cell macro.
(b) ST cell macro [7].
(a)
TABLE II
COMPARISON WITH PREVIOUS WORKS
mV mV mV
DRV of our cells mainly results from the stacking of two
PFETs in our cell. The overall characteristics of our test chip at 300 mV of VDD. Relatively speaking, a reasonably good
are summarized in Table I. WRITE SNM of ours is easier to achieve than that of [8]. It
A comparison with [2], [3], [7], and [8] is shown in Table II. is notable that elevating the WRITE SNM is crucial sometimes
Since an SRAM design is sensitive to transistor sizing and because it could be the smallest and the limiting factor among
process technology, comparison between various SRAM cells the three SNM values.
is rather complicated. Besides, the detailed transistor sizing
was not mentioned in some of the published papers. Therefore, IV. CONCLUSION
we only try to compare the feature of each work from a more
qualitative point of view. The traditional 6T SRAM cell has been facing a survival
Among the works under comparison, [2] and [3] use single- battle against the relentlessly increasing process variation and
ended outputs, while [7] and [8] use differential outputs like cell leakage current. New cell structures with more transistors
ours. Therefore, we focus more on the comparison with [7] and are widely deemed to be inevitable in order to support future
[8]. Previously, we have compared with [7] quite comprehen- process technologies. The P-P-N-based 10T SRAM cell we
sively. Here, we will briefly differentiate ours with the work of have presented has many merits one may desire when building
[8]. The hold/read SNMs of the 10T cell proposed in [8] could an SRAM macro, including 1) ample and balanced noise mar-
be higher than ours. But the problem with the work of [8] is more gins, 2) ability to operate at a subthreshold voltage, 3) ability
about the writeability. The cell in [8] incorporates two wordlines to support bit-interleaving, 4) low cell leakage, and 5) high
(namely, READ-WL and WRITE-WL) and uses two cascaded immunity to the bitline leakage. Measurement of a fabricated
pass-gate transistors (controlled respectively by READ-WL and SRAM macro indicates its is only 285 mV and
704 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 3, MARCH 2011
its stand-by current is as low as 1.08 A, which is an 80% [8] I. J. Chang, J. Kim, S. P. Park, and K. Roy, “A 32 kb 10T subthreshold
reduction of a state-of-the-art predecessor. SRAM array with bit-interleaving and differential read scheme in 90
nm CMOS,” IEEE J. Solid-State Circuits, vol. 44, no. 2, pp. 650–658,
Feb. 2009.
[9] M. Sharifkhani and M. Sachdev, “An energy efficient 40 Kb SRAM
ACKNOWLEDGMENT module with extended read/write noise margin in 0.13 m CMOS,”
IEEE J. Solid-State Circuits, vol. 44, no. 2, pp. 620–630, Feb. 2009.
The authors would like to thank the National Chip Implemen-
Cheng-Hung Lo (S’08) received the B.S. degree
tation Center (CIC), Taiwan, for their technical support and the in electrical engineering from Yuan-Ze University,
grant of access to the ATE, and United Microelectronics Corpo- Taoyuan, Taiwan, in 2007, and the M.S. degree
ration (UMC) for chip fabrication. in electrical engineering from National Tsing-Hua
University, Taiwan, in 2009.
His research interests are mainly in ultra-low
power SRAM design.
REFERENCES
[1] B. Calhoun and A. Chandrakasan, “Static noise margin variation for
subthreshold SRAM in 65-nm CMOS,” IEEE J. Solid-State Circuits,
vol. 41, pp. 1673–1679, 2006.
[2] B. H. Calhoun and A. Chandrakasan, “A 256 kb subthreshold SRAM
in 65 nm CMOS,” IEEE J. Solid-State Circuits, vol. 42, no. 3, pp.
680–688, Mar. 2007. Shi-Yu Huang (M’97) received the B.S. and M.S. de-
[3] T. Kim, J. Liu, J. Keane, and C. Kim, “A 0.2 V, 480 kb subthreshold grees in electrical engineering from National Taiwan
SRAM with 1 k cells per bitline for ultra-low-voltage computing,” University, Taiwan, in 1988 and 1992, and the Ph.D.
IEEE J. Solid-State Circuits, vol. 43, no. 2, pp. 518–529, Feb. 2008. degree in electrical and computer engineering from
[4] B. Zhai, D. Blaauw, and D. Sylvester, “A variation-tolerant sub-200 the University of California, Santa Barbara, in 1997.
mV 6-T subthreshold SRAM,” IEEE J. Solid-State Circuits, vol. 43, He joined the faculty of the Department of Elec-
no. 10, pp. 2338–2348, Oct. 2008. trical Engineering, National Tsing-Hua University,
[5] N. Verma and A. Chandrakasan, “A 256 kb 65 nm 8T sub-Vt SRAM Taiwan, in 1999. His research interests are mainly
employing sense-amplifier redundancy,” IEEE J. Solid-State Circuits, in VLSI design, automation, and testing, with an
vol. 43, no. 1, pp. 141–149, Jan. 2008. emphasis on SoC power estimation methodology,
[6] T. Kim, J. Liu, and C. Kim, “A voltage scalable 0.26 V, 64 kb 8T SRAM nanometer SRAM design, all-digital phase-locked
with Vmin lowering techniques and deep sleep mode,” IEEE J. Solid- loop design, and timing-related testing.
State Circuits, vol. 44, no. 6, pp. 1785–1795, Jun. 2009. Dr. Huang served as a Program Co-Chair of the IEEE Asian Test Symposium
[7] J. Kulkarni, K. Kim, and K. Roy, “A 160 mV robust Schmitt trigger in 2004, a General Co-Chair of the IEEE Asian Test Symposium in 2009, and
based subthreshold SRAM,” IEEE J. Solid-State Circuits, vol. 42, no. the Program Chair of the IEEE International Workshop on Memory Technology,
10, pp. 2303–2313, Oct. 2007. Design, and Testing in 2005 and 2006.