Microelectronics Journal: Mohammad A. Tehrani, Farshad Safaei, Mohammad Hossein Moaiyeri, Keivan Navi
Microelectronics Journal: Mohammad A. Tehrani, Farshad Safaei, Mohammad Hossein Moaiyeri, Keivan Navi
Microelectronics Journal: Mohammad A. Tehrani, Farshad Safaei, Mohammad Hossein Moaiyeri, Keivan Navi
Microelectronics Journal
journal homepage: www.elsevier.com/locate/mejo
a r t i c l e i n f o abstract
Article history: Quantum-dot Cellular Automata (QCA) is a promising nanotechnology with ultra-small feature size and
Received 4 September 2010 ultra-low power consumption compared with transistor-based technologies. During the past decade
Received in revised form the QCA has been carefully studied, and it has demonstrated the ability of using quantum phenomena
23 February 2011
for implementing logical devices. Multistage Interconnection Networks (MINs) have been frequently
Accepted 7 March 2011
Available online 11 April 2011
suggested as the connection means in parallel systems. This architecture provides the maximum
bandwidth to the components, and the minimum latency access to memory modules. They are
Keywords: generally accepted concepts in the semiconductor industry for solving problems related to on-chip
Quantum-dot Cellular Automata communications. Although there have been a large amount of researches on MINs for parallel
Multistage Interconnection Networks
processing, there seems to be surprising attempts to utilize the unique characteristics of QCA for
Network design
designing and implementing of MINs. In an effort to fill this gap, this paper presents the first design
Nanoelectronics
methodology of MINs using QCA. To demonstrate the functionality and validity of the proposed
methodology, performance evaluations of MINs using QCADesigner simulator are given and analyzed.
& 2011 Elsevier Ltd. All rights reserved.
0026-2692/$ - see front matter & 2011 Elsevier Ltd. All rights reserved.
doi:10.1016/j.mejo.2011.03.004
914 M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922
Many variations of MINs have been already introduced. These In the case where two QCA cells are together, they affect each
architectures provide the maximum bandwidth to components other and the electrons of each cell force the electrons of the other
(such as DSP, IP, etc.), and the minimum access delay to memory cell. When they are situated as shown in Fig. 2(a), they prefer to
modules. A MIN is characterized by its topology, switching mechan- have the same polarization to minimize the Coulombic repulsion.
ism, routing algorithm, task scheduling strategy, and fault-tolerance But when they have 451 gradient, their polarizations must be
[11]. Various topologies of MINs have been proposed and studied in different to have the least repulsion (Fig. 2(b)).
the last few decades. Most of these topologies are driven from the A QCA wire could be made by arranging some cells in a line as
well known graph topologies, including mesh, star, shuffle exchange, shown in Fig. 3(a). When the head of the line has a specified
tree, and cube-connected networks, among different others [7]. polarization, it propagates the value through the line and every
The communication platform of the parallel architectures can cell gets the same state. A ripple wire could be also made by
be implemented with MINs, which must be reconfigured for arranging cells in a line having 451 gradient (Fig. 3(b)). There is a
various purposes. QCA strategy can be used to implement digital very important feature in the interconnection of normal wires and
logic systems by properly arranging cells. So far, several studies ripple wires. These wires can cross each other without any signal
have been reported in the literature about QCA-based circuit interference in a coplanar design. However, their signals do not
design [12–14]. However, to the best of our knowledge, there affect the other’s signal and the signals are propagated to the rest
seems to be surprising attempts to design and implement MINs of the wires correctly (Fig. 3(c)).
considering the unique characteristics of QCA. Indeed, such net- QCA logic gates somehow differ from conventional CMOS design.
works are very important circuits because they are expected to be Majority gate and Inverter are the basic elements of QCA design and
used to design and realize large-scale parallel systems [7]. In an even ‘‘AND’’ and ‘‘OR’’ gates are implemented using Majority gates.
effort to bridge the gap between MIN and QCA, the main concern A Majority gate consists of four cells: three inputs, one output, and
of this article is to implement and simulate the types of MINs one voter cell. The voter chooses its polarization depending on the
such as Omega network, Butterfly network, Baseline network, and states of the input cells. Subsequently, the signal is propagated to
Generalized Cube network [7] on the basis of QCA architecture. A the output cell and rest of the circuit. Another important gate of QCA
novel QCA switching element is also implemented, which is used is Inverter. Its functionality is just like normal CMOS Inverter. A QCA
in the fabric switches and MIN networks. Majority gate and an Inverter are shown in Fig. 4 [5].
The remaining portion of the paper is organized as follows: In A Majority gate can simply change to an ‘‘AND’’ gate. When
Section 2 an acquaintance to QCA is brought, which describes its one of the inputs of the Majority gate is set to the static value of
physical interactions and logical behavior. In Section 3, the ‘‘0’’, it acts like an AND gate (Eq. (2.a)). An ’’OR’’ gate is also a
architecture of a MIN is introduced. Next, the structural design Majority gate, with a static ‘‘1’’ input (Eq. (2.b)).
of a generic MIN and its components is presented. Section 4 gives
Majority ðA, B, 0Þ ¼ ðA:B þ B:0 þ A:0Þ ¼ A:B ð2:aÞ
a description for the implementation of MIN on the basis of QCA
strategy. In Section 5, the simulation results and verification of Majority ðA, B, 1Þ ¼ ðA:B þ B:1 þ A:1Þ ¼ A þ B ð2:bÞ
functionality for the test networks are detailed. Finally, a sum-
mary of results and conclusions can be found in Section 6.
3. Network modeling
2 1
Electron Quantum-dot
Logic: 0 Logic: 1
3 4 Polarization: -1 Polarization: +1
Fig. 1. (a) A QCA cell has four quantum-dots and two surplus electrons and (b) illustration of two QCA’s logical states.
M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922 915
Input Columbic
Fig. 3. Illustration of signal propagation through QCA wires: (a) a normal QCA wire does not change the signal value and transfers it intact and (b) a 451 wire changes the
signal in each cell; value of signal at the end depends on the number of passed cells, and (c) two QCA wires can cross each other in just one layer without signal
interference.
Input A
Input B Output
Input Output
Voter
Input C
Fig. 4. Illustration of two important QCA logic elements. (a) a Majority gate: in a Majority gate, the voter cell chooses its arrangement according to neighbor cells states
and (b) a QCA Inverter.
In this section, we present a classification of MIN and restate Fig. 5. Architecture of a generic MIN using 2 2 Switching Elements (SE).
some definitions necessary for the proposed classification.
MINs have been classified into three classes depending on the Depending on the kind of channels and switches, MINs can be
availability of paths to establish new connections. Fig. 6 illustrates either unidirectional or bidirectional [7]. Additionally, each chan-
a topological classification of MINs [7]. nel can be either multiplexed or be replaced by two or more
channels. The latter case is referred to as a dilated MIN [7].
Definition 2. [7] A Banyan network is defined as a class of MINs
in which there is one and only one path from any input node to Definition 3. [7] A uniform MIN is one in which all the SEs of a
any output node. stage are of the same degree.
916 M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922
Definition 5. [7] Delta networks are built using anbn (where n is Fig. 7. A typical Delta network.
bki : ðxn1 xn2 . . .xi þ 1 xi xi1 . . .x1 x0 Þ-ðxn1 xn2 . . .xi þ 1 x0 xi1 . . .x1 xi Þ 3.3.4. Generalized Cube network
ð6Þ In a Generalized Cube network [7,19,20] the ith cube permuta-
tion ei, 0 rirn 1, is defined only for k¼2 by
The ith Butterfly permutation interchanges the 0th and ith digits
k
of the index. It should be observed that b0 defines a straight one- Ei : ðxn1 xn2 . . .xi þ 1 xi xi1 . . .x1 x0 Þ-ðxn1 xn2 . . .xi þ 1 xi xi1 . . .x1 x0 Þ
to-one permutation and is also called identity permutation [7]. ð8Þ
A Butterfly (16, 2) network is shown in Fig. 9. The ith cube permutation complements the ith bit of the index.
The permutation e0 is also called exchange permutation [7].
3.3.3. Baseline network Fig. 11 illustrates a Generalized Cube (16, 2) network.
In a Baseline network [7], the ith k-ary baseline permutation,
dki , 0 ri r n1, is expressed by
4. QCA realization of MINs
dki : ðxn1 xn2 . . .xi þ 1 xi xi1 . . .x1 x0 Þ-ðxn1 xn2 . . .xi þ 1 x0 xi xi1 . . .x1 Þ
In Section 2, the static characteristic of QCA cells has been
ð7Þ
introduced. To realize large circuits such as Switching Element or
The ith baseline permutation performs a cyclic shifting of the iþ1 Multistage Interconnection Networks, it is important to discuss
least significant digits in the index to the right for one position. It the features of QCA cells as a part of a large system. In this section,
918 M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922
the notation of a tool called QCADesigner [21] is also mentioned, 3. Release: The barriers are lowered and allow the electrons to
by which the most of figures in this article are drawn. start tunneling. The cells move from a fix polarization to no
polarization.
4. Relax: The barriers fall down and the cell has no polarization.
4.1. QCA clocking
The clock has four different phases, so the cells can be
One of the important behaviors of the cells is their response to arranged in four pipelined zones to propagate the signal faster
the clocking signals. They have a special clocking that can through the circuit. There is a notation for showing the clocking
expedite the signal propagation and reduce the noises through a phases in models [22]. A group of cells in the same phase is shown
circuit. The clocking in QCA is somehow different from other with the same color (Fig. 13).
digital clocking technologies. Clocking in QCA is completely
different from CMOS circuits. QCA clock allows data to propagate 4.2. QCADesigner
or force them to stay in their position. The clock raises and lowers
the dots barrier, so the electron can tunnel between dots or it QCADesigner is a tool generally used for simulating the QCA
must stay in its situation. The clock has four phases including circuits. A QCA model might be of single layer or multilayer. In a
switch, hold, release, and relax, which describe the raising and single layer design, only normal cells and fix polarization cells are
lowering of the clock signal (Fig. 12). Each clock phase performs used. Their illustration is depicted in Fig. 14(a) and (b). When a
some special activities on quantum-dot’s barrier that affect the QCA signal moves from one layer to another, it goes via vertical
value of cell as below. cells (Fig. 14(c)). Then, in the upper layer, it propagates through
crossover cells (Fig. 14(d)). Finally, it can go down to the main
1. Switch: The cells passage from having no value to having layer via vertical cells.
definite values.
2. Hold: The barriers are maintained high and the values are the 4.3. QCA 2 2 switching element
same as in the switch phase.
In this section the hardware implementation of a 2 2 SE is
presented. The signals could propagate straightforward or might
exchange their path. Here, the SE is implemented using two
multiplexers. The best multiplexer design is suggested by Mar-
diris and Karafyllidis [23] with 62 cells and 0.12 mm2 area as
shown in Fig. 15(a). The logical design of 2 2 SE is demonstrated
in Fig. 15(b).
QCA implementation of the 2 2 SE design is shown in
Fig. 16(a). This QCA implementation has been simulated and
Fig. 12. Four phases of QCA clocking.
tested. Besides, the results have been approved by QCADesigner
[21] Version 2.0.3. Table 1 presents a brief description for
each parameter used for a bi-stable approximation simulation
engine [24]:
Fig. 16(b) reveals the simulated waveforms of QCA 2 2 SE.
Fig. 13. Illustration of QCA clocking sections in a QCA wire. It contains 157 cells and places arranged in a 0.25 mm2 area.
It is implemented in a single layer having six clock zones and the
output shows the results after 1.5 clock cycles delay.
5. Experimental results
Fig. 14. VariousQCADesigner cells: (a) Normal cell, (b) Fix polarization cell, Some models of MINs are implemented here using the mentioned
(c) Vertical cell, and (d) Crossover cell. SE. These models are implemented in three stages with 12 SEs.
Fig. 15. (a) Implementation of a multiplexer with QCA cells and (b) illustration of the logic diagram of a 2 2 Switching Element.
M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922 919
Fig. 16. (a) Illustration of a QCA 2 2 SE and (b) output signals of QCA 2 2 SE when the number of samples equals 50,000.
Table 1 and takes 4.5 clock cycles to generate the output. In this scheme, the
Parameters model in QCADesigner simulator. maximum number of cells included in one clock phase is reduced to
improve the polarization as well as the transmission of the signal
Parameter Description Value
throughout the wire [25].
Cell width Width of each QCA square (should 18 nm All the discussed models are implemented and simulated
be equal to the height) using the QCADesigner tool. As an instance, the QCA implementa-
Cell height Height of each QCA square 18 nm tion of Generalized Cube network is demonstrated in Fig. 17.
Dot diameter Diameter of each dot in a QCA cell 5 nm
Number of Number of tested data during the 50,000 and
Moreover, the simulation results are shown in Table 2. The
samples simulation. Accuracy depends on 2,000,000 schemas are single layer implementions and the signal is injected
this parameter into the circuit using coplanar crossover model. All the models
Convergence Simulation for each sample iterates 0.001 have the same clock zones having variable cell counting and area.
tolerance until the new value of polarization
According to [26], the QCA clock rate could be in range of
deviates from the old value by more
than this predefined error limit 1–2 THz. Although, there is no frequency setup in QCADesigner,
Radius of effect Radius of effect of a cell is the radius 65 nm the normal range of QCA clock rate is assumed. Therefore, the
at which it will interact with other delays can be estimated at these clock frequencies.
cells MINs have also been designed at 16 nm MOSFET and CNFET
Relative Relation of the permittivity of 12.9
permittivity fabrication material (for GaAs/
nanotechnologies and have been simulated at 0.7 V supply
AlGaAs) to the vacuum permittivity voltage using HSPICE circuit simulator. For 16 nm MOSFET tech-
Clock high Saturation energy of clock signal 9.8E 22 J nology, the 16 nm PTM model [27–29] has been used. Further-
when it is high more, for 16 nm CNFET technology the Compact SPICE Model
Clock low Saturation energy of clock signal 3.8E 23 J
for CNFETs including all nonidealities has been utilized [30,31].
when it is low
Clock amplitude To make an effective clock, top 25% 2 This standard model has been designed for unipolar, MOSFET-
factor and bottom 25% of a sine signal is like CNFET devices, which operates correctly for CNFETs with
dismissed the minimum channel length of 10 nm. In this model, each
Layer separation Distance between two layers 11.5 nm transistor may have one or more CNTs as its channel(s). This
Maximum When the simulation for each state 100
iterations per is not convergence based on this
model also considers Schottky Barrier Effects, Parasitics, including
sample parameter, it automatically goes to CNT, Source/Drain, and Gate resistances and capacitances and
the next state CNT charge screening effects. The parameters of the CNFET model
and their values with brief descriptions are summarized in
Table 3.
As presented in the previous section, each switch has six clock The simulation results are shown in Table 4 and are plotted in
phases delay. When they are put in three stages, it is expected that Fig. 18. It is worth mentioning that the delay parameter denotes
the total delay reaches 18 clock phases. It has totally 18 clock zones the critical path delay of the networks. As can be inferred from the
920 M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922
Table 3
Characteristics of CNFET devices.
Table 4
Simulation results of Delta MINs in QCA, CNFET, and MOSFET technologies.
Delay ( 10 12 s)
6. Conclusions
Fig. 17. QCA implementation of Generalized Cube network.
40
35
30
25
20
15
10
Omega (8, 2)
5
Generalized Cube (8, 2)
0
Butterfly (8, 2) MOSFET-16nm
CNFET-16 nm
Baseline (8, 2) QCA-18nm (1THz)
QCA-18 nm (2 THz)
Fig. 18. Comparison of the average Delta MINs in QCA, CNFET, and MOSFET technologies.
Fig. 19. (a) Outline of the signal path through Baseline network model and (b) baseline network model sample output.
CMOS design process by QCA design process that will allow of Shahid Beheshti University, G.C., for her helps on the QCADe-
moves toward more advanced architectures. In this paper, we signer simulations.
developed QCA strategy to construct a generic Delta MIN archi-
tecture. We examined the possible implementation types of MINs,
such as Omega network, Butterfly network, Baseline network, and References
Generalized Cube network. The results presented in this paper
show that these networks can be successfully implemented using [1] Semiconductor Industries Association Roadmap, /http://public.itrs.netS,
2010.
QCA cells and outperform the other nanotechnology-based imple- [2] M.A. Tehrani, K. Navi, A novel quantum dot cellular automata for implemen-
mentations such as 16 nm CMOS and 16 nm CNFET. Further the tation of multi-valued logic, In: Nano Today Conference, Elsevier, 2009.
introduced network model can be extended to create more [3] N. Kazemifard, M. Ebrahimpour, M. Rahimi, M. Tehrani, and K. Navi,
Performance evaluation of in-circuit testing on QCA based circuits,
complex devices, and also the network’s model suggested here in: Proceedings of the 6th IEEE East–West Design and Test Symposium, 2008.
is capable of being rearranged without making any physical [4] C.S. Lent, P.D. Tougaw, W. Porod, G.H. Bernstein, Quantum cellular automata,
alteration. Nanotechnology 4 (1) (1993) 49–57.
[5] M.R. Azghadi, O. Kavehei, K. Navi, A novel design for quantum-dot cellular
automata cells and full adders, Journal of Applied Sciences 7 (22) (2007)
3460–3468.
Acknowledgment [6] Keivan Navi, Razieh Farazkish, Samira Sayedsalehi, Mostafa Rahimi Azghadi,
A new quantum-dot cellular automata full-adder, Microelectronics Journal 41
(12) (2010) 820–826.
The authors would like to thank Ms. Sara Hashemi of Nano- [7] J. Duato, S. Yalamanchili, L.M. Ni, Interconnection Networks: An Engineering
technology and Quantum Computing Laboratory ECE department Approach, Morgan Kaufmann Publishers, 2003.
922 M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922
[8] R. Lauwereins, Creating a world of smart reconfigurable devices in: Proceed- [21] K. Wallus, et al., /http://www.atips.ca/projects/qcadesignerS.
ing of the Field Programmable Logic (FPL) conference, 2002, pp. 790–794. [22] M.T. Niemier, P.M. Kogge, Exploring and exploiting wire-level pipelining in
[9] T. Cheung, A simulation study of the Cray X-MP memory system, IEEE emerging technologies, in: Proceeding of the International Symposium of
Transactions on Computers 35 (7) (1986) 613–622. Computer Architecture (ISCA), 2001, pp. 166–177.
[10] S. Duquennoy, S. Le Beux, P. Marquet, S. Meftali, and J. Dekeyser, MpNOC [23] V.A. Mardiris, I.G. Karafyllidis, Design and simulation of modular 2n to
design: modeling and simulation, in: Proceedings of the 15th IP based SoC 1 quantum-dot cellular automata (QCA) multiplexers, International Journal
Design Conference, 2006, pp. 229–232. of Circuit Theory and Applications 38 (8) (2010).
[11] Y. Aydi, S. Meftali, and M. Abid, Design and performance evaluation of a [24] K. Walus, T.J. Dysart, G.A. Jullien, R.A. Budiman, QCADesigner: a rapid design
reconfigurable delta MIN for MPSoC, in: Proceedings of the ninth Interna- and simulation tool for quantum-dot cellular automata, IEEE Transactions on
tional Conference on Microelectronics (ICM), 2007, pp. 115–118. Nanotechnology 3 (1) (2004) 26–31.
[12] P.D. Tougaw, C.S. Lent, Logical devices implemented using quantum cellular [25] X. Yang, L. Cai, H. Huang, and X. Zhao, A comparative analysis and design of
automata, Journal of Applied Physics 75 (3) (1994) 1818–1825. quantum-dot cellular automata memory cell architecture, International
[13] C.R. Graunke, D.I. Wheeler, D. Tougaw, Jeffery D. Will, Implementation of a Journal of Circuit Theory and Applications, DOI: 10.1002/cta.710, 2010.
crossbar network using quantum-dot cellular automata, IEEE Transactions on [26] K. Kim, K. Wu, R. Karri, Quantum-dot cellular automata design guideline,
Nanotechnology 4 (4) (2005). IEICE Transactions on Fundamentals of Electronics, Communications and
[14] E.N. Ganesh, L. Kishore, M.J.S. Rangachar, Implementation of quantum Computer Sciences E89–A (6) (2006) 1607–1614.
cellular automata combinational and sequential circuits using majority logic [27] /http://ptm.asu.edu/S, 2010.
reduction method, International Journal of Nanotechnology and Applications [28] F. Safaei, M.H. Moaiyeri, M.A. Tehrani, Design and evaluating carbon nano-
2 (1) (2008) 89–106. tube interconnects for a generic delta MIN, in: Proceedings of the 19th
[15] J.H. Patel, Processor–memory interconnections for multiprocessors, in: Pro- Euromicro International Conference on Parallel, Distributed and Network-
ceedings of the sixth Annual Symposium on Computer Architecture, 1979, Based Computing, 2011.
pp. 168–177. [29] G. Cho, Y.B. Kim, and F. Lombardi, Performance evaluation of CNFET-based
[16] J.H. Patel, Performance of processor–memory interconnections for Multi- logic gates, in: Proceeding of the IEEE International Instrumentation and
processors, IEEE. Transactions on Computers 30 (10) (1981) 771–780. Measurement Technology Conference, 2009, pp. 909–912.
[17] D.A. Lawrie, Access and alignment of data in an array processor, IEEE [30] J. Deng, H.-S.P. Wong, A compact SPICE model for carbon-nanotube field-
Transactions on Computers 24 (12) (1975) 1145–1155. effect transistors including nonidealities and its application—part I: model of
[18] M. Collier, A systematic analysis of equivalence in multistage networks, the intrinsic channel region, IEEE Transactions on Electron Devices 54 (12)
Journal of Light Wave Technology 20 (9) (2002) 228–240. (2007) 3186–3194.
[19] H.J. Siegel, et al., Using the multistage cube network topology in parallel [31] J. Deng, H.-S.P. Wong, A compact SPICE model for carbon-nanotube field-
computers, Proceedings of the IEEE 77 (12) (1989) 1932–1953. effect transistors including nonidealities and its application—part II: full
[20] H.J. Siegel, Interconnection Networks for Large Scale Parallel Processing: device model and circuit performance benchmarking, IEEE Transactions on
Theory and Case Studies, McGraw-Hill, 1990. Electron Devices 54 (12) (2007) 3195–3205.