Microelectronics Journal: Mohammad A. Tehrani, Farshad Safaei, Mohammad Hossein Moaiyeri, Keivan Navi

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Microelectronics Journal 42 (2011) 913–922

Contents lists available at ScienceDirect

Microelectronics Journal
journal homepage: www.elsevier.com/locate/mejo

Design and implementation of Multistage Interconnection Networks using


Quantum-dot Cellular Automata
Mohammad A. Tehrani, Farshad Safaei n, Mohammad Hossein Moaiyeri, Keivan Navi
Faculty of ECE, Shahid Beheshti University G.C., Evin 1983963113, Tehran, Iran

a r t i c l e i n f o abstract

Article history: Quantum-dot Cellular Automata (QCA) is a promising nanotechnology with ultra-small feature size and
Received 4 September 2010 ultra-low power consumption compared with transistor-based technologies. During the past decade
Received in revised form the QCA has been carefully studied, and it has demonstrated the ability of using quantum phenomena
23 February 2011
for implementing logical devices. Multistage Interconnection Networks (MINs) have been frequently
Accepted 7 March 2011
Available online 11 April 2011
suggested as the connection means in parallel systems. This architecture provides the maximum
bandwidth to the components, and the minimum latency access to memory modules. They are
Keywords: generally accepted concepts in the semiconductor industry for solving problems related to on-chip
Quantum-dot Cellular Automata communications. Although there have been a large amount of researches on MINs for parallel
Multistage Interconnection Networks
processing, there seems to be surprising attempts to utilize the unique characteristics of QCA for
Network design
designing and implementing of MINs. In an effort to fill this gap, this paper presents the first design
Nanoelectronics
methodology of MINs using QCA. To demonstrate the functionality and validity of the proposed
methodology, performance evaluations of MINs using QCADesigner simulator are given and analyzed.
& 2011 Elsevier Ltd. All rights reserved.

1. Introduction The intensive development of the modern communication


technology has made it possible to construct and design more
In the last three decades, the required dimension scaling for complicated, more convenient and economical high-performance
high-density, low-power and high-speed VLSI systems has been computers and very complex interconnection networks. Large-
provided by the complementary metal-oxide semiconductor scale parallel computers, Multiprocessors System-on-Chips
(CMOS) technology. To confirm Moore’s Law many worldwide (MPSoCs), multicomputers, cluster computers and peer-to-peer
efforts have been concentrated on determining proper alterna- networks are the collections of independent, cooperating micro-
tives for CMOS technology, because some resources predict that processors that communicate by sending and receiving messages
the CMOS revolution will terminate in the next decade [1,2]. The over high-speed interconnection networks. These systems are
remaining short time emphasizes the need for researching on desirable platforms that will be used in future generation satisfying
novel nanoscale technologies that are expected to achieve many critical requirements. They will be energy-efficient, cheap,
approximately 1012 devices/cm2 compaction [3]. reliable, and offer sufficient computing power for advanced and
Various kinds of nanoelectronic designs have been introduced complex applications. To satisfy all these requirements simulta-
but the main focus of the article is on Quantum-dot Cellular neously future systems will integrate various types of processors
Automata (QCA). The QCA design has been introduced in 1993 [4], and data memory units, resulting in very heterogeneous platforms.
and many efforts have been recently made to develop the Communication systems play a very significant role in today’s
QCA-based designs [2]. It could be a possible alternative that has parallel computers. These systems can be used to interconnect
been proposed to replace the bulk CMOS technology. In this various components. The specific requirements of these communica-
nanodevice the logical states of zero or one could be represented tion systems depend on the architecture of the parallel computer.
by two possible configurations of residing electron pair. Because the Multistage Interconnection Networks (MINs) are widely used
electrons are unable to move within the circuit, the power dissipa- in parallel multiprocessors systems to connect processors to
tion is very insignificant [5,6]. processors and/or to memory modules. Their popularity is due
to the high switching cost of crossbar networks [7]. As an
n
instance, MINs are frequently used to connect the nodes of IBMSP
Corresponding author.
E-mail addresses: m_tehrani@sbu.ac.ir (M.A. Tehrani), f_safaei@sbu.ac.ir,
[8] and CRAY X-MP series [9]. Furthermore, MINs are applied
safaei@ipm.ir (F. Safaei), h_moaiyeri@sbu.ac.ir (M.H. Moaiyeri), for Networks-on-Chips (NoCs) to connect processors to memory
navi@sbu.ac.ir (K. Navi). modules on MPSoCs [10].

0026-2692/$ - see front matter & 2011 Elsevier Ltd. All rights reserved.
doi:10.1016/j.mejo.2011.03.004
914 M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922

Many variations of MINs have been already introduced. These In the case where two QCA cells are together, they affect each
architectures provide the maximum bandwidth to components other and the electrons of each cell force the electrons of the other
(such as DSP, IP, etc.), and the minimum access delay to memory cell. When they are situated as shown in Fig. 2(a), they prefer to
modules. A MIN is characterized by its topology, switching mechan- have the same polarization to minimize the Coulombic repulsion.
ism, routing algorithm, task scheduling strategy, and fault-tolerance But when they have 451 gradient, their polarizations must be
[11]. Various topologies of MINs have been proposed and studied in different to have the least repulsion (Fig. 2(b)).
the last few decades. Most of these topologies are driven from the A QCA wire could be made by arranging some cells in a line as
well known graph topologies, including mesh, star, shuffle exchange, shown in Fig. 3(a). When the head of the line has a specified
tree, and cube-connected networks, among different others [7]. polarization, it propagates the value through the line and every
The communication platform of the parallel architectures can cell gets the same state. A ripple wire could be also made by
be implemented with MINs, which must be reconfigured for arranging cells in a line having 451 gradient (Fig. 3(b)). There is a
various purposes. QCA strategy can be used to implement digital very important feature in the interconnection of normal wires and
logic systems by properly arranging cells. So far, several studies ripple wires. These wires can cross each other without any signal
have been reported in the literature about QCA-based circuit interference in a coplanar design. However, their signals do not
design [12–14]. However, to the best of our knowledge, there affect the other’s signal and the signals are propagated to the rest
seems to be surprising attempts to design and implement MINs of the wires correctly (Fig. 3(c)).
considering the unique characteristics of QCA. Indeed, such net- QCA logic gates somehow differ from conventional CMOS design.
works are very important circuits because they are expected to be Majority gate and Inverter are the basic elements of QCA design and
used to design and realize large-scale parallel systems [7]. In an even ‘‘AND’’ and ‘‘OR’’ gates are implemented using Majority gates.
effort to bridge the gap between MIN and QCA, the main concern A Majority gate consists of four cells: three inputs, one output, and
of this article is to implement and simulate the types of MINs one voter cell. The voter chooses its polarization depending on the
such as Omega network, Butterfly network, Baseline network, and states of the input cells. Subsequently, the signal is propagated to
Generalized Cube network [7] on the basis of QCA architecture. A the output cell and rest of the circuit. Another important gate of QCA
novel QCA switching element is also implemented, which is used is Inverter. Its functionality is just like normal CMOS Inverter. A QCA
in the fabric switches and MIN networks. Majority gate and an Inverter are shown in Fig. 4 [5].
The remaining portion of the paper is organized as follows: In A Majority gate can simply change to an ‘‘AND’’ gate. When
Section 2 an acquaintance to QCA is brought, which describes its one of the inputs of the Majority gate is set to the static value of
physical interactions and logical behavior. In Section 3, the ‘‘0’’, it acts like an AND gate (Eq. (2.a)). An ’’OR’’ gate is also a
architecture of a MIN is introduced. Next, the structural design Majority gate, with a static ‘‘1’’ input (Eq. (2.b)).
of a generic MIN and its components is presented. Section 4 gives
Majority ðA, B, 0Þ ¼ ðA:B þ B:0 þ A:0Þ ¼ A:B ð2:aÞ
a description for the implementation of MIN on the basis of QCA
strategy. In Section 5, the simulation results and verification of Majority ðA, B, 1Þ ¼ ðA:B þ B:1 þ A:1Þ ¼ A þ B ð2:bÞ
functionality for the test networks are detailed. Finally, a sum-
mary of results and conclusions can be found in Section 6.

3. Network modeling

2. Quantum-dot Cellular Automata


In this section, we give a brief introduction to MIN architecture
that is used to design the interconnection platform dedicated for
The most important feature of QCA is hidden in cell architecture.
MPSoCs. Later, we will present the necessary background infor-
Each cell consists of four quantum-dots placed on a square shaped
mation that is used in the paper.
arrangement as shown in Fig. 1(a). There are also two surplus
electrons trapped in these dots. The electrons are unable to move
between the cells but they can freely tunnel within a cell between 3.1. Structure of MINs
the dots. Due to Coulombic repulsion, electrons are arranged
diagonally in the cell to be placed so as to have the farthest possible Definition 1. [7] A MIN is defined as a network used to inter-
distance with each other. In this case the system energy is minimized connect a group of N inputs to a group of M outputs using several
and they are in their ground state. The cell polarization P is defined
to measure this alignment, expressed by Eq. (1), where ri is the
probability of presence of an electron in the quantum-dot i. For a
typical QCA cell shown in Fig. 1(a), the polarization is given by [4]
r1 þ r3 r2 r4
P¼ ð1Þ
r1 þ r3 þ r2 þ r4
So, the electrons are mostly in two possible arrangements; these are
used to represent ‘‘zero’’ and ‘‘one’’ logical values. A QCA cell’s Fig. 2. (a) Signal propagation between two normal 901 cells and (b) signal
polarizations and its logical values are illustrated in Fig. 1(b). propagation between two 451 cells.

2 1
Electron Quantum-dot
Logic: 0 Logic: 1
3 4 Polarization: -1 Polarization: +1

Fig. 1. (a) A QCA cell has four quantum-dots and two surplus electrons and (b) illustration of two QCA’s logical states.
M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922 915

Input Columbic

Fig. 3. Illustration of signal propagation through QCA wires: (a) a normal QCA wire does not change the signal value and transfers it intact and (b) a 451 wire changes the
signal in each cell; value of signal at the end depends on the number of passed cells, and (c) two QCA wires can cross each other in just one layer without signal
interference.

Input A

Input B Output
Input Output
Voter

Input C

Fig. 4. Illustration of two important QCA logic elements. (a) a Majority gate: in a Majority gate, the voter cell chooses its arrangement according to neighbor cells states
and (b) a QCA Inverter.

stages of small crossbar switching elements (SE) followed by


interconnection linking stage. It has n stages, G0–Gn  1. Each stage
Gi has wi switches of size ai,jbi,j, 1rj rwi. Thus stage Gi has pj
inputs and qj outputs such that
wi
X wi
X
pi ¼ ai,j , qi ¼ bi,j ð3Þ
j¼1 j¼1

Linking stages are interconnection functions, each of which is a


bijection of the group of the previous stage switch addresses that
connect all SEs outputs from a given stage to the inputs of the
next stage. A generic MIN architecture is shown in Fig. 5.

3.2. Classification of MINs

In this section, we present a classification of MIN and restate Fig. 5. Architecture of a generic MIN using 2  2 Switching Elements (SE).
some definitions necessary for the proposed classification.
MINs have been classified into three classes depending on the Depending on the kind of channels and switches, MINs can be
availability of paths to establish new connections. Fig. 6 illustrates either unidirectional or bidirectional [7]. Additionally, each chan-
a topological classification of MINs [7]. nel can be either multiplexed or be replaced by two or more
channels. The latter case is referred to as a dilated MIN [7].
Definition 2. [7] A Banyan network is defined as a class of MINs
in which there is one and only one path from any input node to Definition 3. [7] A uniform MIN is one in which all the SEs of a
any output node. stage are of the same degree.
916 M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922

Fig. 6. Classification of MINs.

Definition 4. [7] A Square MIN is one in which a MIN of degree r


is built from SEs of size r.

In this paper, we are concerned only with unidirectional Delta


Square Uniform Banyan networks (DSUBs), which are a subset of
Banyan networks.
Delta networks, a subset of Banyan networks, were first
proposed by Patel [15,16].

Definition 5. [7] Delta networks are built using anbn (where n is Fig. 7. A typical Delta network.

the number of stages) digit controlled crossbars of which no input


and output can be left unconnected.
3.3. Types of Delta networks
Every Delta network is a Banyan network, while the reverse is
not always true. Delta networks have a routing property called The topology plays an important role in designing routing
Self routing property or Delta property. Fig. 7 shows a typical strategy, network latency, throughput, and data transfer. We will
Delta network. restrict the study to Delta MINs. There exist various popular MINs,
Since we are concerned only Delta networks, non-Delta Ban- which we have grouped to be considered as different types of Delta
yan networks are not of interest for this article. Uniform Banyan networks. The difference between each of these networks is
MINs can be either square or non-square. Thus, according to the topology of interconnection links between the crossbar stages.
the Definitions 1–4, a non-uniform network is also non square. A study on equivalence of various types of Delta MINs has been
A DSUB network is a delta network with all its SEs having the reported in [18]. All Delta networks are considered to be topologi-
same size, and is the focus of this paper. cally equivalent as well as functionally equivalent [18]. We thus
In order to simplify the construction of a Delta MIN as well as classify the popular types of Delta MINs as Omega network,
the design of a routing algorithm, Patel [15,16] proposed a regular Butterfly network, Baseline network, and Generalized Cube network.
link pattern that can be used between all stages and thus avoid It is worth mentioning that Flip network, reverse Butterfly network,
the difficult construction procedure for every different delta reverse Baseline network, and indirect Binary n-cube network are
network. Patel termed the regular link pattern, the q-shuffle. mirror images of the first four network types, respectively. Hence, in
The q-shuffle of a group of qr elements is a permutation of these the current study, we focus on the implementation of only the first
elements denoted by four types.
(
qi mod ðqr1Þ, 0 r io qr1
Sqr ðiÞ ¼ ð4Þ
i, i ¼ qr1 3.3.1. Omega network
Omega networks are considered to be the most popular Delta
Applying the q-shuffle function on a number represented in networks. They use the perfect shuffle, which is a special case of a
base q corresponds to the application of a cyclic shift on said q-shuffle. A more intelligent way to describe the perfect k-shuffle
numbers. This leads to a construction of a class of MINs called permutation, sk, is defined as [7]
‘‘shuffle-exchange MINs’’ [15]. Omega networks, which were first
sk : ðxn1 xn2 . . .x1 x0 Þ-xn2 . . .x1 x0 xn1 ð5Þ
defined by Lawrie [17], and one of the most popular types of Delta
networks are usually described as shuffle-exchange MINs. In fact, The perfect k-shuffle permutation performs a cyclic shifting of the
all delta networks are shuffle-exchange MINS. digits in x to the left for one position.
In the next subsection, we describe the different types of Delta Fig. 8 shows a schematic diagram of Omega (16, 2), where the
networks that are considered the most popular ones. It should first parameter refers to the size of network and the second
be noted that we focus primarily on the DSUB network, which parameter corresponds to its degree. It should be noted that the
uses a size of power of 2, i.e., Delta MINs that have crossbars of values of q and r as defined in k-shuffle formula are S2  2 for this
size 2  2. Omega network.
M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922 917

Fig. 8. Schematic diagram of Omega (16, 2) network.

Fig. 10. Schematic diagram of Baseline (16, 2) network.

Fig. 9. Schematic diagram of Butterfly (16, 2) network.

3.3.2. Butterfly network


A Butterfly network is basically an unfolded hypercube. The
dimensions of the hypercube correspond to the number of Fig. 11. Schematic diagram of Generalized Cube (16, 2) network.

interconnection links between the crossbar stages of Butterfly


k
networks. The ith k-ary butterfly permutation bi , 0 r i rn1, is
k is clear that d0 also defines the identity permutation I. Fig. 10
defined by [7] shows a Baseline (16, 2) network.

bki : ðxn1 xn2 . . .xi þ 1 xi xi1 . . .x1 x0 Þ-ðxn1 xn2 . . .xi þ 1 x0 xi1 . . .x1 xi Þ 3.3.4. Generalized Cube network
ð6Þ In a Generalized Cube network [7,19,20] the ith cube permuta-
tion ei, 0 rirn  1, is defined only for k¼2 by
The ith Butterfly permutation interchanges the 0th and ith digits
k
of the index. It should be observed that b0 defines a straight one- Ei : ðxn1 xn2 . . .xi þ 1 xi xi1 . . .x1 x0 Þ-ðxn1 xn2 . . .xi þ 1 xi xi1 . . .x1 x0 Þ
to-one permutation and is also called identity permutation [7]. ð8Þ
A Butterfly (16, 2) network is shown in Fig. 9. The ith cube permutation complements the ith bit of the index.
The permutation e0 is also called exchange permutation [7].
3.3.3. Baseline network Fig. 11 illustrates a Generalized Cube (16, 2) network.
In a Baseline network [7], the ith k-ary baseline permutation,
dki , 0 ri r n1, is expressed by
4. QCA realization of MINs
dki : ðxn1 xn2 . . .xi þ 1 xi xi1 . . .x1 x0 Þ-ðxn1 xn2 . . .xi þ 1 x0 xi xi1 . . .x1 Þ
In Section 2, the static characteristic of QCA cells has been
ð7Þ
introduced. To realize large circuits such as Switching Element or
The ith baseline permutation performs a cyclic shifting of the iþ1 Multistage Interconnection Networks, it is important to discuss
least significant digits in the index to the right for one position. It the features of QCA cells as a part of a large system. In this section,
918 M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922

the notation of a tool called QCADesigner [21] is also mentioned, 3. Release: The barriers are lowered and allow the electrons to
by which the most of figures in this article are drawn. start tunneling. The cells move from a fix polarization to no
polarization.
4. Relax: The barriers fall down and the cell has no polarization.
4.1. QCA clocking
The clock has four different phases, so the cells can be
One of the important behaviors of the cells is their response to arranged in four pipelined zones to propagate the signal faster
the clocking signals. They have a special clocking that can through the circuit. There is a notation for showing the clocking
expedite the signal propagation and reduce the noises through a phases in models [22]. A group of cells in the same phase is shown
circuit. The clocking in QCA is somehow different from other with the same color (Fig. 13).
digital clocking technologies. Clocking in QCA is completely
different from CMOS circuits. QCA clock allows data to propagate 4.2. QCADesigner
or force them to stay in their position. The clock raises and lowers
the dots barrier, so the electron can tunnel between dots or it QCADesigner is a tool generally used for simulating the QCA
must stay in its situation. The clock has four phases including circuits. A QCA model might be of single layer or multilayer. In a
switch, hold, release, and relax, which describe the raising and single layer design, only normal cells and fix polarization cells are
lowering of the clock signal (Fig. 12). Each clock phase performs used. Their illustration is depicted in Fig. 14(a) and (b). When a
some special activities on quantum-dot’s barrier that affect the QCA signal moves from one layer to another, it goes via vertical
value of cell as below. cells (Fig. 14(c)). Then, in the upper layer, it propagates through
crossover cells (Fig. 14(d)). Finally, it can go down to the main
1. Switch: The cells passage from having no value to having layer via vertical cells.
definite values.
2. Hold: The barriers are maintained high and the values are the 4.3. QCA 2  2 switching element
same as in the switch phase.
In this section the hardware implementation of a 2  2 SE is
presented. The signals could propagate straightforward or might
exchange their path. Here, the SE is implemented using two
multiplexers. The best multiplexer design is suggested by Mar-
diris and Karafyllidis [23] with 62 cells and 0.12 mm2 area as
shown in Fig. 15(a). The logical design of 2  2 SE is demonstrated
in Fig. 15(b).
QCA implementation of the 2  2 SE design is shown in
Fig. 16(a). This QCA implementation has been simulated and
Fig. 12. Four phases of QCA clocking.
tested. Besides, the results have been approved by QCADesigner
[21] Version 2.0.3. Table 1 presents a brief description for
each parameter used for a bi-stable approximation simulation
engine [24]:
Fig. 16(b) reveals the simulated waveforms of QCA 2  2 SE.
Fig. 13. Illustration of QCA clocking sections in a QCA wire. It contains 157 cells and places arranged in a 0.25 mm2 area.
It is implemented in a single layer having six clock zones and the
output shows the results after 1.5 clock cycles delay.

5. Experimental results

Fig. 14. VariousQCADesigner cells: (a) Normal cell, (b) Fix polarization cell, Some models of MINs are implemented here using the mentioned
(c) Vertical cell, and (d) Crossover cell. SE. These models are implemented in three stages with 12 SEs.

Fig. 15. (a) Implementation of a multiplexer with QCA cells and (b) illustration of the logic diagram of a 2  2 Switching Element.
M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922 919

Fig. 16. (a) Illustration of a QCA 2  2 SE and (b) output signals of QCA 2  2 SE when the number of samples equals 50,000.

Table 1 and takes 4.5 clock cycles to generate the output. In this scheme, the
Parameters model in QCADesigner simulator. maximum number of cells included in one clock phase is reduced to
improve the polarization as well as the transmission of the signal
Parameter Description Value
throughout the wire [25].
Cell width Width of each QCA square (should 18 nm All the discussed models are implemented and simulated
be equal to the height) using the QCADesigner tool. As an instance, the QCA implementa-
Cell height Height of each QCA square 18 nm tion of Generalized Cube network is demonstrated in Fig. 17.
Dot diameter Diameter of each dot in a QCA cell 5 nm
Number of Number of tested data during the 50,000 and
Moreover, the simulation results are shown in Table 2. The
samples simulation. Accuracy depends on 2,000,000 schemas are single layer implementions and the signal is injected
this parameter into the circuit using coplanar crossover model. All the models
Convergence Simulation for each sample iterates 0.001 have the same clock zones having variable cell counting and area.
tolerance until the new value of polarization
According to [26], the QCA clock rate could be in range of
deviates from the old value by more
than this predefined error limit 1–2 THz. Although, there is no frequency setup in QCADesigner,
Radius of effect Radius of effect of a cell is the radius 65 nm the normal range of QCA clock rate is assumed. Therefore, the
at which it will interact with other delays can be estimated at these clock frequencies.
cells MINs have also been designed at 16 nm MOSFET and CNFET
Relative Relation of the permittivity of 12.9
permittivity fabrication material (for GaAs/
nanotechnologies and have been simulated at 0.7 V supply
AlGaAs) to the vacuum permittivity voltage using HSPICE circuit simulator. For 16 nm MOSFET tech-
Clock high Saturation energy of clock signal 9.8E 22 J nology, the 16 nm PTM model [27–29] has been used. Further-
when it is high more, for 16 nm CNFET technology the Compact SPICE Model
Clock low Saturation energy of clock signal 3.8E 23 J
for CNFETs including all nonidealities has been utilized [30,31].
when it is low
Clock amplitude To make an effective clock, top 25% 2 This standard model has been designed for unipolar, MOSFET-
factor and bottom 25% of a sine signal is like CNFET devices, which operates correctly for CNFETs with
dismissed the minimum channel length of 10 nm. In this model, each
Layer separation Distance between two layers 11.5 nm transistor may have one or more CNTs as its channel(s). This
Maximum When the simulation for each state 100
iterations per is not convergence based on this
model also considers Schottky Barrier Effects, Parasitics, including
sample parameter, it automatically goes to CNT, Source/Drain, and Gate resistances and capacitances and
the next state CNT charge screening effects. The parameters of the CNFET model
and their values with brief descriptions are summarized in
Table 3.
As presented in the previous section, each switch has six clock The simulation results are shown in Table 4 and are plotted in
phases delay. When they are put in three stages, it is expected that Fig. 18. It is worth mentioning that the delay parameter denotes
the total delay reaches 18 clock phases. It has totally 18 clock zones the critical path delay of the networks. As can be inferred from the
920 M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922

Table 3
Characteristics of CNFET devices.

Parameter Description Value

DCNT CNT diameter 1.487 nm


Lch Physical channel length 16 nm
Lss Length of doped CNT source-side extension region 16 nm
Ldd Length of doped CNT drain-side extension region 16 nm
Lgeff Scattering mean free path in the intrinsic CNT channel 100 nm
and S/D regions
Kox Dielectric constant of high-k top gate dielectric 16
material
Ksub Dielectric constant of substrate (10 mm thick SiO2) 4
Tox Thickness of high-k top gate dielectric material 4 nm
Efi Fermi level of the doped S/D tube 6 eV
Csub Coupling capacitance between the channel region and 20 aF/
the substrate (SiO2) mm

Table 4
Simulation results of Delta MINs in QCA, CNFET, and MOSFET technologies.

Delay (  10  12 s)

Network MOSFET- CNFET- QCA-18 nm QCA-18 nm


16 nm 16 nm (1 THz) (2 THz)

Baseline (8,2) 35.216 17.141 4.500 2.250


Butterfly (8,2) 35.901 17.031 4.500 2.250
Generalized Cube (8,2) 38.550 17.443 4.500 2.250
Omega (8,2) 32.954 17.036 4.500 2.250

Average Improvement 73.78 86.89


(QCA-CNFET) %

5.1. Verification of functionality

The models of the above mentioned MINs have been tested


and approved by QCADesigner [21] Version 2.0.3, using 2,000,000
tested data during the simulation process. All the models are
implemented in only one layer. It goes without saying that the
logic devices and the interconnecting wires are implemented in
the main layer. Since all the designs are conducted in the very
same simulation, only the output vector of a test data is brought
here. It shows the transition path from the input to the output of
Baseline multistage network. Other transition paths result in the
same manner.
Fig. 19 depicts an output sample of Baseline network gener-
ated by the QCADesigner tool in which the data propagate from
Input 3 to Output 7 through S10, S21, and S32. In this scheme,
4.5 clock cycles are needed for the inputs to propagate and be
available at the outputs.

6. Conclusions
Fig. 17. QCA implementation of Generalized Cube network.

The evolution of digital design lies in the ability of shrinking


Table 2 the circuit size with each advance in the process technology. The
Characteristics of each MIN network with QCA strategy. future thus points to nanoelectronics as the way to continue the
improvements, which have been implemented using CMOS tech-
Network Area (lm2) Complexity (#cells) Clock cycle
nology. One of the nanoelectronic architectures that have been
Baseline 3.85 2491 4.5 (18 zones) reorganized as one of the top six emerging technologies in the
Butterfly 3.81 2503 4.5 (18 zones) future computers is Quantum-dot cellular automata (QCA). On
Generalized Cube 3.81 2503 4.5 (18 zones) the other hand, Multistage Interconnection networks (MINs) play
Omega 3.94 2617 4.5 (18 zones)
a very significant role in modern digital logic design, and are
widely used in parallel systems to interconnect the various
results, QCA-based networks outperform the CNFET and MOSFET components. While, most QCA devices designed to this point
based ones and the delay parameter improvement is about 80% in have focused on the discrete logic elements, the majority of actual
QCA-based networks even when it has larger width than CNFET circuits have been implemented using the standard bulk CMOS
and MOSFET processes [27]. process. This has motivated the need to replace the conventional
M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922 921

40
35
30
25
20
15
10
Omega (8, 2)
5
Generalized Cube (8, 2)
0
Butterfly (8, 2) MOSFET-16nm
CNFET-16 nm
Baseline (8, 2) QCA-18nm (1THz)
QCA-18 nm (2 THz)

Fig. 18. Comparison of the average Delta MINs in QCA, CNFET, and MOSFET technologies.

Fig. 19. (a) Outline of the signal path through Baseline network model and (b) baseline network model sample output.

CMOS design process by QCA design process that will allow of Shahid Beheshti University, G.C., for her helps on the QCADe-
moves toward more advanced architectures. In this paper, we signer simulations.
developed QCA strategy to construct a generic Delta MIN archi-
tecture. We examined the possible implementation types of MINs,
such as Omega network, Butterfly network, Baseline network, and References
Generalized Cube network. The results presented in this paper
show that these networks can be successfully implemented using [1] Semiconductor Industries Association Roadmap, /http://public.itrs.netS,
2010.
QCA cells and outperform the other nanotechnology-based imple- [2] M.A. Tehrani, K. Navi, A novel quantum dot cellular automata for implemen-
mentations such as 16 nm CMOS and 16 nm CNFET. Further the tation of multi-valued logic, In: Nano Today Conference, Elsevier, 2009.
introduced network model can be extended to create more [3] N. Kazemifard, M. Ebrahimpour, M. Rahimi, M. Tehrani, and K. Navi,
Performance evaluation of in-circuit testing on QCA based circuits,
complex devices, and also the network’s model suggested here in: Proceedings of the 6th IEEE East–West Design and Test Symposium, 2008.
is capable of being rearranged without making any physical [4] C.S. Lent, P.D. Tougaw, W. Porod, G.H. Bernstein, Quantum cellular automata,
alteration. Nanotechnology 4 (1) (1993) 49–57.
[5] M.R. Azghadi, O. Kavehei, K. Navi, A novel design for quantum-dot cellular
automata cells and full adders, Journal of Applied Sciences 7 (22) (2007)
3460–3468.
Acknowledgment [6] Keivan Navi, Razieh Farazkish, Samira Sayedsalehi, Mostafa Rahimi Azghadi,
A new quantum-dot cellular automata full-adder, Microelectronics Journal 41
(12) (2010) 820–826.
The authors would like to thank Ms. Sara Hashemi of Nano- [7] J. Duato, S. Yalamanchili, L.M. Ni, Interconnection Networks: An Engineering
technology and Quantum Computing Laboratory ECE department Approach, Morgan Kaufmann Publishers, 2003.
922 M.A. Tehrani et al. / Microelectronics Journal 42 (2011) 913–922

[8] R. Lauwereins, Creating a world of smart reconfigurable devices in: Proceed- [21] K. Wallus, et al., /http://www.atips.ca/projects/qcadesignerS.
ing of the Field Programmable Logic (FPL) conference, 2002, pp. 790–794. [22] M.T. Niemier, P.M. Kogge, Exploring and exploiting wire-level pipelining in
[9] T. Cheung, A simulation study of the Cray X-MP memory system, IEEE emerging technologies, in: Proceeding of the International Symposium of
Transactions on Computers 35 (7) (1986) 613–622. Computer Architecture (ISCA), 2001, pp. 166–177.
[10] S. Duquennoy, S. Le Beux, P. Marquet, S. Meftali, and J. Dekeyser, MpNOC [23] V.A. Mardiris, I.G. Karafyllidis, Design and simulation of modular 2n to
design: modeling and simulation, in: Proceedings of the 15th IP based SoC 1 quantum-dot cellular automata (QCA) multiplexers, International Journal
Design Conference, 2006, pp. 229–232. of Circuit Theory and Applications 38 (8) (2010).
[11] Y. Aydi, S. Meftali, and M. Abid, Design and performance evaluation of a [24] K. Walus, T.J. Dysart, G.A. Jullien, R.A. Budiman, QCADesigner: a rapid design
reconfigurable delta MIN for MPSoC, in: Proceedings of the ninth Interna- and simulation tool for quantum-dot cellular automata, IEEE Transactions on
tional Conference on Microelectronics (ICM), 2007, pp. 115–118. Nanotechnology 3 (1) (2004) 26–31.
[12] P.D. Tougaw, C.S. Lent, Logical devices implemented using quantum cellular [25] X. Yang, L. Cai, H. Huang, and X. Zhao, A comparative analysis and design of
automata, Journal of Applied Physics 75 (3) (1994) 1818–1825. quantum-dot cellular automata memory cell architecture, International
[13] C.R. Graunke, D.I. Wheeler, D. Tougaw, Jeffery D. Will, Implementation of a Journal of Circuit Theory and Applications, DOI: 10.1002/cta.710, 2010.
crossbar network using quantum-dot cellular automata, IEEE Transactions on [26] K. Kim, K. Wu, R. Karri, Quantum-dot cellular automata design guideline,
Nanotechnology 4 (4) (2005). IEICE Transactions on Fundamentals of Electronics, Communications and
[14] E.N. Ganesh, L. Kishore, M.J.S. Rangachar, Implementation of quantum Computer Sciences E89–A (6) (2006) 1607–1614.
cellular automata combinational and sequential circuits using majority logic [27] /http://ptm.asu.edu/S, 2010.
reduction method, International Journal of Nanotechnology and Applications [28] F. Safaei, M.H. Moaiyeri, M.A. Tehrani, Design and evaluating carbon nano-
2 (1) (2008) 89–106. tube interconnects for a generic delta MIN, in: Proceedings of the 19th
[15] J.H. Patel, Processor–memory interconnections for multiprocessors, in: Pro- Euromicro International Conference on Parallel, Distributed and Network-
ceedings of the sixth Annual Symposium on Computer Architecture, 1979, Based Computing, 2011.
pp. 168–177. [29] G. Cho, Y.B. Kim, and F. Lombardi, Performance evaluation of CNFET-based
[16] J.H. Patel, Performance of processor–memory interconnections for Multi- logic gates, in: Proceeding of the IEEE International Instrumentation and
processors, IEEE. Transactions on Computers 30 (10) (1981) 771–780. Measurement Technology Conference, 2009, pp. 909–912.
[17] D.A. Lawrie, Access and alignment of data in an array processor, IEEE [30] J. Deng, H.-S.P. Wong, A compact SPICE model for carbon-nanotube field-
Transactions on Computers 24 (12) (1975) 1145–1155. effect transistors including nonidealities and its application—part I: model of
[18] M. Collier, A systematic analysis of equivalence in multistage networks, the intrinsic channel region, IEEE Transactions on Electron Devices 54 (12)
Journal of Light Wave Technology 20 (9) (2002) 228–240. (2007) 3186–3194.
[19] H.J. Siegel, et al., Using the multistage cube network topology in parallel [31] J. Deng, H.-S.P. Wong, A compact SPICE model for carbon-nanotube field-
computers, Proceedings of the IEEE 77 (12) (1989) 1932–1953. effect transistors including nonidealities and its application—part II: full
[20] H.J. Siegel, Interconnection Networks for Large Scale Parallel Processing: device model and circuit performance benchmarking, IEEE Transactions on
Theory and Case Studies, McGraw-Hill, 1990. Electron Devices 54 (12) (2007) 3195–3205.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy