Chap09 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 82

Sequential Circuit Design:

Practice

RTL Hardware Design Chapter 9 1


by P. Chu
Outline
1. Poor design practice and remedy
2. More counters
3. Register as fast temporary storage
4. Pipelined circuit

RTL Hardware Design Chapter 9 2


by P. Chu
1. Poor design practice and remedy

• Synchronous design is the most


important methodology
• Poor practice in the past (to save chips)
– Misuse of asynchronous reset
– Misuse of gated clock
– Misuse of derived clock

RTL Hardware Design Chapter 9 3


by P. Chu
Misuse of asynchronous reset
• Poor design: use reset to clear register in
normal operation.
• e.g., a poorly mod-10 counter
– Clear register immediately after the counter
reaches 1010

RTL Hardware Design Chapter 9 4


by P. Chu
RTL Hardware Design Chapter 9 5
by P. Chu
RTL Hardware Design Chapter 9 6
by P. Chu
• Problem
– Glitches in transition 1001 (9) => 0000 (0)
– Glitches in aync_clr can reset the counter
– How about timing analysis? (maximal clock
rate)
• Asynchronous reset should only be used
for power-on initialization

RTL Hardware Design Chapter 9 7


by P. Chu
• Remedy: load “0000” synchronously

RTL Hardware Design Chapter 9 8


by P. Chu
RTL Hardware Design Chapter 9 9
by P. Chu
Misuse of gated clock
• Poor design: use a and gate to disable the
clock to stop the register to get new value
• E.g., a counter with an enable signal

RTL Hardware Design Chapter 9 10


by P. Chu
RTL Hardware Design Chapter 9 11
by P. Chu
RTL Hardware Design Chapter 9 12
by P. Chu
• Problem
– Gated clock width can be narrow
– Gated clock may pass glitches of en
– Difficult to design the clock distribution
network

RTL Hardware Design Chapter 9 13


by P. Chu
• Remedy: use a synchronous enable

RTL Hardware Design Chapter 9 14


by P. Chu
Misuse of derived clock
• Subsystems may run at different clock rate
• Poor design: use a derived slow clock for slow
subsystem

RTL Hardware Design Chapter 9 15


by P. Chu
• Problem
– Multiple clock distribution network
– How about timing analysis? (maximal clock
rate)

RTL Hardware Design Chapter 9 16


by P. Chu
• Better use a synchronous one-clock enable pulse

RTL Hardware Design Chapter 9 17


by P. Chu
• E.g., second and minutes counter
– Input: 1 MHz clock
– Poor design:

RTL Hardware Design Chapter 9 18


by P. Chu
– Better design

RTL Hardware Design Chapter 9 19


by P. Chu
• VHDL code of poor design

RTL Hardware Design Chapter 9 20


by P. Chu
RTL Hardware Design Chapter 9 21
by P. Chu
RTL Hardware Design Chapter 9 22
by P. Chu
RTL Hardware Design Chapter 9 23
by P. Chu
• Remedy: use a synchronous 1-clock pulse

RTL Hardware Design Chapter 9 24


by P. Chu
RTL Hardware Design Chapter 9 25
by P. Chu
A word about power
• Power is a major design criteria now
• In CMOS technology
– Dynamic power is proportional to the switching
frequency of transistors
– High clock rate implies high switching freq
• Clock manipulation
– Can reduce switching frequency
– But should not be done at RT level

RTL Hardware Design Chapter 9 26


by P. Chu
• Development flow:
1. Design/synthesize/verify a regular
synchronous subsystems
2(a). Derived clock: use special circuit (PLL etc.)
to obtain derived clocks
2(b). Gated clock: use “power optimization”
software tool to convert some register into
gated clock

RTL Hardware Design Chapter 9 27


by P. Chu
2. More counters
• Counter circulates a set of specific patterns
• Counter:
– Binary
– Gray counter
– Ring counter
– Linear Feedback Shift Register (LFSR)
– BCD counter

RTL Hardware Design Chapter 9 28


by P. Chu
• Binary counter:
– State follows binary counting sequence
– Use an incrementor for the next-state logic

r_next r_reg
d q q
+1
clk
clk reset

reset

RTL Hardware Design Chapter 9 29


by P. Chu
• Gray counter:
– State changes one-
bit at a time
– Use a Gray
incrementor

RTL Hardware Design Chapter 9 30


by P. Chu
RTL Hardware Design Chapter 9 31
by P. Chu
RTL Hardware Design Chapter 9 32
by P. Chu
Ring counter
• Circulate a single 1
• E.g., 4-bit ring counter:
1000, 0100, 0010, 0001
• n patterns for n-bit register
• Output appears as an n-phase signal
• Non self-correcting design
– Insert “0001” at initialization and
circulate the pattern in normal operation
– Fastest counter
RTL Hardware Design Chapter 9 33
by P. Chu
RTL Hardware Design Chapter 9 34
by P. Chu
RTL Hardware Design Chapter 9 35
by P. Chu
• Self-correcting design:
shifting in a ‘1’ only when 3 MSBs are 000

RTL Hardware Design Chapter 9 36


by P. Chu
LFSR (Linear Feedback Shift Reg)
• A sifter reg with a special feedback circuit
to generate the serial input
• The feedback circuit performs xor
operation over specific bits
• Can circulate through 2n-1 states for an n-
bit register

RTL Hardware Design Chapter 9 37


by P. Chu
• E.g, 4-bit LFSR

RTL Hardware Design Chapter 9 38


by P. Chu
• Property of LFSR
– N-bit LFSR can cycle through 2n-1 states
– The feedback circuit always exists
– The sequence is pseudorandom

RTL Hardware Design Chapter 9 39


by P. Chu
• Application of LFSR
– Pseudorandom: used in testing, data
encryption/decryption
– A counter with simple next-state logic
e.g., 128-bit LFSR using 3 xor gates to circulate
2128-1 patterns (takes 1012 years for a 100 GHz
system)

RTL Hardware Design Chapter 9 40


by P. Chu
RTL Hardware Design Chapter 9 41
by P. Chu
RTL Hardware Design Chapter 9 42
by P. Chu
• Read remaining of Section 9.2.3 (design to
including 00..00 state)

• Read Section 9.2.4 (BCD counter, design


similar to the second/minute counter in
Section 9.1.3

RTL Hardware Design Chapter 9 43


by P. Chu
PWM (pulse width modulation)
• Duty cycle: percentage of time that the
signal is asserted
• PWM: use a signal, w, to specify the duty
cycle
– Duty cycle is w/16 if w is not “0000”
– Duty cycle is 16/16 if w is “0000”
• Implemented by a binary counter with a
special output circuit
RTL Hardware Design Chapter 9 44
by P. Chu
RTL Hardware Design Chapter 9 45
by P. Chu
RTL Hardware Design Chapter 9 46
by P. Chu
RTL Hardware Design Chapter 9 47
by P. Chu
3. Register as fast temporary storage
• RAM
– RAM cell designed at transistor level
– Cell use minimal area
– Behave like a latch
– For mass storage
– Need a special interface logic
• Register
– D FF requires much larger area
– Synchronous
– For small, fast storage
– E.g., register file, fast FIFO, Fast CAM (content
addressable memory)

RTL Hardware Design Chapter 9 48


by P. Chu
Register file
• Registers arranged as an 1-d array
• Each register is identified with an address
• Normally has 1 write port (with write
enable signal)
• Can has multiple read ports

RTL Hardware Design Chapter 9 49


by P. Chu
• E.g., 4-word register file w/ 1 write port
and two read ports

RTL Hardware Design Chapter 9 50


by P. Chu
• Register array:
– 4 registers
– Each register has an enable signal
• Write decoding circuit:
– 0000 if wr_en is 0
– 1 bit asserted according to w_addr if wr_en is 1
• Read circuit:
– A mux for each read por

RTL Hardware Design Chapter 9 51


by P. Chu
• 2-d data type needed

RTL Hardware Design Chapter 9 52


by P. Chu
RTL Hardware Design Chapter 9 53
by P. Chu
RTL Hardware Design Chapter 9 54
by P. Chu
RTL Hardware Design Chapter 9 55
by P. Chu
FIFO Buffer
• “Elastic” storage between two subsystems

RTL Hardware Design Chapter 9 56


by P. Chu
• Circular queue implementation
• Use two pointers and a “generic storage”
– Write pointer: point to the empty slot before
the head of the queue
– Read pointer: point to the tail of the queue

RTL Hardware Design Chapter 9 57


by P. Chu
RTL Hardware Design Chapter 9 58
by P. Chu
• FIFO controller
– Read and write pointers: 2 counters
– Status circuit:
• Difficult
• Design 1: Augmented binary counter
• Design 2: with status FFs
– LSFR as counter

RTL Hardware Design Chapter 9 59


by P. Chu
RTL Hardware Design Chapter 9 60
by P. Chu
• Augmented binary counter:
– increase the counter by 1 bits
– Use LSBs for as register address
– Use MSB to distinguish full or empty

RTL Hardware Design Chapter 9 61


by P. Chu
RTL Hardware Design Chapter 9 62
by P. Chu
RTL Hardware Design Chapter 9 63
by P. Chu
RTL Hardware Design Chapter 9 64
by P. Chu
• 2 extra status FFs
– Full_erg/empty_reg memorize the current staus
– Initialized as 0 and 1
– Modified according to wr and rd signals:
• 00: no change
• 11: advance read pointer/write pointer; full/empty no
change
• 10: advance write pointer; de-assert empty; assert full if
needed (when write pointer=read pointer)
• 01: advance read pointer; de-assert full; asserted empty
if needed (when write pointer=read pointer)

RTL Hardware Design Chapter 9 65


by P. Chu
RTL Hardware Design Chapter 9 66
by P. Chu
RTL Hardware Design Chapter 9 67
by P. Chu
RTL Hardware Design Chapter 9 68
by P. Chu
• Non-binary counter for the pointer
– Exact location does not matter as long as the
write pointer and read pointer follow the same
pattern
– Other counters can be used for the second
scheme
– E.g, use LFSR

RTL Hardware Design Chapter 9 69


by P. Chu
4. Pipelined circuit
• Two performance criteria:
– Delay: required time to complete one task
– Throughput: number of tasks completed per unit time.
• E.g., ATM machine
– Original: 3 minutes to process a transaction
delay: 3 min; throughput: 20 trans per hour
– Option 1: faster machine 1.5 min to process
delay: 1.5 min; throughput: 40 trans per hour
– Option 2: two machines
delay: 3 min; throughput: 40 trans per hour
• Pipelined circuit: increase throughput

RTL Hardware Design Chapter 9 70


by P. Chu
• Pipeline: overlap certain operation
• E.g., pipelined laundry:

RTL Hardware Design Chapter 9 71


by P. Chu
• Non-pipelined:
– Delay: 60 min
– Throughput 1/60 load per min
• Pipelined:
– Delay: 60 min
– Throughput k/(40+k*20) load per min
about 1/20 when k is large
– Throughput 3 times better than non-pipelined

RTL Hardware Design Chapter 9 72


by P. Chu
Pipelined combinational circuit

RTL Hardware Design Chapter 9 73


by P. Chu
RTL Hardware Design Chapter 9 74
by P. Chu
Adding pipeline to a comb circuit
• Candidate circuit for pipeline:
– enough input data to feed the pipelined circuit
– throughput is a main performance criterion
– comb circuit can be divided into stages with
similar propagation delays
– propagation delay of a stage is much larger
than the setup time and the clock-to-q delay
of the register.

RTL Hardware Design Chapter 9 75


by P. Chu
• Procedure
– Derive the block diagram of the original
combinational circuit and arrange the circuit as a
cascading chain
– Identify the major components and estimate the
relative propagation delays of these components
– Divide the chain into stages of similar
propagation delays
– Identify the signals that cross the boundary of the
chain
– Insert registers for these signals in the boundary.

RTL Hardware Design Chapter 9 76


by P. Chu
Pipelined comb multiplier

RTL Hardware Design Chapter 9 77


by P. Chu
RTL Hardware Design Chapter 9 78
by P. Chu
RTL Hardware Design Chapter 9 79
by P. Chu
RTL Hardware Design Chapter 9 80
by P. Chu
RTL Hardware Design Chapter 9 81
by P. Chu
RTL Hardware Design Chapter 9 82
by P. Chu

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy