12 Vlsicad Timing

Download as pdf or txt
Download as pdf or txt
You are on page 1of 88

VLSI CAD:

Logic to Layout
Rob A. Rutenbar
University of Illinois
Lecture 12.1
ASIC Timing:
Basics
2013, R.A. Rutenbar
Timing Issues: Role of CAD Tools
Deep interactions between logic synthesis and layout






Important facts
Logic-side tools must estimate delays through unplaced/unrouted logic
Layout tools must estimate delays through placed/routed logic
High-level
description
+
Timing
Specifications
Logic
Synthesis
Physical
Design
Connected cells
with delay constraints
on signal paths
Placed cells
with real locations,
real connecting wires
Slide 2
2013, R.A. Rutenbar
Our Topics for ASIC Timing
Logic-side
Static Timing Analysis
How do we estimate the worst-case
timing through a logic network?
Layout-side
Interconnect Delay Analysis
We place the gates, route the wires:
how do we estimate wire delays?
Slide 3
t=0
V
V
V
2013, R.A. Rutenbar
ASIC Timing: Logic Side
Logic-side
Static Timing Analysis
How do we estimate the worst-case
timing through a logic network?
Slide 4
All problems look like longest
(or shortest) paths through a
graph that properly models
the gates, and (maybe) the wires

Surprisingly the maze routing
idea reappears, in a very nice way
2013, R.A. Rutenbar
Our Topics for ASIC Timing
Layout-side
Interconnect Delay Analysis
We place the gates, route the wires:
how do we estimate wire delays?
Slide 5
t=0
V
V
V
The problem starts as an
electrical circuit model
(This is unavoidable)
However, we skip circuit details,
and just show key results

Surprisingly it all turns into
another computational walk
on another special tree!
VLSI CAD:
Logic to Layout
Rob A. Rutenbar
University of Illinois
Lecture 12.2
ASIC Timing:
Logic-Level Timing:
Basic Assumptions & Models
2013, R.A. Rutenbar
Timing Analysis at the Logic Level
Goal: Verify timing behavior of our logic design
I give you a gate-level netlist
I give you some timing models of the gates and (after place/route) the wires too
You tell me:
When signals arrive at various points in the network
Longest delays through gate network
Does the netlist satisfy the timing requirement? If not where are key problems?
This is surprisingly complicated in the real world...
Slide 7
2013, R.A. Rutenbar
Acknowledgements
Early versions of this lecture used material from:
Karem Sakallah (U Michigan) and Tom Szymanski (AT&T Bell Labs)
This version of lecture has benefited extensively by inputs from
David Hathaway (IBM Essex Junction, VT)
Aside: Hathaway is the principal designer of Einstimer, IBMs static timing tool
Current version also benefited from versions my VLSI CAD lectures taught jointly by
John Cohn (IBM) and Dave Hathaway (IBM) at University of Vermont Dept of EE.
Many thanks to Karem, Tom, John, and especially Dave for all
the inputs on this material
Slide 8
2013, R.A. Rutenbar
Analyzing Design Performance
Assume design is synchronous
All storage is in explicit sequential elements, eg, flip-flop elements
Consequence: for us, we can just focus on delays through combinations gates
Combinational
Logic

(No feedback
loops)
F
l
i
p

F
l
o
p
s

F
l
i
p

F
l
o
p
s

Common
Clock






Slide 9
2013, R.A. Rutenbar
Question: Cant We Just Simulate Logic?
What logic simulation does
Determine hows a system will behave, simulates the logical function
Gives the most accurate answer (with good simulation models)
! but it is (practically) impossible to give a complete answer especially timing
Requires examination of an exponential number of cases
All possible input vectors !
With all possible relative timings !
Under all possible manufacturing variations !
We need a different, faster solution...
Slide 10
2013, R.A. Rutenbar
Timing Analysis: Basic Model
Assume we know clock cycle: e.g., 1GHz clock, cycle = 1ns
Slide 11

Logic


F
l
i
p

F
l
o
p
s

F
l
i
p

F
l
o
p
s







Longest delay <1ns
1 ns
CLOCK
For this logic to work
successfully, longest delay
through network must be
shorter than 1ns
(For simplicity, ignore some flip flop timing issues)
2013, R.A. Rutenbar
Timing Analysis: Gate Delay Models
First: we need a model of delay through each logic gate
Slide 12
network delay == ?
Gate delay ! == ?
!
2013, R.A. Rutenbar
In Real World: This is Amazingly Complex
Slide 13
Gate type affects delay
! ! "
Gate loading affects delay
!
!
"
Waveform shape affects delay
! ! "
Transition direction affects delay
! !
"
2013, R.A. Rutenbar
In Real World: This is Amazingly Complex
Slide 14
Gate input pin affects delay
! " !
1V = logic 1
A
A
B
B
Why? Different transistor-level
circuit paths input to output
Simple ex: NAND
At nanoscale, delays are really statistical
! !
"
http://upload.wikimedia.org/wikipedia/commons/8/8c/Standard_deviation_diagram.svg
2013, R.A. Rutenbar
Our Model: Pin-to-Pin Delay
This lecture, keep it simple: Fixed, pin-to-pin delay model
No slopes, electricity, distributions. Loading effects pushed back into gate delay itself
Per-pin delays are essential, but well use just 1 value per gate, for simplicity
Turns out this is enough to see all the interesting algorithm ideas



Slide 15
!=3
!=3
!=4.1
!=4.1
2013, R.A. Rutenbar
Next: Do We Consider Logical Function?
Does this matter? Try an example, where we erase gates
In this example: PI = Primary Input, PO = Primary Output
Slide 16
!=8
!=1
!=2
!=8
!=1
!=2
!=1
PI
PI
PI
PO
Longest delay is
8+2+8+2=20
2013, R.A. Rutenbar
Now, Suppose We Know Logic Gates
Slide 17
You cannot sensitize this path: cannot make a logic change at
this input propagate down this path to change this output
!=8
!=1
!=2
!=8
!=1
!=1
PI
PI
PI
PO
2:1 mux 2:1 mux
0
1
!=2
0
0
Conflict!
1
0
1
2013, R.A. Rutenbar
Topological vs Logical Timing Analysis
When we ignore logic, this is called Topological Analysis
We only work with the graph and the delays dont consider the logic
We can get wrong answers: what we found was called a False Path
Going forward: we ignore the logic (Too tough to deal with)
Assume that all paths are statically sensitizable
Means: Can find a constant pattern of inputs to other PIs that makes some
output sensitive to some input
Reminder: this is exactly the Boolean Difference concept of sensitivity
This timing analysis has a name: Static Timing Analysis (STA)
Slide 18
VLSI CAD:
Logic to Layout
Rob A. Rutenbar
University of Illinois
Lecture 12.3
ASIC Timing:
Logic-Level Timing:
STA Delay Graph,
ATs ,RATs, and Slacks
2013, R.A. Rutenbar
STA Representation: Delay Graph
From gate-level network, we build this delay graph
Vertices: Wires in gate network, 1 per gate output; also 1 for each PI and PO
Edges: Gates, input pin to output pin (1 edge per input). Put gate delays on edges
Slide 20
PI=a
PI=b
c
PI=d
e=PO
!=2
!=2
!=3
!=3
d
c
a
b
e
2
2
3
3
Called Cell arcs.
Because they are
edges, that explain
timing, for each
cell in tech library.
2013, R.A. Rutenbar
Delay Graph
Common convention: Add Source / Sink nodes
Add 1 source (SRC) node that has a 0-weight edge to each PI
Add1 sink (SNK) node with 0-weight edge from each PO
Why do this?
Now, the network has exactly 1 entry node, and 1 exit node
All the longest (or shortest) path question have same start / end nodes
Slide 21
a
d
c
b
e
2
2
3
3
SNK
0
SRC
0
0
0
2013, R.A. Rutenbar
Representation: Delay Graph
What about interconnect delay?
Can still use delay graph: model each wire as a special gate that just has a delay
Slide 22
PI=a
PI=b
c
PI=d
e
x
y
w
z
q=PO
!=2
!=3
!=1.2
!=1.6
!=1.5
!=1.0
!=1.8
a
d
c
b
e
x
y
w
z
q
SRC
0
0
SNK
0
0
1.2
1.6
2
2
1.5
1.0
3
1.8
3
2013, R.A. Rutenbar
So how do we use this graph to do timing analysis?
What we do not do: Try to enumerate all the source-to-sink paths
Why not? Exponential explosion in number of paths, even for small graph
Theres a smarter answer: Node-oriented timing analysis
Find, for each node in delay graph, worst delay to the node along any path
Operations on Delay Graph
Slide 23
0 1 2 3 n

How many
paths from
0 to n? 2
n
!
2013, R.A. Rutenbar
Define Values on Nodes in Delay Graph
Arrival Time at a node (AT)
AT(n) = Latest time the signal can
become stable node n
Think: Longest path from source
Called: Delays TO node
Required Arrival Time at node (RAT)
RAT(n) =Latest time the signal is allowed
to become stable at node n
Think: Longest path to sink (sort of#)
Called: Delays FROM node
Slide 24
SRC
SNK
n
other paths
ATs
RATs
2013, R.A. Rutenbar
Define Values on Nodes in Delay Graph
Slack at node n: Slack(n) = RAT(n) AT(n)
Amount of timing margin for the signal: positive is good, negative is bad
Determined by longest path through node
Amount by which a signal can be delayed at node and not increase the longest
path through the network
Can increase delay at node (to minimize power, circuit area) with positive slack and
not degrade overall performance
Slack(n) = RAT(n)

- AT(n)
Slide 25
SRC
SNK
n
other paths
ATs
RATs
2013, R.A. Rutenbar
Slack is Hugely Important in Timing Analysis
About slacks
Defined so negative slack always bad --, it indicates a timing problem
Measures sensitivity of network to this nodes delay
Positive slack
Good: I can change something at this node, and not hurt networks overall timing
Example: I can make this node slower, maybe save some power, not hurt timing
Negative slack
Bad: I have problem at this node; more negative the slack, bigger the problem
Looking for a node to fix to help timing? These nodes are where to look first.
These affect my critical paths the most

Slide 26
2013, R.A. Rutenbar
How To Compute ATs? Recursively
succ(n)
Slide 27
SRC
SNK
n
-
p
-
-
s
-






!(p,n)
pred(n)
predecessors of n
p
r
e
d
e
c
e
s
s
o
r

p
a
th
s

AT(n) = maximum delay to n =
0 if n == SRC

MAX AT(p) + !(p,n) else
p pred(n)
2013, R.A. Rutenbar
A Quick Concrete AT Example
Big idea
If we know the longest path to each predecessor of n, its a simple Maximum
operation to compute the longest path to n itself. (Yes, its just Dijkstra again!)
MAX { AT(x) + !(x,n) }
x{p, q, r}

= MAX { 5+7, 10+1, 5+5 }

= 12
Slide 28
n
p
q
r
SRC



!=7
AT(p) =5
!=1 AT(q)=10
!=5
AT(r)=5
AT(n) =
2013, R.A. Rutenbar
How To Compute RATs? Recursively
pred(n)
p
r
e
d
e
c
e
s
s
o
r

p
a
th
s

Slide 29
SRC
SNK
n
-
p
-
-
s
-






!(n,s)
succ(n)
successors of n
RAT(n) =
Latest time in cycle
where n could change
and signal would still
propagate to sink
before end of cycle
Cycle Time if n == SNK

MIN RAT(s) - !(n,s) else
s succ(n)
=
Cycle Time
CLOCK
RATs are
defined relative
to clock cycle
2013, R.A. Rutenbar
ATs versus RATs: Look at Clock Cycle
Why the differences between AT and RAT definitions?
Slide 30
Cycle Time if n == SNK

MIN RAT(s) - !(n,s) else
s succ(n)
0 if n == SRC

MAX AT(p) + !(p,n) else
p pred(n)
AT(n)
RAT(n)
AT(n)
AT: longest logic
delay after launch
edge of clock
RAT(n)
RAT: longest logic
delay to the capture
edge of clock, but it is
expressed relative to
the Cycle Time
CLOCK: Cycle Time Launch

Capture
2013, R.A. Rutenbar
Signal arrives too late,
and there is too much
delay from node to output

Signal does not arrive at
flip flip input before the
capture edge of clock
Bad Things Happen When We See THIS
SLACK = RAT AT = Negative
Slide 31
AT(n)
RAT(n)
CLOCK: Cycle Time
Launch
Edge
Capture
Edge
VLSI CAD:
Logic to Layout
Rob A. Rutenbar
University of Illinois
Lecture 12.4
ASIC Timing:
Logic-Level Timing:
A Detailed Example,
and the Role of Slack
2013, R.A. Rutenbar
Lets Do a Bigger Example
Delays are on edges; let clock cycle be 12
Compute the min/max delays by eye for now
AT=longest path from SRC TO node;
RAT=(cycle time 12) (longest path FROM node to SNK)
Slack = RAT - AT
Slide 33
Cycle=12
1
4
1
2 3 5
2
3
1
5
3 2
4
SNK SRC
0
0
0
0
0
0
PIs POs
a
b
c
d
e
f
g
h
j
k
n
2013, R.A. Rutenbar
Lets Do a Bigger Example: ATs
Slide 34
Cycle=12
1
4
1
2 3 5
2
3
1
5
3 2
4
SNK
SRC
0
0
0
0
0
0
PIs POs
! AT RAT slack = (RAT AT)
Compute ATs from SRC to SNK
0
0
0
0
6
1
2
4
10
12
7
15
15
f
a
b
c
d
e
g
h
j
k
n
2013, R.A. Rutenbar
Lets Do a Bigger Example: ATs
Slide 35
Cycle=12
1
4
1
2 3 5
2
3
1
5
3 2
4
SNK
SRC
0
0
0
0
0
0
PIs POs
! AT RAT slack = (RAT AT)
0
0
0
0
1
6
2
4
10
12
7
15
15
Compute RATs from SNK to SRC
12
12
12
12
10
7
3
-2
4
-3
-1
2
-3
a
b
c
d
e
g
h
j
k
n
f
Bug fix:
RAT=+2,
not -2
2013, R.A. Rutenbar
Lets Do a Bigger Example: Slacks
Slide 36
Cycle=12
1
4
1
2 3 5
2
3
1
5
3 2
4
SNK
SRC
0
0
0
0
0
0
PIs POs
! AT RAT slack = (RAT AT)
0
0
0
0
1
6
2
4
10
12
7
15
15 12
12
12
12
10
7
3
-2
4
-3
-1
2
-3 -3
-3
-1
2
-3
2
-3
6
-3
5
0
-3
-3
Worst (most negative slack) is -3. Trace worst path, SRC!SNK
a
b
c
d
e
g
h
j
k
n
f
Bug fix:
RAT=+2, and
Slack=+2 also
2013, R.A. Rutenbar
Analyzing this Example
Look at those slacks
A negative slack at an output (PO) means a missed requirement
A negative slack on internal node n means it feeds a problem PO
So, there is a path from n to some problem PO
Big result: the negative slack appears along this entire worst path
Your worst timing violation at an output (PO) = the most negative slack value
You can always trace a path with this slack value back to a PI
So, slacks are hugely useful
Beyond just knowing what is the worst path; slacks tell us problem gates on this path
Slide 37
VLSI CAD:
Logic to Layout
Rob A. Rutenbar
University of Illinois
Lecture 12.5
ASIC Timing:
Logic-Level Timing:
Computing ATs, RATs,
Slacks, and Worst Paths
2013, R.A. Rutenbar
Answer this: What are all the too-slow paths that violate timing?
Most useful answer:
Report paths in order, from
slowest to fastest
In other words: Enumerate
these paths, in delay order
The Most Typical STA Problem
Slide 39

Logic


F
l
i
p

F
l
o
p
s

F
l
i
p

F
l
o
p
s







1 ns
CLOCK
2013, R.A. Rutenbar
What Do We Need?
Calculate all the ATs
Calculate all the RATs
Calculate all the Slacks
# do all of this very efficiently: Delay graphs are huge!
#the enumerate the violating paths, in worst delay order
Slide 40
2013, R.A. Rutenbar
Computational Strategy
One approach: Topological sorting the delay graph
Sort the vertices in the delay graph into one single ordered list
Essential property: if there is an edge p!s, p appears before s in sorted order
Compute ATs by going forward through the sorted list
Compute RATs by going backward through the sorted list
Legal Topological Sort Orders
SRC,B,D,C,E,SNK
SRC,B,C,D,E,SNK
SRC,B,C,E,D,SNK
SRC,C,B,D,E,SNK
SRC,C,B,E,D,SNK
Slide 41
B D
SNK
E C
SRC
3
5
6
15
9
11
4
2013, R.A. Rutenbar
Topological Sorting (Topsort)
Pretty easy application of depth-first-search (DFS)
Slide 42
From: Wikipedia: Topological Sorting
2013, R.A. Rutenbar
Assume Have Topsort: Compute ATs
computeATs() {
AT(SRC) = 0;
foreach ( n in topsort order ) {
AT(n) = -";
foreach ( node p in pred(n) ) {
AT(n) = max( AT(n), AT(p) + !(p,n) );
}
}
Slide 43
pred(n)
predecessors of n
succ(n)
p
re
d
e
c
e
s
s
o
r
p
a
th
s

s
u
c
c
e
s
s
o
r
p
a
th
s

SRC
SNK
n
-
p
-
-
s
-






!(p,n)
2013, R.A. Rutenbar
computeRATs() {
RAT(sink) = CycleTime;
foreach (node n in reverse topsort order ) {
RAT(n) = ";
foreach (successor s in succ(n) )
RAT(n) = min( RAT(n), RAT(s) - !(n,s) );
}
}
Computing RATs
Trick:
Pretend all edges are reversed,
they point from SNK to SRC,
and walk graph backwards
Slide 44
pred(n) succ(n)
successors of n
p
re
d
e
c
e
s
s
o
r
p
a
th
s

s
u
c
c
e
s
s
o
r
p
a
th
s

SRC
SNK
n
-
p
-
-
s
-






!(n,s)
2013, R.A. Rutenbar
Using Slack For Path Reporting
Useful slack property: all nodes on longest path have same slack
Surprising result: Let us find N worst paths, even though we did not trace them all
Slide 45
B D
SNK
E C
SRC
3
5
6
15
9
11
4
Slack=23-8=15
Slack=5-4=1
Slack=0
Slack=0
Slack=0
Slack=0
RAT=5 RAT=14
RAT=29
RAT=23 RAT=3
RAT=0
AT=3 AT=8
AT=4 AT=14
AT=29
AT=0
Cycle=29
2013, R.A. Rutenbar
N-Worst Path Reporting (Like Maze Routing!)
Find N worst paths
We evolve partial paths; each partial path stores 3 things:
< Path itself, Delay of this path, Slack of the final node on path >
We store the partial paths in a heap, which is indexed on this Slack value
Sort so path with worst slack endpoint is always on top
Initially this heap contains only the source node
Algorithm is quite simple (and familiar)
Expand: Pop partial path off the heap it has the most negative (smallest) slack
Reach target? If its end node is the sink: Print out the path
Reach: Else add each successor node to make new partial paths, push them back onto the
heap, each with <path, delay, slack> labeled
Repeat until N paths have been reported go pop next partial path
Slide 46
2013, R.A. Rutenbar
Worst Case Path Reporting: Example




Heap starts as
<Path, Delay, Slack> = <SRC, 0, 0>
Slide 47
B D
SNK
E C
SRC
3
5
6
15
9
11
4
Slack=15
Slack=1
Slack=0
Slack=0
Slack=0
Slack=0
Heap
Minimum
<SRC, 0, 0>
2013, R.A. Rutenbar
Worst Case Path Reporting: Example




Reach B ,reach C
Slide 48
B D
SNK
E C
SRC
3
5
6
15
9
11
4
Slack=15
Slack=1
Slack=0
Slack=0
Slack=0
Slack=0
Heap
Minimum
<SRC B, 3, 0>
<SRC C, 4, 1>
2013, R.A. Rutenbar
Worst Case Path Reporting: Example




Expand SRC-B, Reach D ,reach E
Slide 49
B D
SNK
E C
SRC
3
5
6
15
9
11
4
Slack=15
Slack=1
Slack=0
Slack=0
Slack=0
Slack=0
Heap
Minimum
<SRC BD, 8, 15>
<SRC C, 4, 1>
<SRC BE, 14, 0>
2013, R.A. Rutenbar
Worst Case Path Reporting: Example




Expand SRC-BE, Reach SNK
with <SRC BE SNK, 29, 0>
so 1
st
worst path delay=29
Slide 50
B D
SNK
E C
SRC
3
5
6
15
9
11
4
Slack=15
Slack=1
Slack=0
Slack=0
Slack=0
Slack=0
Heap
Minimum
<SRC BD, 8, 15>
<SRC C, 4, 1>
2013, R.A. Rutenbar
B D
SNK
E C
SRC
3
5
6
15
9
11
4
Slack=15
Slack=1
Slack=0
Slack=0
Slack=0
Slack=0
Worst Case Path Reporting: Example




Expand SRC-C, Reach E
Slide 51
Heap
Minimum
<SRC BD, 8, 15>
<SRC CE, 13, 0>
2013, R.A. Rutenbar
B D
SNK
E C
SRC
3
5
6
15
9
11
4
Slack=15
Slack=1
Slack=0
Slack=0
Slack=0
Slack=0
Worst Case Path Reporting: Example




Expand SRC-CE, Reach SNK
with <SRC CE SNK, 28, 0>
so 2
nd
worst delay is 28
Slide 52
Heap
Minimum
<SRC BD, 8, 15>
2013, R.A. Rutenbar
B D
SNK
E C
SRC
3
5
6
15
9
11
4
Slack=15
Slack=1
Slack=0
Slack=0
Slack=0
Slack=0
Worst Case Path Reporting: Example




Expand SRC-BD, Reach SNK
with <SRC BD SNK, 14, 0>
so 2
nd
worst delay is 14
Slide 53
Heap
Minimum
Note: only 3 possible paths
from source to sink in graph,
snd we found them correctly
in delay order!
2013, R.A. Rutenbar
Static Timing Analysis: Summary
STA is a very important step in design of complex ASICs
Its a critical sign off step, which means: you dont get to fabricate unless you pass
Several big ideas
Gate level delay models matter, and can be pretty complex in real world
Logical # Topological path analysis (which == STA)
Build delay graph, calculate ATs, RATs, slacks recursively
Concept of slack is big: lets us locate worst paths, and problem gates on path
Idea very like maze routing lets us find worst paths in delay order
Slide 54
2013, R.A. Rutenbar
Static Timing Summary: Aside
STA is a huge topic several things we did not cover
STA for sequential elements
How do we model flip flops and latches, so we can verify, eg, that setup and hold
times are met? More tricks with the delay graph
Early mode versus late mode timing
Our development was only so-called late mode timing, where we care about
longest path. Early mode focuses on shortest paths, and is critical for more
advanced timing optimizations, eg, with transparent latches
Incremental STA
In practice, you change 10,000 gates out of 1,000,000 gates, you dont want to
redo the whole STA analysis. Advanced methods can update incrementally
Slide 55
VLSI CAD:
Logic to Layout
Rob A. Rutenbar
University of Illinois
Lecture 12.6
ASIC Timing:
Interconnect Timing:
Electrical Models of
Wire Delay
2013, R.A. Rutenbar
Interconnect (Wire) Delay Modeling
The problem
You place the logic it puts the pins at a certain distance apart
You route the wires, each wire has an input-to-output delay
Where does the delay come from? How accurately can we predict this delay?
How efficiently can we model this delay for use in layout or synthesis or STA?
Slide 57
x
x
x
t=0
V
V
V
2013, R.A. Rutenbar
Sources of Delay: Model 1
Delay = Finite speed signal propagation through physical wires
Model = Length
Delay proportional to length; shorter = better
Analysis
Good: This is really easy, qualitatively OK
Bad: Not quantitatively accurate, extremely crude
Slide 58
x
x
x Delay $ Bounding Box !X + !Y
2013, R.A. Rutenbar
Sources of Delay: Model 2
Add: Delay affected by electrical circuit drive limitations
Model = Wire load
Delay proportional to length, fanout, capacitance of the driven pins
Analysis
Good: Qualitatively better, not too hard to curve fit models from data
Bad: Still focuses mostly on the pins, not on the wire; can be off by lots
Slide 59
x
x
x
Delay = F ( bounding box !X + !X,
size of driver gate, fanout,
capacitance of pins on driven gates, ...)
Fanout is 2, account for
loading due to 2 pins

2013, R.A. Rutenbar
Sources of Delay: Model 3
Add: Delay comes from electrical loading of the interconnect
Depends critically on exact geometry of the wired net
Model = Electrical Circuit
Interconnect must be modeled as a circuit, analyzed as a circuit
Slide 60
Silicon
Insulator
First-level
metal wire
At nanoscale, the
interconnect geometry
Is large relative to
the devices themselves
2013, R.A. Rutenbar
Interconnect Model: RC Trees
Most popular interconnect model used in layout applications
First: Interconnect ! Circuit
Slide 61
Silicon
height d
W
L
H
Metal
Metal wire has resistance = R
to current flowing down its length



current
Physics: R = $ L / WH

ASIC: R = r L / W
We control
L, W of a wire
2013, R.A. Rutenbar
Interconnect Model: RC Trees
Slide 62
Silicon
height d
W
L
H
Metal
Metal wire has capacitance to silicon
substrate, with insulator between
current
metal
silicon
insulator
Physics: C = % WL / d

ASIC: C = c WL
We can control
W, L of wire
2013, R.A. Rutenbar
Aside: About Real Capacitance (Cap)!
Note: this model is very simplistic
You really get capacitance between any pair of conducting surfaces
So, in a multi-layer metal process you get Caps between all the layers
Vertically adjacent conductors create Overlap Cap
Laterally adjacent conductors (next to you or below you) create Fringe Cap
Slide 63
M4
M5
M3
Fringe cap between
2 adjacent wires
on the same layer
Overlap cap between
2 adjacent wires on
the same layer
cross
section
view
Sidewall fringe cap from
side of one layer
to the conductors below it
2013, R.A. Rutenbar
Interconnect Models: RC Trees
Typical circuit model: " model (pi model)
Accounts for the resistance R and the capacitance C of wire segment
Symmetric (note: split capacitance in two halves); small model, only need 2 numbers
Slide 64
Silicon
height d
W
L
H
Metal
current
R = r L/W
C = (1/2) cWL C = (1/2) cWL
2013, R.A. Rutenbar
From Wire Segments to RC Tree
Big idea: Replace every straight wire segment with pi model
Slide 65
Each wire
segment creates
its own RC circuit
2013, R.A. Rutenbar
From Wire Segments to Final RC Tree
Simplification: Recall a rule from basic circuits (or physics)
Parallel capacitors can be replaced by 1 capacitor with % Ci
C1 C2 C3
=
C1+C2+C3
Slide 66
RC Tree
Note: each of the Rs,
Cs in this tree are
different numbers.
R, C depend on the
geometry of each wire
segment.
VLSI CAD:
Logic to Layout
Rob A. Rutenbar
University of Illinois
Lecture 12.7
ASIC Timing:
Interconnect Timing:
The Elmore Delay Model
2013, R.A. Rutenbar
Using RC Trees for Interconnect Analysis
Lots of nice Electrical Engineering detail we could do, now#
!except not everybody in the class has this circuits-oriented background
Useful thing about RC tree model
It starts as a circuit!
!but it turns into a simple tree object
And a special computational walk on tree gives us all the delay information we need!
So, (almost) no circuits for us. Just the computational recipe for how to use tree
Slide 68
2013, R.A. Rutenbar
This is It: RC Tree
RC Tree general form
A tree of resistors (no loops); capacitors hanging off all intermediate tree nodes
Root of tree is where signal is input; Leaves of tree are the driven outputs
Slide 69
RC Tree:
drawn as a
circuit
a
b
c
d
e
f
R
R
R
R
R
C C
C
C C
C
RC Tree:
drawn as a
graph
a
b
c
d
e
f
2013, R.A. Rutenbar
RC Trees: Delay Estimation
Missing circuits detail: need to model driver, driven gates
Voltage source + resistor as input at root (this models driving gate)
Capacitor as load at each leaf (each models a driven gate)
Slide 70
R
R
R
R
R
C
C
C
C
C
C
+
-
t =0
V=1
V1
+


-
V2
+


-
V1
V2
Driving input
Driven load
a
b
c
d
e
f
2013, R.A. Rutenbar
Summary: Gates + Wires ! RC Tree
Slide 71
+
-
t =0
V=1
V1
Next: Use tree to
compute a delay
number for each
output of tree
V2
We get a unique
delay number for
each output
R
R
R
R
R
C
C
C
C
C
C
a
b
c
d
e
f
2013, R.A. Rutenbar
RC Trees: The Elmore Delay
Famous formula:
Elmore delay
Derived in 40s for circuits applications
Resurrected in 80s by Penfield,
Rubenstein, Horowitz for RC trees
Very simple, very useful
Easy computational recipe
Slide 72
2013, R.A. Rutenbar
Elmore Delay !: Tree Walk Computing Recipe
Do this:
Set ! = 0; walk down path of resistors from Root to Leaf where you want delay
At each resistor, do ! = ! + R & (all capacitors downstream)
Downstream capacitor = any C that is reachable in tree below this resistor
Slide 73
1
Ri = 2
2 1
1
3
1
4
1
1
3
5
Example: at Ri=2
resistor in our RC tree,
the term we would
add to Elmore delay !
on a tree walk through Ri
= 2'(2+1+1+1+3)
a
b
c
d
e
f
2013, R.A. Rutenbar
Elmore Delay !: Tree Walk Computing Recipe
Example:
Set ! = 0; walk down path of resistors from Root to Leaf where you want delay
At each resistor, do ! = ! + R & (all capacitors downstream)
Slide 74
1
2
2 1
1
3
1
4
1
1
3
5
Delay !
! = 0
+5(1+2+1+1+3+1)
+2(2+1+1+3+1)
+4(1+3)
+1(3)
= 45+16+16+3
= 80
a
b
c
d
e
f
2013, R.A. Rutenbar
Insight: Stream Analogy
Think of RC tree like a branching stream, current like water
Goal: You are downstream, trying to fill your bucket; how fast can you fill it?
Unfortunately, at every branch point, somebody else has a bucket
The farther you are downstream, the less water you get from upstream.
What matters here?
Width of the upstream branches.
Size of all the other buckets
Slide 75
Water in
(fixed, limited supply)
YOU
2013, R.A. Rutenbar
Water ! Circuits
Slide 76
Water in
YOU
Driving gate
Wire capacitance
Wire resistance
Current
Driven gate
2013, R.A. Rutenbar
Circuits Aside: What Is Elmore Delay?
If you could model the path from input to an output as a simplified
circuit with exactly one R and one C, the best RC value = Elmore !
Slide 77
1
2
2 1
1
3
1
4
1
1
3
5
V(e)
+


-
V(e)
Vin ( 1 - e
-t /
!

) )
a
b
c
d
e
f
VLSI CAD:
Logic to Layout
Rob A. Rutenbar
University of Illinois
Lecture 12.8
ASIC Timing:
Interconnect Timing:
Elmore Delay Examples
2013, R.A. Rutenbar
Using the Elmore Delay
The Elmore delay formulas are immensely useful
Simple enough for layout folks to use them in algorithms
Accurate enough that they beat simple length-based schemes
(Unfortunately, not so accurate that you can avoid later verification with what are
called higher order models that incorporate more than one time constant)

Applications
Numerous!
But for us: can take a real routed wire, and build a good delay model for STA
Slide 79
2013, R.A. Rutenbar
Elmore Example
Simple tree with 4 leaf nodes
Electrical parameters: r = 1 , c = 2
So, for each segment, total R = r L / W, C = c W L
W=1, L = 20
W=1, L = 5
W=1, L = 2
Slide 80
Remember: Add Cs at each
node in the tree! 3 Cs to add
at this node!
a
b
c d
e f
g
h
R=1(5/1)
= 5
5=C/2
5=C/2
C=2(1*5)=10
b
d
2013, R.A. Rutenbar
Elmore Example
RC Tree for the interconnect alone
Again: Remember to add up Cs hanging off each internal node of tree
Slide 81
W=1, L = 20
W=1, L = 5
W=1, L = 2
20
5
5
2 2
2
2
20
30
9
9
2 2 2 2
a
b
c d
e f
g
h
a
b
c
d
e
f g h
2013, R.A. Rutenbar
Elmore Example
Add driver and driven gates
2+1 = 3
Slide 82
20
5
5
2 2
2
2
20
30
9
9
W=1, L = 20
W=1, L = 5
W=1, L = 2
R0 = 20
Cload = 1
20
a
b
c
d
e
f
g
h
a
b
c
d
e
f g h
aa
aa
2013, R.A. Rutenbar
Elmore Example: Compute Delay to Each Leaf
Since symmetric, only need to compute 1 path
Remember the recipe:

1. Set != 0, walk from root to leaf

2. At each R, ! += R & (all Cs downstream)

Slide 83
3
20
5
5
2 2
2
2
20
30
9
9
20
3 3 3
! = 0
+20(20+30+2*9+4*3)
+20(30+2*9+4*3)
+5(9+2*3)
+2(3)
= 2881
a
b
c
d
e
f g h
aa
2013, R.A. Rutenbar
New Elmore Example
What can layout (ie, placement, routing) do to wiring?
Change the length of a wire, or even the width of a wire
Try example: change L on 1 segment
R=40
40=C/2
40=C/2
R & C increase for longer wire
Slide 84
W=1, L = 20
W=1, L = 40
W=1, L = 2
R0 = 20
Cload = 1
b
d
b
d
2013, R.A. Rutenbar
New Elmore Example
OK, now what is delay to each leaf?
Slide 85
20
20
5
40
2
2
2
2
20
65
9
44
3 3 3 3
Right side:
!=7606

Left side:
!=5681

Note:
Extra C of longer
wire also loads the
left side of tree,
increasing the delay
Left Right
a
b
c
d
e f g
h
aa
b
d
2013, R.A. Rutenbar
New Elmore Example, version 2
How about instead we change W=width on 1 segment?
Slide 86
W=1, L = 20
W=10, L = 5
W=1, L = 2
R0 = 20
Cload = 1
R smaller, C bigger
Left Right
Right side:
!= 6436
Left side:
!= 6481
a
b
c
d
e
f g h
aa
2013, R.A. Rutenbar
Elmore Applications
Do people really use this delay metric? Yes!
Timing verification
Can use this to give realistic wire delays, post layout, for final STA
During placement
Estimate wire shape (eg, a simple Steiner) you can get very quick delay estimate
Analytical placers use to adjust weights on wires, coerce critical wires to be short
Slide 87
PI
PI
PO
!
!
!
!
!
!
!
2013, R.A. Rutenbar
Summary
Interconnect has a huge impact on chip speed
Cannot ignore delays caused by the electical properties of real wires
Layout tools responsible for part of timing guarantee
Upstream tools determine levels of logic, gate count, fanouts, etc
Physical design tools responsible for how long the wires end up
All of these impact wire length and distribution
Individual wires are today modeled as complex circuits
RC tree is the most useful model; Elmore delay is easiest to compute
There are sophisticated estimators beyond Elmore...
Can use for both verification, and for layout optimizations (eg clock)
Slide 88

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy