WA1.3 Bhat Paper
WA1.3 Bhat Paper
WA1.3 Bhat Paper
Simulation
Shankarnarayan Bhat,
Shrivatsa Prahallada,
Sriram Satakopan,
Sanjay Muchini
Qualcomm India Pvt Ltd,
Whitefield, Bangalore,
India - 560066
www.qualcomm.com
ABSTRACT
Gate level verification is becoming extremely challenging due to increased gate counts in SoC
designs and limited capabilities of the EDA tools. On-time completion of GLS activity reduces
the tape-out and silicon bring-up risk of the product, which directly impacts the Time-ToMarket. One of the key challenges in gate level verification is the X propagation debug. X
propagation originates from un-initialized memory, flops and timing violations. Identifying the
non-resettable flops in the design is the key challenge in gate level simulation.
EDA tools have provided limited capability to identify uninitialized flops/UDPs. An in-house
flow was developed leveraging the capabilities of the two Synopsys tools - Primetime and VCS
simulator. Design netlist and simulation dump are the inputs to the flow, which analyzes flop
state during reset condition. Flop state is recorded and checked every simulation cycle to identify
an X state. Identified flops are initialized to a random value of 0 or 1. Randomization of these
initialized values (0 or 1) on the identified flops increases the probability of uncovering gate level design bugs.
New flow enabled rapid way to identify un-initialized flops and significantly redcuded gate level
verification time.
Table of Contents
1.
Introduction ........................................................................................................................... 4
2.
3.
4.
Results ................................................................................................................................. 14
5.
Conclusions ......................................................................................................................... 15
6.
7.
Acknowledgements ............................................................................................................. 16
Table of Figures
Figure 1 Asynchronous resettable flop ........................................................................................ 6
Figure 2 Synchronous resettable flop .......................................................................................... 6
Figure 3 Synchronous resettable flop with no clock during reset period ................................... 7
Figure 4 Combinational loop back ............................................................................................... 7
Figure 5 X propagation through design ...................................................................................... 8
Figure 6 Flow chart to identify un-initialized flops in the design................................................ 9
Figure 7 evcd dump sequence using VCS simulator ................................................................. 10
Figure 7 un-initialized flops breakup ......................................................................................... 14
Figure 8 Person efforts utilized in X propagation debug ........................................................... 14
SNUG 2010
Table of Tables
Table 1 GLS efforts on a project case study ............................................................................. 4
Table 2 Comparision of Traditional X propagation debug and debug with New Flow ............ 15
SNUG 2010
1. Introduction
Gate level simulation is an integral part of the Pre-silicon verification. As design complexity and
flop count increases, gate level verification is becoming more and more challenging and time
consuming activity. Basic question arises in minds of many why do we need gate level simulation when STA and Formal verification is widely accepted? Though we have very robust STA
and formal verification, gate level verification still becomes mandatory for multiple reasons
Validate the usage of wild cards in static timing close constraints set false and multi cycle
paths where they dont belong wrong understanding of the design may lead to wrong false
path definition
Validate asynchronous interfaces which STA can not analyse
Validate constrains defined during formal verification while scan and MEMBIST insertion
happens after synthesis
Timing simulation with back annotated SDF to validate dynamic switching
ATE test pattern generation from SDF simulation
Identify any un-intended initial condition of the flop X/Z, RTL simulation may be optimistic or pessimistic
Gate level simulation is verified with zero-delay to validate if the design has come out of reset,
scan insertion is proper. Fully SDF back annotated gate level simulation is done to validate timing constrains, dynamic switching and asynchronous interfaces. SDF based simulation is also
used for ATE pattern generation.
One of the key challenges of gate level simulation is identification of X propagation. X propagation may be happening due to un-initialized memory, un-initialized register, wrong library
model, timing violation, etc.. With multimillion gates design, considerable amount of time is
spent is identifying source of X and fixing the X propagation. Below table explains time line
and complexity involved for the project
SNUG 2010
Gate level simulation is a very painful and time consuming task. In traditional way gate level
debug is done using VCS simulator or any other industry standard simulator tool. Most of the
time is spent in debugging X propagation in the simulation. This debug of chasing X is manual effort which is dependant on persons experience and design knowledge. Practically it is not
possible to verify 100% test cases at gate level. In the above project (Table 1) only 5% of total
verification test suite was verified at gate level. This limits number of issues that can be uncovered. In the above project manual debug identified only 260 non-reset flops, actually design
had many more and resulted in silicon bug.
There are several causes for X propagation in GLS. Most complex one to debug being nonreset or un-initialized flops. This paper presents an innovative flow of finding all non-reset flops
which causes X propagation in a very short period of time. An in-house tool developed based
on Primetime and VCS simulator precisely identifies all non-reset flops. These flops are reviewed with designer and forced to random value (0 or 1) during simulation. Using random force
deposition on the flops increases the probability of identifying the design and initialization issues. Paper describes the different kinds of X propagation sources during GLS, new methodology/flow on how to identify non-reset flops up-front before starting the simulation and also create random force file for simulation. It also includes one project case study with all actual data
collected from the simulations.
.
SNUG 2010
Asynchronous reset flops gets initialized to 0 or 1 as and when reset/clear signal is asserted.
In case of synchronous resettable flops figure 2, device reset is applied to D input through
combinatorial logic. On first clock after reset, the flop gets initialized to valid value.
Ideally in the design all flops should be either asynchronous resettable or synchronously resettable. In order to reduce area and cost fewer number of asynchronous resettable flops are used.
SNUG 2010
SNUG 2010
During simulation any of the above 3 conditions originate X and these Xs are propagated
through design as in Figure 5.
2.3.4 X propagation
At a given point of simulation time, flop which has X value on Q output may not be the originator of X. Identification of originating flop is done by tracing the X propagation path hierarchically in the design using VCS simulator. This needs extensive design knowledge to trace the
signal and time consuming task.
SNUG 2010
SNUG 2010
20
reset_in
start
evcd
dump
Stop
evcd
dump
Evcd is dumped only during reset cycles. All flops q, clk and d ports are captured in eVCD.
SNUG 2010
10
21
3.4 Find_all_Xs
Find_all_Xs is a perl utility which consists of 2 modules Viz. eVCD parser and eVCD analyser.
The eVCD parser stores all the valid states and value change information for all the pins in the
hash datastructure. As the eVCD is generated at the flop level of the design, the analyser gets
state of every input/output pin of the flop. Based on this information, the flops of concern are
catogorised.
The output Q/NQ of the flip flop is a function of D input and CLK clock input.
Q = Fn(D, CLK);
We have used 20 clock cycles in the reset period for the analysis. Number of cycles used is
based on reset logic sequence.
FlopInst: top.u_core_top.xxxx.yyy!lc_server_ch1!lc_bank_reg[15][1]
PIN
CODE
STATES
q
<3525
pX(0)
d
<3527
pN(0)
clk
<3528
pN(0) pD(116) pN(125) pD(134) pN(143) pD(152) pN(161) pD(170)
FlopInst: top.u_core_top.xxx.yyy!u_l2cc_s1!u_l2cc_lrb!data_reg_reg[170]
PIN
CODE
STATES
q
<3531
pX(0)
d
<3533
pN(0)
clk
<3534
pN(0) pD(26)
Formatted output of the find_all_Xs module
#*** 2D MATRIX where ... ROWS=FLOPS and COLS=CYCLES ...
PIN_CODE , PIN |
0 |
364001 |
416000
|
<0 ,
q |
pX |
X |
X
<1 ,
clk |
pN |
N |
N
<2 ,
d |
pN |
N |
N
<3 ,
q |
pX |
X |
X
<4 ,
clk |
pN |
N |
N
<5 ,
d |
pN |
N |
N
<6 ,
q |
pX |
X |
X
<7 ,
clk |
pN |
N |
N
<8 ,
d |
pN |
N |
N
<9 ,
q |
pX |
X |
X
<10 ,
clk |
pN |
N |
N
<11 ,
d |
pN |
N |
N
<12 ,
q |
pX |
X |
X
<13 ,
clk |
pN |
N |
N
<14 ,
d |
pN |
N |
N
<15 ,
q |
pX |
X |
X
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
442000 |
X
pD
N
X
pD
N
X
pD
N
X
pD
N
X
pD
N
X
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
468000 |
X
pN
N
X
pN
N
X
pN
N
X
pN
N
X
pN
N
X
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
494000 |
X
pD
N
X
pD
N
X
pD
N
X
pD
N
X
pD
N
X
So if, Q =X,
either D is non deterministic or
we have a dead clock or clock is X
SNUG 2010
11
520000
3.5 Detect_ancestor_kit
From the find_all_Xs, we get a filtered list of X propogating flip flops where D input may be non
deterministic. These flops are cterozied as below
3.5.1 Flop with Feed back loop
These are the flops for which D input is dervied from Q output of the same flop discussed in
section 2.3.3. For all the flops listed in section 3.4, with the help of Primetime session each
flops parent flop is identified. If same flop appears as one of the parent flop then those flops are
caterized as self looped(feedback loop from same flop). Q is X at 0th time for these flops and
never recover during reset period. These flops need to be initialized during simulation
3.5.2 X progating flop
A flop with a valid clock and having D input as X will be construed as a X propagating flop as
discussed in section 2.3.4. Detect_ancestor_kit tool analyses such cases. No force or fix needed
for these flops
3.5.3 Non-resettable Flops
These are the ones which dont fall under any of the above category. These are pure nonresettable. Design coding needs to be changed for these set of flops. Random force file
generated during simulation initialises these flops for simulation purpose.
SNUG 2010
12
3.6 Random_force_gen
With above flow we have identified all 3 types of un-initialized flops which originate X
propagation
1) Non-resettable flops
2) Flops with combo loop back , Q not initialized to 0 or 1 at reset
3) Flops with valid D but no clock provided
These flops are initialized during simulation to a known value (0 or 1). When silicon comes out
of reset, these un-initialized flops may come up in state of either 0 or 1. A unique combination of
flop states at reset can cause silicon failure. We need to apply 2^n combination of forces on a
flop to identify all the issues. It is practially impossible to predict reset state of a non-resettable
flop during silicon bringup and verify it with 2^n forces. In order to increase the probability of
identifying the root-cause we randomly initialize the flip flops.
#!/usr/bin/perl
use strict;
use warnings;
.
.
.
my $range = 2 ^ $no_of_non_resetable_flops;
my $minimum = $arbitrary_value;
my $random_number = int(rand($range)) + $minimum;
$binary_random_number = dec2bin($random_number);
#
sub dec2bin {
my $str = unpack("B32", pack("N", shift));
$str =~ s/^0+(?=\d)//;
# otherwise you'll get leading zeros
return $str;
}
-deposit
-deposit
-deposit
-deposit
-deposit
-deposit
-deposit
-deposit
-deposit
SNUG 2010
top.u_xxx.yyy!u_src!des_data_q_reg[12] 1
top.u_xxx.yyy.core_rd_data_lat_reg[17] 0
top.u_xxx.yyy_mncntr_not_n_m_val_reg[20] 1
top.u_xxx.yyy_not_n_m_val_reg[19] 1
top.u_xxx.yyy!u_map0_par1_15!q_reg[4] 1
top.u_xxx.yyy!DDUpperFifo_reg[1][7] 0
top.u_xxx.yyy!AcceptB_reg 1
top.u_xxx.yyydeblocker_c_wr_addr1_buf_loc_reg[10] 1
top.u_xxx.yyy!cpp_sym_cnt_reg_2!q_reg[0] 0
13
4. Results
Automated flow evaluated on the project using same number of test cases.
In the above analysis (Figure7) there were 73% of flops initialized to either 0 or 1 when reset is
asserted. Remaining 27% of the flops which are un-initialized are categorized into three parts.
13% of the flops didnt receive clock during reset sequence, 10% of the flops had self loop back
from Q to D and 4% flops are non-resettable due to design coding issues.
Force file generated for 27% of the flops using random value generator function and applied during simulation.
Significant reduction in manual debug effort recorded for the similar number of test cases when
new flow is deployed.
SNUG 2010
14
Each of these tests is system level test cases having simulation time 5msec to 10msec. Run time
of each test case is 12hours to 24hours. Figure 8 captures person weeks used for simulation debug. It took ~8 weeks for 7 engineers to manually chase X and debug. By applying force file
generated from new flow similar number of test cases debugged by only 3 engineers with in 9
weeks of effort.
Automated Flow development took one day effort to deploy it to new project.
We observed ~52% reduction in person hours in GLS debug. Flow also enabled early
feedback mechanism to designers for possible fix of the issue.
56 wks
260 flops
187K
01
00
160
Table 2 Comparision of Traditional X propagation debug and debug with New Flow
New flow identified all un-initialized flops in the design. Refer Table 2 for comparison with traditional debug method. One design bug identified as a result of un-initialized flop in post silicon
phase. The bug that was uncovered on silicon was root caused to a flop, which was identified by
the flow (in the 13% category). Flow enabled sustainable model for gate level simulation which
can be ported across projects with minimal resource utilization.
5. Conclusions
A novel approach developed leveraging the existing capabilities of the two Synopsys tools
PrimeTime and VCS was discussed in the paper. This flow can be extended in an automated way
across projects.
The new flow enabled identification of 100% uninitialized flops in the design within a day, and
also resulted in 50% reduction in GLS debug cycle time.
SNUG 2010
15
7. Acknowledgements
Our sincere thanks to entire Qualcomm GLS team who supported gate level debug and validation
of this flow.
SNUG 2010
16