Testing High-Performance Pipelined Circuits With Slow-Speed Testers
Testing High-Performance Pipelined Circuits With Slow-Speed Testers
This article presents a methodology for testing high-performance pipelined circuits with slow-speed
testers. The technique uses a clock timing circuit to control data transfer in the pipeline in test
mode. The technique adds no extra hardware in the data path of the pipeline and therefore has
virtually no performance penalty. A clock timing circuit capable of achieving a timing resolution of
50 ps in 0.18 µm CMOS technology is presented. The design provides the ability to test the clock
timing circuit itself. The effectiveness of the technique is demonstrated using a 16-bit pipelined
multiplier as a test vehicle. Simulations show that we are able to detect delay faults as small as
50 ps at an input clock frequency of 100 MHz.
Categories and Subject Descriptors: B.6.2 [Logic Design]: Reliability and Testing—testability;
B.7.3 [Integrated Circuits]: Reliability and Testing—testability
General Terms: Design, Reliability
Additional Key Words and Phrases: Delay-fault testing, high-performance testing, design for delay
testability
1. INTRODUCTION
The 2001 edition of the International Technology Roadmap for Semiconductors
(ITRS) expects that the clock frequency of high-performance state-of-the-art
CMOS VLSI circuits will exceed 6 GHz by year 2007 [ITRS 2001]. According to
the ITRS, potential manufacturing yield loss associated with the at-speed func-
tional test methodology is related to the growing gap between automatic test
equipment (ATE) performance and the ever increasing device I/O speed. In the
last two decades, while the clock frequencies of VLSI circuits have improved at
an average rate of 30% per year, the tester accuracy has improved only at a rate
of 12%. If this trend continues, tester timing accuracy will soon approach the
cycle time of high-performance devices making at-speed test almost impossible.
This article is an extended version of the authors’ work published in Proceedings of the Design,
Automation and Test in Europe Conference 2003 (DATE03), vol. 3 (March), 212–217.
Authors’ address: Electrical and Computer Engineering, University of Waterloo, 200 University
Avenue West, Waterloo, ON, N2L 3G1, Canada.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is
granted without fee provided that copies are not made or distributed for profit or direct commercial
advantage and that copies show this notice on the first page or initial screen of a display along
with the full citation. Copyrights for components of this work owned by others than ACM must be
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,
to redistribute to lists, or to use any component of this work in other works requires prior specific
permission and/or a fee. Permissions may be requested from Publications Dept., ACM Inc., 1515
Broadway, New York, NY 10036 USA, fax: +1 (212) 869-0481, or permissions@acm.org.
°
C 2003 ACM 1084-4309/03/1000-0506 $5.00
ACM Transactions on Design Automation of Electronic Systems, Vol. 8, No. 4, October 2003, Pages 506–521.
Testing High-Performance Pipelined Circuits with Slow-Speed Testers • 507
Due to the slow advances and the high cost of ATEs, we might not be able
to test future high-performance VLSI circuits. Therefore, it will be essential to
design these circuits with design-for-testability/built-in-self-test (DFT/BIST)
techniques to reduce the reliance on traditional, high-cost, full-feature testers.
The requirements of ATEs designed to work with DFT/BIST techniques are
much simpler than the traditional testers.
In this article, we propose a methodology for testing high-performance
pipelines using slow-speed testers. The technique depends on test mode clock
shifting. The technique adds no extra hardware in the data path of the pipeline
and therefore has virtually no performance penalty.
2. BACKGROUND
The size of a defect determines whether the defect affects the logic functionality
of a circuit. Normally, smaller defects, which are likely to cause partial shorts
or opens, have a higher probability of occurrence. Such defects often cause tim-
ing failures without altering the logic functionality of the circuit. A number
of recent studies show concerns about new failure mechanisms in scaled ge-
ometries that are harder to detect with conventional means. Nigh et al. [1997]
reported a significantly large number of timing-only failures that did not affect
the steady-state logic functionality. Similarly, for Intel’s manufacturing pro-
cesses, Needham et al. [1998] reported an increasing shift towards soft defects
as technology moved from 0.35 to 0.25 µm. These defects do not always cause
failures at all temperature and voltage conditions and are considered to be
major long term reliability threats.
Using DFT/BIST techniques that allow testing high-performance circuits
with slow-speed testers is one way to tackle the problem of high-performance
circuit testing. The creation of a low frequency test mode in digital circuits
was first introduced by Agrawal and Chakraborty [1995]. In their proposal, a
quantifiable, externally controlled delay is added such that high-performance
testing can be carried out with relatively slow-speed testers. They used a pulse-
triggered flip-flop in which a dynamic latch is introduced inside a traditional
master-slave flip-flop. Shashaani and Sachdev [1999] proposed the controlled-
delay flip-flop (CDFF) as an alternative to the pulse-triggered flip-flop. In this
technique, an additional test mode clock is used to control the delay of the flip-
flop. The main advantages of the CDFF over the pulse-triggered flip-flop are
the stable operation and improved performance in normal mode. Kerkhoff et al.
[2001] suggested a BIST environment for detecting delay faults based on the
use of CDFFs. Nummer and Sachdev [2001, 2003] proposed an on-chip clock
generation methodology that allows the test mode clock frequency of circuits
using CDFFs to be reduced arbitrarily.
Fig. 1. Block diagram of the DUT and the clock timing circuit.
shifted version(s) of the input clock, IPCLK, to control the timing of data flow
through the pipeline in the test mode. In order to achieve that, each register in
the pipeline has to have a separately routed clock, as shown in Figure 1. This
results in higher complexity of clock generation and propagation. The clock net
of a pipelined circuit normally consists of a tree of buffers. Special care should
be given to balance the load of the different clock tree branches in order to
keep the skew between the different clock signals within acceptable limits. In
our technique, only those buffers close to the clock timing circuit require extra
design and layout effort. Due to the small number of these upstream buffers,
the extra effort imposed by the technique would not be substantial.
The normal and test mode operation of the circuit is illustrated in Figure 2,
showing stage i in a pipeline (0 ≤ i ≤ n). In normal mode, a single phase high-
frequency clock is used for all registers in the pipeline. This is shown in
Figure 2(b). As a result, the operation of the circuit depends on the period of
this clock. The delay of stage i , tdi , can be expressed using
¡ ¢
tdi = tpropi + tcombi + tsetupi+1 + tCKi − tCKi+1 , (1)
where tpropi is the propagation delay of register i, tcombi is the delay of the ith
stage combinational block, tsetupi+1 is the setup time of register i + 1, and (tCKi −
tCKi+1 ) is the difference between the delays through the clock driving networks
of registers i and i + 1. For the pipeline to function correctly, the normal mode
clock period, TNM , has to be at least equal to the largest stage delay, that is,
n ¡ ¢
TNM ≥ MAXi=0 tdi . (2)
In the test mode, a delayed version of the clock is used to test the pipeline.
This is illustrated in Figure 2(c). In order to test stage i, a delayed version of the
input clock, IPCLK, with delay Td is applied to register i + 1, while the original
clock is used for all other registers. The test mode clock period, TTM , has to be at
least equal to the normal mode clock period. It is clear that using a larger value
of TTM means that we can use a slow-speed tester in the test mode. Setting
Td to be equal to tdi allows the ith stage to operate within its normal mode
timing constraints while the whole circuit is running at a lower frequency. As a
result, a slow-speed tester can be used for performance binning and delay fault
testing. In the test mode, the tester supplies a slow-speed input clock as well
ACM Transactions on Design Automation of Electronic Systems, Vol. 8, No. 4, October 2003.
Testing High-Performance Pipelined Circuits with Slow-Speed Testers • 509
Fig. 2. Proposed technique. (a) Circuit model. (b) Normal mode. (c) Test mode.
as test vectors for the target stage. These vectors are supplied at the rate of the
slow-speed clock. After a predetermined number of clock cycles, the tester reads
the results at the same rate of the slow-speed clock. This technique does not
require any changes in the design of the registers or the combinational blocks
and therefore has virtually no performance penalty.
For the proposed technique to achieve its goals, it is essential for the value
of Td to be controllable within state-of-the-art timing accuracy. Furthermore,
we should have the ability to program Td in order to bin the device under
test (DUT) for performance. This feature also allows different stages in the
pipeline to be tested even if they have unbalanced delays. For these reasons,
an on-chip clock timing circuit is used to generate the delayed version of the
input clock and to control the clocks propagated to the different stages in the
pipeline.
The second step is to add the partial products together in a summation network
which reduces the partial products to two operands. The product is generated
in the final step by adding the resulting two operands using a carry propagate
adder.
In our design, no encoding is used to generate the partial products. This re-
sults in a number of partial products equal to the size of the multiplier (16).
These partial products are added in the summation network using 4-2 com-
pressors as the main component. A 4-2 compressor accepts four partial sums
and reduces them to two [Mehta et al. 1991]. In order to reduce the 16 partial
products to 2 operands, this has to be done in three levels of 4-2 compression. A
carry-lookahead adder with conditional sum select [Ohkubo et al. 1995] is used
to generate the product from these two operands.
The multiplier is implemented with five pipelined stages, as shown in
Figure 3. The first stage is used to generate the 16 partial products and re-
duce them to 8 partial sums after the first level of the summation network.
The second and third levels of the summation network are implemented in the
second and third stages of the pipeline. The final addition is done in the last
two stages.
Pipeline stages are separated by registers to control the timing of data flow
through the multiplier. Static flip-flops are used as the storage elements in all
registers. As shown in Figure 3, each register is controlled by a separate clock
ACM Transactions on Design Automation of Electronic Systems, Vol. 8, No. 4, October 2003.
Testing High-Performance Pipelined Circuits with Slow-Speed Testers • 511
SN L1 715
SN L2 655
SN L3 655
CLA L1 665
CLA L2 615
provided through the clock timing circuit. Details of the design of this circuit
are given in the next section.
Performance characterization of the multiplier is carried out in order to find
its maximum operating frequency and the critical path through each stage of
the pipeline. These results are shown in Table I. The delays shown include the
propagation delay of the register feeding the stage, the setup time of the register
accepting the output of the stage, and the difference in delays of the clock driving
networks of the input and output registers (refer to Eq. 1). As shown in Table I,
the first stage (SN L1) has the largest delay and the operating frequency of
the multiplier is determined by this stage. This delay is equal to 715 ps which
translates to a maximum operating frequency of 1.4 GHz.
Fig. 4. Clock Timing Circuit (multiplexer select inputs are not shown).
Figure 4 shows the clock timing circuit used in our design (multiplexer select
inputs are not shown). This circuit is designed using CMOS 0.18 µm technology
provided through TSMC. It comprises a phase-locked-loop (which could be part
of the clock generation circuit for the chip, as is the case for high-performance
microprocessors, for example), three delay lines, a phase splitter & delay circuit,
a DLL, and a number of multiplexers. The design allows Td to vary between
250 ps and 1000 ps with 50 ps increments (resolution). These are design vari-
ables and it is up to the designer to choose the values suiting a specific circuit
or application. For our test vehicle, the range of Td is reasonable to allow us to
do performance binning and test for delay faults. A 50 ps timing resolution is
high enough considering the maximum operating frequency of the multiplier.
The design and operation of the different blocks used in the clock timing circuit
is described below.
to calibrate the delay lines requires that the delay for any edge to be within
a certain limit, otherwise the signal will be lost before reaching the end of
the delay line. As this is not feasible using the technology at hand (0.18 µm
CMOS technology), we designed the delay elements to have a delay of 100 ps
and used two delay lines with a 50 ps delay in between to get a resolution of
50 ps. It is worth noting that sizing of the transistors in the delay element
should be such that there are values of the control voltages that result in a
delay of 100 ps under worst case conditions (slow-PMOS and slow-NMOS
transistor models, T = 100◦ C, and Vdd is 10% lower than its nominal value)
as well as best-case conditions (fast-PMOS and fast-NMOS models, room
temperature, and Vdd is 10% higher than its nominal value).
DL0 consists of 10 delay elements. It is used to calibrate the other two de-
lay lines. This is achieved by closing the DLL loop using DL0 and applying
HFCLK from the PLL to the input of this delay line. The period of HFCKL is
1 ns for an input clock frequency of 100 MHz. Once locked, the output volt-
ages from the DLL should be such that the total delay across DL0 is equal
to one clock period. Since these voltages control all three delay lines, the
delays of all delay elements in the circuit are adjusted to 100 ps. If IPCLK
were to be used to calibrate the delay lines, DL0 would have 100 delay ele-
ments rather than only 10. This explains the benefit of generating HFCLK
using the PLL.
As shown in Figure 4, the phase splitter & delay (PSD) circuit accepts
either IPCLK (low-frequency clock, in the test mode) or HFCLK and gen-
erates two clocks, A and B. The PSD circuit is designed such that the clock
at A is delayed by half delay-element delay with reference to the clock at B.
Therefore, when V p and Vn are such that the delay of the delay element is
100 ps, the delay from B to A is 50 ps. These two clocks are fed to the other
two delay lines, DL1 and DL2. DL1 and DL2 consist of 11 and 10 delay
elements, respectively. The main function of these two delay lines is to gen-
erate DCLK with programmable delays with respect to CLK (Figure 4). It is
worth noting that the input clock jitter has virtually no effect on the timing
accuracy of DCLK and hence on the high-performance delay-fault testabil-
ity. Any edge placement inaccuracy at the Input, propagates through DL1
and DL2. As a result, the same amount of jitter is added to both CLK and
DCLK keeping the delay between these two clocks at the desired value
of Td .
(4) Multiplexers M1–M7. The multiplexers shown in Figure 4 can be divided
into two groups. The first group includes M2 and M5. Outputs from 16 delay
elements in DL1 and DL2 are tapped and fed to M2. According to the select
inputs of M2, one of the 16 inputs is selected to be DCLK. This allows the
delay between CLK and DCLK to be varied between 250 ps and 1000 ps
with 50 ps increments (note that CLK is buffered to compensate for the
delay through M2). M5 consists of six 2:1 multiplexers. It is used to control
the clocks feeding all registers in the pipeline (CK0 to CK5). Depending on
the mode of operation and the pipeline stage to be tested, M5 sets the clock
of each register to either CLK or DCLK.
ACM Transactions on Design Automation of Electronic Systems, Vol. 8, No. 4, October 2003.
Testing High-Performance Pipelined Circuits with Slow-Speed Testers • 515
The second group of multiplexers include M1, M3, M4, M6, and M7. These
multiplexers are used only to ensure the functionality of the clock timing
circuit. Details on the operation of these multiplexers are given in the next
section.
The proposed clock timing circuit takes into account all the issues mentioned
at the beginning of this section. Programmability is achieved through DL1, DL2,
and M2. Using a DLL allows us to achieve the same timing accuracy regard-
less of process, temperature, and/or supply voltage variations. The DLL with a
number of multiplexers help ensure the functionality of the clock timing circuit
itself. With the help of the PLL, tester clock jitter is not allowed to propagate to
DL0. It also has no effect on the timing accuracy of signals generated form DL1
and DL2. The area overhead due to this design is estimated to be 100 gates
per pipeline stage. This should be acceptable for medium to large pipelined cir-
cuits. It is important to note that matching between the different components
in the design is essential to ensure correct timing even with small local process,
temperature, and/or supply voltage variations. This can be achieved through
circuit layout techniques similar to those used for analog circuits.
6. MODES OF OPERATION
In this section, we put the different blocks of the clock timing circuit together
and show how they function in the different modes of operation. The clock timing
circuit operates in three different modes. In normal mode, the pipeline is used to
perform the function it is designed for. In the DUT test mode, the clock timing
circuit is used to verify the performance of the DUT and to test it for delay
faults. In the clock timing circuit test mode, the clock timing circuit is tested to
ensure its ability to give a correct image about the performance of the pipeline.
ACM Transactions on Design Automation of Electronic Systems, Vol. 8, No. 4, October 2003.
516 • M. Nummer and M. Sachdev
set to DCLK. As a result, data flows between registers 3 and 4 within normal
mode timing constraints. For all other stages, the low frequency clock allows
operation under relaxed timing. This is important to ensure that delay faults
in these stages do not affect the target path and hence the stage under test.
This procedure is repeated for every path to be tested until the DUT is tested
completely.
6.3.1 Phase 1: Testing DL0, DL1, and DL2. In this phase, delay lines DL0,
DL1, and DL2 are tested between nodes C and H, B and G, and A and E,
respectively. Table II gives the number of delay elements between the different
nodes in the clock timing circuit. The first three entries in the table have the
same number of delay elements. As a result, closing the DLL loop with these
node pairs one at a time should result in very close values of Vn (monitored off-
chip) for all three configurations. As shown in Figure 6, the test in this phase
is done in four steps:
Ideally, the three values should be equal. Mismatch between the delay el-
ements and the different components in the DLL would result in some dif-
ferences. Characterization is necessary to define how much difference due to
mismatch and process variations is acceptable. In this context, we assume that
only single delay fault exists in the circuit. It is highly unlikely to have the
ACM Transactions on Design Automation of Electronic Systems, Vol. 8, No. 4, October 2003.
Testing High-Performance Pipelined Circuits with Slow-Speed Testers • 517
exact same amount of delay fault in two of or all three delay lines. In addition
to DL0, DL1, and DL2, the test in this phase covers paths through multiplexers
M3 and M4 used to close the DLL loop (A-X, B-X, C-X, E-Y, G-Y, and H-Y).
ACM Transactions on Design Automation of Electronic Systems, Vol. 8, No. 4, October 2003.
518 • M. Nummer and M. Sachdev
A and E 10
B and G 10
C and H 10
A and F 10 12
B and E 10 12
6.3.2 Phase 2: Testing the Phase Splitter & Delay Circuit. Referring to
Table II, the delay between nodes A and F on one side and nodes B and E
on the other are both equivalent to 10 12 delay elements. This is only true if
the two outputs of the PSD circuit (A and B) are exactly half delay-element
delay apart. This observation is used to test the PSD circuit for delay faults. As
shown in Figure 6, the test is done first by closing the DLL loop using nodes
A and F. When locked, DLL output Vn is recorded as Vn3 . Similarly, Vn4 is
obtained by closing the DLL loop through nodes B and E. For fault-free PSD,
the difference between Vn3 and Vn4 should be within acceptable limits (defined
through characterization). The test in this phase covers the G-F-Y path as well.
6.3.3 Phase 3: Testing Multiplexer M5. This is done with the help of multi-
plexers M6 and M7. The main idea is to set the delay between CLK and DCLK
(Td ) to one cycle of HFCLK (1 ns). The different paths in M5 are tested two at
a time. As shown in Figure 6, to test the paths from CLK to CK i and DCLK
to CK i+1 (where i is any number between 0 and 4), M5 is set accordingly and
M6 and M7 are used to close the DLL loop using nodes CK i and CK i+1 . Under
these conditions, if all signal paths are free of delay fault, when locked, the
DLL output, Vn , should be equal to the values obtained in phase 1 of the test
procedure (for fault free DL0, DL1, and DL2). In addition to M5, the test in this
phase covers M6, M7, the 15-DCKL path in M2, the D-X path in M3, the I-Y
path in M4, and the buffer used for CLK.
target path, two vectors are used to test the circuit. The first vector initial-
izes the DUT while the second vector activates the target path. The delay
of a given path depends on the input vectors of the circuit. As a result, for
every stage in the pipeline, the critical path and its delay might change de-
pending on the applied vectors. For all paths tested in our simulations, acti-
vation vectors are chosen such that these paths are the critical paths in their
pipeline stage for these vectors. As mentioned before, our design allows adjust-
ment of Td such that paths of different delays can be tested for small delay
faults.
Delay-fault simulation results for the test vehicle are shown in Table III.
The test is done at an input clock frequency of 100 MHz. Delay faults of 50 ps
are inserted in each path one at a time. Td from the clock timing circuit is
set to the next higher value compared to the delay of the target path. Under
these conditions, the multiplier gives incorrect output for all paths tested in our
simulations, as shown in Table III. The left half of the table gives the delays
of the different paths and the value of Td used to detect a 50 ps delay fault
in each path. The right half of the table gives the test vectors used for each
path as well as the fault-free and faulty products of the multiplier. The extent
of delay fault that goes undetected is a function of the slack between the delay
of the path and the value of Td used to test it. A path with larger slack will
have a larger undetectable delay fault. For example, in the case of path # 2, we
expect that delay faults as small as 15 ps should be detectable using the same
value of Td . On the other hand, for path # 3, it will take a delay fault of at least
45 ps to cause the timing failure. The situation is worse if the target path is
not the critical path for the applied vectors. In general, delay fault detection is
dependent on the target path delay. Most of the delay fault testing techniques
have similar limitations. Balancing path delays is the most commonly used
method to alleviate this problem.
ACM Transactions on Design Automation of Electronic Systems, Vol. 8, No. 4, October 2003.
520 • M. Nummer and M. Sachdev
Table IV. Delay-Fault Simulation Results for the Clock Timing Circuit
Fault Faulty Delay Vn0 Vn1 Vn2 4Vn
Fault # location Path fault (ps) mV mV mV mV
F1 DL0 C-H 60 633 611.5 611.5 21.5
F2 DL1 B-G 100 611.5 649 611.5 37.5
F3 DL2 A-E 200 611.5 611.5 691.7 80.2
F4 M3 A-X 60 611.5 611.5 594.2 −17.3
F5 M3 B-X 100 611.5 583.3 611.5 −28.2
F6 M3 C-X 200 554 611.5 611.5 −57.5
F7 M4 E-Y 60 611.5 611.5 634 22.5
F8 M4 G-Y 100 611.5 648.7 611.5 37.2
F9 M4 H-Y 200 697 611.5 611.5 85.5
A positive 4Vn indicates that the delay seen by the DLL is larger than it
should be. As a result, the DLL causes the voltage to increase in to order to
compensate for the extra delay. The opposite is true for negative 4Vn . For a
certain delay fault, the larger the value of 4Vn , the easier it is to observe the
error due to the fault. As shown in Table IV, the smallest value of 4Vn is 17.3 mV,
which can be easily measured off-chip. These results demonstrate our ability
to test the clock timing circuit for delay faults. This is important to ensure its
ability to give a true image about the operation and performance of the DUT.
8. CONCLUSIONS
In this article, we presented a methodology for testing high-performance
pipelined circuits with slow-speed testers. In this technique, each pipeline stage
is clocked using a separate clock generated from an on-chip clock timing circuit
in test mode. The technique adds no extra hardware in the data path of the
pipeline and therefore has no performance penalty.
A design for the clock timing circuit capable of achieving a timing resolution
of 50 ps in 0.18 µm CMOS technology was presented. The design provides the
ability to test the clock timing circuit itself.
The effectiveness of the technique was demonstrated using a 16-bit pipelined
multiplier as a test vehicle. Simulations show that we are able to detect delay
faults as small as 50 ps at an input clock frequency of 100 MHz. Simulations
also prove our ability to test the clock timing circuit itself for delay faults.
REFERENCES
AGRAWAL, V. D. AND CHAKRABORTY, T. J. 1995. High-performance circuit testing with slow-speed
testers. In Proceedings of the International Test Conference, 302–310.
ITRS 2001. International technology roadmap for semiconductor, 2001 edition.
KERKHOFF, H. ET AL. 2001. Design for delay testability in high-speed digital ics. J. Elect. Test.:
Theory Appl. 17, 3–4 (June–August), 225–231.
MEHTA, M., PARMAR, V., AND E. SWARTZLANDER, J. 1991. High-speed multiplier design using multi-
input counter and compressor circuits. In Proceedings of the 10th IEEE Symposium on Computer
Arithmetic. IEEE Computer Society Press, Los Alamitos, Calif., 43–50.
MOYER, G. C. 1996. The Vernier Technique for Precise Delay Generation and Other Applications.
Ph.D. thesis, The Department of Computer Engineering, North Carolina State University.
NEEDHAM, W., PRUNTY, C., AND YEOH, E. H. 1998. High volume microprocessor test escapes, an
analysis of defects our tests are missing. In Proceedings of International Test Conference, 25–34.
NIGH, P. ET AL. 1997. So what is an optimal test mix? a discussion of the sematech methods
experiment. In Proceedings of International Test Conference, 1037–1038.
NUMMER, M. AND SACHDEV, M. 2001. A methodology for testing high-performance circuits at arbi-
trarily low test frequency. In Proceedings of IEEE VLSI Test Symposium. IEEE Computer Society
Press, Los Alamitos, Calif., 68–74.
NUMMER, M. AND SACHDEV, M. 2003. A DFT technique for testing high-speed circuits with arbi-
trarily slow testers. J. Elect. Test.: Theory Appl. 19, 3 (June), 299–314.
OHKUBO, N. ET AL. 1995. A 4.4 ns cmos 54 × 54-b multiplier using pass-transistor multiplexer.
IEEE J. Solid-State Circ. 30, 3 (Mar.), 251–257.
SHASHANI, M. AND SACHDEV, M. 1999. A DFT technique for high-performance circuit testing. In
Proceedings of International Test Conference, 267–285.
ACM Transactions on Design Automation of Electronic Systems, Vol. 8, No. 4, October 2003.