This document summarizes how a commercial ATPG compression tool was used along with design for testability techniques to significantly reduce the scan test cost of an audio processor chip. Specifically, it describes how an on-chip PLL was used to generate high-speed clocks for at-speed testing instead of requiring an expensive tester. It also explains how test pattern volume was reduced through the use of broadside and launch-off-shift test patterns applied via internal scan enable logic and configurable capture counters controlled by test data shifting. These techniques allowed testing at the required high quality level while using a lower-cost tester.
This document summarizes how a commercial ATPG compression tool was used along with design for testability techniques to significantly reduce the scan test cost of an audio processor chip. Specifically, it describes how an on-chip PLL was used to generate high-speed clocks for at-speed testing instead of requiring an expensive tester. It also explains how test pattern volume was reduced through the use of broadside and launch-off-shift test patterns applied via internal scan enable logic and configurable capture counters controlled by test data shifting. These techniques allowed testing at the required high quality level while using a lower-cost tester.
This document summarizes how a commercial ATPG compression tool was used along with design for testability techniques to significantly reduce the scan test cost of an audio processor chip. Specifically, it describes how an on-chip PLL was used to generate high-speed clocks for at-speed testing instead of requiring an expensive tester. It also explains how test pattern volume was reduced through the use of broadside and launch-off-shift test patterns applied via internal scan enable logic and configurable capture counters controlled by test data shifting. These techniques allowed testing at the required high quality level while using a lower-cost tester.
This document summarizes how a commercial ATPG compression tool was used along with design for testability techniques to significantly reduce the scan test cost of an audio processor chip. Specifically, it describes how an on-chip PLL was used to generate high-speed clocks for at-speed testing instead of requiring an expensive tester. It also explains how test pattern volume was reduced through the use of broadside and launch-off-shift test patterns applied via internal scan enable logic and configurable capture counters controlled by test data shifting. These techniques allowed testing at the required high quality level while using a lower-cost tester.
Download as TXT, PDF, TXT or read online from Scribd
Download as txt, pdf, or txt
You are on page 1of 10
Abstract
With the advent of nanometer technologies, the
design size of integrated circuits is getting larger and the operation speed is getting faster. As a consequence, test cost is becoming unbearable with traditional test methods. The big challenge for design and test engineers is how to guarantee the required high levels of test quality and yield while keeping the test cost low. From a scan-based ATPG point of view, there are two main ways to reduce test cost. One way is to reduce test pattern volume and test run time. The problem is how to maintain the same test coverage with a smaller test pat- tern set. The other way to reduce test cost is to use lower-end testers, which are much cheaper but have limited memory, data channels, and clocking capabili- ties. This paper shares the experiences of a real case of how to significantly reduce scan test cost by using DFT techniques. 1. Introduction The design is the first Dual-Core programmable CMOS digital signal processor (DSP) in a series of products that target the audio market. Since this Audio Processor is for an automotive application, it requires the highest possible test coverage. It is also for the con- sumer market so the cost needs to be as low as possible. This combination presents a common quandary for manufacturing test plans -- how to get high test quality at low cost [1]. The design will be packaged into 4 types of parts. The smallest package is an 80-pin quad flat pack, so only 16 scan chains are available for the design. With normal scan-based test methods, this requires more than 3000 registers on each scan chain. During scan- based testing, this configuration leads to a lot of time shifting test data into and out of the device which drives up the test cost. As the test time required for each device goes up, the through-put of the manufacturing test floor goes down. High test quality is another challenge for test cost. As we know, at-speed testing is generally required to achieve high quality test levels [2][3]. The cost of including at-speed test comes in two parts -- first is the high test pattern volume needed and the second is pro- viding the required high-speed clocking sequences. The Audio Processor is targeted to operate at 200MHz and potentially at even higher frequency with overdrive in the future. A tester that can provide 1000MHz clock driving capability costs almost twice as much as one with 100MHz clock driving capability. The clock speed for scan shifting is much lower and can be provided by any tester. The basic characteristics of this design are shown in Table 1.
This paper describes how a commercial ATPG
compression tool was used in conjunction with the test plan to get high levels of test quality for this design while at the same time significantly reducing the cost of that test. Section 2 explains how the at-speed clocks were generated and provided for the at-speed test patterns. Section 3 describes the logic and methodology used to compress the test data volume. The results of the compression are provided in section 4 and the paper is concluded in section 5. Table 1. Basic information of the audio processor Type Number Process 90nm CMOS Package Types 80pin, 208pin144pin_A, 144pin_B, Frequency > 200 MHz Area (um2) 53002 Transistors ~1,290,000 Registers ~49,000 Scan chains 16 A Real Case of Significant Scan Test Cost Reduction Selina Sha, Freescale Semiconductor selina.sha@freescale.com Bruce Swanson, Mentor Graphics Corp. bruce_swanson@mentor.com IEEE Computer Society Annual Symposium on VLSI 978-0-7695-3170-0/08 $25.00 ? 2008 IEEE DOI 10.1109/ISVLSI.2008.32 239 2. On-chip PLL clock generation for at- speed test This design includes an on-chip phase locked-loop (PLL) for generating the high frequency functional clocks. To get a high quality at-speed test, it is better to use these on-chip clocks for test purposes instead of having the high speed test clocks come from the tester equipment. It is also a big cost savings because this approach allows for the use of less sophisticated and hence less expensive testers. To take advantage of the on-chip functional clocks during test mode, a small piece of logic in the form of a clock chopper was added inside the design which can generate qualified scan clocks during scan test opera- tion. The clock control circuitry for this design is shown in Figure 1. There are many other similar clock control circuits described in the literature [4][5][6][7]. Figure 1. On-chip clock generation circuitry
To test a chip at-speed, only the capture clock is
required to run at-speed. In the Audio Processor the capture clock is generated by the PLL circuitry while the scan shift clock remains the same as the slower ref- erence clock. The benefit of using the reference clock instead of the clock divided from the PLL as the shift clock is to avoid synchronization problems between the shift clock and shift-in data. As shown in the broadside mode timing waveform of Figure 2, during scan shift cycles the external scan enable (ext_scan_en) is set high and the scan clock (scan_clk) comes from the reference clock (ref_clk). Once ext_scan_en is cleared low for the capture cycles, scan_clk comes from the clock gate which is only open when the chopper controller outputs a capture enable signal (cap_ena). Figure 2. AC test clock waveform for broadside mode
The chopper controller is actually a counter triggered
by the PLL output clock (pll_clk). It turns on the cap_ena signal to enable the number of specified fast pulses through the clock gate which is configured in the capture count (cap_cnt) register. The clock gate ensures the fast pulses from the PLL are filtered without glitches. The values specified for the cap_cnt register are determined for each test pattern and these values are loaded into the register cells as part of the scan test data that is shifted into the circuit. The example diagram shown in Figure 2 has two at- speed clock pulses for the launch and capture cycles. This is a broadside or launch-off clock type of test pat- tern and a sequential depth of two is sufficient to cap- ture the vast majority of the transition faults in the design. To detect transition faults around memories in the design, more capture pulses are often necessary. The clock control circuitry described here is flexible enough to provide up to seven at-speed clock pulses for those harder to detect transition faults. Targeting those additional faults is sometimes required to reach the required test coverage goals. The scan enable circuitry of this design includes a gated or pipe-lined piece of logic as shown in the bot- tom of Figure 1. With this circuitry, the design is able to use the launch-off shift type of transition pattern with the on-chip PLL clocks. The timing waveform for 240 launch-off shift transition patterns for this design is shown in Figure 3. Figure 3. AC test clock waveform for launch-off shift mode
In general, launch-off shift mode patterns have higher
transition test coverage than broadside because it is a simpler test for the ATPG tool to create. However, much of the additional coverage is due to testing non- functional faults [8]. The coverage is similar to broad- side once false and multicycle path analysis is per- formed [9][10]. To get a higher test coverage for at- speed test with fewer patterns, the internal scan enable was implemented for this design so that both broadside and launch-off shift transition patterns could be used. The scan mode select is cap_mode and it is configured per pattern and the value is loaded during shifting in of the other scan pattern data. When this register is set to 1, launch-off shift mode is enabled. Table 2 shows a comparison of the pattern count required by the two types of transition patterns to reach 80% test coverage for this design.
To get the correct values loaded into the capture
counter and capture mode cells, the ATPG tool provides a method to specify those values when those cells are part of the scan chains. The way to accomplish this is to use condition statements within the named capture pro- cedures. Named capture procedures are user defined and tell the ATPG tool how the clock control circuitry around the PLL logic works. An example is illustrated in Table 3. Within the named capture procedure are two modes, the external mode describes the external clocks and controls from the primary inputs, and the internal mode describes the clocks and controls on the internal chip side of the clock control logic. The example shown in Table 3 specifies to create a 2 clock pulse broadside type of pattern because of the values specified in the condition statements. Multiple named capture procedures can be created and used. To test for at-speed faults around the memo- ries of this design, a minimum of 5 cycles is required, so named capture procedures were written to accom- plish that. Another way to test around memories is to use multiple-load patterns. To create test patterns more efficiently, the general pattern creation flow was to first run the ATPG tool with the 2 at-speed clock pulse named capture procedures on the whole fault list. Then to target the remaining faults to get the highest possible test coverage, turn on the 6 at-speed clock pulse named capture procedure. Since the use of the on-chip PLL clocks for at-speed test worked so well, this design can be tested with a tester that supplies clocks of 50 MHz or less instead of requiring a 200 MHZ tester. This reduced the cost of the test equipment substantially. Table 2. AC Pattern volume for broadside vs. launch-off shift Item Broadside Launch-off shift pattern count 5856 2816 test coverage 80% 80% Table 3. Condition statement usage set time scale 1 ps; timeplate cap_ext = force_pi 0; measure_po 8000; pulse extal 10000 20000; period 40000; end; timeplate cap_int = force_pi 0; pulse pll_clk 1250 2500; period 5000; end; procedure capture cap_broadside_dep2= condition /dsp_top/ac_config/cap_mode/Q 0; condition /dsp_top/ac_config/cap_cnt_2_/Q 0; condition /dsp_top/ac_config/cap_cnt_1_/Q 1; condition /dsp_top/ac_config/cap_cnt_0_/Q 0; mode external = timeplate cap_ext; cycle = force_pi; force ref_clk 0; force reset_b 1; force ext_scan_en 0; pulse ref_clk; end; cycle = pulse ref_clk; end; cycle = pulse ref_clk; end; cycle = pulse ref_clk; end; end; mode internal = timeplate cap_int; cycle = force_pi; force ref_clk 0; force reset_b 1; force ext_scan_en 0; force int_scan_en 0; force scan_clk 0; end; ...... cycle = pulse pll_clk; end; cycle = pulse pll_clk; end; ...... end; 241 3. On-chip test pattern compression The Audio Processor will be taped out once but the die will be packaged into 4 different products. So only the common pins that are present in all the types of packages can be used as scan control and data channels. 16 scan chains are all that are available. As seen in Table 1, almost 49,000 registers share these16 scan chains, so there are more than 3000 registers on a single scan chain in a standard ATPG scan setup. With standard ATPG, it takes 1176 test patterns to achieve 96% stuck-at test coverage. This means that 3.81Mbit of data storage is required for each scan chain input/output pin on the tester. About 152.2ms is required to execute this test set if the clock shift rate is at 25MHz. But thats only for the stuck-at faults! The patterns for transition faults can easily be 3X (3 times) larger than that. Embedded Deterministic Test (EDT) is a non-intru- sive DFT technology for reducing test data volume and test time dramatically [11]. EDT accomplishes this reduction by applying a patented type of compression during deterministic test pattern generation. EDT also requires a small amount of logic on the chip that resides only in the scan path between the scan channel interface and the internal scan chains. This on-chip logic receives the compressed test pattern data from the tester and feeds the scan chains. Then it compresses the captured responses on the output side before sending that data back to the tester to compare against the expected results. The benefits of using EDT on the design are twofold. It reduces the volume of test data that is required for the tester memory, allowing the use of less expensive testers. Secondly, it also shortens the test application time and so higher tester throughput is possible than with traditional ATPG [12]. The way that it shortens the test time is by configur- ing the internal scan chains differently than in standard scan ATPG. With EDT logic on-chip, the scan chains are re-configured to be much shorter in length. This creates many more scan chains, but that is not a prob- lem since they only interface to the decompressor and compactor logic and not directly to the scan channel pins on the I/O. The tester still sees the design as having only 16 scan chains/channels, but each is much shorter in length and the test patterns are loaded/unloaded much faster. Figure 4 shows a diagram of how the EDT logic and scan chains were configured for the Audio Processor design. It used 600 short internal scan chains. Figure 4, Scan configuration with compression logic
As mentioned before, only 16 scan input/outputs are
available in this design. 15 are used for EDT scan chan- nels. The last one is used for the at-speed scan clock configuration, which was discussed in Section 2. This is optional because these scan cells can be made part of the regular scan chains. The EDT logic can also include optional bypass logic to configure the many short scan chains back into 16 long scan chains. This used to be done for failing part diagnosis, but the ATPG provider can also diagnose compressed EDT test patterns too. EDT requires 3 pins for controlling the logic. They are edt_clock, edt_update and edt_bypass. They can be shared with functional pins. The normal sequence for these signals is shown in Figure 5 and described below: - In the load_unload cycle, the edt_clock pulses with edt_update asserted to clear the old stored data in the EDT module. - In shift cycles, edt_clock pulses with edt_update cleared. The test pattern data is shifted into the EDT decompressor via the scan-in channels. The decompressor calculates the data which is distributed to all the short scan chains. The captured output data from the previous pattern is compressed via the compressor and shifted out from the scan output channels. - In capture cycle(s), edt_clock is held to a constraint 0, the values on edt_bypass and edt_update are dont cares. To get the EDT logic created for the design is quite easy. In fact, there are three standard EDT flows that are supported: the external flow, the internal flow, and the skeleton flow. 600 scan chains 1 5 sc an ch an ne ls D ec om pr es so r Co m pa ct or 242 Figure 5. Basic EDT signals waveform
The internal and external flows both need the core
design to be at the netlist level. For the external flow, the basic netlist core is without any I/O pads or bound- ary scan logic. Those elements are inserted after the EDT logic is created and reside at the same level of hierarchy. If the design netlist already includes I/O pads and boundary scan logic, the internal flow is used when creating the EDT logic. This will include a wrapper with the EDT logic and core inside. Since the Audio Processor is a new design project, the core RTL design was still changing when the DFT work needed to begin. Because of this, the skeleton flow was used. This flow allows the use of a skeleton netlist as input to the EDT logic creation step as shown in Figure 6. The skeleton netlist only contains the basic design information such as the number of scan chains and their clock domains. With this information, the EDT tool is able to create the necessary EDT logic at the RTL level. Figure 6. Skeleton DFT flow with EDT The design team integrated this EDT logic into the design before synthesis and scan insertion as shown in the diagram. Compared to the core logic, the addition of the EDT logic was less than 1%. The use of the skeleton flow was a big advantage because the DFT work was started before the first netlist was ready. In addition, if the core netlist changes, its not necessary that the EDT logic be re-generated. Since netlist changes are common in the design phase, this was another considerable cost savings to the project. 4. Test pattern compression results The test compression results obtained were quite good. Table 4 shows the stuck-at test pattern volume comparison for the design with and without the EDT implemented. At 96% test coverage, 3.81Mbits are required on the tester for each scan input/output for data storage if the EDT logic is bypassed. With the EDT logic in place, this tester memory per pin requirement drops to 0.20Mbits. NOTE: The coverage shown is not real high because we didnt stress the tool and try all possible configurations for comparing convenience. We can get > 98% stuck-at coverage if we do so. The test compression results for broadside transition test patterns was slightly higher and are shown in Table 5. At 80% test coverage, 11.62Mbits are required for each scan input/output for data storage in bypass mode while only 0.52Mbits are required with the EDT logic in place. scan_en scan_clk edt_update edt_clock edt_bypass load_unload shift load_unload shift shiftcapture Create EDT logic RTL Design/Integration Design Synthesis Scan Insertion ATPG with gate level netlist RTL EDT Skeleton Netlist Scan With netlist Logic EDT Test Patterns Table 4. DC pattern volume comparison Item Without EDT With EDT scan chains 16 601 scan cells per chain 3240 82 pattern count 1176 2278 tester memory needed per pin 3.81M 0.20M pattern volume 61.00M 3.21M test time (25MHz shift) 152.50ms 8.02ms test coverage 96.02% 96.02% compression ~ 19 X Table 5. AC pattern volume comparison Item Without EDT With EDT scan chains 16 601 scan cells per chain 3240 82 pattern count 3584 5856 test memory needed per pin 11.62M 0.52M pattern volume 185.91M 8.25M test time (25MHz shift) 464.77ms 20.61ms test coverage 80% 80% compression ~ 22 X 243 NOTE: The pattern counts shown are both calculated for broadside mode. NOTE: The coverage shown is not real high because we didnt stress the tool and try all possible configurations for comparing convenience. We can get > 89% transition coverage if we do so. By combining the stuck-at and transition test patterns with the on-chip compression logic, the amount of required tester memory drops to just 5% of the memory that would be required without the on-chip compression. 5. Conclusions The Audio Processor utilized several DFT techniques to reduce test cost significantly. The first methodology was to add a small amount of logic to enable the use of the on-chip PLL clocks for at-speed test purposes. Since the shift clock frequency was just 25MHz during test, we were able to use a less sophisticated, and much less expensive tester while still testing the design at its 200MHz operational frequency. Considering that a 1000MHz Pinscale tester is almost double the cost of a 100Mhz J750 tester, the test cost is cut in half by using this technique. Another big test cost savings comes from the com- pression of the test patterns which require much less tester memory. For less than 1% logic area added to the design for EDT logic, both the stuck-at and transition test patterns take up much less tester memory as shown in the previous section. Since the design has over half of the total area consumed by memory, the EDT logic amount drops to under 0.5% of total logic chip area. Finally, the saying that time is money is really true, especially on the manufacturing test floor. Figure 7 shows a comparison of the amount of tester time required to test this design with the bypass (standard ATPG) and the compressed test patterns using the EDT logic. Figure 7. Scan pattern test time comparison The times in milliseconds to run the at-speed test patterns are shown on the left and the stuck-at patterns on the right. By using the EDT logic and test patterns, we were able to dramatically reduce the test time per device and increase the through-put of the production test line. 6. References [1]. J. Saxena, et al., Scan-Based Transition Fault Testing - Implementation and Low Cost Test Challenges, Proc. International Test Conference, 2002, pp. 1120-1129. [2]. X. Lin, et al., High-frequency, At-speed Scan Testing, IEEE Design & Test of Computers, Sept.- Oct. 2003, pp. 17-25. [3]. B. R. Benware, R. Madge, C. Lu, R. Daasch, Effectiveness Comparisons of Outlier Screening Methods for Frequency Dependent Defects on Complex ASICs, Proc. IEEE VLSI Test Symposium, 2003, pp. 39-46. [4]. N. Tendolkar, et al., Novel techniques for achieving high at-speed transition fault test coverage for Motorolas microprocessors based on PowerPC instruction set architecture, Proc. IEEE VLSI Test Symposium, 2002, pp. 3-8. [5]. J. Boyer, R. Press, Easily Implement PLL Clock Switching for At-Speed Test, Chip Design Magazine, Feb.-March 2006. [6]. M. Beck, et al., Logic design for on-chip test clock generation - implementation details and impact on delay, Proc. Design, Automation and Test in Europe, 2005, pp. 56-61. [7]. H. Nakamura, etal., Low Cost Delay Testing of Nanometer SoCs Using On-Chip Clocking and Test Compression, Proc. IEEE Asian Test Symposium, 2005, pp. 156-161. [8]. K. S. Kim, S. Mitra, P. Ryan, Delay Defect Characteristics and Testing Strategies, IEEE Design & Test of Computers, Sept.-Oct. 2003, pp. 8-16. [9]. V. Vorisek, B. Swanson, K.-H. Tsai, D. Goswami, Improving Handling of False and Multicycle Paths in ATPG, Proc. IEEE VLSI Test Symposium, 2006, pp. 160-165. [10]. D. Goswami, et al., At-Speed Testing with Timing Exceptions and Constraints - Case Studies, Proc. IEEE Asian Test Symposium, 2006, pp. 153-159. [11]. J. Rajski, et al., Embedded Deterministic Test for Low-Cost Manufacturing Test, Proc. International Test Conference, 2002, pp. 1120-1129. [12]. F. Poehl, et al, Industrial experience with adoption of EDT for low-cost test without concessions, International Test Conference, 2003, pp. 1211-1220. 244