DFT - CLK - Mux and DFT - CLK - Chain Data Sheet
DFT - CLK - Mux and DFT - CLK - Chain Data Sheet
DFT - CLK - Mux and DFT - CLK - Chain Data Sheet
ABSTRACT
DFT Compiler adds DFT_clk_mux and DFT_clk_chain components to the design when insert_dft is run with the set_dft_configuration -clock_controller enable setting. These components are not documented in the DFT Compiler Scan User Guide. This data sheet is intended to document the architecture and operation of these components, and to provide a check list for users concerned about the components impact on their design. This document describes the implementation instantiated by the F-2011.09 release. The most recent changes were: In the D-2010.03-SP2 release, an option was added to use clock gating latches. In the E-2010.12 release, the hierarchy of the new blocks was flattened during insert_dft. Note: The PLL controller that is included with DFT Compiler is an example that is not guaranteed to be appropriate for use in your design. If you decide to use this design, you are responsible for validating that this functionality works in the context of your design.
SYNOPSYS CONFIDENTIAL
DFT_clk_mux
1 System Overview
The DFT_clk_mux and DFT_clk_chain are inserted as two separate modules in the top level of the design, but they always function together as a unit. The DFT_clk_mux is inserted between the OCC (On-Chip Clocking) clock generator, usually a PLL (Phase-Locked Loop), and its clock tree to provide control over the clock for scan shifting and capture. The DFT_clk_chain contains data to control the capture operation of the DFT_clk_mux. These blocks are kept separate because the flipflops inside DFT_clk_mux must be nonscan to allow them to switch clock sources correctly, but the flip-flops inside DFT_clk_chain must be on the scan chains so that the capture pulses can be controlled by ATPG. The purpose of these blocks is to allow ATPG to specify capture sequences consisting of a fixed number of pulses from a PLL which may be running asynchronously to the primary inputs controlled by the ATE. The scan shift operation takes place under direct ATE control, and switching between the different clock sources is done glitchlessly. The fast sequential ATPG engine in TetraMAX specifies capture sequences with a maximum of 10 cycles, so it is not meaningful to create DFT_clk_mux blocks capable of emitting more pulses, although it is legal and the IP block works in this case.
1.1 Schematics
These schematics correspond to the connections made automatically by the insert_dft command for a specification with two PLL clocks and a maximum of two clock pulses per capture cycle. If more clock pulses are selected, the DFT_clk_chain becomes longer, and the counter and decoder become larger. Note that the logic is shown generically, and may appear different after synthesis. In Figure 1, the DFT_clk_mux is shown as it would be instantiated in the design. Before the insert_dft command is run, the PLL is connected to the Clock Drivers, and the Clock Trees and Scan Flops must already exist. These are not changed by insert_dft (besides adding the Scan Enable and serial scan connections), but the DFT_clk_mux is inserted at the output of the PLL with DFT_clk_chain controlling it. The circuitry inside DFT_clk_mux is shown in separate figures for clarity. The hierarchy inside it is flattened during insert_dft.
SYNOPSYS CONFIDENTIAL
DFT_clk_mux
DFT_clk_chain
test_siN
DQ SI SE DQ SI SE DQ SI SE DQ SI SE
test_soN
[0]
[1]
[2]
[3]
Clock Drivers
Clock Trees
Scan Flops
DQ SI SE
PLL
CLKA [3:2]
DFT_clk_mux
Figure 1.
DFT_clk_mux & DFT_clk_chain in the design. The contents of the dashed boxes are shown in the following figures.
\U_clk_control_i_0/ load_n_meta_1_l_reg
DQ
\U_clk_control_i_0/ load_n_meta_2_l_reg
DQ load_n (load 0)
Q[1:0]
Decoder: 2-to-4
Figure 2. Contents of the Fast Pulse Controller block from Figure 1. The instance names of the clock domain crossing synchronization flip-flops are before running the change_names command, and are for the first DFT_clk_mux to be inserted. For subsequent instances, increment the first 0.
SYNOPSYS CONFIDENTIAL
DFT_clk_mux
test_se
DQ
slow_clk_enable
slow_clk
clk
pipeline_or_tree
DQ
Figure 3.
Contents of the Clock Selection Circuit block from Figure 1, using the default (false) of test_occ_insert_clock_gating_cells. Clock paths are shown in red.
test_se
DQ D Q
slow_clk_enable
slow_clk
GN
clk pipeline_or_tree
DQ D Q
GN
Figure 4. Contents of the Clock Selection Circuit block from Figure 1, using set test_occ_insert_clock_gating_cells true. The inner dashed boxes show logic that can be replaced by integrated clock gating cells using the test_icg_p_ref_for_dft variable. Clock paths are shown in red.
SYNOPSYS CONFIDENTIAL
DFT_clk_mux
2 DFT_clk_mux
2.1 Naming Convention
The module is instantiated under this name: <string>_DFT_clk_mux_<number> where <string> is the current_design during the insert_dft run <number> is the uniquification number of the controller, starting from 0
2.2 Ports
Port Name reset test_mode pll_bypass scan_en clk_enable[m:0] fast_clk[n:0] slow_clk clk[n:0] Direction Input Input Input Input Input Input Input Output Function 1 to reset controller, 0 to allow controller to operate 1 to control clock, 0 to select fast_clk unconditionally 1 to select slow_clk, 0 to allow clock switch-over operations Mediates clock switch-over operation Capture pulse control from clock chain Fast clock from PLL ATE clock Output clock to scan flip-flops
Table 1. DFT_clk_mux I/O ports
The widths of the buses are determined by the options of the set_dft_clk_controller command: clk and fast_clk are as wide as the number of elements in the -pllclocks list. clk_enable is as wide as the number of elements in the -pllclocks list times the argument of the -cycles_per_clock option. When the bus width would be 1, a scalar port of the same name is used instead.
SYNOPSYS CONFIDENTIAL
DFT_clk_mux
2.3 Connections
As instantiated by insert_dft, the DFT_clk_mux ports are connected as follows: Port Name reset test_mode pll_bypass scan_en clk_enable fast_clk slow_clk clk Type Primary Input Primary Input Primary Input Primary Input Internal Internal Primary Input Internal Default Name pll_reset test_mode pll_bypass test_se DFT_clk_chain(clk_ctrl_data) -pllclocks hookup pin (last element in list is bit 0) -ateclocks argument
-pllclocks destination
The remaining inputs control the dynamic selection of the two clocks. When used properly, they ensure that switching between the clocks is done glitchlessly. A clock is deselected on its own falling edge, then the clk output is held low until the new clock selection is made on its own falling edge to ensure glitchless operation and full pulse widths. reset is only used for initialization. In the test protocol, it pulses to 1 and then stays at 0 for the remainder of the test. When reset goes back to 0, the sequence of operations is: If scan_en is 1, one slow_clk pulse is required and then slow_clk is selected. If scan_en is 0, the next fast_clk pulse starts a capture pulse sequence.
Pulsing reset to 1 after initialization is improper use, and will result in the clk output immediately going to 0.
SYNOPSYS CONFIDENTIAL
6
DFT_clk_mux
DFT_clk_mux can reset itself even without the reset pulse. By setting scan_en to 1 and waiting for one fast_clk pulse followed by one slow_clk pulse (which selects the slow_clk input) and after five more fast_clk pulses, it will be ready to go through a capture sequence. clk_enable is a bus connected to the clk_ctrl_data output of a DFT_clk_chain block. This bus is loaded during the scan shift operation. Changing this input while scan_en is low is improper use and can result in unpredictable glitching on the clk output. Each bit enables a pulse on an output clk signal at a particular clock cycle count of its corresponding fast_clk input. A value of 1 represents a pulse and a value of 0 represents no pulse. The grouping is first by output clock and second by count. For example, if set_dft_clk_controller has three elements in its -pllclocks list and a -cycles_per_clock argument of 2: clk_enable[0] enables a pulse on count 1 on clk[0] clk_enable[1] enables a pulse on count 2 on clk[0] clk_enable[2] enables a pulse on count 1 on clk[1] clk_enable[3] enables a pulse on count 2 on clk[1] clk_enable[4] enables a pulse on count 1 on clk[2] clk_enable[5] enables a pulse on count 2 on clk[2] scan_en is connected to the scan enable signal used by the internal scan chains. It works as follows: When scan_en goes high, slow_clk is selected following its first falling edge. Every transition on slow_clk is passed through to the clk output. When scan_en goes low, the signal is resynchronized from the slow clock domain (captured by a single flip-flip in the clock selection block) to the fast clock domain (resynchronized by three successive synchronizer flip-flops in the fast pulse controller block). Once the low scan enable signal has been resynchronized, a counting sequence from 0 to N+1 is initiated by the fast pulse controller, according to the -cycles_per_clock N argument. Cycles 0 and N+1 are quiet, while cycles 1 through N selectively issue fast clock pulses depending on the values loaded into the clock chain.
If the OCC controller is used with a pipelined scan-enable signal, additional steps are needed to ensure correct operation. For more information, see On-Chip Clocking Support in the DFT Compiler Scan User Guide. Figures 5 and 6 show the behaviors in a case with set_dft_clk_controller -cycles_per_clock 2:
SYNOPSYS CONFIDENTIAL
DFT_clk_mux
Count = 0 (no pulse) Count = 1 (pulse next cycle if enabled) Count = 2 (pulse next cycle if enabled) Count = 3 (terminal)
3 synchronization cycles
fast_clk
slow_clk
scan_en
clk scan_en falling deselects slow_clk asynchronously scan_en rising takes effect on next falling clock edges
Figure 5.
3 synchronization cycles
Count = 0 (no pulse) Count = 1 (pulse 2nd following cycle if enabled) Count = 2 (pulse 2nd following cycle if enabled) Count = 3 (terminal)
fast_clk
slow_clk
scan_en
clk
Figure 6.
The dotted arrows show data setup relationships to their corresponding clock edges. scan_en must be synchronized to slow_clk and it must change while slow_clk is low to avoid truncating its pulse on clk. No synchronization with fast_clk is assumed and clock domain crossing synchronization logic is provided. Minimum widths are required for both the high and low pulses of scan_en: The scan_en low pulse must encompass a slow_clk pulse followed by a number of fast_clk pulses equal to the -cycles_per_clock argument plus five (three synchronization cycles plus two extra counter cycles). Failure to meet this requirement will cause a failure during pattern simulation. Capture pulses will be skipped, but no glitching will occur and the following scan operation will work correctly. If needed, increase the duration of the scan_en low pulse by using the set_atpg
SYNOPSYS CONFIDENTIAL
8
DFT_clk_mux
clock cycles that the signal is held low. You can calculate this value using the waveform diagrams, the period of the slow clock, and the largest period across all fast clocks. If the clock pulses have considerable propagation delay to the scan flip-flops, you can also use the -min_ateclock_cycles option to add additional delay to the low scan_en pulse so that the clock pulses reach their destination before the rising scan enable transition. There is no maximum scan_en low pulse width. The scan_en high pulse must encompass a slow_clk pulse followed by five fast_clk pulses. Failure to meet this requirement may cause all capture pulses in the next following capture cycle to be skipped. There is no maximum scan_en high pulse width.
DFT_clk_mux
configuration file attribute. See the VCS User Guide for details. 4. Static timing analysis requires a special setup to enable the required clock gating checks. This setup is described in SolvNet article 022490, titled Static Timing Analysis Constraints for OnChip Clocking Support. 5. Clock Tree Synthesis (CTS) can cause timing problems if it is not set up properly. If CTS is allowed to balance the clock skew to the flip-flops inside DFT_clk_mux to the same value as the flip-flops on the endpoints of the clock tree, then the clock output of DFT_clk_mux may include glitches or shortened clock pulses. This is because the DFT_clk_mux flip-flops gate the clock before it has gone through the clock trees delay. The solution to this is to skew the clock to the DFT_clk_mux flip-flops to be earlier than that going to other destinations of the same clock. In IC Compiler, this can be done using the set_clock_tree_exceptions -float_pins command. See the IC Compiler documentation for details. Note that the clock for DFT_clk_chain can use a clock balanced to the functional flip-flops on endpoints of the clock tree. Its flip-flops are on the scan chains with the functional flip-flops, and its outputs to DFT_clk_mux are ignored during shift but stable during the capture cycle, so they do not have to meet single-cycle timing requirements on those paths.
SYNOPSYS CONFIDENTIAL
10
DFT_clk_mux
3 DFT_clk_chain
This section describes the use of the DFT_clk_chain block with regular scan and scan compression.
3.2 Ports
Port Name clk se si[n:0] so[n:0] clk_ctrl_data[m:0] Direction Input Input Input Output Output Function Falling edge clock 1 to shift scan chains, 0 to hold previous data Scan inputs Scan outputs Parallel output data
Table 4. DFT_clk_chain I/O ports
The widths of the buses are determined by the options of the set_dft_clk_controller command: si and so are as wide as the argument of -chain_count. clk_ctrl_data is as wide as the number of elements in the -pllclocks list times the argument of the -cycles_per_clock option. When the bus width would be 1, a scalar port of the same name is used instead.
3.3 Connections
As instantiated by the insert_dft command, the DFT_clk_chain ports are connected as follows: Port Name clk se si so clk_ctrl_data Type Internal Primary Input Primary Input Primary Output Internal Default Name DFT_clk_mux(clk[max]) test_se test_si test_so DFT_clk_mux(clk_enable)
11
SYNOPSYS CONFIDENTIAL
SYNOPSYS CONFIDENTIAL
12