Informatica Guide
Informatica Guide
Informatica Guide
PREPARED BY:
Ammar Hasan
CONTENTS
CHAPTER 1: TOOL KNOWLEDGE
1.1 Informatica PowerCenter 1.2 Product Overview 1.2.1 PowerCenter Domain 1.2.2 Administration Console 1.2.3 PowerCenter Repository 1.2.4 PowerCenter Client 1.2.5 Repository Service 1.2.6 INTEGRATION SERVICE 1.2.7 WEB SERVICES HUB 1.2.8 DATA ANALYZER 1.2.9 METADATA MANAGER
CHAPTER 2:
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9
REPOSITORY MANAGER
Adding a Repository to the Navigator Configuring a Domain Connection Connecting to a Repository Viewing Object Dependencies Validating Multiple Objects Comparing Repository Objects Truncating Workflow and Session Log Entries Managing User Connections and Locks Managing Users and Groups
CHAPTER 3:
DESIGNER
3.1 Source Analyzer 3.1.1 Working with Relational Sources 3.1.2 Working with Flat Files 3.2 Target Designer 3.3 Mappings 3.4 Transformations 3.4.1 Working with Ports 3.4.2 Using Default Values for Ports 3.4.3 User-Defined Default Values 3.5 Tracing Levels 3.6 Basic First Mapping 3.7 Expression Transformation 3.8 Filter Transformation 3.9 Router Transformation
3.21
Union Transformation Sorter Transformation Rank Transformation Aggregator Transformation Joiner Transformation Source Qualifier Lookup Transformation 3.16.1 Lookup Types 3.16.2 Lookup Transformation Components 3.16.3 Connected Lookup Transformation 3.16.4 Unconnected Lookup Transformation 3.16.5 Lookup Cache Types: Dynamic, Static, Persistent, Shared Update Strategy Dynamic Lookup Cache Use Lookup Query Lookup and Update Strategy Examples Example to Insert and Update without a Primary Key Example to Insert and Delete based on a condition Stored Procedure Transformation 3.21.1 Connected Stored Procedure Transformation 3.21.2 Unconnected Stored Procedure Transformation Sequence Generator Transformation Mapplets: Mapplet Input and Mapplet Output Transformations Normalizer Transformation XML Sources Import and usage Mapping Wizards 3.26.1 Getting Started 3.26.2 Slowly Changing Dimensions Mapping Parameters and Variables Parameter File Indirect Flat File Loading
CHAPTER 4:
WORKFLOW MANAGER
4.1 Informatica Architecture 4.1.1 Integration Service Process 4.1.2 Load Balancer 4.1.3 DTM Process 4.1.4 Processing Threads 4.1.5 Code Pages and Data Movement 4.1.6 Output Files and Caches 4.2 Working with Workflows 4.2.1 Assigning an Integration Service 4.2.2 Working with Links 4.2.3 Workflow Variables 4.2.4 Session Parameters
4.3 Working with Tasks 4.3.1 Session Task 4.3.2 Email Task 4.3.3 Command Task 4.3.4 Working with Event Tasks 4.3.5 Timer Task 4.3.6 Decision Task 4.3.7 Control Task 4.3.8 Assignment Task 4.4 Schedulers 4.5 Worklets 4.6 Partitioning 4.6.1 Partitioning Attributes 4.6.2 Partitioning Types 4.6.3 Some Points 4.7 Session Properties 4.8 Workflow Properties
Chapter 1
Informatica PowerCenter
Data Cleanse and Match Option features powerful, integrated cleansing and matching capabilities to correct and remove duplicate customer data. Data Federation Option enables a combination of traditional physical and virtual
data integration in a single platform.
High Availability Option minimizes service interruptions during hardware and/or software outages and reduces costs associated with data downtime. Metadata Exchange Options coordinate technical and business metadata from
data modeling tools, business intelligence tools, source and target database catalogs, and PowerCenter repositories.
Partitioning
Option helps IT organizations maximize their technology investments by enabling hardware and software to jointly scale to handle large volumes of data and users. Pushdown Optimization Option enables data transformation processing,
where appropriate, to be pushed down into any relational database to make the best use of existing database assets.
Team-Based
Development Option facilitates collaboration among development, quality assurance, and production administration teams and across geographically disparate teams. Unstructured Data Option expands PowerCenters data access capabilities to
include unstructured data formats, providing virtually unlimited access to all enterprise data formats.
Service Manager: The Service Manager is built in to the domain to support the domain and the application services. The Service Manager runs on each node in the domain. The Service Manager starts and runs the application services on a machine.
The Service Manager performs the following functions: Alerts: Provides notifications about domain and service events. Authentication: Authenticates user requests Authorization: Authorizes user requests for services. Domain configuration: Manages domain configuration metadata. Node configuration: Manages node configuration metadata. Licensing: Registers license information and verifies license information Logging: Provides accumulated log events from each service in the domain.
Application services: A group of services that represent PowerCenter serverbased functionality. Repository Service: Manages connections to the PowerCenter repository. Integration Service: Runs sessions and workflows. Web Services Hub: Exposes PowerCenter functionality to external clients through web services. SAP BW Service: Listens for RFC requests from SAP NetWeaver BW and initiates workflows to extract from or load to SAP BW.
Global repository: The global repository is the hub of the repository domain. Use
the global repository to store common objects that multiple developers can use through shortcuts. These objects may include operational or Application source definitions, reusable transformations, mapplets, and mappings.
Local repositories: A local repository is any repository within the domain that is not the global repository. Use local repositories for development. From a local repository, you can create shortcuts to objects in shared folders in the global repository. These objects include source definitions, common dimensions and lookups, and enterprise standard transformations. You can also create copies of objects in non-shared folders.
PowerCenter supports versioned repositories. A versioned repository can store multiple versions of an object. PowerCenter version control allows you to efficiently develop, test, and deploy metadata into production.
Designer:
Use the Designer to create mappings that contain transformation instructions for the Integration Service. The Designer has the following tools that you use to analyze sources, design target schemas, and build source-to-target mappings: Source Analyzer: Import or create source definitions. Target Designer: Import or create target definitions. Transformation Developer: Develop transformations to use in mappings. You can also develop user-defined functions to use in expressions. Mapplet Designer: Create sets of transformations to use in mappings. Mapping Designer: Create mappings that the Integration Service uses to extract, transform, and load data.
Data Stencil
Use the Data Stencil to create mapping template that can be used to generate multiple mappings. Data Stencil uses the Microsoft Office Visio interface to create mapping templates. Not used by a developer usually.
Repository Manager
Use the Repository Manager to administer repositories. You can navigate through multiple folders and repositories, and complete the following tasks:
Manage users and groups: Create, edit, and delete repository users and
user groups. We can assign and revoke repository privileges and folder permissions.
Perform folder functions: Create, edit, copy, and delete folders. Work
we perform in the Designer and Workflow Manager is stored in folders. If we want to share metadata, you can configure a folder to be shared.
We create repository objects using the Designer and Workflow Manager Client tools. We can view the following objects in the Navigator window of the Repository Manager: Source definitions: Definitions of database objects (tables, views, synonyms) or files that provide source data. Target definitions: Definitions of database objects or files that contain the target data. Mappings: A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Integration Service uses to transform and move data. Reusable transformations: Transformations that we use in multiple mappings. Mapplets: A set of transformations that you use in multiple mappings. Sessions and workflows: Sessions and workflows store information about how and when the Integration Service moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.
Workflow Manager
Use the Workflow Manager to create, schedule, and run workflows. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. The Workflow Manager has the following tools to help us develop a workflow: Task Developer: Create tasks we want to accomplish in the workflow. Worklet Designer: Create a worklet in the Worklet Designer. A worklet is an object that groups a set of tasks. A worklet is similar to a workflow, but without scheduling information. We can nest worklets inside a workflow. Workflow Designer: Create a workflow by connecting tasks with links in the Workflow Designer. You can also create tasks in the Workflow Designer as you develop the workflow. When we create a workflow in the Workflow Designer, we add tasks to the workflow. The Workflow Manager includes tasks, such as the Session task, the Command task, and the Email task so you can design a workflow. The Session task is based on a mapping we build in the Designer. We then connect tasks with links to specify the order of execution for the tasks we created. Use conditional links and workflow variables to create branches in the workflow.
Workflow Monitor
Use the Workflow Monitor to monitor scheduled and running workflows for each Integration Service. We can view details about a workflow or task in Gantt Chart view or Task view. We can run, stop, abort, and resume workflows from the Workflow Monitor. We can view sessions and workflow log events in the Workflow Monitor Log Viewer. The Workflow Monitor displays workflows that have run at least once. The Workflow Monitor continuously receives information from the Integration Service and Repository Service. It also fetches information from the repository to display historic information.
PowerCenter Client: Use the Designer and Workflow Manager to create and
store mapping metadata and connection object information in the repository. Use the Workflow Monitor to retrieve workflow run status information and session logs written by the Integration Service. Use the Repository Manager to organize and secure metadata by creating folders, users, and groups.
Integration Service (IS): When we start the IS, it connects to the repository to
schedule workflows. When we run a workflow, the IS retrieves workflow task and mapping metadata from the repository. IS writes workflow status to the repository.
Web Services Hub: When we start the Web Services Hub, it connects to the
repository to access web-enabled workflows. The Web Services Hub retrieves workflow task and mapping metadata from the repository and writes workflow status to the repository.
SAP BW Service: Listens for RFC requests from SAP NetWeaver BW and initiates
workflows to extract from or load to SAP BW.
We install the Repository Service when we install PowerCenter Services. After we install the PowerCenter Services, we can use the Administration Console to manage the Repository Service.
Repository Connectivity:
PowerCenter applications such as the PowerCenter Client, the Integration Service, pmrep, and infacmd connect to the repository through the Repository Service.
The following process describes how a repository client application connects to the repository database: 1) The repository client application sends a repository connection request to the master gateway node, which is the entry point to the domain. This is node B in the diagram. 2) The Service Manager sends back the host name and port number of the node running the Repository Service. If you have the high availability option, you can configure the Repository Service to run on a backup node. Node A in above diagram. 3) The repository client application establishes a link with the Repository Service process on node A. This communication occurs over TCP/IP.
4) The Repository Service process communicates with the repository database and performs repository metadata transactions for the client application.
Understanding Metadata
The repository stores metadata that describes how to extract, transform, and load source and target data. PowerCenter metadata describes several different kinds of repository objects. We use different PowerCenter Client tools to develop each kind of object. If we enable version control, we can store multiple versions of metadata objects in the repository. We can also extend the metadata stored in the repository by associating information with repository objects. For example, when someone in our organization creates a source definition, we may want to store the name of that person with the source definition. We associate information with repository metadata using metadata extensions.
Administering Repositories
We use the PowerCenter Administration Console, the Repository Manager, and the pmrep and infacmd command line programs to administer repositories. Back up repository to a binary file Restore repository from a binary file Copy repository database tables Delete repository database tables Create a Repository Service Remove a Repository Service Create folders to organize metadata Add repository users and groups Configure repository security
This is not used by Informatica Developer normally and not in scope of our training.
Chapter 2
Repository Manager
CHAPTER 2:
REPOSITORY MANAGER
We can navigate through multiple folders and repositories and perform basic repository tasks with the Repository Manager. This is an administration tool and used by Informatica Administrator.
Add a repository to the Navigator, and then configure the domain connection information when we connect to the repository.
2. Enter the name of the repository and a valid repository user name. 3. Click OK. Before we can connect to the repository for the first time, we must configure the connection information for the domain that the repository belongs to.
3. Click the Add button. The Add Domain dialog box appears. 4. Enter the domain name, gateway host name, and gateway port number. 5. Click OK to add the domain connection.
Select the objects you want to validate. Click Analyze and Select Validate Select validation options from the Validate Objects dialog box Click Validate. Click a link to view the objects in the results group.
In the Repository Manager, connect to the repository. In the Navigator, select the object you want to compare. Click Edit > Compare Objects. Click Compare in the dialog box displayed.
Launch the Repository Manager and connect to the repository. Click Edit > Show User Connections or Show locks The locks or user connections will be displayed in a window. We can do the rest as per our need.
There are two default repository user groups: Administrators: This group initially contains two users that are created by default. The default users are Administrator and the database user that created the repository. We cannot delete these users from the repository or remove them from the Administrators group. Public: The Repository Manager does not create any default users in the Public group.
3. Click ok.
Chapter 3
Designer
CHAPTER 3: DESIGNER
The Designer has tools to help us build mappings and mapplets so we can specify how to move and transform data between sources and targets. The Designer helps us create source definitions, target definitions, and transformations to build the mappings. The Designer lets us work with multiple tools at one time and to work in multiple folders and repositories at the same time. It also includes windows so we can view folders, repository objects, and tasks.
Designer Tools:
Source Analyzer: Use to import or create source definitions for flat file, XML, COBOL, Application, and relational sources. Target Designer: Use to import or create target definitions. Transformation Developer: Use to create reusable transformations. Mapplet Designer: Use to create mapplets. Mapping Designer: Use to create mappings.
Designer Windows:
Navigator: Use to connect to and work in multiple repositories and folders. Workspace: Use to view or edit sources, targets, mapplets, transformations, and mappings. Status bar: Displays the status of the operation we perform. Output: Provides details when we perform certain tasks, such as saving work or validating a mapping Overview: An optional window to simplify viewing workbooks containing large mappings or a large number of objects. Instance Data: View transformation data while you run the Debugger to debug a mapping. Target Data: View target data while you run the Debugger to debug a mapping.
Overview Window
5) 6) 7) 8) 9)
Enter a database user name and password to connect to the database. Click Connect. Table names will appear. Select the relational object or objects you want to import. Click OK. Click Repository > Save.
6) Click Next. Follow the directions in the wizard to manipulate the column breaks in the file preview window. Move existing column breaks by dragging them. Double-click a column break to delete it. 7) Click next and Enter column information for each column in the file. 8) Click Finish. 9) Click Repository > Save.
Optional
Remove Escape Character From Data Use Default Text Length Text Qualifier
Optional
Character immediately preceding a column delimiter character embedded in an unquoted string, or immediately preceding the quote character in a quoted string. Clear this option to include the escape character in the output string. If selected, the Flat File Wizard uses the entered default text length for all string datatypes. Quote character that defines the boundaries of text strings. Choose No Quote, Single Quote, or Double Quotes.
Optional
Required
4) Enter column information for each column in the file. 5) Click Finish. 6) Click Repository > Save.
The way to handle target flat files is also same as described in the above sections. Just make sure that instead of Source Analyzer, Select Tools -> Target Designer. Rest is same.
3.3 MAPPINGS
A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation. Mappings represent the data flow between sources and targets. When the Integration Service runs a session, it uses the instructions configured in the mapping to read, transform, and write data.
Mapping Components:
Source definition: Describes the characteristics of a source table or file. Transformation: Modifies data before writing it to targets. Use different transformation objects to perform different functions. Target definition: Defines the target table or file. Links: Connect sources, targets, and transformations so the Integration Service can move the data as it transforms it.
The work of Informatica Developer is to make mappings as per client requirements. We drag source definition and target definition in workspace. We create various transformations to modify the data as per the need. We then run the mappings by creating session and workflow. We also unit test the mappings.
3.4 TRANSFORMATIONS
A transformation is a repository object that generates, modifies, or passes data. You configure logic in a transformation that the Integration Service uses to transform data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data. Transformations in a mapping represent the operations the Integration Service performs on the data. Data passes through transformation ports that we link in a mapping or mapplet.
Types of Transformations:
Active: An active transformation can change the number of rows that pass through
it, such as a Filter transformation that removes rows that do not meet the filter condition.
Passive: A passive transformation does not change the number of rows that pass
through it, such as an Expression transformation that performs a calculation on data and passes all rows through the transformation.
Reusable: Reusable transformations can be used in multiple mappings. These are created in Transformation Developer tool. Or promote a non-reusable transformation from the Mapping Designer. We can create most transformations as a non-reusable or reusable. External Procedure transformation can be created as a reusable transformation only. Source Qualifier is not reusable. Non reusable: Non-reusable transformations exist within a single mapping. These
are created in Mapping Designer tool.
Note: Variable ports do not support default values. The Integration Service initializes variable ports according to the datatype. Note: The Integration Service ignores user-defined default values for unconnected transformations.
ERROR: Generate a transformation error. Write the row and a message in the
session log or row error log. The Integration Service writes the row to session log or row error log based on session configuration. Use the ERROR function as the default value when we do not want null values to pass into a transformation. For example, we might want to skip a row when the input value of DEPT_NAME is NULL. You could use the following expression as the default value: ERROR('Error. DEPT is NULL')
ABORT: Abort the session. Session aborts when the Integration Service encounters
a null input value. The Integration Service does not increase the error count or write rows to the reject file. Example: ABORT(DEPT is NULL')
Level
Normal
Description
Integration Service logs initialization and status information, errors encountered and skipped rows due to transformation row errors. Summarizes session results, but not at the level of individual rows. Integration Service logs initialization information and error messages and notification of rejected data. In addition to normal tracing, Integration Service logs additional initialization details; names of index and data files used, and detailed transformation statistics. In addition to verbose initialization tracing, Integration Service logs each row that passes into the mapping. Allows the Integration Service to write errors to both the session log and error log when you enable row error logging. Integration Service writes row data for all rows in a block when it processes a transformation.
Terse
Verbose Initialization
Verbose Data
Change the tracing level to a Verbose setting only when we need to debug a transformation that is not behaving as expected. To add a slight performance boost, we can also set the tracing level to Terse.
Note: We can edit the source definition by dragging the table in Source Analyzer only.
Now we have all the tables we need in shared folder. We now need to create shortcut to these in our folder. 1. 2. 3. 4. 5. Right Click on Shared folder and select disconnect. Select the folder where we want to create the mapping. Right click on folder and click open. The folder will become bold. We will now create shortcut to the tables of need in our work folder. Click + sign on Shared folder and open + sign on Sources and Select EMP table. 6. Now click Edit -> Copy 7. Now select the folder where which is bold. 8. Click Edit -> Paste Shortcut 9. Do the same for all source and target tables. 10. Also rename all the shortcuts and remove Shortcut_to_ from all. 11. Repository Save
Shortcut use:
If we will select paste option, then the copy of EMP table definition will be created. Suppose, we are 10 people and 5 using shortcut and 5 are copying the definition of EMP. Now suppose the definition of EMP changes in database. We will now reimport the EMP definition and old definition will be replaced. Developers who were using shortcuts will see that the changes have been reflected in mapping automatically. Developers using copy will have to reimport manually. So for maintenance and ease, we use shortcuts to source and target definitions in our folder and short to other reusable transformations and mapplets.
Creating Mapping:
1. 2. 3. 4. 5. 6. 7. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping -> Create -> Give mapping name. Ex: m_basic_mapping Drag EMP from source and EMP_Tgt from target in mapping. Link ports from SQ_EMP to EMP_Tgt. Click Mapping -> Validate Repository -> Save
Creating Session:
Now we will create session in workflow manager. 1. 2. 3. 4. 5. Open Workflow Manager -> Connect to repository Open the folder with same name in which we created mapping. Make sure folder is bold. Now click tool Task Developer Click Task -> Create -> Select Session task and give name. s_m_basic_mapping 6. Select the correct mapping from the list displayed. 7. Click Create and done. 8. Now right click session and click edit. 9. Select mapping tab. 10. Go to SQ_EMP in source and give the correct relational connection for it. 11. Do the same for EMP_Tgt. 12. Also for target table, Give Load Type option as Normal and Also select Truncate Target Table Option. 13. Task -> Validate
Creating Workflow:
1. 2. 3. 4. 5. 6. 7. 8. Now Click Tools -> Workflow Designer Workflow -> Create -> Give name like wf_basic_mapping Click ok START task will be displayed. It is the starting point for Informatica server. Drag session to workflow. Click Task-> Link Task. Connect START to the session. Click Workflow -> Validate Repository Save.
Calculating Values
To use the Expression transformation to calculate values for a single row, we must include the following ports: Input or input/output ports for each value used in the calculation: For example: To calculate Total Salary, we need salary and commission. Output port for the expression: We enter one expression for each output port. The return value for the output port needs to match the return value of the expression.
We can enter multiple expressions in a single Expression transformation. We can create any number of output ports in the transformation. Example: Calculating Total Salary of an Employee Import the source table EMP in Shared folder. If it is already there, then dont import. In shared folder, create the target table Emp_Total_SAL. Keep all ports as in EMP table except Sal and Comm in target table. Add Total_SAL port to store the calculation. Create the necessary shortcuts in the folder.
Creating Mapping:
Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping -> Create -> Give mapping name. Ex: m_totalsal Drag EMP from source in mapping. Click Transformation -> Create -> Select Expression from list. Give name and click Create. Now click done. 6. Link ports from SQ_EMP to Expression Transformation. 7. Edit Expression Transformation. As we do not want Sal and Comm in target, remove check from output port for both columns. 8. Now create a new port out_Total_SAL. Make it as output port only. 9. Click the small button that appears in the Expression section of the dialog box and enter the expression in the Expression Editor. 10. Enter expression SAL + COMM. You can select SAL and COMM from Ports tab in expression editor. 1. 2. 3. 4. 5.
11. Check the expression syntax by clicking Validate. 12. Click OK -> Click Apply -> Click Ok. 13. Now connect the ports from Expression to target table. 14. Click Mapping -> Validate 15. Repository -> Save
Create Session and Workflow as described earlier. Run the workflow and see the data in target table.
As COMM is null, Total_SAL will be null in most cases. Now open your mapping and expression transformation. Select COMM port, In Default Value give 0. Now apply changes. Validate Mapping and Save. Refresh the session and validate workflow again. Run the workflow and see the result again. Now use ERROR in Default value of COMM to skip rows where COMM is null. Syntax: ERROR(Any message here)
Similarly, we can use ABORT function to abort the session if COMM is null. Syntax: ABORT(Any message here) Make sure to double click the session after doing any changes in mapping. It will prompt that mapping has changed. Click OK to refresh the mapping. Run workflow after validating and saving the workflow.
Creating Mapping:
Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping -> Create -> Give mapping name. Ex: m_filter_example Drag EMP from source in mapping. Click Transformation -> Create -> Select Filter from list. Give name and click Create. Now click done. 6. Pass ports from SQ_EMP to Filter Transformation. 7. Edit Filter Transformation. Go to Properties Tab 8. Click the Value section of the Filter condition, and then click the Open button. 9. The Expression Editor appears. 10. Enter the filter condition you want to apply. 11. Click Validate to check the syntax of the conditions you entered. 12. Click OK -> Click Apply -> Click Ok. 13. Now connect the ports from Filter to target table. 14. Click Mapping -> Validate 15. Repository -> Save 1. 2. 3. 4. 5.
Create Session and Workflow as described earlier. Run the workflow and see the data in target table.
How to filter out rows with null values? To filter out rows containing null values or spaces, use the ISNULL and IS_SPACES functions to test the value of the port. For example, if we want to filter out rows that contain NULLs in the FIRST_NAME port, use the following condition: IIF(ISNULL(FIRST_NAME),FALSE,TRUE) This condition states that if the FIRST_NAME port is NULL, the return value is FALSE and the row should be discarded. Otherwise, the row passes through to the next transformation.
Mapping A uses three Filter transformations while Mapping B produces the same result with one Router transformation. A Router transformation consists of input and output groups, input and output ports, group filter conditions, and properties that we configure in the Designer.
User-Defined Groups: We create a user-defined group to test a condition based on incoming data. A user-defined group consists of output ports and a group filter condition. We can create and edit user-defined groups on the Groups tab with the Designer. Create one user-defined group for each condition that we want to specify. The Default Group: The Designer creates the default group after we create one
new user-defined group. The Designer does not allow us to edit or delete the default group. This group does not have a group filter condition associated with it. If all of the conditions evaluate to FALSE, the IS passes the row to the default group. Example: Filtering employees of Department 10 to EMP_10, Department 20 to EMP_20 and rest to EMP_REST Source is EMP Table. Create 3 target tables EMP_10, EMP_20 and EMP_REST in shared folder. Structure should be same as EMP table. Create the shortcuts in your folder.
Creating Mapping:
1. 2. 3. 4. 5. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give mapping name. Ex: m_router_example Drag EMP from source in mapping. Click Transformation -> Create -> Select Router from list. Give name and click Create. Now click done. 6. Pass ports from SQ_EMP to Router Transformation. 7. Edit Router Transformation. Go to Groups Tab 8. Click the Groups tab, and then click the Add button to create a user-defined group. The default group is created automatically.. 9. Click the Group Filter Condition field to open the Expression Editor. 10. Enter a group filter condition. Ex: DEPTNO=10 11. Click Validate to check the syntax of the conditions you entered.
12. Create another group for EMP_20. Condition: DEPTNO=20 13. The rest of the records not matching the above two conditions will be passed to DEFAULT group. See sample mapping 14. Click OK -> Click Apply -> Click Ok. 15. Now connect the ports from router to target tables. 16. Click Mapping -> Validate 17. Repository -> Save Create Session and Workflow as described earlier. Run the workflow and see the data in target table. Make sure to give connection information for all 3 target tables.
Sample Mapping:
Creating Mapping:
1. 2. 3. 4. 5. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give mapping name. Ex: m_union_example Drag EMP_10, EMP_20 and EMP_REST from source in mapping. Click Transformation -> Create -> Select Union from list. Give name and click Create. Now click done. 6. Pass ports from SQ_EMP_10 to Union Transformation. 7. Edit Union Transformation. Go to Groups Tab 8. One group will be already there as we dragged ports from SQ_DEPT_10 to Union Transformation. 9. As we have 3 source tables, we 3 need 3 input groups. Click Add button to add 2 more groups. See Sample Mapping 10. We can also modify ports in ports tab. 11. Click Apply -> Ok. 12. Drag target table now. 13. Connect the output ports from Union to target table. 14. Click Mapping -> Validate 15. Repository -> Save Create Session and Workflow as described earlier. Run the workflow and see the data in target table. Make sure to give connection information for all 3 source tables.
When we create a Sorter transformation in a mapping, we specify one or more ports as a sort key and configure each sort key port to sort in ascending or descending order. We also configure sort criteria the PowerCenter Server applies to all sort key ports and the system resources it allocates to perform the sort operation. The Sorter transformation contains only input/output ports. All data passing through the Sorter transformation is sorted according to a sort key. The sort key is one or more ports that we want to use as the sort criteria.
2. Case Sensitive:
The Case Sensitive property determines whether the PowerCenter Server considers case when sorting data. When we enable the Case Sensitive property, the PowerCenter Server sorts uppercase characters higher than lowercase characters.
3. Work Directory
Directory PowerCenter Server uses to create temporary files while it sorts data.
4. Distinct:
Check this option if we want to remove duplicates. Sorter will sort data according to all the ports when it is selected.
Example: Sorting data of EMP by ENAME Source is EMP table. Create a target table EMP_SORTER_EXAMPLE in target designer. Structure same as EMP table. Create the shortcuts in your folder.
Creating Mapping:
Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give mapping name. Ex: m_sorter_example Drag EMP from source in mapping. Click Transformation -> Create -> Select Sorter from list. Give name and click Create. Now click done. 6. Pass ports from SQ_EMP to Sorter Transformation. 7. Edit Sorter Transformation. Go to Ports Tab 8. Select ENAME as sort key. CHECK mark on KEY in front of ENAME. 9. Click Properties Tab and Select Properties as needed. 10. Click Apply -> Ok. 11. Drag target table now. 12. Connect the output ports from Sorter to target table. 13. Click Mapping -> Validate 14. Repository -> Save Create Session and Workflow as described earlier. Run the workflow and see the data in target table. Make sure to give connection information for all tables. 1. 2. 3. 4. 5.
I O V R
Port to receive data from another transformation. Port we want to pass to other transformation. Can use to store values or calculations to use in an expression. Rank port. Rank is calculated according to it. The Rank port is an input/output port. We must link the Rank port to another transformation. Example: Total Salary
Rank Index
The Designer automatically creates a RANKINDEX port for each Rank transformation. The PowerCenter Server uses the Rank Index port to store the ranking position for each row in a group. For example, if we create a Rank transformation that ranks the top five salaried employees, the rank index numbers the employees from 1 to 5. The RANKINDEX is an output port only. We can pass the rank index to another transformation in the mapping or directly to a target. We cannot delete or edit it.
Defining Groups
Rank transformation allows us to group information. For example: If we want to select the top 3 salaried employees of each Department, we can define a group for department. By defining groups, we create one set of ranked rows for each group. We define a group in Ports tab. Click the Group By for needed port. We cannot Group By on port which is also Rank Port. 1> Example: Finding Top 5 Salaried Employees EMP will be source table. Create a target table EMP_RANK_EXAMPLE in target designer. Structure should be same as EMP table. Just add one more port Rank_Index to store RANK INDEX. Create the shortcuts in your folder.
Creating Mapping:
1. 2. 3. 4. 5. 6. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give mapping name. Ex: m_rank_example Drag EMP from source in mapping. Create an EXPRESSION transformation to calculate TOTAL_SAL. Click Transformation -> Create -> Select RANK from list. Give name and click Create. Now click done. 7. Pass ports from Expression to Rank Transformation. 8. Edit Rank Transformation. Go to Ports Tab 9. Select TOTAL_SAL as rank port. Check R type in front of TOTAL_SAL. 10. Click Properties Tab and Select Properties as needed. 11. Top in Top/Bottom and Number of Ranks as 5. 12. Click Apply -> Ok. 13. Drag target table now. 14. Connect the output ports from Rank to target table. 15. Click Mapping -> Validate 16. Repository -> Save Create Session and Workflow as described earlier. Run the workflow and see the data in target table. Make sure to give connection information for all tables.
2> Example: Finding Top 2 Salaried Employees for every DEPARTMENT Open the mapping made above. Edit Rank Transformation. Go to Ports Tab. Select Group By for DEPTNO. Go to Properties tab. Set Number of Ranks as 2. Click Apply -> Ok. Mapping -> Validate and Repository Save. Refresh the session by double clicking. Save the changed and run workflow to see the new result.
RANK CACHE
When the PowerCenter Server runs a session with a Rank transformation, it compares an input row with rows in the data cache. If the input row out-ranks a stored row, the PowerCenter Server replaces the stored row with the input row. Example: PowerCenter caches the first 5 rows if we are finding top 5 salaried employees. When 6th row is read, it compares it with 5 rows in cache and places it in cache is needed.
The transformation language includes the following aggregate functions: AVG, COUNT , MAX, MIN, SUM FIRST, LAST MEDIAN, PERCENTILE, STDDEV, VARIANCE Single Level Aggregate Function: MAX(SAL) Nested Aggregate Function: MAX( COUNT( ITEM ))
Conditional Clauses
We can use conditional clauses in the aggregate expression to reduce the number of rows used in the aggregation. The conditional clause can be any clause that evaluates to TRUE or FALSE. SUM( COMMISSION, COMMISSION > QUOTA )
Non-Aggregate Functions
We can also use non-aggregate functions in the aggregate expression. IIF( MAX( QUANTITY ) > 0, MAX( QUANTITY ), 0))
Note: The PowerCenter Server uses memory to process an Aggregator transformation with sorted ports. It does not use cache memory. We do not need to configure cache memory for Aggregator transformations that use sorted ports.
1> Example: To calculate MAX, MIN, AVG and SUM of salary of EMP table. EMP will be source table. Create a target table EMP_AGG_EXAMPLE in target designer. Table should contain DEPTNO, MAX_SAL, MIN_SAL, AVG_SAL and SUM_SAL Create the shortcuts in your folder.
Creating Mapping:
1. 2. 3. 4. 5. 6. 7. 8. 9. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give mapping name. Ex: m_agg_example Drag EMP from source in mapping. Click Transformation -> Create -> Select AGGREGATOR from list. Give name and click Create. Now click done. Pass SAL and DEPTNO only from SQ_EMP to AGGREGATOR Transformation. Edit AGGREGATOR Transformation. Go to Ports Tab Create 4 output ports: OUT_MAX_SAL, OUT_MIN_SAL, OUT_AVG_SAL, OUT_SUM_SAL Open Expression Editor one by one for all output ports and give the calculations. Ex: MAX(SAL), MIN(SAL), AVG(SAL),SUM(SAL)
10. Click Apply -> Ok. 11. Drag target table now. 12. Connect the output ports from Rank to target table. 13. Click Mapping -> Validate 14. Repository -> Save Create Session and Workflow as described earlier. Run the workflow and see the data in target table. Make sure to give connection information for all tables.
2> Example: To calculate MAX, MIN, AVG and SUM of salary of EMP table for every DEPARTMENT Open the mapping made above. Edit Rank Transformation. Go to Ports Tab. Select Group By for DEPTNO. Click Apply -> Ok. Mapping -> Validate and Repository Save. Refresh the session by double clicking. Save the changed and run workflow to see the new result.
Here we are not doing any calculation or group by. In this case, the DEPTNO and SAL of last record of EMP table will be passed to target. Scene2: What will be output of the above picture if Group By is done on DEPTNO? Here we are not doing any calculation but Group By is there on DEPTNO. In this case, the last record of every DEPTNO from EMP table will be passed to target. Scene3: What will be output of the EXAMPLE 1? In Example 1, we are calculating MAX, MIN, AVG and SUM but we are not doing any Group By. In this DEPTNO of last record of EMP table will be passed. The calculations however will be correct. Scene4: What will be output of the EXAMPLE 2? In Example 1, we are calculating MAX, MIN, AVG and SUM for every DEPT. In this DEPTNO and the correct calculations for every DEPTNO will be passed to target. Scene5: Use SORTED INPUT in Properties Tab and Check output
Example: To join EMP and DEPT tables. EMP and DEPT will be source table. Create a target table JOINER_EXAMPLE in target designer. Table should contain all ports of EMP table plus DNAME and LOC as shown below. Create the shortcuts in your folder.
Creating Mapping:
1> 2> 3> 4> Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give mapping name. Ex: m_joiner_example Drag EMP, DEPT, Target. Create Joiner Transformation. Link as shown below.
Specify the join condition in Condition tab. See steps on next page. Set Master in Ports tab. See steps on next page. Mapping -> Validate Repository -> Save. Create Session and Workflow as described earlier. Run the workflow and see the data in target table. Make sure to give connection information for all tables.
JOIN CONDITION:
The join condition contains ports from both input sources that must match for the PowerCenter Server to join two rows. Example: DEPTNO=DEPTNO1 in above. 1. Edit Joiner Transformation -> Condition Tab 2. Add condition We can add as many conditions as needed. Only = operator is allowed.
If we join Char and Varchar datatypes, the PowerCenter Server counts any spaces that pad Char values as part of the string. So if you try to join the following: Char (40) = abcd and Varchar (40) = abcd Then the Char value is abcd padded with 36 blank spaces, and the PowerCenter Server does not join the two fields because the Char field contains trailing spaces. Note: The Joiner transformation does not match null values.
JOIN TYPES
In SQL, a join is a relational operator that combines data from multiple tables into a single result set. The Joiner transformation acts in much the same manner, except that tables can originate from different databases or flat files. Types of Joins: Normal Master Outer Detail Outer Full Outer
Note: A normal or master outer join performs faster than a full outer or detail outer join. Example: In EMP, we have employees with DEPTNO 10, 20, 30 and 50. In DEPT, we have DEPTNO 10, 20, 30 and 40. DEPT will be MASTER table as it has less rows. Normal Join: With a normal join, the PowerCenter Server discards all rows of data from the master and detail source that do not match, based on the condition. All employees of 10, 20 and 30 will be there as only they are matching. Master Outer Join: This join keeps all rows of data from the detail source and the matching rows from the master source. It discards the unmatched rows from the master source. All data of employees of 10, 20 and 30 will be there. There will be employees of DEPTNO 50 and corresponding DNAME and LOC columns will be NULL.
Detail Outer Join: This join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source. All employees of 10, 20 and 30 will be there. There will be one record for DEPTNO 40 and corresponding data of EMP columns will be NULL.
Full Outer Join: A full outer join keeps all rows of data from both the master and detail sources. All data of employees of 10, 20 and 30 will be there. There will be employees of DEPTNO 50 and corresponding DNAME and LOC columns will be NULL. There will be one record for DEPTNO 40 and corresponding data of EMP columns will be NULL.
JOINER CACHES
Joiner always caches the MASTER table. We cannot disable caching. It builds Index cache and Data Cache based on MASTER table. 1> Joiner Index Cache: All Columns of MASTER table used in Join condition are in JOINER INDEX CACHE. Example: DEPTNO in our mapping. 2> Joiner Data Cache: Master column not in join condition and used for output to other transformation or target table are in Data Cache. Example: DNAME and LOC in our mapping example.
The entire above are possible in Properties Tab of Source Qualifier t/f. SAMPLE MAPPING TO BE MADE:
Source will be EMP and DEPT tables. Create target table as showed in Picture above. Create shortcuts in your folder as needed.
Creating Mapping:
1> 2> 3> 4> 5> 6> 7> Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give mapping name. Ex: m_SQ_example Drag EMP, DEPT, Target. Right Click SQ_EMP and Select Delete from the mapping. Right Click SQ_DEPT and Select Delete from the mapping. Click Transformation -> Create -> Select Source Qualifier from List -> Give Name -> Click Create 8> Select EMP and DEPT both. Click OK. 9> Link all as shown in above picture. 10> Edit SQ -> Properties Tab -> Open User defined Join -> Give Join condition EMP.DEPTNO=DEPT.DEPTNO. Click Apply -> OK (More details after 2 pages) 11> Mapping -> Validate 12> Repository -> Save Create Session and Workflow as described earlier. Run the workflow and see the data in target table. Make sure to give connection information for all tables.
SQ PROPERTIES TAB
In the Mapping Designer, open a Source Qualifier transformation. Select the Properties tab. Click the Open button in the Source Filter field. In the SQL Editor Dialog box, enter the filter. Example: EMP.SAL>2000 Click OK.
Validate the mapping. Save it. Now refresh session and save the changes. Now run the workflow and see output.
Steps: 1> Open the Source Qualifier transformation, and click the Properties tab. 2> Click the Open button in the User Defined Join field. The SQL Editor Dialog box appears. 3>Enter the syntax for the join.
4> Click OK -> Again Ok. Validate the mapping. Save it. Now refresh session and save the changes. Now run the workflow and see output. Join Type Equi Join Left Outer Join Right Outer Join DEPT.DEPTNO=EMP.DEPTNO {EMP LEFT OUTER JOIN DEPT ON DEPT.DEPTNO=EMP.DEPTNO} {EMP RIGHT OUTER JOIN DEPT ON DEPT.DEPTNO=EMP.DEPTNO} Syntax
Curly braces are needed in Syntax. Try all the above & also FULL outer. Session fails for FULL OUTER. {EMP FULL OUTER JOIN DEPT ON DEPT.DEPTNO=EMP.DEPTNO}
In mapping above, we are passing only SAL and DEPTNO from SQ_EMP to Aggregator transformation. Default query generated will be: SELECT EMP.SAL, EMP.DEPTNO FROM EMP
4. The SQL Editor displays the default query the PowerCenter Server uses to select source data. 5. Click Cancel to exit.
Note: If we do not cancel the SQL query, the PowerCenter Server overrides the default query with the custom SQL query. We can enter an SQL statement supported by our source database. Before entering the query, connect all the input and output ports we want to use in the mapping. Example: As in our case, we cant use full outer join in user defined join, we can write SQL query for FULL OUTER JOIN: SELECT DEPT.DEPTNO, DEPT.DNAME, DEPT.LOC, EMP.EMPNO, EMP.JOB, EMP.SAL, EMP.COMM, EMP.DEPTNO FROM EMP FULL OUTER JOIN DEPT ON DEPT.DEPTNO=EMP.DEPTNO WHERE SAL>2000 EMP.ENAME,
We also added WHERE clause. We can enter more conditions and write more complex SQL.
We can write any query. We can join as many tables in one query as required if all are in same database. It is very handy and used in most of the projects.
Important Points:
When creating a custom SQL query, the SELECT statement must list the port names in the order in which they appear in the transformation. Example: DEPTNO is top column; DNAME is second in our SQ mapping. So when we write SQL Query, SELECT statement have name DNAME first, DNAME second and so on. SELECT DEPT.DEPTNO, DEPT.DNAME Once we have written a custom query like above, then this query will always be used to fetch data from database. In our example, we used WHERE SAL>2000. Now if we use Source Filter and give condition SAL>1000 or any other, then it will not work. Informatica will always use the custom query only. Make sure to test the query in database first before using it in SQL Query. If query is not running in database, then it wont work in Informatica too. Also always connect to the database and validate the SQL in SQL query editor.
The PowerCenter Server queries the lookup source based on the lookup ports in the transformation. It compares Lookup transformation port values to lookup source column values based on the lookup condition. Pass the result of the lookup to other transformations and a target. We can use the Lookup transformation to perform following: Get a related value: EMP has DEPTNO but DNAME is not there. We use Lookup to get DNAME from DEPT table based on Lookup Condition. Perform a calculation: We want only those Employees whos SAL > Average (SAL). We will write Lookup Override query. Update slowly changing dimension tables: Most important use. We can use a Lookup transformation to determine whether rows already exist in the target.
Relational Lookup:
When we create a Lookup transformation using a relational table as a lookup source, we can connect to the lookup source using ODBC and import the table definition as the structure for the Lookup transformation. We can override the default SQL statement if we want to add a WHERE clause or query multiple tables. We can use a dynamic lookup cache with relational lookups.
Cache includes all lookup/output ports in the lookup condition and the lookup/return port. If there is no match for the lookup condition, the PowerCenter Server returns NULL.
If there is a match for the lookup condition, the PowerCenter Server returns the result of the lookup condition into the return port.
values
value
to
another
1. Lookup Source:
We can use a flat file or a relational table for a lookup source. When we create a Lookup t/f, we can import the lookup source from the following locations: Any relational source or target definition in the repository Any flat file source or target definition in the repository Any table or file that both the PowerCenter Server and Client machine can connect to The lookup table can be a single table, or we can join multiple tables in the same database using a lookup SQL override in Properties Tab.
2. Ports:
Ports I O L Lookup Type Connected Unconnected Connected Unconnected Connected Unconnected Unconnected Number Needed Minimum 1 Minimum 1 Minimum 1 Description Input port to Lookup. Usually ports used for Join condition are Input ports. Ports going to another transformation from Lookup. Lookup port. The Designer automatically designates each column in the lookup source as a lookup (L) and output port (O). Return port. Use only in unconnected Lookup t/f only.
1 Only
3. Properties Tab
Options Lookup SQL Override Lookup Table Name Lookup Caching Enabled Lookup Policy on Multiple Match Lookup Type Relational Relational Flat File, Relational Flat File, Relational Description Overrides the default SQL statement to query the lookup table. Specifies the name of the table from which the transformation looks up and caches values. Indicates whether the PowerCenter Server caches lookup values during the session. Determines what happens when the Lookup transformation finds multiple rows that match the lookup condition. Options: Use First Value or Use Last Value or Use Any Value or Report Error Displays the lookup condition you set in the Condition tab. Specifies the database containing the lookup table. Lookup is from a database or flat file. Location where cache is build. Whether to use Persistent Cache or not. Whether to use Dynamic Cache or not. To rebuild cache if cache source changes and we are using Persistent Cache. Use only with dynamic caching enabled. Applies to rows entering the Lookup transformation with the row type of insert. Use only with dynamic caching enabled. Applies to rows entering the Lookup transformation with the row type of update. Data Cache Size Index Cache Size Use only with persistent lookup cache. Specifies the file name prefix to use with persistent lookup cache files.
Lookup Condition Connection Information Source Type Lookup Cache Directory Name Lookup Cache Persistent Dynamic Lookup Cache Recache From Lookup Source Insert Else Update Update Else Insert Lookup Data Cache Size Lookup Index Cache Size Cache File Name Prefix
Flat File, Relational Relational Flat File, Relational Flat File, Relational Flat File, Relational Flat File, Relational Flat File, Relational Relational
Relational
Some other properties for Flat Files are: Datetime Format Thousand Separator Decimal Separator Case-Sensitive String Comparison Null Ordering Sorted Input
4: Condition Tab
We enter the Lookup Condition. The PowerCenter Server uses the lookup condition to test incoming values. We compare transformation input values with values in the lookup source or cache, represented by lookup ports. The datatypes in a condition must match. When we enter multiple conditions, the PowerCenter Server evaluates each condition as an AND, not an OR. The PowerCenter Server matches null values. The input value must meet all conditions for the lookup to return a value. =, >, <, >=, <=, != Operators can be used. Example: IN_DEPTNO = DEPTNO In_DNAME = 'DELHI'
Tip: If we include more than one lookup condition, place the conditions with an equal sign first to optimize lookup performance. Note: 1. We can use = operator in case of Dynamic Cache. 2. The PowerCenter Server fails the session when it encounters multiple keys for a Lookup transformation configured to use a dynamic cache.
Creating Mapping:
1. 2. 3. 4. 5. 6. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give name. Ex: m_CONN_LOOKUP_EXAMPLE Drag EMP and Target table. Connect all fields from SQ_EMP to target except DNAME and LOC. Transformation-> Create -> Select LOOKUP from list. Give name and click Create. 7. The Following screen is displayed.
8. As DEPT is the Source definition, click Source and then Select DEPT.
10> Now Pass DEPTNO from SQ_EMP to this Lookup. DEPTNO from SQ_EMP will be named as DEPTNO1. Edit Lookup and rename it to IN_DEPTNO in ports tab. 11> Now go to CONDITION tab and add CONDITION. DEPTNO = IN_DEPTNO and Click Apply and then OK. Link the mapping as shown below:
12> We are not passing IN_DEPTNO and DEPTNO to any other transformation from LOOKUP; we can edit the lookup transformation and remove the OUTPUT check from them. 13> Mapping -> Validate 14> Repository -> Save Create Session and Workflow as described earlier. Run the workflow and see the data in target table. Make sure to give connection information for all tables. Make sure to give connection for LOOKUP Table also.
We use Connected Lookup when we need to return more than one column from Lookup table. There is no use of Return Port in Connected Lookup.
14. Now add a condition in Condition Tab. DEPTNO = IN_DEPTNO and Click Apply and then OK. 15. Now we need to call this Lookup from Expression Transformation. 16. Edit Expression t/f and create a new output port out_DNAME of datatype as DNAME. Open the Expression editor and call Lookup as given below:
We double click Unconn in bottom of Functions tab and as we need only DEPTNO, we pass only DEPTNO as input. 17. Validate the call in Expression editor and Click OK. 18. Mapping -> Validate 19. Repository Save.
Create Session and Workflow as described earlier. Run the workflow and see the data in target table. Make sure to give connection information for all tables. Make sure to give connection for LOOKUP Table also.
2. Dynamic Cache
To cache a target table or flat file source and insert new rows or update existing rows in the cache, use a Lookup transformation with a dynamic cache. The IS dynamically inserts or updates data in the lookup cache and passes data to the target. Target table is also our lookup table. No good for performance if table is huge.
3. Persistent Cache
If the lookup table does not change between sessions, we can configure the Lookup transformation to use a persistent lookup cache. The IS saves and reuses cache files from session to session, eliminating the time required to read the lookup table.
5. Shared Cache
Unnamed cache: When Lookup transformations in a mapping have compatible caching structures, the IS shares the cache by default. You can only share static unnamed caches. Named cache: Use a persistent named cache when we want to share a cache file across mappings or share a dynamic and a static cache. The caching structures must match or be compatible with a named cache. You can share static and dynamic named caches.
To configure the session to create concurrent caches Edit Session -> In Config Object Tab-> Additional Concurrent Pipelines for Lookup Cache Creation -> Give a value here (Auto By Default)
You can set the following update strategy options: Insert: Select this option to insert a row into a target table. Delete: Select this option to delete a row from a table. Update: We have the following options in this situation: o Update as Update. Update each row flagged for update if it exists in the target table. o Update as Insert. Inset each row flagged for update. o Update else Insert. Update the row if it exists. Otherwise, insert it. Truncate table: Select this option to truncate the target table before loading data.
Steps: 1. Create Update Strategy Transformation 2. Pass all ports needed to it. 3. Set the Expression in Properties Tab. 4. Connect to other transformations or target.
To use Dynamic Cache, first Edit Lookup Transformation -> Properties Tab -> Select Dynamic Cache Option Also Select Insert Else Update or Update Else Insert Option We can update a target table only when it has a Primary Key. If there is no Primary Key, then we need to Use Update Override Option in Properties Tab of Target Table.
Associated Port:
Associate lookup ports with either an input/output port or a sequence ID. Each Lookup Port is associated with a source port so that it can compare the changes. Also, we can generate of Sequence 1, 2, 3 and so on with it. Sequence ID is available when datatype is Integer or Small Int.
Ignore In Comparison:
When we do not want to compare any column in source with target, then we can use this option. Ex: Hiredate will be always same so no need to compare.
In above: The top most port is NewLookupRow. Its hidden. All Lookup table ports have been PREV_ before them. ENAME has been associated with PREV_ENAME and so are others. PREV_COMM port has been checked for Ignore Null Inputs for updates. PREV_HIREDATE has been checked for Ignore in Comparison.
Example: Working with Dynamic Cache using Update Strategy. EMP will be source table. Create a target table DYNAMIC_LOOKUP. Structure same as EMP. Make EMPNO as Primary Key. Create Shortcuts as necessary.
Creating Mapping:
1. 2. 3. 4. 5. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give name. Ex: m_DYNAMIC_LOOKUP_EXAMPLE Drag EMP and target Table. Transformation-> Create -> Select LOOKUP from list. Give name and click Create. 6. Create a Lookup on DYNAMIC_LOOKUP table as created in connected one. 7. Drag all ports from SQ_EMP to Lookup Transformation. 8. Edit lookup transformation. Edit all ports and add PREV_ before them. 9. Also remove 1 added in names of all ports coming from SQ_EMP. 10. Now Go to Properties tab -> Select Dynamic Cache and Insert Else Update Option. 11. Now Associate ports and Set Ignore Null Inputs or Ignore In Comparison Option as shown in Picture above. 12. Transformation -> Create -> Select Filter -> Give name and Click Done. 13. Pass all ports as shown in mapping below to filter and give condition: NewLookupRow != 0 14. Transformation -> Create -> Select Update Strategy -> Give name and Click Done. 15. Pass all ports from Filter to Update Strategy and give Update Strategy Expression: IIF(NewLookupRow = 1,DD_INSERT,DD_UPDATE) IIF 16. Link all the needed ports from Update Strategy to target. 17. Mapping -> Validate and Repository Save.
Create Session and Workflow as usual. First time all rows will be inserted. Now Change the data of target table in Oracle and Run workflow again. You can see how the data is updated as per the properties selected.
We pass the data from Lookup Cache and not source to Filter. This is because the Cache is updated regularly and contains the most updated data. Example Source: EMPNO 9000 9001 9002 9003 of cache: Name Amit Kumar Rahul Singh Sanjay Sumit Singh SAL 9000 9500 8000 7000 DEPTNO 10 20 30 20
Data Already in target: EMPNO Name 9000 Amit Kumar 9001 Rahul Singh Initial Cache: NewlookupRow
DEPTNO 10 20
Name Amit Kumar Rahul Singh Name Amit Kumar Rahul Singh Sanjay Sumit Singh
DEPTNO 10 20 DEPTNO 10 20 30 20
So we always need to write query as SELECT COLUMN1 as COL_1, COLUMN2 as COL_2, COLUMN3 as COL_3 FROM ABC Here COL_1, COL_2, COL_3 are LOOKUP PORTS NAME.
As there is Primary key, we need to write Update Override. Steps: 1. Edit target table INS_UPD_NO_PK_EXAMPLE1. This is copy of table definition. 2. Properties Tab -> Update Override -> Generate SQL. Default SQL is: UPDATE INS_UPD_NO_PK_EXAMPLE SET EMPNO = :TU.EMPNO, ENAME = :TU.ENAME, JOB = :TU.JOB, MGR = :TU.MGR, HIREDATE = :TU.HIREDATE, SAL = :TU.SAL, COMM = :TU.COMM, DEPTNO = :TU.DEPTNO 3. There is no Where clause in this SQL. We need to modify it. UPDATE INS_UPD_NO_PK_EXAMPLE SET ENAME = :TU.ENAME, JOB = :TU.JOB, MGR = :TU.MGR, HIREDATE = :TU.HIREDATE, SAL = :TU.SAL, COMM = :TU.COMM, DEPTNO = :TU.DEPTNO WHERE EMPNO = :TU.EMPNO 4. 5. 6. 7. Paste this modified SQL there and click OK. Click Apply -> OK. Mapping -> Validate Repository Save. Refresh Session by double clicking on it. Save all the changes and run workflow again. Now see data has been updated.
Example2: To Insert if record is not present in target and Delete it if SAL in Source < Sal in Target.
EMP will be source table. Create a target table INS_DELETE_EXAMPLE. Structure same as EMP. Make EMPNO as Primary Key. We cannot delete a record as there is no Primary Key. Create Shortcuts as necessary.
Make Session. Do not select Truncate and Select Delete option for targets. Also Use NORMAL mode. Make Workflow Run and See data. Now change the data in target and update some SAL field in target. Delete 3-4 rows. Run workflow again and see the target data.
A Stored Procedure transformation is an important tool for populating and maintaining databases. Database administrators create stored procedures to automate tasks that are too complicated for standard SQL statements. Use of Stored Procedure in mapping: Check the status of a target database before loading data into it. Determine if enough space exists in a database. Perform a specialized calculation. Drop and recreate indexes. Mostly used for this in projects.
Stored Procedures:
Connect to Source database and create the stored procedures given below: CREATE OR REPLACE procedure sp_agg (in_deptno in number, max_sal out number, min_sal out number, avg_sal out number, sum_sal out number) As Begin select max(Sal),min(sal),avg(sal),sum(sal) into max_sal,min_sal,avg_sal,sum_sal from emp where deptno=in_deptno group by deptno; End; /
CREATE OR REPLACE procedure sp_unconn_1_value(in_deptno in number, max_sal out number) As Begin Select max(Sal) into max_sal from EMP where deptno=in_deptno; End; /
Creating Mapping:
1. 2. 3. 4. 5. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give name. Ex: m_SP_CONN_EXAMPLE Drag DEPT and Target table. Transformation -> Import Stored Procedure -> Give Database Connection -> Connect -> Select the procedure sp_agg from the list.
6. Drag DEPTNO from SQ_DEPT to the stored procedure input port and also to DEPTNO port of target. 7. Connect the ports from procedure to target as shown below:
8. Mapping -> Validate 9. Repository -> Save Create Session and then workflow. Give connection information for all tables. Give connection information for Stored Procedure also. Run workflow and see the result in table.
Method of returning the value of output parameters to a port: Assign the output value to a local variable. Assign the output value to the system variable PROC_RESULT. (See Later)
Creating Mapping:
1. 2. 3. 4. 5. 6. 7. 8. 9. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give name. Ex: m_sp_unconn_1_value Drag DEPT and Target table. Transformation -> Import Stored Procedure -> Give Database Connection -> Connect -> Select the procedure sp_unconn_1_value from the list. Click OK. Stored Procedure has been imported. T/F -> Create Expression T/F. Pass DEPTNO from SQ_DEPT to Expression T/F. Edit expression and create an output port OUT_MAX_SAL of decimal datatype. Open Expression editor and call the stored procedure as below:
Click OK and connect the port from expression to target as in mapping below:
10. Mapping -> Validate 11. Repository Save. Create Session and then workflow. Give connection information for all tables. Give connection information for Stored Procedure also. Run workflow and see the result in table.
PROC_RESULT use:
If the stored procedure returns a single output parameter or a return value, we the reserved variable PROC_RESULT as the output variable. Example: DEPTNO as Input and MAX Sal as output : :SP.SP_UNCONN_1_VALUE(DEPTNO,PROC_RESULT) If the stored procedure returns multiple output parameters, you must create variables for each output parameter. Example: DEPTNO as Input and MAX_SAL, MIN_SAL, AVG_SAL and SUM_SAL as output then: 1. Create four variable ports in expression VAR_MAX_SAL, VAR_MIN_SAL, VAR_AVG_SAL and VAR_SUM_SAL. 2. Create four output ports in expression OUT_MAX_SAL, OUT_MIN_SAL, OUT_AVG_SAL and OUT_SUM_SAL. 3. Call the procedure in last variable port says VAR_SUM_SAL. :SP.SP_AGG (DEPTNO, VAR_MAX_SAL,VAR_MIN_SAL, VAR_AVG_SAL, PROC_RESULT)
Example 2:
DEPTNO as Input and MAX_SAL, MIN_SAL, AVG_SAL and SUM_SAL as O/P. Stored Procedure to drop index in Pre Load of Target Stored Procedure to create index in Post Load of Target DEPT will be source table. Create a target table SP_UNCONN_EXAMPLE with fields DEPTNO, MAX_SAL, MIN_SAL, AVG_SAL & SUM_SAL. Write Stored Procedure in Database first and Create shortcuts as needed.
Stored procedures are given below to drop and create index on target. Make sure to create target table first.
Create or replace procedure CREATE_INDEX As Begin Execute immediate 'create index unconn_dept on SP_UNCONN_EXAMPLE(DEPTNO)'; End; / Create or replace procedure DROP_INDEX As Begin Execute immediate 'drop index unconn_dept'; End; /
Creating Mapping:
1. 2. 3. 4. 5. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give name. Ex: m_sp_unconn_1_value Drag DEPT and Target table. Transformation -> Import Stored Procedure -> Give Database Connection -> Connect -> Select the procedure sp_agg from the list. Click OK. 6. Stored Procedure has been imported. 7. T/F -> Create Expression T/F. Pass DEPTNO from SQ_DEPT to Expression T/F. 8. Edit Expression and create 4 variable ports and 4 output ports as shown below:
9. Call the procedure in last variable port VAR_SUM_SAL. 10. :SP.SP_AGG (DEPTNO, VAR_MAX_SAL, VAR_MIN_SAL, VAR_AVG_SAL, PROC_RESULT) 11. Click Apply and Ok. 12. Connect to target table as needed. 13. Transformation -> Import Stored Procedure -> Give Database Connection for target -> Connect -> Select the procedure CREATE_INDEX and DROP_INDEX from the list. Click OK. 14. Edit DROP_INDEX -> Properties Tab -> Select Target Pre Load as Stored Procedure Type and in call text write drop_index. Click Apply -> Ok. 15. Edit CREATE_INDEX -> Properties Tab -> Select Target Post Load as Stored Procedure Type and in call text write create_index. Click Apply -> Ok.
Create Session and then workflow. Give connection information for all tables. Give connection information for Stored Procedures also. Also make sure that you execute the procedure CREATE_INDEX on database before using them in mapping. This is because, if there is no INDEX on target table, DROP_INDEX will fail and Session will also fail. Run workflow and see the result in table.
NEXTVAL:
Use the NEXTVAL port to generate sequence numbers by connecting it to a transformation or target. For example, we might connect NEXTVAL to two target tables in a mapping to generate unique primary key values.
Sequence in Table 1 will be generated first. When table 1 has been loaded, only then sequence for table 2 will be generated.
CURRVAL:
CURRVAL is NEXTVAL plus the Increment By value. We typically only connect the CURRVAL port when the NEXTVAL port is already connected to a downstream transformation. If we connect the CURRVAL port without connecting the NEXTVAL port, the Integration Service passes a constant value for each row. When we connect the CURRVAL port in a Sequence Generator transformation, the Integration Service processes one row in each block. We can optimize performance by connecting only the NEXTVAL port in a mapping.
Example: To use Sequence Generator transformation EMP will be source. Create a target EMP_SEQ_GEN_EXAMPLE in shared folder. Structure same as EMP. Add two more ports NEXT_VALUE and CURR_VALUE to the target table. Create shortcuts as needed.
Creating Mapping:
1. 2. 3. 4. 5. 6. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give name. Ex: m_seq_gen_example Drag EMP and Target table. Connect all ports from SQ_EMP to target table. Transformation -> Create -> Select Sequence Generator for list -> Create -> Done 7. Connect NEXT_VAL and CURR_VAL from Sequence Generator to target. 8. Validate Mapping 9. Repository -> Save
Create Session and then workflow. Give connection information for all tables. Run workflow and see the result in table.
Cycle
Optional
Reset
Optional
POINTS:
If Current value is 1 and end value 10, no cycle option. There are 17 records in source. In this case session will fail. If we connect just CURR_VAL only, the value will be same for all records. If Current value is 1 and end value 10, cycle option there. Start value is 0. There are 17 records in source. Sequence: 1 2 10. 0 1 2 3 To make above sequence as 1-10 1-20, give Start Value as 1. Start value is used along with Cycle option only. If Current value is 1 and end value 10, cycle option there. Start value is 1. There are 17 records in source. Session runs. 1-10 1-7. 7 will be saved in repository. If we run session again, sequence will start from 8. Use reset option if you want to start sequence from CURR_VAL every time.
3.23 MAPPLETS
A mapplet is a reusable object that we create in the Mapplet Designer. It contains a set of transformations and lets us reuse that transformation logic in multiple mappings. Created in Mapplet Designer in Designer Tool.
We need to use same set of 5 transformations in say 10 mappings. So instead of making 5 transformations in every 10 mapping, we create a mapplet of these 5 transformations. Now we use this mapplet in all 10 mappings. Example: To create a surrogate key in target. We create a mapplet using a stored procedure to create Primary key for target table. We give target table name and key column name as input to mapplet and get the Surrogate key as output. Mapplets help simplify mappings in the following ways: Include source definitions: Use multiple source definitions and source qualifiers to provide source data for a mapping. Accept data from sources in a mapping Include multiple transformations: As many transformations as we need. Pass data to multiple transformations: We can create a mapplet to feed data to multiple transformations. Each Output transformation in a mapplet represents one output group in a mapplet. Contain unused ports: We do not have to connect all mapplet input and output ports in a mapping.
Mapplet Input:
Mapplet input can originate from a source definition and/or from an Input transformation in the mapplet. We can create multiple pipelines in a mapplet. We use Mapplet Input transformation to give input to mapplet. Use of Mapplet Input transformation is optional.
Mapplet Output:
The output of a mapplet is not connected to any target table. We must use Mapplet Output transformation to store mapplet output. A mapplet must contain at least one Output transformation with at least one connected port in the mapplet.
Example1: We will join EMP and DEPT table. Then calculate total salary. Give the output to mapplet out transformation. Steps: Open folder where we want to create the mapping. Click Tools -> Mapplet Designer. Click Mapplets-> Create-> Give name. Ex: mplt_example1 Drag EMP and DEPT table. Use Joiner transformation as described earlier to join them. Transformation -> Create -> Select Expression for list -> Create -> Done Pass all ports from joiner to expression and then calculate total salary as described in expression transformation. 8. Now Transformation -> Create -> Select Mapplet Out from list -> Create -> Give name and then done. 9. Pass all ports from expression to Mapplet output. 10. Mapplet -> Validate 11. Repository -> Save 1. 2. 3. 4. 5. 6. 7. EMP and DEPT will be source tables. Output will be given to transformation Mapplet_Out.
Making a mapping: We will use mplt_example1, and then create a filter transformation to filter records whose Total Salary is >= 1500. mplt_example1 will be source. Create target table same as Mapplet_out transformation as in picture above.
Creating Mapping
1. 2. 3. 4. 5. 6. 7. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give name. Ex: m_mplt_example1 Drag mplt_Example1 and target table. Transformation -> Create -> Select Filter for list -> Create -> Done. Drag all ports from mplt_example1 to filter and give filter condition. Connect all ports from filter to target. We can add more transformations after filter if needed. 8. Validate mapping and Save it.
Make session and workflow. Give connection information for mapplet source tables. Give connection information for target table. Run workflow and see result.
Example2: We will join EMP and DEPT table. The ports of DEPT table will be passed to mapplet in mapping. We will use MAPPLET_INPUT to pass ports of DEPT to joiner. Then calculate total salary. Give the output to mapplet out transformation. Steps: 1. Open folder where we want to create the mapping. 2. Click Tools -> Mapplet Designer. 3. Click Mapplets-> Create-> Give name. Ex: mplt_example1 4. Drag EMP table. 5. Transformation -> Create -> Select Mapplet Input for list->Create -> Done 6. Edit Mapplet Input. 7. Go to ports tab and add 3 ports DEPTNO, DNAME and LOC. 8. Use Joiner transformation as described earlier to join them. 9. Transformation -> Create -> Select Expression for list -> Create -> Done 10. Pass all ports from joiner to expression and then calculate total salary as described in expression transformation. 11. Now Transformation -> Create -> Select Mapplet Out from list -> Create -> Give name and then done. 12. Pass all ports from expression to Mapplet output. 13. Mapplet -> Validate 14. Repository -> Save EMP will be source table in mapplet designer. DEPT ports will be created in MAPPLET INPUT and passed to joiner. Output will be given to transformation Mapplet_Out.
Making a mapping: We will use mplt_example2, and then create a filter transformation to filter records whose Total Salary is >= 1500. mplt_example2 will be source. Create target table same as Mapplet_out transformation as in picture above.
Creating Mapping
1. 2. 3. 4. 5. 6. 7. 8. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give name. Ex: m_mplt_example2 Drag DEPT, mplt_Example2 and target table. Pass all ports from DEPT to mplt_Example2 for input ports. Transformation -> Create -> Select Filter for list -> Create -> Done. Drag all ports from mplt_example1 to filter and give filter condition. Connect all ports from filter to target. We can add more transformations after filter if needed. 9. Validate mapping and Save it.
Make session and workflow. Give connection information for mapplet source tables. Give connection information for target table. Run workflow and see result.
Example 1: To create 4 records of every employee in EMP table. EMP will be source table. Create target table Normalizer_Multiple_Records. Structure same as EMP and datatype of HIREDATE as VARCHAR2. Create shortcuts as necessary.
Creating Mapping
1. 2. 3. 4. 5. 6. 7. 8. 9. Open folder where we want to create the mapping. Click Tools -> Mapping Designer. Click Mapping-> Create-> Give name. Ex: m_ Normalizer_Multiple_Records Drag EMP and Target table. Transformation->Create->Select Expression-> Give name, Click create, done. Pass all ports from SQ_EMP to Expression transformation. Transformation-> Create-> Select Normalizer-> Give name, create & done. Try dragging ports from Expression to Normalizer. Not Possible. Edit Normalizer and Normalizer Tab. Add columns. Columns equal to columns in EMP table and datatype also same. 10. Normalizer doesnt have DATETIME datatype. So convert HIREDATE to char in expression t/f. Create output port out_hdate and do the conversion. 11. Connect ports from Expression to Normalizer. 12. Edit Normalizer and Normalizer Tab. As EMPNO identifies source records and we want 4 records of every employee, give OCCUR for EMPNO as 4.
13. Click Apply and then OK. 14. Add link as shown in mapping below:
15. Mapping -> Validate 16. Repository -> Save Make session and workflow. Give connection information for source and target table. Run workflow and see result.
Make source as a flat file. Import it and create target table. Create Mapping as before. In Normalizer tab, create only 3 ports Roll_Number, Name and Marks as there are 3 columns in target table. Also as we have 3 marks in source, give Occurs as 3 for Marks in Normalizer tab. Connect accordingly and connect to target. Validate and Save Make Session and workflow and Run it. Give Source File Directory and Source File name for source flat file in source properties in mapping tab of session. See the result.
Steps:
1. Open Shared Folder -> Tools -> Source Analyzer 2. Sources -> Import XML Definition. 3. Browse for location where XML file is present. To import the definition, we should have XML file in our local system on which we are working. 4. Select the file and click open. 5. Option for Override Infinite Length is not set. Do you want to set it is displayed. 6. Click Yes. 7. Check Override all infinite lengths with value and give value as 2. 8. Do not modify other options and Click Ok. 9. Click NEXT and then click FINISH 10. Definition has been imported and can be used in mapping as we select other sources.
SESSION PROPERTIES
Open the session for mapping where we used XML sources. In mapping tab, select the XML source. In properties, we do not give relational connection here. We give Source File Directory and Source Filename information.
Steps: 1. Open the folder where we want to create the mapping. 2. In the Mapping Designer, click Mappings > Wizards > Getting Started. 3. Enter a mapping name and select Simple Pass Through, and click next. 4. Select a source definition to use in the mapping. 5. Enter a name for the mapping target table and click Finish. 6. To save the mapping, click Repository > Save.
Handling Keys: When we use the Slowly Growing Target option, the Designer creates an additional column in target, PM_PRIMARYKEY. In this column, the Integration Service generates a primary key for each row written to the target, incrementing new key values by 1. Steps: 1. Open the folder where we want to create the mapping. 2. In the Mapping Designer, click Mappings > Wizards > Getting Started. 3. Enter a mapping name and select Slowly Growing Target, and click next. 4. Select a source definition to be used in the mapping. 5. Enter a name for the mapping target table. Click Next. 6. Select the column or columns from the Target Table Fields list that we want the Integration Service to use to look up data in the target table. Click Add. These columns are used to compare source and target.
7. Click Finish. 8. To save the mapping, click Repository > Save. Note: The Fields to Compare for Changes field is disabled for the Slowly Growing Targets mapping.
Handling Keys: When we use the SCD Type1 option, the Designer creates an additional column in target, PM_PRIMARYKEY. Value incremented by +1. Steps: 1. Open the folder where we want to create the mapping. 2. In the Mapping Designer, click Mappings > Wizards > Slowly Changing Dimension. 3. Enter a mapping name and select Type 1 Dimension, and click Next. 4. Select a source definition to be used by the mapping. 5. Enter a name for the mapping target table. Click Next. 6. Select the column or columns we want to use as a lookup condition from the Target Table Fields list and click add. 7. Select the column or columns we want the Integration Service to compare for changes, and click add.
8. Click Finish. 9. To save the mapping, click Repository > Save. Configuring Session: In the session properties, click the Target Properties settings on the Mappings tab. To ensure the Integration Service loads rows to the target properly, select Insert and Update as Update for each relational target. Flow1: New record is inserted into target table. Flow2: Changed record is updated into target table.
Note: In the Type 1 Dimension mapping, the Designer uses two instances of the same target definition to enable inserting and updating data in the same target table. Generate only one target table in the target database.
When we use this option, the Designer creates two additional fields in the target: 1. PM_PRIMARYKEY: The Integration Service generates a primary key for each row written to the target. 2. PM_VERSION_NUMBER: The IS generates a version number for each row written to the target. Steps: 1. Follow Steps 1-7 as we did in SCD Type1, except Select Type 2 Dimension in Step 3. 2. Click Next. Select Keep the `Version' Number in Separate Column. 3. Click Finish. 4. To save the mapping, click Repository > Save. Note: Designer uses two instances of the same target definition to enable the two separate data flows to write to the same target table. Generate only one target table in the target database. Configuring Session: In the session properties, click the Target Properties settings on the Mappings tab. To ensure the Integration Service loads rows to the target properly, select Insert for each relational target. Flow1: New record is inserted into target table. Flow2: Changed record is inserted into target table.
When we use this option, the Designer creates two additional fields in the target: 1. PM_PRIMARYKEY: The Integration Service generates a primary key for each row written to the target. 2. PM_CURRENT_FLAG: The Integration Service flags the current row "1" and all previous versions "0". Steps: 1. Follow Steps 1-7 as we did in SCD Type1, except Select Type 2 Dimension in Step 3. 2. Click Next. Select Mark the `Current' Dimension Record with a Flag. 3. Click Finish. 4. To save the mapping, click Repository > Save. Note: In the Type 2 Dimension/Flag Current mapping, the Designer uses three instances of the same target definition to enable the three separate data flows to write to the same target table. Generate only one target table in the target database. Configuring Session: In the session properties, click the Target Properties settings on the Mappings tab. To ensure the Integration Service loads rows to the target properly, select Insert and Update as Update for each relational target. Flow1: New record is inserted into target table. Flow2: Changed record is inserted into target table. Flow2: Current Flag of changed record is updated in target table.
When we use this option, the Designer creates 3 additional fields in the target: 1. PM_PRIMARYKEY: The Integration Service generates a primary key for each row written to the target. 2. PM_BEGIN_DATE: For each new and changed record, it is populated with SYSDATE. This Sysdate is the date on which ETL process runs. 3. PM_END_DATE: It is populated as NULL when record is inserted. A new record is inserted when a record changes. However, PM_END_DATE of changed record is updated with SYSDATE. Steps: 1. Follow Steps 1-7 as we did in SCD Type1, except Select Type 2 Dimension in Step 3. 2. Click Next. Select Mark the Dimension Records with their Effective Date Range. 3. Click Finish. 4. To save the mapping, click Repository > Save. Configuring Session: It is same as we did in SCD Type Flag Current. Flow1: New record is inserted into target table with PM_BEGIN_DATE as SYSDATE. Flow2: Changed record is inserted into target with PM_BEGIN_DATE as SYSDATE. Flow2: END_DATE of changed record is updated in target table.
When we use this option, the Designer creates two additional fields in the target: 1. PM_PRIMARYKEY: The Integration Service generates a primary key for each row written to the target. 2. PM_PREV_ColumnName: The Designer generates a previous column corresponding to each column for which we want historical data. The IS keeps the previous version of record data in these columns. 3. PM_EFFECT_DATE: An optional field. The IS uses the system date to indicate when it creates or updates a dimension. Steps: 1. Follow Steps 1-7 as we did in SCD Type1, except Select Type 3 Dimension in Step 3. 2. Click Next. Select Effective Date if desired. 3. Click Finish. 4. To save the mapping, click Repository > Save. Configuring Session: It is same as we did in SCD Type Flag Current. Flow1: New record is inserted into target table. Flow2: Changed record is updated in the target table.
MAPPING PARAMETERS
A mapping parameter represents a constant value that we can define before running a session. A mapping parameter retains the same value throughout the entire session.
Example: When we want to extract records of a particular month during ETL process, we will create a Mapping Parameter of data type and use it in query to compare it with the timestamp field in SQL override. After we create a parameter, it appears in the Expression Editor. We can then use the parameter in any expression in the mapplet or mapping. We can also use parameters in a source qualifier filter, user-defined join, or extract override, and in the Expression Editor of reusable transformations.
MAPPING VARIABLES
Unlike mapping parameters, mapping variables are values that can change between sessions. The Integration Service saves the latest value of a mapping variable to the repository at the end of each successful session. We can override a saved value with the parameter file. We can also clear all saved values for the session in the Workflow Manager.
We might use a mapping variable to perform an incremental read of the source. For example, we have a source table containing timestamped transactions and we want to evaluate the transactions on a daily basis. Instead of manually entering a session override to filter source data each time we run the session, we can create a mapping variable, $$IncludeDateTime. In the source qualifier, create a filter to read only rows whose transaction date equals $$IncludeDateTime, such as: TIMESTAMP = $$IncludeDateTime In the mapping, use a variable function to set the variable value to increment one day each time the session runs. If we set the initial value of $$IncludeDateTime to 8/1/2004, the first time the Integration Service runs the session, it reads only rows dated 8/1/2004. During the session, the Integration Service sets $$IncludeDateTime to 8/2/2004. It saves 8/2/2004 to the repository at the end of the session. The next time it runs the session, it reads only rows from August 2, 2004.
Variable Values:
Start Value:
The start value is the value of the variable at the start of the session. The Integration Service looks for the start value in the following order: 1. Value in parameter file 2. Value saved in the repository 3. Initial value 4. Default value
Current Value:
The current value is the value of the variable as the session progresses. When a session starts, the current value of a variable is the same as the start value. The final current value for a variable is saved to the repository at the end of a successful session. When a session fails to complete, the Integration Service does not update the value of the variable in the repository. Note: If a variable function is not used to calculate the current value of a mapping variable, the start value of the variable is saved to the repository.
Variable Functions
Variable functions determine how the Integration Service calculates the current value of a mapping variable in a pipeline. SetMaxVariable: Sets the variable to the maximum value of a group of values. It ignores rows marked for update, delete, or reject. Aggregation type set to Max. SetMinVariable: Sets the variable to the minimum value of a group of values. It ignores rows marked for update, delete, or reject. Aggregation type set to Min. SetCountVariable: Increments the variable value by one. It adds one to the variable value when a row is marked for insertion, and subtracts one when the row is marked for deletion. It ignores rows marked for update or reject. Aggregation type set to Count. SetVariable: Sets the variable to the configured value. At the end of a session, it compares the final current value of the variable to the start value of the variable. Based on the aggregate type of the variable, it saves a final value to the repository.
4. Enter name. Do not remove $$ from name. 5. Select Type and Datatype. Select Aggregation type for mapping variables. 6. Give Initial Value. Click ok.
Creating Mapping
1. Open folder where we want to create the mapping. 2. Click Tools -> Mapping Designer. 3. Click Mapping-> Create-> Give name. Ex: m_mp_mv_example 4. Drag EMP and target table. 5. Transformation -> Create -> Select Expression for list -> Create -> Done. 6. Drag EMPNO, ENAME, HIREDATE, SAL, COMM and DEPTNO to Expression. 7. Create Parameter $$Bonus and Give initial value as 200. 8. Create variable $$var_max of MAX aggregation type and initial value 1500. 9. Create variable $$var_min of MIN aggregation type and initial value 1500. 10. Create variable $$var_count of COUNT aggregation type and initial value 0. COUNT is visible when datatype is INT or SMALLINT. 11. Create variable $$var_set of MAX aggregation type.
12. 13. Create 5 output ports out_ TOTAL_SAL, out_MAX_VAR, out_MIN_VAR, out_COUNT_VAR and out_SET_VAR. 14. Open expression editor for TOTAL_SAL. Do the same as we did earlier for SAL+ COMM. To add $$BONUS to it, select variable tab and select the parameter from mapping parameter. SAL + COMM + $$Bonus 15. Open Expression editor for out_max_var. 16. Select the variable function SETMAXVARIABLE from left side pane. Select $$var_max from variable tab and SAL from ports tab as shown below. SETMAXVARIABLE($$var_max,SAL)
17. Open Expression editor for out_min_var and write the following expression: SETMINVARIABLE($$var_min,SAL). Validate the expression. 18. Open Expression editor for out_count_var and write the following expression: SETCOUNTVARIABLE($$var_count). Validate the expression. 19. Open Expression editor for out_set_var and write the following expression: SETVARIABLE($$var_set,ADD_TO_DATE(HIREDATE,'MM',1)). Validate. 20. Click OK. Expression Transformation below:
21. Link all ports from expression to target and Validate Mapping and Save it. 22. See mapping picture on next page.
Make session and workflow. Give connection information for source and target table. Run workflow and see result.
A parameter file contains the following types of parameters and variables: Workflow variable: References values and records information in a workflow. Worklet variable: References values and records information in a worklet. Use predefined worklet variables in a parent workflow, but we cannot use workflow variables from the parent workflow in a worklet. Session parameter: Defines a value that can change from session to session, such as a database connection or file name. Mapping parameter and Mapping variable
To enter a parameter file in the session properties: 1. 2. 3. 4. 5. Open a session in the Workflow Manager. Click the Properties tab and open the General Options settings. Enter the parameter directory and name in the Parameter Filename field. Example: D:\Files\Para_File.txt or $PMSourceFileDir\Para_File.txt Click OK.
Chapter 4
Workflow Manager
To move data from sources to targets, the Integration Service uses the following components: Integration Service process Load Balancer Data Transformation Manager (DTM) process
Steps to assign IS from Menu: 1. Close all folders in the repository. 2. Click Service > Assign Integration Service. 3. From the Choose Integration Service list, select the service we want to assign. 4. From the Show Folder list, select the folder we want to view. 5. Click the Selected check box for each workflow you want the Integration Service to run. 6. Click Assign.
Valid Workflow:
Example of loop:
System variables:
Use the SYSDATE and WORKFLOWSTARTTIME system variables within a workflow.
Task-specific variables:
The Workflow Manager provides a set of task-specific variables for each task in the workflow. The Workflow Manager lists task-specific variables under the task name in the Expression Editor. Task-specific variable Condition EndTime ErrorCode ErrorMsg FirstErrorCode FirstErrorMsg PrevTaskStatus Description Result of decision condition expression. NULL if task fails. Date and time when a task ended. Last error code for the associated task. 0 if there is no error. Last error message for the associated task. Empty String if there is no error. Error code for the first error message in the session. 0 if there is no error. First error message in the session. Empty String if there is no error. Status of the previous task in the workflow that IS ran. Can be ABORTED, FAILED, STOPPED, SUCCEEDED. Total number of rows the Integration Service failed to read from the source. Total number of rows successfully read from the sources. Date and time when task started. Status of the previous task in the workflow. Can be ABORTED, DISABLED, FAILED, NOTSTARTED, STARTED, STOPPED, SUCCEEDED. Total number of rows the Integration Service failed to write to the target. Total number of rows successfully written to the target Total number of transformation errors. Task Type Decision Task All Tasks All Tasks All Tasks Session Session All Tasks
6. Enter the default value for the variable in the Default field. 7. To validate the default value of the new workflow variable, click the Validate button. 8. Click Apply to save the new workflow variable. 9. Click OK to close the workflow properties.
Example: Suppose we want to read data from 10 different databases containing same table and then transfer to the same database table. Solution1: Open Session and give connection for each database 10 times and then run the workflow. Solution2: Create a Session parameter for source database and give its value in parameter file. Session Parameter Type Database Connection Source File Target File Lookup File Reject File Naming Convention $DBConnectionName $InputFileName $OutputFileName $LookupFileName $BadFileName
Source file, target file, lookup file, reject file parameters are used for Flat Files.
Similarly give the parameter for target and reject file in Target properties. For Lookup file parameter, select Lookup file in Transformations node and give the parameter there for Lookup file name.
In the Task Developer or Workflow Designer, choose Tasks-Create. Select an Email task and enter a name for the task. Click Create. Click Done. Double-click the Email task in the workspace. The Edit Tasks dialog box appears. Click the Properties tab. Enter the fully qualified email address of the mail recipient in the Email User Name field. Enter the subject of the email in the Email Subject field. Or, you can leave this field blank. Click the Open button in the Email Text field to open the Email Editor. Click OK twice to save your changes.
Create a workflow wf_sample_email Drag any session task to workspace. Edit Session task and go to Components tab. See On Success Email Option there and configure it. In Type select reusable or Non-reusable. In Value, select the email task to be used. Click Apply -> Ok. Validate workflow and Repository -> Save We can also drag the email task and use as per need. We can set the option to send email on success or failure in components tab of a session task.
Steps for creating command task: 1. In the Task Developer or Workflow Designer, choose Tasks-Create. 2. Select Command Task for the task type. 3. Enter a name for the Command task. Click Create. Then click done. 4. Double-click the Command task. Go to commands tab. 5. In the Commands tab, click the Add button to add a command. 6. In the Name field, enter a name for the new command. 7. In the Command field, click the Edit button to open the Command Editor. 8. Enter only one command in the Command Editor. 9. Click OK to close the Command Editor. 10. Repeat steps 5-9 to add more commands in the task. 11. Click OK.
Steps to create the workflow using command task: 1. Create a task using the above steps to copy a file in Task Developer.
2. 3. 4. 5.
Open Workflow Designer. Workflow -> Create -> Give name and click ok. Start is displayed. Drag session say s_m_Filter_example and command task. Link Start to Session task and Session to Command Task. Double click link between Session and Command and give condition in editor as 6. $S_M_FILTER_EXAMPLE.Status=SUCCEEDED 7. Workflow-> Validate 8. Repository -> Save
Types of Events Tasks: EVENT RAISE: Event-Raise task represents a user-defined event. We use this task to raise a user defined event. EVENT WAIT: Event-Wait task waits for a file watcher event or user defined event to occur before executing the next session in the workflow. Example1: Use an event wait task and make sure that session s_filter_example runs when abc.txt file is present in D:\FILES folder. Steps for creating workflow: 1. 2. 3. 4. 5. 6. Workflow -> Create -> Give name wf_event_wait_file_watch -> Click ok. Task -> Create -> Select Event Wait. Give name. Click create and done. Link Start to Event Wait task. Drag s_filter_example to workspace and link it to event wait task. Right click on event wait task and click EDIT -> EVENTS tab. Select Pre Defined option there. In the blank space, give directory and filename to watch. Example: D:\FILES\abc.tct 7. Workflow validate and Repository Save.
Example 2: Raise a user defined event when session s_m_filter_example succeeds. Capture this event in event wait task and run session S_M_TOTAL_SAL_EXAMPLE Steps for creating workflow: Workflow -> Create -> Give name wf_event_wait_event_raise -> Click ok. Workflow -> Edit -> Events Tab and add events EVENT1 there. Drag s_m_filter_example and link it to START task. Click Tasks -> Create -> Select EVENT RAISE from list. Give name ER_Example. Click Create and then done. 5. Link ER_Example to s_m_filter_example. 6. Right click ER_Example -> EDIT -> Properties Tab -> Open Value for User Defined Event and Select EVENT1 from the list displayed. Apply -> OK. 7. Click link between ER_Example and s_m_filter_example and give the condition $S_M_FILTER_EXAMPLE.Status=SUCCEEDED 8. Click Tasks -> Create -> Select EVENT WAIT from list. Give name EW_WAIT. Click Create and then done. 9. Link EW_WAIT to START task. 10. Right click EW_WAIT -> EDIT-> EVENTS tab. 11. Select User Defined there. Select the Event1 by clicking Browse Events button. 12. Apply -> OK. 13. Drag S_M_TOTAL_SAL_EXAMPLE and link it to EW_WAIT. 14. Mapping -> Validate 15. Repository -> Save. 16. Run workflow and see. 1. 2. 3. 4.
Example: Command Task should run only if either s_m_filter_example or S_M_TOTAL_SAL_EXAMPLE succeeds. If any of s_m_filter_example or S_M_TOTAL_SAL_EXAMPLE fails then S_m_sample_mapping_EMP should run. Steps for creating workflow: 1. Workflow -> Create -> Give name wf_decision_task_example -> Click ok. 2. Drag s_m_filter_example and S_M_TOTAL_SAL_EXAMPLE to workspace and link both of them to START task. 3. Click Tasks -> Create -> Select DECISION from list. Give name DECISION_Example. Click Create and then done. Link DECISION_Example to both s_m_filter_example and S_M_TOTAL_SAL_EXAMPLE. 4. Right click DECISION_Example-> EDIT -> GENERAL tab. 5. Set Treat Input Links As to OR. Default is AND. Apply and click OK. 6. Now edit decision task again and go to PROPERTIES Tab. Open the Expression editor by clicking the VALUE section of Decision Name attribute and enter the following condition: $S_M_FILTER_EXAMPLE.Status = SUCCEEDED OR $S_M_TOTAL_SAL_EXAMPLE.Status = SUCCEEDED 7. Validate the condition -> Click Apply -> OK. 8. Drag command task and S_m_sample_mapping_EMP task to workspace and link them to DECISION_Example task. 9. Double click link between S_m_sample_mapping_EMP & DECISION_Example & give the condition: $DECISION_Example.Condition = 0. Validate & click OK. 10. Double click link between Command task and DECISION_Example and give the condition: $DECISION_Example.Condition = 1. Validate and click OK. 11. Workflow Validate and repository Save. 12. Run workflow and see the result.
Control Option Fail Me Fail Parent Stop Parent Abort Parent Fail Top-Level WF Stop Top-Level WF Abort Top-Level WF
Example: Drag any 3 sessions and if anyone fails, then Abort the top level workflow. Steps for creating workflow: 1. Workflow -> Create -> Give name wf_control_task_example -> Click ok. 2. Drag any 3 sessions to workspace and link all of them to START task. 3. Click Tasks -> Create -> Select CONTROL from list. Give name cntr_task. Click Create and then done. 4. Link all sessions to the control task cntr_task. 5. Double click link between cntr_task and any session say s_m_filter_example and give the condition: $S_M_FILTER_EXAMPLE.Status = SUCCEEDED. 6. Repeat above step for remaining 2 sessions also. 7. Right click cntr_task-> EDIT -> GENERAL tab. Set Treat Input Links As to OR. Default is AND. 8. Go to PROPERTIES tab of cntr_task and select the value Fail top level Workflow for Control Option. Click Apply and OK. 9. Workflow Validate and repository Save. 10. Run workflow and see the result.
Steps to create Assignment Task: 1. Open any workflow where we want to use Assignment task. 2. Edit Workflow and add user defined variables. 3. Choose Tasks-Create. Select Assignment Task for the task type. 4. Enter a name for the Assignment task. Click Create. Then click Done. 5. Double-click the Assignment task to open the Edit Task dialog box. 6. On the Expressions tab, click Add to add an assignment. 7. Click the Open button in the User Defined Variables field. 8. Select the variable for which you want to assign a value. Click OK. 9. Click the Edit button in the Expression field to open the Expression Editor. 10. Enter the value or expression you want to assign. 11. Repeat steps 7-10 to add more variable assignments as necessary. 12. Click OK.
We can use the User Defined Variable in our link conditions as per the need and also calculate or set the value of variable in Assignment Task.
4.4 SCHEDULERS
We can schedule a workflow to run continuously, repeat at a given time or interval, or we can manually start a workflow. The Integration Service runs a scheduled workflow as configured. By default, the workflow runs on demand. We can change the schedule settings by editing the scheduler. If we change schedule settings, the Integration Service reschedules the workflow according to the new settings. A scheduler is a repository object that contains a set of schedule settings. Scheduler can be non-reusable or reusable. The Workflow Manager marks a workflow invalid if we delete the scheduler associated with the workflow. If we choose a different Integration Service for the workflow or restart the Integration Service, it reschedules all workflows. If we delete a folder, the Integration Service removes workflows from the schedule.
The Integration Service does not run the workflow if: The prior workflow run fails. We remove the workflow from the schedule The Integration Service is running in safe mode
Steps: 1. 2. 3. 4. 5. 6. Open the folder where we want to create the scheduler. In the Workflow Designer, click Workflows > Schedulers. Click Add to add a new scheduler. In the General tab, enter a name for the scheduler. Configure the scheduler settings in the Scheduler tab. Click Apply and OK.
1. Run on Demand: Integration Service runs the workflow when we start the workflow manually. 2. Run Continuously: Integration Service runs the workflow as soon as the service initializes. The Integration Service then starts the next run of the workflow as soon as it finishes the previous run. 3. Run on Server initialization Integration Service runs the workflow as soon as the service is initialized. The Integration Service then starts the next run of the workflow according to settings in Schedule Options. Schedule options for Run on Server initialization: Run Once: To run the workflow just once. Run every: Run the workflow at regular intervals, as configured. Customized Repeat: Integration Service runs the workflow on the dates and times specified in the Repeat dialog box.
Start options for Run on Server initialization: Start Date Start Time
End options for Run on Server initialization: End on: IS stops scheduling the workflow in the selected date. End After: IS stops scheduling the workflow after the set number of workflow runs. Forever: IS schedules the workflow as long as the workflow does not fail.
5. If we select Reusable, choose a reusable scheduler from the Scheduler Browser dialog box. 6. Click Ok.
Some Points: To remove a workflow from its schedule, right-click the workflow in the Navigator window and choose Unschedule Workflow. To reschedule a workflow on its original schedule, right-click the workflow in the Navigator window and choose Schedule Workflow.
4.5 WORKLETS
A worklet is an object that represents a set of tasks that we create in the Worklet Designer. Create a worklet when we want to reuse a set of workflow logic in more than one workflow. To run a worklet, include the worklet in a workflow. Worklet is created in the same way as we create Workflows. Tasks are also added in the same way as we do in workflows. We can link tasks and give link conditions in same way.
Worklets can be: Reusable Worklet: Crested in Worklet Designer. 1. In the Worklet Designer, click Worklet > Create. 2. Enter a name for the worklet. 3. Click OK. 4. Add tasks as needed. Give links and conditions. 5. Worklet -> Validate 6. Repository -> Save Non-Reusable Worklet: Crested in Workflow Designer. 1. In the Workflow Designer, open a workflow. 2. Click Tasks > Create. 3. For the Task type, select Worklet. 4. Enter a name for the task. 5. Click Create. 6. Click Done. To 1. 2. 3. add tasks to a non-reusable worklet: Create a non-reusable worklet in the Workflow Designer workspace. Right-click the worklet and choose Open Worklet. Add tasks in the worklet by using the Tasks toolbar or click Tasks > Create in the Worklet Designer. 4. Connect tasks with links.
Some Points: We cannot run two instances of the same worklet concurrently in the same workflow. We cannot run two instances of the same worklet concurrently across two different workflows. Each worklet instance in the workflow can run once.
4.6 PARTITIONING
A pipeline consists of a source qualifier and all the transformations and targets that receive data from that source qualifier. When the Integration Service runs the session, it can achieve higher performance by partitioning the pipeline and performing the extract, transformation, and load for each partition in parallel.
A partition is a pipeline stage that executes in a single reader, transformation, or writer thread. The number of partitions in any pipeline stage equals the number of threads in the stage. By default, the Integration Service creates one partition in every pipeline stage.
2. Number of Partitions We can define up to 64 partitions at any partition point in a pipeline. When we increase or decrease the number of partitions at any partition point, the Workflow Manager increases or decreases the number of partitions at all partition points in the pipeline. Increasing the number of partitions or partition points increases the number of threads. The number of partitions we create equals the number of connections to the source or target. For one partition, one database connection will be used.
3. Partition types The Integration Service creates a default partition type at each partition point. If we have the Partitioning option, we can change the partition type. This option is purchased separately. The partition type controls how the Integration Service distributes data among partitions at partition points.
6. Key range Partition Type We specify one or more ports to form a compound partition key. The Integration Service passes data to each partition depending on the ranges we specify for each port. Use key range partitioning where the sources or targets in the pipeline are partitioned by key range. Example: Customer 1-100 in one partition, 101-200 in another and so on. We define the range for each partition.
2. PROPERTIES TAB
Property Write Backward Compatible Session Log File Session Log File Name Session Log File Directory Parameter File Name Enable Test Load Number of Rows to Test $Source Connection Value $Target Connection Value Treat Source Rows As Required/ Optional Optional Optional Required Optional Optional Optional Optional Optional Required Description Select to write session log to a file.
Commit Type
Required
Required Required
Location where session log is created. Name and location of parameter file. To test a mapping. Number of rows of source data to test. Enter the database connection we want to use for $Source variable. Enter the database connection we want to use for $Target variable. Indicates how the IS treats all source rows. Can be Insert, Update, Delete or Data Driven Determines whether the Integration Service uses a source- or target-based or user-defined commit. Indicates the number of rows after which commit is fired. See on Next Page
We can configure performance settings on the Properties tab. In Performance settings, we can increase memory size, collect performance details, and set configuration parameters.
RECOVERY STRATEGY
Workflow recovery allows us to continue processing the workflow and workflow tasks from the point of interruption. We can recover a workflow if the Integration Service can access the workflow state of operation. The Integration Service recovers tasks in the workflow based on the recovery strategy of the task. By default, the recovery strategy for Session and Command tasks is to fail the task and continue running the workflow. We can configure the recovery strategy for Session and Command tasks. The strategy for all other tasks is to restart the task.
Recovery Strategy Options for session task: Resume from the last checkpoint. Restart task. Fail task and continue workflow. Recovery Strategy Options for Command task: Restart task. Fail task and continue workflow. Target Recovery Tables When the Integration Service runs a session that has a resume recovery strategy, it writes to recovery tables on the target database system. The following recovery tables are used: PM_RECOVERY PM_TGT_RUN_ID Recovery Options Suspend Workflow on Error: Available in Workflow Suspension Email: Available in Workflow Enable HA Recovery: Available in Workflow Automatically Recover Terminated Tasks: Available in Workflow Maximum Automatic Recovery Attempts: Available in Workflow Recovery Strategy: Available in Session and Command Fail Task If Any Command Fails: Available in Command
6. COMPONENTS TAB
In the Components tab, we can configure the following: Pre-Session Command Post-Session Success Command Post-Session Failure Command On Success Email On Failure Email
Optional Optional
2. PROPERTIES TAB
Properties tab has the following options: Parameter File Name Write Backward Compatible Workflow Log File: Select to write workflow log to a file. It is Optional. Workflow Log File Name Workflow Log File Directory Save Workflow Log By: Required and Options are By Run and By Timestamp Save Workflow Log For These Runs: Required. How many logs needs to be saved for a workflow. Enable HA Recovery: Not required. Automatically recover terminated tasks: Not required. Maximum automatic recovery attempts: Not required.
3. SCHEDULER TAB
The Scheduler Tab lets us schedule a workflow to run continuously, run at a given interval, or manually start a workflow.
4. VARIABLE TAB
It is used to declare User defined workflow variables.
5. EVENTS TAB
Before using the Event-Raise task, declare a user-defined event on the Events tab.