How Can We Update A Record in Target Table Without Using Update Strategy?
How Can We Update A Record in Target Table Without Using Update Strategy?
How Can We Update A Record in Target Table Without Using Update Strategy?
A target table can be updated without using 'Update Strategy'. For this, we need to define the key in the target table in Informatica level and then we need to connect the key and the field we want to update in the mapping Target. In the session level, we should set the target property as "Update as Update" and check the "Update" check-box. Let's assume we have a target table "Customer" with fields as "Customer ID", "Customer Name" and "Customer Address". Suppose we want to update "Customer Address" without an Update Strategy. Then we have to define "Customer ID" as primary key in Informatica level and we will have to connect Customer ID and Customer Address fields in the mapping. If the session properties are set correctly as described above, then the mapping will only update the customer address field for all matching customer IDs.
Under what condition selecting Sorted Input in aggregator may fail the session?
If the input data is not sorted correctly, the session will fail. Also if the input data is properly sorted, the session may fail if the sort order by ports and the group by ports of the aggregator are not in the same order.
Here our source is the DEPT table and the EMP table is used a lookup. The lookup condition is based on the department number.
Basically we try to achieve the result as the below sql select:SELECT DEPT.DEPTNO, DEPT.DNAME, DEPT.LOC, EMP.ENAME, EMP.SAL FROM DEPT LEFT OUTER JOIN EMP ON DEPT.DEPTNO = EMP.DEPTNO
When we run a session, the integration service may create a reject file for each target instance in the mapping to store the target reject record. With the help of the Session Log and Reject File we can identify the cause of data rejection in the session. Eliminating the cause of rejection will lead to rejection free loads in the subsequent session runs. If the Informatica Writer or the Target Database rejects data due to any valid reason the integration service logs the rejected records into the reject file. Every time we run the session the integration service appends the rejected records to the reject file.
Row Indicator 0 1 2 3 4 5 6
Indicator Significance Insert Update Delete Reject Rolled-back insert Rolled-back update Rolled-back delete
Rejected By Writer or target Writer or target Writer or target Writer Writer Writer Writer
7 8 9
Now comes the Column Data values followed by their Column Indicators, that determines the data quality of the corresponding Column.
Null Value.
Also to be noted that the second column contains column indicator flag value 'D' which signifies that the Row Indicator is valid.
Using incremental aggregation, we apply captured changes in the source data (CDC part) to aggregate calculations in a session. If the source changes incrementally and we can capture the changes, then we can configure the session to process those changes. This allows the Integration Service to update the target incrementally, rather than forcing it to delete previous loads data, process the entire source data and recalculate the same data each time you run the session.
Incremental Aggregation
When the session runs with incremental aggregation enabled for the first time say 1st week of Jan, we will use the entire source. This allows the Integration Service to read and store the necessary aggregate data information. On 2nd week of Jan, when we run the session again, we will filter out the CDC records from the source i.e the records loaded after the initial load. The Integration Service then processes these new data and updates the target accordingly. Use incremental aggregation when the changes do not significantly change the target.If processing the incrementally changed source alters more than half the existing target, the session may not benefit from using incremental aggregation. In this case, drop the table and recreate the target with entire source data and recalculate the same aggregation formula . INCREMENTAL AGGREGATION, may be helpful in cases when we need to load data in monthly facts in a weekly basis.
Sample Mapping
Let us see a sample mapping to implement incremental aggregation:
Look at the Source Qualifier query to fetch the CDC part using a BATCH_LOAD_CONTROL table that saves the last successful load date for the particular mapping.
Now the most important session properties configuation to implement incremental Aggregation
If we want to reinitialize the aggregate cache suppose during first week of every month we will configure the same session in a new workflow at workflow level with the Reinitialize aggregate cache property checked in.
After the first Load on 1st week of Jan 2010, the data in the target is as follows:
Now during the 2nd week load it will process only the incremental data in the source i.e those records having load date greater than the last session run date. After the 2nd weeks load after incremental aggregation of the incremental source data with the aggregate cache file data will update the target table with the following dataset: CUSTOMER_KEY INVOICE_KEY MON_KEY AMOUNT Remarks/Operation The cache file updated after aggretation The cache file updated after aggretation The cache file remains the same as before New group row inserted in cache file New group row inserted in cache file
1111
6008
201001
450
2222
6009
201001
500
3333
5003
201001
300
4444
1234
201001
350
5555
6157
201001
500
Each subsequent time we run the session with incremental aggregation, we use the incremental source changes in the session. For each input record, the Integration Service checks historical information in the index file for a corresponding group. If it finds a corresponding group, the Integration Service performs the aggregate operation incrementally, using the aggregate data for that group, and saves the incremental change. If it does not find a corresponding group, the Integration Service creates a new group and saves the record data. When writing to the target, the Integration Service applies the changes to the existing target. It saves modified aggregate data in the index and data files to be used as historical data the next time you run the session. Each subsequent time we run a session with incremental aggregation, the Integration Service creates a backup of the incremental aggregation files. The cache directory for the Aggregator transformation must contain enough disk space for two sets of the files. The Integration Service creates new aggregate data, instead of using historical data, when we configure the session to reinitialize the aggregate cache, Delete cache files etc. When the Integration Service rebuilds incremental aggregation files, the data in the previous files is lost. Pushdown Optimization which is a new concept in Informatica PowerCentre, allows developers to balance data transformation load among servers. This article describes pushdown techniques.
Filter Condition used in this mapping is: DEPTNO>40 Suppose a mapping contains a Filter transformation that filters out all employees except those with a DEPTNO greater than 40. The Integration Service can push the transformation logic to the database. It generates the following SQL statement to process the transformation logic:
INSERT INTO EMP_TGT(EMPNO, ENAME, SAL, COMM, DEPTNO) SELECT EMP_SRC.EMPNO, EMP_SRC.ENAME, EMP_SRC.SAL, EMP_SRC.COMM, EMP_SRC.DEPTNO FROM EMP_SRC WHERE (EMP_SRC.DEPTNO >40)
The Integration Service generates an INSERT SELECT statement and it filters the data using a WHERE clause. The Integration Service does not extract data from the database at this time. We can configure pushdown optimization in the following ways: For example, a mapping contains the following transformations: SourceDefn -> SourceQualifier -> Aggregator -> Rank -> Expression -> TargetDefn
SUM(SAL), SUM(COMM) Group by DEPTNO RANK PORT on SAL TOTAL = SAL+COMM
The Rank transformation cannot be pushed to the database. If the session is configured for full pushdown optimization, the Integration Service pushes the Source Qualifier transformation and
the Aggregator transformation to the source, processes the Rank transformation, and pushes the Expression transformation and target to the target database. When we use pushdown optimization, the Integration Service converts the expression in the transformation or in the workflow link by determining equivalent operators, variables, and functions in the database. If there is no equivalent operator, variable, or function, the Integration Service itself processes the transformation logic. The Integration Service logs a message in the workflow log and the Pushdown Optimization Viewer when it cannot push an expression to the database. Use the message to determine the reason why it could not push the expression to the database.
Informatica scenarios Design a mapping to load the last 3 rows from a flat file into a target? Solution: Consider the source has the following data. col a b c d e Step1: You have to assign row numbers to each record. Generate the row numbers using the expression transformation as mentioned above and call the row number generated port as O_count. Create a DUMMY output port in the same expression transformation and assign 1 to that port. So that, the DUMMY output port always return 1 for each row. In the expression transformation, the ports are V_count=V_count+1 O_count=V_count O_dummy=1 The output of expression transformation will be col, o_count, o_dummy a, 1, 1 b, 2, 1 c, 3, 1 d, 4, 1 e, 5, 1 Step2: Pass the output of expression transformation to aggregator and do not specify any group by condition. Create an output port O_total_records in the aggregator and assign O_count port to it. The aggregator will return the last row by default. The output of aggregator contains the DUMMY port which has value 1 and O_total_records port which has the value of total number of records in the source. In the aggregator transformation, the ports are O_dummy O_count O_total_records=O_count
The output of aggregator transformation will be O_total_records, O_dummy 5, 1 Step3: Pass the output of expression transformation, aggregator transformation to joiner transformation and join on the DUMMY port. In the joiner transformation check the property sorted input, then only you can connect both expression and aggregator to joiner transformation. In the joiner transformation, the join condition will be O_dummy (port from aggregator transformation) = O_dummy (port from expression transformation) The output of joiner transformation will be col, o_count, o_total_records a, 1, 5 b, 2, 5 c, 3, 5 d, 4, 5 e, 5, 5 Step4: Now pass the ouput of joiner transformation to filter transformation and specify the filter condition as O_total_records (port from aggregator)O_count(port from expression) <=2 In the filter transformation, the filter condition will be O_total_records - O_count <=2 The output of filter transformation will be col o_count, o_total_records c, 3, 5 d, 4, 5 e, 5, 5 Design a mapping to load the first record from a flat file into one table A, the last record from a flat file into table B and the remaining records into table C? Solution: This is similar to the above problem; the first 3 steps are same. In the last step instead of using the filter transformation, you have to use router transformation. In the router transformation create two output groups. In the first group, the condition should be O_count=1 and connect the corresponding output group to table A. In the second group, the condition should be O_count=O_total_records and connect the corresponding output group to table B. The output of default group should be connected to table C.
B Q1. Design a mapping to load all unique products in one table and the duplicate rows in another table. The first table should contain the following output A D The second target should contain the following output B B B C C Solution: Use sorter transformation and sort the products data. Pass the output to an expression transformation and create a dummy port O_dummy and assign 1 to that port. So that, the DUMMY output port always return 1 for each row. The output of expression transformation will be Product, O_dummy A, 1 B, 1 B, 1 B, 1 C, 1 C, 1 D, 1 Pass the output of expression transformation to an aggregator transformation. Check the group by on product port. In the aggreagtor, create an output port O_count_of_each_product and write an expression count(product). The output of aggregator will be Product, O_count_of_each_product A, 1 B, 3 C, 2 D, 1 Now pass the output of expression transformation, aggregator transformation to joiner transformation and join on the products port. In the joiner transformation check the property sorted input, then only
you can connect both expression and aggregator to joiner transformation. The output of joiner will be product, O_dummy, O_count_of_each_product A, 1, 1 B, 1, 3 B, 1, 3 B, 1, 3 C, 1, 2 C, 1, 2 D, 1, 1 Now pass the output of joiner to a router transformation, create one group and specify the group condition as O_dummy=O_count_of_each_product. Then connect this group to one table. Connect the output of default group to another table. Q2. Design a mapping to load each product once into one table and the remaining products which are duplicated into another table. The first table should contain the following output A B C D The second table should contain the following output B B C Solution: Use sorter transformation and sort the products data. Pass the output to an expression transformation and create a variable port,V_curr_product, and assign product port to it. Then create a V_count port and in the expression editor write IIF(V_curr_product=V_prev_product, V_count+1,1). Create one more variable port V_prev_port and assign product port to it. Now create an output port O_count port and assign V_count port to it. In the expression transformation, the ports are Product V_curr_product=product V_count=IIF(V_curr_product=V_prev_product,V_count+1,1) V_prev_product=product O_count=V_count
The output of expression transformation will be Product, O_count A, 1 B, 1 B, 2 B, 3 C, 1 C, 2 D, 1 Now Pass the output of expression transformation to a router transformation, create one group and specify the condition as O_count=1. Then connect this group to one table. Connect the output of default group to another table. Design a mapping to get the pervious row salary for the current row. If there is no pervious row exists for the current row, then the pervious row salary should be displayed as null. The output should look like as employee_id, salary, pre_row_salary 10, 1000, Null 20, 2000, 1000 30, 3000, 2000 40, 5000, 3000 Solution: Connect the source Qualifier to expression transformation. In the expression transformation, create a variable port V_count and increment it by one for each row entering the expression transformation. Also create V_salary variable port and assign the expression IIF(V_count=1,NULL,V_prev_salary) to it . Then create one more variable port V_prev_salary and assign Salary to it. Now create output port O_prev_salary and assign V_salary to it. Connect the expression transformation to the target ports. In the expression transformation, the ports will be employee_id salary V_count=V_count+1 V_salary=IIF(V_count=1,NULL,V_prev_salary) V_prev_salary=salary O_prev_salary=V_salary
Design a mapping to get the next row salary for the current row. If there is no next row for the current row, then the next row salary should be displayed as null. The output should look like as employee_id, salary, next_row_salary 10, 1000, 2000 20, 2000, 3000 30, 3000, 5000 40, 5000, Null Solution: Step1: Connect the source qualifier to two expression transformation. In each expression transformation, create a variable port V_count and in the expression editor write V_count+1. Now create an output port O_count in each expression transformation. In the first expression transformation, assign V_count to O_count. In the second expression transformation assign V_count-1 to O_count. In the first expression transformation, the ports will be employee_id salary V_count=V_count+1 O_count=V_count In the second expression transformation, the ports will be employee_id salary V_count=V_count+1 O_count=V_count-1 Step2: Connect both the expression transformations to joiner transformation and join them on the port O_count. Consider the first expression transformation as Master and second one as detail. In the joiner specify the join type as Detail Outer Join. In the joiner transformation check the property sorted input, then only you can connect both expression transformations to joiner transformation. Step3: Pass the output of joiner transformation to a target table. From the joiner, connect the employee_id, salary which are obtained from the first expression transformation to the employee_id, salary ports in target table. Then from the joiner, connect the salary which is obtained from the second expression transformaiton to the next_row_salary port in the target table.
Design a mapping to find the sum of salaries of all employees and this sum should repeat for all the rows. The output should look like as employee_id, salary, salary_sum 10, 1000, 11000 20, 2000, 11000 30, 3000, 11000 40, 5000, 11000 Solution: Step1: Connect the source qualifier to the expression transformation. In the expression transformation, create a dummy port and assign value 1 to it. In the expression transformation, the ports will be employee_id salary O_dummy=1 Step2: Pass the output of expression transformation to aggregator. Create a new port O_sum_salary and in the expression editor write SUM(salary). Do not specify group by on any port. In the aggregator transformation, the ports will be salary O_dummy O_sum_salary=SUM(salary) Step3: Pass the output of expression transformation, aggregator transformation to joiner transformation and join on the DUMMY port. In the joiner transformation check the property sorted input, then only you can connect both expression and aggregator to joiner transformation. Step4: Pass the output of joiner to the target table.
2. Consider the following employees table as source department_no, employee_name 20, R 10, A
Q1. Design a mapping to load a target table with the following values from the above source? department_no, employee_list 10, A 10, A,B 10, A,B,C 10, A,B,C,D 20, A,B,C,D,P 20, A,B,C,D,P,Q 20, A,B,C,D,P,Q,R 20, A,B,C,D,P,Q,R,S Solution: Step1: Use a sorter transformation and sort the data using the sort key as department_no and then pass the output to the expression transformation. In the expression transformation, the ports will be department_no employee_name V_employee_list = IIF(ISNULL(V_employee_list),employee_name,V_employee_list||','||employee_name) O_employee_list = V_employee_list Step2: Now connect the expression transformation to a target table.
Q2. Design a mapping to load a target table with the following values from the above source? department_no, employee_list 10, A 10, A,B 10, A,B,C 10, A,B,C,D 20, P
20, P,Q 20, P,Q,R 20, P,Q,R,S Solution: Step1: Use a sorter transformation and sort the data using the sort key as department_no and then pass the output to the expression transformation. In the expression transformation, the ports will be department_no employee_name V_curr_deptno=department_no V_employee_list = IIF(V_curr_deptno! = V_prev_deptno,employee_name,V_employee_list||','||employee_name) V_prev_deptno=department_no O_employee_list = V_employee_list Step2: Now connect the expression transformation to a target table.
Q3. Design a mapping to load a target table with the following values from the above source? department_no, employee_names 10, A,B,C,D 20, P,Q,R,S Solution: The first step is same as the above problem. Pass the output of expression to an aggregator transformation and specify the group by as department_no. Now connect the aggregator transformation to a target table.
How to solve pcsf_10342 when enabling integration service in Informatica 8.6.1 ? Answered by: Sudarshan on: May 2nd, 2012 Login to the admin console and delete the repository content and recreate it. Restart the integration service. It works! Hiiam learning Informatica 8.1 ( which is what I could get my hands on)..I am connecting to Oracle 10g.I created 2 connections to the db using connection --> relational connection
browser.The source and target are the same db in this case, just diff table names. But created 2 separate connections .I... Answered by: Raghu on: Mar 9th, 2012 Delete it Answered by: Lokesh M on: Dec 20th, 2011 Try these and see if it helps - Delete statics and try to retrieve. - Try to export with INDEXES=n STATISTICS=none - Disable auditing with "noaudit session" SQL> noaudit session;
1.Junk Dimension:Contains miscellaneous data such as flags and indicator 2.Degenerated Dimension:Which is derived from the fact table and does not have any dimension of its own. 3.Conformed Dimension:which is connected are shared by more than one facts. Index cache : Integration service stores all conditional values in to the index cache and all output values into the data cache.
Unix
How to print/display the first line of a file?
There are many ways to do this. However the easiest way to display the first line of a file is using the [head] command.
$> head -1 file.txt
No prize in guessing that if you specify [head -2] then it would print first 2 records of the file. Another way can be by using [sed] command. [Sed] is a very powerful text editor which can be used for various text manipulation purposes like this.
$> sed '2,$ d' file.txt
How does the above command work? The 'd' parameter basically tells [sed] to delete all the records from display from line 2 to last line of the file (last line is represented by $ symbol). Of course it does not actually delete those lines from the file, it just does not display those lines in standard output screen. So you only see the remaining line which is the 1st line.
If you want to do it using [sed] command, here is what you should write:
$> sed -n '$ p' test
From our previous answer, we already know that '$' stands for the last line of the file. So '$ p' basically prints (p for print) the last line in standard output screen. '-n' switch takes [sed] to silent mode so that [sed] does not print anything else in the output.
You need to replace <n> with the actual line number. So if you want to print the 4th line, the command will be
$> sed n '4 p' test
Of course you can do it by using [head] and [tail] command as well like below:
$> head -<n> file.txt | tail -1
You need to replace <n> with the actual line number. So if you want to print the 4th line, the command will be
$> head -4 file.txt | tail -1
But the issue with the above command is, it just prints out all the lines except the first line of the file on the standard output. It does not really change the file in-place. So if you want to delete the first line from the file itself, you have two options. Either you can redirect the output of the file to some other file and then rename it back to original file like below:
$> sed '1 d' file.txt > new_file.txt $> mv new_file.txt file.txt
Or, you can use an inbuilt [sed] switch 'i' which changes the file in-place. See below:
$> sed i '1 d' file.txt
How to remove the last line/ trailer from a file in Unix script?
Always remember that [sed] switch '$' refers to the last line. So using this knowledge we can deduce the below command:
$> sed i '$ d' file.txt
The above command will delete line 5 to line 7 from the file file.txt
But not always you will know the number of lines present in the file (the file may be generated dynamically, etc.) In that case there are many different ways to solve the problem. There are some ways which are quite complex and fancy. But let's first do it in a way that we can understand easily and remember easily. Here is how it goes:
$> tt=`wc -l file.txt | cut -f1 -d' '`;sed i "`expr $tt - 4`,$tt d" test
As you can see there are two commands. The first one (before the semi-colon) calculates the total number of lines present in the file and stores it in a variable called tt. The second command (after the semi-colon), uses the variable and works in the exact way as shows in the previous example.
Where <n> is to be replaced by the actual line number that you want to print. Now once you know it, it is easy to print out the length of this line by using [wc] command with '-c' switch.
$> sed n '35 p' file.txt | wc c
The above command will print the length of 35th line in the file.txt.
'-d' switch tells [cut] about what is the delimiter (or separator) in the file, which is space ' ' in this case. If the separator was comma, we could have written -d',' then. So, suppose I want find the 4th word from the below string: A quick brown fox jumped over the lazy cat, we will do something like this:
$> echo A quick brown fox jumped over the lazy cat | cut f4 d' '
But I want to introduce one more command to do this here. That is by using [awk] command. [awk] is a very powerful command for text pattern scanning and processing. Here we will see how may we use of [awk] to extract the first field (or first column) from the output of another command. Like above suppose I want to print the first column of the [wc c] output. Here is how it goes like this:
$>wc -c file.txt | awk ' ''{print $1}' 109
In the action space, we have asked [awk] to take the action of printing the first column ($1). More on [awk] later.
How to replace the n-th line in a file with a new line in Unix?
This can be done in two steps. The first step is to remove the n-th line. And the second step is to insert a new line in n-th line position. Here we go. Step 1: remove the n-th line
$>sed -i'' '10 d' file.txt # d stands for delete
If you want to know the technical MIME type of the file, use -i switch.
$>file -i file.txt file.txt: text/plain; charset=us-ascii
If you connect to database in this method, the advantage is, you will be able to pass Unix side shell variables value to the database. See below example
$>res=`sqlplus -s username/password@database_name <<EOF SET HEAD OFF; select count(*) from student_table t where t.last_name=$1; EXIT; EOF` $> echo $res 12
$> SqlReturnMsg=`sqlplus -s username/password@database<<EOF BEGIN Proc_Your_Procedure( your-input-parameters ); END; / EXIT; EOF` $> echo $SqlReturnMsg
How to check the command line arguments in a UNIX command in Shell Script?
In a bash shell, you can access the command line arguments using $0, $1, $2, variables, where $0 prints the command name, $1 prints the first input parameter of the command, $2 the second input parameter of the command and so on.
inside your program, then your program will thrown an error and exit immediately.
If the file exists, the [ls] command will be successful. Hence [echo $?] will print 0. If the file does not exist, then [ls] command will fail and hence [echo $?] will print 1.
The standard command to see this is [ps]. But [ps] only shows you the snapshot of the processes at that instance. If you need to monitor the processes for a certain period of time and need to refresh the results in each interval, consider using the [top] command.
$> ps ef
If you wish to see the % of memory usage and CPU usage, then consider the below switches
$> ps aux
If you wish to use this command inside some shell script, or if you want to customize the output of [ps] command, you may use -o switch like below. By using -o switch, you can specify the columns that you want [ps] to print out.
$>ps -e -o stime,user,pid,args,%mem,%cpu