Jump to content

User:Prudhvi2003/sandbox

From Wikipedia, the free encyclopedia

Data Lineage

[edit]

Data lineage is a very important module in ETL Informatica tool as it is very useful in finding where the columns or the data in the tables are sourcing to. Before this module came in the previous Informatica versions, identification of data dependencies among the tables and jobs was a very tedious effort and used to be time consuming process.

For Example if you want to find out, in what mappings a particular table is used as source instances, look ups, Target instances. The importance of data lineage over the repository manager is that repository manager can only show the whether a table is used as a source or a target but will not show if it’s used in look ups or in any overrides.

Data Lineage Importance[1]

[edit]

Its main purpose is to track the columns in the tables and very much used in data mining and dependency checks for the jobs.

1. To know where the data is being transformed

2. To know where the columns in the tables are sourced to

3. Data mining

4. Data Dependencies

5. Production Support Failures

6. Error Handling


Data Lineage Window[2]

[edit]

The data lineage diagram appears in a browser window. We can display the data lineage for PowerCenter[3] objects in the Designer. When we display data lineage for a PowerCenter object, the Designer connects to the Metadata Manager application and extracts the data lineage information from the Metadata Manager warehouse. Data lineage analysis on a PowerCenter repository displays one or more of the following PowerCenter objects

Requirements to run the Data Lineage

[edit]

1. In order to Access Data Lineage the Metadata Manager should be up and running.

2. Make sure Metadata Manager Service exists in the domain that contains the PowerCenter repository you want to run the data Lineage analysis on.

3. Create a resource for the PowerCenter repository in Metadata Manager and load the PowerCenter repository metadata into the Metadata Manager warehouse.

4. Configure the Metadata Manager Service and the resource name for the PowerCenter repository in the Administrator tool.

5. Configure the web browser. When you run data lineage analysis on a PowerCenter object, the Designer launches data lineage analysis in the default browser configured for your system. Data lineage requires the Flash 9 viewer installed on the browser.

References

[edit]
  1. ^ "Explore Data Lineage". IBM. 2010-01-07. Retrieved 07 January 2010. {{cite web}}: Check date values in: |accessdate= (help); Unknown parameter |month= ignored (help)CS1 maint: date and year (link)
  2. ^ "Data Lineage Configuration".
  3. ^ "Informatica Resource Center".