Informatica has some advantages over other data integration systems. A couple of the advantages are:
It is faster than the available platforms.
You can easily monitor your jobs with Informatica Workflow Monitor.
It has made data validation, iteration and project development to be easier than before.
If you experience failed jobs, it is easy to identify the failure and recover from it. The same applies to jobs that are running slowly.
Informatica has a wide range of application that covers areas such as:
Some basic Informatica programs are:
Mappings: A mapping is designed in the Designer. It defines all the ETL processes. Data are read from their original sources by mappings before the application of transformation logic to the read data. The transformed data is later written to the targets.
Workflows: The processes of runtime ETL are described by a collection of different tasks are known as workflow. Workflows are designed in the Workflow Manager.
Task: This is a set of actions, commands, or functions that are executable. How an ETL process behaves during runtime can be defined by a sequence of different tasks.
There are many development components in Informatica. However, these are the most widely used of them:
Expression: This can be used to transform data that have functions.
Lookups: They are extensively used to join data.
Sorter and Aggregator: This is the right tool for sorting data and aggregating them.
Java transformation: Java transformation is the choice of developers if they want to invoke variables, java methods, third-party API’s and java packages that are built-in.
Source qualifiers: Many people use this component to convert source data types to the equivalent Informatica data types.
Transaction control: If you want to create transactions and have absolute control over rollbacks and commits, count on this component to bail you out.
ETL tools are quite different from other tools. They are used for performing some actions such as:
Loading important data into a data warehouse from any source known as Target.
Extracting data from a data warehouse from any sources such as database tables or files.
Transforming the data received from different sources in an organized way. Some of the notable sources where data are received include SAP solutions, Teradata, or web services.
There are three types of data ware houses:
· Enterprise data ware house
· ODS(operational data store)
· Data mart
A data mart is a subset of data warehouse that is designed for a particular line of business, such as sales, marketing, or finance. In a dependent data mart, data can be derived from an enterprise wide data warehouse. In a independent data mart can be collected directly from sources.
According to Bill Inmon, known as father of Data warehousing. “A Data warehouse is a subject oriented, integrated ,time variant, non volatile collection of data in support of management’s decision making process”.
A star schema is the simplest form of data warehouse schema that consists of one or more dimensional and fact tables.
A Snowflake schema is nothing but one Fact table which is connected to a number of dimension tables, The snowflake and star schema are methods of storing data which are multidimensional in nature.
ETL Tools are stands for Extraction, Transformation, and Loading the data into the data warehouse for decision making. ETL refers to the methods involved in accessing and manipulating source data and loading it into target database.
Dimension tables contain attributes that describe fact records in the fact table.
Data Modeling is representing the real world set of data structures or entities and their relationship in their of data models, required for a database.Data Modelling consists of various types like :
Conceptual data modeling
Logical data modeling
Physical data modeling
Enterprise data modeling
Relation data modeling
Dimensional data modeling
Surrogate key is a substitution for the natural primary key. It is just a unique identifier or number of each row that can be used for the primary key to the table.
A Data Mining is the process of analyzing data from different perpectives and summarizing it into useful information.
A ODS is an operational data store which comes as a second layer in a datawarehouse architecture. It has got the characteristics of both OLTP and DSS systems.
OLTP is nothing but OnLine Transaction Processing which contains a normalised tables .
But OLAP(Online Analtical Programming) contains the history of OLTP data which is non-volatile acts as a Decisions Support System.
There are three types of dimensions available are :
Maplet is a set of transformations that you build in the maplet designer and you can use in multiple mapings.
Session: A session is a set of commands that describes the server to move data to the target.
Batch: A Batch is set of tasks that may include one or more numbar of tasks (sessions, ewent wait, email, command, etc).
Dimensions that change overtime are called Slowly Changing Dimensions(SCD).
Slowly Changing Dimension-Type1 : Which has only current records.
Slowly Changing Dimension-Type2 : Which has current records + historical records.
Slowly Changing Dimension-Type3 : Which has current records + one previous records.
There are two modes of data movement are:
Normal Mode in which for every record a separate DML stmt will be prepared and executed.
Bulk Mode in which for multiple records DML stmt will be preapred and executed thus improves performance.
Active Transformation:An active transformation can change the number of rows that pass through it from source to target i.e it eliminates rows that do not meet the condition in transformation.
Passive Transformation:A passive transformation does not change the number of rows that pass through it i.e it passes all rows through the transformation.
Connected Transformation:Connected transformation is connected to other transformations or directly to target table in the mapping.
UnConnected Transformation:An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation.
Various types of aggregation are
Aggregator transformation is an Active and Connected transformation. This transformation is useful to perform calculations such as averages and sums (mainly to perform calculations on multiple rows or groups).
Expression transformation is a Passive and Connected transformation. This can be used to calculate values in a single row before writing to the target.
Filter transformation is an Active and Connected transformation. This can be used to filter rows in a mapping that do not meet the condition.
Joiner Transformation is an Active and Connected transformation. This can be used to join two sources coming from two different locations or from same location.
5th April | 08:00 AM