Informatica transformations:
Informatica Transformations are repository objects that can create, read, modify, or pass data to defined target structures like tables, files, or any other targets.
The goal of transformation in Informatica is to adjust the source data to meet the needs of the target system. This also indicates that the information being loaded into the goal is of high quality.
A Transformation seems to be a set of rules that governs the data flow and how the data has been filled into the targets.Informatica includes a number of transformations that can be used to perform specific functions.To transfer information through transformations, we must attach the ports to it, and the output is returned it through output ports.
Become a Informatica Certified professional by learning this HKR Informatica Online Training !
Classification of transformations:
Transformation is classified into two parts: connectivity transformation and change in multiple rows transformation. First, we'll look at the transformation from a connectivity standpoint.
Here are two types of transformations that are based on connectivity:
- Connected Transformations that are linked
- Unconnected Transformations that are unrelated
Connected transformations are defined in Informatica as transformations that are linked to other transformations during mappings.
For instance, the Source qualifier transformation of Source table Stud is linked to the filter transformation in order to filter students in a class.Unconnected transformations are those that do not connect to any other transformations.
Their capabilities are accessed by referencing them within other transformations. And these transformations aren't in the works.When the transformation is started calling for each input row or is expected to return a value, connected transformations are preferred.
Unconnected transformations are beneficial if their features is only needed on a regular basis or under certain conditions. Determine the tax details, for instance, if the tax value is not available.
Here are 2 categories of transformations based on changes in multiple rows:
- Active Transformations and
- Passive Transformations
Active Transformations:Active Transformations would be those who change the data rows as well as the number of input rows that are passed to them. For instance, if a transformation receives ten rows as input and returns fifteen rows as output, it is an active transformation. The data in the row is modified in the active transformation.
Passive transformations:The number of input rows is not altered by passive transformations. The number of input and output rows remains constant in passive transformations, and data is only modified at the row level.
We could not perhaps develop new rows in the passive transformation, and no existing rows are dropped.
List of informatica transformations:
The following are the list of informatica transformations. They are:
1.Source Qualifier Transformation
6.Sequence Generator Transformation
7.Transaction Control Transformation
8.Lookup and Reusable transformation
10.Performance Tuning for Transformation
Now we will explore the above mentioned list of informatica transformations one after the other.
Informatica Certification Training
- Master Your Craft
- Lifetime LMS & Faculty Access
- 24/7 online expert support
- Real-world & Project Based Learning
1.Source Qualifier Transformation:
The source qualifier transformation seems to be currently active and linked. It's being used to depict the rows read by the continuous integration service whenever a conversation is operated. In a mapping, users must connect the source qualifier transformation with both the interpersonal or flat file interpretation.
The source qualifier transformation helps to convert the origin data types to the native types of data of Informatica. As a result, there is no need to change the types of data of the ports.
The source qualifier transformation performs the following functions:
Joins allow us to connect two or more tables from the same source database. By default, the sources are consolidated based on the relation between the primary key and the foreign key. It can be altered by modifying the "user-defined join" property and clarifying the join condition.
- Rows could be filtered: We could indeed filter the rows from the source database. The integration service inserts a WHERE clause into the default query.
- Sorting input: By defining the number of sorted ports, we could indeed sort the source data. The Integration Service inserts an ORDER BY clause into the default SQL query.
- Separate rows: By selecting the "Select Distinct" property, we can get separate rows from the source. The Integration Service inserts a SELECT DISTINCT statement into the default SQL query.
- Custom SQL Query: We can create a SQL query that performs calculations for you.
2.Aggregator Transformation:
The transformation of an aggregator is an active transformation. It is also used to perform calculations on data such as sums, averages, counts, and so on.
The aggregate cache is where the integration service stores the group of data and row data. In comparison to SQL, the Aggregator Transformation is more advantageous. To filter rows, we can use conditional clauses.
Features of aggregator transformations:
Here are some examples of aggregator transformation features:
- Aggregate Expression
- Sorted Data
- Cache aggregate
- Unsorted Input
AggregateExpression:
Aggregate functions have been used to continue driving aggregate expressions that can be evolved in variable ports or output ports only.
Sorted input:
To improve data aggregation performance, groups by ports are sorted using a sorted transformation and receive the sorted data as an input.
It keeps the sorted transformation before the aggregator transformation to perform sorting on from up by ports.
Cache Aggregate
An aggregate cache is created by an integration service.
Inputs are not sorted
The aggregate cache is made up of group by ports, non-group by input ports, and an output port that allows for aggregate expression.
3.Router transformation:
The router transformation seems to be an active and connected transformation that is similar to the filter transformation in that it filters the source data.
Data Integration has been used as a filter condition in a Router transformation to assess every row of incoming data. Before having to process the default group, it verifies the circumstances of each user-defined group.
Data Integration moves a row numerous times if it attaches more than one group filter condition. Rows which do not satisfy any of the conditions can be moved to a default output group.
When we really need to verify the same input data for various conditions, we use a Router transformation in such a mapping rather than making multiple Filter transformations.
4.Joiner Transformation:
The transformation of a joiner is indeed an active and connected transformation. It offers the option of creating joins in Informatica. The joins generated using the joiner transformation are similar to database joins.
The joiner transformation is often used to connect two disparate sources. The joiner transformation connects sources based on a condition that matches one or more pairs of columns in the two sources.
There are two input pipelines: a master pipeline and a detail pipeline. To join more than two sources, we must join the output of the joiner transformation with another source. And n-1 joiner transformations are required to join n number of sources in mapping.
We could indeed specify which data source should be a Master source and which should be a Detail source in the joiner transformation properties.
The master source has been cached in memory for having joined purposes during execution. As a result, the origin with the fewest records must be chosen as that of the master source.
Configuring Joiner Transformations:
In Informatica, we customise the mentioned joiner transformation properties, such as:
- Case-Sensitive String Comparison: This option is used by the integration service when performing joins on string columns. The case sensitive string comparison option is enabled by default.
- Cache Directory: A directory where the master or detail rows are cached. $PMCacheDir is the default directory path. We can also override this value.
- Join Type: Specifies whether the join should be a Master Outer Join, a Detail Outer Join, a Normal Join, or a Full Outer Join.
- Tracing Level: It really is used throughout the session log file to track the level of tracing.
- Joiner Data Cache Size: This value indicates the size of the data cache. And the data cache size is set to Auto by default.
- Joiner Index Cache Size: This value indicates the size of the index cache. And the index cache size is set to Auto by default.
- Sorted Input: When the input data is sorted, this option is used. It also improves performance.
- Master Sort Order: This property specifies the sort order of the master source data. If the master source data is sorted ascendingly, we select Ascending. If we select
- Ascending, we must enable the Sorted Input option. And the default value for this option is Auto.
- Transformation Scope: We can choose between All Input and Row as the transformation scope.
Types of join:
The joiner transformation in Informatica can be used to create the following joins:
- Outer master join:The join returns all records from the Detail source and only matching rows from the Master source in a Master outer join.
- Outer join detail:In a detailed outer join, only matching rows from the detail source are returned, while all rows from the master source are returned.
- Complete outer join.
- All records from both sources are returned in a full outer join. In SQL, the master outer and detail outer joins are equivalent to left outer joins.
- Ordinary join
- Only matching rows from both sources are returned in a normal join.
The blocking transformation is the name given to the joiner transformation. Depending upon whether joiner transformation is organised for sorted input or not, the integration service blocks and unblocks the source data.
Transformation of Unsorted Joiners
When using an unsorted joiner transformation, the integration service reads all of the master rows before reading the detail rows.
The integration service prevents access to the detail source while caching all master rows. It bypasses the detail source as well as comprehends the details rows after reading all of the master rows.
Transformation of Sorted Joiners
In the case of sorted joiner transformation, blocking logic may or may not be possible. If the integration service can use blocking logic without blocking all sources in the target load order group, it does so. Otherwise, no blocking logic is used.
5.Rank Transformation:
Rank is indeed an active and connected transformation that filters data based on groups and ranks. The rank transformation also includes the ability to perform group-based ranking.
The rank transformation does have an output port that is used to rank the rows.
That is used in Informatica to select a bottom or top range of data. While string value ports can be ranked, numeric port values are ranked using the Informatica Rank Transformation. One might believe that the MAX and MIN functions can perform the same function.
The rank transformation, on the other hand, allows groups of records to be listed instead of a single value or record. The rank transformation is built using the ports listed below.
- Input port (I)
- Output port (O)
- Variable port (V)
- Rank Port (R)
Rank Port:The port that participates in a rank calculation is referred to as the Rank port.
Variable Port:A variable port is one that allows us to create expressions to temporarily store data for rank calculation.We will be able to write expressions for rank calculation using the variable port.
Configuring the rank transformation:
Let's look at how to set up the Rank transformation's following properties:
- Cache Directory: This is the location where the integration service stores the index and data cache files.
- Top/Bottom: This stipulates if we want to select the data's top or bottom rank.
- Number of Ranks: This specifies the number of rows to rank.
- Case-Sensitive String Comparison: It is used to sort strings based on case.
- Tracing Level: The amount of logging that will be recorded in the session log file.
- Rank Data Cache Size: The default data cache size is 2,000,000 bytes. For the data cache size, we could indeed specify a numeric value or leave it at Auto. In the case of Auto, the cache size is determined at runtime by the Integration Service.
- Rank Index Cache Size: The index cache size is set to 1,000,000 bytes by default. For the index cache size, we can specify a numeric value or leave it at Auto. In the case of Auto, the cache size is determined at runtime by the Integration Service.
Subscribe to our YouTube channel to get new updates..!
6.Sequence generator transformation:
The sequence generator transformation is a passive and connected transformation that generates numeric sequence values such as 1, 2, 3, and so on. It has no effect on the number of rows in the input.
The Sequence Generator transformation is used to generate one-of-a-kind primary key values and to replace missing primary keys.
If we want to assign sequence values to the source records, for example, we must use a sequence generator.
There are two output ports on the sequence generator transformation. These ports, for example, cannot be edited or deleted.
- CURRVAL
- NEXTVAL
NEXTVAL:
By integrating it to a Transformation or target, the NEXTVAL port generates sequence numbers. The generated sequence numbers have been determined by the properties Current Value and Increment By.
If the sequence generator also isn't configured to Cycle, the NEXTVAL port increments the sequence numbers until the End Value is reached.
To create different values for each row, we could indeed connect the NEXTVAL port to multiple transformations.
At the very same time, the sequence generator transformation generates a block of numbers. If indeed the block of numbers is used, the next block of sequence numbers is generated.
For instance, in mapping, we could connect NEXTVAL to two target tables to generate unique primary key values.
For the first target, the integration service generates a block of numbers ranging from 1 to 10. Only after the first block of numbers has been loaded will a second block of numbers ranging from 11 to 20 be generated for the second target.
Top 30 frequently asked Informatica Interview Questions !
CURRVAL:
NEXTVAL plus the Increment By value equals the CURRVAL port.
When the NEXTVAL port is already linked to a downstream transformation, we connect the CURRVAL port.When we connect the CURRVAL port but not the NEXTVAL port, the Integration Service returns a constant value for each row.
When the CURRVAL port is combined in a Sequence Generator Transformation, the Integration Service processes one row in each block.
In Mapping, we could indeed improve performance by connecting only the NEXTVAL port.
7.Transaction Control Transformation:
A Transaction Control transformation is both active and connected. It enables us to commit and rollback transactions based on a set of rows that have been transformed using the Transaction Control transformation.
Commit and rollback operations seem to be critical because they ensure the availability of data.
A transaction is indeed a collection of rows that are bound together by commit or rollback rows. We could indeed describe a transaction based on the number of input rows that vary. We also can recognise transactions based on a collection of rows ordered by a prevalent key, such as an employee ID or order entry date.
When processing a large amount of data, there may be a need to commit the data to the target. When a commit is performed too quickly, it creates an overhead for the system.
If a commit is conducted too late, the data may be lost in the event of a failure. As a result, the Transaction control transformation offers flexibility.
The transaction control transformation in PowerCenter is described at the following levels:
- Within a mapping: To determine a transaction within a mapping, we use the Transaction Control transformation. In a Transaction Control transformation, we define transactions with an expression. We can commit, rollback, or continue based on the expression's return value without affecting the transaction.
- Within a session: We set up a session for the user-specified commit. If the Integration Service fails to transform or write any rows to the target, we have the option of committing or rolling back a transaction.
The Integration Service analyses the expression for each row that enters the transformation once we run the session. It undertakes all rows in the transaction to the goal or targets when it analyses a committed row. When the Integration Service analyses a rollback row, it rolls back all rows from the target or targets in the transaction.
If the mapping's target is a flat-file, the integration service can start generating an output file for each new transaction. The target flat files could be named dynamically. Here's an example of a dynamic flat-file creation-Dynamic flat-file creation.
8.Lookup Transformations:
The lookup transformation is often used to find relevant data by looking up a source, source qualifier, or target.
It is a join operation in which one of the joining tables contains the source data and the other joining table contains the lookup data.
The Lookup transformation is often used to extract data based on a lookup condition that is specified. A Lookup transformation, for instance, could be used to retrieve values from a database table for codes found in source data.
When a mapping task contains a Lookup transformation, the task queries the lookup source using the lookup fields and a lookup condition. The Lookup transformation sends the lookup result to the target or another transformation.
9.Normalizer Transformations:
The Normalizer is indeed a transformation that is in progress. It's being used to split a single row into several rows. When the Normalizer transformation receives a row with multiple-occurring data, it comes back a row for each example of the data.
Whether there is repeating data in multiple columns in a single row, it can be split into multiple rows. We occasionally have data in multiple columns.
A relational source, for instance, contains four fields containing flat sales data. A Normalizer transformation can be used to generate a separate output row for each flat.
10.Performance tuning:
The aim of performance tuning would be to improve session achievement by removing performance bottlenecks in order to achieve a more acceptable ETL load time.
Tuning begins with identifying bottlenecks in the source, target, and mapping and progresses to session tuning. It is possible that the system resources on which the Informatica PowerCenter Services are running need to be fine-tuned further.
When tuning session performance, we can use the test load option to run sessions.
We can further optimise session performance by increasing the number of pipeline partitions in the session if we tune all of the bottlenecks.
11.Expression Transformation:
In Informatica, an Expression Transformation is a Passive and Connected transformation that is used to perform non-aggregate calculations on the source data. You can use this transformation to perform any type of manipulation you want on a single record. It accepts row-by-row data, manipulates it, and then moves it to the target.
12.External Procedure Transformation:
External procedure transformation can be passively or actively connected. We can use External procedure transformation to call a procedure from a shared library or from the Windows Com layer.
Conclusion:
In the above blog post we had discussed clearly about the informatica, types of informatica transformations in depth. If you find anything not being covered please drop them in the comments section. We will definitely consider them. Happy learning!
Related Articles:
About Author
As a senior Technical Content Writer for HKR Trainings, Gayathri has a good comprehension of the present technical innovations, which incorporates perspectives like Business Intelligence and Analytics. She conveys advanced technical ideas precisely and vividly, as conceivable to the target group, guaranteeing that the content is available to clients. She writes qualitative content in the field of Data Warehousing & ETL, Big Data Analytics, and ERP Tools. Connect me on LinkedIn.
Upcoming Informatica Certification Training Online classes
Batch starts on 26th Dec 2024 |
|
||
Batch starts on 30th Dec 2024 |
|
||
Batch starts on 3rd Jan 2025 |
|