Ab Initio Interview Questions

Ab Initio is popular software that offers data processing applications and enterprise application integration. It provides a single platform for data analysis, complex events, batch processing, data manipulation, quantitative, and qualitative data processing. In this post, our experts have put together the top 30 Ab Initio interview questions that help you through your interview process. Go through the below Ab Initio interview questions and answers to enhance your skill.

Most Frequently Asked Ab Initio Interview Questions and Answers

1. What are the components of Ab Initio architecture?

Ans: The following are the components of Ab Initio architecture.

  • Co>Operating system (Co>Op)
  • Component Library
  • Graphical Development Environment (GDE)
  • Enterprise Meta>Environment (EME)
  • Data Profiler
  • Conduct>IT

To gain in-depth knowledge with practical experience, then explore Ab Initio Training

2. Tell me about the CoOperating system in Ab Initio.

Ans: Co>Operating System operates on the top of the operating system and works as a base for all Ab Initio processes. It can run on operating systems like Windows, Linux, Solaris, AIX, HP-UX, and z/OS. It provides Ab Initio extensions through which ETL processes can be controlled. It manages metadata by interacting with EME, manages, and runs Ab Initio graphs.

3. Explain the relation between eme, gde, and co-operating system.

Ans: Co>Operating System is installed on the operating system. Enterprise Meta>Environment (EME) is nothing but a repository for storing and managing metadata. Graphical Development Environment (GDE) is the graphical application for designing and running Ab Initio graphs. Users can access the EME metadata through the GDE web browser or Co>Operating system command line.

4. What do you mean by dependency analysis in Ab Initio?

Ans: The dependency analysis is a process through which the EME analyzes the project for the dependencies within and between the graphs. It examines the entire project and tracks how data is being transformed and transferred from component to component and field by field. The steps involved in dependency analysis are translation and analysis.

5. How is Ab Initio EME segregated?

Ans: The Ab Initio EME is logically segregated into data integration portion and user interface to access metadata information.

6. What do you know about overflow errors?

Ans: Overflow errors are the errors raised when dealing with processing bulky sets of data. While processing data, bulky calculations might not fit the memory allocated for them. And when a character of more than 8-bits is stored, an overflow error is raised.

7. How to connect EME to Ab Initio Server?

Ans: EME can be connected to the Ab Initio server in several ways. The following are the ways to connect to EME.

8. What are the file extensions used in Ab Initio?

Ans: Below are the file extensions used in Ab Initio.

  • .mp - graph or graph component
  • .mpc - custom component or program
  • .dbc - database table files
  • .dat - data files
  • .mdc - dataset template files or custom dataset components
  • .ksh - shell scripting file
  • .xfr - transform function files
  • dml - record format files

9. What are the kinds of layouts that Ab Initio supports?

Ans: A layout defines which component should run where. Ab Initio has two kinds of layouts. 

  • Serial layout - the level of parallelism is 1.
  • Parallel layout - the level of parallelism depends on the data partition.

10. How to add default rules in the transformer?

Ans: Go to component properties, navigate to the parameter tab page, and double click on the transform parameter. The transform editor page will open. Click on the edit menu and select the 'Add Default Rules' option from the drop-down. You can choose from Match names and Wildcard options.

11. What is a local lookup?

Ans: The local lookup function will be used before the lookup function call when the lookup file is a multifile and partitioned/sorted on a particular key. It will be local to a partition, depending on the key. The data records in the lookup file can be loaded into memory. This way, the transform function retrieves records faster than retrieving from disk.

Want to know more about AB Initio Tutorial

Ab Initio Training

  • Master Your Craft
  • Lifetime LMS & Faculty Access
  • 24/7 online expert support
  • Real-world & Project Based Learning


12. What is the difference between a lookup file and lookup?

Ans: A lookup file represents one or more serial files, also known as Flat files. It is a physical file where lookup data is stored, which is small enough to be held in memory. Lookup is a component of the Ab Initio graph where data resides along with a key parameter. The data can be retrieved using this key parameter.

13. What information does a .dbc file extension provide to connect to the database?

Ans: The .dbc file provides the below information to connect to a database.

  • Name and version number of the database
  •  Name of the system on which the database server or instance is running. 
  • Name of the server or database instance

14. How can you execute a graph infinitely in Ab Initio?

Ans: Calling the .ksh file of the graph at the end script runs the graph infinitely. If the graph name is xyz.mp, then the end script of the graph should make a call to xyz.ksh. 

15. Explain the different types of parallelism used in Ab Initio.


  • Ab Initio provides three types of parallelism.
  • Component parallelism - It is used by the graph that has multiple processes executing simultaneously on separate data.
  • Data parallelism - It is used by the graph that works with data divided into segments, which operates on each segment, respectively. 
  • Pipeline parallelism - It is used by the graph that deals with multiple components executing simultaneously on the same data.

16. What do you mean by Sort Component in Ab Initio?

Ans: The Sort Component in Ab Initio is used for reordering data. It contains two parameters.

  • Key - It represents the collation order.
  • Max-core - It defines how often the sort component should dump data from memory to disk.

17. What does the dedup component and replicate component do?


  • Dedup component - It is used to remove duplicate records from the flow based on a specified key.
  • Replicate component - It is used to combine input records from multiple sources into one flow and write a copy of that flow to output ports.

18. What are the types of partition components in Ab Initio?

Ans: Ab Initio has the following partition components.

  • Partition by Round-Robin - It distributes the data evenly in block size chunks.
  • Partition by Range - Based on a set of partitioning ranges and key, it divides data evenly among nodes.
  • Partition by Key - It partitions data based on a key.
  • Partition by Percentage -  It distributes data in such a way that the output is proportional to fractions of 100.
  • Partition by Expression - It divides data based on a DML expression.
  • Partition by Load balance - It distributes data based on dynamic load balancing.

19. What is a surrogate key?

Ans: The system generated unique sequential number is called a surrogate key. It acts as a primary key. 

20. Explain about the sandbox.

Ans: A sandbox is like a directory that contains a collection of Ab Initio graphs and related files. It is local to any user and is a replica of the Project in the EME. It will be helpful for version control, migration, and navigation.

21. What is the relation between limit and ramp?

Ans: Limit and ramp are used to set the reject tolerance of a graph. Limit is the number of rejects and ramp is rate of rejection. The formula for rejects tolerance is,

limit + (ramp*no_of_records_processed)

Subscribe to our youtube channel to get new updates..!


22. How will you handle it if DML is changing dynamically?

Ans: We have a lot of ways through which we can handle dynamically changing DML. Some of the methods are,

  • Use a conditional DM
  • Call the vector functionality while calling the DMLs
  • Use the MULTI REFORMAT component

23. What are the air commands used in Ab Initio?

Ans: Here are some of the air commands in Ab Initio.

  • Air object ls
  • Air object rm
  • Air project modify
  • Air object versions -verbose
  • Air lock show -user
  • Air sandbox status

24. What are the types of output formats that we can get after processing data?

Ans: The output can be of the following formats.

  • Charts
  • Tables
  • Vectors
  • Plain Text files
  • Maps
  • Raw files
  • Image files

25. Explain about the rollup component.

Ans: A rollup component is used to group the records based on certain field values. It is a multi-stage transform function which contains functions like initialize, rollup and finalize.

26. What are is_valid and is_define used for?


  • is_valid - It is used to test if a value is valid or not. If the expression is a valid data item, the value will be 1. If the expression is not a valid data item, the value will be 0.

  • is_define - It is used to test if an expression is not NULL. If the expression is non NULL, the value will be 1. The value will be 0 otherwise.

Ab Initio Training

Weekday / Weekend Batches


27. How does the force_error function work?

Ans: If any mentioned conditions are not met, the force_error it forces an error. It will be useful when you want to stop the execution of a graph if it doesn't meet the set condition. It will send the records to the reject port and error message to the error port. 

28. How to improve the performance of a graph?

Ans: We can improve the performance of a graph through the following methods.

  • Use lookup instead of join and merge components.
  • When we have to join two files and don't want duplicates, use a union function instead of a duplicate remover.
  • Minimize the use of sort components.
  • Reduce the use of regular expression functions in the transfer functions.
  • Don’t use broadcast as partitioner for large datasets.
  • Use only the fields that are required in sort, reformat, and join components.

29. List the commonly used components in an Ab Initio graph?

Ans: The commonly used components in an Ab Initio graph are,

  • Input file/output file
  • Lookup file
  • Input table/output table
  • Join
  • Sort
  • Partition
  • Partition by key
  • Gather
  • Reformat
  • Concatenate

30. Explain the difference between conventional loading and direct loading.


  • Conventional load - All the table constraints are checked against the data before loading it.
  • Direct load - All the table constraints will be disabled, and the data is loaded directly. After the data load is done, the table constraints will be checked against the data.


Major companies like American Express, Citi Bank, JP Morgan Chase, Time Warner Cable, Home Depot, Premier, etc., use Ab Initio for their data processing and integration needs. The customers include 20% of Computer Software, 10% of Information Technology and Services, 9% of Higher Education, 9% of Education Management, etc. It has a market share of 5.12%. The Ab Initio developer and admin job posts are very high in demand. So, prepare well on the basics of Ab Initio, and you will have a high chance of cracking the interview.

Find our upcoming Ab Initio Training Online Classes

  • Batch starts on 27th Sep 2023, Weekday batch

  • Batch starts on 1st Oct 2023, Weekend batch

  • Batch starts on 5th Oct 2023, Weekday batch

Global Promotional Image


Request for more information

Research Analyst
As a Senior Writer for HKR Trainings, Sai Manikanth has a great understanding of today’s data-driven environment, which includes key aspects such as Business Intelligence and data management. He manages the task of creating great content in the areas of Digital Marketing, Content Management, Project Management & Methodologies, Product Lifecycle Management Tools. Connect with him on LinkedIn and Twitter.