Data Modeling Interview questions

Data modeling is the process of creating a model for the data to store in a database. It is a conceptual representation of data objects, the association between different data objects, and the rules.

In this article, you can go through the set of Data Modeling interview questions most frequently asked in the interview panel. This will help you crack the interview as the topmost industry experts curate these at HKR training.

Let us have a quick review of the Data Modeling interview questions.

1. What is data modeling?

Ans: A data model is a conceptual representation of business requirements (logical data model) or database objects (physical) required for a database and is very powerful in expressing and communicating the business requirements and database objects. The approach by which data models are created is called data modeling.

2. What does the data model contain?


Logical Data Model: 

Entity, Attributes, Super Type, Sub Type, Primary Key, Alternate Key, Inversion Key Entry, Rule, Relationship, Definition, business rule, etc

Physical Data Model: 

Table, Column, Primary key Constraint, Unique Constraint or Unique Index, Non Unique Index, Check Constraint, Default Value, Foreign Key, comment etc.

3. Explain various types of data models.

Ans: There are mainly three different types of data models:

  • Conceptual: Conceptual data model defines what should the system contain. This model is typically created by business stakeholders and data architects. The purpose is to organize, scope, and define business concepts and rules.
  • Logical: Defines how the system should be implemented regardless of the DBMS. This model is typically created by data architects and business analysts. The purpose is to develop a technical map of rules and data structures.
  • Physical: This data model describes how the system will be implemented using a specific DBMS system. This model is typically created by DBA and developers. The purpose is the actual implementation of the database.

4. What are the important types of relationships in a data model?

Ans: The types of relationship in a data model are stated as follows.

  • Identifying.
  • Non-Identifying Relationship.
  • Self-Recursive relationship.

5. What is the difference between the logical data model and physical data model?


Logical data model:

  • A logical data model can design the requirement of business logically.
  • It is responsible for the actual implementation of data which is stored in the database.
  • It contains an entity, primary key attributes, Inversion keys, alternate key, rule, business relation, definition, etc.

Physical data model:

  • A physical data model provides information about the target database source and its properties.
  • A physical data model helps you to create a new database model from existing and apply the referential integrity constraint.
  • A physical data model contains a table, key constraints, unique key, columns, foreign key, indexes, default values, etc.

6. What are the characteristics of a logical data model?

Ans: Characteristics of the logical data model are:

  • Describes data needs for a single project but could integrate with other logical data models based on the scope of the project.
  • Designed and developed independently from the DBMS.
  • Data attributes will have data types with exact precisions and length.
  • Normalization processes to the model, which are generally applied typically till 3NF.

7. What are the characteristics of the physical data model?

Ans: Characteristics of the physical data model are:

  • The physical data model describes the data needed for a single project or application. It may be integrated with other physical data models based on project scope.
  • The data model contains relationships between tables that address cardinality and nullability of the relationships.
  • Developed for a specific version of a DBMS, location, data storage, or technology to be used in the project.
  • Columns should have exact data types, lengths assigned, and default values.
  • Primary and foreign keys, views, indexes, access profiles, and authorizations, etc. are defined.

8. What are the advantages of using data modeling?

Ans: The advantages of using data modeling in data warehousing are:

  • It helps you to manage business data by normalizing it and defining its attributes.
  • Data modeling integrates the data of various systems to reduce data redundancy.
  • It enables to create efficient database design.
  • Data modeling helps the organization department to function as a team.
  • It facilitates access to data with ease.

9. Explain about the phases in the Data Modeling Development cycle.


  • Gathering Business Requirements: Data Modelers have to interact with business analysts to get the functional requirements and with end-users to find out the reporting needs.
  • Conceptual Data Modeling(CDM): This data model includes all major entities, relationships and it will not contain much detail about attributes and is often used in the INITIAL PLANNING PHASE.
  • Logical Data Modeling(LDM): This is the actual implementation of a conceptual model in a logical data model. An LDM is the version of the model that represents all of the business requirements of an organization.
  • Physical Data Modeling(PDM): This is a complete model that includes all required tables, columns, relationship, database properties for the physical implementation of the database.
  • Database: DBAs instruct the data modeling tool to create SQL code from the physical data model. Then the SQL code is executed on the server to create databases.

10. What is Data Mart?

Ans: A data mart is the most straightforward set of data warehousing and is used to focus on one functional area of any given business. Data marts are a subset of data warehouses oriented to a specific line of business or functional area of an organization such as marketing, finance and sales. Data enters data marts by an assortment of transactional systems, other data warehouses, or even external sources.

11. Explain data warehousing in detail.

Ans: Data warehousing is a process for collecting and managing data from varied sources. It provides meaningful business enterprise insights. Data warehousing is typically used to connect and analyse data from heterogeneous sources. It is the core of the BI system, which is built for data analysis and reporting.

Data Modelling Training Online

  • Master Your Craft
  • Lifetime LMS & Faculty Access
  • 24/7 online expert support
  • Real-world & Project Based Learning


12. What is the difference between view and materialized view?



  • A view is created by combining data from different tables. Hence, a view doesn’t have data of itself.
  • When a view is created, the data is not stored in the database. The data is created when a query is fired on the view. 

Materialized view:

  • Materialized view usually used in data warehousing has data. This data helps in decision making, performing calculations etc. 
  • Data of a materialized view is stored. The data stored by calculating it beforehand using queries.

13. Explain the Two Different Design Schemas.

Ans: The two design schema is called Star schema and Snowflake schema. 

  • The Star schema has a fact table centered with multiple dimension tables surrounding it. 
  • A Snowflake schema is similar, except that the level of normalization is higher, which results in the schema looking like a snowflake.

14. What is the difference between star flake and snow flake schema?


Star Schema: 

In a star schema, you just enter your desired facts and all the primary keys of your dimensional tables in the Fact table. The fact tables primarily are the union of its all dimension table key. In a star schema, dimensional tables are usually not in BCNF form.

Snow Flake: 

It’s almost similar to a star schema but in this, our dimension tables are in 3rd NF, so more dimensions tables. These dimension tables are linked by primary, foreign key relation.

15. What are the critical relationship types found in a Data Model? Describe them.

Ans: The main relationship types are:

  • Identifying: A relationship line normally connects parent and child tables. But if a child table’s reference column is part of the table’s primary key, the tables are connected by a thick line, signifying an identifying relationship.
  • Non-identifying: If a child table’s reference column is NOT a part of the table’s primary key, the tables are connected by a dotted line, signifying a no-identifying relationship.
  • Self-recursive: A recursive relationship is a standalone column in a table connected to the primary key in the same table.

16. What is the difference between a data mart and a data warehouse?


Data mart:

  • It focuses on a single subject area of business.
  • It is used to make tactical decisions for business growth.
  • Data mart follows the bottom-up model.
  • Data source comes from one data source.

Data warehouse:

  • It focuses on multiple areas of business.
  • It helps business owners to take a strategic decision.
  • Data warehouse follows a top-down model.
  • Data source comes from more than one heterogeneous data source.

17. What does a data modeller use normalization for?

Ans: The purposes of normalization are:

  • Remove useless or redundant data.
  • Reduce data complexity.
  • Ensure relationships between the tables in addition to the data residing in the tables.
  • Ensure data dependencies and that the data is stored logically.

18. What is OLTP data modeling?

Ans: OLTP acronym stands for ONLINE TRANSACTIONAL PROCESSING. The approach by which data models are constructed for transactions is called OLTP data modeling. 

Example: all online transactions, bank transactions, trading transactions.

19. Explain various types of fact tables.

Ans: There are three types of fact tables:

  • Additive: It is a measure that is added to any dimension.
  • Non-additive: It is a measure that can't be added to any dimension.
  • Semi-additive: It is a measure that can be added to a few dimensions.

20. What is the purpose of the Factless Fact Table?

Ans: Fact less tables simply contain keys which refer to the dimension tables. Hence, they don’t really have facts or any information but are more commonly used for tracking some information about an event.

Example: To find the number of leaves taken by an employee in a month.

21. What is a level of Granularity of a fact table?

Ans: A fact table is usually designed at a low level of Granularity. This means finding the lowest level of information that can store in a fact table.

Example: Employee performance is a very high level of granularity. Employee_performance_daily, employee_perfomance_weekly can be considered lower levels of granularity.

22. What’s the difference between forward and reverse engineering, in the context of Data Models?


  • Forward engineering is a process where Data Definition Language (DDL) scripts are generated from the data model itself. DDL scripts can be used to create databases. 
  • Reverse Engineering creates data models from a database or scripts. Some data modeling tools have options that connect with the database, allowing the user to engineer a database into a data model.

Subscribe to our youtube channel to get new updates..!


23. What are the two types of data modelling techniques? Describe them.

Ans: Two types of data modelling techniques are: 

Entity-Relationship (E-R) Model:

ER model or entity-relationship model is a methodology of data modeling that normalizes the data by reducing redundancy. 

UML (Unified Modelling Language):

It is a general-purpose, database development, modelling language in the field of software engineering. The main intention is to provide a generalized way to visualize system design.

24. What is dimensional modeling?

Ans: Dimensional model consists of dimension and fact tables. Fact tables store different transactional measurements and the foreign keys from dimension tables that qualify the data. The goal of the Dimensional model is not to achieve a high degree of normalization but to facilitate easy and faster data retrieval.

25. Differentiate between Relational Data modeling and Dimensional Data modeling.


Relational Data modeling:

  • Data is stored in RDBMS.
  • Tables are units of storage.
  • Data is normalized and used for OLTP. Optimized for OLTP Processing.
  • Several tables and chains of relationships among them.
  • Volatile and time-variant.
  • Details level of transactional data.

Dimensional Data modeling:

  • Data is stored in RDBMS or Multidimensional databases.
  • Cubes are units of storage.
  • Data is denormalized and used in data warehouses and data mart. Optimized for OLAP.
  • Few tables and fact tables are connected to dimensional tables.
  • Non volatile and time-invariant.
  • Summary of bulky transactional data.

26. Differentiate between ER Modeling and Dimensional Modeling.


  • The entity-relationship model is a method used to represent the logical flow of entities/objects graphically that in turn create a database. It has both a logical and physical model. This model is good for reporting and point queries.
  • A Dimensional model is a method in which the data is stored in two types of tables namely facts table and dimension table. It has only a physical model. It is good for ad hoc query analysis

27. What is the importance of using keys?


  • Keys help you to identify any row of data in a table. In a real-world application, a table could contain thousands of records.
  • Keys ensure that you can uniquely identify a table record despite these challenges.
  • Allows you to establish a relationship between and identify the relation between tables.
  • Help you to enforce identity and integrity in the relationship.

28. What is a surrogate key? Explain it with an example.

Ans: Data warehouses commonly use a surrogate key to uniquely identify an entity. A surrogate is not generated by the user but by the system. A primary difference between a primary key and surrogate key in a few databases is that the primary key uniquely identifies a record while a surrogate key uniquely identifies an entity.

Example: An employee may be recruited before the year 2010 while another employee with the same name may be recruited after the year 2010. Here, the primary key will uniquely identify the record while the surrogate key will be generated by the system (say a serial number) since the surrogate key is not derived from the data.

Data Modelling Training Online

Weekday / Weekend Batches


29. What are the examples of the OLTP system?

Ans: Example of OLTP system are:

  • Sending a text message.
  • Add a book to the shopping cart.
  • Online airline ticket booking.
  • Online banking.
  • Order entry.

30. What is an enterprise data model?

Ans: Enterprise data model comprises all entities required by an enterprise. The development of a common consistent view and understanding of data elements and their relationships across the enterprise is referred to as Enterprise Data Modeling. For better understanding purpose, these data models are split up into subject areas.

Find our upcoming Data Modelling Training Online Online Classes

  • Batch starts on 27th May 2022, Fast Track batch

  • Batch starts on 31st May 2022, Weekday batch

  • Batch starts on 4th Jun 2022, Weekend batch

Global Promotional Image


Request for more information

Research Analyst
As a Senior Writer for HKR Trainings, Sai Manikanth has a great understanding of today’s data-driven environment, which includes key aspects such as Business Intelligence and data management. He manages the task of creating great content in the areas of Digital Marketing, Content Management, Project Management & Methodologies, Product Lifecycle Management Tools. Connect with him on LinkedIn and Twitter.