Having data all in one place will improve business processes and enable enterprises to make informed decisions. IBM Netezza is a popular data warehousing system. Whether you are a fresher or an experienced candidate, there are a lot of Netezza opportunities in major companies. In this post, we will overview the Netezza interview questions that are curated by the working professionals. Go through the below frequently asked Netezza architecture interview questions and answers that help you face the interview process very quickly. Let's get started.
Ans: Netezza is a company that provides data warehouse appliances for data warehousing, predictive analytics, and business intelligence. It was founded in the year 1999 by Foster Hinshaw. IBM acquired Netezza in the year 2010. IBM Netezza is used to store historical data and enables faster processing of data.
Ans: Here are some of the advantages of Netezza,
Ans: Four environmental variables are required to connect to Netezza.
Ans: There are five states to the Netezza server.
Ans: Netezza stores system catalog information in the Netezza hosts. Each time a table gets created in Netezza, table definition gets stored in hosts. The actual rows or records of the table gets stored in the Netezza disks.
Ans: Clustering is the process of grouping similar items. Netezza consists of 2 SMP hosts that are part of a cluster - one for active host and the other for the passive host.
Ans: In Netezza, we can't change the extent and page sizes, they are fixed by default. The extent size is 3 MB, and the page size is 128 KB.
Ans: The following are the list of enforced and non enforced constraints in Netezza.
Ans: Netezza uses Field Programmable Gate Array (FPGA) to filter out unwanted data when reading the data from the disk. This removes IO bottlenecks and frees up CPU, memory, and network components.
Ans: A snippet is a small unit of database operations that need to be carried out in Netezza. When we run a snippet, it shows all the details of its execution. A snippet can run either on an SPU or host or both.
Ans: Netezza maintains zone maps automatically on all the tables. Zone maps offer similar functionality, like partitions. When we want to access particular data, we can rely on the zone map, to get only the range of the data that is required.
Ans: Netezza gathers stats about the tables in the database through 'generate statics'. The stats include null values, duplicate values, maximum values, minimum values, etc. There are two ways to generate statistics.
Ans: Materialized views reduce the width of data (number of columns). It creates a thin version of the base table, such that it only contains the frequently queried columns. It will have the same distribution key as the base table.
Ans: Here are some of the best practices to follow when working with material views.
Ans: We have some restrictions on material views.
Ans: Tables in Netezza are split into groups and stored across data slices in SPUs. There are two types of partitioning methods available,
Ans: No, Netezza does not cache everything. It will only cache the table schema and other database objects. The SPUs hold the actual data of the tables.
Ans: Netezza supports the following join types,
Ans: Netezza supports 3 data loading formats for loading the data from external sources.
Ans: If the same record is updated concurrently, Netezza will roll back the affected transaction that was performed recently on the same record. Netezza locks the table by using serializable transaction isolation to ensure no dirty reads are being performed.
Ans: The cumulative sum or running sum can be calculated for queries by using the Netezza analytic functions.
Ans: Netezza does not have a primary key enforced. So, we can insert duplicate rows into the Netezza tables.
Ans: We have multiple ways to identify and delete duplicate rows from a table,
Ans: The best way to redistribute a table is to create a new table using CTAS (Create table AS) and load the data at the same time. Alternatively, we can also change the distribution property in the /nz/data/postgresql.conf configuration file and restart the database.
Ans: If two tables share the same distribution key, then those tables are called collocated tables. When two collocated tables are joined together, then it is called a collocated join.
Ans: The nzload is a utility or command that is used to load bulk data from a file into a table. The nzload can upload data from the localhost or a remote client.
Ans: There are four ways to upload data into a Netezza table,
Ans: Netezza allows only one transaction to perform on a table at a time. The transactions are maintained through two slots - create xid and delete xid. Each row or record will contain these two slots.
Ans: No, we cannot update distribution columns in the Netezza tables. The tables are distributed across several nodes using the distribution column. If you try to update the distribution columns, it will throw an error and lock the entire table.
Ans: The data loaded to Netezza will be stored in the Snippet Processing Unit (SPU). Each SPU has a dedicated hard drive.