Netezza Interview Questions
Last updated on Jan 08, 2024
Do you have a technical skill? And are you having the capability to become a Netezza developer? Then you are to the right place to learn and gain professional experience, HKR has come up with a list of Netezza interview questions and Answers article which is designed by the SME expert team. As per the Gartner report, there is a huge buzz created for Database administrators so you can expect a huge salary offer. Why Netezza is so popular? The answer is that here user can fast access the consumer data, and also offers incredible analytical solutions. So what you are waiting for, let’s begin learning Netezza concepts:
Most Frequently Asked Netezza Interview Questions
What is Netezza?
Ans: Netezza is a SQL based platform developed by one of the top companies in the world “IBM” Pvt. Ltd. IBM Netezza is a subsidiary of American technology. This technology offers high-performance data ware appliances and advanced analytical applications. The important functions included are data appliance warehousing, business intelligence, analytic predictions, and business planning. One important thing is that Netezza supports only one data type that is SQL at a time.
What are the advantages of Netezza?
Ans: Below are the important advantages of using the Netezza application such as;
- Netezza is much easier with hardware and software optimized for data warehousing applications.
- No need for labor intensive tuning
- No requirements needed for partitioning, indexing, and building cubes.
- A Netezza twin fin 12 requires 5 Ip addresses and 4 network droops. The theme of simplicity is reflected in installation as in operations.
List the environment variables required to connect to Netezza
Ans: The environments are as follows;
- NZ_HOST
- NZ_DATABASE
- NZ_USER
- NZ_PASSWORD.
What are the states involved in Netezza?
Ans: The different states of Netezza are as follows;
- Online state: this is a type of normal or usual state
- Stopped: this Netezza state will terminate once you finish current queries and new queries will not be allowed.
- Offline: this state will wait for the completion of the current query operation and new queries will have an error.
- Paused: here this works the same as the offline state but doesn’t display any error. This is will happen during the time of boot up or start up the Netezza application.
- Downstate: this will halt the application due to server problems.
What are the different data warehouse appliances you know?
Ans: The following are various types of data warehouse appliance available they are;
- Netezza
- Greenplum
- Vertica
- Teradata
- Dataupia
- Oracle Exadata.
Get ahead in your career by learning Netezza course through hkrtrainings netezza certification Training
How FPGA can be helpful in improving query performance?
Ans: When the application starts reading the data from the desk, the FGPA (Field programmable GateWay) will filters unwanted data sets. This FRGA process also helps in the data elimination process, IO bottlenecks, and frees up the components such as memory, CPU, and network management.
How the zone maps are created and updated?
Ans: In Netezza, the zone maps are generated and refreshed in SPU with the help of the following mechanisms such as;
- Generate statistics
- Insert extents
- Nzload operations
- Update the operations
- Nzreclaim operations.
What are the constraints in Netezza?
Ans: The following are the list of enforced and non enforced constraints in Netezza.
- Primary Key (Not enforced)
- Unique key (Not enforced)
- Referential Integrity (Not enforced)
- Not Null (Enforced)
- Data type verification (Enforced)
How can FPGA help improve query performance?
Ans: Netezza uses Field Programmable Gate Array (FPGA) to filter out unwanted data when reading the data from the disk. This removes IO bottlenecks and frees up CPU, memory, and network components.
) Define snippet
Ans: A snippet is nothing but a small unit of work that will be carried out by SPU.
Netezza Training
- Master Your Craft
- Lifetime LMS & Faculty Access
- 24/7 online expert support
- Real-world & Project Based Learning
) Explain about zone maps
Ans: Zone maps are the smallest unit disk allocation located on the SPU. Zone maps are considered as internal mapping structures that will extend the internal ordering data and this eliminates the data extents, no need for any scanning. The purpose of using Zone mapping is to offer transparency and avoids unreferenced rows scanning. In Netezza, the zone maps are generated for each data column this contains minimum/maximum values for the extents.
) What permission will you give to connect to a database?
Ans: Netezza gathers stats about the tables in the database through 'generate statics'. The stats include null values, duplicate values, maximum values, minimum values, etc. There are two ways to generate statistics.
- Generate statistics on a table
- Generate express statistics on a table
) Do we need to drop all tables and objects in that database, before dropping a database?
Ans: Materialized views reduce the width of data (number of columns). It creates a thin version of the base table, such that it only contains the frequently queried columns. It will have the same distribution key as the base table.
) What are the best practices for creating materialized views?
Ans: Here are some of the best practices to follow when working with material views.
- Only create a material view on the frequently queried columns.
- Refresh the material views often.
- Stop the material view before performing a large operation on the base table. Once the operation is complete, refresh the material view so the updates will come into effect
) What are the limitations of materialized views?
Ans: Below are the few limits of materialized views;
- In a materialized view, you cannot add, delete, update, or remove any data. If you make any changes to the base tables, this will affect the materialized view.
- Here you can define only one data base table that to from the clause.
- The base table in the materialized view can’t be specified as an external table or any temporary table.
- One more drawback is that, here where clause can’t be used in the materialized view.
- In a materialized view, expressions are not used as columns.
) Explain about partition methods available in Netezza
Ans: The two partitioning methods are used;
- Random partitioning method: with this method, data will be distributed randomly.
- Hash portioning method: with this method, data will be distributed on the columns.
If you want to Explore more about Netezza? then read our updated article - Netezza Tutorial
) Will everything get cached in Netezza?
Ans: No, Netezza does not cache everything. It will only cache the table schema and other database objects. The SPUs hold the actual data of the tables.
) What are the join types in Netezza?
Ans: Netezza supports the following join types,
- Hash join
- Merge Sort join
- Expression join
- Exist join
- Cross Product Join
) What are the data loading formats supported by Netezza?
Ans: Netezza supports 3 data loading formats for loading the data from external sources.
- Text-delimited - every row or record should end with a delimiter.
- Fixed-length - allows an expressive form of the external table definition.
- Compressed binary - compresses the data before a backup or restore.
) Does Netezza support concurrent updates of the same record?
Ans: If the same record is updated concurrently, Netezza will roll back the affected transaction that was performed recently on the same record. Netezza locks the table by using serializable transaction isolation to ensure no dirty reads are being performed.
Subscribe to our YouTube channel to get new updates..!
) How to calculate the cumulative sum in Netezza?
Ans: The cumulative sum or running sum can be calculated for queries by using the Netezza analytic functions.
) Can we insert duplicate rows in the Netezza table?
Ans: Netezza does not have a primary key enforced. So, we can insert duplicate rows into the Netezza tables.
) How can we identify and remove duplicate records in the Netezza table?
Ans: We have multiple ways to identify and delete duplicate rows from a table,
- Create a temporary table and insert the distinct rows of the base table in it. Truncate the base table and insert the rows from the temporary table.
- Delete the rows using row id.
- Use the row_number analytic function to rank the rows for each group and then select only one row from the group.
) How can we redistribute a table?
Ans: The best way to redistribute a table is to create a new table using CTAS (Create table AS) and load the data at the same time. Alternatively, we can also change the distribution property in the /nz/data/postgresql.conf configuration file and restart the database.
) Explain about collocated join
Ans: If two tables share the same distribution key, then those tables are called collocated tables. When two collocated tables are joined together, then it is called a collocated join.
) Explain about the nzload
Ans: The nzload is a utility or command that is used to load bulk data from a file into a table. The nzload can upload data from the localhost or a remote client.
) What are the different ways to load data?
Ans: There are four ways to upload data into a Netezza table,
- The nzload command
- External tables
- Create table AS
- Inserts
- The nz_migrate utility
) How are transactions maintained in Netezza?
Ans: Netezza allows only one transaction to perform on a table at a time. The transactions are maintained through two slots - create xid and delete xid. Each row or record will contain these two slots.
) Can we update a distribution column?
Ans: No, we cannot update distribution columns in the Netezza tables. The tables are distributed across several nodes using the distribution column. If you try to update the distribution columns, it will throw an error and lock the entire table.
) How does the data get stored in Netezza?
Ans: The data loaded to Netezza will be stored in the Snippet Processing Unit (SPU). Each SPU has a dedicated hard drive.
) Where is the system catalog information stored?
Ans: Netezza stores system catalog information in the Netezza hosts. Each time a table gets created in Netezza, table definition gets stored in hosts. The actual rows or records of the table gets stored in the Netezza disks.
) Explain about clustering in Netezza
Ans: Clustering is the process of grouping similar items. Netezza consists of 2 SMP hosts that are part of a cluster - one for active host and the other for the passive host.
) What are the extent and page sizes in Netezza?
Ans: In Netezza, we can't change the extent and page sizes, they are fixed by default. The extent size is 3 MB, and the page size is 128 KB.
) What are the constraints in Netezza?
Ans: The following are the list of enforced and non enforced constraints in Netezza.
- Primary Key (Not enforced)
- Unique key (Not enforced)
- Referential Integrity (Not enforced)
- Not Null (Enforced)
- Data type verification (Enforced)
) How can FPGA help improve query performance?
Ans: Netezza uses Field Programmable Gate Array (FPGA) to filter out unwanted data when reading the data from the disk. This removes IO bottlenecks and frees up CPU, memory, and network components.
) How can we get stats on tables in Netezza?
Ans: Netezza gathers stats about the tables in the database through 'generate statics'. The stats include null values, duplicate values, maximum values, minimum values, etc. There are two ways to generate statistics.
- Generate statistics on a table
- Generate express statistics on a table
) Explain about materialized views
Ans: Materialized views reduce the width of data (number of columns). It creates a thin version of the base table, such that it only contains the frequently queried columns. It will have the same distribution key as the base table.
. How to update a row within a table in Netezza?
Ans. In Netezza, each record consists of two slots- Deletexid & Createxid. Generally, Netezza removes the current row by marking the withdrawn column with the existing transaction ID. Further, it updates a row with the new values.
. Define the ways for loading data into a file from a table.
Ans. There are two ways to load data into a file from a table:-
- By creating an external table.
- Using NoSQL with the option -o.
. Which data type is best for zone maps?
Ans. For integer data types, zone maps work the best.
. What are the limitations imposed on a table?
Ans. The following are the limitations- Not Null and Default. But Netezza doesn't apply to Primary Key and Foreign key.
. Does CTAS (CREATE TABLE AS) allocate data randomly or assign it based on the table on which it receives data?
Ans. A newly built table from CTAS allocates data from the original table.
. Why is it best to use Truncate instead of the drop table command?
Ans. The reason is that it only removes the table data and keeps the structure and permissions of the table flawless.
. Define how data loading is possible into a table from a file.
Ans. For this, we have to use the following:-
- The command' nzload" is useful here.
- Also, build an external table to load the original table using it.
. What is the way to eliminate
Ans. When we remove a row from a table, it means it is not physically deleted but logically removed by marking the deleted field within the table. Further, to remove the "Logically Deleted Records'' in Netezza, we utilize NZRECLAIM.
. Define the data types best suited for the columns mentioned in the distribution clause?
Ans. "Integer" is the best data type that suits the columns mentioned within the distribution clause.
. How many columns can you state within the distribution clause in Netezza?
Ans. In the distribution clause, it allows specifying up to four columns.
. What type of information are the statistics maintained in the table?
Ans. It maintains the Maximum & Minimum values of the column.
. Name the helpful command to monitor the reclaim status in Netezza.
Ans. The command you must use here is- NZ Stats- type reclaim.
. In what conditions will we get invalid results?
Ans. A driver may rarely send back invalid output that is getting processed and returned to the client. The client may think the computation is over and doesn't require updating the final production. Hence, the driver must wait until Netezza completes its operation on the host computer before shipping the same.
. Is it required to drop all the objects & tables within the database before removing them?
Ans. It would help if you simply dropped the database; it will handle the remaining.
. Why is it better to use the Not Null specification within Netezza?
Ans. It is better to mention Not Null as it provides better performance. Also, we can track Null values easily at the row header. If all the columns in a table are Not Null, no record header will be there.
. Which distribution will Netezza use when no distribution clause is used in creating a table?
Ans. Here, Netezza will use "the first column" as distribution.
. What will happen to the records loaded in the nzload process but were not executed?
Ans. These are logically removed records, and the admin may run nzreclaim. Further, we can also truncate the table.
. How to check the table rows whether distributed in SPUs equally or not?
Ans. To check the distribution of rows, we have to run the below command, "SELECT" the sliced data, COUNT (*) FROM GROUP BY data cut.
. Which log file in Netezza records the SQL submitted to the NPS system?
Ans. The system writes the log information to the following log file.:-
/nz/kit/log/postgres/pg.log
. What are the features that define Materialized Views?
Ans. The following features better define the materialized view in Netezza.:-
- They can include the clause "ORDER BY."
- Moreover, these views can only reference a single base table within the “FROM” clause.
Upcoming Netezza Training Online classes
Batch starts on 14th Sep 2024 |
|
||
Batch starts on 18th Sep 2024 |
|
||
Batch starts on 22nd Sep 2024 |
|