A snowflake schema is a logical arrangement of tables in a multidimensional database that mimics a snowflake shape on the entity-relationship diagram. The snowflake schema is made up of centralized fact tables with multiple dimensions. A method of normalizing the dimension tables in a star schema is known as "snowflaking". In this blog, we are going to cover the topics which include a complete overview of Snowflake schema, characteristics of snowflake, advantages, and disadvantages of a snowflake.
Snowflake's Data Cloud is based on a cutting-edge data platform that is available as Software-as-a-Service (SaaS). Snowflake provides data storage, processing, and analytic solutions which are quicker, simple to use, and more adaptable than traditional systems. Snowflake is not based on any current database technology or "big data" software platforms like Hadoop. Snowflake, on the other contrary, blends a brand-new SQL query engine with a cutting-edge cloud architecture designed for the cloud. Snowflake brings all of the features and capabilities of an enterprise analytic database to the user.
Snowflake is a cloud-based application that runs entirely in the cloud. All of Snowflake's components (except for optional command-line connectors, drivers, and clients) are executed on public cloud infrastructures. Snowflake's computational needs are met by virtual compute instances, and data is stored persistently via a storage service. Snowflake isn't compatible with private cloud infrastructures (hosted or on-premises). Snowflake isn't a user-installable package of software. Snowflake is responsible for all software updates and installation.
Become a Snowflake Certified professional by learning this HKR Snowflake Training !
The architecture of Snowflake is a hybrid of shared-nothing and shared-disk databases. Snowflake uses a central data repository for persisting data that is accessible from all compute nodes in the platform, similar to shared-disk systems. Snowflake, however, performs queries utilizing MPP (massively parallel processing) compute clusters, in which each node in the cluster maintains a piece of the full data set locally, akin to shared-nothing systems. This method combines the ease of data management of a shared-disk design with the performance and scale-out advantages of a shared-nothing architecture.
Model of a Snowflake Schema in a Data Warehouse
EmployeeID, EmployeeName, DepartmentID, Region, and Territory are now all available in the Employee dimension table. The Employee table is connected to the Department dimension table by the DepartmentID attribute. The Department dimension is used to offer specific information about each department, like the department's name and location. CustomerID, CustomerName, Address, and CityID are now attributed in the Customer dimension table. The Customer dimension table and the City dimension table are connected by the CityID attributes. Each city's details are contained in the City dimension table, including CityName, Zip Code, State, and Country.
The main distinction between star and snowflake schemas is that the snowflake schema's dimension table is retained in its normalized form to minimize redundancy. The benefit is that such (normalized) tables are simple to maintain and save storage capacity. However, this means that the query would require more joins to run. This will have an adverse effect on the system's performance.
Get ahead in your career with our Snowflake Tutorial !
The following are the two key advantages of the snowflake schema:
Top 30 frequently asked snowflake interview questions & answers for freshers & experienced professionals
In this blog, we have learned an overview of Snowflake Schema such as Data Platform as a Cloud Service, the architecture of Snowflake, ways of connecting snowflakes. We have also discussed an example for Snowflake schema along with the characteristics of snowflakes, benefits, and drawbacks of Snowflake. We hope this blog has provided you with sufficient knowledge to understand the Snowflake Schema and its related concepts.
Related Articles:
Batch starts on 24th Mar 2023, Fast Track batch
Batch starts on 28th Mar 2023, Weekday batch
Batch starts on 1st Apr 2023, Weekend batch