Snowflake Query
Last updated on Jul 17, 2024
Snowflake Query - Table of Content
- What is SQL
- What is Snowflake SQL – SnowSQL
- Prerequisites Of Snowflake
- Creating Snowflake Objects
- Diagnosing Queries in Snowflake
- Conclusion
What is SQL?
A standardised programming language called Structured Query Language (SQL) is often employed to administer relational databases and carry out diverse actions on the data they contain. Originally developed in the 1970s, SQL is frequently used among database administrators as well as programmers creating scripts for data integration and data analysts setting up and running analytical queries.
The following are some of the uses of SQL:
- Altering the index and table structures of a database;
- Adding, changing, and removing data rows; and
- Retrieving specific sets of data from relational database management systems (RDBMSes); this data can be utilised for analytics applications, transaction processing, and other programs that call for interaction with a relational database.
Users can add, change, or extract content from database tables by using SQL queries as well as other actions, which take the shape of statements expressed as commands.
The most fundamental component of a database is a table, which is made up of columns and rows of data. Records are kept in a single table, and every record is kept in a table row. The most common kind of relational database items, or structures that store or relate to data, are tables. These are some other database object types:
- Data gathered from 1 or perhaps more database tables is represented logically in Views.
- Lookup tables with indexes can expedite database lookup operations.
- Data from 1 or more tables, typically a sample of that data that is chosen relying on search criteria, are used to create reports.
Because they are made up of tables that link to one another, relational databases are just that—relational. For instance, a SQL database in use for customer service may contain tables for specific purchases, product codes, and client contacts in addition to a single table for client names and locations. In order to relate to a customer's record in a different table meant to hold client data, including name and contact information, a database used to manage customer contacts typically utilises a special customer identifier termed a key or primary key.
What is Snowflake SQL – SnowSQL?
You may connect to Snowflake and run SQL commands using SnowSQL, the Snowflake SQL cmd line client. Additionally, it enables you to carry out all DDL and DML tasks, such as uploading and discharging data from database tables.
The SnowSQL program can be used to run SnowSQL in batch mode or as an interactive shell. Either stdin or the -f option can be used for this.
The Snowflake Connector for Python was utilised in the creation of SnowSQL. However, installing SnowSQL is not necessary for this connector. The installers come pre-bundled with all the software needed to install SnowSQL. All widely used operating systems, especially Windows, macOS, as well as some Linux variants, are supported by SnowSQL.
Become a Snowflake Certified professional by learning this HKR Snowflake Training !
Prerequisites Of Snowflake
Before going forward with Snowflake, make sure you fulfil the following:
1. Installing and setting up SnowSQL
2. The appropriate rights must be granted to your Snowflake user in order to create various Snowflake objects.
Accessing SnowSQL
1. Activate the terminal window.
2. Use the following command to launch SnowSQL at the cmd prompt:
$ snowsql -a
The id that Snowflake has given to the account is accountName>.
The login name given to the Snowflake user is
3. Type the password associated with your Snowflake user when SnowSQL prompts you to do so.
Snowflake Training
- Master Your Craft
- Lifetime LMS & Faculty Access
- 24/7 online expert support
- Real-world & Project Based Learning
Creating Snowflake Objects
Before you can import data into Snowflake, a database and table are required. In a database entitled demo, this article inserts data into the demo table. A virtual warehouse is also necessary for data loading and querying. This offers the necessary computing power for completing these jobs.
- Create a Database: The following syntax can be used to build the database:
CREATE OR REPLACE DATABASE demo;
Every database created by Snowflake receives a default schema called public; as a result, you do not need to define a schema yourself. The current session is now using the database & schema which was just built. Although it is shown in the SnowSQL cmd prompt, one can also do so by using:
SELECT CURRENT_DATABASE(), CURRENT_SCHEMA();
Creating a Table:
Applying the CREATE TABLE command, make a table called
CREATE OR REPLACE TABLE emp_info (
first_name STRING,
last_name STRING,
email STRING,
address STRING,
city STRING,
start_date DATE
);
Keep in mind that the table's number, placement, and data kinds match up with the sections in the CSV data files which will be staged in the following phase.
Creating a Virtual Warehouse
Use the following syntax to build an X-Small warehouse with the name demo_wh:
CREATE OR REPLACE WAREHOUSE demo_wh WITH
WAREHOUSE_SIZE='X-SMALL'
AUTO_SUSPEND = 180
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED=TRUE;
Here,
The quantity of servers for each cluster is specified by WAREHOUSE SIZE.
AUTO RESUME & AUTO SUSPENDThe warehouse will be suspended if it is dormant for the predetermined amount of time. When a query is filed that calls for computational capabilities from the warehouse and the warehouse is currently in a session, auto-resume is automatically enabled.
Determines if the warehouse is initially built in the "Suspended" state using INITIALLY SUSPENDED = TRUE | FALSE.
It must be observed that the warehouse would not be launched initially; instead, it is configured to auto-resume, which means that when the first SQL operation that calls for its compute resources is executed, the warehouse will begin running automatically.
Also, take note that your current session is currently using the warehouse that was built. The following command can be used to inspect the data that will be shown in the SnowSQL command prompt:
SELECT CURRENT_WAREHOUSE();
What To Do When a Snowflake Query Is Slow?
Understanding the cause of your Snowflake queries' sluggish performance is necessary before you can take any corrective action.
Get ahead in your career with our Snowflake Tutorial !
Subscribe to our YouTube channel to get new updates..!
Diagnosing Queries in Snowflake
Let's start by dispelling some myths about the causes of Snowflake queries' slowness. Because Snowflake operates as a cloud service, the hardware and the OS have no bearing on how quickly operations are carried out.
Although the network may play a role in sluggish queries, it is not a substantial enough factor to consistently impede execution. Let's now explore the additional potential causes of your inquiries' delays.
- Check the Information Schema
In a nutshell, the INFORMATION SCHEMA serves as the design template for each database you build in Snowflake. Viewing historical data for tables, warehouses, permissions, and queries is made possible through this.
Its data cannot be changed because it is read-only. The QUERY HISTORY and QUERY HISTORY BY_* tables are among the INFORMATION SCHEMA's key features. These tables aid in identifying the reasons behind delayed Snowflake queries.
Remember that only data that is accessible to your Snowflake account is returned by this tool.
- Check the Query History Page
The query history page for Snowflake finds columns with useful data. We receive the following columns in our situation:
- The query's status is shown in the EXECUTION STATUS, including whether it is executing, queued, blocked, or successful.
- QUEUED PROVISIONING TIME shows the amount of time spent awaiting the assignment of an appropriate warehouse.
- The amount of time needed to fix the warehouse is shown by QUEUED REPAIR TIME.
- The amount of time that a query is continuing to overload the warehouse is shown in the QUEUED OVERLOAD TIME field.
- The more frequent occurrence is overloading, thus QUEUED OVERLOAD TIME is an important diagnostic indicator.
Here is an example of a search:
select *
from table(information_schema.query_history_by_session())
order by start_time;
This displays Snowflake's most recent 100 queries from the current session. Additionally, you can obtain the query history according to the individual and the warehouse.
- Check the Query Profile
We saw what occurs when several queries are impacted at once in the previous section. It's equally crucial to respond to each question specifically. Employ the query profile option for that.
The profile of a query may be found under the Snowflake History tab.
The UI for query profiles resembles a sophisticated flowchart containing a step-by-step query execution. You should concentrate mostly on the nodes and operator tree.
Based on the duration of their operation, the operator nodes are dispersed. The operator tree displays all operations that accounted for more than 1% of the overall execution time.
- Check Your Caching
A query's execution and retrieval of the results could both take 500 milliseconds. Snowflake gives you the ability to cache a query if you regularly use it to retrieve the same results so that the subsequent execution is shorter than 500 milliseconds.
Data is stored in the results cache by Snowflake. It first looks in the result cache when data is needed. If no data is found, it looks on the local hard disc. If the data is still missing, the remote storage is checked.
Data retrieval from the results cache is quicker than from the distant memory or the hard drive. Utilizing the result cache wisely is therefore great practice. The result cache keeps data for 24 hours. The query must then be run once more in order to retrieve the information from the hard drive.
You may examine Snowflake's usage of the result cache. Inspect the Query Profile tab after running the query through Snowflake.
- Check Snowflake Join Performance
You should contrast the predicted output with the actual result if query execution is sluggish. You might have run into a fight that erupted.
A query result that provides significantly more rows than expected is known as a "row explosion." As a result, it takes much longer than expected. For instance, you might anticipate four million records to be produced, but the actual number could be enormously higher. Your queries' joins, which aggregate rows from various tables, cause this issue. The join order is important. You have two options: either search for the join criteria you applied or check the join order using Snowflake's optimizer.
Search for join operators in the query profile that show greater rows in the output as compared to those in the input links to quickly ascertain whether this is the issue. Make certain that the query result doesn't have more rows than all of the inputs combined to prevent a row explosion.
Top 30 frequently asked Snowflake interview questions & answers for freshers & experienced professionals
Conclusion
The fundamentals of utilizing Snowflake SQL are now familiar to you. Data can be loaded, accessed, and modified using SnowSQL. You can also carry out numerous ETL procedures and migrate data from MySQL Server to Snowflake.
Related Articles:
About Author
A technical lead content writer in HKR Trainings with an expertise in delivering content on the market demanding technologies like Networking, Storage & Virtualization,Cyber Security & SIEM Tools, Server Administration, Operating System & Administration, IAM Tools, Cloud Computing, etc. She does a great job in creating wonderful content for the users and always keeps updated with the latest trends in the market. To know more information connect her on Linkedin, Twitter, and Facebook.
Upcoming Snowflake Training Online classes
Batch starts on 21st Dec 2024 |
|
||
Batch starts on 25th Dec 2024 |
|
||
Batch starts on 29th Dec 2024 |
|
FAQ's
Snowflake is a data platform and data warehouse that supports the most common standardized version of SQL: ANSI. This means that all of the most common operations are usable within Snowflake.
The data imported into the emp basic table can be queried using all supported functions and operators as well as ordinary SQL. Using common DML instructions, you can also modify the data, such as changing the loaded data or adding new data.
Yes, Snowflake has a query builder. To utilise the Query Builder for Snowflake, go to Metrics > Query Builder. When the Data Source drop-down menu appears, choose your linked Snowflake Data Source by clicking the green + Create Custom Metric button.
Using secure views, snowflake data providers can exchange information from several databases. Provided the databases in question are part of the same account, a secure view can make references to objects through one or more databases, including schemas, tables, and other views.