Snowflake is incredibly fast out of the box, but as your data and query complexity grow, even the most powerful engine needs a tune-up. Slow-running queries not only frustrate users but also lead to higher credit consumption and wasted costs. The good news is that most performance issues can be solved with a few key techniques.
If you’re an experienced data engineer, mastering Snowflake performance tuning is a critical skill that separates you from the crowd. It’s about understanding how Snowflake works under the hood and making strategic decisions to optimize your workloads.
This guide will walk you through five actionable techniques to diagnose and fix slow-running queries in Snowflake.
Before You Tune: Use the Query Profile
The first rule of optimization is: don’t guess, measure. Snowflake’s Query Profile is the single most important tool for diagnosing performance issues. Before applying any of these techniques, you should always analyze the query profile of a slow query to identify the bottlenecks. It will show you exactly which operators are taking the most time, how much data is being scanned, and if you’re spilling data to disk.
1. Right-Size Your Virtual Warehouse
One of the most common misconceptions is that a bigger warehouse is always better. The key is to choose the right size for your workload.
- Scale Up for Complexity: Increase the warehouse size (e.g., from Small to Medium) when you need to improve the performance of a single, complex query. Larger warehouses have more memory and local SSD caching, which is crucial for large sorts, joins, and aggregations.
- Scale Out for Concurrency: Use a multi-cluster warehouse when you need to handle a high number of simultaneous, simpler queries. This is ideal for BI dashboards where many users are running queries at the same time. Scaling out adds more warehouses of the same size, distributing the user load without making any single query faster.
Actionable Tip: If a single ETL job is slow, try running it on the next warehouse size up and measure the performance gain. If your BI dashboard is slow during peak hours, configure your warehouse as a multi-cluster warehouse with an auto-scaling policy.
2. Master Your Clustering Keys
This is arguably the most impactful technique for tuning large tables. Snowflake automatically stores data in micro-partitions. A clustering key co-locates data with similar values in the same micro-partitions, which allows Snowflake to prune (ignore) the partitions that aren’t needed for a query.
When to Use:
- On very large tables (hundreds of gigabytes or terabytes).
- When your queries frequently filter or join on a high-cardinality column (e.g.,
user_id,event_timestamp).
Actionable Tip: Analyze your slow queries in the Query Profile. If you see a “TableScan” operator that is scanning a huge number of partitions but only returning a few rows, it’s a strong indicator that you need a clustering key.SQL
-- Define a clustering key when creating a table
CREATE TABLE my_large_table (
event_timestamp TIMESTAMP_NTZ,
user_id VARCHAR,
payload VARIANT
) CLUSTER BY (user_id, event_timestamp);
-- Check the clustering health of a table
SELECT SYSTEM$CLUSTERING_INFORMATION('my_large_table');
3. Avoid Spilling to Remote Storage
“Spilling” happens when an operation runs out of memory and has to write intermediate data to storage. Spilling to local SSD is fast, but spilling to remote cloud storage is a major performance killer.
How to Detect It:
- In the Query Profile, look for a “Bytes spilled to remote storage” warning on operators like
SortorJoin.
How to Fix It:
- Increase Warehouse Size: The simplest solution is to run the query on a larger warehouse with more available memory.
- Optimize the Query: Try to reduce the amount of data being processed. Filter data as early as possible in your query, and select only the columns you need.
4. Use Materialized Views for High-Frequency Queries
If you have a complex query that is run very frequently on data that doesn’t change often, a Materialized View can provide a massive performance boost.
A materialized view pre-computes the result of a query and stores it, almost like a cached result set. When you query the materialized view, you’re just querying the stored results, which is incredibly fast. Snowflake automatically keeps the materialized view up-to-date in the background as the base table data changes.
When to Use:
- On a query that aggregates or joins data from a large, slowly changing table.
- When the query is run hundreds or thousands of times a day (e.g., powering a critical dashboard).
SQL
CREATE MATERIALIZED VIEW mv_daily_sales_summary AS
SELECT
sale_date,
category,
SUM(amount) as total_sales
FROM
raw_sales
GROUP BY
1, 2;
5. Optimize Your Joins
Poorly optimized joins are a common cause of slow queries.
- Join Order: Join your largest tables last. Start by joining your smaller dimension tables together first, and then join them to your large fact table. This reduces the size of the intermediate result sets.
- Filter Early: Apply
WHEREclauses to your tables before you join them, especially on the large fact table. This reduces the number of rows that need to be processed in the join.
SQL
-- GOOD: Filter before joining
SELECT
u.user_name,
SUM(s.amount)
FROM
(SELECT * FROM sales WHERE sale_date > '2025-01-01') s -- Filter first
JOIN
users u ON s.user_id = u.user_id
GROUP BY 1;
-- BAD: Join everything then filter
SELECT
u.user_name,
SUM(s.amount)
FROM
sales s
JOIN
users u ON s.user_id = u.user_id
WHERE
s.sale_date > '2025-01-01' -- Filter last
GROUP BY 1;
Conclusion
Snowflake performance tuning is a blend of science and art. By using the Query Profile to diagnose bottlenecks and applying these five techniques—warehouse management, clustering, avoiding spilling, using materialized views, and optimizing joins—you can significantly improve the speed of your queries, reduce costs, and build a highly efficient data platform.

Leave a Reply