Simple 3 steps to creating a table and setting up daily incremental refresh on Snowflake
Why do you want to use a table?
Obviously if you created a view and it works fast, then that would be the simplest way to go. But if you are dealing with millions or sometimes billions of rows, every time you need to run the view, it will not only cost much but also it will take too long — it will likely timeout. If you could make your codes more efficient and that makes it faster enough, it is great! There will be times, however, your code is as efficient as possible and the data is still too large and the view is taking too long no matter what you do. Then you should turn to creating tables. So let’s quickly talk about how to create a table and a task to insert day-before-data daily. It is simple!
1. Create a table
Creating a table in Snowflake is EASY. Just use the following:
CREATE OR REPLACE TABLE [table name] AS
[your SQL code here]
***If you are dealing with too many rows to create the table at once, you may have to create with a filter and then use INSERT INTO and add the rest of the data. For my table, I had to create a table and insert 3 months of data at a time so Snowflake would not time out. If you don’t know how to insert, use this:
INSERT INTO [table name] [your SQL code here]
2. Create a task
Create a task to insert day-before data into the table:
You can make tasks more complex, like, “run after this other task runs” sort of thing, but here, we are only going to talk about a simple, once a day at a specific time run task. To create a task to insert, and make a schedule, use the following — update the highlighted parts:
CREATE OR REPLACE TASK [task name] WAREHOUSE = [warehouse name] SCHEDULE = ‘USING CRON 0 17 * * * UTC’ (this is an example. 0 means, 0 minutes, and 17 means 5pm. So this schedule means the task will run every day at 5:00PM UTC.)
AS INSERT INTO [table name] [your SQL code here]
***in SQL, make sure the WHERE statement limits the date to day-before using DATEADD:
WHERE TO_DATE(DATE) = DATEADD(DAY, -1, CURRENT_DATE())
***I used TO_DATE() because the date in my code was datetime, not just date. If your date is not datetime, you can go without TO_DATE().
Pretty simple, right? Now you created a task successfully. But you need to go through one more step to finish the entire process.
3. alter the task to make it run
Yes, I know, why wouldn’t it run? But when you create a task, it is automatically “suspended”. So you need to make it start. To do so, use the following:
ALTER TASK [task name] RESUME
Now you are finally done. Enjoy your table and every day automatic insert! If you want to check your task to make sure the status is showing “started” then use the following command:
SHOW TASKS LIKE ‘[table name]’
Or you can use:
SHOW TASKS [database name].[schema name]
Before we finish...
Before we finish, here’s the link to Snowflake DDL Commands. You can use these commands to create different tasks with various functionality. Also, remember – if you need to alter task in the future, you need to SUSPEND the task before you alter it. Then, do not forget to RESUME again! Enjoy your table and the task!
Comments