Video |

Snowflake | Data engineering | Analytics

Getting started with Snowflake PT-3: Running queries and using JSON

Re-run the exact same query and it comes back in 65 milliseconds for free, because Snowflake's cache already has the answer.

Tim Ngwena •16 May 2021 •1 min read •45 min watch •7.5K

Snowflake's result cache holds query results for 24 hours across all warehouses, so re-running an identical query returns in milliseconds and uses no compute credit
Zero-copy cloning takes a snapshot via metadata and pointers rather than copying data, letting you spin up an independent, writable dev environment without doubling storage
The variant column type lets Snowflake ingest semi-structured data like JSON, Parquet or Avro without predefining a schema
You can query nested JSON fields directly using SQL dot notation (e.g. v:city.coord.lat) as if each field were a relational column
The query profile view breaks down where time and cost were spent, such as a table scan consuming most of the work

0:00Hey, it's Tim here. We're back doing more

0:01Snowflake. We're going to be pushing on,

0:03this is the third video in a series where I

0:04'm essentially just going through the

0:06getting started

0:07guide that Snowflake have on their website.

0:10Essentially, we finished off at the fifth

0:13step

0:14last time loading data. So we essentially

0:16managed to load data from S3 into our Snow

0:18flake environment.

0:19And so today we're going to be pushing on,

0:21we're probably going to be looking at some

0:22more of the

0:23analytical queries and also trying to look

0:25at some of the account administration

0:27aspects.

0:27So without further ado, let's get stuck in.

0:29This is a long video. If you look at the

0:31timestamps

0:31below, you'll sort of get a gauge of what's

0:33coming up and what you might be interested

0:35in.

0:35So yeah, let's get stuck in. Let's not

0:37waste any time and let's start doing this.

0:39Okay, so last time we finished this step of

0:41loading data, you can see here,

0:43we basically went through this entire step

0:45and we got pretty much the end. We had a

0:47couple of issues.

0:48Most notably, specifically when it came to

0:51making sure that we had the right column

0:54types and

0:55making sure that the CSV files that were

0:57loading from S3 were actually being

0:59ingested properly.

1:00We managed to get around that using the SQL

1:02that was actually in the worksheet. So this

1:05worksheet

1:06is available to you when you start off the

1:08demo. There's a little bug here where you

1:10can see that

1:10I've loaded it up, but it's not actually

1:12sort of loading it. And now when we go out

1:14and come back

1:14in, it works. So that's something to be

1:16aware of. So basically, yes, you can see

1:18here that the last

1:19thing we did was we just essentially copied

1:22everything from, well, we copied into trips

1:25from

1:25CityBikeTrips, if that makes sense. Okay,

1:28so we can see here the file format was CSV.

1:30So now that we've done that, the next thing

1:33to do is to carry on to Module 5, which is

1:35actually

1:35Module 6. And at this point in this demo,

1:38all they're trying to do is get you to use

1:40Snowflake

1:41as a database. So for those of you who've

1:42used Tableau, this is normally the kind of

1:44stuff you

1:45use Tableau for. You wouldn't be using SQL,

1:47you'd be doing it nicely in Tableau Desktop

1:50. Okay. So

1:51essentially, this is a very simple part of

1:53the process. Now, I've already got Snow

1:55flake running

1:56over here. So we don't need to sort of

1:57worry too much about that. What we will be

1:59doing sometimes

2:00is just simply highlighting the query and

2:02then hitting run just to save time for the

2:04video.

2:05But we will try and sort of decipher some

2:06of the queries and see how they're working.

2:08I'm not a

2:08SQL expert. So I'll just be sort of trying

2:10to give my basic understanding of this. But

2:13as you know,

2:13this is a video where I'm learning and you

2:15're kind of learning with me. So hopefully

2:17that helps. Okay.

2:18So yeah, let's go back to the Step 6. And

2:20let's just sort of figure our way around

2:23this. And one

2:24of the key things to know about this step,

2:25and actually many of the steps going

2:26forward is that

2:27Snowflake is trying to sort of introduce

2:29the philosophy of how they work. Okay.

2:32And so there'll be sort of a lot of things

2:34here that seem simple, but actually, they

2:35're kind of

2:36trying to get you to sort of see the

2:38difference between this and the traditional

2:40relational

2:40database as it were. Okay. So some of these

2:43points might be a bit labored, but I think

2:45they're really

2:45useful ones to sort of go through. So let's

2:47work through this. So let's see this.

2:49Everything

2:49they highlight in yellow is always useful

2:51to read. So I will skip right to that. A

2:53real world roles

2:53and queries. Within a real company,

2:56analytics users would likely have a

2:58different role than

2:59SysAdmin. Essentially, not everyone's an

3:02administrator. To keep the lab simple,

3:04we're going to stay with a SysAdmin role

3:05for this section. Additionally, querying

3:08would typically be

3:09done with a business intelligence product

3:11like Tableau. Hey, hey, hey, Looker, Power

3:13BI, and so

3:14on and so forth. In my view, that's in the

3:17right order. Okay. For more advanced

3:19analytics, data

3:21science tools like DataRobot, DataQ, AWS

3:23SageMaker, or many others could query Snow

3:26flake essentially.

3:27So if you need more sort of hands-on tools,

3:29then you've already got those capabilities

3:31for you.

3:33Any technology that leverages JDBC/ODBC,

3:36these are database connectors. So JDBC, I

3:40think it

3:40just stands for Java Database Connector or

3:42something like that, and ODBC's Open

3:45Database

3:45Connector or something like that. Spark or

3:48Python can run analytics on the data in

3:50Snowflake.

3:51To keep this lab simple, all queries are

3:53being done via Snowflake worksheet. So

3:55essentially,

3:56what they're saying is, listen, in the real

3:57world, you wouldn't use the Snowflake

3:59interface,

4:00and you'd probably use actual tools to do

4:02this. So like, you know, this is just us

4:04showing you how it works. But in the real

4:06world, you'll probably be using something a

4:08bit more

4:08organic, like a tool, because you're likely

4:11trying to do something to your data. Okay,

4:13so that's a good context to know. Like, a

4:15lot of people will sort of hear databases,

4:17and they'll immediately see SQL, and they

4:19'll think that everyone in the world writes

4:21SQL. Actually,

4:22no, you use tools that typically do the SQL

4:24for you. Or you use tools that help compile

4:27the SQL,

4:27but then you run the bits that you want and

4:29bits you don't want. Okay. So execute the

4:33select

4:34statements and result cache. So essentially

4:36, this one is going to be trying to get us

4:38to see the

4:38cache in action. So run the query below to

4:41see a sample of the trip status. So let's

4:43go back.

4:44And you see here that it's actually got

4:46everything set up. Now, it's going to make

4:48sure that we're

4:48using this specific role. So let's just

4:50make sure we're using that role. So role is

4:52sysadmin,

4:54warehouse is analytical warehouse. In my

4:56case, I'm going to stick to compute, then

4:58you've got

4:59use database, city bike, and use schema

5:01public. So everything is pretty much where

5:03we left it off

5:04last time. Okay. So let's go ahead and just

5:06select this line and hit run. And then see

5:08what we get

5:09back, it's going to ask this, I'm going to

5:11keep sort of not ticking this box, because

5:13I think I

5:14want people to see this every single time.

5:15So they're clear that I'm actually running

5:17the query.

5:18So when we go ahead and run that query, it

5:20took a bit of time, because of course,

5:22my snowflake instance was asleep, but the

5:23warehouse wasn't running. Now it's running,

5:25everything is good to go. So we can see

5:27here, we get a sample of the data, and we

5:29're pretty much

5:30good to go. Everything's worked exactly as

5:32expected. Okay. So we get pretty much the

5:35same

5:35data, it might be slightly different given

5:37you know, where, where everything is and

5:39how everything

5:40is done. But essentially, we get the sample

5:43data. First, let's look at some basic

5:45hourly statistics

5:46on city bike usage. Okay, so we've

5:48downloaded some data from S3. Now we want

5:51to start analyzing it

5:52and see how our bikes used. Run the query

5:55below in the worksheet, it will show for

5:57each hour,

5:58the number of trips, average trip duration

6:01and average trip distance. So let's go

6:03ahead and

6:04go back. I think actually my dog is snoring

6:06in the background. One sec.

6:09Alright, I had to wake him up slowly just

6:15to stop snoring.

6:17It's as boring as as my dog thinks it is. I

6:21hope not. If that's the case, let me know

6:24in the

6:24comments below. So let's go back here and

6:27just select a few of these lines. So this

6:30is the step

6:31we're trying to do here. So let's just look

6:33at this query in more detail. Let me just

6:35give this

6:35more space. Let me see if I can drag this

6:37down there and get blue marker down. And if

6:39I just

6:40hit command plus, I think this will zoom in

6:42. There we go. So select basically, it's

6:45basically taking

6:46the start time and it wants us to take the

6:48hour, it's truncating to the hour and just

6:51bring it back

6:52as a date. So basically, select the trip

6:54time, round it up to the hour and just

6:56bring it back

6:56as a date. Essentially, the trip time is

6:59actually like a date time interface. So by

7:03truncating to

7:03the hour, we just return everything within

7:05the hour to have the same start time. And

7:07then the

7:08date, we just bring in as a date

7:09essentially. So that will give the column

7:11name date essentially.

7:12And then count the number of trips,

7:14basically count everything and return it as

7:17number of trips.

7:18And then the average trip duration. So

7:20essentially the trip duration is in seconds

7:22, divided by 60

7:23will give us the minutes. And so if we do

7:25that and then take the average of that,

7:28then essentially

7:30we'll get the average duration in minutes.

7:32So they've done it slightly different here.

7:34They've taken sort of the trip time,

7:35created an average in seconds, then divided

7:38it by 60.

7:39You could do the same thing essentially,

7:40then you get the answer in minutes. It will

7:42be decimal

7:43point minutes, but essentially it'll still

7:45be fine. Then the average start latitude,

7:50the start station longitude, end station.

7:54So what it's basically doing is it's using

7:58a special

7:59function. So have a sign, I think is a

8:02specific spatial function or specific

8:04function that will

8:05allow us, if I just scroll across so we can

8:07see the rest of this query and see if I can

8:09,

8:10I can't really, I can bring this to the

8:11left. So if we just look at this, it's

8:13basically going to

8:14be calculating the distance. So this is

8:16like a big clue of what's going on. And I

8:18think this function

8:19is like a mathematical function that

8:21essentially takes a start point, the finish

8:24point, and takes

8:26those two things and it calculates the

8:27distance. I don't know how it does it

8:29because it's not like

8:30the journey time. It's just basically

8:32drawing a line between the two. I don't

8:34believe it's

8:34doing anything sort of complex, like

8:36plotting the routes. To do that, you'd need

8:39an API to go out to,

8:40let's say something like a mapping database

8:42so you can get something back. So it's just

8:45calculating

8:45the distance as the bird flies essentially,

8:48and it's doing that from trips and a group

8:50by one,

8:50order by one essentially. So just give it

8:53some sense of order. So let's go ahead and

8:55highlight

8:56that and hit run and let's hit run again.

8:58And we'll sort of bring this up and there

9:01you go.

9:02You get basically for each hour, the number

9:05of trips, the average duration, the average

9:07distance.

9:08And because it's aggregating this, we don't

9:10get sort of the locations, just taking look

9:12how many trips are happening in a

9:13particular time and what's the average

9:15distance and average

9:16duration. So we can sort of look at all the

9:18different start times of our journey by

9:20hour

9:21and see, okay, how many trips are being

9:23made and what's the average sort of picture

9:25look like in

9:25this sense. It's a very generalized way of

9:27doing this. Once you start doing more

9:29analytical

9:29questions, you might want to do some more

9:31sort of bespoke queries. So that's that

9:33query done.

9:34Essentially, we just select that, we run it

9:36, and then you're able to just look at it.

9:38Now,

9:38if I go ahead and look at the query ID and

9:40I just go into the query by clicking on it,

9:42you actually go to this sort of view where

9:44it shows you the query again. So you can

9:46see

9:46it's a success. You can see the query ID

9:48itself and you can see the query that run.

9:51One thing you can do is you can actually

9:53profile the query. So if I just zoom back

9:55out,

9:55you can actually sort of see the profile of

9:56the query. Essentially, it's showing you

9:58sort of what was happening. Okay, so if I

10:00just click on this, if I just show you this

10:03little

10:03square marker here, it makes sure that

10:05everything is sort of in the screen. And it

10:07sort of shows you

10:08like, you know, what cost you the most, you

10:10know, what was the most expensive part of

10:12this process.

10:13And so you can see here the table scan was,

10:15you know, took 72% of sort of the total

10:18cost.

10:18I never know how to sort of interpret this

10:20because it's sort of like costs in Snow

10:23flake are based on

10:24credit systems. So it's, it doesn't sort of

10:26give you a monetary value, but it's

10:28essentially just

10:28showing you look, this query, and most of

10:30your effort was spent down here. And then

10:32look, some

10:33of these other things took a lot less time.

10:35So this is our sort of whole query, we go

10:36to 61

10:37million rows, we aggregate it, we sort it,

10:39and then we return the result. And that's

10:42what we see.

10:43Inside of the query window. And you can

10:45actually see some of that sort of

10:48performance over here on

10:50the right hand side, broken down by various

10:51factors. Again, this is probably worth

10:53another

10:54video. So I'll probably do another video on

10:56, you know, breaking down the profile of a

10:58query in

10:58Snowflake or something like that. I keep

11:00taking notes, by the way, on the side,

11:01making sure I take

11:02all these things down, because they're not

11:04things I understand, and it's probably

11:05worth coming back

11:06to them. But if we go back, then we just go

11:08back in this sort of window, and it takes

11:10us back to

11:10our initial query that we ran. And we can

11:12just sort of expand this out. So that's

11:14that's basically

11:15that I just wanted to sort of give a deep

11:16dive into what one query looks like. So you

11:18can

11:19understand to the mechanics of how this is

11:21all working and how it's all coming

11:22together. Okay,

11:23so let's go back to our query. And this is

11:25the query we've looked at, we've got a

11:27result.

11:28And basically, the point here is that Snow

11:30flake has a result cache that holds the

11:32results

11:32of every query executed in the past 24

11:34hours. These are available across

11:36warehouses. So

11:37essentially, it doesn't matter how many

11:39warehouses you have, whether a small

11:41warehouse did the job or

11:42a large warehouse did the job, the cache is

11:45universal to all of those warehouses. Okay,

11:47so query results return to one user

11:49available to any other user on the system

11:52who executes the same

11:53query. So it's got like a global awareness

11:55of queries being asked. So if someone, let

11:58's say,

11:58run a query on a super fast state of

12:00warehouse, sorry, and then I come along

12:02with my small

12:03warehouse and ask the same question, I'll

12:05benefit from the fact that the question's

12:07already been

12:07answered by a more powerful warehouse. So

12:09that's a nice way of saving costs as well.

12:11It's really

12:11trying to optimize the amount of use you

12:14get out of this query sort of hits don't

12:16cost you anything

12:17because there's already an answer there. So

12:20why would I need to ask the same question

12:22again?

12:23And that cache here is executed over 24

12:25hours. I'm not sure if you can change that

12:27period,

12:28it'll probably cost more to keep that cache

12:30available for longer. But it's nice that

12:33any

12:33query asked in the last 24 hours are going

12:36to be available, which means that data is

12:38kept fresh

12:38data is, you know, you don't have anything

12:41that's sort of over over a day. But the

12:43other key thing

12:44is, it's also interesting to understand

12:46whether this is sort of automatically

12:48invalidated once

12:49you load new data into it. So does it have

12:51an awareness of, oh, I've had new data, so

12:53let me

12:53not rely on the query? I'm sure that's

12:55probably the case. Maybe that will get

12:57explained in a bit.

12:58But let's push on. Not only do these

13:00repeated queries return extremely fast, but

13:02they also use

13:03no compute credit. So you don't get charged

13:05for them. Essentially. Let's see the result

13:07class in action by running the exact same

13:09query again. So let's go back. And you'll

13:12see here that

13:12they don't actually give us the query again

13:14. So we're just going to literally do the

13:16same thing,

13:16highlight it, and run it again. And it's

13:18going to ask us again, and we're going to

13:21hit run.

13:21And you'll see that this one returned in 65

13:24milliseconds. Okay. So if we go to the

13:26query ID,

13:27and we go to it, you'll see the query

13:29profile view that we ran earlier on. Now,

13:32it looks completely

13:34different. And you can see that because for

13:36a brief moment, you saw what it looked like

13:38. In

13:38fact, what I'll do is I'll probably take a

13:40screenshot of that and overimpose and fade

13:42in

13:42between the two of them in post. But you

13:44can see this one just use the cache. So it

13:46basically is

13:46a query reuse, and return the answer

13:48straight away. The other way I can show you

13:50that if I go

13:51to the history tab, I've only run two

13:53queries, essentially. So they were

13:56immediately after each

13:57other. And you can see here they are. So

13:59these are basically the exact same query.

14:01So here's the

14:01second one that I ran. And that's the

14:03sequel. And then here's the first one that

14:05ran exactly the

14:06same. And just just to sort of keep this

14:09easy for you to see, let me just highlight

14:11those two

14:11queries. So here are the two queries, you

14:14can see they're both done in the Snowflake

14:16UI. And the key

14:17thing you want to sort of pay attention to

14:19is the total duration. So the first time it

14:21ran, it took

14:222.1 seconds, not like a slow response at

14:24all. But the second time it ran, it was 65

14:27milliseconds

14:28to get the response back. And it didn't

14:30need to scan the data at all either the

14:32byte scan stayed

14:33basically zero and the second time around.

14:35So essentially, the second query was free,

14:38I didn't

14:38end up using any warehouse in order to do

14:40that. Okay, so that's a really sort of nice

14:42thing to

14:42see. It's nice to see that it's always

14:44being optimized to sort of save me money,

14:46which is

14:47sort of nice to make sense. Okay, so let's

14:49go back to where I was before. Let's go

14:52back a couple

14:52of steps. And I'm just heading back in my

14:54browser for the record recently, you can't

14:56see your

14:57browser bars, because I think it's easier

14:58to see what I'm doing without it. So I've

15:00basically turned

15:01this into like a web app, essentially, so

15:03that we can just go through this without

15:04any sort of

15:05distraction. And so that's that. And that's

15:07basically that we've seen the cache working

15:09,

15:09let's go on to the next step. In the

15:11history window, note that the second query

15:13runs

15:13significantly faster because the results

15:15have been cached. I've shown you this

15:16already. And next,

15:18let's run this query to see which months

15:20are the busiest months. So let's go ahead

15:22back into our

15:22little SQL script. And let's have a look at

15:25this SQL here. So select start time, and as

15:29month name.

15:30So I think month name is like a function

15:32that returns just the name of the month and

15:34returns

15:34it as month. Notice that it's not month

15:37year, so it will just say January, February

15:39, March across

15:40multiple years of data. So it will

15:41aggregate all of these up, count the number

15:43of trips in each of

15:44those months and do that from the trips

15:47table group by one order by two descending.

15:50Okay, so

15:52I think essentially group by one, I'm

15:54thinking that's like a like this function

15:56here that just

15:57gives us a row number and then order by

16:00descending just basically returns the in

16:03descending order by

16:05a number of trips. So if I just, let's just

16:07run that query, I'll actually be able to

16:08see what it

16:09does rather than just talking out my ass

16:11one second. Yeah, so it does, it does

16:12essentially

16:13do what I thought it does. So if I just

16:15move my face here to the top, you can see

16:17here that June

16:18is the busiest month, then September, then

16:20October. And yeah, we basically pretty much

16:22get everything

16:23that we need and it's sort of really nice

16:25to see. Okay, perfect. So that's that. So

16:29just another

16:30query. There's no real point here. It's

16:32just basically getting us to run a query,

16:33which is nice.

16:34Cloning a table. Now this is nice, okay.

16:37Snowflake allows you to create clones, also

16:40known as zero

16:41copy clones of tables, schemas and

16:44databases in seconds. A snapshot of a data

16:47present to,

16:48sorry, let me start again. A snapshot of

16:50data present in the source object is taken

16:52from the

16:52clone from when the clone is created and is

16:54made available to the cloned object. In

16:57English,

16:58basically, it takes a snapshot of the data,

17:00but it doesn't actually copy the data. We

17:02'll come to that

17:03in a second. The cloned object is writable

17:05and independent of the client source.

17:07Therefore,

17:07changes made to either source object or the

17:10client object and not included in the other

17:12. I.e.

17:12let's say I want to test something. I take

17:16a snapshot of my real data. It creates a

17:18separate

17:19clone. And what it does is then from that

17:21point onwards, these two clones are

17:23independent. So

17:24anything I do to my client doesn't change

17:26the real data source. Okay. And the next

17:29cool thing is

17:29a popular use case for zero copy cloning is

17:32to clone production environment for use by

17:34development and testing to test and

17:36experiment with that adversely impacting

17:38production

17:38environment and eliminating the need to set

17:40up and manage two separate environments.

17:42So a typical thing in analytics is let's

17:44say you're building an analytical product,

17:46let's say some dashboards. And what you

17:48want to do is you want to use real

17:50production data

17:50in order to do this. But what you don't

17:52want to do is to take down your production

17:54data.

17:54So what you typically have to do is stand

17:56up a whole nother environment called dev or

17:59development

18:00where you basically publish all your data.

18:02And it's normally a copy from production.

18:04So you have

18:04to go through the whole process of moving

18:06all your data across. And that in itself

18:08can actually break

18:09the production database. I've actually seen

18:11this happen a couple of times. And so it's

18:13a lot of

18:13effort and work. And in real terms, what

18:16you actually end up having is two physical

18:18databases,

18:19one for dev and one for prod. And there's

18:21also a downside, which is dev is never as

18:23fast as prod,

18:24because companies like to cheap out. So

18:26what you end up happening is you build an

18:28analytical

18:28product never really knowing what the real

18:30performance is like, because you built it

18:32on dev.

18:32And then when you get to prod, prod has a

18:34whole set of different circumstances that

18:36you have to

18:36cater for. So all the work you did on to

18:38have never actually even mattered. And so

18:40that's me

18:41having a bit of a rant. But anyway, zero

18:43copy cloning is cool. And let's just go

18:45into this a

18:46bit. A massive benefit of zero copy cloning

18:49is that the underlying data is not copied.

18:51Only the metadata and pointers to the

18:54underlying data change. Hence, they are

18:56zero copy. And

18:58storage requirements are not double when

18:59the data is cloned. Most data warehouses

19:01cannot do this.

19:02But for Snowflake, this is easy. So what

19:04they're saying is, you create a clone, but

19:07you don't take

19:07up twice as much storage, because all you

19:09're doing is you're copying the metadata

19:11around it.

19:11So all your query, everything runs on this

19:14other metadata layer, that's essentially

19:16not actually

19:16touching your data. So your data always

19:18trade stays true to form, everything you've

19:21done on

19:21top of it, sort of kept separate to the

19:23data. And the zero copy clone is sort of an

19:26enhanced

19:27version of doing that. And you can do that

19:29to enable other people to start working on

19:31it. Let's

19:32say analytical use cases, they're normally

19:34quite resource intensive. So they tend to

19:36be done on dev

19:37environments or separate analytical

19:39warehouses. And so this is really nice. Now

19:41, the weird thing

19:42here is that Snowflake are confusing the

19:44term warehouses. So don't again confuse

19:46warehouses

19:47and Snowflake with warehouses in typical

19:49databases. So when it talks about most data

19:52warehouses cannot do this, Snowflake means

19:55a traditional database warehouse, not a

19:58Snowflake

19:59warehouse. Okay. So they're going to show

20:01us how to do this. Run the following

20:02command in the

20:03worksheet to create the development table.

20:05So we're just gonna go back to our sheet.

20:07And let's

20:08go run that. So create table trips, dev

20:11clone trips, hit run, hit run again, this

20:14will take a

20:15bit of time. Oh, actually didn't take time

20:17because of course, it exists. It's just

20:18creating another

20:19instance. So table trips, dev successfully

20:22created. Okay. So in order to see that, let

20:25's, let's go

20:25ahead table trips, dev, I think that's in

20:28our city bike. So let's go into public. And

20:30you can see

20:31here that trips dev has been created. So if

20:34I just sort of highlight this for you, you

20:36can see here,

20:37I now have two databases. And they're

20:38essentially the same, I could run one query

20:40on one and another

20:41query on the other. And they're exactly the

20:43same. But now these two are separated. I

20:45can't,

20:46I can't break trips just because I'm

20:48working on trips dev. But the underlying

20:50data is exactly the

20:51same. I didn't need to load all my data

20:53from S3 again, to get this one to work,

20:56right? It's

20:56everything is just sort of been kept nicely

20:58separate. Okay. So that's really cool. I

21:00really

21:01like that. It's nice, sort of nice to have.

21:04And if I just go back to my, actually, don

21:07't do that,

21:07let's go to my worksheets. And then why hit

21:09back there, we're actually on this step.

21:11And we've

21:12already completed that. So that's good.

21:14Essentially, we just wanted to see where

21:16trips dev was. And

21:17we've seen that already. So that's a nice

21:19thing. That's a nice thing to have. But

21:22again, just to

21:22sort of summarize this section, analytical

21:24queries, results, cache cloning, these are

21:26just standard

21:27database capabilities. If you work with a

21:29database, this stuff is bread and butter.

21:32But actually,

21:32some of the benefits of Snowflake is it

21:34breaks that down in a way that's sort of

21:35really, really

21:36cool. And I think it's really interesting.

21:39Okay. Next, number seven. So we're moving

21:41on here,

21:42we could see at the top, it says 46

21:43remaining, I like this sort of time

21:45estimate, I doubt we'll

21:46keep to that because I'm talking whilst

21:47doing this. So if I did that might end up

21:49being like

21:50a three hour video or something. But anyway

21:52, let's keep going. Working with semi

21:53structured data,

21:55views and join. Okay, so going back to the

21:58lab example, the city bikes analytics team

22:00wants to

22:01determine how weather impacts ride counts.

22:03To do this in this section, we will load

22:06data in JSON

22:07format here, held in the public s3 bucket.

22:10I'll come to that in a minute and create a

22:12view and

22:12query the semi structured data using SQL

22:15dot notation, run a query that joins the

22:18JSON data

22:18to the previously loaded trips data,

22:21analyze the weather and ride count data to

22:23determine

22:24their relationship. Okay. So this is really

22:26interesting, because JSON is a slightly

22:28different

22:29data format. Okay, it's not a sort of a

22:31traditional data source, like a CSV.

22:33It's typically more used in web development

22:35. So I think it stands for JavaScript object

22:38notation,

22:38which is just another sort of, I don't know

22:41why they come up with all these names. But

22:43JavaScript

22:44object notation is essentially a way that

22:46you write JavaScript to talk about sort of

22:49information in

22:49the hierarchy. It typically uses

22:52information stored in arrays and different

22:54levels of hierarchy

22:56with arrays in those hierarchies. But I'm

22:58not a web programmer. So I could also just

23:00be, you know,

23:01maligning that description. So if I am let

23:03me know in the description, we'll get stuck

23:05in. Okay. So

23:06this is actually quite a cool step, because

23:08we quite often have to work with data

23:10sources from

23:10the web. And I know in Tableau, you can

23:12connect to JSON files directly. So you kind

23:14of don't have

23:15to worry about that. But what's even better

23:17is actually not having to burden Tableau

23:19with the

23:20sort of processing that it's going to have

23:22to do to process that JSON, and just

23:24connecting to it

23:25directly through Snowflake. And so Snow

23:27flake can do the hard work of reading that

23:29data. And then

23:30you can just natively query columns and

23:32rows exactly like you would in any other

23:35database,

23:35and let Snowflake do all the hard work of

23:38sort of understanding that that data. So it

23:40's

23:40a nice little sort of feature. And I think

23:42it's a really sort of nice, nice thing to

23:44do. So let's

23:45let's figure out how it works. So let's

23:47read this yellow section semi structured

23:50data. Snowflake can

23:51easily load and query semi structured data

23:53such as JSON, Parquet, or Avro without

23:56transformation.

23:57So you don't need to do any transformation

23:59work to get this to work. This is important

24:01because an

24:01increasing amount of business relevant data

24:03being generated today is semi structured,

24:06and many

24:06traditional data warehouses cannot easily

24:08load and query such data. Snowflake makes

24:10it easy. So the

24:12point here is, typically, if you were

24:13having to work with this data, you'd have

24:15to run a step

24:16before you loaded it into your database,

24:18then you can start to work with it. The

24:20problem there is it

24:21adds time, cost and also expertise, because

24:23you've got to have people who know what

24:24they're doing to

24:25write the scripts and the transformations

24:27required to do that. Now, for some people,

24:29that might not

24:29be a big step. But if lots of your data

24:31comes from web data sources, then this is

24:34literally going to

24:35be like a whole team doing this and making

24:38sure this works every time. So to do this

24:41demo, we need

24:41to create a new table. So we'll go back to

24:44our worksheet, and we'll just start sort of

24:46using some

24:47of these. So create a database, we're going

24:49to create some roles to look at this data.

24:52And then

24:52next, let's create a table called JSON

24:54weather. We're basically just going to go

24:56work our way

24:57through this and bring in data from various

24:59places. So this is a fun exercise, it'll

25:01take a

25:01bit of time. But let's get into it. I think

25:04we should be able to get to the end of this

25:06very

25:06easily. So let's go ahead and do this. So

25:08if I go back in here, and we're just into

25:11model six, so

25:12let's go ahead and create a database called

25:14weather. And if we do that and hit run, it

25:17's

25:18actually going to say database weather has

25:20been created, I'm just going to refresh

25:21this left hand

25:22side. And you should see here weather is

25:25turned up. So this one here is just the one

25:27we want.

25:28And what we can do is we can just sort of

25:30expand this and just see that there's

25:32nothing there. So

25:33there's no tables or anything in our public

25:36sort of area in this particular view. So

25:40that's good.

25:41Then we have some roles to create. So use

25:43role sysadmin, which is already where I am,

25:45use warehouse compute warehouse already

25:47where I am, use database weather already

25:49where I am,

25:50and use schema public already where I am.

25:52Now notice, in the previous step, we were

25:55running

25:55everything on city bikes. And so when I

25:58actually created this database, I reckon it

26:01automatically,

26:02I'm not 100% sure, but I think it

26:04automatically took me to that database.

26:06Because if you create

26:08a database, likelihood is the thing you

26:10want to do next is to work in it. So

26:12instead of having you

26:14have to remember each time to switch

26:15databases, it just switched you over, which

26:17is sort of a nice

26:18thing. It just keeps stops you sort of

26:19working. But if you're working in your own

26:21tool, don't

26:22create a database and then start running

26:23the wrong queries on the wrong database.

26:26That's a bit of a

26:27however. Okay. So now once we've created

26:29our database, we can go ahead and create

26:31this table.

26:32So JSON weather data and v variant. Let's

26:34just go back and see what it says about

26:36that.

26:37So here we go. Let's create a table called

26:41JSON weather that will be used for loading

26:43JSON. In

26:44the worksheet, run the SQL text below. Snow

26:46flake has a special column type called

26:48variant, which

26:49will allow us to store the entire JSON

26:51object and eventually query it directly. So

26:54that's why this

26:55v variant thing has been inserted. I don't

26:58know what the initial v is for. Again, if

27:02you know,

27:02let me know in the comments, but variant

27:04makes sense. They've actually told us what

27:06that is. So

27:07essentially if we're working with JSON data

27:09, we need to tell Snowflake that's what it

27:11is. Okay.

27:12Semi-structured data magic. The variant

27:15data type allows Snowflake to ingest semi-

27:17structured

27:17data without having to predefine the schema

27:20. This is very cool. So remember before when

27:23we worked on

27:23CityBikes, we had to sort of convey the

27:25schema because we had to tell it that, look

27:28, this column

27:28is a string. This column is a number. This

27:31column is called this. This column is

27:32called that. JSON

27:34has that sort of hierarchy built in. So

27:35essentially what it's saying is that the

27:37variant sort of

27:38notation allows them to just go in and not

27:40have to specify a schema. The reason this

27:43is is because

27:43when you use JavaScript object notation,

27:46essentially you can have an endless number

27:48of

27:48hierarchies inside of that sort of data

27:50array. So if you had to describe the schema

27:52, you'd always

27:53be limiting yourself because if you add,

27:55like I say, a new feature in a web

27:56application,

27:57you might add a new hierarchy in your JSON

27:59object to reflect the data that comes from

28:02that new

28:02feature. And so if you don't have to do

28:04that, the schema can always be adapting

28:06itself to suit your

28:07needs. And so not having to specify schema

28:10is really, really, really handy. So that

28:13makes a

28:13lot of sense. So let's go back. And I just

28:15wasn't sure what the variant was. So let's

28:17go ahead and

28:18create table and run. So now that's just

28:21done, what we should be able to see if I

28:23just go back into

28:25public and just refresh this again, not in

28:28my views, go to public. You can see here

28:31that tables

28:32is there. So JSON where the data has been

28:34created, and that's what we were working

28:37with. So we're

28:37still in weather. And that's all been

28:40nicely created. Okay. So now that we've

28:43done that,

28:44let's have a look and see what's next. So

28:47verify that the table JSON has been created

28:49, we've seen

28:50it already, we've gone to database says now

28:52we can load data into it. So we've checked

28:55all this

28:55already. So we can now start loading data.

28:57So via the worksheet, create a stage where

29:00the unstructured

29:00data is stored in AWS S3. So again, like

29:04last time, we need to go tell Snowflake

29:06where the data is,

29:08this time round, notice we're doing it

29:10entirely inside of the worksheet we're

29:13doing inside of the

29:14query editor. So this is something you can

29:16do, I could go into the databases, click

29:19create a stage

29:20and everything, but you can actually do it

29:21all through the command line. So you know,

29:23what tends

29:24to happen is the more and more you work

29:25with these stores, the more and more you

29:26just do all your

29:27work in one place. Because having done it

29:29in that one place, you can then just make

29:31sure that

29:32everything is sort of done properly. And

29:34like we saw before, it's actually probably

29:35a better way to

29:36do it, because then you can make sure you

29:37've covered all your bases when it comes to

29:39writing

29:39the sequel you need to write. So let's go

29:42ahead and run this. And let's see what

29:45happens. So

29:45create stage New York weather URL is that.

29:48And that's pretty much it. So hit run.

29:52Stage error New York weather successfully

29:54created, we can go validate that a little

29:56bit by going to

29:57databases, we can go to weather, and then

30:00we go to the stages tab. And we should see

30:04the stage there

30:04New York weather. So that's been created

30:06for us, it's really nice to see that. Let's

30:08go back to the

30:09worksheet. And we're pretty much good to go

30:11. Now when we say list at New York weather,

30:14what we're

30:14basically going to see is what files are

30:16inside of that S3 bucket. So let's go ahead

30:19and hit that.

30:20And what we should see is all the different

30:22sort of weather files that dump that as

30:24JSON files.

30:24So essentially, the weather is being sort

30:27of collected at various intervals. And

30:29every day,

30:30I assume it's just sort of dumping the

30:32weather for that particular day. So it's

30:35also sort of

30:36in different parts. I'm not sure exactly

30:38what weather is in which file, but you can

30:40see here

30:40that it's it's just dumping in there. And

30:43the second thing is it's compressed. It's a

30:45gzip

30:45file again, something to be aware of. Okay.

30:48So once it's there, I don't think we have

30:50to do

30:51anything because you can just see here, the

30:53next step is to copy that JSON weather data

30:55from our New York weather staging area to

30:58the new place, which is JSON weather. So

31:01this is sort of

31:02written backwards. So what we're basically

31:04saying is, copy all our data from New York

31:07weather

31:08into the JSON weather data. And the file

31:11format is type JSON. So it's a bit like Y

31:14oda in

31:16Star Wars, he always sort of talks from the

31:17middle of the sentence to the front and

31:19then to the back,

31:20you know, wise one is and you could just

31:23say you're a wise one. Okay. So this is

31:25sort of like

31:26JSON. This is sort of sequel for you. I don

31:28't know why it's like this. This is

31:30programmers years ago,

31:32who just wanted to make things maybe more

31:34cumbersome. Who knows, maybe there's a good

31:37reason. If you know, let me know. So let's

31:40go ahead and run this query. Copy into JSON

31:42weather data from New York weather file

31:44format equals type J equals JSON, hit run.

31:47And now once

31:48we've done that, you can see I'm fairly

31:50confident here, I know what's going to

31:51happen. So I'm not

31:52even bothering to go back to the guide,

31:54because you know, you can start to sort of

31:55guess what's

31:56going to happen each time. And you can see

31:58that it loaded each and every one of these

32:00pretty well,

32:01we got a loaded status for each one. So

32:03that works, we're pretty happy with that.

32:06And now what

32:07we can do now we've loaded the data, we can

32:09just go ahead and look at the first 10 rows

32:12of data

32:13from our new database. So let's go ahead

32:15and hit run. And when you do that, you'll

32:17see that this

32:18is real JSON data. So we get the city, let

32:20me just bring this up so we can see more of

32:23it. So we only

32:23get 10 rows. But you can see here that if I

32:26just, I just click on one of the lines, it

32:28gives me a

32:28little preview. And so you can see here, we

32:30get the city, the coordinates, the Latin

32:32long. And so

32:33you can sort of see the hierarchy I was

32:35talking about. So there's this object

32:37called city. And

32:38then in that hierarchy, it goes one step,

32:40and each city has a coordinate. And each

32:43coordinate has two

32:44attributes, a latitude and longitude. And

32:47that's all part of the city label. Okay,

32:50then you've got

32:51country. Actually, I'm wrong. Coordinate is

32:55one thing, country is another thing. And

32:59the name is

32:59another thing. The ID is another thing. The

33:01languages, the different ways you can say

33:03it is

33:03another thing. So this is a really sort of

33:06complex hierarchy. And essentially, I think

33:08this is just

33:09the weather in that particular city at that

33:11particular time. I don't I think if I keep

33:13scrolling down, it's actually loading more

33:15data, which is what's going on. So when I

33:17get to row 226,

33:19I think that's actually the bottom of the

33:22data set this time. I don't think there's

33:24other I think

33:25there is more 226 is the last row. So you

33:27can see there's quite a lot of data. And

33:29you've got

33:30different timestamps and different sort of

33:33weather. But I think this is just here's

33:36this

33:36specific time, here's the weather. And this

33:38is the description basically. Okay, so that

33:40's how this

33:41data looks like. And then you've got this

33:43multiple times each and every time. So

33:45essentially, this is

33:46New York weather again and again and again.

33:49And it's got different locations in New

33:51York, it's not

33:52just the one location, you can see here, we

33:54've got different latitudes and longitudes,

33:55they're ever

33:56so slightly different. Maybe they're coming

33:58from set weather stations. And that's what

34:00they're

34:00using essentially to take the weather in

34:02New York, because New York is a big place.

34:05But if you're

34:05just talking about the city, New York, well

34:07, there's probably only going to be three or

34:08four

34:08weather stations you can realistically use

34:10to get a real time weather reading from the

34:13city

34:13and as a data source. So that makes a lot

34:15of sense. So that's essentially it, we've

34:17seen we've loaded

34:18our JSON, we're pretty much ready to go.

34:20Now, if I go back to the guide, it's going

34:22to talk a little

34:23bit about working with V. So we've done all

34:25of this step, if you're wondering why I'm

34:27skipping

34:27past this, we've actually done all of this,

34:29I didn't necessarily do it here, I just

34:31talked

34:31through it. Okay. Next thing, let's look at

34:34how Snowflake allows us to create a view

34:36and also

34:36query the JSON data directly using SQL.

34:39Okay, so views and materialized views, a

34:41view allows the

34:42result of a query to be accessed as if it

34:44were a table. Views can help present data

34:47to end users in

34:48a cleaner manner, limit what end users can

34:50view in a source table and write more

34:52modular SQL.

34:53There are also materialized views in which

34:55SQL results are stored as though the

34:57results were

34:58a table. This allows faster access but

35:00requires storage space. Materialized views

35:02can be accessed

35:03with Snowflake's Enterprise Edition or

35:06higher. Okay, so you need to be able to use

35:08a certain sort

35:09of set of databases. Now, this is sort of

35:12interesting because I'm actually on a paid

35:15version of Snowflake, so I'm not sure this

35:17is going to work. Materialized views can be

35:19accessed

35:19with an Enterprise Edition or higher. I'm

35:21not an Enterprise Edition, I'm just on the

35:22standard

35:23edition but I know the demo would probably

35:24have enabled this. So if this doesn't work,

35:27this is

35:27probably why. So let's go ahead and try and

35:29run this anyway. We should get some of the

35:32warning

35:32that sort of makes some sense. So let's see

35:34if we can create the view anyway and just

35:37hit run.

35:37Okay, so here we are with the SQL. We've

35:40got all the information there, let's hit

35:42run. So yeah,

35:43the view was successfully created. Let's

35:46just refresh this little right hand side

35:49area.

35:49And you can see here that I now have a new

35:52view just over here. So we have tables

35:55and then we have views. Don't forget, views

35:57are like tables but they're not the same,

35:59okay.

35:59They're sort of based on tables. It's like

36:02you're basically creating a perspective of

36:05the real

36:05thing. So you want something to be simpler

36:07for someone to work with, so what you go

36:09and do is

36:10you go and do all the work in advance. And

36:12when you materialize that view, essentially

36:14that work

36:14doesn't have to be done every single time.

36:16It's already been done once and that can be

36:18the sort

36:19of starting point. But what you're not

36:20doing is changing your source data in order

36:22to do that.

36:23So that's why sort of views are useful in

36:25this particular instance. So that's really

36:27nice.

36:27SQL.notation v.city.core.lat is used in

36:31this command to pull out values at lower

36:33levels within

36:34the JSON hierarchy. This allows us to treat

36:36each field as if it were a column in a

36:38relational table.

36:40So in the previous step, what we'd actually

36:43seen is that each city has a bunch of

36:45information. Things

36:46like coordinates, things like the location,

36:49the Latin long, the different ways it can

36:51be spelt

36:52for example, different ways it can be used

36:54in language. And what we're doing with the

36:56dot

36:56notation here is we're just going deeper

36:58and deeper into the hierarchy. So again,

37:00you can go

37:01deeper into a JSON hierarchy as you add

37:04more complexity to your hierarchy. That's

37:06essentially

37:07what's going on here. So you can see here v

37:09.city, essentially the name of the city,

37:12and then we go dot id and then we bring it

37:16back as a city id, essentially. It's an

37:19integer which

37:19is why it says int. The next one is v.city,

37:22so city.name. And again, we just bring it

37:25back

37:25as a string as city name. Let's go

37:27something a little bit harder. So v.city,

37:31then dot coordinates,

37:32dot lat, that basically brings us back the

37:35latitude. And then as you go down, you can

37:37see

37:37that there's obviously different levels of

37:39this hierarchy that you can go and get. So

37:41it's a

37:42really nice simple way of getting to data

37:44in a hierarchy without having to know too

37:46much about

37:47whole entire structure of the whole thing,

37:49or having a column for everything pred

37:51efined in

37:51a schema in advance. So that's a really

37:53nice thing to do. And what they've done is

37:57they've

37:57actually filtered it a little bit. So from

37:59Jason Weather where city id equals this. So

38:02this I think

38:03is the city id for New York. So that's why

38:05that's there. It's filtering the data. We

38:09're getting it

38:09from Jason Weather, but we're making sure

38:12the city id is 5128638, which I believe

38:15would be the same

38:16id as New York. So if I click on this, I

38:18can actually go ahead and preview the data.

38:21And so

38:22what I did there in case you missed that

38:24just for reference is I went down here, I

38:26clicked on Jason

38:27Weather, that created this little preview

38:29so I could see all the data types coming

38:31from this

38:31thing. And then I clicked on preview data,

38:34and that preview data then gave me this

38:36table. So that's

38:37sort of how it works. And that's how it all

38:40came together. So yeah, you can see here,

38:42this is all

38:43New York 5128638. That's essentially just

38:45like me running a filter in Tableau

38:47essentially. And now

38:48that's super, super easy to work with. Okay

38:50, so that's that's it. That's basically this

38:53particular

38:53step, we've got a couple more steps. Let's

38:55see if we can sort of figure out the next

38:57step without

38:58reading the instructions. So select star

39:00from Jason Weather data, select everything

39:01from Jason

39:02Weather data, where the date is the same as

39:052018, January the first 2018, essentially,

39:09and just

39:09return the first 20 rows. Okay, so let's

39:12just run that to make sure that is actually

39:15what's going on.

39:16Let's hit run. And we should see that

39:19everything is from yes, the Oh, interesting

39:22. So hold on,

39:24hold on a second. So let's what happened

39:26here. So date truncates the month, of

39:28course. And

39:31observation, observation time is okay. So

39:33it's basically taking everything in 2018,

39:37January,

39:38and whether it's, you know, whether it's on

39:41the 17th of Jan, or the first of Jan, it's

39:44all going

39:44to return the basically the the day of the

39:47first of the month, because we're trunc

39:50ating to the

39:50month. So if I give you a date and the time

39:52, let's say the second of January, 2018, on,

39:57you know,

39:57whatever, what, what time, let's choose a

39:59time like 2am in the morning, when you

40:01truncate that

40:02down to the month, it gets rid of

40:03everything up until month, and it just

40:05returns that but it gives

40:06it a timestamp still of midnight. So that's

40:10why you'll get to the 17th and the ninth,

40:13because

40:13they're all coming back as that. And what

40:16we can see here is we're not actually

40:19giving the column

40:20name as well, we're not doing it as that.

40:22So just just be careful that this is this

40:24is like running

40:25a filter essentially, okay. And we're

40:27limiting it to the first 20 rows. And we're

40:29basically just

40:30returning everything. So if we go back to

40:31the guide, let's kind of see what logic

40:33they were

40:33trying to get us to do here. So if we go

40:36back here, view the worksheet, verify the

40:39view with

40:39the following query, notice the results

40:41look like a regular structured data source,

40:42your result

40:43set may have different observation time

40:45values, okay. So that's basically it, they

40:47just wanted us

40:48to see that look, you've loaded data from a

40:50semi-structured data source. So that's very

40:53easy to run. Okay, now, then we start to do

40:55things that are a little bit more complex.

40:58Now we're going

40:59to start working with joins. So here you

41:01can see user join operation to correlate

41:04against data set.

41:05So we'll now join the JSON weather data to

41:08our citybike.public.trips data to

41:10understand,

41:12so to answer our original question of how

41:14weather impacts the number of rides. So run

41:16the command

41:17below to join weather to trips and count

41:19the number of trips associated with certain

41:22weather

41:22conditions. So essentially what they want

41:24us to do is start doing something you do in

41:26the real world.

41:27Let's match up this weather data to this

41:29trips data and see if there's a correlation

41:31.

41:31So that is going to require a little bit of

41:33sort of bringing data together, but also a

41:36little bit

41:36of math because essentially we need to sort

41:38of create some sort of correlation, see if

41:40the two

41:41are related. So here you can just look at

41:43the query, select weather as conditions,

41:45count the number of trips from city bike

41:48public trips, and then bring that data

41:52together left

41:53out of join to the JSON weather data view

41:56on date trunk, our observation time equal

41:59date trunk,

42:00our start time. So basically it's saying

42:02compare the weather observation time to the

42:04hour to the

42:05date truncation hour start time. So you

42:08compare the observation time to the start

42:11time of the

42:12journey and where the condition is not null

42:14, where basically there is no weather data

42:16and group by

42:17one order by two descending. Okay so let's

42:20go to the query and just look at that. So

42:23let's there

42:23it is again, it's pretty straightforward, I

42:26hope it's straightforward, if not I'll go

42:28over it once

42:28it's run so we can see it more clearly. So

42:31let's hit run and what we should see is you

42:33should see

42:34here the conditions on the left and here

42:35the number of trips on the right. So it's

42:37just

42:37basically doing a general sort of tick for

42:39each condition go and do this. So if we

42:42just actually

42:42look at the SQL again I think now we we're

42:45in a better place to understand it. So

42:48select weather

42:48as conditions, so that's why we have

42:51conditions on the left, and then count the

42:53number of trips

42:54from CityBike public trips, hence we have

42:57this on the right hand side. Okay left out

42:59to join

43:00JSON weather data on date trunk observation

43:02time equals that. So it's basically just

43:05then defining

43:05the relationship between the two things

43:07where condition is not null and then group

43:09by order

43:10by two. So essentially what it's doing is

43:13you're defining the way these two datasets

43:16come together

43:16here and you're basically choosing what you

43:19're going to see here at the top and then

43:21you're

43:21choosing how it's ordered over here

43:23essentially. Okay so that's pretty cool

43:26that's nice to see

43:27it's nice to see that that works and that I

43:29think is the very last step of this

43:30particular exercise.

43:31So that was seven working with semi-struct

43:33ured data views and joining essentially

43:35doing analytical

43:36queries and making sure that all works

43:39really really well. Okay so that's pretty

43:41much it we're

43:42going to stop this here in the next video

43:44what we're going to do is look at time

43:45travel and try

43:46and figure out how that works. It's a

43:48really cool feature it basically lets you

43:50undo things

43:50and you can see here that I've actually

43:52been already looking at it just checking it

43:53out but

43:54it lets you undo things basically go back

43:56in time if you're on the standard edition

43:58it goes back

43:59up to 24 hours if you're on enterprise

44:01edition up to 90 days of history you can

44:04essentially run a

44:04query as if you were doing it back in time

44:06and essentially get the results from that

44:09point in

44:09time and not where you are today. It also

44:11allows you to rescue yourself should you

44:12delete something

44:13or do something that you sort of shouldn't

44:15do. So we'll check this out in the next

44:17video and this

44:18has been sort of 40-45 minutes so I think

44:20this is about the right time if I carry on

44:22for another bit

44:23longer we'll probably end up being an hour

44:25and a half and that's probably too long for

44:26a video. So

44:27thanks for watching I'll catch you in the

44:29next video if you've enjoyed this let me

44:30know what

44:30you think in the comments below and we'll

44:32catch you in the next video take it easy.

Future-proof your career https://n1d.io

| My Courses on Linkedin Learning: https://www.linkedin.com/learning/instructors/tim-ngwena In this video we go through using and running analytical queries in Snowflake. We will continue to use the web interface but the same queries will work in any tool. We also take a look at semi-structured data formats, specifically JSON with it’s support for flexible schemas. We will build a database, load JSON data into it and query it without transforming the structure of the data using Snowflake’s native capabilities.

Join this channel to get access to perks:
https://www.youtube.com/channel/UC7HYxRWmaNlJux-X7rNLZyw/join ----------
(C) 2023 TN-Media LTD. No re-use, unauthorized use, or redistribution, of this video without prior permission.