Tableau Prep Builder: Pivot / Transformations (Part 6 of 7)
The pivot tool is also secretly an aggregation tool, so you've got to know the grain of your data before you pivot.
- The pivot tool in Tableau Prep works both ways under one icon, turning columns to rows or rows to columns, and the icon changes to show which direction you're pivoting.
- Pivoting secretly aggregates data, so you must understand the grain (row level) of your data before pivoting to avoid unexpected aggregation.
- You can extract the year as a string from a date using a calculated field with DATEPART wrapped in STR, giving the same result as an existing year field.
- The aggregation tool can replace manual field removal: anything you don't bring into the aggregation is automatically excluded, making it a faster route to the same output.
- Tableau Prep is non-destructive and flexible, letting you delete individual change steps to undo mistakes and branch flows to achieve identical results multiple ways.
0:00Okay, in this video, we're going to turn it
0:04up a little bit, we're going to be working
0:06on data transformations. In Excel, you
0:09might consider this to be pivoting or unpiv
0:11oting values. So they're easier to work with
0:13, maybe for some data prep, but actually,
0:18usually you do this because you want to
0:18share this with someone.
0:19But in order to do this, we're also going
0:21to have to do a little bit of data prep
0:22ourselves to get ourselves to the necessary
0:25step. So in this video, we're going to be
0:27calling on all the skills we've built upon
0:29in the previous four or five videos. Okay,
0:32let's get stuck in.
0:34So again, I'm going to open the sample flow
0:36. I connect this because I know everyone has
0:39this in their version of tablet prep.
0:42One thing to be aware of if you're using an
0:43older version of tablet prep, you might not
0:45see some of the features that I'm working
0:47with, but everything I'm covering should be
0:49available to you. And if it's not, just
0:52check the release notes and grab a new
0:54version of tablet prep.
0:55I'm actually using tablet prep 2019 dot
0:57four dot one, this is not the latest
1:00version 2020 dot one is actually the latest
1:03version. So this should hopefully help with
1:06that. Okay, so what I want you to do in
1:08this particular video is to show you how to
1:11use this.
1:11So what I'd like you to do in this
1:13particular step is just to delete
1:16everything up until this clean step, I'm
1:19going to keep the clean step as is. Because
1:22what I'd like to do is add a little bit
1:24more value to this flow by creating our own
1:26steps, then we're going to pivot this data
1:29in a much, much easier way.
1:31Now, the tool that we're going to be
1:33working with is actually this pivot tool.
1:35And you can see that it has this single
1:37icon here, that suggests that you can turn
1:40a column into a row. But it actually also
1:42has the ability to do the opposite to turn
1:44a row into a column. So everything is sort
1:47of under that one icon, if that makes sense
1:49.
1:50But in order to do that, we need to do some
1:52data prep first, we need to have a look at
1:54our data and see what we'd like to do. And
1:56I'm going to add a step rather than adding
1:58additional calculations into this clean
2:01step to make it easier to see what we've
2:03done versus what was already in the flow.
2:06So I'm going to click on the Add step
2:07option. And I'm going to zoom back out.
2:09Okay, so we have a clean list of changes
2:12here in our changes pane. I'm going to keep
2:14that open so we can see what's going on.
2:17Okay. Now what I'd like to do is I'd like
2:19to pivot the values, okay, so that I build
2:23a table where I have the dates going across
2:26the top. Let's assume that I want to create
2:28a very simple table that I sent to some
2:31colleagues that has the categories on the
2:34left hand side, and then the years across
2:36the top.
2:38Okay, I'd like this broken down by category
2:40and subcategory. Okay. In order to do this,
2:43I need to do a little bit of data cleansing
2:45here because I need to remove a whole bunch
2:47of items in this table, I have quite a few
2:49items that I don't need for that very
2:51simple table.
2:53Okay, so what I'm going to do is I'm going
2:55to go to this pane here on the right hand
2:57side.
2:58And what you can actually do is you can
2:59actually remove fields that you don't need.
3:02So I need the year of sale, but I don't
3:04need the weather is returned or not. And
3:06notice as I remove these, you get this sort
3:08of change happening here on the left.
3:11I don't need the days to ship. And
3:13unfortunately adds them one by one. So this
3:14is going to get very, very long, very, very
3:16quickly. So don't panic. If this is moving
3:20too fast.
3:21I'll just go ahead and remove quite a few
3:23of these. Now be careful because I want to
3:25keep the category. So let's say I made a
3:27mistake and I remove that. And I carry on,
3:30and I keep going. Well, if I want to bring
3:33the categories back, I can just delete that
3:35one single step, and it will pop back there
3:37.
3:38So this is really non destructive way of
3:40working with your data. It's really
3:42flexible, and you can carry on and sort of
3:44correct mistakes, as I say, you can fail
3:46fast. So I'm just going to keep the date
3:49because I might need that in some of my
3:50analysis.
3:52And the key thing to also bear in mind here
3:54is that you need to be very comfortable
3:55with the row level data. Okay. In business
3:58intelligence, this is typically called the
4:01grain of your data.
4:03And if you look at this data set, you might
4:06think that it captures orders from a
4:08particular storm. But in fact, the grain of
4:11the data is at the product level. Actually,
4:13each order will have multiple products. And
4:16this data actually tells me about which
4:19products were ordered.
4:21It's not just about orders. So you have to
4:23understand that when you're working with
4:25the data. But in this case, it's okay,
4:27because we're going to aggregate some of
4:29the information. And I'm going to remove
4:32profit, I'm going to keep profit and
4:33quantity.
4:35And I probably want to remove region. I don
4:38't think I need that. I do need subcategory,
4:41I do need cells, I think I can go ahead and
4:43just delete everything else. Okay.
4:46Now, the interesting thing here is when you
4:49build a prep flow, you kind of build it as
4:52you go along, right? And so you will build
4:55this one way and then find another way of
4:58doing this.
4:59So whilst I'm building this, I just want
5:01you to think, as I do this particular step
5:03here, what is the other way I could have
5:05achieved this exact same operation?
5:08And if you can do that whilst we're
5:09building, and then when I show you the
5:11other way you could have done this, you'll
5:14start to see how you can use Tableau prep
5:16to work in a flexible manner, depending on
5:18what you're trying to do.
5:20Okay. So I think I have all the fields that
5:22I need here. Year of sale, category, order
5:24date, and subcategory. So let's go back to
5:26the summary view. I just have four columns.
5:29Okay.
5:30Now, I still have 16,000 rows. I see that
5:32here in the top left-hand side, and I now
5:35have just five fields. So that just gives
5:37me a little bit of context of what's going
5:39on.
5:40Now, the key thing to understand is here
5:42that the year is a number. I think it's a
5:45number, a whole number. What I'm going to
5:47do is I'm actually going to turn it into a
5:49string. Okay.
5:50I just want it to work as text. The reason
5:52is I'm actually going to be exporting this
5:54to Excel, so I have to kind of think in the
5:56backward way of Excel, and I'm actually
5:59going to make these text items across the
6:01top of my table.
6:03And on the left-hand side, I said I wanted
6:04the category and subcategory, so let me put
6:07those together. And the value that I'd like
6:09in that table is the sales. Okay.
6:12So because I've already got a grouping
6:14field here for each of the years, I don't
6:16need to do anything particular with the
6:19order date. But let's assume I didn't have
6:21this year of sale.
6:23What I could do is I could click on this
6:25order date, create a calculated field, and
6:28one of the things I can do in the date
6:30functions is return the date part.
6:33So if I double-click that, and then I'll
6:35just zoom in here so you can see what I'm
6:38doing, I'll take that with me here, and I
6:40'll paste it in there. And the date part
6:42that I'm interested in is the year.
6:45I could just type in year, put a comma in
6:47there. That's a valid calculation. And then
6:50what I'll do is I'll just type in year.
6:53And because I want it as a string, I can
6:56wrap it in the str function, and that will
6:59return whatever that integer is as a string
7:02.
7:03And I'll just call this year manual so it's
7:06easy to see what's what. Hit apply.
7:10And zoom back out. Click save. And you can
7:13see that the manual version is identical to
7:16the one that's there. Okay.
7:18This is a calculated field, whereas this
7:20was just the field that was in my dataset
7:23already.
7:24So those are the two ways you could have
7:25arrived at the same conclusion. But having
7:27created that, we're going to sadly get rid
7:30of our wonderful work and remove it.
7:33Okay, so I'm also going to remove this date
7:34because we've established that I don't need
7:36it. And now I just have the four columns
7:38and values that I need. Okay.
7:41So we've still got 16,000 rows. One of the
7:44advantages is when we pivot data, we kind
7:46of need to make sure that everything works
7:49the way it should.
7:51If I pivot 16,000 rows and I don't
7:54necessarily have the right level of detail,
7:56then some aggregation has to take place,
7:58right?
7:59Those rows have to go somewhere. So what I
8:01tend to do with pivoting, I always do this
8:02thing. Maybe it's a bad habit, but I can't
8:05always foresee what's going to happen
8:07because the pivot can sometimes find
8:10datasets.
8:10And nulls and sort of weird characters that
8:12kind of break the pivot. So I always try
8:14and just do a bit of trial and error first.
8:18So I bring the pivot in. And then it's
8:19asking, well, what pivot fields would you
8:22like to pivot? Well, the thing I'd like to
8:24pivot is the year. Okay.
8:26So if I do that, you can see that the pivot
8:29name is year of sales and the pivot value
8:33is year of sales. Okay.
8:36But in this particular case, this isn't
8:38quite working the way I want it to. This
8:41view is not doing exactly what I wanted.
8:45If I just go back one step and I just look
8:47at this, I already had year of sales in a
8:50column.
8:51So this pivoting to rows, pivoting this
8:53column to rows is actually creating two
8:56columns to do the job of one, which is not
8:58quite the right thing.
9:00What I need to actually do is pivot it the
9:03other way. Okay. So now I'm going to drag
9:06the year of sale back in here and look how
9:08the outcome is different.
9:10Okay. I now have the same year of sale item
9:16. Okay. But this is not finished yet because
9:19I haven't given it a field to aggregate.
9:22So the field to aggregate for new columns
9:25is going to be the sales. Yeah. So let's
9:28just drag that in to here.
9:31And now you can see that this is working
9:33the way it should do. The years have gone
9:36across the top. The category is on the
9:37right hand side.
9:38I can correct that very simply by just
9:39bringing those here to the left hand side.
9:42Be careful you don't drag that too far to
9:45the left and off the canvas.
9:47And now we've pivoted the information in a
9:49very sort of simple to see way. And now you
9:52can hopefully start to see the format that
9:54we needed.
9:55So you've got to be really careful.
9:57Whenever you pivot, eventually certain rows
9:59are going to get aggregated.
10:02So the pivot tool is also secretly an
10:04aggregation tool. And the aggregation here
10:07can be defined much like our aggregation
10:09options that we saw in the past.
10:12Okay. Those are also available to you here.
10:17Now, earlier on, I said to think about
10:18another way of doing this that would
10:21essentially arrive at the same result.
10:23Okay. So what I'm going to do for the first
10:25time, I'm going to actually create a branch
10:27in this step.
10:29So I'm going to click on this plus icon and
10:31create a branch. And this is going to
10:33create a new step just above.
10:35Now, ironically, I don't need a new sort of
10:39step. What I'd actually like to do is add a
10:43new aggregation step.
10:46And that will create a branch as well. And
10:48it will allow me to sort of add this in
10:50here.
10:51Now, if I look back at my data set as it
10:53came in before, you'll notice that I
10:56actually did a bunch of removal of columns,
10:59right?
11:00And interestingly, here with the
11:02aggregation, I've just gone right into it.
11:05So what I needed to do is I can search for
11:07the year column here.
11:09I can bring the year of sales and I'm going
11:11to group it by the year of sales. Then I'm
11:14going to bring my category into the group.
11:18Subcategory into the group. And then the
11:22last thing I'm going to bring across is the
11:24sales.
11:25Okay. So what this does is in one step, it
11:27actually does everything we did in the top
11:30step, because by virtue of being the
11:32aggregation tool,
11:33anything I don't bring in is automatically
11:35left out and only the stuff I bring in is
11:37actually going to affect the aggregation.
11:40So whereas before I went through and
11:42removed all of these fields, you can see
11:44them listed here, I didn't actually need to
11:46do that.
11:47The faster way would have been to use the
11:49aggregation tool to do this.
11:51Okay. And then what I think I can do, I'm
11:53pretty confident I can do this, is I can
11:56actually copy this top step here and paste
11:58it just below.
12:00Now, it's a bit weird in prep when you
12:01paste. It pastes it all the way over here
12:03on the left at the moment.
12:04So I hope that's a bug that will get fixed.
12:07And to then connect these two, just I'm
12:09going to zoom in so you can see what
12:10happens here.
12:11I can just drag this aggregation and you'll
12:13see that when I hover over these items, you
12:16get the new sort of capabilities that you
12:19can see.
12:20So I just want to add this aggregation to
12:22this particular pivot. And when I do that,
12:26it creates a step.
12:28And so if I zoom out, what I expect to see
12:30is that this top pivot and this bottom
12:33pivot have exactly the same outcome.
12:36Okay. If I click on that, I get category,
12:38subcategory, and then the years. If I click
12:41on the bottom, there you go.
12:44I get exactly the same thing. So I've
12:46achieved the same result once with an
12:48aggregation tool and the other time without
12:51.
12:52And hopefully you start to see how data
12:54prep in Tableau Prep starts to become very
12:56flexible. There's multiple ways you can do
12:58the same thing.
13:00Now, the key thing to understand here is
13:01that just because you've done a
13:03transformation one way doesn't mean you can
13:04't do it the other way.
13:06So let's say I'd like to correct this. Let
13:08's say actually rather than deleting this, I
13:10'd like to unpivot this data.
13:12Well, notice here that the icon changes to
13:14show you which way you're pivoting. Okay.
13:18So this one has one column going to two,
13:21whereas this has two going to two. Okay.
13:24So this is really important to understand.
13:26Tableau is constantly communicating with
13:28you what's actually going on.
13:30And for this particular setup, what I need
13:32to do is simply drag all the columns that I
13:35'd like to pivot into one column.
13:38Okay. So in this case, I'm going to select
13:402015, go all the way down to 2018, drag it
13:43onto the pivoted fields.
13:46You'll see the summary view change. Then
13:48what I need to do is rename these because
13:50it doesn't know what to name these when the
13:52pivot is done. Okay.
13:54And then here I can just call this "Sales."
13:57And so we're back to square one. We're back
14:00to the exact same output that we had here.
14:03Okay.
14:04So hopefully that's been a good
14:06introduction into the pivot and unpivoting
14:09tools.
14:10I've also shown you ways you can do
14:11transformations without necessarily using
14:13things like the aggregation tools.
14:16And it's also really important to
14:17understand that when you're pivoting, you
14:20can actually inadvertently aggregate the
14:22data.
14:23And you need to be aware of how those
14:24aggregations are going to work. So it's
14:27just something to bear in mind. Okay.
14:30Now, just as before, you can see here that
14:31there's a little change that's been made.
14:34So I get to see an icon here.
14:36I rename the two fields. That's why this
14:37icon is a little bit smaller. And those
14:39changes live here in the changes pane.
14:42So just like I've done in the previous
14:43videos, all those changes are consistently
14:46being displayed to you throughout the flow.
14:49Okay.
14:51That's pretty much it for transformations
14:53and pivots.
14:54You can obviously do way more advanced
14:56transformations with a combination of
14:58different skill sets and tools.
15:00But I just wanted to show that one as a
15:01basic introduction as to how that works
15:03here in Tableau Prep.
15:05It's the same interface you're starting to
15:07see again and again and again. The same
15:09sort of mechanics again and again.
15:11And the same visual cues each time. Okay.
15:15That's it. If you've enjoyed the video, hit
15:17subscribe. Otherwise, drop a comment below.
15:19Let me know the kind of content you'd like
15:21to see. And I'll catch you in the next
15:23video where we'll talk about joins/vlookups
15:26. All right.
Pivoting your data from columns to rows or rows to columns is a pretty heavily used function and capability in Microsoft excel. In this video I show you how to do both and how to make sure you get yours perfectly configured. -----Join my Discord Server. https://discord.gg/shBuxXr it’s a little sparse at the moment but hang in there.