Tableau Prep Builder: Formulas and Calculations (Part 4 of 7)
The big shift for Tableau Desktop users: Prep does its formulas at the row level, so there are no aggregations in this calculation window.
- Tableau Prep calculations work at the row level of detail, so aggregations like SUM() cause errors here unlike in Tableau Desktop; a separate aggregation step is required.
- Rename and add descriptions to your steps (e.g. 'calculation step') to document your flow as best practice for others.
- Be careful dragging columns in Prep: dropping one column onto another merges them, so watch for the visual line cue that indicates placement between columns.
- Filter calculations produce a true/false Boolean and filter the dataset directly rather than creating a new column, whereas a calculated field keeps those values visible.
- The list view and aggregated summary view make it far easier to read columns and verify what your calculations are doing through highlighting.
0:00Okay, welcome back to this fourth video in
0:03the series for Tableau Prep for Excel users
0:07.
0:07In this video, I'm going to be focusing
0:10purely on formulas and calculations inside
0:12of Tableau
0:12Prep.
0:13Now, in order to do that, you just need to
0:15open up Tableau Prep, and we're going to be
0:17using the sample flow.
0:19But we're going to make a change to that
0:20flow so that you can build along with me
0:22and have
0:23a go at these calculations yourself.
0:24Yes, that's right.
0:25I'm going to be asking you to get stuck in.
0:28So let's open up this Superstore sample
0:30flow.
0:31What I want you to do is I want you to take
0:33everything on the right-hand side that I've
0:35selected here.
0:36If I just zoom in so you can see that, I
0:38want you to right-click on it, and I want
0:41you to
0:41remove it.
0:43We're going to start this flow using this
0:45position.
0:46So I'm just going to give you a few minutes
0:48just to do that.
0:49All right, we're there.
0:50So here we are.
0:53We're looking at this dataset, and already
0:55some data prep has been done.
0:57And if you watched the previous video, you
1:00'd now be familiar with the step flow.
1:02You'd also be familiar with how to bring
1:05datasets in.
1:06And in the future videos, we're going to be
1:09covering this union step.
1:10So don't worry too much about this
1:12particular step.
1:13But I just want to build on our
1:14understanding.
1:15I want to focus in on formulas and
1:18calculations.
1:19To add a calculation step, I'm just going
1:22to hit this plus icon and add a step after
1:24that particular step.
1:27Now what will happen is that Tableau will
1:29do two things.
1:30If I just go back one step, and I click
1:33that icon and hit add step, it will open up
1:36this
1:36panel and it will add the step.
1:39It can sometimes be confusing because that
1:40change is almost instantaneous.
1:42You don't get any sort of animation into it
1:44.
1:44So it does that in one step.
1:46And the key thing here is because we've got
1:49no calculations in yet, this step won't
1:51have
1:51any icons along the top.
1:53If you look at the top here, this is a very
1:55sort of blank step.
1:56And more importantly, the changes here on
1:59the right-hand side should say zero.
2:01So we've just added a step.
2:03It's called Clean 2.
2:05And there are no changes made just yet.
2:07Now, it might be frustrating you that it's
2:09called Clean 2.
2:10It frustrates me.
2:11So what you can do is you can double click
2:14that and just call this calculation step.
2:18You can also add a description.
2:20So if I zoom back out, sorry, there's a
2:23little bug there.
2:24If I zoom in, double click on the name, I
2:28can also then add a description.
2:31I can just type in, "We're going to do some
2:35profit ratio calcs."
2:38Okay.
2:40This is useful for documenting your flow as
2:42you go along.
2:43So now we're starting to work with prep.
2:45I'm starting to show you some of the best
2:47practice you can do in your prep flow to
2:49help
2:49others understand what's going on.
2:52I'll zoom back out.
2:54And now that we've got our step and we've
2:57got the description and the name all set,
2:59we can click back on the step itself, start
3:02to understand what's going on.
3:04Now, in order to do a profit ratio
3:07calculation, I'm going to first need a set
3:10of values.
3:11The values I'll need are my sales, the
3:14total sales, and then I'll also need the
3:17profits
3:17from each individual sale.
3:21If I look at my data, I can look at it in
3:23one of three ways.
3:24This is probably not the most efficient way
3:26because I have to scroll across to check
3:28each
3:28and every column, and it's not a very easy
3:30way of reading that.
3:31I can actually switch over here to this
3:34third view, which shows a list view.
3:36And then I can just scroll down this and it
3:38's much, much easier.
3:39What I'm going to do is I'm going to move
3:41my columns so they're side by side and I
3:43can
3:43more easily see them in the window.
3:46So now that I've found all my fields, I'm
3:49now going to try and bring these together.
3:51So I know sales is right at the end, profit
3:54is there, and I think quantity might be
3:57another
3:58useful field to have.
4:00Quantity is just right there.
4:01So quantity and profit are next to each
4:02other, but sales is all the way over here.
4:04If I go back to my summary view, I can very
4:07simply just bring that in because I've
4:10still
4:10got it selected and just bring it across.
4:13The window will automatically sort of move
4:16across and I can just drop it in here in
4:18front
4:18of sales.
4:20Now the one thing to be careful with in
4:23prep is when you drag and drop columns, you
4:26can
4:26sometimes drop them on each other.
4:29And when you do that, Tableau perceives
4:31that as an instruction.
4:33So just watch what happens.
4:34I'm going to take the product name and the
4:36product ID, and I'm going to drop them on
4:38top of each other.
4:40Tableau will actually merge those fields.
4:43You'll see here that it's got now a merged
4:46field.
4:46If I just see this, you can see here that
4:49it's now got a new sort of column ID and it
4:52's
4:52merged them.
4:54That's not what I wanted to do.
4:55If you look at this icon, that's the merge
4:57icon.
4:58That's not what I wanted to do.
4:59I did not want to merge those fields.
5:01So it's just very, just be very careful
5:03when you click and drag items that you are
5:05explicitly
5:06dropping them where you need them and you
5:08're not sort of conflating particular
5:10operations
5:11with each other.
5:12So if I drag the cells, I need to look for
5:14this visual cue of the line.
5:16If you see this line here as I hover over,
5:18that's the line you need to be looking for
5:20because that tells you that you're placing
5:23that column in between those two objects.
5:26Now we've got our sales column.
5:28To create a calculation, you can do this
5:30one of two ways.
5:31The easiest way is just to click on the
5:33column that you're probably going to use in
5:34your
5:34calculation and then create the calculated
5:37field just by clicking that blue icon.
5:39If you're familiar with Tableau, you'll be
5:41familiar with this calculation window.
5:43But if you're not, well, welcome.
5:45This is how calculations are done in Table
5:48au.
5:48Now for those of you who are Tableau users,
5:51this is going to immediately be slightly
5:53different
5:54because one thing you have to remember is
5:56that Tableau Prep is doing most of its
5:58formulas
5:59and calculations at a row level of detail.
6:02Okay?
6:03So if you instantly open this and your
6:05instinct is to type in "sum of sales", you
6:08're going
6:08to get an error because there are no aggreg
6:11ations in this calculation window.
6:13The reason being you're working at the row
6:16level of data.
6:17You're not aggregating anything just yet.
6:20There's a separate step required to do that
6:22.
6:22There's an aggregation step that aggregates
6:24columns up.
6:25But when you're working with raw data, you
6:27're working at the row level.
6:29So let's just remove that sum and leave it
6:33as so.
6:34Okay?
6:35Now, I've also got the profit for each
6:37transaction in my data set.
6:39So I'm just going to type in "profit", okay
6:43?
6:43And I'm just going to put "profit divided
6:45by sales" to give me the profit ratio.
6:48Okay?
6:49Profit ratio.
6:50Okay?
6:51Not ration, ratio even.
6:54Now the thing to bear in mind is when you
6:56've built a calculation that works, Tableau
6:59will
6:59say that the calculation is valid.
7:01There are no new notifications.
7:03If I just do that and I delete that
7:05particular step, let's say, it's going to
7:08give me an
7:08error where it thinks it perceives it.
7:10So unexpected text and it's highlighting
7:12sales because it's already had a
7:14calculation.
7:15It's not expecting another one without an
7:17instruction.
7:17So adding the division tells Tableau what's
7:21going on and now that's no longer an error.
7:25If you want to close this sort of reference
7:26pane on the right-hand side, you can click
7:28this triangle just by doing that.
7:30If you know what you're doing, you can just
7:32close that to get a little bit more real
7:33estate.
7:34But I tend to keep it open because all the
7:36functions that you're familiar with in
7:38Excel
7:38are also here.
7:39So if I just take date functions, for
7:42example, they're pretty common.
7:45And I can actually see when I click on
7:47these an example of how that calculation
7:49should
7:50be written.
7:51So it's also a really helpful way for
7:53making sure the calculations are done
7:55correctly.
7:56Let me just close this particular reference
8:00pane.
8:01I'm going to hit OK.
8:02I'm going to save.
8:04And I'm going to zoom back out.
8:06All new calculations appear on the left-
8:09hand side.
8:10So all new calculations appear on the left-
8:12hand side.
8:13And you can see here that they get this
8:15icon, which is a box with an equal sign.
8:18So you're starting to see this sort of icon
8:21ography carry through.
8:22Any time a calculated field is used, you
8:25get a box, and the icon in the box denotes
8:27what
8:27type of calculation that you've done.
8:29Okay?
8:30So that's a very basic calculation.
8:33Let's do another calculation, this time on
8:35dates.
8:36Dates are a pretty common thing in Excel.
8:38So let's go ahead and do a calculation
8:41based on that.
8:42Now, I could do a filter calculation.
8:45That would basically be a calculation that
8:48keeps dates that are true or false.
8:50So let's have a go and do that.
8:52Let's click calculation.
8:54And all I want to do is essentially create
8:57a Boolean field.
8:59Okay?
9:00Now, a Boolean field is a field which
9:02stores basically two types of data.
9:05Those two types are true and false.
9:08Yes or no, on and off.
9:10Binary fields, essentially.
9:12And so I want to create -- if I want to
9:14create a filter, it can only do one of two
9:17things.
9:18Keep the things that meet the true criteria
9:20.
9:20So let's say today is Saturday.
9:24That would, in this case, be true.
9:26So that would keep this particular video,
9:29let's say, in YouTube's context.
9:31Let's say you only wanted to look at videos
9:34uploaded on a Monday.
9:36That calculation would generate this video
9:39as false.
9:39And so therefore, this video wouldn't show
9:42up in those listings.
9:43It's the same thing here with filter
9:45calculations.
9:46You're essentially using a calculation to
9:48filter out a set of data.
9:50Now, what I also want to do is I also want
9:52to do a very basic test.
9:54If I look at my order day, I can move this
9:56around.
9:57I can see here that I've got date ranges
9:59from 2015 all the way to 2019.
10:02So what I'd like to do is filter out
10:05anything that is older than 2018.
10:09So I don't want any recent data in my
10:12dataset.
10:13Okay?
10:14So let's take this date function.
10:20And if we look down this list, you'll see
10:22that I get, obviously, the ability to make
10:24a date.
10:25Okay?
10:26And that's fine.
10:28But I'm trying to figure out in my head,
10:30well, how am I going to do this calculation
10:32?
10:32Well, what I normally do is I normally
10:35enter new row, and I basically just write
10:37the calculation in human terms.
10:38So basically, I want to check if the date
10:42for the order is greater than 2018.
10:47Okay?
10:48So that's basically how I'd do the
10:50calculation.
10:51So I already have the order date here in my
10:54view.
10:55Let me zoom in so it's easy to see.
10:57So I can absolutely go ahead and do that.
10:59But I don't have a date in my dataset that
11:02I can use.
11:03So I'm going to need to use this date
11:05function to manually type in the date.
11:08So I'm going to drag that in.
11:09I'm going to double click that in, actually
11:11.
11:11And I'm going to simply write in a date.
11:14And I'm going to use a text field to do
11:15this.
11:16So I'm just going to write 2018.
11:20I'm doing it in the format that I'm seeing
11:21here in the reference pane.
11:23Hyphen 01, hyphen 01.
11:26Okay?
11:27And I'm not going to bother with the time.
11:30It can actually make that a date time.
11:31I'm just going to keep that as is.
11:33And then I'm just going to bring that there
11:35.
11:36Now, you can see that Tableau is saying
11:37this calculation is valid.
11:39So technically, this is going to generate a
11:42true or false return.
11:44And you can see here that it says this must
11:46return a boolean value, true or false.
11:49Now, when I hit apply, we're going to get a
11:53new column.
11:54Okay?
11:54So I'm going to click save.
11:57And actually, we won't get a new column
12:01because this is a filter calculation.
12:03It's actually filtering the dataset.
12:04So if we look at our order dates, you will
12:09see that it's only kept the values that
12:13start
12:13from the 1st of January 2018.
12:15So I actually wrote this calculation
12:17backwards.
12:18If I wanted it to do the right thing, I
12:20should have actually done this and even
12:21included
12:23the 1st of January just to keep that sort
12:25of nice and healthy.
12:27Hit save.
12:28You can see that my data changes.
12:30And now you only get 2015 onwards.
12:34I have this weird anomaly here where I have
12:37one transaction from that period.
12:40But that is because it actually falls on
12:43the 1st of January 2018.
12:46So just for my OCD, if I just remove that
12:49equals, which I thought would be a smart
12:51idea,
12:52hit save, and then I shouldn't see anything
12:55else.
12:56Perfect.
12:56So we pretty much have all the values.
12:59Because of the weird way it groups the data
13:01, it technically needs this 1st of January
13:042018
13:05as a buffer stop.
13:06But it doesn't have any values from that
13:08year.
13:08It just has 2017, 2016, and 2015, as you
13:12can see here in this filter.
13:14So this aggregated view is proving really
13:16handy because I can actually see what's
13:18going
13:18on.
13:19Now, I can change the spacing in these
13:21columns.
13:21This is not something I've shown you yet.
13:23You can change this spacing just to give
13:25certain columns a little bit more room if
13:27you're working
13:28with them a lot.
13:31So we've done two calculations.
13:32We've done a simple calculation, and then
13:34we've done a filter calculation.
13:36The filter calculation doesn't appear in my
13:38columns.
13:39If I wanted that to appear, I could
13:41absolutely grab this calculation.
13:44I could copy it.
13:47And this time create a calculated field.
13:52And if I just paste that, oh, it didn't
13:54happen.
13:54Oh, well, we'll get the practice in.
13:56Let's just type this again.
13:57Filter date is less than, and I need the
14:02date function.
14:04And then I just need to do speech marks,
14:07then type in the date, 2018-01-01.
14:12That's a valid calculation.
14:14I'll call this date filtering so we can see
14:18a column name.
14:19Click Save.
14:22And there we go.
14:23Now, the challenge here is that this
14:27calculation is doing what this filter was
14:29doing.
14:30But if I want this filter to be available
14:32to the user, then I need the true and false
14:35values back in here.
14:36So let's say I want to go back a few
14:38several steps.
14:38Well, the right thing to do isn't to go
14:40back.
14:41It's just to click this Delete icon, and
14:43this will delete that filter.
14:45And you'll see my false and true values
14:48appearing.
14:49And the really cool thing here is you can
14:50actually see the highlighting going on here
14:52in this year's summary.
14:53So if I click on False, you'll see any 2018
14:55is highlighted.
14:56And click on True, then you'll see all the
14:59other values get highlighted.
15:02That's how calculations are done in Tableau
15:04Prep.
15:04They're very, very simple.
15:06You can hop into some more advanced
15:08calculations as you get more and more
15:11advanced with your
15:12calculations.
15:13But the key thing, especially for those who
15:14've used Tableau very heavily, is that
15:16aggregations
15:16are not done inside of this calculation
15:18window.
15:19Whereas in Tableau Desktop, you can do
15:21aggregations inside of the calculation
15:23window.
15:23So just be aware that that's a meaningful
15:26difference.
15:27And the reason is we're working with data
15:30at a row level of detail.
15:32Thanks for watching this video.
15:33If you found it useful, hit Subscribe.
15:35If not, drop a comment below.
15:36Let us know what kind of content you'd like
15:38to see on the channel.
15:39Otherwise, I'll catch you in the next one.
In this video we dive into formulas and calculations in Tableau Prep Builder. I take you through the nuances, of how it works, a few tips and tricks and show you how easy it is to work with your data all whilst never editing or changing your original input files. -----Join my Discord Server. https://discord.gg/shBuxXr it’s a little sparse at the moment but hang in there.