How to create a date scaffold in Tableau using Relationships
Here's how to scaffold dates and count patients per day in Tableau, with no data prep beyond one tiny Excel file.
- You can build a date scaffold without Alteryx or Tableau Prep by creating a single Excel file containing one row per day across the full range you need
- Choose an arbitrary start year that doesn't exist in your real data to avoid any date conflicts when generating the scaffold
- Make the date scaffold your primary data set when you want to visualise a value across every day of a range
- Define the relationship with two conditions so each scaffold date falls between the start and end date (date >= start date and date <= end date)
- Relationships blend the data at query time rather than via a join, so you avoid an exploded data set, then use a distinct count of ID to count patients per day
0:00Hey, Tim here. In today's video, I'm back
0:01with another challenge. I actually got
0:03asked this a
0:04couple weeks ago. And the question was this
0:06, and the user wanted to essentially find a
0:08way
0:09of showing a visualization that showed the
0:11number of patients in a hospital at any
0:13given day over a
0:15period of time. But their data set only had
0:18this setup. So they had a ID for the
0:20patient, they had
0:22a start date for the date in which the
0:23patient started treatment. They had another
0:26column,
0:27which was the day which treatment ended.
0:28And then on the final column, they had some
0:31sort of
0:32location identified to tell you where the
0:34hospital was. In this case, I've just gone
0:36with state.
0:37And so with this particular setup, they had
0:39a couple of requirements, you couldn't do
0:41data
0:41prep. So I can't use something like ortrix
0:43or tablet prep to do this. And they wanted
0:45to be
0:45able to show the number of patients in any
0:48given day, from basically the minimum date
0:51to the
0:52maximum date, essentially the range of the
0:54entire data set. And the range really
0:56involves the start
0:57date for, you know, the first patient they
1:00saw to the end date of the last patient. So
1:03we don't
1:03really even know that because it's really
1:05hard to see that in this data set. So I
1:07could probably
1:08sort this, but I'm not going to do this
1:09here in Excel, we're going to do it in
1:11Tableau and try
1:11and figure it out. So I thought a bit about
1:13this and actually realize, hey, we can
1:15probably do some
1:16date scaffolding here with the
1:18relationships and data models. So I kind of
1:20had to go and actually
1:22worked out. So that's why I'm making a
1:23video about it, because I think it's
1:24another nice way
1:25you can do scaffolding without having to do
1:28really any data prep other than one small
1:30step,
1:31which I'm going to show you in this video.
1:33Let's get stuck in. Okay, so what I want to
1:35do before
1:36I get stuck into this is I just want to
1:38sort of remind you what this data looks
1:39like. So you've
1:40got the ID, the start, the end and the
1:43state. What I really need to do is for if I
1:46just take this
1:47patient, patient number one here, let me
1:49just zoom in to make it clearer. What I
1:50need to do is that
1:51for every day in between the, I think this
1:54is an American data, so the 1st of February
1:582021,
1:59up until the 11th of July 2021, I need that
2:04patient represented on every day
2:07essentially
2:08in that record. And so the way I thought of
2:11doing this is actually quite a simple way.
2:13What I wanted
2:14to do is create another excel file. So what
2:17I did is I just go into head and did this.
2:19This is the
2:20only step I did that wasn't strictly in
2:22Tableau. And I just basically gave it an
2:24arbitrary date.
2:26So I just thought 01, 4 times 01. And what
2:29I did just to avoid any conflicts
2:31whatsoever is I just
2:32chose a year that definitely wasn't in the
2:35data set. So I said 2017. Okay. And not 2
2:38117. So 2017.
2:40Here we go. And essentially, Excel wouldn't
2:43recognize this as a date. So let's go ahead
2:46and
2:46just call this date. Okay. And then what
2:49you can do is if you just go ahead and
2:51start adding elements
2:53to this, you'll see that you get a day
2:54increment. So it's automatically starting
2:57to work by the day.
2:58So what I'm going to do here is I'm just
3:00going to keep going all the way down. I'll
3:02just drag this
3:03all the way down. You can see I can see the
3:05date on the very right hand side. And I
3:07just held this
3:08until I got to 2021. Actually, we want a
3:11few more dates. So this is going to be
3:13quite a big data set.
3:14So we'll say we'll stop at 30th of December
3:172022. That's actually my birthday. So we've
3:21got the dates
3:21filled out all the way until 2022. Then I
3:25save this as a simple file. So let's go
3:27ahead and save
3:27this. And we will call this I will put this
3:30in the dates folder, which is where I'm
3:32working.
3:33We'll call this date scaffold. Okay. And by
3:36doing this, all we're doing is we're
3:38basically creating
3:40a day on every single day, right through to
3:43the 30th of December 2022. Okay, now that
3:47we've done
3:47this, we're pretty much done with Excel.
3:49And we've got our original data set still
3:51untouched, we're
3:52going to close this. And we're going to
3:54save it, make sure that it's got the right
3:56formatting.
3:56And I'll leave that as a CSV. And we'll
3:58just leave that in there. And we'll go to
4:01Tableau. So let's
4:02go ahead and open Tableau. I'll open the
4:04latest version of Tableau just so we've got
4:06this working
4:06correctly. Okay, Tableau is now open, I'm
4:09going to connect to my data. Now, it's
4:11really important
4:12that you think about the data model in this
4:14case, because what I essentially want to do
4:17is I want to
4:17define a relationship that looks at the
4:20start date, and the end date and creates a
4:23relationship
4:24between those two things. So what I'm going
4:26to do is essentially I'm going to connect
4:28to the CSV
4:29file, which in this case is a text file. So
4:31let's go ahead and connect to the CSV file,
4:33you'll see
4:34that the date example comes up here. And
4:36there we go, we've got the dates, because I
4:38'm using the
4:39latest version of Tableau, this interface
4:42is slightly more modern and new. And I can
4:44actually go
4:45into a particular sheet. And I can just
4:47look at the dates and see where they start
4:50and end. So you
4:50can see the date start in 2020. And the end
4:53in 2022. I can of course, choose more
4:56detailed date
4:56parts here. So I get data all the way up
4:59until May 2022, which is good for the end
5:01date. And for the
5:02start date, let's go ahead and expand this.
5:05I'll remove these out the way. And we can
5:07expand this
5:08one more time, I get data from October to
5:11October 2020, to 2021. So that's all fine.
5:15Now for the
5:16actual scaffold, we need to think about
5:18this more carefully. So let's go back, go
5:21into our data
5:22model. And when I'm doing this, what I
5:24actually want to do is make this my
5:26secondary data set. Now,
5:28in all honesty, it doesn't actually matter
5:30which way around you define this, I could
5:33make this my
5:33primary data set. But I always find it
5:35easier to write logic when you have what's
5:38considered your
5:39primary data set in the right place. And so
5:41because I'm trying to visualize the number
5:44of patients on
5:45each day across the entire data set, what I
5:48actually need to do start with my scaff
5:50olding
5:51data set. So I'll go ahead and add that as
5:53another connection. So this was I believe,
5:55an Excel file,
5:56they get dates scaffold. And you can see
5:59that essentially, I'm using the connection
6:01capabilities
6:02to connect to a second file. So the first
6:04file is my text file, this could be your
6:06database.
6:06And this second file is an Excel file,
6:08which is where I actually created the file.
6:10And you can see
6:11it's just got the dates all the way down.
6:13So unfortunately, I didn't rename this. So
6:15what we
6:15can do is click on this drop down, and we
6:18can just say dates, scaffold, okay. And now
6:22we want to go
6:23and get the main date, and we want to bring
6:25it in. So this is our date example data set
6:27, we're going
6:28to bring it in. And this is where we kind
6:30of have to start getting creative, because
6:32this relationship
6:33needs to do a couple of things for us. For
6:35every day that a patient is technically in
6:38the hospital,
6:39we need to create a relationship with the
6:41date scaffold. And what we essentially need
6:44to do is
6:45we need to take a day in the date scaffold,
6:48and see if that date falls between the
6:50start and the
6:51end date. And so the relationship basically
6:54needs to create a relationship with all
6:57data items where
6:58two conditions are met. Number one, the
7:02start date is greater than or equal to the
7:05date scaffold.
7:07And then secondly, the end date is less
7:11than or equal to the end date, or the date
7:14scaffold
7:15essentially, sorry. So those are the two
7:17conditions we need to write. So I'm
7:19confusing people 100%.
7:20So let's go ahead and write this. So first
7:22one, we're going to choose a date. And we
7:24're going to
7:25say greater than or equal to the start date
7:29. Okay, so that is going to give us one set
7:34of relationship
7:35items. But it's only capturing the start,
7:37what we now need to do is to capture the
7:39end. So let's go
7:40ahead, add another field, we're going to
7:43get the same date again. And what we're
7:45going to say is
7:46less than or equal to the end date. Okay.
7:50So for each record, essentially, Tableau is
7:54checking this
7:54relationship logic. And if it finds the
7:57data, it's actually going to blend these
8:00two dates together.
8:01But it does it at query time, it doesn't do
8:03it through a join. So we don't get this
8:05massive
8:06exploded data set, we just have two data
8:08sets. So you can see our data set here on
8:11the left.
8:11And if I click on the date scaffold, you
8:13can see the other one also working
8:15completely fine. So my
8:17relationship has been defined, and we're
8:19pretty much ready to go. So that's all I
8:21need to do. Now
8:22I need to just go to sheet one. And we can
8:24start to test this out a little bit. Okay.
8:27So the first
8:27thing I'm going to do is just bring in the
8:29date, okay. And when I bring in the dates,
8:31you see that
8:32we get this now what I want to do is
8:34actually set this down to the month. And I
8:36'm going to put this
8:37on the columns just say we get this sort of
8:39nice chart. And just so we absolutely clear
8:42, if I bring
8:42the count of the date scaffold, you can see
8:45we have a pretty consistent number of days
8:47in every
8:47month. So that shows you that this date
8:50scaffold is really working every few
8:51February's, you can
8:53see here we have 2828. And then we have 29,
8:56and so on and so forth. So those are your
8:58leap years,
8:58everything is working pretty good. So let's
9:01remove this count of dates. And let's go
9:03back to what we
9:04wanted to do. Now for the next one, what we
9:07need to do is count how many patients are
9:10in on each
9:10of these days. So I can actually just bring
9:13the ID, I'm going to drop it here. And you
9:15'll notice
9:16that as soon as I did that, the date range
9:18here suddenly shrunk. And that's because
9:21our
9:21relationship is kicked in. It's now only
9:23bringing in data where both things are met,
9:26I could ask it
9:26to show me the nulls on either side, but it
9:28's not doing that. It's just showing me the
9:30range. And
9:31because these IDs aren't really that
9:33descriptive, the detail doesn't seem to
9:34work here, what I
9:35really need to do is to count the number of
9:38patients. So what I need to do is do a
9:40distinct
9:40count of the particular ID on each day. So
9:44let's go ahead and do that. And you can see
9:48that it's
9:48done the count, okay, it is actually there,
9:50you can see that it's all coming up as zero
9:52. But let's put
9:53this in on rows. And let's see what happens
9:55. Okay, so now you can see it's starting to
9:57work.
9:58You can see that it's 0000000. And then we
10:01have a number, and it climbs up and it goes
10:04down again.
10:05Okay, so this is starting to work. And
10:07because we've used a month as a primary
10:09data set, it only
10:10is going to kick in here when we actually
10:12have data for the first one. So remember,
10:14we said the
10:15data started in October 2020. And our last
10:18patient left the system on May of 2022,
10:21which we validated
10:22at the beginning of this. So now what I can
10:24do, let me just do something really simple,
10:26which I'll
10:27just I'll just exclude all these, I'm going
10:29to do in a very crude way, just to get rid
10:31of all this
10:31stuff, because we don't need all of this in
10:33our view, it's going to just make it more
10:35complex.
10:36And so you can see here that we've
10:37basically answered the question we have
10:39three of the
10:41highest number, we have 355 patients in our
10:44system. And what I can now start to do is
10:47just visualize
10:47this. So I can just break this down by
10:49state as in a simple example. And I can
10:51immediately see which
10:52states have the highest number of patients
10:54at any given time. So you can use this
10:56technique to solve
10:58a problem where you're trying to visualize
10:59things like support desk data, or you're
11:01trying to
11:02visualize how many people were in a
11:03particular system at any one time, I haven
11:06't done any complex
11:07scaffolding whatsoever outside of the Excel
11:09file that I created to give us a range of
11:12dates.
11:12And even that was very basic, I basically
11:15chose a start date, I chose an end date
11:17that was ridiculous.
11:18And then we put that into our data set to
11:20work with. Now, if you were trying to be a
11:22little bit
11:22more consistent, what you might do to that
11:24data set is expand it a lot more. But just
11:27keeping
11:27things simple, that's all you really need
11:29to do. So here, we've got everything set up
11:31, we've answered
11:32the question, and you're pretty much good
11:34to go. And it works just as fast, just as
11:36simple as if
11:37you'd done this in data prep, and it was
11:39all clean to get to get. Now, what you
11:41could then do is you
11:43could then build a visualization, say, well
11:45, look, which patients were in the hospital
11:47on this
11:47particular day, you could build another
11:49dashboard and click on that, and it would
11:51send you to see
11:52all the patients in the system and any on
11:54that day in this particular hospital, and
11:56you'd be able to
11:56sort of get that to work in a really easy
11:58way. But that's not what this video is
12:00about. I just wanted
12:01to show you this very simple scaffolding
12:03method that doesn't use any data prep apart
12:06from that one
12:06Excel file, and allows us to use the
12:08relationship and the data model to create
12:11something super
12:12interesting. If you've enjoyed this video,
12:14you know what to do. Thanks for watching,
12:16and I'll catch you in the next video.
Relationships in tableau open up new possibilities with scaffolding. In this video, I show you how to manipulate the way relationships work to build scaffolds in Tableau. In this example I show a typical use case you’ll see on support desk data or patient records in Tableau where you need to be able to display the daily total of open tickets or patients in treatment. This technique doesn’t require any table calcs and is record/row specific because of the way relationships work.
0:00 - Intro 1:34 - Setting up our scaffold 4:07 - Checking our main data set 5:17 - Setting up our data model to scaffold 8:20 - Checking our scaffold has worked