Handling MULTIPLE fact tables in Power BI - YouTube

Channel: Guy in a Cube

[0]
- Yooo! What's up, this is Patrick, from Guy in a Cube,
[3]
and in this video I wanna talk about
[5]
what do you do when you have multiple facts
[9]
in your data model?
[10]
Stay tuned.
[11]
(upbeat music)
[17]
If you're finding us for the very first time,
[19]
be sure to hit that subscribe button to stay up to date
[21]
on all the videos from both Adam and this guy.
[25]
Okay, you've connected to your data warehouse
[27]
and you look at the schema,
[29]
or you building out your own model against, you know,
[32]
some other source,
[33]
and you've realized that you have multiple tables
[36]
that contains your additive values,
[38]
things like sales amount, and the quantity sold,
[41]
and the number of enrollments and things like that.
[43]
But you have multiple tables at different grains,
[46]
and you need to bring them in at different grains.
[48]
How do you handle those in your model?
[51]
What do you do to make sure
[53]
that you can report against them effectively,
[55]
while at the same time not affecting refreshes
[58]
or introducing complicated DAX,
[60]
or missing values and things like that?
[62]
What do you do?
[63]
So in this video,
[64]
that's what I'm gonna talk about.
[65]
I'm gonna start by talking about two things I see people do
[70]
that they should try to avoid.
[71]
And at the end, I'm gonna show you what you should do.
[74]
And a lot of you guys probably are already looking
[76]
and go maybe I know, but I'm still gonna watch the video.
[79]
Alright, so enough of all this talking,
[81]
you guys know what I like to do,
[82]
let's do what, let's head over to my laptop.
[84]
So the first thing I see people do
[87]
is that they duplicate dimensions in their model,
[91]
let me show you what I'm talking about.
[92]
So let's say for example,
[94]
you have this model that has reseller sales
[97]
and internet sales, and you wanna report against them.
[100]
And you have some common things across those two tables
[104]
like date, you can see there's a date,
[106]
and then a product key.
[107]
Instead of trying to figure out how I can relate
[110]
one table to both,
[112]
what I see people do is this,
[113]
let's go to this layout
[114]
and what you'll see is they'll have a the fact table,
[117]
and then there they'll have a calendar table for reseller,
[120]
but this particular fact table and a product table
[123]
for this fact table.
[124]
And then for the other fact table for my internet table,
[126]
they'll go to Internet Sales,
[128]
and then they'll have a product table for that one
[130]
and an Internet Sales table for that one.
[132]
But why are you doing this?
[134]
Do you realize one,
[136]
you're increasing the size of your model?
[138]
Because now you have these duplicated tables?
[141]
And number two,
[142]
you could potentially slow down refreshes,
[145]
because now it's more tables that you have to refresh
[147]
that wanna refresh.
[148]
That's two things.
[149]
Number three, number three is a little more, you know,
[152]
something you might not even think about.
[153]
Let me show you.
[154]
So let's say you have a requirement
[156]
to create an element that displays both internet sales
[160]
and reseller sales.
[161]
Alright, so that works,
[162]
but you wanna look at it annually, by year.
[165]
Hmm, so I have these two calendar tables in my model,
[168]
which one do I use?
[169]
Do I use Internet Sales Calendar,
[171]
or Reseller Sales Calendar,
[173]
I'm gonna use Internet Sales
[174]
'cause it's first in the alphabet.
[176]
Hmm, something's wrong, what's wrong?
[178]
Well, if I look at this from a table perspective,
[180]
you can clearly see what's wrong.
[182]
It's repeating the total amount of reseller sales
[185]
over and over and over again,
[186]
it's because there's no relationship between
[189]
that measure, Reseller Sales,
[190]
and the Internet Sales Calendar table,
[193]
what you need to do is, take a quick copy of this,
[196]
what we would do is get rid of Reseller Sales here,
[199]
get rid of Internet Sales here and here and take the year
[202]
from the Reseller Calendar table.
[204]
Now you can see that I have these two values
[207]
in two separate tables.
[208]
But what if you wanna put them in the same table?
[210]
How would you do that?
[211]
You can't with this model.
[213]
So there's three problems with this model,
[215]
you're introducing more data,
[216]
so you're bloating the size of your model,
[217]
you're increasing the refresh time,
[219]
potentially increasing the refresh time,
[221]
and now I can't correlate values on the same axis
[224]
or using the same dimension.
[225]
I just can't do it
[226]
because they're across two different dimensions
[228]
that are completely unrelated to the corresponding table.
[231]
So what do I do, right?
[233]
So then people get really creative.
[235]
So number one, don't duplicate your dimensions.
[239]
The second thing people do is they create
[242]
what I like to call the consolidated fact table,
[245]
they pretty much stack facts on top of each other.
[247]
So if I have Internet Sales and Reseller Sales,
[250]
they'll create a view or use power query
[252]
to append the two tables together.
[254]
Let me show you what I'm talking about.
[255]
They'll create what I like to call the consolidated fact.
[258]
And so if I go over to my data view,
[260]
you'll see if I go to my all sales,
[262]
first they consolidate them and during that consolidation,
[264]
they'll add a type to each individual fact table
[267]
before the consolidation, so when I consolidate it,
[269]
I can distinguish between those two types.
[271]
So now I can see I have my Internet Sales here,
[273]
and if I sort this the other way,
[274]
I have my Reseller Sales,
[276]
and if you're paying attention,
[278]
you'll notice that I have blank values.
[280]
That's because there's no corresponding sales order number
[283]
or sales order line number, yada, yada yada
[285]
promotion key, currency key, for reseller
[288]
but if I look at Internet Sales,
[290]
there are corresponding promotions and sales order,
[293]
but there's no employee key.
[294]
So now I gotta deal with these blank values.
[297]
If I start reporting and blank starts showing up
[299]
in my reports, my end users, my report consumer is gonna go,
[301]
what the french toast is this?
[304]
Get rid of those blanks and you try to do some filtering.
[307]
There, you can get creative with DAX to omit 'em,
[310]
but why even introduce that problem?
[312]
The second thing is, now you got this huge table
[315]
that can take a long time to refresh,
[318]
you don't wanna introduce, you know,
[320]
something that could potentially cause problems
[322]
in your models like increasing your refresh, right?
[324]
Another thing is that when you write your measures,
[327]
you will have to add an additional filter on the measures,
[330]
not that this is a huge problem,
[331]
it could cause a huge problem if you start
[333]
if you have a lot, a lot of data in your model.
[335]
But you'll have to do this for every measure,
[336]
every time you wanna create a measure
[338]
that is corresponds to Internet Sales,
[340]
you'll have to add this filter.
[341]
Every time you wanna create one
[342]
that goes to Reseller Sales, you'll have to add this filter,
[344]
if you don't, if I have a table that looks like this,
[347]
and I just do a sum of total sales,
[349]
and I add it here,
[351]
what's gonna happen is I'll get this blank value
[353]
I'm like why why do I get this blank value?
[355]
Well, then they'll go, oh, Patrick,
[356]
I can use this cell type and what I can do is
[359]
use it as a filter,
[360]
and then I can toggle between Internet Sales
[362]
and Reseller Sales, now my Internet Sales is gone.
[365]
You can add this on other things in the in reports also.
[368]
But why jump through those unnecessary hoops?
[371]
Somebody tell me why would I do that?
[374]
You wouldn't, you don't need to do that, right?
[376]
So duplicating dimensions, bad idea.
[380]
Consolidating fact tables, another bad idea.
[384]
Patrick, what should I do?
[386]
It's really simple.
[387]
If you have someone that's building a data warehouse,
[389]
just follow the schema that you get they gave you,
[392]
they're probably a really good data warehouse developer,
[394]
that's why you hired 'em.
[395]
And so use the data warehouse that they develop.
[397]
A lot of times I see people with data warehouses,
[399]
and then they adopt these patterns I have no idea why
[402]
if your data warehouse doesn't do it,
[404]
then you need to probably get a better
[406]
data warehouse developer,
[407]
or at least shape the model a little better.
[409]
And let me show you how you shape your model.
[411]
So let's go to this model.
[412]
Let's create a new layout.
[413]
So I have two fact tables.
[415]
I have my Internet Sales,
[416]
and I have my Reseller Sales,
[418]
they both have a date,
[420]
what do you think I should do?
[421]
Use a consolidated date table that's related to both.
[425]
The same thing for product,
[428]
a central product table as a one-to-many relationship
[431]
with both.
[432]
Any dimensions that's just specific
[434]
to that particular fact table like employee, for example,
[438]
there should only be a relationship
[440]
between Employee and Reseller Sales.
[441]
If I do that consolidated table, then it's related to,
[444]
it doesn't matter, it's gonna relate to Internet Sales
[446]
and Reseller Sales, because they're in the same table.
[448]
Now, what I can do is,
[450]
remember the visual that I was creating earlier,
[453]
if I drop Internet Sales and Reseller Sales here,
[456]
and then I use my centralized calendar, boom, all right,
[459]
now I can see that I was selling some stuff
[461]
through resellers for a couple of years,
[463]
and then it just completely fell off.
[465]
But now you may be thinking, well, Patrick,
[467]
how are they gonna know what to, you know,
[469]
filter and aggregate by,
[470]
because I could also make the mistake of bringing
[473]
my employee name and drop in my Internet Sales right here
[476]
and there's no relationship.
[477]
What I like to do is I give my report consumers
[480]
and my authors, my report authors a little hint
[484]
and on those corresponding tables,
[486]
I write a little description that says,
[488]
hey, you can group by these tables.
[490]
I put it on the table,
[491]
and I put it on the measure.
[493]
So now when they're over in the Report View design,
[496]
in the report, they do a quick hover,
[498]
and they'll say "Oh, wait, I can't use employee
[500]
against none of these tables.
[502]
Oh, let me hover over reseller I can for reseller."
[505]
Okay, okay, great idea,
[507]
so I'm gonna get rid of internet sales.
[508]
I probably shouldn't be looking at this anyway.
[510]
And it just works.
[511]
So use the model, how it was designed.
[513]
Use your conformed or shared dimensions,
[516]
whatever you wanna call them,
[517]
connect them to both facts,
[518]
and then you can correlate to values in a single visual.
[521]
Okay, what do you guys think?
[523]
Are you running, have you ran into anything
[524]
like this before, you solved it a different way?
[527]
I love to know.
[528]
You know what to do, let's continue the conversation where?
[530]
In the comments below.
[532]
It's your first time visiting the Guy in a Cube Channel,
[533]
hit that Subscribe button, like my video,
[536]
big thumbs up.
[537]
As always, from Adam and myself,
[539]
thanks for watching.
[540]
See you in the next video.