đ
Datawarehousing Concepts Basics (Fact and Dimension Table) - YouTube
Channel: Tech Coach
[0]
HI guys my name is Vivek and today I will
be explaining you the data warehousing concepts
[5]
in a simplified manner
Before we go ahead letâs understand what
[9]
do we mean by a data warehouse ?
Data warehouse is a system which is used for
[15]
reporting and data analysis.
[17]
From this data analysis we derive some logical
conclusions that helps us in improving our
[23]
businesses.
[24]
We will be explaining the entire concept with
a simple example today.
[29]
Mark who is the CEO of an ecommerce website
is willing to increase his profit by streamlining
[35]
the process
[36]
This is the basic model of seller point
Seller points has 3 components customers,
[44]
products and courier services
[45]
So these customers can buy these products
and these products are shipped with these
[51]
Courier Services
So letâs take an example Vivek can buy a
[55]
shoe from boot makers and that will be shipped
by speed fast.
[62]
Mark is analyzing the data for last one month
to identify which products or which customer
[70]
has made money for him and what are the reasons
why he has incurred loss.
[75]
So based on that data analysis he will come
up with some logical conclusions to improve
[81]
his business line.
[83]
Letâs analyze the data as I just mentioned
we have three entities customers products
[90]
and careers and these are the details of the
item that have been sold say over the last
[96]
one month.
[97]
So if you see we have certain components that
have been highlighted in red these are the
[102]
items that have been returned an item that
has been returned has made no profit for the
[107]
company rather it has incurred losses and
the reason being an item that was returned
[115]
means it was shipped and then it was sent
back
[119]
That means it didnât make any money for
the company,so what are the reasons why these
[124]
items were returned
For the 1st 2 items the customer said he didnât
[129]
like the product,
Vivek is the customer who bought a shirt and
[132]
jeans and he said that he didn't like the
product also note that the Delivery status
[139]
for this particular jeans was delayed.
[143]
The other two items, the reason for return
were âThe items were damagedâ
[149]
These items were shipped by batman couriers,
That means 2 items that were delivered by
[155]
batman couriers were damaged also if you see
2 items that were delivered by local couriers
[160]
were delayed.
[163]
Based on this we can come up with some logical
conclusions like
[167]
Few items were delivered late
Few products were not sold at all
[171]
Vivek returned 2 out of his 3 items
And batman couriers delivered 2 broken items
[178]
out of 3
Obviously mark can come up with logical conclusions
[183]
with the data we just saw
[184]
He can remove the batman couriers services
He can remove the unsold product line with
[190]
new similar products
He should warn the local couriers service
[193]
to expedite their courier process
He should remove cash on delivery for vivek
[198]
as he is returning majority of his items
And he should give an additional discount
[203]
to neeraj on his next purchase as his previous
experience was not smooth.
[209]
This will save him the name of the company
because neeraj will not give him a bad review
[213]
in case if he gives him an additional discount.
[218]
If we want to understand the entire concept
in terms of data warehousing itâs very simple
[225]
The 3 entities customers, products and courier
services are dimensions
[230]
And the transaction table that we saw is the
fact table.
[235]
The fact table has keys of all the dimensions
and some additional columns we call measures
[243]
A fact table has the primary key of all the
associated dimensions and the measures.
[250]
So measures are used to evaluate the profit
and loss game
[253]
For example here the measure we can calculate
which product made me how much money
[262]
Dimensions are the true driving table as they
provide descriptive information for the measurements
[267]
recording in the fact
The vital thing to remember in a data warehouse
[271]
is that the dimensions will always be loaded
first.
[278]
Because you canât sell a product that doesnât
exist.
[281]
Also if you see dimensions are relatively
small in size because a customer can buy multiple
[288]
products.
[289]
Which means vivek he can buy 10 items and
that will be 10 entries in the fact table
[296]
but he still is a single customer
[298]
In a data warehouse 80% of the data usually
is consumed by a fact table
[305]
As I already explained the objective of data
warehouse is to leverage the data for precise
[310]
and effective decision making .
Also the data in warehouse is persistent.
[315]
If I try to explain the data flow of data
for reporting, First of all the data enters
[323]
in the OLTP system from there it goes to the
integration layer between OLTP and the data
[329]
warehouse which is ods (Operational data store).
[333]
From the data warehouse the data goes to data
mart where the reporting is usually done.
[339]
So we still have some open questions
[341]
What is an OLTP system ?
What is ODS?
[345]
What is data mart?
[346]
Why the reporting is not usually done on OLTP
or ODS and is done on warehouse or data mart?
[354]
What kind of reporting tools are being used
?
[356]
All these answers I will give in my next tutorial
You can go back to the homepage right here: Homepage





