Fact Table vs Dimension Table: Difference Between Fact Table and Dimension Table
By Rohit Sharma
Updated on Aug 23, 2023 | 9 min read | 7.3k views
Share:
For working professionals
For fresh graduates
More
By Rohit Sharma
Updated on Aug 23, 2023 | 9 min read | 7.3k views
Share:
Table of Contents
In the simplest terms, a fact table refers to a central table primarily used in data warehousing and business intelligence systems. It stores quantitative data such as measurements, metrics, or facts related to a particular business process or event. There are typically two columns in a fact table- one for foreign keys, acting as a link to the dimension tables and one for value or data that needs evaluation.
Fact tables are essentially important because they facilitate the analysis and reporting of business performance by storing granular data that can be analysed across different dimensions. Some of the most common examples of fact tables include inventory levels, website traffic metrics, financial data, and sales transaction data.
Below are some of the key characteristics of fact tables in data warehousing and business intelligence systems.
By adhering to these characteristics, fact tables can efficiently store qualitative data required for analytical processing, thus making them an important tool for data warehousing.
The granularity of a fact table is a very crucial component of data analysis and reporting as it defines the scope and accuracy of the insights that can be obtained from the data.
Simply put, granularity refers to the level of detail or specificity at which individual events or transactions get recorded in the fact table. When designing a fact table, granularity is the foremost factor that needs to be addressed.
This usually constitutes two crucial steps,
upGrad’s Data Analytics 360 Cornell Certificate program can help you understand granularity in fact tables while decoding its accurate implementation.
Check out our free courses to get an edge over the competition.
Contrary to a fact table, a dimension table can be described as a type of table that stores all the different attributes or characteristics of the data in the fact table. The information is usually quite descriptive in nature and helps to provide contextual or additional details to the numeric data that is stored in the fact table.
There are different types of dimensions in data warehouse. Some of the most commonly used dimension tables include slowly changing dimensions, junk dimensions, role-playing dimensions, and shrunken dimensions, among others.
Similar to the fact table, dimension tables are an integral part of the star schema or the snowflake schema data modelling techniques widely used in data warehousing. They usually have columns, which serve as a primary key allowing for the other dimension rows or records to be uniquely identified.
Let’s explore the key characteristics of a dimension table.
The fact table and dimension table are two important components of the dimension model widely used for data warehousing. On that note, here are a few key points of difference between fact and dimension table.
Fact Table | Dimension Table |
The primary purpose of a fact table is to record quantitative or numeric data and facts of a business process. | Dimension table is used to store descriptive attributes or characteristics related to the data in the fact table. |
There are more records present than in the dimension table | There are usually much lesser records present than in the fact table. |
Fact tables tend to be large because they store a vast amount of numeric data. | In comparison, dimension tables are usually much smaller in size because they do not contain detailed numeric data. |
It is mainly used for analysis and decision-making purposes. | It mainly stores all the information about a business and its process. |
Fact tables do not have any hierarchical structure. | Dimension tables can have a hierarchical structure with attributes organised into levels to facilitate drill-down and roll-up analysis. |
Now that you have a clear understanding of the notable differences in the fact vs dimension table let’s look at the different types of facts that can be captured in the dimensional model.
Facts can be categorised into various types depending on the nature or the characteristics of the data they represent. Nonetheless, some of the most common types of facts include,
These are the ones that can be aggregated across all dimensions of a fact table. It involves simple mathematical operations such as addition, subtraction, multiplication or division. A few examples of additive facts might include sales revenue, total cost, or quantity sold.
Contrary to additive facts, non-additive facts refer to those that cannot be aggregated at all or can be aggregated only under certain specific conditions. They represent the measurements that are not additive across dimensions. A few examples of the same are percentages, ratios or averages.
Factless facts refer to those tables in data warehouses that capture no measures or facts. They are only useful for storing the occurrence of an event without any specific numeric data. For example, a factless fact table might only contain the date or product key without any measures.
These store the state of a business process at any specific point in time. Since they represent a momentary snapshot of data such as daily sales, monthly inventory sales, or weekly website traffic, they are referred to as snapshot facts.
Ready to unlock the power of data science? Then check out this Graduate Certificate programme in Data Science and AI, brought to you by upGrad.
Learn data science courses online from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.
There are different types of dimensions in data warehouse, leveraged to organise and describe the data in the fact tables. Some of the most commonly known dimensions are,
A normal dimension contains attributes related to a single logical entity. It has a business key, and all attributes heavily depend on the surrogate key. For example, a customer dimension might include customer ID, name, location, and age.
A junk dimension contains no business key and is typically used to group boolean or binary attributes. It combines multiple indicators into a single dimension table and helps to reduce the number of dimension tables in the data warehouse and improve query performance. The attributes in a junk dimension are usually at the transaction level.
A split dimension is one, as the name suggests, that has been split into multiple tables to reduce the chances of data redundancy and improve data management. It is used when a dimension is predicted to be big, e.g., 20 million rows. Dividing the same into multiple smaller tables makes the data more manageable and efficient to query.
A text dimension usually contains large amounts of textual data such as comments, descriptions or notes. It allows for a more detailed analysis of text-based information. The text dimension is especially useful when hierarchies or relationships exist between different dimensions.
A stacked dimension is one where multiple related dimensions are combined into a single table. It allows for simplification of the data model and makes it easier to navigate and analyse the data.
Below is a small example illustrating the difference between fact and dimension tables.
Order ID | Product ID | Customer ID | Quantity | Price | Discount | Total Sales |
1001 | 101 | C001 | 2 | $50 | $5 | $95 |
1002 | 102 | C002 | 1 | $30 | $0 | $30 |
1003 | 103 | C003 | 3 | $20 | $2 | $58 |
This fact table has the record of three different sales transactions. It highlights quantitative data such as products sold, unit price, discounts applied and the total sales of each transaction.
Product ID | Product Name | Category | Brand |
101 | Laptop | Electronics | ABC Inc. |
102 | Smartphone | Electronics | XYZ Corp |
103 | Headphones | Accessories | DEF Tech |
Here, we have information related to the various products sold by the company. Each row represents a unique product and includes attributes such as product name, category, and brand.
Hopefully, with this, you clearly understand the key differences in data warehouse dimension vs fact table. To sum it up, both these components play crucial roles in organising and storing data for efficient analysis and reporting.
From performance optimisation and data organisation, to query efficiency and simplified reporting, the list of advantages they bring to the table goes on and on. They enable business enterprises to harness the power of data-driven decision-making for improved business performance.
If you wish to learn more about intricate data science components like fact table vs dimension table Power BI, do not forget to check out the MS In Data Science program offered by Liverpool John Moores University in collaboration with upGrad. This 18 months course is specifically tailored for IT professionals and sales experts who wish to venture into this vast dynamic world of data science.
Get Free Consultation
By submitting, I accept the T&C and
Privacy Policy
Start Your Career in Data Science Today
Top Resources