Data warehouse and Data mart are used as a data repository and serve the same purpose. These can be differentiated through the quantity of data or information they stores. The vital difference between a data warehouse and a data mart is that a data warehouse is a database that stores information-oriented to satisfy decision-making requests whereas data mart is complete logical subsets of an entire data warehouse.
In simple words, a data mart is a data warehouse limited in scope and whose data can be obtained through summarizing and selecting the data from the data warehouse or with the help of distinct extract, transform and load processes from source data system.
Content: Data Warehouse Vs Data Mart
|Basis for comparison||Data Warehouse||Data Mart|
|Basic||Data warehouse is application independent.||Data mart are specific to decision support system application.|
|Type of system ||Centralised||Decentralised|
|Form of data||Detailed||Summarized|
|Use of denormalisation||The data is slightly denormalised.||The data is highly denormalised.|
|Nature||Flexible, data-oriented and long life.||Restrictive, project-oriented and short life.|
|Type of schema used||Fact constellation ||Star and snowflake|
|Ease of building||Hard to build||Simple to build|
Definition of Data Warehouse
A data warehouse is subject-oriented, integrated, time-variant, and nonvolatile collection of data that supports management decision making process. Alternatively, it a repository of information gathered from multiple sources, stored in a unified schema, at a sole site that allows integration of a variety of application systems. Once this data is collected it is stored for a long time, hence has a long life and permit access to historic information.
Consequently, data warehouse provides the user with a single integrated interface to the data through which user can write decision-support queries easily. Data warehouse helps in turning the data into information. Designing a data warehouse includes top-down approach.
It gathers information about subjects that span the entire organisation, such as customers, items, sales, assets and personnel and therefore its scope is enterprise-wide. Generally, fact constellation schema is used in it, which covers a wide variety of subjects. A data warehouse is not a static structure and it’s evolving continuously.
Definition of Data Mart
A data mart can be called as a subset of a data warehouse or a subset of corporate-wide data that is of value to a specific group of users. Data warehouse involves several departmental and logical data marts which must be consistent in their data representation to ensure the robustness of a data warehouse. A data mart is a set of tables that concentrate on a single task these are designed using a bottom-up approach.Data mart scope is confined to some specific selected subject, thus its scope is department-wide. These are usually implemented on low-cost departmental servers. The implementation cycle of data marts is monitored in weeks instead of month and year.
The star and snowflake schema are commonly used in the data mart since both are geared towards single subject modelling. Although, the star schema is more popular than snowflake schema. Depending on the data source the data marts can be classified into two types: dependent and independent data marts.
Key Differences Between Data Warehouse and Data Mart
- Data warehouse is application independent whereas data mart is specific to decision support system application.
- The data is stored in a single, centralised repository in a data warehouse. As against, data mart stores data decentrally in the user area.
- Data warehouse contains a detailed form of data. In contrast, data mart contains summarized and selected data.
- The data in a data warehouse is slightly denormalised while in case of Data mart it is highly denormalised.
- The construction of data warehouse involves top-down approach. Conversely, while constructing a data mart the bottom-up approach is used.
- Data warehouse is flexible, information-oriented and longtime existing nature. On the contrary, a data mart is restrictive, project-oriented and has a shorter existence.
- Fact constellation schema is usually used for modelling a data warehouse whereas in data mart star schema is more popular.
Data warehouse provides enterprise view, single and centralised storage system, inherent architecture and application independency while Data mart is a subset of a data warehouse which provides department view, decentralised storage. As data warehouse is very large and integrated, it has a high risk of failure and difficulty in building it. On the other hand, the data mart is easy to build and associated failure risk is also less but data mart could experience fragmentation.