Tuesday, January 28, 2014

Business Intelligence: An Introduction; Fact Tables and Dimensions

In today's world of abundant information and 'Big Data', business intelligence (BI) is a highly relevant topic of study. BI provides techniques and tools required for analyzing and reporting large amounts of related data in order to drive business decisions. Due to their purpose, BI tools and applications are used by people of different authority, spanning the depth of the organization. For this reason, the output of BI applications must deliver readily understandable business insights and a highly intuitive user experience.

In order to accomplish this, the design of the application/data warehouse must be simple. A confusing design will inevitably lead to a confusing user-experience and a lack of acceptance from end users. Therefore, we breakdown the fundamental data elements of the business into meaningful categories: fact tables and dimensions. The classification of data warehouse information into fact tables and dimensions was interesting to me as it provided a different perspective on the way we looked at entities and attributes. I would say this is a more relevant form of classification because it makes sense from a business reporting point of view.

In simple words, facts are business measures: they are used to provide measurement of a particular business process. For example, the total revenue of a product line comes under the fact table. The definition of fact must also be accompanied by the grain. Grain is the level of detail present in the fact table. The level of detail of the data is inversely proportional to the granularity of the data. In other words, the grain gives you an idea about the kind of data is present in the fact table.

Dimension is the other category of data present in a data warehouse model. Dimensions contrast with facts because they are not measurements of the data, but provide descriptive information of the data present in fact tables. As a result they are mostly textual in nature. Sometimes, there can be situations of ambiguity in determining whether a particular numerical attribute is a fact or a dimension. Under such circumstances, Visualizing the final report or result set we require will help us make this determination. It also helps to see if the numerical attribute is continuous or discrete ; continuous attributes are mostly facts and discrete attributes are mostly dimensions (this rule is bound to have exceptions!)

Hopefully, this brief introduction on facts and dimensions will provide us with a basic understanding of data warehouse models and guide us in the design of simple and flexible BI applications.

No comments:

Post a Comment