Considerations When Implementing a Data Warehouse Solution

 Jul 21, 2017

The foundation for a Business Intelligence (BI) solution for your company is a data warehouse. In today’s blog post we will go through some high-level considerations that you should take into account when you are implementing a data warehousing project. This is a short summary of what we cover in detail in our course, Implementing a Data Warehouse with Microsoft SQL Server

The data warehouse is a centralised repository of business data, which can be used for reporting and analytics, and that will support key business decisions. The very first step in the planning of a data warehouse is to find out what are the business questions that need to be answered. You should organise meetings with the key decision makers and arrive at a deeper understanding of the key business objectives and metrics used to drive the decision-making process.

Once you know the requirements for the reporting infrastructure and the analytics, you can make a list of the data that is required to support them. The business data is usually spread across many different systems throughout the company and you need to know how to obtain the data. You should assess the importance of each business question against the ability to answer it from existing data, and prioritise the business questions you will address in the data warehousing solution. You could classify the questions into four quadrants of a feasibility vs importance matrix:

 High importance, low feasibility  High importance, high feasibility
 Low importance, low feasibility   Low importance, high feasbility

 

The ones with high importance and high feasibility should be addressed first, and if there are a large number of them, you may consider breaking down the project into subprojects. The last quadrant to be addressed, if ever, should be the low importance, low feasibility.

Each subproject tackles the problem of implementing:

  1. A schema for the data warehouse database
  2. An Extract, Transform and Load (ETL) solution
  3. Data quality procedures

For the schema, you typically implement a star schema with one central Fact table surrounded by several Dimension tables. The fact table contains all the metrics required to answer the business questions such as sales amount, price, cost, discount, etc. It also contains keys to each one of the dimension tables representing business entities such as Customer, Employee, Product, Supplier, etc. The relationships between the Fact and the Dimension tables makes it possible to aggregate the metrics by attributes of each business entity. Therefore, we can generate sales amount by city of the customer, or average price per type of product, etc.  Besides the business entities there will also be present a Time dimension with a suitable granularity such as the day or the hour, allowing us to aggregate by year, quarter, month, etc., down to the most granular measure.

The dimensional model will be constantly refined as the requirements gathering process evolves. Once we get some stability with the Dimensional model we start implementing the ETL components that will connect with the several data sources where the data is located in existing systems. The ETL will typically pass this data through a pipeline of transformations and finish by inserting the modified or cleansed data into the final destination that is the data warehouse.

If you would like to know more, click here to find out about our course, Implementing a Data Warehouse with Microsoft SQL Server, or contact us to speak with one of our Learning Solutions Advisors today. 

How do your Excel skills stack up?   

Test Now  

About the Author:

Newton Godoy  

With over 17 years of in-class training experience and over 16 years of industry experience, Newton offers students a wealth of real-world technical knowledge and expertise in the areas of .NET application development, SQL Server and SharePoint Server. After spending several years lecturing as a professor, Newton found his true calling and began his career as a MCT. He worked as a technical trainer for some of Brazil’s and Australia’s largest corporate training organisations before finally finding a home with New Horizons where he is now one of our top trainers. Newton brings a thorough mentoring capability to the classroom where he can advise on technical issues and challenges often beyond the scope of the course curriculum. His combination of technical knowledge and instructor experience make him one of the most respected instructors within the IT training industry.

Read full bio
top