The Modeling Agency logo

Analytics Transformed™
  • This field is for validation purposes and should be left unchanged.

What is a Data Warehouse?

A data warehouse is a massive storage center for all of the data that an organization typically generates. It is distinct from a day-to-day operations system both in the amount of data that is stored and in how that data is treated.

Warehousing data carries several key advantages. One of those advantages is the ability to analyze and work with the data without disrupting or overwhelming your daily operations.

Data Warehousing Concepts

There are several key concepts wrapped up in the data warehouse definition. These concepts represent a series of criteria that a data storage center must meet in order to be classified as a data warehouse.

First, the data in the warehouse must be “subject oriented.” That is, the data must be specific, and it must pertain to an identifiable subject. This allows the data to respond to queries in an effective way.

The data must also be integrated. Typically, your operations data will not be integrated.

This concept is easier to understand if you think of your operations data as going into a series of boxes that are typically kept separate. Sales data doesn’t usually touch payroll data, for example.

In the warehouse, however, all of the boxes are first stored together and then linked together so that analysts can form relationships between different data sources. For example, warehousing the data now makes it possible to make correlations between the number of marketing dollars that were spent in 2012 and the number of sales that were made. On a more complex level you could track the number of marketing dollars spent on each media channel and then compare those results to sales figures as well as the source of sales leads. In either case, managers can use the data to make informed, highly strategic decisions about how much marketing money should be spent in the future, neither overspending to the point of diminishing returns nor under-spending to the point of lost profits.

Next, the information in the warehouse has to be stable. Operations data may fluctuate. For example, you may have a system in one of your factories which measures the amount of heat that a given pump gives off. This could change from moment to moment, and your engineers or factory floor managers could be called upon to respond differently depending upon what reading is showing from hour to hour.

The data that would be stored in the warehouse, however, is never going to change. The warehoused piece of data would read: Pump A, Temperature 110̊, July 22nd 2013 at 5:13 PM. The value of that recorded bit of information is never going to change, and it can be compared with other immutable values, such as the temperature of Pump A on June 22nd and on August 22nd.

A data warehouse must also be “time variant.” That is, you can look up historical data. Your operations center may only be concerned with the most current data, such as the customer’s most current phone number. The warehouse will also store the customer’s number from 3 years ago, or from 10 years ago, and the length of time that a customer keeps a phone number could be used as a variable in a predictive model. For example, perhaps customers who change phone numbers every six months represent a poor credit risk, meaning that it might be wise to raise the interest rate on their credit account in the future.

Finally, a data warehouse must make the data accessible to users. After all, it would make very little sense to store so much information if one does not also have the ability to find and use the data later! In fact, organizing data and “normalizing” it for ease of use is one of the key functions of any data warehouse.

Reaping the Benefits of a Data Warehouse

There are two keys to reaping the benefits of a data warehouse.

The first key is proper implementation. Creating an effective data warehouse is more complex than it may sound. Therefore, most businesses will benefit from working with a data warehouse vendor rather than attempting to turn the task over to the in-house IT staff. There are literally hundreds of providers to choose from.

The second key is to keep your eyes on the point of the exercise. That is this: data warehousing is a prerequisite for predictive modeling. It is the tool that allows you to locate the trends, patterns, and relationships that will allow you to make good decisions for your organization. It is not an end in itself.

Thus, you must have a plan for taking the next step. Storing mounds of data has no real purpose if you don’t also know how to analyze and develop predictive models that actually use the data. Fortunately, that is just the kind of problem that TMA solves. Click here to learn more.