Did You Document Your Data Prep?

photodune-1381176-analizing-data--xs (2)As you may know, data preparation is usually the most labor-and-time intensive part of a predictive analytics project. What you may not recognize is that the entire preparation phase needs to be documented on an on-going basis as you complete this phase of the project.

It may not seem important as you’re doing it. But if your model ever needs any revisions you’re going to need to know what you did–and you’re probably not going to be very accurate if you simply try to pull that information out of your own memory banks.

You may also have to use your model again at a later date, with a new set of data. That model simply won’t give you consistent results if you don’t prepare the new data set in the same way you went about preparing the original data set. Remember you may be translating unstructured data (such as the content of recorded customer service calls) into numbers so that your model can actually read what happened. If you use different numbers the second time around the model’s going to produce very different results.

This documentation process should be completed through all six phases of the model’s development.

Fortunately, TMA has created a way to make your data prep documentation much easier. TMA offers an Excel Workbook template which makes it very easy to document and capture your progress on nearly any kind of predictive analytics project. You will receive this spreadsheet with your other course materials when you sign up for one of TMA’s training classes.

Documentation helps you deal with any data preparation issues on an on-going basis. If you don’t use TMA’s process it’s still very important to develop your own method. Otherwise, you’ll be starting from scratch every time there’s any question about the model or models that you’ve been developing.

