The Modeling Agency logo

Analytics Transformed™
  • This field is for validation purposes and should be left unchanged.

Posts Tagged ‘web’

Mining Your Web Data for Impactful Insight

Data Mining for Web DataIf you have a web site for your business — and especially if you engage in e-commerce — every visit to your site generates important data about your visitors… even if they do not make a purchase. Every action a visitor takes on your site contains behavioral insight such as preferences, tendencies and habits. Studying this data through data mining and predictive analytics methods reveals patterns and trends that you can use to build a predictive model and inform vital decisions about your site and marketing strategies.


Properly mining web data is not a simple process, largely because of the sheer volume and lack of organization that plagues most web data. However, with clear objectives and the right recipe for tracking visitors on your site, you can learn tremendous amounts of valuable information from your web logs.


Here are the basic steps for successfully planning and implementing a web data mining project.


  • Identify your objective. Work with your web development, sales and marketing teams to determine what demographics need to be captured and the right methods to do so.
  • Select and prepare your data. Determine the database you’ll work with and check to see if the data needs any special preparation (for example, normalizing dollar fields by dividing by 1000, converting dates to continuous values, converting yes/no to 1/0 or running log or square transformations on skewed data).
  • Evaluate your data’s structure. This will decide the data mining methods and tools you’ll use. Check the overall structure of the database and condition of the data set, and see if it’s skewed.
  • Format your solution. In what format would you prefer the results to appear (e.g. graph, map, decision tree, etc.)? What is your overall goal for the solution (e.g. to gain insight or to increase sales)? How will you act on the results? The answers to these questions will help you plan the most efficient and effective data mining process while avoiding the need to retrofit, right size or misapply an otherwise productive model.
  • Choose your data mining tool(s). Take into account both the structure and nature of your data and the best method(s) for achieving your desired results or meeting your overall goal. Other considerations are modeler experience, end-user needs, environmental integration and results translation.
  • Design your models. Examine your model error rates and improve them if possible, and see if additional data exists that could help your models’ performance. Decide how many models you’ll need for the job, then test and train your models using a random number seed. Keep in mind that it’s far more effective to apply a good model to a solid strategy, than develop an excellent model that’s strategically misapplied.
  • Validate your findings. Double check your results and submit them to unbiased review to determine if they are correct. Be wary of great model results on your training and testing data by applying the model against a validation set to ensure that the model is generalized – and didn’t memorize the training data. Be prepared to launch a new analysis with adjusted models if the validation results did not stand up.
  • Report your findings. Prepare a report that clearly documents the entire web data mining process you used and the justification for the tools you selected, as well as presents the results and your comments. If you can see clear ways to improve upon the process for future analysis, include your ideas. Be sure to document! You’ll be glad you did later.
  • Incorporate your findings into your business. They may impact best practices, marketing, sales techniques and strategic planning. Prepare to routinely monitor the performance of your models, because all models deteriorate. When your existing models are no longer accurate, you will have to make adjustments or develop new ones. This is a natural part of an ongoing model lifecycle management practice.


How often you need to mine your web data will depend on the industry you’re in and factors like how frequently customer attributes change – or what we call ‘data velocity.’ If your industry is very dynamic you may need to refresh your models often. Maintaining fresh models will maintain the edge over your competitors.


Ready to learn about data mining in greater detail? Register for an upcoming free data mining webinar, or one of The Modeling Agency’s intensive predictive analytics training sessions.