PREDICTIVE ANALYTICS & DATA MINING:
MODEL DEVELOPMENT
A Tactical Drill-Down Of Process,
Methods, Tools And Techniques
ABOUT THIS COURSE
The Modeling Agency’s “Model Development” course presents a deep dive into the data mining process at a tactical level. Attendees will observe demonstrations of machine learning methods and computer-guided analytical techniques for extracting and interpreting complex patterns and relationships from large volumes of data. If you desire an intensive tactical orientation to data mining concepts, tools, techniques and supporting methods, then this event is designed for you.
This vendor-neutral course broadly covers data-driven information discovery techniques and model-building tactics without restriction to any particular modeling tool. Popular open-source and commercial packages are leveraged to illustrate methods, but not to showcase the tools. There are no prerequisites for this course. However, participants will benefit by reviewing the CRISP-DM guide ahead of the training.
Each course in the series is designed to be taken independently or as a natural progression from tactics to strategy and practice. View the course series overview page to compare the two primary orientations and target the most fitting agenda for your experience, situation and objectives.
WHO SHOULD ATTEND
- IT Professionals: who wish to expand their skills in this increasingly visible area within the corporate IT agenda
- Project Leader: who must report on developmental progress, resource requirements and system performance
- Decision Support System Architects: who require an understanding of the infrastructures required for supporting a data mining solution
- Business Analysts: who must develop and interpret the models, communicate the results and make actionable recommendations
- Functional Leaders: Customer Relationship Managers, Risk Analysts, Business Forecasters, Statistical Analysts, Inventory Flow Analysts, Direct Marketing Analysts, Medical Diagnostic Analysts, Market Timers, e-commerce System Architects and Web Data Analysts
BENEFITS OF ATTENDING
- Vendor-neutral exposure to tools and techniques that will place you months ahead inmethod planning and product surveying
- Examine which methods and tools are most effective for your needs
- Avoid pitfalls in data preparation, modeling, and results interpretation
- Leave with resources, contacts and actionable plans to substantially increase youranalysis capabilities while minimizing dead ends
THE BUSINESS CHALLENGE
The rapid emergence of electronic data processing and collection methods has led some to call recent times as the “Information Age.” However, it may be more accurately termed as “The Age of the Data Glut.” Most businesses either posses a large database or have access to one. These databases contain so much data that it becomes very difficult to understand just what that data is telling us.
There is hardly a transaction that does not generate a computer record somewhere. All this data has meaning with respect to making better prospective business decisions and anticipating customer needs and preferences. But how do you discover those needs and preferences in a database that contains gigabits of seemingly incomprehensible numbers and facts? Data mining and predictive analytics does just that.
The intent of this course is to offer attendees a stronger grasp of data mining techniques, a solid understanding of how various methods and tools apply to different kinds of data intensive problems, and how to overcome limitations that cause predictive models to under-perform.
WHAT YOU WILL LEARN
- The data mining process and general implementation
- How to prepare raw data and benefit from visualization
- Various data mining methods and how they compare
- Advanced model building techniques
- Results analysis and validation
- Technology and product selection
- Solution integration, ongoing performance and maintenance
- Where to begin and how to obtain resources and support
WHAT MAKES THIS COURSE UNIQUE
This course does not restrict or skew the presentation of data mining methods through a single product. Rather, the course gives consideration to all resources from a vendor-neutral position.
The instructor possesses a wealth of pragmatic experience in applying data mining technology across industries in real-world applications. This course insists upon making predictive analytics constructive and interpretable in a business or organizational setting.
In addition, live modeling demonstrations projected from the presenter’s machine will support the instructional sessions. The demonstrations will highlight superior performance as well as pitfalls. The instructor will show how to evaluate various packages based on strengths, limitations, value and general performance.
COURSE OUTLINE
Introduction
- What you will get in this course
- What is PA/DM?
- Definition
- Related terms and fields
- Machine learning
- Computer-aided pattern discovery
- Business analytics and statistics
- Others you have heard?
- Examples
- Differences
- How can you develop PA/DM opportunities
- Generative questions
- Examples
- Nuts and bolts of a project
- Big Picture: Introduction to CRISP-DM
- What is it? What is it not?
- Why do we care? Why use it? What is it good for?
- Example: Tour of CRISP-DM in real-world context
- Team Exercise
- Big Picture: Introduction to CRISP-DM
- One Practitioner’s View
- Regarding PA/DM: What’s hype and what isn’t?
- How to be successful with PA/DM
- Tools and products
- People matter
- Regarding PA/DM: What’s hype and what isn’t?
- CRISP-DM Methodology: Parts 3, 4, 5
- Highlight CRISP-DM 1, 2, 6
CRISP 1, 2, 6 are detailed in the Strategic Implementation course- Business understanding
- Data understanding
- Deployment
- Highlight CRISP-DM 1, 2, 6
- Data Preparation (CRISP 3)
- Rows: Select data
- How much data?
- Rows: Selecting the “unit of analysis”
- Determine what the record will look like
- Determine how many records we have to work with
- Site selection example
- Rows: Defining the population / outcome of interest
- Modeling Goals
- Simple: Response vs. Non-Response
- Uplift / Incremental Lift / Net Lift Modeling: Identifying
those most receptive to a treatment or offer
- Rows: Sampling methods / oversampling
- Rows: Exclusions / rules of thumb
- Columns: Identifying types
- Need definitions (from clients or internal) so that we
understand what the data represents. Don’t assume
that an element isn’t important - Categorical / Nominal (what does null mean)?
- Ordinal
- Interval / Rational
- Date / Time
- Sub-Types (money, count, geo, id, etc, and why care?)
- Need definitions (from clients or internal) so that we
- Columns: Appropriate statistics and visualizations
- Univariate
- Multivariate
- Columns: Selection for modeling
- See “Clean Data” for pre-modeling elimination of
redundant, constant, etc columns - Final selection is done during the Modeling phase
- See “Clean Data” for pre-modeling elimination of
- Build and Execute Transformations
- Sources: Household File; Demographics; Derived Variables
- Transformations: Counts, Category, Binary, Logarithmic, etc.
- Document the above in a “Scorecard”
- Rows: Select data
- Modeling (CRISP 4)
- Select modeling technique
- Taxonomies: An overview
- Supervised vs. Unsupervised
- Descriptive vs. Predictive
- Classification vs. Estimation
- Supervised — Constellation of methods with pros and cons
- Classification
- Decision Trees
- Logistic Regression
- Neural Networks
- K-Nearest Neighbor
- Prediction
- Linear Regression
- Neural Networks
- Exercise: Scenario revisited — What method(s) do we choose?
- Classification
- Unsupervised — More methods with pros and cons
- Segmentation / Clustering
- Hierarchical clustering
- K-Means
- Decision trees
- Association Rules
- Segmentation / Clustering
- Team Exercise: Com up with an expert-derived decision tree to
make a selection for supervised problems - Advanced Topics
- Ensemble / Hybrid Models
- Bagging
- Boosting
- Parting remarks
- Models should be as simple as possible, but no simpler
- Why not both? (a low-res descriptive model and
a high-res opaque accuracy model)
- Generate test design
- Data segregation
- Performance metrics: Whenever possible, go for the
custom metric — “If you build it, they will come.”
- Build Model
- Use a tool, select a method, set parameters (if any),
select candidate columns, select outcome (if supervised) - Variable selection techniques for supervised methods
- Variable selection techniques for unsupervised methods
- Use a tool, select a method, set parameters (if any),
- Assess Model (Tweaking)
- Predictors
- Manually removing or limiting
- Forcing predictors
- Structure
- Profiles
- Compared to What?
- Baseline model comparison
- Train/Test/Validation comparison
- Scoring the model
- What does scoring mean?
- How is it different from building the model?
- What are we looking for when scoring?
- Final Product
- Model(s)
- Description(s)
- Text Mining / Text Analysis
- Evaluation (CRISP 5)
- Evaluate results (from business perspective)
- Prelude to business use presentation
- Informal, low-risk setting
- Poke holes early, before business presentation
- Does the model or segmentation make sense?
- Does it contradict of reinforce the standard “lore”?
- Get support and buy-in from potential champions
- Candidate names for segments
- Present results to business users or clients
- BUs need to be convinced: Models, segments and analysis
need to be marketed! - Deployment will require change
- To processes
- To systems
- To ingrained mindsets
- Deployment costs (to each change area above)
- Results must have business value, not technical
representations - Performance results — in business terms
- Descriptions
- No equations
- Tell the story, paint the vision, what will life be like with
or without the model in place?
- BUs need to be convinced: Models, segments and analysis
- Review Process
- Follow-ups to the presentation
- Anticipate follow-up issues in planning and estimates
- Revisions to the model(s) or segments based upon feedback
- Final quality assurance
- Follow-ups to the presentation
- Determine next steps
- Are you done?
- Will the model(s) be deployed? Why or why not?
- Document!
- Lessons learned meeting
- Final Product
- Consulting Exercise
- Wrap-up and Parting Thoughts
- Final Q&A
- Springboard exercise
- PA/DM Philosophy
- Understand the problem
- Understand the data
- Work on problems with specific business goals,
specific hypotheses to be tested. Do NOT go
prospecting for “data mining nuggets.”
- Next Steps
- Proceed to the Strategic Implementation course
- Certification Exam (for those who complete the series)
- Product training courses
- Keep learning!
- Supplementary materials and resources
- Conferences and communities
- Get started on a project!
- Final Q&A


Upcoming Sessions
Sign Up Early and Save
On-Site Available
Data Mining Webinar
Next Event
Why Train With TMA?
Attendee Comments
“This is a must-attend course for those who would like to get started in the amazing world of Predictive Analytics & Data Mining. Without going into theoretical details, it covers every important step in the development of predictive models.”
Database Marketing & Predictive Analytics Expert
Cogeco Cable
“This course was very enlightening. I was suprised to see the many ways predictive analytics can be applied!”
Business Operations Analyst
Toshiba America Medical Systems
“The wealth of information covered in these courses, as well as the in-depth demonstrations of multiple software packages, made the sessions valuable from a wide range of perspectives. I will certainly recommend that others attend.”
AVP, Managed Care
Analytics / Business Development
Health Smart Preferred Care
“The instructor is knowledgeable, well organized, and interacts extremely well with participants. If you have only two days to learn about data mining, TMA’s Model Development course is the class you should attend.”
Marketing Department
Amica Insurance
“The instructor was great and the instructor was very knowledgeable. It was just at the right level, not to general and not too technical.”
Technical Consultant
Convergys
“The instructor and course material are first rate. Any organization that believes data mining should be a part of their business operations portfolio would be making a wise investment by attending this course.”
Information Computing Sciences
SRI International
“This course was excellent. It was technical and practical. The instructors were just amazing with their knowledge experience and their ability to answer all our inquires, also the resources they provided us with were very helpful. Well done TMA!”
Planning and Program Analyst
Saudi Aramco
“This course gave me a new perspective on techniques and applications software that our federal agency had not previously seen. The course content was great, and the very knowledgeable instructor kept the students attention by using real-life examples and discussion of additional resources. I highly recommend this course!”
Auditor
US Department of Education
“I was a bit apprehensive about attending and how I could apply data mining concepts to my particular industry, but the instructor put those fears to rest. Highly recommended! Thanks TMA!”
Planner / Analyst
The Mitchell Gold Company
“This class gives the Statistician a bunch of new tools to use in solving business problems. Once the limitations of statistics are reached, grab this data mining tool belt. You will be surprised how much further you can get.”
Wage and Investment Research
Internal Revenue Service
“The instructor was really great. Tim not only had a very clear understanding of what he was presenting, he was also a very good teacher who solicited a lot of audience participation. Most people with a highly technical background can’t always communicate effectively. Tim had no such problem.”
Economist
Current Employment Statistics
Bureau of Labor Statistics
“The Modeling Agency takes a daunting subject and brings it to a very understandable level to attack any problem you may be facing as an organization.”
Personal Lines Underwriting Manager
Brethren Mutual Insurance Company
“The instructor’s effective communication and presentation skills provided us the confidence to understand the proper use of data mining in our new roles as analysts. His genuine interest and concerns for adaptation to each student level in addition to his experience and understanding enabled proactive class participation conducive to learning. This was extremely helpful for those of us with no data mining background. Our sincere thanks for an overall excellent experience!”
Office of Special Investigations
US Air Force
“This course was fabulous. It was everything I hoped it would be – technical and practical. The instructor is amazing and the resources he gave to further my studies was also helpful.”
Solutions Architect
Bayshore Solutions
“The class was great! I was really impressed with the instructor’s knowledge, experience, and ability. He was able to answer everyone’s questions thoroughly and tailor the class to individual needs. I learned so much about the data mining process, the different methods, and available tools. I highly recommend this course to both technical and non-technical people interested in leading-edge data mining methodologies and the application of current data mining software to marketing, business, and research endeavors.”
Preventive Medicine
Kaiser Permanente
“This course is a great overview of available predictive analytics methods and techniques that provides an organizing framework for your model development efforts.”
Lead Workforce Analytics
CH2M HILL
“This course will fully help you understand what is really important for increasing performance from your data by introducing key concepts that are not taught in any business schools.”
Operations Research Analyst
Federal Aviation Administration
“This course is extremely helpful and should be attended by any and all new Data Analysts. It outlines the process completely and thoroughly.”
Data Scientist
Xoom Corporation






