The Modeling Agency logo

Analytics Transformed™
  • This field is for validation purposes and should be left unchanged.

Category: Uncategorized

How Can You Get Management on Board with Data Mining?

PresentationThis is a question that TWA hears a lot in the Q&A sessions held after each session of the free webinar. Usually, the person who is asking the question is already quite on board with the idea of using the insights that data mining can deliver, but his supervisors aren’t so sure that they want to invest the resources into any predictive analytics projects.

If you’re in this position, take heart. At the very least, it’s a sure bet that data mining and predictive analytics are at least on the radar of corporate leadership. Everyone’s hearing about “big data” and how data mining is being used successfully to create real business results. In fact, you can even see their caution as a positive. Too many management teams are all too “on board” too quickly, and so dive in with both feet without any plan or sense of what they are doing. The result is a lot of wasted time and money that does not drive actionable intelligence.

Your management team’s caution gives you the ability not only to sell them on the benefits of data mining and predictive analytics but to sell them on the benefits of launching these projects in the right way.

The Best Way

The absolute best way to get management on-board would be to contact TMA to schedule a day-long visit with seasoned experts. The first half of the day would be spent on an in-depth presentation while the second half would be spent on a round table where stakeholders would enjoy an open forum where they might air their questions and concerns.

Pilot Projects

If you don’t have the budget or the power to schedule a presentation then a pilot project becomes the next logical step. You’ll need an experienced model biulder to make the project as successful as possible.

Remember that you’re generally not going to get the chance to perform a full, comprehensive assessment of organizational and departmental goals as you launch your pilot. Nobody’s sold on using predictive analytics yet, and you will likely have other needs, goals, and projects tugging at your attention.

That means your results won’t be the best possible results. This is okay. You really only need results that are impressive enough to demonstrate the strength of the effort.

In order to avoid wasting too much time and/or money you should choose a project that looks likely to provide the most benefit with the least amount of risk.

What to Avoid

Avoid vendors.

Vendors may well sell your leadership on the benefits of data mining and predictive analytics. Vendors, unfortunately, are also there to sell software.

This is not to say that software is never going to be necessary. However, purchasing software at this stage is like purchasing a luxury sports car before learning how to drive. Software is not a magic bullet, and organizations who invest in software first tend to get caught up in hype. This means it becomes difficult to extract the value from the data, which puts companies right back into the position of being information rich but knowledge poor.

TMA is vendor-neutral for this reason. Our goal is to get you trained without pushing any particular agenda.

Most pilot projects can be completed with Excel. You don’t require better software so much as you require the ability to think critically about the problem and the organization’s goals. Show results, get the buy-in. You can always get the budget for a fantastic and needful software package later.

Why “Data Scientist” is a Useless Designation

Looking for a data scientist is a little like going on a unicorn hunt.

Looking for a data scientist is a little like going on a unicorn hunt.

Are you trying to hire a “data scientist” for your organization? You might want to think twice before you decide to place that job ad.

“Data scientist” is either a meaningless designation or a descriptor for a person who will prevent your organization from realizing the full value of the data that it currently owns. Here’s why.

Data scientists typically approach the problem from the wrong direction.

Most “data scientists” typically focus on building technically superior models. There’s nothing wrong with building a better rocket ship, but first you’d best make sure that the rocket is actually pointed in the right direction.

No “optimized model” has ever aligned with business objectives. No business has ever generated a significant benefit from merely building a better algorithm.

In fact, many so-called “data scientists” pooh-pooh strategic assessment and project planning as “fluff” that distracts them from the “real work” of writing ever-more complicated code.

Unfortunately, strategic assessment and project planning happen to be vital if you’re ever going to extract any value from your data.

The term (as most organizations use it) describes a unicorn.

It is impossible–or at least, exceedingly rare–to find all of the skill-sets of a so called “data scientist” as most companies envision the position within a single human being. When organizations talk about “data scientists” they typically mean someone who:

  • Has a collection of advanced analytical skills
  • Has vast IT experience
  • Has and effectively uses a broad range of managerial soft skills
  • Can oversee analytic processes at the project level.

This mythical human somehow has managerial acumen and technical skill all rolled up into one brilliant, convenient package. Someone like this might exist in the sense that anything is possible…kind of like the way unicorns might exist in a universe where anything is possible. Since most organizations don’t have the time or money to embark on a unicorn hunt it’s smarter to take a step back and to think about who or what can actually achieve what the organization hopes to achieve by hiring a “data scientist” in the first place.

Anyone can call themselves a data scientist.

Granted, if you are dead set on acquiring a certain skill set then it’s awfully hard to fake having the technical skills. However, there is simply no formal definition for the term, which means no certifications, no degrees, and no quality controls. An unemployed MBA can legally hang out his “data scientist” shingle tomorrow. Often, amateurs do just that, to the detriment of the organizations they attempt to help.

Your existing employees can probably give you what you need.

Believe it or not, your existing employees probably have what it takes to help you derive outstanding value from the data you possess. In fact, training strategic thinkers who are close to the problems your organization is facing is often the first step. Sending key employees to a vendor-neutral training regimen which takes just a few short weeks can help you  begin transforming your data into actionable intelligence that offers solid benefits to your business. Doesn’t that sound far better than hiring an overpriced theoretical analytic specialist who is largely incapable of taking your organization where it needs to go?


Case Study: Building Models Without Historic Data

DATASometimes you are in a position where you have a problem you need to solve, but you do not necessarily have all of the historic data you want or need to do so. You don’t have to allow this scenario to stop you in your tracks. There are always ways to back into a problem.

A large academic institution recently asked TMA to create a predictive modeling surveillance program to detect credit card fraud. The challenge here was that there were no known historical cases of fraud to work from. How, then, to train this model?

TMA solved this problem by using an unsupervised learning approach. The idea was to cluster behaviors based on both distance mapping and multidimensional space, along with pattern matching.

After building the cluster, TMA worked with the users to define the number of clusters that they could work with.

The model was then prepared to stand by for any known cases of fraud to come through the system. The users could then see which segment or cluster the behavior mapped to. They would then target any auditing efforts on that cluster. The model was designed to grow more effective as more cases came in.

Cluster analysis has many uses. It is the same sort of process that Netflix uses to predict your next movie. It’s also the same process Amazon uses to predict your next purchase.

It is, of course, just one tool out of many. If you want to learn how to match the right tool to the right task you need to seek the appropriate predictive analytics training program. Why not start with TMA’s free webinar? Or sign up for training courses today to learn how to solve sticky problems just like this one.

How Long Does It Take to See The Benefits of Data Analysis?

WatchHere’s a question that came up during a recent TMA Q&A. It’s actually a very good question, as it is important to count the time commitment that investments will demand from you, just as it is important to count the initial investment cost.

The answer? It depends.

With the help of modern software you can certainly crank out a quick and dirty model in a matter of hours. This approach carries many strategic risks, but it can be done.

Going through the entire TMA process takes about six weeks on average. As for receiving results from following that process?

Well, that all depends on what you’re doing.

For example, if you’re using predictive analytics to improve the efficacy of your e-mail marketing campaigns then you’d typically see results very quickly. Someone who is using predictive analytics to improve the efficacy of a direct mail campaign might have to wait longer.

It also depends on whether or not you are putting together a single project or are attempting to methodically build an internal analytics practice. The difference here in time is the difference between building a single car and building an entire care factory.

Of course, there are plenty of benefits associated with the longer, harder project that you’ll never see if you remain fixated on building those individual cars!

If you’re not sure how and where to get started on your next project (big or small), why not register for TMA’s next free webinar? You’ll receive a thorough grounding in the concepts and strategies that help a predictive analytics project become successful, and you’ll get your own chance to ask your questions of gurus Tony and Scott after the webinar is complete. Sign up today.

Data Too Messy? Don’t Panic.

Messy DataWhen you’re working with predictive modeling you’re typically going to run into two different types of data sets.

Structured data sets are the easiest to deal with. Your financial data may be mostly structured, for example. You have a fixed dollar amount which came in during a specific period of time. The numbers always mean the same thing.

“Unstructured” data contains important information, but it might not live in any numeric form. You may have thousands of customer service calls or records to sort through. You may be tracking social media content, like the subject matter of your customer’s tweets. When that data is used, it will have to be prepared and converted into a specific number before it will be useful to you.

Most modern data environments are a very messy mix of structured and unstructured data, and usually that data is stored in huge quantities. It’s easy to get overwhelmed when you look at the sheer scope of the data that is available to you.

Fortunately, the problem is not insurmountable if you approach your data in the right way.

First and foremost, you must have a clear idea of what you are trying to accomplish with this data. What are your objectives? Once you know your objectives it’s easier to understand which data won’t make sense for this particular problem set.

Next, you’ll need to take a sample. You’re never going to analyze all of the data that’s available to you. You only need to put together a good enough representation of the whole solution space.

In fact, one of the biggest mistakes most modelers make is that they’re prone to going into the data assuming that “more” is better. In truth, adding more and more data is actually a good way to ensure the failure of your model.

Fortunately, knowing your targets makes it far easier to decide what you’re going to include. You’ll dive in, pick the data that is relevant to your objective, and then work with that data (and only that data).

Don’t jump into the data without setting your course. That’s like starting a book on Chapter 5. Both actions all but ensure you won’t have a clue what’s going on–which means you’re likely to create an even bigger mess than the one you started with.