Predictive Analytics: No More the Way of the Analytics Ninjas

Since the beginning of time, we’ve always been fascinated with knowing what lies ahead. We started with predicting the seasons and weather patterns, and using that information to know when to sow seed, and when to harvest. Though some have the attitude ‘Que sera sera, whatever will be, will be,’ people have a fundamental desire to stay informed. Businesses are no different. Business leaders love the idea of certainty when making decisions. Strategic decision makers want data to be the proverbial crystal ball that gives them foresight for key business decisions and directions. In recent years, predictive analytics has come into prominence as a promising solution to this quest for foresight.

In this series of posts on predictive analytics, we’ll be discussing the important aspects of predictive analytics, and how it’s being used today to gain a competitive advantage. In this first post, we get an overview of what predictive analytics is, what are the different ways of performing predictive analytics, and what to keep in mind when applying it in an organization.

The ‘Problem’ of Unstructured and Semi-structured Data

Traditionally business intelligence was like the way of the ninjas – exclusively the domain of data scientists. Data was stored, queried, and analyzed by data scientists to understand the what, where, when, how, and why behind some happening. Data was structured, and generated in much smaller volumes than today. However, the biggest difference between data from then and now, is the variety of data being generated, and this not just by data scientists, but predominantly by consumers. Today, we also deal with unstructured, and semi-structured data. This data resides on the billions of social media pages across the Web, and is fueled by the ease of access to the Internet from the multitude of connected devices that are integral to our lifestyle. There’s also the gray area of semi-structured data that exists in HTML, text files, and PDF documents, which may have some structure in the form of tags, and markers, but for the most part is unstructured text.

While this unstructured and semi-structured data may be seen as having little or no value by some business analysts, for those pushing the frontier of data analysis, it presents a gold mine of an opportunity to derive value out of. This opportunity is in the form of predictive analytics.

Predictive Analytics for Desired Outcomes

  1. Descriptive model: This method analyzes past performance by mining historical and current data to decide a course of action. Descriptive models identify many different relationships between customers or products, and decide what approach needs to be taken going forward. Almost all management reporting such as sales, marketing, operations, and finance, uses this type of post-mortem analysis. It seeks to answer the questions – what is happening? how many? how often? where? when? what exactly is the problem? and what actions need to be taken?
  2. Predictive model: Analyze past performance to assess how likely a customer is to exhibit a specific behavior. The focus on predicting a single customer behavior such as credit risk. It addresses the questions – what could happen? what if these trends continue? and what will happen next if…?
  3. Prescriptive model: Also known as decision models, it describes the relationship between all the elements of a decision involving variables in order to predict the results of those decisions. It asks the questions – how can we achieve the best outcome? how can we address variability? what other product would they be interested in?

Although predictive analytics has been gaining prominence recently with the explosion of data, it’s existed in traditional business intelligence most frequently in the form of the descriptive model. An example would be to look at last year’s sales revenue and orders, and project targets for the upcoming year. It involves looking at past data, and charting a course going forward. This has been standard practice in businesses since decades.

The more complex predictive, and prescriptive models which were restricted to the science labs, and data scientists before, are now coming into play more and more in organizations of all sizes. This, as we discussed earlier, is because of hardware becoming cheaper, and the consequent proliferation of data, especially unstructured, and semi-structured data.

While understanding these different models is key to doing predictive analytics the right way, just like with any other project, it begins with understanding business goals and objectives. Once there is clarity on the business goals and objectives, any or all of the three models can be applied in the BI system to serve those goals.

Rinse-and-Repeat Approach

We can’t ignore the possible pitfalls of predictive analytics. 100% accuracy of analysis is not possible in most, if not all, cases of predictive analysis because of the following reasons:

  1. Historical data cannot decisively predict the future
  2. There may be unknown variables that are not accounted for when training the predictive analytics model
  3. The models can be manipulated to show biased, and unrealistic predictions

It follows that the margin of error needs to be kept in mind when training a model. Usually, the quality of a predictive analytics system improves, or declines over time due to the influence of the above three factors. Therefore, continual optimization of the predictive analytics model becomes necessary. The cycle of training a model, deploying it, training it again, and deploying it, will ensure the highest accuracy in prediction over time.

In the upcoming posts, we’ll take a look at the various real world applications of predictive analytics across a broad range of industry verticals. Finally, we’ll look at the most common chart types, and interactive features when visualizing data in predictive analytics.

P.S. : Take a look at one of our previous posts showing how SRI (Stanford Research Institute) uses predictive analytics as they invent the workstation of tomorrow. Oh, and for the record, SRI uses FusionCharts to power the dashboards in their futuristic workstations.

If you enjoyed reading this, be sure to check out the other posts in this series:

Part 2 – 5 Businesses on the Frontier of Predictive Analytics

Part 3 – 4 More Businesses on the Frontier of Predictive Analytics

Part 4 – 3 Insanely Great Dashboards from Recorded Future

Part 5 – Stripping Down the Gorgeous Sift Science Dashboard

Part 6 – 9 Ways We Use Predictive Analytics Without Even Knowing It

  • Asar September 6, 2013, 12:44 am

    Hi Twain,
    Nice post indeed! With tons of data (most of them unstructured) pouring by the milliseconds, there is a growing need to glean and mash them into digestible bits of useful information. With the deluge of Big Data (I’m still confused how big it is, or could get) today, end-users are no longer interested in looking at tables, they’d rather look at charts and graphs and decide things on the fly. This is where Data Visualization helps them extract information. So, in a way it is the visualization which leads them to delve deeper into the data and extract meaningful information.
    Waiting for the next part.. 😀
    Regards,
    Asar

  • twain September 13, 2013, 4:18 pm

    Couldn’t have said it better, Asar. Finding meaning from all the data is what it’s all about. I can’t wait to get to the visualization part of this series, which is coming up.  For now, I’ve posted the second part in this series which you should check out at http://ow.ly/oOxD9

Leave a Comment