# Predictive modeling

Quick definition

### Predictive modeling is a statistical technique and process used to forecast future outcomes. It is most closely associated with predictive analytics, which focuses on using machine learning to predict what might happen next.

### Key takeaways

Predictive modeling can help a company understand how well the business is performing as a whole, mitigate potential risks, and build better customer experiences and overall lifetime value.

Predictive modeling does not require machine learning or artificial intelligence (AI), but they expand the available processing scale, improve efficiency, and allow for better data evaluation.

There are seven major steps in the predictive modeling process: understand the objective, define the modeling goals, gather data, prepare the data, transform the data, develop the model, and activate the model.

### John Bates is the director of product management for predictive marketing solutions and for Adobe Analytics Premium in Adobe Experience Cloud. His core responsibility is to develop the product roadmap for all advanced statistics, data mining, predictive modeling, machine learning and text mining/natural language processing solutions found within the products of Adobe's Adobe's digital experience business unit.

What are the different classes of predictive modeling?

What is the business strategy of predictive modeling?

Are AI and machine learning required for predictive modeling?

How does predictive modeling work?

What tools are required for predictive modeling?

How do you build a simple predictive model?

**Q: What are the different classes of predictive modeling**?

* A:* There are many different types of predictive models, and each fulfills a specific purpose or addresses a specific need. An organization must consider their business goals and the information they hope to gather when selecting a model to use.

**Classification models** are best at answering yes or no questions and helping to guide decisive actions. A classification model may provide information on the risk of customer churn or how likely a customer is to convert.

**Clustering models** separate data into different smart buckets or groupings of individuals based on common attributes or similarities. For example, you might want to separate your customers that churn based on similar behaviors and common traits or characteristics, and then proactively make recommendations to each one of those different clusters.

**Forecast models** deal in metric value predictions using historical data. They use numeric information and estimate the numeric values of new data based on learnings from historical data. Forecast models answer questions like, “How many customers are likely to convert next week?” or “What is the optimal price for this product?” Given a piece of information, they forecast or predict another piece of information.

**Anomaly detection models**, or **outlier models**, are used to identify odd or divergent behavior. They calculate things like spikes in revenue that are different from the norm, and notice things like a fraudulent purchase or transaction, because these things are different from what we typically expect.

**Time-series models** differ slightly from forecast models, even though those words are oftentimes used interchangeably. Time-series models use data over time to predict a numerical metric for future periods. They attempt to provide information like the expected temperature in the area over the next five days, or how many customers will likely convert over the next four weeks.

### Q: What is the business strategy of predictive modeling?

* A:* From a strategic standpoint, predictive modeling can be used in many different ways at many different levels within an organization. At the highest level, you may have executives that depend on the data to understand the performance of the overall business. It's the data that's often used for determining forecasts, as well as targets and goals for your sales organization. Strategically, you'd use predictive modeling for forecasting business performance and adjusting your sentiment or outlook when you produce quarterly results.

Predictive modeling is also used to minimize future risks. If you're evaluating some significant new investments, product lines, ventures, or acquisitions, predictive modeling can be used to minimize future risk in those situations.

The marketing department may use predictive modeling to produce rapid, real time, personalized, one-on-one experiences for customers. In those cases, you're trying to increase not only overall business performance, but also customer experience and lifetime value.

*Q:* Are AI and machine learning required for predictive modeling?

* A:* In today’s world, AI, machine learning, and predictive modeling go hand-in-hand, but AI and ML are not an absolute requirement for predictive modeling.

The terms “predictive modeling” and “machine learning” are often used interchangeably, but they are distinct. Machine learning is the use of statistical techniques to allow a computer to construct predictive models. An individual can construct a predictive model on a napkin with a small set of information, but machine learning is when you allow a computer to construct those predictive models for you. Organizations often use machine learning when they’re trying to reach a certain level of scale in computation processing.

Machine learning itself is a branch of artificial intelligence in which the machine displays the intelligence. The key difference is that AI systems are able to make assumptions, test, and learn in an autonomous way. And AI uses a combination of technologies to get the right processing speed, as well as different inputs of data. Machine learning is one technique it uses. There are also other techniques, like deep learning, which are closely associated with neural networks, or algorithms designed to work like the human brain. The purpose of AI is to reassess the model and reevaluate the data, all without the intervention of a human, and then make assumptions based on that predictive analysis.

*Q:* How does predictive modeling work?

* A:* Predictive modeling uses historic data to predict future events. That can be time-based data, or it can be data about the behavior and characteristics of past consumers. By using historical data to build a mathematical model and capturing those important events or trends, a predictive model then uses current data to predict what will happen next or suggest actions to take for optimal outcomes.

Generally, people will group the process of predictive modeling into either five or seven steps:

*Step 1: Understand the business objective.*You're not actually doing any predictive modeling yet, but if you don't have a clear understanding of what you are trying to achieve and what business question you need answered, then you're just modeling for modeling's sake.

*Step 2: Define the modeling goals.*Do you have certain expectations around the accuracy of your models? Are there other factors that you need to account for, like the speed of the predictions or the scores being produced, because of the nature in which you plan to ultimately deploy that technology? Do you need to do it in milliseconds, or is it something that can be done over a week? That plays into which model you will select.

*Step 3: Gather data.*Gather any data that will be relevant to making those predictions, including both the historic data surrounding the event you’re trying to predict and the data surrounding characteristics or behaviors. Data mining is a common way to gather the necessary information.

*Step 4: Prepare the data.*Preparing the data is the longest and most tedious aspect of the entire process. Eighty percent of the total time will be spent cleaning the data, transforming the data, or turning the raw data into meaningful variables that fit better with your choice of predictive model.

*Step 5: Analyze the data.*At this step, you may do some sampling to test and evaluate.

*Step 6: Develop the model.*Next, you select and develop the actual model that you’ll use to get the business insights you need from your data. Once you've developed those models and you're pleased with the expected accuracy and the estimates that they are producing, then you validate the models.

*Step 7: Validate the model.*This is where you're testing, optimizing, and deploying the models into the areas of the business in which you may need to take action. Not all predictive models need to be deployed. Some of them just produce an output, and then the business understands that output. But depending on the business objective and the intended goals, you may need that predictive model to be deployed. The deployment can require the model to be integrated into technologies that you're using, for example, instantly personalizing content for a customer as they're engaging with your website.

*Q:* What tools are required for predictive modeling?

* A:* The tools you need for predictive modeling are usually a variety of different types of software. You need software for data collection, for transforming the data, for analysis, for model building, and for the actual activation of the model.

A company can either use one solution for each piece, or find solutions that perform more than one task, depending on their budget and needs.

*Q:* How do you build a simple predictive model?

* A:* If you're new to predictive modeling, begin with something familiar. Choose data where you know the relationship, and so you already understand the underlying data itself. It's probably best to choose a smaller sample size that is relatively easy to explore.

If it's your first predictive model, start with a time-series model looking at something like the weather, or stock data, or market price data. These are sometimes the easiest to quickly grasp.

Most businesspeople and marketers are familiar with spreadsheet software like Microsoft Excel. If you're building a model for the very first time, take the trended data by time, like total revenue by day for the last 24 months, and input that data into your spreadsheet. Then, use the trend function to fit a linear trend using a straight line. It uses the least squares method to make a set of known Ys. This might be your revenue on a daily basis, for example. Your Ys are the data points you have already collected. Then, your known Xs are your days or weeks or months, whatever the granularity is. It needs to be consistent. The model will return the revenue, or the Y values along that line for the array, for any new Xs, which would be future days. With that, you're projecting out into the future.