Multivariate analysis — definition, methods, and examples
Multivariate analysis allows you to find patterns between variables, helping you better understand the effects that different factors have on each other and the relationships between them. It represents a critical tool for marketers looking for ways to get deeper insight into the outcome of campaign decisions.
Business owners can also use multivariate analysis to better determine the impact of complex market and global factors on everything from sales to consumer behavior.
This article will demonstrate how multivariate analysis can help you determine how different variables interact with each other in complex scenarios, and the power that comes with identifying patterns and the relationships between those variables. It will also detail multiple methods, examples, and possible use cases to help you understand the strengths of each.
Specifically, it will explain:
- What multivariate analysis is
- Multiple linear regression
- Multiple logistic regressions
- Multivariate analysis of variance (MANOVA)
- Factor analysis
- Cluster analysis
- Discriminant analysis
- Conjoint analysis
- Correlation analysis
What is multivariate analysis?
Multivariate analysis is used to find patterns and correlations between multiple factors by analyzing two or more variables at once. Experts generally group the various approaches of multivariate analysis into two camps — dependent techniques for those instances when the outlined variables depend on one another, and independent techniques for when they do not.
There are many ways to conduct a multivariate analysis. Below we’ll explore some of the most common methods.
Multiple linear regression
Multiple linear regression is an example of a dependent technique that looks at the relationship between one dependent variable and two or more independent variables. For instance, say a couple decides to sell their home. The price they can get for it depends as a variable on many independent factors, including location, time of year, interest rates, and the existence of similar listings in the area.
Let’s think about another example. You want to understand how social media spend across multiple platforms is likely to affect future traffic to your website. In this scenario, the amount and type of spend and platform type are all independent variables, and web traffic is the dependent variable.
Multiple logistic regressions
Multiple logistic regressions, meanwhile, examine the probability of one of two events occurring. Will one’s partner accept an offer of marriage? Will a child graduate from college? In both scenarios, myriad factors may contribute to the final outcome — has the partner expressed interest in marriage previously? Does the child study hard? Ultimately, both come down to a simple yes or no.
For a business owner trying to determine how likely a certain kind of client is to contract with her demolition company for a job, multiple logistic regression is the way to go.
Multivariate analysis of variance (MANOVA)
Multivariate analysis of variance (MANOVA) tests the difference in the effect of multiple independent variables on multiple dependent variables. Say, for example, a marketer wants to study the impact of pairing a price reduction with an increase in campaign budget — both independent variables — on the sales of a certain face cream.
Moving inventory isn’t her only concern, however. She also wants to know what the new price and spend will do to her total revenues on the cream for that period. Both sales and revenues represent dependent variables. In this case, a MANOVA would prove most helpful.
Sometimes too many variables find their way into a dataset. When that happens, it can be difficult to identify any clear patterns in all the information spread out in front of you. Trim back some of those variables using factor analysis, and you can begin to make sense of your data.
Do this by combining closely correlated variables, such as condensing a target audience’s income and education into “socioeconomic status” — or multiple behaviors, such as leaving a positive review and signing up for your newsletter into “customer loyalty.”
Like factor analysis, cluster analysis is a technique you can use to turn down the noise in your data to zero in on what’s most important. Only instead of grouping similar factors together, cluster analysis includes combining similar observations. A scientist performs cluster analysis when, upon discovering new types of deep-sea fish, she organizes them into a single species based on shared traits.
Applied to marketing, cluster analysis represents a powerful market segmentation tool that — when done regularly using powerful, automated software — can help teams deliver personalized experiences based on similar relevant behaviors.
In contrast to factor and cluster analysis, both of which represent independent techniques, discriminant analysis is a dependent technique. Put another way, cluster and factor analysis are exploratory, allowing you to determine the grouping pattern that maximizes similarities within groups — and, by extension, differences between groups.
Discriminant analysis is also concerned with simplifying datasets through classification. However, unlike the previous two types of analysis, it’s a dependent technique. This means it begins with a specific dependent variable, leaving you to focus on the impact your independent variable or variables have on distinguishing between groups. For example, let’s say you want to look at repeat buyers. The repeat buyers represent your dependent variable, a group that you can then examine more closely using any independent variables you have available to you — be it age, location, gender, or something else.
Conjoint analysis refers to the process of surveying consumers to identify which part of a product or service they value most. Used widely in market research, conjoint analysis provides real-world insight into which factors consumers consider most important when making a purchase. Such data is invaluable for minimizing guesswork when it comes to rolling out new products or services (or when tweaking old ones).
Not all feedback deserves equal weight, however. Only invest in a survey that is sure to reach consumers that match your target audience for the product. Just asking the right people isn’t enough. For a response of any real value, you’ll want to be sure to survey as many individuals as possible so as to root out outlying opinions and establish a statistically significant trend.
Equally important for actionable outputs is your survey’s format. Offer too many options, and the respondent becomes overwhelmed. Too few, however, and the survey may fail to pick up on key nuances in consumer preference. One solution researchers have developed to solve this problem is that of asking respondents to rank paired attributes. Do they value clothing that is “affordable and on trend” or “high quality and durable?” This way, the individual taking the survey only has to weigh two options at any given time.
Rarely does life ever offer up just two choices, however, which is why more and more marketers find themselves opting for choice-based conjoint analysis. In this approach, respondents must select one from a handful of full-profile concepts.
For example, say you want to test out an idea for a new winter jacket. Choice-based conjoint analysis would require you to design between three and five different jackets, each with distinct features for respondents to choose from. In the case of a clear winner, you are ready to move forward with confidence knowing that at least compared to the other possibilities, the jacket performs well with your target audience.
Oftentimes the responses are a bit more muddied, however, in which case iteration based on the results of a second or even third survey can help refine your data.
Just because two variables are correlated does not imply causation. Ice cream sales and pool drownings both increase in the summer, but one does not lead to the other. Correlation analysis is designed to help determine the relationship between two variables, including whether or not they are directly linked.
There are three main types of correlation analysis. Pearson’s correlation coefficient is used for linearly related variables — that is to say, as one variable goes up, the other follows (and vice versa). This relationship is measured using a number between 0 and 1, with 1 representing a perfectly linear relationship.
Spearman’s rank-order correlation is used to organize data defined by a set order, also known as ordinal association. For example, say the same runners face off in a series of matches and you want to determine the correlation between contests. You would use their results and Spearman’s rank-order correlation to calculate your answer.
Finally, Kendall’s tau correlation allows you to determine how dependent two variables are on one another, with 0 indicating complete independence and 1 the complete dependence. Note that this goes beyond occurrence. As in the ice cream and drowning example, just because two variables increase in frequency in tandem does not mean they are dependent on each other for happening. Similarly, just because your web traffic and web sales tend to rise and fall in tandem does not necessarily mean one causes another.
For marketers and business owners, correlation analysis can help take the guesswork out of where a company should invest in order to boost its bottom line. Suppose you notice that those consumers who test-drive the cars at your dealership for a minimum of 15 minutes are twice as likely to purchase a vehicle that same day. You want to know if it’s that extra time in the car that pushes them over the edge, or if customers are simply driving the cars longer because they already know they’re inclined to sign the dotted line.
Correlation analysis can help clarify the relationship between the two and possibly uncover a critical consumer behavior for driving sales.
Conducting your own multivariate analyses
Multivariate analysis helps you understand complex scenarios by uncovering patterns and relationships between multiple variables.
The term itself may sound like a highly technical and specialized skill. But in truth, most of us are engaged in some form of multivariate analysis all day long. New parents try to tease out how nap lengths, feeding intervals, and sleep environment combine to influence the number of times a child wakes up during the night. Homeowners must consider a dozen factors — some personal, some global — when deciding whether or not to refinance.
Likewise, marketers understand intuitively that sales and revenue are the result of many factors, ranging from price and campaign spending all the way up to inflation and global market trends.
Ready to get started on your own multivariate analysis? Cluster analysis is a great way to use your existing customer segmentation data to find correlations that might reveal a group that could benefit from a targeted campaign.
Adobe Target allows you to quickly perform multivariate testing, helping you to deliver more personalized experiences to your customers.
Watch an overview video to find out how Adobe Target can help you conduct your own multivariate analyses.