Marketing Mix Modeling at Adobe: Learn to Predict the Future Like We Did

[Music] [Kimberly Leung] And thank you for joining us today. I'm Kimberly Leung, the Group Product Manager of AI and ML at Adobe. And I'm excited to share how we approach Marketing Mix Modeling here.

So before we dive in, here's what you'll take away from today's session. So we're going to talk about how we leveraged AI and machine learning to drive faster and more actionable marketing insights. We will talk about our engineering challenges that we face in consolidating our measurement into a single unified solution for all our teams. And then we'll also talk about the key pitfalls and breakthroughs that we encountered along the way so then you can apply this to these lessons to your own marketing strategy and how you want to build your measurement stack. So let's jump in.

So I think a lot of you can agree that measuring incremental marketing performance is a balancing act. Marketers need speed and flexibility to make quick decisions, but they also need accuracy and complexity to ensure those decisions are sound. And, unfortunately, traditional measurement forces a tradeoff. Either we are meticulous, but we're super slow and rigid, and we have models that struggle to keep up so that we can have quicker insights and action upon them, or we get fast insights, but then we oversimplify reality in our results.

And this challenge became even more and more complex for us at Adobe at the time because we were also dealing with fragmented data sources, different methodologies, and a lot of different vendors. We were piecing together multiple tools and publisher reports and only to receive conflicting results most of the time. Beyond that, data wrangling was really, really slow for us. It was very resource-intensive. And by the time we had insights, they were often way too late to act on.

And even when we had results, turning them into decisions was another hurdle. Static reports and unintuitive tools that we had at the time created a lot of organizational misalignment, and leading this would create a lot of churn and decision paralysis.

So it was really clear to us at that time that we really needed a better approach internally.

So what was frustrating also is, as we were starting that journey to build our internal solution and our different measurement capabilities, another challenge emerged, and that was data erosion. Walled gardens, mobile device restrictions, evolving privacy laws, and the shift towards that post cookie world, all chipped away at the data we were able to capture. So at the time we were developing a very advanced multi-touch attribution solution, and then we started to lose some of the data that we needed, right, to be able to capture all our digital channels. And thus, the less data at that time meant that we had less precision that we could make, and decision making was getting even harder. So to stay ahead, we needed a measurement approach that wasn't just effective today but was also future-proofed against ongoing data deprecation.

We also needed a solution that could also scale across all of Adobe's businesses. And at that time, we were growing very quickly. We were building new products, we were acquiring new businesses, and we needed our solution to grow and adapt to our unique needs across those different regions and industries as well. And a one-size-fits-all solution wouldn't work.

So at the time we asked ourselves, how do we build measurement models that are both fast and rigorous? How do we deal with fragmented data and conflicting measurement approaches? And with the rapidly changing market and AI advancing at an unprecedented pace, how do we future-proof our measurement strategies and harness AI's full potential? So the answer was for us to rethink how we do measurement completely from the ground up, not just as a tool but as a strategic capability that evolves alongside our technologies and our data landscapes. We embraced adaptive models that integrate diverse data sources and worked around walled gardens so we could get clearer, more actionable insights and a view of performance. The goal was no longer just to measure what happened but it was also to continuously refine our strategies with precision and foresight. And so what we built and what we needed was a flexible AI-driven system that could adjust to varying business factors, evolving regulations, and diverse marketing channels while still delivering actionable insights.

And this is how DDOM was built. And this is our internal name for our data-driven operating model. We started in 2014 when we start building this internal measurement platform, and it's our centralized hub for measurement, optimization, and planning.

Instead of fragmented reports, DDOM provided near real-time insights in one place for all our business units, all our marketing teams, all our agencies and partners. This eliminated a lot of efficiencies across our teams, so our teams are no longer trying to reconcile different publisher reports and the results from different measurement methodologies experiments. And we are now allowing the team to just focus on answering the important key strategic questions such as where should we invest our next dollar for maximum impact or how can we leverage our own channels for more growth? At the time, when we first started, Adobe was a $4 billion company, and we were scaling rapidly as one of the first companies to adopt and transition to a SaaS model with Creative Cloud. And as we expanded into Experience Cloud, Document Cloud, and all these different global markets, our internal tool DDOM also evolved alongside us.

We integrated Sensei, AI models, and latest advancements to adapt to changing marketing landscapes along the way. And so today, the insights from DDOM have been instrumental into Adobe's growth to $20 billion business, and it has helped us to optimize marketing investments and staying ahead in an ever-changing industry.

And what's really, really cool is beyond marketing, what DDOM has let our team do as well is strengthen our strategic relationship with our finance teams and ensuring that we deliver data-driven aligned investment decisions together. And to highlight this impact, I want to quote our VP of Growth Marketing Performance at Adobe, Matt Scharf, as what he mentioned is, we have much deeper relationship with our finance team now. Now we sit down with them and map out how our targets align with the broader business KPIs. It's got into a point where they trust our projections almost as gospel. And we can have sophisticated conversations on how marketing can expand or adjust to meet our financial goals.

So today, I just want to summarize the impact that our internal tool had with Adobe. And our projections and recommendations from our models directly influenced billions of dollars in our marketing spend across our 50 plus business units and 29 plus countries that we're across.

And the results speak for themselves. We've been seeing an average of 19% year-over-year improvement in return on marketing spend and beyond the performance. The rigor and transparency of our models have really created a deeper alignment between us and our finance team and strategy teams, ensuring that every investment decision that we make as Adobe as a whole is backed by our AI-driven insights.

So with us, what we found is that AI and analytics isn't just clarity about understanding the past or predicting the future. It has helped us actively shape that future and helped us to build the future of measurement.

So that's why in October 2023, we launched Adobe Mix Modeler. And we brought over a decade of our innovation and our learnings from DDOM to our key customers.

Mix Modeler is our product that helps us disrupt that traditional measurement and combines AI-powered measurement and planning in an intuitive, marketer friendly platform.

Mix Modeler delivers on three key pillars that help Adobe internally as well as our key customers in their marketing and measurement strategy. First, we bring in the rigor, using cutting-edge AI to ensure accuracy and make sure insights are transparent and explainable. We have speed, where we have a self-service UI that enables faster data unification and then scale that allows flexibility with a cloud-based technology so that you can expand across different businesses, different products, different regions and across the different data needs that you have.

So that's like a quick preamble about what our technology is about, how it impacted Adobe. Now I'm going to introduce Bowen Wang, our Manager of Machine Learning Engineering. He's played a key role in the measurement efforts in DDOM and was instrumental in building Mix Modeler. Today, he'll walk us through the technical considerations behind the tool, how we solve the specific business challenges, and then the key lessons he learned along the way. And then a fun fact is that Bowen is also an amazing woodworker. So let's welcome Bowen. Bowen, over to you. Thank you.

[Bowen Wang] Hello. All right. Thank you. Hello. Yeah.

Okay. All right. Thank you, Kim. And thank you, everyone, for being here. I'm not talking about woodworking today, but even though that would be easier for me. And so my name is Bowen Wang. I'm the Manager of Machine Learning here at Adobe. So my team develops the machine learning models that power the Mix Modeler. So today, I'm really excited to take you under the hood of our modeler methodology.

All right. So just now Kim covered all the measurement challenges we face here at Adobe. And the solution from Adobe is, ultimately led to Mix Modeler. So given our time, I won't be able to go into every aspect of Mix Modeler. But I selected some of the key concept that we'll walk you through today. Those are covered in the middle black box. So as Kim mentioned, Mix Modeler leverages two different measurement approaches under the hood, the multi-touch attribution, MTA, or marketing-mix modeling, MMM. So I will start from there and talk about machine learning models that goes into each one of them and how they handle real-life marketing behaviors such as adstock, diminishing return, and also why do we make these modeling choices. And then we know that by having two separate models is not enough. How do we unify them and even beyond the two of them to provide a reliable and consistent insights for measurement and planning? That's where the word-- Excuse me.

That's the word where transfer learning comes from. So in generic terms, it basically means you take learnings from a different task to help your current task.

And last but not least, budget optimization in Mix Modeler is how we use them, whatever we learn through MTA, MMM, enable future looking decision making for the marketers. And that's where I call them, the last concept I will cover.

All right. Let me start with MTA. All right? So we know that multi-touch attribution is built on event-level data. So it's good to start by visualizing the data. So now this diagram shows two customers, customer A and B, and their online journey overtime. And this x-axis represents time. We see that customer A converted in the end, but customer B didn't.

For us to build marketing, multi-touch attribution model, both type of paths are important, both the non-conversion and the conversion path because it's the contrast between the two that essentially let us learn if a particular type of marketing touchpoint is effective to drive conversion. For instance, if there is one type of touchpoint that's just as likely to appear on a non-conversion path than a conversion path, there isn't much signal from the data to suggest, okay, it's impactful for the conversion. But on the other side, if a much getting touchpoints is much more easy to often appear on conversion path, then that signal that is impacting conversion. So I take this customer A's journey and use that as a running example as I go through some of the intuitions of how we build up the multi-touch attribution model.

All right. So this diagram shows how the marketing touchpoint view from our MTA model, actually impact the customer's behavior. The intuition is that a customer comes in with some baseline level, interest level towards making a conversion. And that's on the y-axis, the cumulative interest level. On x-axis is time. So over time, there are different marketing touchpoint on this customer's path. For instance, at time T1, there's an email that occurred, that increased the interest level to some point. But as time progresses, that increase in interest level slowly fades. And given enough time, eventually we assume it's going to come back to baseline level.

That's exhibiting the media adstock impact, right? But luckily for customer A, there are two subsequent touchpoints happening at time T2 and time T3, bumping up the interest level each time. And then at time TC, the conversion time, that's where it actually converted. So if we take a snapshot at the time TC, that's where we can decompose this interest level from baseline and interest level gains from each of the touchpoints as shown in the thetas here on the right. The use of letter theta suggests it's something we learned from the model. It's a model parameter. And for each of the touchpoints, you notice that there's a second subscript. It's because the timelapse from the exposure to the conversion time is also what's important. We need to take that into account. And more generally, you can consider, right, for someone's conversion path, we can take this deconversion at the conversion time. But more generally, you can take the deconversion any time, right? Why that matters is, earlier I mentioned why it is important to consider the nonconversion path. Because on nonconversion path you don't have a TC here, right? But then there's always a user specified conversion window as a reference. And when you do the deconversion, then you can also measure-- There will be touchpoint on the nonconversion path as well. So in general, through model training, what we try to learn are these thetas. And what comes out of the model is generally systematically, if you evaluate the cumulative interest level on conversion path, it will be higher than on nonconversion path.

So all this intuition behind interest level is clear, but interest level isn't something we readily observe. But, in fact, we observe just whether you converted or you didn't convert. So it's a zero-one outcome. Many of you may be familiar with supervised machine learning and classification, right? So that's where very naturally, one way to map that, doing the mapping between the cumulative interest level is to the probability of conversion, which is represented by this S-shaped curve right here. So in this graph, the x-axis represent the cumulative interest level and on the y-axis, it's the probability of conversion.

For this graph, we're focusing at that time TC where we made that deconversion. As you can see, four thetas are on this graph in ordered manner. So with each theta, it corresponds to a probability of conversion on the blue S-shaped curve. And you notice that it's S-shaped, so it's not a linear mapping and order didn't matter here. Why that's important? For instance, if you take the last touchpoint on the customer edge journey, that was the search. And that increased the interest level quite a bit. That's the last horizontal diagnosed theta sub-s.

But it's bigger. The interest level increased is bigger than the second email touchpoint, which is represented in the second last horizontal arrow, the theta sub-e. Notice impact from the second email. But the increase in probability is actually smaller, if you compare those two red bars. And this view from the probability perspective is how we use the [INAUDIBLE] conversion back to each of the individual touchpoint. And this concave shape naturally accounts for the diminishing return. So intuition is once the customer has reached very high interest level, you keep bombarding him with immediate touchpoint. It shouldn't increase the probability meaningfully, basically.

So this type of modeling we use for MTA is known as a discrete-time survival model in biomedical research. So in a survival model, the main interest of event is the occurrence of death. In our case, the current of a conversion. So where does this discrete time comes in? So earlier, for showing adstock, I used those continuous blue curve earlier. But it can also be shown-- There's another way to capture this dimension, the adstock impact. Here, this diagram, we use email as an example. The x-axis is time since exposure in weeks, the delta t, and y-axis is the impact on the interest level. So if we capture the email adstock as a continuous function, then what we're really estimating, usually continuous function is something like exponential decay as one parameter x function to capture the adstock. Then what we're really estimating is that lambda, and then we plug in the delta t, that's observed value, to get at a time of decomposition of those, what are the actual impact from that media touchpoint. But when we make it discrete, we make it discrete time windows, for each time window, we have a different parameter assigned to it. We estimate that scalar parameters instead. And to make sure we enforce monotonicity, that's where the constraint comes from. If you're closer to the time of exposure, then the impact from that media touchpoint is higher in general. And that's handled through numerical optimization with constraint, basically. And there are several benefits from making discrete time. And intuitively, we make observations usually in discrete times and also increase the flexibility of the model to capture nuances of those exponential decay. And also it alleviates some of the more restrictive assumption we have to use when we're using a continuous time.

So quick summary about the decision points, why we landed on this discrete-time survival model for our MTA, right? First of all, what we should touch on earlier is it's important to look at both the conversion and nonconversion path, right? Now the second is highly interpretable. It's very easy to use the model to capture real-life marketing behaviors such as adstock and dimension return. And both of these are rule-based method, they cannot accommodate straight out of box, right? And also this method is very easy to scale, and it has a very intuitive scoring process with an incremental probability interpretation. Right? If you compare it to some other methods like Hidden Markov Model, number three and number four are where the kind, generally fall short of compared to this model. And lastly, it has a very high predictive accuracy. Even if for now we ignore all the requirement of interpretability and if we treat it just as a classification problem looking at the predictive AUC and it's there, right. It opens door for many other classification algorithm but still this model performed very well against the other algorithms. That's why the main reasons why over the years we landed on this method even though we've tried so many.

So now switching gears a little bit, separately talking about Marketing Mix Modeling or MMM, and which is build on aggregate-level data. And also let's start with a visualization of what the data will look like.

So after all the data ingestions, validation, and cleaning, it ultimately comes down into a tabular format, something like this. This is just a toy example. So each row corresponds to a time period, usually weekly, sometimes daily, and each column represent a variable. So the outcome of interest is the conversion column, and we have factors to help on seasonality or promotion for its control variables and then there are media channels. And the goal, of course, this model relationship between the weekly sales or weekly conversions as a function of those factors and media input, right? So just a quick example. I highlighted week four is because when we tried model and predict the week four conversion, and it will make use of the data, factor data from the same week and then media channel data from some week, leading to that week because of adstock.

So one key thing to call out is, we use a multiplicative model. So basically what it means is weekly conversion is modeled as a product of baseline demand and the multipliers from each of the media channel. And at a very high level in terms of notation, of course, ignoring a lot of details is written in that form. So a quick note on the notation. Again, theta represent the model parameters we try to learn in the training process. X and Y represent the data, we actually observe for using the training. And the subscript corresponding to the two channels in this toy example, Search and Display. And if we relate back to the week four example from a previous page, so basically, week four, the actual conversion we observe is something like 1730. And when the model predicts will be slightly different, giving a residual. But we can decompose the model prediction into the part from the baseline and the multipliers from each of the marketing channels, basically.

And earlier the equation is very high level, but in the multiplier, the function of the multipliers, that's where the adstock and diminished return are all taken care of in that function. So how do we take care of them on the MMM? So for adstock, we use exponential decay. It can be either one-tail or two-tail. The difference is, where does the peak impact occur after your media investment as shown on this plot, the two different types of decay.

To take care of a diminished return, we use the power function applied at the right location. So basically, if you power function, x to the power of theta, you keep theta in the range of zero and one, then it's this concave shape that's capturing the dimension return. And these are captured in the multiplier function in the MMM model.

So why do we choose this modeling approach? Or more specifically, very often people ask, why didn't you use an additive model, which is very common out there. So very quickly, also at a very high level, the difference is, how do you model weekly sales or weekly conversion, right? In additive model, the key is what highlighting the blue color is basically the plus operator instead of a multiplication, right? And then each of the terms are quite parallel to our model. So the decision is actually motivated by real-world marketing behavior that we try to handle within MMM. For instance, there's a belief that media-- There's a media synergy. Basically, media working harder together. One plus one is greater than two. And for this one, it's very easy that it's additive. There's no room for you to be greater than one. One plus one has to be equal to two, right? Then there are also two other consideration.

For example, the expectation that we should have a time-varying impact from media channel. For instance, you have the same investment in two different time periods in a year. I potentially can have different impact due to external factors, right? Another related concept if we're a little forward looking into budget optimization is, if for the future I foresee some change in economic factors or whatever that's going to change the other baseline fluctuation, how does that reflect into my budget recommendation? Because budget recommendation isn't just a cross-channel. It can also be across times.

So the second and third are so closely related. I use a simple example to illustrate the difference between additive and multiplicative models in this case. Why it can be readily addressed in multiplicative model but not in the additive model? I apologize for these super packed slides, but the idea is actually pretty simple. I'll walk you through this.

So we start with a very simple scenario. So I'm a sneaker company selling sneakers online. So generally, the demand for sneakers is higher in summer month but lower in winter month, right? So which is captured on the bottom left graph. Over the whole year, what's the demand, baseline demand for the sneakers? And for me, my default strategy for planning is, I spend a fixed amount of marketing budget across the whole year. Each month gets the same budget. And I made two simplifications for this example. First, I ignore adstock. The second, I treat paid media as a unit. And because these two are independent concepts to what we're trying to answer here, the comparison between additive and multiplicative. So without loss of generality, I made these assumptions so that the example is tractable and easy to follow, right? So then we move up there for additive model and move down there for a multiplicative model. So what does it mean by constant spend in each month under this setup? Constant budget in the additive model means constant contribution across each month, which is reflected on the top graph in the middle column, on the A, for the additive model. It's the same contribution over time. But on the multiplicative side, constant budget means constant multipliers each month. And the multiplier works with the baseline demand to give you a time varying impact for the same amount of money you spend each month. That's on the bottom graph in the middle column. It matched the shape of the baseline demand, basically.

Then the next question is-- In this, for example, I made these total contributions from the additive model and from the multiplicative model the same, 840, right? But is there any capacity to move away from a fixed budget, shifting budget around, but keep the total budget? And can we do better, right? In additive model, there isn't any incentive from the modeling perspective to motivate you to make that change because it doesn't require-- It just has nothing to do with the baseline. There's no interaction. And also because assuming diminishing return has already kicked in.

Having a flat spend is actually optimal. If you move $1 from November to May, the gain in May is smaller than the drop in November because of diminishing return, right? But in a multiplicative model, there's actually quite a bit of room to shift around. Basically, we know the direction to shift, we should shift from winter month to summer month. And diminished return also applied. When it applies here, it means the average multiplier is smaller. It's going to drop across the year. But the gain in the summer month, more than able to compensate the job in the winter month. And then in this case, brings up the total contribution from marketing quite a bit. And how do we actually know the extent of the shift and where to shift? That's what's covered in budget optimization algorithm, which I will talk about later. Yeah.

So we very briefly covered MTA and MMM. And now the next step is, I'd like to borrow this Swiss Cheese Analogy from our internal stakeholder that I could use a lot. So for those of you who haven't heard, Swiss cheese, each slice represent one of the tools we have, and each of them have holes in it, representing limitations, whether it's due to data, due to your assumptions, or gaps in methodology. But when you stack different slice of cheese up the right way, you can actually cover the whole area without holes. And that's also how Adobe is trying to approach the whole measurement planning problem. And we know that MTA, MMM are intrinsic part of Mix Modeler. Experimentation is also another pillar how we solve marketing measurement problem at Adobe. Even though it's not implemented within the framework of Mix Modeler, but for the model, it does have to take into account whatever experimentation result you get so that we can unify the insight to provide a consistent and reliable insights for business decision.

So how to do the unification? Because we just touched on MTA and MMM. I'll start there. How do we do the bidirectional transfer learning that Kim alluded to earlier, right? So at a very high level, a workflow, on the left side is MTA, on the right side it's MMM. So when we have both event-level data and aggregate-level data, the first step, step 1A and 1B, would build a separate model by themselves, the purely data-driven model with the data, with the configure setup, right? Then MTA scores, they're given on an event level in step two, which feed that event-level MTA score into step three, together with MMM model built in step one. And these two together lead us to step number four through the transfer learning process, which I'll talk a little more on the next slide on how it exactly works. But outcome from the transfer learning is that we take these two input and then we produce output. The output is updated MMM model. It's a new set of parameters compared to what we have in Step 1B, and that is used for generating any aggregate level insight, the score and everything, which you see in step number five. And when we have this MMM score updated, taking MTA into account, that's the second stage of transfer learning going from step five to step six. And at this step, this event-level MTA scores from step two also will act as a input so that we have a consistent update MTA score also at the event level, that's based out at step seven. That's at a high level of how it works. So there are a couple of things I'd like to highlight here.

First of all, the MTA model and MMM model, they can be built on different time window, right? Typically, MTA, the training window is shorter. Say, in this example, it's half a year. MMM, typically, at least two years.

But in practice, usually there should be a big overlap in the training window because that's where the information is flowing, right? The second highlight is-- The first step, what we're taking from MTA model into the transfer learning in the first stage of the Bi-directional Transfer Learning, the arrow from step two to step three is actually the relative performance of the media channels that's covered by MTA. The reason is that we do try to bring the best of all worlds in a sense.

Wow. Very exciting.

Yeah. We try to bring the best of all worlds, right? But for MTA, we know the limitation is the channel coverage, right? Especially with cookie list and everything. But the advantage is the vast amount of data for you to analyze those channel you have, right? So that's why the relative performance is the key. We really want to take off from them and then use that together with MMM model, which sees the more holistic picture, all the factors, online, offline channels you have. That is why that's the first step, right? And also a third note is the degree we do the consulting. Some of you may have noticed that transfer learning actually is where-- For the MMM side, we have to do that at the model level. Meaning, it will influence your training, influence how your parameters are selected because additional insights downstream whether it's budget optimization or the response curve we try to plot, right? Those will have to come from the model, the multiplicative model. But on the MTA side, it's not a requirement because it has to do with the notion of sufficiency. What are the business problem we try to address, right? Because MTA, usually when you spit out event level score, that's a sufficient statistical, we call sufficient statistic. You can basically ignore the model parameters. Every downstream insight can come from that score itself. That's why a way to also simplify the process. So we focus on the business problem we try to solve.

So beyond MTA, MMM, just now I alluded to, and experimentation, right? And even beyond experimentation, there are so many different source of prior knowledge we can leverage for measurement planning problem. Different company may have different source of prior knowledge. MTA is one of their lift test results. This experimentation results. There could be very strong belief on the spend share or in-house model you built before using a different approach or market a strong belief from past research and experience in the industry, right? These can all be sources of information that be leveraged by the model building process through the same process called transfer learning. So in the middle, the diagram updated a little bit to emphasize that all these prior knowledges are actually optional. You don't need to have any of them for the MMM to work. If you don't have any of them, the default model will be used to generate score insights and then budget optimization. But if you do have them, one or more, we can accommodate everything so that they guide the model building process. So I've been talking about this transfer learning for quite a bit, but what does it do actually? So basically, it's numerical optimization algorithm at work. So if you think about MMM in model training, what we're essentially doing is we set up loss function, right? Then there's goodness of fit metric. There's regularization and everything. And we solve the numerical optimization, then we get the thetas. So what happens here is, if there's one or more sources of information, we augment that original optimization objective function with another term. That term measures the distance between what you supply to me as prior knowledge and what the model would produce to compare against what would you just supply? So basically a distance metric. So a lot of the work or challenge or innovations actually come in doing how to set it up right because to solve the problem efficiently, you need the right metric, you need the right representation. If you want to use gradient descent to solve the problem efficiently, the gradient has to be tractable, right? So there's a lot of consideration, including when you have multiple source of information, how do you standardize those inputs so you can leverage in a systematic way and balance all the information you have.

The last big concept I want to touch on is budget optimization. Budget optimization is based on the MMM model we built, basically the multiplicative. To help visualize what it looks like, I created this toy example. So we have only two channels and only one time period. The channels are search and display. And we have a total budget. We want to split between the two to maximize our conversion in this case.

So in multiplicative model, the optimization surface looks something like this. And in terms of notation, the x in model training, that's the input, but here, turn it around is the optimization parameter. These are question marks we try to solve for. Highlighted the equation in blue. Basically, we try to maximize the revenue, given those functions and over the parameters x sub s and x sub d.

And this budget optimization algorithm, yeah, typically we solve it under constraint, but there's a lot of flexibility on how we offer to use this budget optimization. I listed the two here. First, it's very easily handle channel level constraints and within one period or across time periods. And then we can also plan across multiple conversions. This surface is showing for just one conversion, but later I'll show. We can plan across multiple conversion to make it very flexible.

So this constraint part, right? Falling out from the example from previous page, so what does it look like? So we have search and display. Now we give it some channel level constraint. Search. It only spends somewhere between 10,000, 80,000 during this time window. Display, you can spend between 5,000 to 70,000. And then we have a total budget constraint. Between the two, you only have $130,000 to spend, which gives this gray area, which is the set of all possible budget allocation that's possible under the constraint. And if you translate that into the optimization surface, you basically cut it off a little bit due to the constraint.

Well, basically, the numerical optimization is solving for that points, shown at the junction of the three red dotted lines here.

And planning across multiple conversions, right? So earlier we showed how we were planning across one, but there often comes up the use case where we have to plan across multiple conversion. One example is, I'm a retailer. I have online store versus in-store-- Online sales and in-store sales. I built two models for each with the same set of budget pools because I expect those channels are affecting online sales and in-store sales differently. But when I do planning, it's the same budget pool. Where should I allocate money so I can maximize revenue across these two types of sales, right? That's where the equation coming in. So earlier, we only look at one function F. Now we're looking at two. Now we put them together as a weighted sum so we can optimize towards the total, right? Another example is, when you have-- Probably consider upper funnel, lower funnel, that's where it comes really important because the two models you build, one model could be on the sales, the lower funnel. But the other model, the response variable may be some brand awareness KPI. And then when you do optimization, then naturally they're not on the same unit, right? But depending on how you want to prioritize long term versus short term success, you can reflect that as the weight so that the optimization actually takes both into account. And, of course, the diagram on the right is basically showing there's a lot of flexibility in channel coverage as well. These two conversions or two different models you set up doesn't have to cover the exact same channel. They can have their own. They can have overlap. It doesn't matter. The optimization, because the problem is basically we solve numerical optimization, degrading propagation, it will work out together.

So a quick recap of what I covered.

I removed the word, the phrase black box to start.

And I talk about MTA and MMM by itself. How do they handle adstock, dimension return? And how do they talk to each other through the process called transfer learning, right? And even beyond that, if they have test experimentation results through transfer learning, you can also incorporate that, and then how that can be used into budget optimization.

I hope this deep dive has been informative and interesting and not a reminder of your least favorite stats class from college.

So with this, I'd like to invite my partner Kim on stage for final remarks. Thank you.

Thanks, Bowen.

So one thing I think I don't want to underscore after going through Bowen's in-depth detail on how we built Mix Modeler, I think it's really important that we all keep staying ahead with our marketing innovation roadmap. We believe that it's important to not just stop where we are but continue to develop, right, new ways to solve further problems and solve for new use cases as technology allows.

And a well maintained marketing innovation roadmap ensures that our brands remain relevant, resilient, and capable of any marketing opportunities that come across and then also make sure that we evolve with the evolving market itself.

So to that point, I wanted to highlight some of our new innovations and maybe these can be ideas for your marketing tech stack and what you're trying to develop on your side as well. So we're coming out with a feature that allows someone to build plans based off of their goals and targets. And so that instead of giving us marketing budget, you give us marketing goal, and we'll be able to generate a suggested budget and plan that will hit that goal. We're also able to do portfolio planning. So this is what Bowen's talking about with multiple conversions, figuring out the optimal budgets for that. This will allow you to build plans across different business portfolios, different business goals, different products. We're also coming out with a granular incrementality insights, which allows you to get deeper into your marketing performance so you no longer just look at touchpoints and channels. You can also slice and dice those details by campaigns, publishers, segments in any way that you can get deeper insights so that you can actually run those in your marketing campaigns. And then we're also building AI agents in our critical workflows so that we can help you scale and more effectively collaborate with teams so that when there are, like, areas where you can have an agent that helps you maybe run several models at once so you can compare and contrast them quickly, that's something you can do here.

So, I guess, to just wrap up the keys to success, we believe, is to make sure there's a unified measurement approach across all your teams, one place, that everyone gets their insights and gets their planning capabilities from so that you reduce the churn and work that you need to reconcile between teams and work together to just answer important questions. Also use AI to drive also, like, accuracy and speed because that's the tool that you have now to be able to balance all the different business requirements that your teams have. And then also always evolve for impact. So always evolve and always watch out for what's out there, what's new, and see what makes sense to integrate into your marketing stack.

Here are some resources for your reference that you can take a look at. Additional sessions for Mix Modeler, our website, as well as our Adobe, on Adobe Story in more detail. And then also just a reminder, everyone, please take the survey at the end of the session. Thank you, everyone. I think we're about out of time anyway. So thanks, all, and thanks for your time here. Thank you, all. [Music]

Closed captions in English can be accessed in the video player.

Share this page

Speakers

Featured Products

Session Resources

About the Session

Discover how Adobe Marketing achieved an 80% increase in return on media spend over five years by overcoming major measurement challenges. Hear from Adobe’s data scientists about their journey — from fragmented, conflicting results to a unified solution called Mix Modeler that quantified the incremental impact of marketing and also reliably predicted future ROI. Get an inside look at the innovative (patent-pending!) AI/ML techniques Adobe trialed and refined to meet the needs of modern marketers, plus upcoming innovations.

Key takeaways:

  • Gain insights into the engineering challenges that Adobe faced in consolidating down to a single solution
  • Learn how Adobe’s approach leverages advanced AI/ML to ensure faster marketing insights
  • Get a firsthand look at the pitfalls and breakthroughs encountered, and take away lessons for your own development

Technical Level: Intermediate to Advanced

Track: Developers

Presentation Style: Case/Use Study

Audience: Advertiser, Developer, Digital Analyst, Digital Marketer, Audience Strategist, Data Scientist, Web Marketer, Marketing Analyst, Data Practitioner, Marketing Technologist

This content is copyrighted by Adobe Inc. Any recording and posting of this content is strictly prohibited.


By accessing resources linked on this page ("Session Resources"), you agree that 1. Resources are Sample Files per our Terms of Use and 2. you will use Session Resources solely as directed by the applicable speaker.

New release

Agentic AI at Adobe

Give your teams the productivity partner and always-on insights they need to deliver true personalization at scale with Adobe Experience Platform Agent Orchestrator.