= Attribution (marketing) =

In marketing, attribution, also known as multi-touch attribution (MTA), is the identification of a set of user actions ("events" or "touchpoints") that contribute to a desired outcome, and then the assignment of a value to each of these events. Marketing attribution provides a level of understanding of what combination of events in what particular order influence individuals to engage in a desired behavior, typically referred to as a conversion.

== History ==

The roots of marketing attribution can be traced to the psychological theory of attribution. By most accounts, the current application of attribution theory in marketing was spurred by the transition of advertising spending from traditional, offline ads to digital media and the expansion of data available through digital channels such as paid and organic search, display, and email marketing.

== Concept ==

The purpose of marketing attribution is to quantify the influence each advertising impression has on a consumer's decision to make a purchase decision, or convert. Visibility into what influences the audience, when and to what extent, allows marketers to optimize media spend for conversions and compare the value of different marketing channels, including paid and organic search, email, affiliate marketing, display ads, social media and more. Understanding the entire conversion path across the whole marketing mix diminishes the accuracy challenge of analyzing data from siloed channels. Typically, attribution data is used by marketers to plan future ad campaigns and inform the performance of previous campaigns by analyzing which media placements (ads) were the most cost-effective and influential as determined by metrics such as return on ad spend (ROAS) or cost per lead (CPL).

== Attribution versus causal measurement ==

While attribution models assign credit to marketing touchpoints, they do not establish causality. Attribution relies on observational data and correlational patterns to distribute value across the customer journey, but cannot determine whether a given touchpoint actually caused a conversion or merely correlated with it. This distinction is critical for understanding the true effectiveness of marketing investments.

Incrementality testing addresses this limitation by measuring counterfactual lift through controlled experiments. By comparing outcomes between a treatment group exposed to marketing and a control group that is not, incrementality tests isolate the causal effect of marketing interventions. Methods such as randomized controlled trials, geo-experiments, and holdout testing provide unbiased estimates of marketing impact by establishing proper counterfactuals.

The difference between attribution and causal measurement has important implications for marketing optimization. While attribution can be useful for understanding the customer journey, it may misallocate credit and lead to suboptimal budget decisions if treated as a measure of causal impact.

== Attribution models ==

Resulting from the disruption created by the rapid growth of online advertising over the last decade, marketing organizations have access to significantly more data to track effectiveness and ROI. This change has impacted how marketers measure the effectiveness of advertisements, as well as the development of new metrics such as cost per click (CPC), cost per thousand impressions (CPM), cost per action/acquisition (CPA) and click-through conversion. Additionally, multiple attribution models have evolved over time as the proliferation of digital devices and tremendous growth in data available have pushed the development of attribution technology.

- Single Source Attribution (also Single Touch Attribution) models assign all the credit to one event, such as the last click, the first click or the last channel to show an ad (post view). Simple or last-click attribution is widely considered as less accurate than alternative forms of attribution as it fails to account for all contributing factors that led to a desired outcome.

- Fractional Attribution includes equal weights, time decay, customer credit, and multi-touch/curve models. Equal weight models give the same amount of credit to the events, customer credit uses past experience and sometimes simply guesswork to allocate credit, and the multi-touch assigns various credit across all the touchpoints in the buyer journey at set amounts.

- Algorithmic or Probabilistic Attribution uses statistical modeling and machine learning techniques to derive probability of conversion across all marketing touchpoints which can then be used to weight the value of each touchpoint preceding the conversion. Also known as Data Driven Attribution, Google's DoubleClick and Analytics 360 use sophisticated algorithms to analyze all of the different paths in an account (both non-converting and converting) to figure out which touchpoints help the most with conversions. Algorithmic attribution analyzes both converting and non-converting paths across all channels to determine probability of conversion. With a probability assigned to each touchpoint, the touchpoint weights can be aggregated by a dimension of that touchpoint (channel, placement, creative, etc.) to determine a total weight for that dimension.

- Customer Driven Attribution models are developed through the collection of zero-party data and then use projective analytics to provide a full view attribution picture. This method of attribution was developed with the belief that responses from customers should be the strongest weighted data point in an attribution calculation. It is an attempt to simplify attribution and go back to the basics.

=== Limitations and divergence from experimental results ===

Research has consistently shown that multi-touch attribution (MTA) and rules-based attribution models frequently diverge from lift measurements obtained through controlled experiments. Attribution models, being based on observational correlations rather than experimental manipulation, often overestimate the causal impact of marketing touchpoints that are merely associated with high-converting users.

Studies comparing MTA outputs to results from randomized experiments have found substantial discrepancies, with attribution models systematically misallocating credit across channels. This occurs because attribution cannot account for selection bias, where certain touchpoints appear effective simply because they are shown to users already likely to convert. The Platform Incrementality Evaluation (PIE) framework and similar research demonstrate that attribution-based optimization can lead to inefficient budget allocation compared to decisions guided by experimental lift measurements.

These findings underscore the importance of validating attribution models against experimental results and using incrementality testing as the primary method for measuring true marketing effectiveness.

=== Constructing an algorithmic attribution model ===

Binary classification methods from statistics and machine learning can be used to build appropriate models. However, an important element of the models is model interpretability; therefore, logistic regression is often appropriate due to the ease of interpreting model coefficients.

==== Behavioral model ====

Suppose observed advertising data are $\{(X_i, A_i, Y_i)\}^n_{i=1}$ where:
- $X \in \mathbb{R}$ covariates
- $A \in \{0,1\}$ consumer saw ad or not
- $Y \in \{0,1\}$ conversion: binary response to the ad

===== Consumer choice model =====

$u(x, a) = \mathbb{E}(Y|X=x, A=a)$$\forall X \in \mathbb{R}$ covariates and $\forall A$ ads

$u = \sum_{k}A\beta^k\psi(x) + \epsilon$

Covariates, $X$, generally include different characteristics about the ad served (creative, size, campaign, marketing tactic, etc.) and descriptive data about the consumer who saw the ad (geographic location, device type, OS type, etc.).

===== Utility theory =====

$y_i^* = \underset{y_i}{\max}\bigl(\mathbb{E}[u_i]\bigr)$

$\Pr(y = 1|x) = \Pr(u_1 > u_0)$

$= 1/[1+e^{\sum_kA\beta^k\psi(x)}]$

==== Counterfactual procedure ====

An important feature of the modeling approach is estimating the potential outcome of consumers supposing that they were not exposed to an ad. Because marketing is not a controlled experiment, it is helpful to derive potential outcomes in order to understand the true effect of marketing.

Mean outcome if all consumers saw the same advertisement is given by:

$\mu_a = \mathbb{E}Y^*(a)$

$= \mathbb{E}\{\mathbb{E}(Y|X,A=a)\}$

A marketer is often interested in understanding the 'base', or the likelihood that a consumer will convert without being influenced by marketing. This allows the marketer to understand the true effectiveness of the marketing plan, or incrementality. The total number of conversions minus the 'base' conversions will give an accurate view of the number of conversions driven by marketing. The 'base' estimate can be approximated using the derived logistic function and using potential outcomes.

$\text{Base} = \frac{\text{Predicted Conversions Without Observed Marketing}}{\text{Predicted Conversions With Observed Marketing}} = \frac{\mathbb{E}\{\mathbb{E}(Y|X,A = 0)\}}{\mathbb{E}\{\mathbb{E}(Y|X,A = 1)\}}$

Once the base is derived, the incremental effect of marketing can be understood to be the lift over the 'base' for each ad supposing the others were not seen in the potential outcome. This lift over the base is often used as the weight for that characteristic inside the attribution model.

$\text{Attribution Weight} = \frac{\mathbb{E}\{\mathbb{E}(Y|X,A = 1)\} - \mathbb{E}\{\mathbb{E}(Y|X,A = 0)\}}{\mathbb{E}\{\mathbb{E}(Y|X,A = 1)\}}$

With the weights constructed, the marketer can know the true proportion of conversions driven by different marketing channels or tactics.

== Marketing mix and attribution models ==

Depending on the company's marketing mix, they may use different types of attribution to track their marketing channels:
- Interactive Attribution refers to the measurement of digital channels only, while cross-channel attribution refers to the measurement of both online and offline channels.
- Account-based attribution refers to measuring and attributing credit to companies as a whole rather than individual people and is often used in B2B marketing.
