← Demos
Adaptive A/B testing

Adaptive A/B Testing with Customer Features

When treatment effects depend on customer features, the choice of who to test can strongly affect how quickly you learn. Sequential design treats each assignment as a decision: pick the next customer \(x^*\) (and ad assignment) to reduce uncertainty about the A vs B decision boundary as fast as possible—so you reach a reliable targeting policy with fewer samples. In many settings this reduces the classic explore/exploit tension: customers who are confidently A (or B) are low-value information-wise, so the design naturally concentrates experiments near the boundary where the opportunity cost of exploration is small.

What you are seeing

Why this matters

Model details (for technical viewers)

We use a logistic interaction model: \(\eta(x,z) = x^\top\beta + z\,x^\top\gamma\), \(y \sim \mathrm{Bernoulli}(\sigma(\eta(x,z)))\). Here \(x\) denotes the feature vector \(\phi(x_1,x_2)=[1,\,x_1,\,x_2,\,x_1x_2]\). The heterogeneous A/B difference is \(\Delta(x)=\eta(x,1)-\eta(x,0)=x^\top\gamma\).

In this demo, the expected information gain view is concerned with learning the interaction weights \(\gamma\) (i.e., the feature-dependent A/B difference), not the baseline \(\beta\). Any other random quantity computable from the model can be targeted depending on our goals (e.g., a policy boundary, a threshold, or a ranking), each leading to different computed optimal designs. The decision uncertainty view focuses on uncertainty in the sign of \(\Delta(x)\) ("is A better than B here?") rather than uncertainty in the absolute conversion rate.

Simulation setup for these precomputed frames: customer features \((x_1, x_2)\in[-3,7]^2\). We start with \(N_0=10\) customers sampled from the same proposal distribution used below. Each step draws a candidate pool of 200 customers from this proposal distribution, where each coordinate is sampled independently as \(\mathcal{N}(\mu=3,\sigma=1)\) with rejection outside \([-3,7]\); after 10,000 rejected draws the implementation falls back to clamping a fresh normal draw into \([-3,7]\). The OED policy chooses \(x^*\) from this candidate pool to maximize expected information gain. Ad assignment at \(x^*\) is randomized with \(P(z=1)=0.5\); outcomes are simulated from fixed “true” parameters.

Browse the sequence

Plots are precomputed for a sequential design run (recommended next experiment at each step). Use the slider to move through the 41-frame run, and toggle between views.

Step 0 of 0

Shortcuts: ←/→ step, 1/2/3 switch view.

Three views of the same state: where to test next, where the decision is fragile, and what we'd do now.

Where to test next: expected information gain · Decision uncertainty: where A vs B is fragile · Best choice now: the predicted winner

--
Outcomes observed (cumulative)

Frames show observed outcomes = 10..50 (41 frames).

--
Next customer (x*)

From the precomputed OED policy.

--
Ad to show at x*

From the precomputed OED policy.

Plot frame

Random sampling (no OED)

If you sample customers from the same fixed proposal distribution but without adaptive design (no OED), the resulting A/B recommendation can remain wrong for much longer. Random sampling often misses the few profiles that would resolve the decision quickly.

N = 50

Snapshots are precomputed at N ∈ {50, 60, 70, 80, 90, 100}.

Decision regions under random sampling

*Compared against random sampling under the same budget and the same evaluation criteria.

Key takeaway

Even in a simplified model, the optimal next test moves around. Intuition is not a reliable sampling strategy.

Contact

If you'd like to apply sequential design to your experiments, pricing tests, surveys, or labeling pipelines, email me.