← Demos
Adaptive A/B testing

Adaptive A/B Testing with Customer Features

When treatment effects depend on customer features, who you test matters. This demo walks through a simple sequential design run: at each step, choose the next customer \(x^*\) and ad assignment to learn about the A/B decision boundary. Customers who are clearly better for A or B usually do not teach the model much, so the design tends to spend tests near the boundary. On the other hand, not all boundary points are equally useful: the design focuses on points where the model is uncertain and the A/B decision is likely to change.

What you are seeing

Why this matters

Model details (for technical viewers)

We use a logistic interaction model: \(\eta(x,z) = x^\top\beta + z\,x^\top\gamma\), \(y \sim \mathrm{Bernoulli}(\sigma(\eta(x,z)))\). Here \(x\) denotes the feature vector \(\phi(x_1,x_2)=[1,\,x_1,\,x_2,\,x_1x_2]\). The heterogeneous A/B difference is \(\Delta(x)=\eta(x,1)-\eta(x,0)=x^\top\gamma\).

In this demo, the expected information gain view focuses on learning the interaction weights \(\gamma\) (i.e., the feature-dependent A/B difference), not the baseline \(\beta\). Any other random quantity computable from the model could be targeted depending on the goal (e.g., a policy boundary, a threshold, or a ranking), each leading to different computed optimal designs. The decision uncertainty view focuses on uncertainty in the sign of \(\Delta(x)\) ("is A better than B here?") rather than uncertainty in the absolute conversion rate.

Simulation setup for these precomputed frames: customer features \((x_1, x_2)\in[-3,7]^2\). We start with \(N_0=10\) customers, then evaluate a fresh candidate pool of 200 customers drawn from the same distribution at each step. The OED policy chooses \(x^*\) from this candidate pool to maximize expected information gain. Ad assignment at \(x^*\) is randomized with \(P(z=1)=0.5\); outcomes are simulated from fixed “true” parameters.

Browse the sequence

The plots are precomputed from one sequential design run. Use the slider to move through the 41 frames, and switch views to understand why points were chosen.

Step 0 of 0

Shortcuts: ←/→ step, 1/2/3 switch view.

Three views of the same state: where to experiment next, where the model says the A/B choice is uncertain, and what the model would choose now if you had to stop training.

Where to test next: expected information gain · Decision uncertainty: where A vs B is least settled · Best choice now: the predicted winner

--
Outcomes observed (cumulative)

Frames show observed outcomes = 10..50 (41 frames).

--
Next customer (x*)

From the precomputed OED policy.

--
Ad to show at x*

From the precomputed OED policy.

Plot frame

Random sampling (no OED)

If you sample customers from the same distribution without adaptive design (no OED), the resulting A/B recommendation can stay wrong for longer. In this run, random sampling often misses the profiles that would clarify the boundary.

N = 50

Snapshots are precomputed at N ∈ {50, 60, 70, 80, 90, 100}.

Decision regions under random sampling

*Compared against random sampling under the same budget and the same evaluation criteria.

Key takeaway

Even in this simple model, the best next test point around, and strongly depends on the current state of the model. Note also that towards the bottom boundary, more B ads were chosen, whereas the left boundary, A ads tended to be optimal. I would not want to rely on intuition alone here.

Contact

If you're thinking about sequential design for experiments, pricing tests, surveys, or labeling pipelines, email me.