Optimal Experiment Design Demos
Adaptive data collection that picks the next most-informative sample, so product, ML, and research teams can learn with fewer measurements.
Some demos are fully static (precomputed plots). Others are interactive and require a backend service; each demo indicates its status.
Featured demo: Adaptive A/B testing with customer features
What this is
Optimal experiment design provides a way to fit a statistical model with less data: start with some initial data, then iteratively choose the next measurement to reduce uncertainty about the quantity that matters for a decision.
The emphasis is not exploration for its own sake, but learning the specific quantity that drives a decision as quickly as possible.
Example: in deduplication, instead of labeling random record pairs, we label the pairs that most clarify cluster membership.
Typical applications include A/B testing with covariates, record linkage, pricing experiments, adaptive surveys, and ML data labeling.
Demos
Precomputed plots and interactive demos.
Learn accurate clusters with far fewer labeled pairs by targeting the comparisons that matter most. Runs against a hosted backend API.
Focus experimentation where the A vs B decision is uncertain, not where data are easiest to collect. Browse a 41-step run of decision regions, decision uncertainty, and expected information gain.
Estimate a demand curve with fewer price experiments by choosing probes that target revenue uncertainty.
Reduce survey length by selecting questions that maximize information about the target construct.
Method
A high-level view of sequential design.
Sequential design makes data collection active: choose the next measurement to maximize expected information about a target quantity of interest, accounting for sampling uncertainty and uncertainty in the model.
In practice, this is implemented by an approximation to Bayesian optimal experiment design, evaluating candidate measurements via a data acquisition objective (e.g., expected information gain about a decision boundary or treatment choice), and iterating until the remaining decision uncertainty is below a chosen tolerance.
Unlike standard A/B testing, bandits, or generic active learning, the objective here can be explicitly tied to the decision-relevant quantity of interest, and is approximately Bayesian. We don't reward uncertainty reduction in parts of the model that do not change the recommended action. We can also fit more complex models--likely ones you already use. We evaluate the acquisition function in about the time it takes to fit the model itself.
Hosted vs local demos
The A/B demo is fully static (precomputed plots). The record linkage demo runs against a hosted backend API so visitors can upload data directly.
If you need the system to run locally (e.g., for privacy or infrastructure constraints), that is typically handled as part of an engagement. Email joe@josephsmiller.com and I can suggest an appropriate deployment option.
About
Background and contact.
These demos are maintained by Joseph S. Miller. For publications and technical notes, see josephsmiller.com.
Privacy note: when you run the demos locally, your files stay on your machine.
Contact
Questions, collaborations, or pilots — email me.
If you prefer a short call, include a few times that work for you.