Question 1

How long should an incrementality test run?

Accepted Answer

Long enough to reach the required sample and to cover at least one or two weekly cycles. Run a minimum of 14 days even when the math allows fewer, so weekday and weekend behavior are both represented. Avoid tests longer than about 8 weeks, where seasonality and promotions contaminate the result. Most DTC holdouts land in the two-to-four-week range.

Question 2

How do you calculate the sample size for an incrementality test?

Accepted Answer

Use a two-proportion sample size at 95% confidence and 80% power. Convert the baseline rate to p1, set p2 to p1 times one plus the detectable lift, and apply the standard formula with z values of 1.96 and 0.8416. For a 2% baseline and a 10% lift, each group needs about 80,683 users. Divide by daily users per group to get the duration.

Question 3

How big should a holdout group be?

Accepted Answer

Big enough to detect your target lift with confidence, which the sample size formula determines, and matched to the exposed group on the variables that drive conversion. A holdout below a few percent of the audience is usually too small to read reliably. Scale the holdout to the same audience size as the exposed group before comparing results.

Question 4

What is a minimum detectable effect?

Accepted Answer

The minimum detectable effect, or MDE, is the smallest lift the test is designed to detect with the chosen confidence and power. A 10% MDE means the test can reliably catch a 10% relative improvement. Smaller MDEs require dramatically larger samples: detecting a 5% lift takes roughly four times the users of a 10% lift, so set it to the smallest lift that would change a decision.

Question 5

What is a good lift in an incrementality test?

Accepted Answer

A good lift is one large enough to clear your break-even and your detectable threshold. Prospecting and upper-funnel campaigns typically show the strongest incremental lift because they reach people who were not already converting. Retargeting often shows little or no lift because it harvests existing intent. Judge the lift against your margin, not against an inflated platform ROAS.

Question 6

What if the test duration is impractical?

Accepted Answer

If a user-level holdout needs months at your traffic, switch to a geo test that splits matched markets into test and control, or raise the minimum detectable lift so the required sample drops. Both trade some precision for a feasible window. Never shorten the test by peeking and stopping early, which inflates false positives.

Incrementality Test Calculator

How it works

How long should an incrementality test run?

Frequently asked questions

How long should an incrementality test run?

How do you calculate the sample size for an incrementality test?

How big should a holdout group be?

What is a minimum detectable effect?

What is a good lift in an incrementality test?

What if the test duration is impractical?

Related calculators

Incrementality Calculator

ROAS Calculator

CTR Calculator

Planning a holdout or geo test?