Sample Size Calculation - Two Independent Means

I am trying to design an email test to measure the demand lift obtained from a marketing promotion (treatment) versus no promotion (control). To do so, I want to calculate the per-group sample size required to get a significant read on the difference in average demand per-customer for different marketing segments.

To do so, I am applying the following formula (for each segment):

N = frac{2(Z_{1-alpha/2}+Z_{pi})^2sigma^2}{Delta^2}


$Z_{1-alpha/2}$ = percentile of the normal distribution used as the critical value in a two-tailed test (1.96)

$Z_{pi}$ = percentile of the normal distribution where $pi$ is the power of the test (0.84 for 80th percentile)

$sigma$ = within-group standard deviation

$Delta$ = expected mean difference between the treatment versus control population

To calculate the standard deviation and expected mean difference above, I pulled historical response for the same period last year during which the test will run. My question is this: should the group means and standard deviations be estimated from the total population which was exposed to the treatment (and control), respectively, or should the mean and standard deviation be calculated based on respondents only? Put another way, should I use the mean/variance for the full audience exposed to a given treatment in the past, or the mean/variance for responders only, and then back solve for required full audience?

The results that I’m getting appear counter-intuitive, with similar required sample sizes among the most-engaged and least-engaged audiences, so I know I must be doing this wrong.

Most of the material that I’ve come across from the marketing community involves using a desired difference in response rate to solve for appropriate per-group sample sizes. In my case, however, the metric of interest is demand-based rather than raw response (average demand per customer). That said, the response rate is an important metric, as it is particularly low for certain groups of customers, but it does not directly reflect the metric of interest.

Thanks in advance!

Cross Validated Asked by user291972 on November 14, 2021

1 Answers

One Answer

Here is a simulation to show that your approximate formula for sample size $n$ gives a reasonable answer for a particular case, which may be realistic.

Suppose $sigma^2/Delta^2 = 9,$ significance level is 5% and desired power is 80%. Then the formula gives $n approx 141.$ [An exact formula would use a noncentral t distribution, but with $n > 100,$ the approximate formula should be OK.]

n = 2*(1.96+.84)^2*9;  n
[1] 141.12

Now suppose I do $m = 100,000$ two-sided pooled two-sample t tests using samples of size $n = 150$ to try to detect a significant difference (5% level) in sample means from populations $mathsf{Norm}(mu_1 = 100, 15)$ and $mathsf{Norm}(mu_2 = 105, 15),$ so that $Delta = 5, sigma= 15$ and $sigma^2/Delta^2 = (15/5)^2 = 9.$ [For the population means, only $Delta=|mu_1-mu_2| = 5$ matters.]

Then I should reject at the 5% level a little more than 80% of the time. The simulation shows rejection 82% of the time, so the simulation is in substantial agreement with your formula.

pv = replicate(10^5, t.test(rnorm(150,100,15),
mean(pv <= .05)
[1] 0.82189

Answered by BruceET on November 14, 2021

Add your own answers!

Related Questions

Stationarity in SAS

0  Asked on March 13, 2021 by adrcoder


What is a generalized linear model

0  Asked on March 2, 2021 by pluviophile


SVM-Light displays corrupted precision/recall results

1  Asked on February 24, 2021 by zvisofer


bounds test for cointegration (Pesaran ardl)

2  Asked on February 23, 2021 by user54285


Ask a Question

Get help from others!

© 2021 All rights reserved.