# Sample Size Calculation - Two Independent Means

I am trying to design an email test to measure the demand lift obtained from a marketing promotion (treatment) versus no promotion (control). To do so, I want to calculate the per-group sample size required to get a significant read on the difference in average demand per-customer for different marketing segments.

To do so, I am applying the following formula (for each segment):

$$N = frac{2(Z_{1-alpha/2}+Z_{pi})^2sigma^2}{Delta^2}$$

Where:

$$Z_{1-alpha/2}$$ = percentile of the normal distribution used as the critical value in a two-tailed test (1.96)

$$Z_{pi}$$ = percentile of the normal distribution where $$pi$$ is the power of the test (0.84 for 80th percentile)

$$sigma$$ = within-group standard deviation

$$Delta$$ = expected mean difference between the treatment versus control population

To calculate the standard deviation and expected mean difference above, I pulled historical response for the same period last year during which the test will run. My question is this: should the group means and standard deviations be estimated from the total population which was exposed to the treatment (and control), respectively, or should the mean and standard deviation be calculated based on respondents only? Put another way, should I use the mean/variance for the full audience exposed to a given treatment in the past, or the mean/variance for responders only, and then back solve for required full audience?

The results that I’m getting appear counter-intuitive, with similar required sample sizes among the most-engaged and least-engaged audiences, so I know I must be doing this wrong.

Most of the material that I’ve come across from the marketing community involves using a desired difference in response rate to solve for appropriate per-group sample sizes. In my case, however, the metric of interest is demand-based rather than raw response (average demand per customer). That said, the response rate is an important metric, as it is particularly low for certain groups of customers, but it does not directly reflect the metric of interest.

Cross Validated Asked by user291972 on November 14, 2021

Here is a simulation to show that your approximate formula for sample size $$n$$ gives a reasonable answer for a particular case, which may be realistic.

Suppose $$sigma^2/Delta^2 = 9,$$ significance level is 5% and desired power is 80%. Then the formula gives $$n approx 141.$$ [An exact formula would use a noncentral t distribution, but with $$n > 100,$$ the approximate formula should be OK.]

n = 2*(1.96+.84)^2*9;  n
[1] 141.12


Now suppose I do $$m = 100,000$$ two-sided pooled two-sample t tests using samples of size $$n = 150$$ to try to detect a significant difference (5% level) in sample means from populations $$mathsf{Norm}(mu_1 = 100, 15)$$ and $$mathsf{Norm}(mu_2 = 105, 15),$$ so that $$Delta = 5, sigma= 15$$ and $$sigma^2/Delta^2 = (15/5)^2 = 9.$$ [For the population means, only $$Delta=|mu_1-mu_2| = 5$$ matters.]

Then I should reject at the 5% level a little more than 80% of the time. The simulation shows rejection 82% of the time, so the simulation is in substantial agreement with your formula.

set.seed(2020)
pv = replicate(10^5, t.test(rnorm(150,100,15),
rnorm(150,105,15),var.eq=T)\$p.val)
mean(pv <= .05)
[1] 0.82189


Answered by BruceET on November 14, 2021

## Related Questions

### How many ways are there to select exactly one heart in a hand of 5?

3  Asked on March 9, 2021 by pythonnoob

### The behaviour of dice loss when target and prediction are disjoint

0  Asked on March 4, 2021 by bmurray

### Loss function for regression

1  Asked on March 3, 2021

### What is a generalized linear model

0  Asked on March 2, 2021 by pluviophile

### Help with choosing appropriate way to test hypothesis

1  Asked on March 2, 2021 by sleepy

### Identifiability of multinomial logistic regression

0  Asked on March 1, 2021 by sedi

### How to calculate the ACF and PACF for time series

2  Asked on February 28, 2021 by peterbe

### Represent Integer Categorical feature as both Numeric and Categorical

0  Asked on February 27, 2021 by user2991421

### Determine the test statistic for each case

1  Asked on February 27, 2021 by mathslover

### Statistical test whether to use Sharp or Fuzzy Regression Discontinuity Design

1  Asked on February 27, 2021 by misologie

### Conditional and unconditional expectation for the variance of error term in linear regression

1  Asked on February 25, 2021 by mcgurck

### Python Fastai library – Loss and Validation interpretation

0  Asked on February 25, 2021 by la_haine

### How to estimate Standard error with delta method

0  Asked on February 24, 2021 by zge

0  Asked on February 24, 2021 by diricksen

### SVM-Light displays corrupted precision/recall results

1  Asked on February 24, 2021 by zvisofer

### bounds test for cointegration (Pesaran ardl)

2  Asked on February 23, 2021 by user54285

### Different z and Wald values in logistic regression analyses between SPSS and R

0  Asked on February 23, 2021 by afton-nelson

### Creating an index using transactional data

0  Asked on February 23, 2021 by jamzy

### How does a pdf change after a variable transformation with another random variable?

1  Asked on February 23, 2021 by fluctuation