Unbiased estimatior for $bar{x} $ from a Random Sample with unequal selection probability

I have the following population:


Where the left column is the age of our individuals and the right column is their weight (in kg).

The exercise tells us that we use Random Sampling with no replacement to take our sample and we are twice as likely to select an individual whose age is lower than 20.

I have to find 3 things:

  1. An unbiased estimator for $bar{x}$.
  2. The probability of obtaining the sample: $S = { 50, 35, 85 } $
  3. With $S$ as our sample, estimate the total population weight with a 75% confidence interval.

Any help would de appreciated, I have worked on this for hours and gotten nowhere.

Cross Validated Asked by PLanderos33 on November 14, 2021

1 Answers

One Answer

A small population with twice as many children as adults. Suppose there are 4 each of children of ages 5, 10, and 15; and that there are 3 each of adults of ages 25 and 45. That means the average weight in the population of $36$ is $[4(20+35+50) + 3(90+85)]/18 = 52.5$kg.

What sample size? Also, suppose we are take a random sample of $n = 3$ from this population. (A clue that we should use $n = 3$ is having been given a sample of size three in the problem.)

The population of 36 weights is kg as follows:

kg=c(rep(c(20,35,50), each=4), rep(c(90,85), each=3))
[1] 52.5

Simulation results. If we take many samples of size 3 from this population, we can get a good approximation of the sampling distribution.

m = 10^5;  n=3;  s.3 = wt = numeric(m)
for(i in 1:m) {
  x = sort(sample(kg, 3))
  s.3[i] = sum(x == c(35,50,85))
  wt[i] = mean(x) }
mean(s.3==3)    # prob sample has 35,50,85
[1] 0.058995    # aprx 1/17
[1] 0.05882353  # exact 1/17
[1] 52.50809    # aprx 52.5
[1] 0.02902334

Probability of specified sample. With a million samples, one can expect 2 or 3 places of accuracy. One can show by simple combinatorics that the probability of getting one each of the weights $35, 50, 85$ (in some order) is $1/7,$ which is consistent with the simulation.

Unbiased estimator. Also, the mean weight in the population is $52.5.$ The simulation approximates $E(bar X_3)= 52.508 pm 0.029,$ with a 95% margin of simulation error.

If sampling had been with replacement, it is obvious that the mean of the sample of $n=3$ would be an unbiased estimate of the population weight $52.2.$ It is not hard to show that the same is true for sampling without replacement, and I will leave that to you.

Confidence intervals. I don't know what you have studied about confidence intervals. The sample mean of the specified sample of three observations is $bar X_3 = 56.67;$ it should be the center of a CI for the true mean weight of the population. Using it's standard error you should be able to get some style of CI.

Three observations are hardly enough for a good bootstrap CI, but if you know about bootstrapping this part of the problem may be a prompt to do whatever kind of bootstrap you may have studied. A naive percentile 75% nonparametric bootstrap CI can be found as follows (repeatedly re-sampling with replacement from the sample of three). This CI is $(40.0, 73.3),$ which does cover the known population mean.

re.avg = replicate(10^4, mean(sample(c(35,50,85), 3, rep=T)))
quantile(re.avg, c(.125, .875))
   12.5%    87.5% 
40.00000 73.33333 

Answered by BruceET on November 14, 2021

Add your own answers!

Related Questions

Interpretation of TSA::arimax output model is presented in R

1  Asked on January 2, 2021 by wasif


Belief propagation on Polytree

0  Asked on January 2, 2021 by jonasc


Split train//validation/test sets by time, is it correct?

3  Asked on December 31, 2020 by wishihadabettername


Chi squared test questions

0  Asked on December 30, 2020 by woodpigeon


QQ plot comparison of z-normalized datasets

1  Asked on December 30, 2020 by prinzvonk


Ask a Question

Get help from others!

© 2021 All rights reserved.