Monday, May 24, 2010

The following random sample from a population whose values where normally distributed were collected.?

10 8 11 11


the 95% confidence interval for the mean is:


A) 8.52 to 10.98


b) 7.75 to 11.75


c) 9.75 to 10.75


d) 8 to 10

The following random sample from a population whose values where normally distributed were collected.?
I get a different answer. Surely none of these answers can be right: in no case is the interval symmetric about the mean = 10.





Sample size n = 4


Sample mean xbar = 40/4 = 10


Population mean = mu (unknown)


Sample variance s^2 = (0 + 4 + 1 + 1)/(4-1) = 2


(Note: dividing here by n-1 = 4-1 is more accurate than by n because it makes s^2 an exactly unbiased estimator of the population variance sigma^2. This would matter little if n were large.)





Std error = s/sqrt(n) = sqrt(2)/2 = 0.707


So xbar-mu is normally distributed with mean 0 and std dev 0.707 (as estimated from the sample; its exact but unknown value is sigma/sqrt(n) = sigma/2).


So the 95% confidence interval for mu is





[10 - 1.96*0.707, 10 + 1.96*0.707] = [8.614,11.386].








Remark 1: With such a small sample, you really have to take (n-1) = 3 instead of n = 4 in calculating the sample variance s^2. The formula for s^2 "mean of squares - square of mean" under-estimates sigma^2 by the factor n/(n-1) = 4/3 here. This is not inconsiderable. The unbiased estimate of s^2 from this sample is 2, and NOT 1.5 as in Wal C's solution.


Even the value s^2 = 2 will lead to under-estimating the population std dev. sigma itself (because sqrt is a concave fctn, s is an under-estimating estimator of sigma even though s^2 is an unbiased estimator of sigma^2).





Remark 2: Wal C's solution is even more seriously wrong. She should have used the std error, the std dev of the sample mean, which is sigma/sqrt(n). But she used the std dev. of a SINGLE trial, sigma. If that were correct, there would never be any point in increasing the sample size! The point is of course that the std error sigma/sqrt(n) goes down as n increases -- but you've got to use this fact, not ignore it!





Remark 3: My solution above is also faulty. The sample size is so small that you must use the Student t-distribution and not the normal distribution. (xbar-mu)/(s/sqrt(n)) is not normally distributed; it has the Student t-distribution with 4-1=3 degrees of freedom. The 97.5 percentile is 3.18 (and not 1.96 as for the normal distr), and so the 95% confidence interval for mu is





[10 - 3.18*0.707, 10 + 3.18*0.707] = [7.75, 12.25].
Reply:10 8 11 11





Mean = 40/4 = 10


Mean of squares = 406/4 = 101.5





Variance = Mean of squares - square of mean


= 101.5 - 100


= 1.5





So SD = σ = √1.5 ≈ 1.22


2σ = 2.45





95% Interval ≈ μ ± 2σ


ie from 10 - 2 x 1.225 to 10 + 2 x 1.225


ie 7.55 to 12.45





So none of them is correct





The 68% confidence interval is approximately μ ± σ


ie 8.775 to 11.225 so none of them are correct on this figure either.





And if you take into account that this is a small sample then the sample SD = √(4/3) * 1.225 ≈ 1.41 then the 68% interval grows to around 8.59 to 11.41 and the 95% interval to around 7.18 to 12.82


No comments:

Post a Comment