The central limit theorem (CLT) is one of the most powerful theorems within statistics. The central limit theorem: Let X1,X2,…,Xn be a random sample from i.i.d. random variables from any distribution with finite mean, μ, and finite variance, σ2. Then the limiting distribution of
|
| (5.3) |
where n is the average of the n sampled observations. That is for a sufficiently large
sample size, n, the sample mean
from i.i.d. random variables with a finite mean and
finite variance has an approximately Normal distribution N(μ,σ2∕n) and
is
approximately N(0, 1). In real life σ is almost almost always unknown, but we can
calculate a sample variance, s2. If
is from a random sample, X
1,X2,…,Xn, from a
normal distribution with mean μ, and finite variance, σ2 then
has a t-distribution with d.f. = n - 1, where d.f. stands for degrees of freedom. For a sufficiently large sample size, n, from any distribution with finite mean and variance
Typically a minimum of n > 30 is desired before assuming a t- distribution when the data are known to come from a non-normal distribution. The t-distribution converges to the normal distribution as n →∞, i.e.
|
|
|
|
|
|