6 Central Limit Theorem
CLT states that the distribution of averages of ii variables (properly normalized) becomes that of the standard normal as the size increases.
\[\frac{\bar{X_n}-\mu}{\sigma/\sqrt{n}} = \frac{\sqrt{n}(\bar{X_n}-\mu)}{\sigma} = \frac{Estimate~~-~~Mean ~of~estimate}{Std.Err~ of ~estimate}\]
A useful way to think about the CLT is that \(\bar{X_n}\) is approximately \(N(\mu, \sigma^2/n)\)
6.1 Examples
Lets simulate a normal random variable by rolling \(n\) (6 sided).
- Let \(X_i\) be the outcome for die \(i\).
- Then note that \(\mu = E[X_i] = 3.5\)
- \(Var(X_i) = 2.92\)
- Standard error is \(\sqrt{\frac{2.92}{n}} = \frac{1.71}{\sqrt{n}}\)
Lets roll \(n\) dice, take their mean, subtract off 3.5 and divide by \(\frac{1.71}{\sqrt{n}}\).
6.1.1 Coin CLT
Let \(X_i\) be the 0 or 1 result of the \(i^{th}\) flip of a possibly unfair coin.
- The sample proportion, say \(\hat{p}\) is the average of the coin flips
- \(E[X_i] = p\) and \(Var(X_i) = p(1-p)\)
- Standard error of the mean is \(\sqrt{ \frac{p(1-p)}{n}}\)
This should be normally distributed if \(n\) is large enough \[\frac{\hat{p}-p}{\sqrt{ \frac{p(1-p)}{n}}}\]
The speed at which this value converges on normality is dependent on how biased the coin is (the skew- of the original distribution)