10 Hypothesis Tests
Deciding between two hypotheses is a core activity in scientific discovery. Statistical hypothesis testing is the formal inferential framework around choosing between hypotheses.
10.1 Hypothesis Testing
Hypothesis testing is concerned with making decisions using data.
A Null hypothesis is that represents the status quot, usually labeled \(H_0\).
The null hypothesis is assumed true and statistical evidence s required to reject it in favor of a research or alternative hypothesis.
10.2 Example
A respiratory disturbance index of more than 30 events per hour is considered evidence of severe sleep breathing (SDB).
Suppose that in a sample of 100 overweight subjects with other risk factors for sleep disordered breathing at a sleep clinic, the mean RDI was 32 events / hour with a standard deviation of 10 events / hour.
We might want to test the hypothesis that:
- \(H_0:\mu=30\)
- \(H_a:\mu>30\)
- Where \(\mu\) is the population mean RDI.
The alternative hypotheses are usually of the form \(<, >, \ne\)
Not that there are four possible outcomes of our statistical decision process.
\[H_0 = True. ~~~Decide ~H_0~~~ \therefore~ Correctly ~accept ~null\] \[H_a = True. ~~~Decide ~H_a~~~ \therefore~ Correctly ~reject ~null\] \[H_0 = True. ~~~Decide ~H_a~~~ \therefore~ Type ~I~Error \] \[H_a = True. ~~~Decide ~H_0~~~ \therefore~ Type~II~Error\]
10.3 Discussion
- Consider a court of law, the null hypothesis is that the defendant is innocent
- We require a standard on the available evidence to reject the null hypothesis (Convict)
- If we set a low standard, then we would increase the percentage of innocent people convicted (type 1 errors); however, we would also increase the percentage of people convicted (correctly rejecting the null)
- If we set a high standard, then we would increase the percentage of innocent people let free (correctly accepting the null) while we would also increase the percentage of guilty people let free (type 2 errors)
10.4 Our Last Example
- A reasonable strategy would reject the null hypothesis if \(\bar{X}\) was larger than some constant, \(C\).
- Typically, \(C\), is chosen so that the probability of a type 1 error, \(\alpha\), is 0.05 (Or some other relevant constant)
- Standard error of the mean \(\frac{10}{\sqrt{100}}=1\)
- Under the null hypothesis \(H_0~~\bar{X}\sim N(30,1)\)
- We want to choose \(C\) so that the \(P(\bar{X} > C;~H_0)\) is 5%
- The 95ht percentile of a normal distribution is 1.645 standard deviations from the mean
- So if we set the value of the constant \(C = 30+ 1(1.645) = 31.645\) we are left with a cut point, so that the probability that a randomly drawn mean from this population is larger than this is 5%
- This rule: “Reject \(H_0\) when \(\bar{X} \ge 31.645\)” has the property that the probability of rejection is 5% when \(H_0\) is true (for the \(\mu_0, \sigma\) and \(n\) given)
In general, we don’t convert \(C\) back to the original scale We would just reject because the Z-Score, which is how many standard errors the sample mean is above the hypothesized mean.
\[\frac{32-30}{\frac{10}{\sqrt{100}}}=2\] This is greater than 1.645. Or whenever \(\frac{\sqrt{n}(\bar{X}-\mu_0)}{s} > Z_{1-\alpha}\)
10.5 T-Tests
10.5.1 Example Reconsidered
- Consider the last example, however, this time, that \(n=16\) (rather than 100)
- \(H_0; \mu=30~~~H_a;\mu>30\)
- The following statistic follows a \(T\) distribution with 15 sf user \(H_0\).
\[\frac{\bar{X}-30}{\frac{s}{\sqrt{16}}}\]
- Under the \(H_0\), the probability that it is larger than the 95ht percentile of the \(T\) distribution is 5%.
- The 95ht percentile of the T distribution with 15 sf is 1.7531 (obtained via
qt(.95, 15)
) - So that out test statistic is now \(\frac{\sqrt{16}(32-30)}{10}=0.8\)
- As 0.8 is not greater than
qt(.95, 15)
we correctly reject the alternative hypothesis.
10.5.2 Two Sided Tests
- Suppose that we would reject the null hypothesis if in fact the mean was too large or too small.
- That is, we want to test the alternative \(H_a : \mu \ne 30\).
- We will reject if the test statistic, 0.8, is either too large or too small.
- Then we want the probability of rejecting under then null to be 5%, split equally as 2.5% in the upper tail and 2.5% in the lower tail.
- Thus we reject if our test statistic is larger than
qt(.975, 15)
or smaller thanqt(.025, 15)
- This is the same as saying reject, if the absolute value of our statistic is larger than
qt(.975, 15)
=2.1314. - So in this case, we also accept the two sided test as well.
- This is the same as saying reject, if the absolute value of our statistic is larger than
10.5.3 T Test in R
##
## One Sample t-test
##
## data: father.son$sheight - father.son$fheight
## t = 11.789, df = 1077, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 0.8310296 1.1629160
## sample estimates:
## mean of x
## 0.9969728
10.5.4 Connections with Confidence Intervals
- Consider testing \(H_0:\mu = \mu_0\) versus \(H_a:\mu\ne\mu_0\)
- Take the set of all possible values for which you fail to reject \(H_0\), this set is a \((1-\alpha)100\%\) confidence interval for \(\mu\)
- The same works in reverse; if a \((1-\alpha)100\%\) confidence interval contains \(\mu_0\), then we fail to reject \(H_0\)
10.5.5 Two Group Intervals
- First, now you know how to do two group T tests since we already covered dependent group T intervals
- Rejection rules are exactly the same
- Test \(H_0:\mu_1=\mu_2\)
10.5.6 Example
This example will use the chick weight data
# First, we need to reformat the data using reshape2
# Load packages and dataset
library(datasets); data(""); library(reshape2)
wideCW <- dcast(ChickWeight, Diet + Chick ~ Time, value.var = "weight")
names(wideCW)[-(1:2)] <- paste("time", names(wideCW)[-(1:2)], sep = "")
library(dplyr)
wideCW <- mutate(wideCW,
gain = time21 - time0
)
# Now we can perform an unequal variance T test comparing diets 1 and 4
wideCW14 <- subset(wideCW, Diet %in% c(1,4))
t.test(gain ~ Diet, paired = FALSE,
var.equal = TRUE, data = wideCW14)
##
## Two Sample t-test
##
## data: gain by Diet
## t = -2.7252, df = 23, p-value = 0.01207
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -108.14679 -14.81154
## sample estimates:
## mean in group 1 mean in group 4
## 136.1875 197.6667