10 Hypothesis Tests

Deciding between two hypotheses is a core activity in scientific discovery. Statistical hypothesis testing is the formal inferential framework around choosing between hypotheses.

10.1 Hypothesis Testing

Hypothesis testing is concerned with making decisions using data.

  • A Null hypothesis is that represents the status quot, usually labeled \(H_0\).

  • The null hypothesis is assumed true and statistical evidence s required to reject it in favor of a research or alternative hypothesis.

10.2 Example

  • A respiratory disturbance index of more than 30 events per hour is considered evidence of severe sleep breathing (SDB).

  • Suppose that in a sample of 100 overweight subjects with other risk factors for sleep disordered breathing at a sleep clinic, the mean RDI was 32 events / hour with a standard deviation of 10 events / hour.

  • We might want to test the hypothesis that:

    • \(H_0:\mu=30\)
    • \(H_a:\mu>30\)
    • Where \(\mu\) is the population mean RDI.
  • The alternative hypotheses are usually of the form \(<, >, \ne\)

  • Not that there are four possible outcomes of our statistical decision process.

\[H_0 = True. ~~~Decide ~H_0~~~ \therefore~ Correctly ~accept ~null\] \[H_a = True. ~~~Decide ~H_a~~~ \therefore~ Correctly ~reject ~null\] \[H_0 = True. ~~~Decide ~H_a~~~ \therefore~ Type ~I~Error \] \[H_a = True. ~~~Decide ~H_0~~~ \therefore~ Type~II~Error\]

10.3 Discussion

  • Consider a court of law, the null hypothesis is that the defendant is innocent
  • We require a standard on the available evidence to reject the null hypothesis (Convict)
  • If we set a low standard, then we would increase the percentage of innocent people convicted (type 1 errors); however, we would also increase the percentage of people convicted (correctly rejecting the null)
  • If we set a high standard, then we would increase the percentage of innocent people let free (correctly accepting the null) while we would also increase the percentage of guilty people let free (type 2 errors)

10.4 Our Last Example

  • A reasonable strategy would reject the null hypothesis if \(\bar{X}\) was larger than some constant, \(C\).
  • Typically, \(C\), is chosen so that the probability of a type 1 error, \(\alpha\), is 0.05 (Or some other relevant constant)
  • Standard error of the mean \(\frac{10}{\sqrt{100}}=1\)
  • Under the null hypothesis \(H_0~~\bar{X}\sim N(30,1)\)
  • We want to choose \(C\) so that the \(P(\bar{X} > C;~H_0)\) is 5%
  • The 95ht percentile of a normal distribution is 1.645 standard deviations from the mean
  • So if we set the value of the constant \(C = 30+ 1(1.645) = 31.645\) we are left with a cut point, so that the probability that a randomly drawn mean from this population is larger than this is 5%
  • This rule: “Reject \(H_0\) when \(\bar{X} \ge 31.645\)” has the property that the probability of rejection is 5% when \(H_0\) is true (for the \(\mu_0, \sigma\) and \(n\) given)

In general, we don’t convert \(C\) back to the original scale We would just reject because the Z-Score, which is how many standard errors the sample mean is above the hypothesized mean.

\[\frac{32-30}{\frac{10}{\sqrt{100}}}=2\] This is greater than 1.645. Or whenever \(\frac{\sqrt{n}(\bar{X}-\mu_0)}{s} > Z_{1-\alpha}\)

10.5 T-Tests

10.5.1 Example Reconsidered

  • Consider the last example, however, this time, that \(n=16\) (rather than 100)
  • \(H_0; \mu=30~~~H_a;\mu>30\)
  • The following statistic follows a \(T\) distribution with 15 sf user \(H_0\).

\[\frac{\bar{X}-30}{\frac{s}{\sqrt{16}}}\]

  • Under the \(H_0\), the probability that it is larger than the 95ht percentile of the \(T\) distribution is 5%.
  • The 95ht percentile of the T distribution with 15 sf is 1.7531 (obtained via qt(.95, 15))
  • So that out test statistic is now \(\frac{\sqrt{16}(32-30)}{10}=0.8\)
  • As 0.8 is not greater than qt(.95, 15) we correctly reject the alternative hypothesis.

10.5.2 Two Sided Tests

  • Suppose that we would reject the null hypothesis if in fact the mean was too large or too small.
  • That is, we want to test the alternative \(H_a : \mu \ne 30\).
  • We will reject if the test statistic, 0.8, is either too large or too small.
  • Then we want the probability of rejecting under then null to be 5%, split equally as 2.5% in the upper tail and 2.5% in the lower tail.
  • Thus we reject if our test statistic is larger than qt(.975, 15) or smaller than qt(.025, 15)
    • This is the same as saying reject, if the absolute value of our statistic is larger than qt(.975, 15)=2.1314.
    • So in this case, we also accept the two sided test as well.

10.5.3 T Test in R

## 
##  One Sample t-test
## 
## data:  father.son$sheight - father.son$fheight
## t = 11.789, df = 1077, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  0.8310296 1.1629160
## sample estimates:
## mean of x 
## 0.9969728

10.5.4 Connections with Confidence Intervals

  • Consider testing \(H_0:\mu = \mu_0\) versus \(H_a:\mu\ne\mu_0\)
  • Take the set of all possible values for which you fail to reject \(H_0\), this set is a \((1-\alpha)100\%\) confidence interval for \(\mu\)
  • The same works in reverse; if a \((1-\alpha)100\%\) confidence interval contains \(\mu_0\), then we fail to reject \(H_0\)

10.5.5 Two Group Intervals

  • First, now you know how to do two group T tests since we already covered dependent group T intervals
  • Rejection rules are exactly the same
  • Test \(H_0:\mu_1=\mu_2\)

10.5.6 Example

This example will use the chick weight data

## 
##  Two Sample t-test
## 
## data:  gain by Diet
## t = -2.7252, df = 23, p-value = 0.01207
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -108.14679  -14.81154
## sample estimates:
## mean in group 1 mean in group 4 
##        136.1875        197.6667