6.1.1 Hypothesis testing

In general within hypothesis testing we wish to test a theory, belief or simply something of interest. It is desired to test if a quantity concerning the population, called a parameter, is either not equal to, greater than or less than some value. Typically, the population mean, μ, or proportion, π, is the parameter, but not always. In hypothesis testing the theory is turned into what is called a null hypothesis, denoted H0, and an alternative hypothesis, denoted H1 or HA. In general hypothesis testing one may want to compare one group/sample to a specific value, say μ0. Often within hypothesis testing one may want to compare two groups/samples to each other, such as comparing the average salary of men, say μ1, to the average salary of women, say μ2.

The alternative hypothesis is what is desired to prove or show to be true and the null hypothesis the opposite. Examples: If it is desired to prove the ...

Table ?? lists various null and alternative hypothesis combinations for one and two sample tests of population mean(s) and proportion(s) and how to calculate their associated p-values1 1Note: The Table ?? for calculating p-value assumes variance is known for investigating the population mean, μ. .






Investigate Null Hypothesis Alternative Hypothesis Calculate p-value




μ from one μ = μ0 μμ0 2 × P(Z > |z|)
group/ μ μ0 μ < μ0 P(Z < z)
sample μ μ0 μ > μ0 P(Z > z)




π from one π = π0 ππ0 2 × P(Z > |z|)
group/ π π0 π < π0 P(Z < z)
sample π π0 π > π0 P(Z > z)




μ from two μ1 = μ2 μ1μ2 2 × P(Z > |z|)
groups/ μ1 μ2 μ1 < μ2 P(Z < z)
samples μ1 μ2 μ1 > μ2 P(Z > z)




π from two π1 = π2 π1π2 2 × P(Z > |z|)
groups/ π1 π2 π1 < π2 P(Z < z)
samples π1 π2 π1 > π2 P(Z > z)





Table 6.1: The more common H0 and HA combinations and how to calculate their associated p-values.

In hypothesis testing a decision is made by using what is known as a p-value. The p-value is the probability of observing what was observed or more extreme assuming the null hypothesis is true. If the probability of observing what was observed or more extreme assuming the null hypothesis is true is ”very small” the researcher rejects the null hypothesis. The researcher rejects the null hypothesis when the p-value is small because we trust the data over the null hypothesis. Typically p-values less than that of 0.1, 0.05, or 0.01 are considered too small to be random chance and the null hypothesis is rejected. The value which the null hypothesis will be rejected at is called the level of significance and denoted by α. Commonly for large data sets often a significance level of 0.01 is used. Typically in the class room setting an α = 0.05 is used.

Important:
If p-value < α then reject H0
If p-value α then fail to reject H0

_ _

For hypothesis testing regardless of the test chosen and the test-statistic used the steps are generally the same. This book will only cover the p-value approach to hypothesis testing. Other books cover a rejection region as well. The rejection region approach is useful for when a p-value can’t be calculated. For example, when the researcher does not have access to a computer, like on exams. When working, in this day in age the researcher will most likely have access to a computer and almost all, if not all statistical software calculates a p-value for hypothesis testing. For this reason only the p-value approach will be covered.

Steps Within Hypothesis Testing: P-value Approach

  1. Determine the null hypothesis, H0, and the alternative hypothesis, HA.
  2. Decide on the appropriate level of significance, α.
  3. Determine the sample size and sampling design to use.
    • The tests in this chapter are appropriate when the data comes from a simple random sample.
    • The tests in this chapter and other statistical tests are not appropriate when the data comes from a convenience or other type of non-probability sample.
  4. Determine the appropriate test statistic given the data and sampling design.
  5. Collect the data and calculate the appropriate test statistic.
  6. Calculate the p-value for the H0 and HA combination.
  7. Make a decision whether to fail to reject H0 or reject the H0 by comparing the p-value to α.
_ _

After making a decision there are two possible types of error, type I and type II. A Type I error is when when you reject the null hypothesis and the null hypothesis is actually true. A Type II error is when you fail to reject the null hypothesis and the null hypothesis is actually false, with probability β. The power of a test equals 1 - β which is the probability of rejecting the null hypothesis when the null hypothesis is false. All possible error/no error results of a hypothesis test are given in Table ??.





H0 is true H0 is false



Fail to reject H0 P(No error)=1 - α P(Type II Error)=β



Reject H0 P(Type I error)=α P(No error)=1 - β




Table 6.2: No error, type I and type II error

Important

_ _

When a hypothesis test is performed, the result is either fail to reject the null hypothesis or reject the null hypothesis. Do not say ”accept” the null hypothesis. There is a huge difference between not having enough evidence to disprove something and proving something. Reject H0 is like disproving H0 and fail to reject H0 is like failing to disprove H0, but this is very different from saying accept H0 or that H0 has been proved. This is a very important concept and understanding it will help you avoid much confusion when performing hypothesis testing and working with data.

Scenario

_ _

You have a theory that on average men in Bangkok weigh more than 65 kilograms. H0 : μ 65 and HA : μ > 65. Data are collected, a simple random sample of size n = 100 and the average weight from the sample is ¯x = 67.3 kilograms. The statistician performs a statistical test and the p-value is 0.23, so he fails to reject H0. Is he saying he believes the average male weight in Bangkok is less than or equal to 65 kilograms? No! Were he to say he accepts H0 this would be implying he believes the average weight is less than or equal to 65 kilograms. What the statistician is saying is that there is not enough evidence to show your theory beyond a reasonable doubt, so he can not reject H0. This subtle difference is very important. Imagine saying to someone that the sample average is ¯x = 67.3 kilograms so you believe the population average is less than or equal to 65 kilograms. That does not make any logical sense. Given the amount of data, the sample average, ¯x = 67.3, and the sample standard deviation, we are not confident in saying that the population average is greater than 65 kilograms. This is a what is being shown by the hypothesis test. With more information it could possibly be shown that μ > 65. For this particular reason the author tends to prefer looking at confidence intervals for a deeper understanding of what the data collected are saying.