In general within hypothesis testing we wish to test a theory, belief or simply something of interest. It is desired to test if a quantity concerning the population, called a parameter, is either not equal to, greater than or less than some value. Typically, the population mean, μ, or proportion, π, is the parameter, but not always. In hypothesis testing the theory is turned into what is called a null hypothesis, denoted H0, and an alternative hypothesis, denoted H1 or HA. In general hypothesis testing one may want to compare one group/sample to a specific value, say μ0. Often within hypothesis testing one may want to compare two groups/samples to each other, such as comparing the average salary of men, say μ1, to the average salary of women, say μ2.
The alternative hypothesis is what is desired to prove or show to be true and the null hypothesis the opposite. Examples: If it is desired to prove the ...
Table ?? lists various null and alternative hypothesis combinations for one and two sample tests of population mean(s) and proportion(s) and how to calculate their associated p-values1 1Note: The Table ?? for calculating p-value assumes variance is known for investigating the population mean, μ. .
|
In hypothesis testing a decision is made by using what is known as a p-value. The p-value is the probability of observing what was observed or more extreme assuming the null hypothesis is true. If the probability of observing what was observed or more extreme assuming the null hypothesis is true is ”very small” the researcher rejects the null hypothesis. The researcher rejects the null hypothesis when the p-value is small because we trust the data over the null hypothesis. Typically p-values less than that of 0.1, 0.05, or 0.01 are considered too small to be random chance and the null hypothesis is rejected. The value which the null hypothesis will be rejected at is called the level of significance and denoted by α. Commonly for large data sets often a significance level of 0.01 is used. Typically in the class room setting an α = 0.05 is used.
Important: |
For hypothesis testing regardless of the test chosen and the test-statistic
used the steps are generally the same. This book will only cover the p-value
approach to hypothesis testing. Other books cover a rejection region as well.
The rejection region approach is useful for when a p-value can’t be calculated.
For example, when the researcher does not have access to a computer, like on
exams. When working, in this day in age the researcher will most likely have
access to a computer and almost all, if not all statistical software calculates a
p-value for hypothesis testing. For this reason only the p-value approach will be
covered.
Steps Within Hypothesis Testing: P-value Approach
|
After making a decision there are two possible types of error, type I and type II. A Type I error is when when you reject the null hypothesis and the null hypothesis is actually true. A Type II error is when you fail to reject the null hypothesis and the null hypothesis is actually false, with probability β. The power of a test equals 1 - β which is the probability of rejecting the null hypothesis when the null hypothesis is false. All possible error/no error results of a hypothesis test are given in Table ??.
|
|
Important |
When a hypothesis test is performed, the result is either fail to reject the null
hypothesis or reject the null hypothesis. Do not say ”accept” the null hypothesis.
There is a huge difference between not having enough evidence to disprove something and
proving something. Reject H0 is like disproving H0 and fail to reject H0 is like failing to
disprove H0, but this is very different from saying accept H0 or that H0 has been
proved. This is a very important concept and understanding it will help you
avoid much confusion when performing hypothesis testing and working with
data.
|
Scenario |
You have a theory that on average men in Bangkok weigh more than 65 kilograms.
H0 : μ ≤ 65 and HA : μ > 65. Data are collected, a simple random sample of size n = 100
and the average weight from the sample is = 67.3 kilograms. The statistician
performs a statistical test and the p-value is 0.23, so he fails to reject H0. Is he
saying he believes the average male weight in Bangkok is less than or equal to 65
kilograms? No! Were he to say he accepts H0 this would be implying he believes
the average weight is less than or equal to 65 kilograms. What the statistician
is saying is that there is not enough evidence to show your theory beyond a
reasonable doubt, so he can not reject H0. This subtle difference is very important.
Imagine saying to someone that the sample average is
= 67.3 kilograms so
you believe the population average is less than or equal to 65 kilograms. That
does not make any logical sense. Given the amount of data, the sample average,
= 67.3, and the sample standard deviation, we are not confident in saying that
the population average is greater than 65 kilograms. This is a what is being
shown by the hypothesis test. With more information it could possibly be shown
that μ > 65. For this particular reason the author tends to prefer looking at
confidence intervals for a deeper understanding of what the data collected are
saying.