7.2.1 One-way ANOVA

One-way ANOVA is used when you have a single categorical variable and a single continuous variable. It is called one-way ANOVA since there is only a single categorical variable. This chapter will also cover two-way ANOVA, which gets its name from the fact that it involves two categorical variables. This chapter does not go into depth about one-way nor two-way ANOVA as they can be viewed as general linear models with a single continuous dependent variable and one or two categorical independent variables. General linear models will be covered in the following chapter. The assumptions for using a one-way ANOVA are:

  1. The samples for each group are from a simple random sample, and units are independent of one another.
  2. Groups differ by only the factor being studied; everything is the same except for the factor being studied.
  3. Data come from a normal distribution.
  4. Equal variance across the groups, this assumption is more robust to departure when the sample size for all groups is equal.

The overall variation, SST, can be broken into two parts:

  1. The variation among the groups or sums of squares among (SSA), resulting from the differences among the group means.
  2. The variation within the groups or sums of squared error (SSE), resulting from the unattributable randomness within the groups.

SST = SSA + SSE
j=1k i=1nj (xij -¯x..)2 = j=1kn j(¯x.j -¯x..)2 + j=1k i=1nj (xij -x¯.j)2
where ¯x.. is the overall average, ¯x.j is the average of the jth group, and x ij is the ith observation of the jth group. The total sample size is denoted n which equals the sum of the samples from all groups, n = j=1kn j, where the number of groups is k.

The mean sum of squares for the among and error are MSA=SSA-
k- 1 and MSE=SSE-
n-k, respectively. If the statistic F = MMSSAE-- is large then we reject the null hypothesis, where large can be determined by the p-value in the computer output. As with the previous chapters on hypothesis testing, if the p-value is less than α we reject the null hypothesis. The concept is that if the variation among the groups is large relative to within the groups then it is the result that at least one of the population means differs. Figure ?? illustrates this concept. A typical one-way ANOVA table looks like Table ??.


PIC PIC
Figure 7.1: Illustrating the concept behind ANOVA.








Source Sum of Squares (SS) Degrees of Freedom Mean Square (MS) F-value





Factor A SSA k - 1 MSA=SSA-
k-1 MSA--
MSE
Error SSE n - k MSE=SSE-
n-k





Total SST n - 1






Table 7.2: One-way ANOVA table, with a total of n observations.

From ANOVA we can determine whether the population group means differ or not. We cannot determine which population means differ if they differ. There exist various post hoc tests to determine which population means differ among the groups. One common post hoc test used is called the Bonferoni test. The test is s multiple comparison test, comparing all possible combinations of the groups. To determine whether or not the population means of two groups differ using a Bonferoni test, we also use p-value. See Figures ?? and ?? for examples of Bonferoni test computer output.