Ed230B/C

Linear Statistical Models

Multiple Comparisons


Organization of Multiple Comparisons

The Problem with Multiple Comparisons

If n independent contrasts are each tested at α, then the probability of making at least one type I error is 1 - (1 - α)n.

The table below gives the probability of making at least one type I error for four different numbers of comparisons:

 n    probability
 3      .1426
 5      .2262
10      .4013
15      .5367

Conceptual Error Rates

  • Error rate contrastwise
    The probability that a contrast will be falsely declared significant.

  • Error rate experimentwise
    The probability that at least one contrast will be falsely declared significant in an experiment.

  • Error rate familywise
    The probability that at least one contrast will be falsely declared significant in a family of contrasts.

    Changing the critical value of the statistical test is what controls the conceptual error rate.

    Beware

    You are generally safe sticking with the following post-hoc comparison techniques: Dunnett, Fisher-Hayter, Tukey HSD, Tukey-Kramer, Scheffé or Bonferroni, since they do a reasonably good job of of protecting the familywise error rate. They are known to strongly protect the familywise error rate. However, post-hoc techniques such as Fisher's least significant difference (LSD), Student-Newman-Keuls, and Duncan's multiple range test fail to strongly protect the familywise error rate. Such procedures are said to protect the familywise error rate in a weak sense, avoid them if possible.

    Contrasts

  • A contrast or comparison among means is a difference among the means.
  • Consider the following four group means: M1, M2, M3, & M4
  • A contrast can then be thought of as set of weights, c, that are multiplied times the group means.
  • The greek letter psi is used to indicate contrasts.
  • Some examples:

    Group 1 vs Group 2: ψ1 = (1)M1 + (-1)M2 + (0)M3 + (0)M4
    c1 = 1 -1 0 0

    Group 1 vs Group 3: ψ2 = (1)M1 + (0)M2 + (-1)M3 + (0)M4
    c2 = 1 0 -1 0

    Group 3 vs Group 4: ψ3 = (0)M1 + (0)M2 + (1)M3 + (-1)M4
    c3 = 0 0 1 -1

    Groups 1 & 2 vs Groups 3 & 4: ψ4 = (1)M1 + (1)M2 + (-1)M3 + (-1)M4
    c4 = 1 1 -1 -1

    Group 1 vs Group 4: ψ5 = (1)M1 + (0)M2 + (0)M3 + (-1)M4
    c5 = 1 0 0 -1

    Orthogonal Contrasts

  • Contrasts are orthogonal when the dot product of their weights, c, equals zero.
  • Some examples:

    ψ1 & ψ2 = (1)(1) + (-1)(0) + (0)(-1) + (0)(0) = 1 [not orthogonal]

    ψ1 & ψ3 = (1)(0) + (-1)(0) + (0)(1) + (0)(-1) = 0 [orthogonal]

    ψ1 & ψ4 = (1)(1) + (-1)(1) + (0)(-1) + (0)(-1) = 0 [orthogonal]

    ψ2 & ψ4 = (1)(1) + (0)(1) + (-1)(-1) + (0)(-1) = 2 [not orthogonal]

    ψ3 & ψ4 = (0)(1) + (0)(1) + (1)(-1) + (-1)(-1) = 0 [orthogonal]

    Planned Orthogonal Comparisons

  • Requirements
  • There are at most p-1 orthogonal comparisons possible in any set of comparisons

    t Tests for Orthogonal Comparisons

    An Example

  • Using contrasts ψ1, ψ3 & ψ4
  • The sets of weights are:
    c1 = 1 -1 0 0
    c3 = 0 0 1 -1
    c4 = 1 1 -1 -1
  • MSerr = 2.179 and n = 8 for each group
  • The three t-tests are:
    t1 = -0.68 --> t2 = F = 0.46
    t2 = -2.71 --> t2 = F = 7.34
    t3 = -3.83 --> t2 = F = 14.69

    Using Stata

    This section make use of the anovacontrast.ado file which can be obtained from UCLA ATS via the Internet.

    use http://www.philender.com/courses/data/cr4new, clear
    
    table a, cont(freq mean y sd y)
    
    ----------------------------------------------
            a |      Freq.     mean(y)       sd(y)
    ----------+-----------------------------------
            1 |          8           3    1.511858
            2 |          8         3.5    .9258201
            3 |          8        4.25    1.035098
            4 |          8        6.25     2.12132
    ----------------------------------------------
    
    anova y a
    
    
                               Number of obs =      32     R-squared     =  0.4455
                               Root MSE      =   1.476     Adj R-squared =  0.3860
    
                      Source |  Partial SS    df       MS           F     Prob > F
                  -----------+----------------------------------------------------
                       Model |       49.00     3  16.3333333       7.50     0.0008
                             |
                           a |       49.00     3  16.3333333       7.50     0.0008
                             |
                    Residual |       61.00    28  2.17857143   
                  -----------+----------------------------------------------------
                       Total |      110.00    31   3.5483871  
    
    
    anovacontrast a, values(1 -1  0  0) title(1vs2)
    
    1vs2
    Contrast variable a (1 -1 0 0)                 Dep Var  =        y
    source           SS          df      MS        Contrast =    -0.50
    ---------+---------------------------------    N of obs =       32
    contrast |          1         1      1.0000    F        =     0.46
    error    |         61        28      2.1786    Prob > F =   0.5036
    ---------+---------------------------------    t        =     0.68
    
    anovacontrast a, values(0  0  1 -1) title(3vs4)
    
    3vs4
    Contrast variable a (0 0 1 -1)                 Dep Var  =        y
    source           SS          df      MS        Contrast =    -2.00
    ---------+---------------------------------    N of obs =       32
    contrast |         16         1     16.0000    F        =     7.34
    error    |         61        28      2.1786    Prob > F =   0.0114
    ---------+---------------------------------    t        =     2.71
    
    anovacontrast a, values(1  1 -1 -1) title(12vs34)
    
    12vs34
    Contrast variable a (1 1 -1 -1)                Dep Var  =        y
    source           SS          df      MS        Contrast =    -4.00
    ---------+---------------------------------    N of obs =       32
    contrast |         32         1     32.0000    F        =    14.69
    error    |         61        28      2.1786    Prob > F =   0.0007
    ---------+---------------------------------    t        =     3.83
    
    anovalator a, wgt(1 -1 0 0)
    
    Adjusted predictions                              Number of obs   =         32
    
    Expression   : Linear prediction, predict()
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               a |
              1  |          3   .5218443     5.75   0.000     1.977204    4.022796
              2  |        3.5   .5218443     6.71   0.000     2.477204    4.522796
              3  |       4.25   .5218443     8.14   0.000     3.227204    5.272796
              4  |       6.25   .5218443    11.98   0.000     5.227204    7.272796
    ------------------------------------------------------------------------------
    
    anovalator contrast for a  
    
    
     ( 1)  1bn.a - 2.a = 0
    
    ------------------------------------------------------------------------------
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             (1) |        -.5   .7379992    -0.68   0.498    -1.946452    .9464519
    ------------------------------------------------------------------------------
    
    anovalator a, wgt(0 0 1 -1) quietly
    
    anovalator contrast for a  
    
    
     ( 1)  3.a - 4.a = 0
    
    ------------------------------------------------------------------------------
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             (1) |         -2   .7379992    -2.71   0.007    -3.446452   -.5535481
    ------------------------------------------------------------------------------
    
    anovalator a, wgt(1 1 -1 -1) quietly
    
    anovalator contrast for a  
    
    
     ( 1)  1bn.a + 2.a - 3.a - 4.a = 0
    
    ------------------------------------------------------------------------------
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             (1) |         -4   1.043689    -3.83   0.000    -6.045592   -1.954408
    ------------------------------------------------------------------------------
    Recall

  • The group means for the 4 group one-way anova
    Group    1     2     3     4
    Mean    3.00  3.50  4.25  6.25
    
  • The MSerr = 2.179

    Dunnett's Test (Pairwise versus Control Group)

  • Compare p-1 treatment groups with a control-group.

  • df for Dunnett's t is the same as the df associated with the MSerr
  • Critical values found in Dunnett's table of critical values using p and df for MSerr
  • In our four group example p = 4, dferr = 28, and the nearest critical value of Dunnett's t is 2.51.
  • Dunnett's t squared is an F and the critical value is 6.30.

    An Example

  • Using contrasts ψ1, ψ2 & ψ5
  • The sets of weights are:
    c1 = 1 -1 0 0
    c2 = 1 0 -1 0
    c5 = 1 0 0 -1
  • MSerr = 2.179 and n = 8 for each group
  • The three t-test are:
    t1 = -0.68 --> t2 = F = 0.46 -- n.s.
    t2 = -1.69 --> t2 = F = 2.87 -- n.s.
    t3 = -4.40 --> t2 = F = 18.39 -- sig. at .05

    Using Stata

    use http://www.philender.com/courses/data/cr4new, clear
    
    anova y a
    
                               Number of obs =      32     R-squared     =  0.4455
                               Root MSE      =   1.476     Adj R-squared =  0.3860
    
                      Source |  Partial SS    df       MS           F     Prob > F
                  -----------+----------------------------------------------------
                       Model |       49.00     3  16.3333333       7.50     0.0008
                             |
                           a |       49.00     3  16.3333333       7.50     0.0008
                             |
                    Residual |       61.00    28  2.17857143   
                  -----------+----------------------------------------------------
                       Total |      110.00    31   3.5483871  
    				   
    anovacontrast a, values(1 -1  0  0) title(1vs2)
    
    1vs2
    Contrast variable a (1 -1 0 0)                 Dep Var  =        y
    source           SS          df      MS        Contrast =    -0.50
    ---------+---------------------------------    N of obs =       32
    contrast |          1         1      1.0000    F        =     0.46
    error    |         61        28      2.1786    Prob > F =   0.5036
    ---------+---------------------------------    t        =     0.68
    
    anovacontrast a, values(1 0 -1 0) title(1vs3)
    
    1vs3
    Contrast variable a (1 0 -1 0)                 Dep Var  =        y
    source           SS          df      MS        Contrast =    -1.25
    ---------+---------------------------------    N of obs =       32
    contrast |       6.25         1      6.2500    F        =     2.87
    error    |         61        28      2.1786    Prob > F =   0.1014
    ---------+---------------------------------    t        =     1.69
    
    anovacontrast a, values(1 0 0 -1) title(1vs4)
    
    1vs4
    Contrast variable a (1 0 0 -1)                 Dep Var  =        y
    source           SS          df      MS        Contrast =    -3.25
    ---------+---------------------------------    N of obs =       32
    contrast |      42.25         1     42.2500    F        =    19.39
    error    |         61        28      2.1786    Prob > F =   0.0001
    ---------+---------------------------------    t        =     4.40
    
    /* use regression with appropriate reference group */
    
    regress y ib1.a
    
          Source |       SS       df       MS              Number of obs =      32
    -------------+------------------------------           F(  3,    28) =    7.50
           Model |          49     3  16.3333333           Prob > F      =  0.0008
        Residual |          61    28  2.17857143           R-squared     =  0.4455
    -------------+------------------------------           Adj R-squared =  0.3860
           Total |         110    31   3.5483871           Root MSE      =   1.476
    
    ------------------------------------------------------------------------------
               y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
               a |
              2  |         .5   .7379992     0.68   0.504    -1.011723    2.011723
              3  |       1.25   .7379992     1.69   0.101    -.2617229    2.761723
              4  |       3.25   .7379992     4.40   0.000     1.738277    4.761723
                 |
           _cons |          3   .5218443     5.75   0.000      1.93105     4.06895
    ------------------------------------------------------------------------------
    

    Recall

  • The group means for the 4 group one-way anova
    Group    1     2     3     4
    Mean    3.00  3.50  4.25  6.25
    
  • The MSerr = 2.179

    Fisher-Hayter Pairwise Comparisons

  • A post-hoc technique that test all pairwise comparisons of means
  • Based on the Studentized Range Distribution the critical value q(α, k-1, dfe) = q (.05, 3, 28) = 3.4994064
  • Let nn = (1/n1 + 1/n2) = (1/8 + 1/8) = 2/8 = 0.25
  • Let the critical difference = sqrt(mse * nn / 2) * q = sqrt(2.18 * 0.25 / 2) * 3.5 = 1.826
  • Compare each of the pairwise diffences in means with the critical difference
    1vs2  -0.50  n.s.
    1vs3  -1.25  n.s.
    1vs4  -3.25  sig.
    2vs3  -0.75  n.s.
    2vs4  -2.75  sig.
    3vs4  -2.00  sig.
    

    Alternatively

  • Alternatively, you could compute a kind of a t-test for each pairwise difference in means and compare that to q.
  • From above q (.05, 3, 28) = 3.5

  • Using this formula the t values are:
    1vs2  -0.96  n.s.
    1vs3  -2.39  n.s.
    1vs4  -6.23  sig.
    2vs3  -1.44  n.s.
    2vs4  -5.27  sig.
    3vs4  -3.83  sig.
    

    Using Stata

    use http://www.philender.com/courses/data/cr4new, clear
    
    anova y a
    
                               Number of obs =      32     R-squared     =  0.4455
                               Root MSE      =   1.476     Adj R-squared =  0.3860
    
                      Source |  Partial SS    df       MS           F     Prob > F
                  -----------+----------------------------------------------------
                       Model |       49.00     3  16.3333333       7.50     0.0008
                             |
                           a |       49.00     3  16.3333333       7.50     0.0008
                             |
                    Residual |       61.00    28  2.17857143   
                  -----------+----------------------------------------------------
                       Total |      110.00    31   3.5483871  
    
    fhcomp a
    
     Fisher-Hayter pairwise comparisons for variable grp
    studentized range critical value(.05, 3, 28) = 3.4994064
    
                                          mean     critical
    grp vs grp       group means          dif        dif
    -------------------------------------------------------
      1 vs   2     3.0000     3.5000     0.5000    1.8261
      1 vs   3     3.0000     4.2500     1.2500    1.8261
      1 vs   4     3.0000     6.2500     3.2500*   1.8261
      2 vs   3     3.5000     4.2500     0.7500    1.8261
      2 vs   4     3.5000     6.2500     2.7500*   1.8261
      3 vs   4     4.2500     6.2500     2.0000*   1.8261

    Tukey's HSD Pairwise Comparisons

  • A post-hoc technique that test all pairwise comparisons of means
  • Based on the Studentized Range Distribution the critical value q(α, k, dfe) = q (.05, 4, 28) = 3.8613586
  • If the sample sizes are unequal use the harmonica mean sample size
  • For the harmonic mean sample size let n = k/(1/n1 + 1/n2 + 1/n3 + 1/n4) = 4/(1/8 + 1/8 + 1/8 + 1/8) = 8
  • Let the critical difference = sqrt(mse / n) * q = sqrt(2.18 / 8) * 3.86 = 2.01
  • Compare each of the pairwise diffences in means with the critical difference
    1vs2  -0.50  n.s.
    1vs3  -1.25  n.s.
    1vs4  -3.25  sig.
    2vs3  -0.75  n.s.
    2vs4  -2.75  sig.
    3vs4  -2.00  n.s.
    

    Alternatively

  • Alternatively, you could compute a kind of a t-test for each pairwise difference in means and compare that to q.
  • From above q (.05, 4, 28) = 3.86

  • Using this formula the t values are:
    1vs2  -0.96  n.s.
    1vs3  -2.39  n.s.
    1vs4  -6.23  sig.
    2vs3  -1.44  n.s.
    2vs4  -5.27  sig.
    3vs4  -3.83  n.s.
    

    Using Stata

    use http://www.philender.com/courses/data/cr4new, clear
    
    anova y a
    
                               Number of obs =      32     R-squared     =  0.4455
                               Root MSE      =   1.476     Adj R-squared =  0.3860
    
                      Source |  Partial SS    df       MS           F     Prob > F
                  -----------+----------------------------------------------------
                       Model |       49.00     3  16.3333333       7.50     0.0008
                             |
                           a |       49.00     3  16.3333333       7.50     0.0008
                             |
                    Residual |       61.00    28  2.17857143   
                  -----------+----------------------------------------------------
                       Total |      110.00    31   3.5483871  
                      
    tukeyhsd a
    
    Tukey HSD pairwise comparisons for variable a
    studentized range critical value(.05, 4, 28) = 3.8613586
    uses harmonica mean sample size =    8.000
    
                                          mean     critical
    grp vs grp       group means          dif        dif
    -------------------------------------------------------
      1 vs   2     3.0000     3.5000     0.5000    2.0150
      1 vs   3     3.0000     4.2500     1.2500    2.0150
      1 vs   4     3.0000     6.2500     3.2500*   2.0150
      2 vs   3     3.5000     4.2500     0.7500    2.0150
      2 vs   4     3.5000     6.2500     2.7500*   2.0150
      3 vs   4     4.2500     6.2500     2.0000    2.0150
      
    tkcomp a
    
    Tukey-Kramer pairwise comparisons for variable a
    studentized range critical value(.05, 4, 28) = 3.8613586
    
                                          mean 
    grp vs grp       group means          dif     TK-test
    -------------------------------------------------------
      1 vs   2     3.0000     3.5000      0.5000   0.9581 
      1 vs   3     3.0000     4.2500      1.2500   2.3954 
      1 vs   4     3.0000     6.2500      3.2500   6.2279*
      2 vs   3     3.5000     4.2500      0.7500   1.4372 
      2 vs   4     3.5000     6.2500      2.7500   5.2698*
      3 vs   4     4.2500     6.2500      2.0000   3.8326 
    

    Comparing Tukey's HSD with Tukey-Kramer

  • When cell sizes are equal Tukey HSD and Tukey-Kramer give the same results.
  • Tukey-Kramer handles unequal cell sizes better than Tukey Hsd

    Comparing Fisher-Hayter with Tukey's HSD

  • When the cell sizes are equal Fisher-Hayter and Tukey's HSD give the same results, i.e., tq = qT.
  • Fisher-Hayter is a little bit more powerful since the critical value from the Studentized Range is based on k-1 and not k as in Tukey's HSD.
  • Fisher-Hayter handles unequal cell sizes in a cleaner manner.

    Recall

  • The group means for the 4 group one-way anova
    Group    1     2     3     4
    Mean    3.00  3.50  4.25  6.25
    
  • The MSerr = 2.179

    Scheffé's Test

  • Can perform an unlimited number of contrasts
  • Usually used to perform non-pairwise comparisons.
  • Create a new critical value of F
    FS = (p-1)F where F is found using p-1 & dferror degrees of freedom
  • The critical value of F for the ANOVA at α = .05 was F3,28 = 2.95
  • The critical value for FS = (4-1)2.95 = 8.85
  • Formula for Scheffé

    From Our Example

  • Using contrasts ψa, ψb, ψc & ψd
  • The sets of weights are:
    ca = 3 -1 -1 -1
    cb = 2  0 -1 -1
    cc = 1  1 -1 -1 
    cd = 1  1 -2  0
    

  • MSerr = 1.464 and n = 8 for each group
  • The four F-tests are:
    Fa =   7.65 -- n.s.
    Fb =  12.39 -- sig.
    Fc =  14.69 -- sig.
    Fd =   2.45 -- n.s.
    
    Using Stata

  • When using coded contrasts the critical value should be 8.85

    use http://www.philender.com/courses/data/cr4new, clear
    
    anova y a
    
                               Number of obs =      32     R-squared     =  0.4455
                               Root MSE      =   1.476     Adj R-squared =  0.3860
    
                      Source |  Partial SS    df       MS           F     Prob > F
                  -----------+----------------------------------------------------
                       Model |       49.00     3  16.3333333       7.50     0.0008
                             |
                           a |       49.00     3  16.3333333       7.50     0.0008
                             |
                    Residual |       61.00    28  2.17857143   
                  -----------+----------------------------------------------------
                       Total |      110.00    31   3.5483871    
    
    anovacontrast a, values(3 -1  -1  -1) title(1vs234)
    
    1vs234
    Contrast variable a (3 -1 -1 -1)               Dep Var  =        y
    source           SS          df      MS        Contrast =    -5.00
    ---------+---------------------------------    N of obs =       32
    contrast | 16.6666667         1     16.6667    F        =     7.65
    error    |         61        28      2.1786    Prob > F =   0.0099
    ---------+---------------------------------    t        =     2.77
    
    anovacontrast a, values(2  0  -1  -1) title(1vs34)
    
    1vs34
    Contrast variable a (2 0 -1 -1)                Dep Var  =        y
    source           SS          df      MS        Contrast =    -4.50
    ---------+---------------------------------    N of obs =       32
    contrast |         27         1     27.0000    F        =    12.39
    error    |         61        28      2.1786    Prob > F =   0.0015
    ---------+---------------------------------    t        =     3.52
    
    anovacontrast a, values(1  1  -1  -1) title(12vs34)
    
    12vs34
    Contrast variable a (1 1 -1 -1)                Dep Var  =        y
    source           SS          df      MS        Contrast =    -4.00
    ---------+---------------------------------    N of obs =       32
    contrast |         32         1     32.0000    F        =    14.69
    error    |         61        28      2.1786    Prob > F =   0.0007
    ---------+---------------------------------    t        =     3.83
    
    anovacontrast a, values(1  1  -2   0) title(12vs3)
    
    12vs3
    Contrast variable a (1 1 -2 0)                 Dep Var  =        y
    source           SS          df      MS        Contrast =    -2.00
    ---------+---------------------------------    N of obs =       32
    contrast | 5.33333333         1      5.3333    F        =     2.45
    error    |         61        28      2.1786    Prob > F =   0.1289
    ---------+---------------------------------    t        =     1.56
    Bonferroni & Sidak Methods

  • quick and dirty
  • Not very powerful for pairwise comparisons
  • Divide the alpha level by the number of contrast to arrive at a new alpha level
  • Sidak is a modification of the Bonferroni approach For the non-pairwise contrast in the section above the Bonferroni p-value would be, αB = .05/4 = .0125 which equates to a critical value of FB = 4.33.

    The Sidak critical value is αSi = 1 - (1-.05).25 = .01274146 which equates to a critical value of FSi = 4.31.

    Compare these critical values with the Scheffé critical value of 8.85.

    Comparing the Comparisons

    Consider a four group design with error df=28. Here are the critical values for pairwise comparisons using various methods at α = 0.05.

    Method                     Critical Value of t*
    Ordinary Student's t                  2.048
    Dunnett's test                        2.157
    Fisher-Hayter                         2.474 requires rescaling studentized range statistic
    Tukey HSD                             2.730 requires rescaling studentized range statistic
    Tukey-Kramer                          2.730 requires rescaling studentized range statistic
    Sidak                                 2.830
    Bonferoni                             2.839
    Scheffé                               2.975


    Linear Statistical Models Course

    Phil Ender, 17sep10, 13apr06, 12Feb98