Introduction to Research Design and Statistics

Nonparametric Tests


Introduction

Many of the statistical tests that we have looked at so far make assumptions concerning the distribution of the observations. For example, the t-test assumes that the observations come from normal populations that have equal variances. Nonparametric statistics allow the researcher to test hypotheses without making these assumptions. They are very useful when the data are badly behaved but are less powerful than parametric tests when distribution assumptions are met.

We will illustrate some nonparametric test using the hsb2 dataset, looking at the differences in writing test scores for males and females.

Students t-test (Parametric)

We will begin the the standard Students t-test (a parametric test) so that we can compare our results. The results from the t-test are statistically significant.

use http://www.philender.com/courses/data/hsb2, clear

ttest write, by(female)

Two-sample t test with equal variances

------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
    male |      91    50.12088    1.080274    10.30516    47.97473    52.26703
  female |     109    54.99083    .7790686    8.133715    53.44658    56.53507
---------+--------------------------------------------------------------------
combined |     200      52.775    .6702372    9.478586    51.45332    54.09668
---------+--------------------------------------------------------------------
    diff |           -4.869947    1.304191               -7.441835   -2.298059
------------------------------------------------------------------------------
Degrees of freedom: 198

                  Ho: mean(male) - mean(female) = diff = 0

     Ha: diff < 0               Ha: diff ~= 0              Ha: diff > 0
       t =  -3.7341                t =  -3.7341              t =  -3.7341
   P < t =   0.0001          P > |t| =   0.0002          P > t =   0.9999
The Wilcoxen (Mann-Whitney) Test

The ranksum command performs the nonparametric equivalent of the t-test. And like the t-test it finds a significant difference with these data.

ranksum write, by(female)

Two-sample Wilcoxon rank-sum (Mann-Whitney) test

      female |      obs    rank sum    expected
-------------+---------------------------------
        male |       91        7792      9145.5
      female |      109       12308     10954.5
-------------+---------------------------------
    combined |      200       20100       20100

unadjusted variance   166143.25
adjustment for ties     -852.96
                     ----------
adjusted variance     165290.29

Ho: write(female==male) = write(female==female)
             z =  -3.329
    Prob > |z| =   0.0009
Looking at K-groups

We will stay with the hsb2 dataset but focus our attention on the differences in program type (prog).

One-way Anova (Parametric)

The one-way anova, a parametric test, will be used for comparison. Note that the results are statistically significant.

anova write prog

                           Number of obs =     200     R-squared     =  0.1776
                           Root MSE      = 8.63918     Adj R-squared =  0.1693

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  3175.69786     2  1587.84893      21.27     0.0000
                         |
                    prog |  3175.69786     2  1587.84893      21.27     0.0000
                         |
                Residual |  14703.1771   197   74.635417   
              -----------+----------------------------------------------------
                   Total |   17878.875   199   89.843593 

The Kruskal-Wallis tests is the nonparametric equivalent of the one-way anova.

kwallis write, by(prog)

Test: Equality of populations (Kruskal-Wallis test)

     prog          _Obs   _RankSum
  general            45    4079.00
 academic           105   12764.00
 vocation            50    3257.00

chi-squared =    33.870 with 2 d.f.
probability =     0.0001

chi-squared with ties =    34.045 with 2 d.f.
probability =     0.0001


Intro Home Page

Phil Ender, 24Sep01