Introduction to Research Design and Statistics

Exploratory Data Analysis

The Five-Number Summary

  • Maximum
  • Q3 (75th percentile)
  • Median (Q2, 50th percentile)
  • Q1 (25th percentile)
  • Minimum
  • Stem-and-Leaf Plots (Stemplots)

    8| 3 7| 89 7| 0122 6| 001133555 5| 77788 5| 222 4| 56699 4| 022 3| 799 3| 355 2| 69

    Box-and-Whisker Plots (Boxplots)

    Boxplot for multiple groups.

    Stata Examples

    use, clear
    summarize write, detail
                            writing score
          Percentiles      Smallest
     1%           31             31
     5%         35.5             31
    10%           39             31       Obs                 200
    25%         45.5             31       Sum of Wgt.         200
    50%           54                      Mean             52.775
                            Largest       Std. Dev.      9.478586
    75%           60             67
    90%           65             67       Variance       89.84359
    95%           65             67       Skewness      -.4784158
    99%           67             67       Kurtosis       2.238527
    univar write
    /* univar is available via the Internet, net search univar */
                                            -------------- Quantiles --------------
    Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
       write     200    52.77     9.48    31.00    45.50    54.00    60.00    67.00
    stem write, lines(2)
    Stem-and-leaf plot for write (writing score)
      3* | 11113333
      3. | 5566777899999
      4* | 0001111111111223444444444444
      4. | 56666666667799999999999
      5* | 00222222222222222344444444444444444
      5. | 5557777777777779999999999999999999999999
      6* | 000011112222222222222222223333
      6. | 55555555555555557777777
    graph box write, fxsize(50)
    table female, contents(freq mean write median write sd write)
       female |       Freq.  mean(write)   med(write)    sd(write)
         male |          91     50.12088           52     10.30516
       female |         109     54.99083           57     8.133716
    univar write, by(female)
    -> female=male 
                                            -------------- Quantiles --------------
    Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
       write      91    50.12    10.31    31.00    41.00    52.00    59.00    67.00
    -> female=female 
                                            -------------- Quantiles --------------
    Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
       write     109    54.99     8.13    35.00    50.00    57.00    62.00    67.00
    graph box write, over(female)

      Notched Boxplot

    (Created in R)

    Intro Home Page

    Phil Ender, 30Jun98