Education 231C

Applied Categorical & Nonnormal Data Analysis

Survey Logistic Regression


When using data from a survey design it is necessary to take into account such aspects as stratification, cluster sampling etc. If you don't take these aspects of the sampling design into account you may end up with biased coefficients and certainly with incorrect standard errors. In the next example we will demonstrate a logistic analysis using a stratified random sampling design

Survey Logit with Stratified Random Sampling

Using API data provided by the California State Department of Education we will take a stratified random sample of 100 elementary schools, 50 middle schools and 50 high schools. This is out of a total of 4,421 elementary schools, 1,018 middle schools and 755 high schools.

The file apistrat.dta contains the data for the stratified random sample.

Survey Logit with One-Stage Cluster Sampling

Another type of sampling design is cluster sampling. In this example we will use school districts as the cluster or primary sampling units. We will take a random sample of 15 school districts and look at all of the schools in each one. There are 757 school districts in the state.

The file apiclus1.dta will contain the data for the one-stage cluster sampling design.


Categorical Data Analysis Course

Phil Ender