Newspaper Story

Here is an interesting article in the field of education in which regression plays a major role.

School Test Scores Missing Key Parts of the Equation

By Doug Smith, Times Education Writer

October 28, 1998

Education: Some say poverty and other social factors should be considered when deciding how a school is doing.

Measured by test scores alone, McKinley Elementary School in South-Central Los Angeles stands out as a disaster.

The school last month was placed on academic probation for failing to show improvement in the year since it was placed on Supt. Ruben Zacarias' list of 100 worst-performing schools.

Yet, from another perspective, that judgment appears unfair. McKinley's students are mostly poor, fewer than half speak English as their native language, and they frequently change schools. With all that, McKinley has one of the toughest assignments in the Los Angeles Unified School District.

An analysis by The Times that takes those factors into account yields a much more favorable conclusion, elevating McKinley above the average district elementary school.

At the same time, some other area schools that rank well when measured simply by test scores do less well when economic and social factors are considered. Some could be accused of simply coasting by on the advantages that students from middle- and upper-class homes bring to the classroom.

This schizophrenic picture illustrates the treacherous questions awaiting statewide efforts to judge school performance. Should schools be held accountable to a single bottom line--standardized test scores? Or should they be given consideration for the socioeconomic factors known to influence student achievement?

An evaluation that recognizes the impact of poverty or transiency would highlight campuses where students are exceeding academic expectations. It also would point out those that are underperforming.

Because they now largely escape notice, underperforming schools, and their high-scoring students, may be hurt just as much as those at the bottom, said Bruce Fuller, a UC Berkeley associate professor of education.

Without a method for factoring in demographics, Fuller points out, the district cannot evaluate the effectiveness of presumably high-performance schools and search out those that are underachieving.

"I think it's a real middle-class issue, as the school choice movement catches on in L.A.," he said.

Thorny Issues of Accountability

A nationwide movement to enhance accountability in public education has given new urgency to the issue. The stakes are high for schools and their staffs, who are increasingly competing for rewards when they do well and face penalties, including losing their jobs, when they fail.

Opinions are deeply divided. Some educators argue that it is unfair to brand schools as "the worst" mainly because of circumstances their teachers can't control. Others contend that it is unfair to poor students to use a measure that does not hold their schools to a single academic standard.

As yet, there are no definitive answers to those questions anywhere in America. But some school systems are far ahead of Los Angeles and California in untangling the knot of educational assessment.

The prime examples are Dallas and the state of Tennessee, which use complex computer models to level the test score playing field by taking into account demographic influences.

The architects of these statistical models promote them as the answer to the inherent tyranny of ranking schools by their students' scores.

The shortcomings of a simple performance-based ranking were dramatically illustrated this year in the derailing of Zacarias' groundbreaking initiative to put principals on the spot if their schools do not do well on the state's standardized test.

Zacarias won plaudits when he announced his plan, but when the list came out in July 1997, all but five of the schools had at least two-thirds of their students meeting the poverty guideline for free and reduced-cost lunches.

Critics quickly complained that the 100 schools plan ignored the reality that poor children generally do not score as well as those who are better off.

"You can hold people accountable for things they can control, but you can't hold them accountable just for results," teachers union President Day Higuchi said. When the time came a year later to hold the 100 schools accountable, Zacarias took little more than symbolic action.

He issued no new list of 100 schools, acknowledging that he needed to find a method of also identifying schools in middle- and upper-class neighborhoods that were performing below par.

The basic tool that testing experts use to control for the effects of demographics is a statistical procedure called regression. In essence, regression examines whether test scores tend to increase or decrease as social characteristics such as overcrowding or transiency change.

Regression studies have consistently shown that wealth is the strongest influence on test scores. As the percentage of poor students increases, scores decline.

But given the same poverty level, some schools do better than others. A simple regression model uses data from a large group of schools to make predictions--given a certain level of poverty, the model forecasts the scores that one would expect to see. By comparing that prediction to a school's actual scores, one can measure whether a school is performing better than expected or worse.

To illustrate how such an analysis might affect school rankings in Los Angeles, The Times, in consultation with a UCLA statistician, constructed a regression model for Los Angeles elementary schools. The analysis correlated test scores to two socioeconomic factors: poverty, which is the strongest correlation, and the stability rate--a measure of what percentage of students stay at a campus throughout the school year.

About half the elementary schools on the list of the 100 worst-performing schools would have earned a much higher ranking if judged using that sort of model, the analysis found.

Learning the Hard Way

McKinley, near 78th Street and Avalon Boulevard, is a striking example because it has been placed on academic probation.

It is one of the district's poorest schools, with 96.4% of its students qualifying for free or reduced-price lunches. It has also gone through a prolonged transition from predominantly African American to majority Latino. Two out of every 10 students who begin the school year at McKinley leave before it ends.

Ranking schools on the basis of how many points they scored above what the poverty and stability rates predicted, McKinley placed 256th out of 435 elementary schools.

Being placed on probation has been hard on her staff, McKinley Principal Gwendolyn Washington said.

"As hard as we worked and as much time as we spent improving the program, it is disheartening to find ourselves on the list of 30," she said. "We know we don't deserve to be here."

While lifting McKinley out of the cellar, the analysis boosted Solano Elementary in Chavez Ravine right to the top based on its scores in the 56th percentile despite high poverty and low English proficiency.

Given a chance to rave about his school, Principal John Stoll said he has always known that Solano is a hidden gem.

Adopted by the Dodgers, it has a mentor in the team front office for each sixth-grade student. A Cal State Fullerton program provides college students as literacy partners, and Cal State Los Angeles contributes teacher training. Students who aren't keeping up get tutoring, thanks to an Annenberg grant.

Room for Improvement At the other extreme, highly regarded Dixie Canyon Elementary in upscale Sherman Oaks scored considerably lower than other schools with similar demographics. It posted a percentile score of 52, when its predicted score was 61.

The rare negative review didn't faze Dixie Canyon Principal Melanie Deutsch, who said she welcomed the information as a tool for self-evaluation at a school that routinely scores above average for the district.

"There is room to improve, and we're looking at that," Deutsch said.

As he searches for ways to lift his accountability program beyond the 100-schools program, Superintendent Zacarias said he remains skeptical about socioeconomic rankings.

"We're looking at all that," Zacarias said. "But when it's all said and done, I don't want to use all of these factors as excuses for any school not to improve. It still boils down to the fact that in some neighborhoods, schools are improving and others aren't."

The subject will apparently be put on the district's agenda by a task force of experts he assembled last spring to study accountability practices across the nation. Zacarias is not alone in his concerns.

"Our belief is you should hold all kids to the same standards," said Susan Sclafani, the Houston school district's chief of staff for educational services.

"You shouldn't say that because kids are black or poor that you won't hold them to the same standard. The world is going to hold them to the same standard."

Houston schools are expected to reach minimum scores on the statewide assessment exams regardless of their demographics, Sclafani said.

She asserts that it's working. Five years ago, 55 of the district's 265 schools were classified as low-performing, meaning that only 20% of their students passed the test. Even though the threshold passing rate has since been raised, only three schools received low ratings this year, Sclafani said.

Some on the cutting edge of statistical testing assessment believe they have answered their critics with a different kind of model that ignores socioeconomic indicators, and instead follows individual students from year to year to measure how much they have learned.

"We're measuring the direct influence the school is having on them," said the University of Tennessee's William Sanders, who designed and administers the state system.

One student who lags in a particular content area may be insignificant, but if several in the same classroom do, it could indicate a problem. Sanders said his analyses have shown great performance variation among schools and also among teachers within a school.

"The biggest difference we've found is the effectiveness of the individual classroom teacher," Sanders said. "It's not race. It's not poverty."

In Tennessee, school reports are made public, but classroom reports go no further than the principal, Sanders said. The teachers union has supported the use of the information by principals to focus teacher training and, sparingly, to support removal of ineffective teachers, he said.

The national discourse over statistical test score modeling has yet to reach Sacramento.

At one time, the Department of Education had a token demographic component in its statewide test reports--ranking each school among a group of 100 similar schools.

But that died in 1994 when California killed its statewide testing program.

The legislation that restored statewide testing this year made no provision for interpreting scores based on demographics, or for ranking schools at all. Without a mandate from the Legislature, the Department of Education merely listed scores for each school, leaving parents and school districts to interpret for themselves how well schools are doing.

How the Analysis Was Done

The Times analysis of performance at 435 Los Angeles elementary schools was based on 1998 standardized tests taken in English and Spanish.
The analysis used a statistical procedure called linear regression to assess how poverty and student transiency affect school performance. To measure poverty, the analysis used students eligible for federally subsidized lunches. Transiency was measured by the percentage of each school's students who remain enrolled throughout a year.
Limited English skills also have a strong effect on student performance. The Times did not use that variable in its analysis because it overlaps significantly with poverty.
Schools are ranked according to how much their scores were above or below the predicted score. Most schools' scores were quite close to the model's predictions, but some clearly perform better than expectations, while others are below expectations. Richard Brown, project director with the UCLA Center for the Study of Evaluation, reviewed the work, noting one qualification:
The reliability of the model depends upon the validity of the school district's practice of compiling the English- and Spanish-language test scores from each school into a single score for reading, math and language over all grade levels.