The Test Generation: An Education Reform Experiment

On exam day in Sabina Trombetta's Colorado Springs first-grade art class, the 6-year-olds were shown a slide of Picasso's "Weeping Woman," a 1937 cubist portrait of the artist's lover, Dora Maar, with tears streaming down her face. It is painted in vibrant — almost neon — greens, bluish purples, and yellows. Explaining the painting, Picasso once said, "Women are suffering machines."

The test asked the first-graders to look at "Weeping Woman" and "write three colors Picasso used to show feeling or emotion." (Acceptable answers: blue, green, purple, and yellow.) Another question asked, "In each box below, draw three different shapes that Picasso used to show feeling or emotion." (Acceptable drawings: triangles, ovals, and rectangles.) A separate section of the exam asked students to write a full paragraph about a Matisse painting.

Trombetta, 38, a 10-year teaching veteran and winner of distinguished teaching awards from both her school district, Harrison District 2, and Pikes Peak County, would have rather been handing out glue sticks and finger paints. The kids would have preferred that, too. But the test wasn't really about them. It was about their teacher.

Trombetta and her students, 87 percent of whom come from poor families, are part of one of the most aggressive education-reform experiments in the country: a soon-to-be state-mandated attempt to evaluate all teachers — even those in art, music, and physical education — according to how much they "grow" student achievement. In order to assess Trombetta, the district will require her Chamberlin Elementary School first-graders to sit for seven pencil-and-paper tests in art this school year. To prepare them for those exams, Trombetta lectures her students on art elements such as color, line, and shape — bullet points on Colorado's new fine-art curriculum standards.

All of this left Trombetta pretty frustrated, and on a November afternoon, she really wanted to talk. As she ate lunch (a frozen TV dinner) in her cheery, deserted classroom plastered with bright posters, she recounted the events of the past week. She liked the idea of exposing her young students, many of whom had never visited a museum, to great works of art. But, Trombetta complained, preparing the children for the exam meant teaching them reductive half-truths about art — that dark colors signify sadness and bright colors happiness, for example. "To bombard these kids with words and concepts instead of the experience of art? I really struggle with that," she said. "It's kind of hard when they come to me and say, 'What are we going to make today?' and I have to say, 'Well, we're going to write about art.'"

Harrison District 2 spent about six months creating a test that turned out to be far too difficult for most first-graders, who are just learning to read full paragraphs, let alone write them. Yet the children's art-exam scores, along with results from classroom observations, will determine Trombetta's professional evaluation score and, consequently, her salary. If she "grows" her students' test scores over the course of the year, she could earn up to $90,000 — more than double the average for a Colorado teacher. But if her students score poorly two years in a row, her salary could drop by as much as $20,000, and she could eventually lose tenure.

Like many Harrison teachers, Trombetta isn't sure whether she wants to continue working in the district, despite the possibility of significant salary gains. She loves her school and its principal, and she supports evaluating teachers based, at least in part, on their ability to advance student achievement. But she's torn on the value of test prep: "I want to maintain a sense of integrity and be faithful to my values about art."

In the social sciences, there is an oft-repeated aphorism called Campbell's Law, named after Donald Campbell, the psychologist who pioneered the study of human creativity: "The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor." In short, incentives corrupt. Daniel Koretz, the Harvard education professor recognized as the country's leading expert on academic testing, writes in his book Measuring Up that Campbell's Law is especially applicable to education; there is a preponderance of evidence showing that high-stakes tests lead to a narrowed curriculum, score inflation, and even outright cheating among those tasked with scoring exams.

A number of state- and city-level studies from the No Child Left Behind era found that swiftly rising scores on high-stakes state tests were accompanied by appalling stagnation in students' actual knowledge as measured by the National Assessment of Educational Progress, the gold-standard exam administered to a sample set of students each year by the federal Department of Education. In 2005, for example, Alabama reported that 83 percent of its fourth-graders were proficient in reading, even though the NAEP found that only 22 percent of these children were proficient readers. The harsh punishments associated with NCLB had encouraged Alabama and most other states to dumb down their tests and then teach directly to them.

"The kind of motivation that results from pressure can get you certain kinds of test scores, but what happens is that the motivation and the learning don't persist over time," says Edward Deci, a social psychologist and expert on motivation. Deci has studied the effects of testing on teaching and learning since the early 1970s, and he is a firm opponent of tying teacher evaluation and pay to student test scores. "The kind of learning associated with pressure is rote learning, rather than conceptual learning," he says.

Despite these warnings from social science and the patent absurdity of first-graders writing critiques of Matisse, Harrison's culture of high-pressure testing could represent the wave of the future in Colorado — and across the country. In May, the Colorado Legislature narrowly passed Senate Bill 191, or "The Great Teachers and Leaders Bill." Taking cues from the Obama administration's education-reform agenda, a narrow bipartisan majority voted to overhaul the way Colorado's teachers are evaluated and granted tenure. Beginning in 2013, 51 percent of every teacher's annual professional evaluation score must be based on student-achievement data. Once the law goes into full effect, any teacher in the state can lose tenure if he earns unsatisfactory performance evaluations two years in a row. If he then fails remediation efforts and loses his job, he won't be guaranteed a new one; after one year without a classroom assignment, he will be let go.

An expert panel appointed by Colorado's previous Democratic governor, Bill Ritter, is working to develop tools for assessing student growth and incorporating the data into teacher-evaluation scores. The panel could push for mandatory testing across every grade and subject area, as Harrison does. (The district's students even sit for pencil-and-paper tests in gym class.) It could also ask the state to measure student growth using a combination of test scores, portfolios of student work, and in-class presentations, as union advocates and the Obama administration would prefer. The federal Department of Education is "developing guidance for states, so they appreciate it doesn't have to be a paper-and-pencil test," says Carmel Martin, the assistant secretary for planning, evaluation, and policy development. "In things like music and physical education, there are other ways."

The downside of more holistic evaluation systems is their subjectivity compared to test results, especially when livelihoods and reputations are at stake. Creating and implementing more complex evaluation tools would also be more expensive than simply writing and grading additional tests.

As New York, Louisiana, and other states revamp their own teacher-evaluation systems to incorporate student-achievement data, they are paying attention to how Colorado implements SB 191. New York City Mayor Michael Bloomberg visited the state in October and praised the legislation in several speeches. Meanwhile, state Sen. Mike Johnston, the former principal and Teach for America teacher who is the driving force behind the bill, travels the country promoting his efforts to lawmakers and education philanthropists. In December, he was enthusiastically received by the New Jersey Legislature, which is considering a similar reform agenda; in February he spoke on the keynote panel at the Washington, D.C., conference celebrating Teach for America's 20th anniversary.

Tags: bill ritter, colorado, daniel koretz, department of education, education, education reform, edward deci, george w. bush, michael bloomberg, mike johnston, national assessment of educational progress, no child left behind, obama administration, standardized tests, teach for america

    • Dana Goldstein
    • Dana Goldstein is a Brooklyn-based journalist covering education, public health, economic mobility, and women’s issues. She is a contributing writer to The Nation and The Daily Beast, and in 2010 was a recipient of the Spencer Fellowship in Education Journalism, a competitive award supporting the long-form work of mid-career e...

    • Dana Goldstein's full bio »