Saturday, August 07, 2010

Why randomization is so important

Kevin Carey calls for randomized trials in education:
The few randomized control trials that exist continue to be enormously influential–I recently heard Nobel prize-winning economist James Heckman give a fascinating presentation on his recent analysis of results from the Perry Preschool project, which involved 128 students (64 treatment, 64 control) from Ypsilanti who were randomly assigned to preschool 48 years ago. What if future Heckmans had a thousand times as many data sets from which to choose? Given that teacher assignment often unfairly reflects parental pressure, periodic random assignment could be a net increase in fairness for students.
Randomization can't answer every important question. Economists trying to understand the impact of a large-scale merger in the airline industry, for instance, can't randomly allow and disallow the merger 500 times each to see what happens. We only have one economy. But when it does apply, randomization is an incomparably useful empirical technique. Nothing else really comes close.

Suppose you want to answer a basic question in education policy, like the effect of class size on student performance. First you look at a simple correlation—how do students in small classes score compared with students in large classes? The problems with this strategy, however, quickly become apparent. What if students with academic disabilities tend to be assigned to smaller classes, dragging down the scores in that category? Or, alternatively, what if students with the most motivated parents (who will tend to be more successful anyway) squeeze into the smallest classes?

More generally, do you look at class size variation within schools, across schools, or both? If it's the former, you must contend with the fact that within-school variation in class size is quite small, and whatever differences is achievement are actually caused by the variation in class size will surely be swamped by the effects of whatever rule (say, the "whiny parent" rule) is used to assign students to those classes.

If it's the latter, you're faced with the enormous empirical challenge of separating the actual effect of average class size from all the confounding factors that are associated with it—type of school, level of resources, student population, general academic philosophy, and so on. If you're extremely, extremely smart and lucky, you might find that a tradition dating back to a 12th century rabbinic scholar induces an arbitrary discontinuity in class size with respect to school size, which can be used to identify the effect of class size on student performance. More likely, however, you'll be forced to resort to running an ugly regression where you attempt to "control" for other factors to isolate the impact of class size. (This is the methodology that produces the famously erratic epidemiological "studies" that produce breathless news stories about the connection between cauliflower and lung cancer.) The problem, of course, is that you can't control for everything; correlation with unobserved variables will inevitably distort your results. Meanwhile, in adding controls to your regression model you make strong assumptions about the functional form. If you assume linearity when it isn't appropriate, your results may be completely wrong—but it's very difficult to know whether this is happening or not.

With rare exceptions, non-randomized evidence on questions like class size sucks.

Randomization, meanwhile, offers extraordinary credibility. Worried that class size will correlate with other important determinants of student performance? Worry not—randomization makes systematic correlation impossible, and to the extent that it happens by chance in finite samples, it's governed by the well-trod principles of basic statistics. Even if cagey parents convince school officials to switch their children to another class, as long as you retain data from the original randomization you can use it as an "instrument" for class assignment and recover a meaningful estimate.

I've written before about the dangers of extrapolating from randomized experiments that don't resemble reality. But education is one of the fortunate cases where randomization is a perfectly appropriate way to gather evidence—it's best suited for understanding the effects of a "treatment", and public education is really just a long and expensive treatment. The fact that a nation of over 300 million people has no better evidence about the benefits of preschool for at-risk children than an experiment with 128 subjects, 48 years ago, is outrageous.

1 comment:

Anonymous said...

I agree that "treatment" randomization is important for the progression of efficacy analysis in policy decisions. In fact, I will flip a coin the next time my friend needs to go to the ER for a chronic seizure issue. Heads we'll take her to the closest inpatient ER, tails we'll take her to the nearest 24/7 private urgent care center. We will continue to flip each subsequent time it occurs (it is an almost monthly affair, with thus far a non-positive track-record of improvement, just a requisite need to stabilize her in a controlled setting). Hopefully the seizures shall remain a martingale and the coin will guide us to the lowest cost-setting to guarantee safety. If nothing else, we can blame the presence of the fair coin for failure and reward the "decision" of the fair coin for reward. Maybe we'll generate a data set that will help her employer-based insurance hone in on an optimal reimbursement efficacy setting.