Wednesday, April 08, 2009

When experiments are too controlled...

As I've discovered more and more over the past few weeks, students around here are extremely fond of Dan Ariely, the behavioral economist author of Predictably Irrational, and one of our more famous faculty members. And there's nothing wrong with an interest in behavioral economics: some outstanding economists dedicated to behavioral or experimental analysis of economic behavior have made enormous strides in our understanding of the field.

But it's also important to note the weaknesses often associated with behavioral economics, and I can't think of a better example than an opinion article Dan Ariely contributed to the New York Times last November. His article, "What's the Value of a Big Bonus?", uses results from several field experiments to suggest that bonuses may not have the incentive effects businesses and economists expect:
To look at this question, three colleagues and I conducted an experiment. We presented 87 participants with an array of tasks that demanded attention, memory, concentration and creativity... 

About a third of the subjects were told they’d be given a small bonus, another third were promised a medium-level bonus, and the last third could earn a high bonus. We did this study in India, where the cost of living is relatively low so that we could pay people amounts that were substantial to them but still within our research budget...

What would you expect the results to be? When we posed this question to a group of business students, they said they expected performance to improve with the amount of the reward. But this was not what we found. The people offered medium bonuses performed no better, or worse, than those offered low bonuses. But what was most interesting was that the group offered the biggest bonus did worse than the other two groups across all the tasks.
He continues to describe a similar experiment done at MIT, along with a conclusion (and one enormous qualifier):
We found that as long as the task involved only mechanical skill, bonuses worked as would be expected: the higher the pay, the better the performance. But when we included a task that required even rudimentary cognitive skill, the outcome was the same as in the India study: the offer of a higher bonus led to poorer performance.

If our tests mimic the real world, then higher bonuses may not only cost employers more but also discourage executives from working to the best of their ability.
"If our tests mimic the real world" indeed! Although Ariely doesn't dwell on this point (which I italicize for emphasis), it's a fundamental weakness in his research. It is hard to imagine two more different situations: a lab experiment that lasts at most a few hours, and difficult financial work that lasts for months or years. Generalizing from the former to the latter isn't some small jump in reasoning: it's a massive leap of faith, with no logical or empirical basis to support it.

In fact, a little intuition suggests that Ariely's results are perfectly consistent with a world where ordinary bonuses incentivize productivity. Imagine you're in a lab, slated to participate in an experiment for the next few hours. You know that the experiment will last the same amount of time, and you'll be asked to complete the same set of tasks, no matter what you do. Even if you're at the "low" compensation level, the amount of money you'll be given is enough to justify putting in your best effort (or something close to it). After all, you're stuck in the lab anyway: why not give their stupid puzzle your best shot, when even as a "low" bonus recipient you can earn the equivalent of a low-skilled worker's daily wage?

It's likely, then, that the improvement in effort induced by larger bonuses in this experiment is pretty small. Meanwhile, there are other factors affecting your performance, like stress in a cognitively demanding challenge. Even a relatively small detrimental effect from stress may overwhelm the small improvement to effort induced by the bonus, and produce results like the ones we see in Ariely's experiment. 

This doesn't mean, of course, that we should expect to see the same pattern from all bonuses. When varying time commitments are involved, and your bonus depends on sustained dedication to work over a long period, its effect on your effort is likely to be much larger, and may swamp the negative effects from stress. I don't know that this will happen, but Ariely certainly hasn't shown that it won't, and his research does very little to advance our knowledge of incentives in the real world.

The more general philosophical issue here is the tradeoff between internal and external validity. If you're concerned about internal validity, Ariely's work is great. Small sample size notwithstanding, I have very little doubt that if I set up an identical experiment measuring the effects of bonuses on laboratory tasks in India, my results will be similar to Ariely's, and that if prodding lab subjects to perform contrived tasks ever becomes a critical policy goal, this knowledge will prove predictive and invaluable. In this limited sense, I have far more confidence in randomized economic experiments than I do in, say, the correctness of a particular regression specification.

Unfortunately, we are also concerned about external validity—whether our results extend to a more realistic setting—and here we are forced to indulge massive leaps in analysis. Near the end of his piece, Ariely argues that bankers were "too quick to discount" his results, but he never makes clear why a sane banker would base compensation policy on a few lab experiments. However unreliable financial executives' experience may be, it is much more relevant to the question at hand.

"Inflation-adjusted" school spending: not adjusted enough

I often hear conservatives talk about how "inflation-adjusted" school spending has skyrocketed in the past few decades, and that the lack of any positive results from these "massive increases" demonstrates that money won't help our education system. This has a grain of truth: increases in funding don't magically translate into better outcomes, and some interventions are much more cost-effective than others. Many of the disparities we see in the system today stem from broader and more complex problems of poverty and social status, which are impossible to fix with education policy alone.

The argument itself, however, is just poorly informed nonsense. There is no single, pervasive inflation rate that is exactly the same for every type of product. Instead, different goods and services experience different price changes over time. Generally, the prices of manufactured goods, especially high-technology ones, either decrease or climb at a rate much slower than "overall" inflation. Since these goods are included in the basket used to calculate the overall rate, it is an arithmetic inevitability that costs of other products -- particularly low-technology services like teaching -- are increasing at a rate higher than overall inflation. Even if we "adjust" for inflation using a standard measure like the CPI, we will overestimate the gains in school funding, and are likely to see "massive increases" even when the actual inputs are the same.

To obtain a real picture, we'd have to adjust for a "teacher price index": the salaries necessary to recruit and retain qualified teachers over time. But when your goal is to uncritically snap up whatever compelling factoid crosses your desk, rather than to engage in an honest discussion of American education, I guess that isn't really necessary.