Short version
Scientists have done a spectacularly poor job explaining to the taxpayers what we do in many ways. One, which is perhaps not entirely our fault, is that we have done a poor job explaining just how cheap our research is. Here I tell you about a project my lab conducted, which suggests (a) that people vary dramatically in how they map the cost of objectively small budget items onto a number line, even when given numerical information about costs, and (b) that support for these budget items is elastic in terms of psychological relative cost—people who are better at mapping the true cost of the programs into number lines view them more favorably than those who don’t them.
Budgeting Science
It’s that time of year again. House Republicans have noticed that the National Science Foundation still exists, and have once again demanded that science research—and social science in particular—be cut substantially. It’s actually not as bad this time around as in some past years: social science is facing a 42% proposed cut; in past years, the starting proposal has been even higher. The proposal also puts heavy restrictions on climate change research.
And, once again, it’s time to face the fact that we scientists have done a spectacularly bad job explaining what we do, and why it is worth public investment. Some of the reason for our failing is perhaps that we scientists feel entitled to do our work; some is that objectively, science is an amazingly good investment, and social science has arguably led to growth in GDP, as well as in outcomes for veterans, escape plans in the face of natural disasters, and educational practice.
Nevertheless, support for public research is relatively low, and funding for the public universities that are the major site for this research is under pressure. One problem is that there is a widespread misconception that professors spend the majority of their effort teaching in classrooms. Of course, teaching is an important part of our job, but classroom-related teaching is about 20-30% of most faculty member’s efforts. The bulk of our time is doing research—research that creates much of the new knowledge we go on to teach in our courses. As a result, students end up paying for research that benefits the entire tax base, and taxpayers don’t realize how this value is achieved.
But over the last few years my lab has been researching another likely cause of opposition to the NSF and other research budgets*. Budgets for NIH, NSF, IES, DARPA, and other large, famous federal research foundations are typically expressed as numbers. For instance, the NSF budget is about $7 billion dollars annually. And people don’t know how much that is. Worse, they work with those numbers incorrectly, and when they do, they tend to end up making predictable bad judgments that likely mismatch their real desires.
Perception of Cost
I give a super-fast overview of our methods here. There are a lot more details in the published papers. If what you want is less detail, here’s the one-sentence version: About 40% of people are biased on number lines such that they systematically and hugely overestimate the value of smaller ‘big’ numbers relative to much larger ones, when those numbers cross between millions, billions, and trillions
The major—but not the only—way we have examined large number use is using the number to position task. Here, we ask people to put a number on a number line. For instance, we might ask people to put 280 million on a line from 1 thousand (or 0) to 1 billion. There is quite a bit of complex structure in how people respond to this task, and I won’t explain it all in detail (but see our papers). The short version is that people divide the line up into ‘chunks’ based on the scale word used—for instance a line from 1 thousand to 1 billion would be divided into a ’thousands’ chunk and a ‘millions’ chunk**, like this:

The thing is, the way I just drew this, it’s very wrong. You see, there are 1,000 millions in a billion (that’s what a billion is, right? 1,000 million, at least here in the US (link)). But about 40% of our subjects do something quite like this, placing "million" somewhere between 20% and 50% of the way across the line. The other half also seems to divide the line up, but they do it more or less at the right place, which is about here:

So about half of people not only get big numbers wrong, but get them systematically grossly wrong. Does that matter? It might: if these behaviors reflect something that is happening when we compares costs. Let’s look at how this might work in an example: the $11 million that was budgeted at one point in 2013 for political science research, out of the $7 billion total the NSF was getting that year.
If you’re one of the more accurate, linear responders, that looks (on the line!) like not that much money. But if you’re one of the non-linear people, then $11 million looks
like a lot. I collected data from 50 mechanical turkers, to verify this. First I asked them to place 11 million on a line from 0 to 7 billion. (the NSF was not mentioned). Then I gave them 8 other number line judgments on our standard thousand to billion" line. I used the latter 8 judgments to bin people into the two groups, which I’ll start calling linear (that’s the people who get it right) and categorical. The difference is large and right on track with our model predictions. Categorical responders rate $11 million as 20% of $7 billion, while linear responders are closer to 1%.

Does this affect actual views on funding federal programs?
We don’t know yet whether number line judgments actually causally impact people’s political views. But we have some evidence that they might at least correlate with them***. Last summer Brian Guay conducted a research study in my lab, through Time-Sharing Experiments in Social Science (or TESS) . TESS conducts nationally representative online surveys using standard polling methods on important topics for social scientists, and that’s just what they did for us. The survey consisted of about 2100 adults.
Here’s what we did: First, we gave each person 4 number line judgments, and used those to divide them into two groups. Then we asked people to make 4 judgments about the federal budget ****. In each, we gave people a total budget for an entity, and an amount allocated to some particular program in that budget. These were actual spending figures that had been recently reported in the media. Then we asked whether the agent should spend “a lot less”, “a little less”, “about the same”, “a little more” or “a lot more” on that particular program.
The four items were: spending on climate change research in the NSF ($133.53 million of a $5 billion NSF research budget); spending on weapons systems by the federal government ($114.9 billion of a $3.45 trillion federal budget); spending on unmanned drones by the U.S. Customs & Border Protection agency ($88.6 million of a $10.35 billion CBP budget), and US federal government foreign aid ($52 billion of a $3.45 trillion federal budget, and fairly notorious.
The results
Obviously the details depend on exactly how you measure things. We had decided to add together***** the numerically coded ratings to get a ‘total support measure’, because that seemed simple, and also to analyze separate effects for each question, because that seemed interesting. We included only people who answered all the questions. The graphs present something slightly easier on the eyes, but basically tell the same story. What they indicate is that, overall, acting linearly on the number line task was associated with a shift support for maintaining or increasing funding for these government programs, i.e., who gave a response of at least “about the same". The total raw shift in support was about 4 percentage points, from 59% supporting these programs among linear responders on average to 55% among categorical (standard error around 0.9%).
Of course, some of that is explained by correlations between the groups: accurate number line responding was moderately correlated with income, education, and gender. However, even when these were included as covariates in a multiple regression, linearity continued to carry unique variance; perhaps more importantly, a preliminary SEM analysis suggests that linearity is affected by overall education level, but also mediates education’s effect on these judgments. There are lots of ways that education probably influences support for cheap government programs, of course; however, our mechanical turk studies suggest a possible causal intervention—training people on the number line affected immediately posterior number line judgments.
Nor does political affiliation explain easily the results: more linear people take a more liberal position by supporting increased NSF spending on climate research, but a more conservative one, approving more spending on drones to secure the US border.
If you want the full details, here’s the same graph as above, broken down by question. You can see that support for climate change research and spending on drones are much more sensitive to these phenomena than foreign aid and weapons—is that a real difference? I don’t know. It would be interesting to see how elastic people are to the programs, but until these patterns are better replicated****** we won’t really know for sure.

The Moral
The main moral is this: Giving people context to help them understand the significance of large numbers may lead a fairly large proportion of them to misinterpret the relative values involved in predictable ways. Practically, this matters, because contextualizing information is often used by the media to frame values and often crosses scales in just this way. It’s important for people to realize that, even when it doesn’t shift their position all that much, it may have a larger impact on how they interpret these statements. Saying that the NSF spends $11 million of their $7 billion budget on political science******* may sound either of two very different ways, depending on how the reader interprets the numbers.
This failure to correctly deal with large numbers impacts our support for cheap programs, but as my friend John Opfer points out, it also plausibly impairs our ability to cut deficits appropriately. Politicians often propose budget cuts which are objectively tiny—but are probably accepted as moderate progress by a fairly large proportion of the population. Again, we think it’s important to carefully express numerical in a context in a way that avoids these typical misinterpretations.
The second moral is more fraught, but relates to the question of how we should present large numbers. We don’t know—we don’t have all the right data to determine what methods of presentation will be most effective. Here are some guesses, though, some of which are informed by data. 1) Present all your numbers using the same base. That is, don’t say “the proposal cuts $300 million in climate research, from $1.4 billion to $1.1 billion. Do say “The proposal cuts $300 million in climate research, from $1,400 million to $1,100 million” (as in xkcd). Do present linear visualizations of your quantities. Do remind people of how the number system works, every time. Do give percentages where meaningful and possible. Fanny Chevalier has collected a large number of scale representations, and done some interesting analysis of the kinds of scales people use. You might find her analyses helpful too.
* This research was partially funded by the NSF. We weren’t NSF-funded (and I never have been yet), but TESS, the group who funded and conducted our survey, is funded by the NSF.
** Before you ask, it doesn’t seem to matter much whether numbers are printed as numerals (e.g., 280,000,000) or hybrid number words (e.g., 280 million).
*** Full disclosure: this data has not yet been published in a peer-reviewed journal, or even presented at a peer-reviewed conference. (For that, we’ll do structural equation modeling, so the analysis won’t even be the same). You heard it here first. Lots of things you hear on the internet turn out to be wrong. Caveat emptor.
**** We were unable to counterbalance the order of the number line and political judgments in this experiment, though the internal scales were presented in random order. There is, of course, some possibility that mere exposure to the number lines changed people’s views. No study is final. As I said in ***, caveat emptor.
***** This treats the Likert scale as a fully metric scale, which is inappropriate. Better techniques exist, and our results generalize to most. But they are harder to describe, so here I’m sticking with the simple.
****** In the lab, as part of piloting these materials, we have replicated an effect of linearity on NSF results three times with mechanical turk populations during pilot work—that’s three out of three attempts. In each, we also included Foreign Aid spending, and as I recall in two there was a significant effect, but not the third. These are NOT preregistered trials, and mix those intended as exploratory and those intended as confirmatory. As I said, more care is needed.
******* Just to fully connect the dots here, this research was itself funded partially by NSF funding to TESS, a social sciences project! I wouldn’t call that a conflict of interest, necessarily (I am not on the TESS grant, nor have I received any federal dollars through NSF, for any project—though heaven knows I’ve tried!), and I’m not claiming that these data by themselves, say, demonstrate the intrinsic valuableness of public funding for research. However, if you are inclined to see self-interest in this research line, well, I can only state that that wasn’t my conscious motivation, and that I want to be clear and up-front with my readers about the concerns that they might have. Nobody is free from implicit biases, and I want you to be able to scour my behavior for it.