Category Archives: Uncategorized

When is a test "biased"?

Data Colada recently posted a comment (written by Uri Simonsohn) on a supposed ‘bias’ lurking a common default Bayesian alternative to the t-test.  Point #1 was that the Bayesian T-test is ‘biased’ against low power results. Several smart people have made sophisticated and pretty sensible critiques of Simonsohn’s arguments, which I won’t rehash here. Instead, I want to point out the obvious problem: the claim of ‘bias’ is based on choosing the sample size in a way that is, well, biased.

Here’s the central graph under contention.


Now, there are several things to notice. First, not only does the probability that the results support the null increase as the effect size goes down, the probability of the results supporting the alternative go down as well. This is barely noticeable in this graph, but that’s just a function of the particular power level the author chose.  With a power of 0.85 (which is a better estimate of what a person ought to be doing in their experiment anyway), the size of these changes is reversed; the probability of supporting the alternative decreases 5 points from .73 to .68, while the increase in the probability of supporting the null increases just 4 points, from .005 to .042.  I got these numbers using Simonsohn’s publicly available R code).  So it seems that only really low power tests are ‘biased’ against small effect sizes?

But this brings up the deeper question: why make this comparison at all? In particular, why use the notion of ‘power’ to set your sample size?   ‘Power’ is supposed to control for type II errors in a t test.  Why should it have any particular privileged status in setting the size of an experiment?  The answer is that it shouldn’t.  These two concepts have been juxtaposed here without any clear justification.  One could just  as easily choose sample sizes to keep a constant probability that the default test favors the null.  Then one would find that the t-test is ‘biased’ against large effect sizes, in that the power would be higher for small effects.

I might be missing something, but I don’t see what. You have different ways of using models to estimate the parameters for an experiment to control the probabilities of certain kinds of errors.  Parameters that control for one type of error won’t necessarily control for another, but it’s rather extreme to call this a ‘bias’.





I’ve been reading Matt Crawford’s fascinating op-ed in the New York Times, and thought I’d write a quick response.  I should mention that I know Matt, and think of him as a friend; I’ve collaborated on projects with and am a former colleague of Beth Crawford, who is married to Matt. With that stipulation, here’s my initial response to the article.  What I love here is the economic and adversarial analysis of advertising–The idea that a war is being waged for our attention, and that as individuals we need to fight back. This comes out in language like “a straightforward conflict between me an L’Oréal”.

It’s fascinating to think of attention in terms of combat—a war waged across our public spaces.  However, there is an alternative viewpoint Matt also raises, which I ultimately find more compelling. This is that our attention is used transactionally, as fungible payment for goods and services.  On this view, the war between between “us” and “the companies” is no different than it has been, just fought over different resources. The obvious example  motivating this perspective is the way we pay for phone apps.  Many, many apps now have a ‘paid’ version, and a ‘free’ version. Though the language is misleading, most of us seem to understand that we are paying either way—we just pay with our attention and vigilance, or we pay with cash up front. Depending on our needs, beliefs, and interests, we choose how to pay for our goods. In this, we ware clear winners: we get the choice to pay for a service, but we get choice over what currency we use.  In many cases, I prefer paying in attention. It might seem that there is a difference between this example and the airport, but as Crawford points out, there really isn’t: at the airport as in the app store, we can pay for silence.

The risks of ubiquitous advertising have been actively discussed at least since billboards were a great new idea.  The idea that attention is a resource is already an idea actively in play by parties on both sides of this culture war.  Has anything really changed lately, or has Crawford simply noticed what’s been going on for fifty years?  I’m looking forward to reading the book to see (I hope) a more careful analysis of the relevant components of payment, of attention, and of social structure that underlies the emotions of frustration, stress, and anger which are foremost in the times piece. It seems to me that a great contribution would be the thinking through of this economic character of this transaction, where we sell our attention for goods and services.  Do we get something worth what we pay? Can we trust people to use good judgment in making these sorts of transactions? What kinds of laws or other institutional safeguards are important for understanding the interests of companies and

Instead answering these sorts of questions, a large portion of the op-ed is occupied tallying up supposed losses to society caused by this battle for our attentional space. I found this tallying less persuasive. For instance, among the losses Crawford claims the sealing off of public spaces—the massive reduction in the capacity for spontaneous conversations and interactions in airports, train stations, shopping malls, and other places the public congregates.  For my part, while I belief that such a sealing off has indeed occurred, I am skeptical of the link to advertising or other claims on our attention.  I find that when a person sits next to me in an airport or on a plane, unless I’m in an unusual mood I quickly to the very same thing that they do—plug in earphones. I often don’t even turn anything on—the earphones simply indicate my personal preference for privacy over the intrusion of a ‘spontaneous interaction’.  Sometimes people-watching is fun; sometimes conversations are rewarding (I am currently reading a book that was recommended to me by a stranger on an airplane); but mostly, we as a society seem to prefer to avoid these chance encounters and awkward interactions.

Looking around right now, in the Washington Dulles airport, I am in a nearly silent place—no ads, no televisions, no one talking. I didn’t have to pay for this space–I just found it. I travel about 50 times a year, leading to around 200 flights.  I have found that nearly every airport I travel in has free silent spaces, where the people who prefer quiet tend to congregate, but which are generally uncrowded. One has to seek these space out, but they are always there—unmarked, except by the absence of frantic people and televisions.  Often they move with the traffic, but the same few places regularly become eddies in the flow of people, and so if one knows where to look one can quickly get to privacy. The simple fact that such silence exists raises a small point and a bigger point. The small point is that one can pay for experiences one wants in any of three currencies: money, attention, or time. The bigger point is evident in the activities people engage in in these spaces. At this moment, I see 16 people sharing my pseudo-private space.  14 of them (including one Amish gentleman) are silently consulting their phones or laptops. Reading, texting, playing games, reading, working.  Different people are doing different things. One person is staring at the other people blearily, perhaps wishing she was asleep.  The one thing no one is doing?  Chatting up the people around them, initiating spontaneous and unwanted intrusions into each others private lives.

It seems to me that spontaneous conversations have gotten more rare in spaces for reasons that have little to do of the presence of advertising intruding into those spaces. My car is filled with advertisements—on billboards, on the radio—yet it seems that these never silence family conversation on a long car trip.  It seems to me intuitively obvious that the change is not that advertising has blocked our capacity for social behavior, but that cheap, available electronic devices have allowed us to engage socially (via email, blogging, reading, writing) with communities of choice, freeing us from the tyranny of geographic proximity and frustrating conversations with people whom we would rather avoid.  Or to do things that are idle and pointless that are not conversing with random strangers about the weather, about their miraculous cures for colds, about my children, or whatever else we might discuss. The conversations we have now are the conversations we want to have, and it’s not obvious that this is a problem. Nor is it clear that has anything to do with a conflict between individuals and corporations waged in the space of attention. Our culture is changing, AND we have become better at making our attention a fungible commodity, one we can effectively barter away in exchange for goods, generally if and when we want to. I  don’t go to the gas station at the 7-eleven near my home, because of the several close-by gas stations it is the only one to show me advertisements while I pump gas. We do have choices (at least for now), and my intuition is that all of these choices make us better off.  Furthermore, they demonstrate that the same technology that seems to bother Crawford with its intrusiveness also allows us to escape from unwanted intrusions of the past, and to better allocate our attention to the things we most value.

So: Crawford has one intuition, and I have another. How do we resolve this difference?  Unfortunately, the analytical tools on offer do not allow us to track down mechanisms with any confidence. I have no idea which of us is right about why our culture has shifted away from spontaneous discourse (or vouyerism) in public spaces, just as I have no idea whether this is a good or bad thing from the perspective of the broader cultural goals Crawford implicates in the debate.  What I do think is clear is that these issues are rapidly changing and of central importance. Crawford provides value raising them.  I just wish they were raised with more of a focus on rigorous analysis over anecdote, and on a clear articulation of the public goods rather than emotional responses may not be shared. At his worst moments, Crawford can come off as a cranky curmudgeon, annoyed with ‘kids these days’ and sure that every new direction is a wrong turn.  At his best moments, he sees with clear, fresh eyes solutions to problems you didn’t know you had, and makes clear the dangers of cultural choices that seemed entirely safe, or even obvious. My belief and expectation is that book itself will provide a deeper analysis of the economic tradeoffs involved, and (maybe more so) invoke a larger and more profound conceptual framework for understanding the personal and social implications of an increasingly transactional approach to attention, outside of the ‘corporations vs. the little guy’ framing presented in the article.

The ignorance of the Ipsos Mori Ignorance Index

A number of news outlets and otherwise sophisticated blogs are reporting on the Ipsos Mori "Index of Ignorance", where the polling group purports to gather data about how people misunderstand and ‘overestimate’ proportions of various values in their societies, such as the percentage of Muslims, or the rate of teen pregnancy.  For instance, British people estimated that 21% of the population of Britain is Muslim, where the actual figure is about 5%.  The poll goes so far as to rank these countries based on their ‘ignorance’.

There is a news story here, but it is the exact opposite of the reported one: people are shockingly good at estimating the proportion of events in their countries. I guess I can’t blame Ipsos for getting it so wrong, because in order to see what’s going on, you need a basic understanding of psychophysics.

Continue reading

Grumbles about "Grade Inflation" reveal different perspectives on what grades signify

Slate’s Matthew Yglesias has a nice post on grade inflation at top universities.  He observes that because top colleges and universities have become much more selective, part of why grades are getting higher may be that better students are attending them.  One way to think about it is that the grades are exchangeable by university–that same student that got an A in a Harvard course would have gotten an A anywhere–it’s just that the student who would have gotten a B- in that Harvard course didn’t get into Harvard in the first place.

Continue reading

Biases in Mathematical Perception in Mother Jones

Lydia Nichols recently pointed me to an article in Mother Jones that had some mildly interesting flaws whose cognitive origins can be largely traced.

The main point of the article was to talk about gerrymandering, and the clear damage that highly successful gerrymandering–together with increased segregation along political lines–has done to our representational democracy.  And the article was, as far as I could tell, basically factually correct. However, the implications drawn had two neat errors.

Continue reading

Analogical Mapping and Classroom Representations

Lately, I’ve been thinking a lot about mappings–particularly the kind of mappings we make across mathematical representations like algebraic notation.  Part of this is to build toward a cluster of automated mapping systems that allow for flexible enforceable mappings across structures–I’ll write about that project later.
Here, I’m just going to muse a little bit on the difficulty of pinning down some analogies that are very common in math classes, yet on their surface seem challenging for mapping processes (at least, the ones I know of) to handle.  I’m going to talk just about number lines and arithmetic notation, but the same points seem to hold for a wide variety of situations.

Continue reading