IQ Cults, Nonlinearity, and Reality: a Bird-watcher’s Parable

Imagine a society obsessed by bird watching. Bird watching is not only a wonderful pleasure for the individual but also, let us say, the source of that society’s flourishing. Good bird-watchers are in high demand. Many people want to be bird-watchers. Aristotle has a section on bird watching in the Ethics. The National Academy of Sciences is named after John Audubon.

We worry about the next generation of bird-watchers. Can we identify them? Can we spot diamond bird-watchers in the rough? To help, some psychologists create a test. The test is based on introspecting on what bird watching is really about. The psychologists ponder it, watch some bird-watchers, and decide it looks like they’re really good at sitting still.

The test, therefore, is how long you can sit in a chair without moving. This is administered in controlled conditions. You have to put your hands in your lap, palms up, there’s a timer, and you don’t get to see the particular chair you’re sitting in ahead of time. Movement is judged by the person who administers the test, at first, but it’s now been upgraded to laser-ranging systems that eliminate sources of bias.

The test works! It turns out that if you can’t sit still in a chair for more than five minutes, you will never make it as a bird-watcher. Not only that, but if you can break the thirty-minute mark, you have an elevated probability of becoming a great bird-watcher. Sitting still captures bird-watching ability.

A bunch of other tests based on sitting still are created. They all strongly correlate with each other. Comfy chairs, couches, even a super-rigorous standing one used at Duke; they all seem to measure the same thing, s. It turns out that sitting still scores move a little bit with training, but if someone can’t sit still for ten minutes, there’s almost nothing they, or a Head Start program, can do to get them past the thirty minute mark, at least if you check a couple of years later. New sitting tests are created that are more resistant to people learning to sit still.

Even more than that, it turns out that sitting still is not just predictive of bird watching performance, it’s also predictive of a whole host of other life outcomes. People who can’t sit still for five minutes have more problems with addiction, for example. Conversely, someone who can sit still for twenty minutes is often able to avoid addiction, or to break it if he falls victim. Very, very few people who can sit still for three hours die of alcoholism. Same with divorce, automobile accidents, and being good at chess. Bird watching ability is protective. This fits with how important bird watching is in the culture.

Things start to get dark. For example, very few women are extreme performers on the sitting task. This is because sitting ability is bell-curve distributed, and the female variance is smaller than the male variance. Some men just can’t sit still, while others are massive overachievers and can sit still for days. Women just can’t hack it as elite bird-watchers because e^{-\frac{x^2}{2}(\frac{1}{\sigma^2_\textrm{f}}-\frac{1}{\sigma^2_\textrm{m}})} is very small for large x

The psychologists caution that just because they’re saying that women are much, much less likely to be found in the elite sitting score percentiles, and that’s the best measure of true bird-watching ability we have, it doesn’t mean you should assume that any individual woman can’t be a great bird-watcher. That doesn’t make sense, they say. Most people realize that this is exactly what you should think given what they’re saying. If red apples are much less likely to taste good than green apples, you should cook with the green apple unless you’re racist. But everyone agrees to go along with the idea that this population-level stuff is super-innocent, and people who write papers on this get ruthlessly suppressed and there’s a whole Quilette thing.

Twin studies are done. Sitting task performance is genetically heritable.

Racial differences in the sitting task appear. Extremely sophisticated linear regressions are done to control for SES, age, educational background of parents, etc., and they refuse to go away. People write books about how the lack of black bird-watchers is due to their genetic inability to do well on the sitting test. (People notice that black female bird-watchers are over-represented in elite circles compared to black male bird-watchers, and that kind of clashes with the gender result, but explanations are forthcoming.)

There are some troubles in paradise, however.

To begin with, almost every great bird-watcher alive thinks the test is absolutely crazy. Bird watching is not about sitting perfectly still for hours, they say! No great bird-watcher wants to brag about their sitting score. A famously egotistical bird-watcher who writes books about how awesome he is at bird watching, how he totally crushed this other bird-watcher, etc etc., is also really proud of the fact that he was, at best, at the bottom of the upper-quartile of sitting still. Birdbloggers clamor to reveal their crappy sitting scores.

In fact, bird-watchers basically describe what they do in terms of anything other than sitting still. This is a dynamic, gestalt thing, they say. There are many different kinds of birders. Great birders are birders about birding. There is a world of Platonic birds I touch them with my mind at night. Bird-watching is ethological poetry, and I am Byron. Besides, those kids who do blow away the sitting task? We’re not surprised when only a small fraction of them actually blow away bird-watching.

What do bird-watchers know about bird-watching? the psychologists reply. A lot of the greatest bird-watchers are liberals who don’t like the race stuff which is totally true. Not only that, they add in a Parthian shot, but the sitting task test is actually a good, liberal thing! It really opened up bird watching back in the 30s. A lot of WASPs were getting grandfathered into the elite birding academies, and they couldn’t even sit still! If you oppose the sitting test, you are in favor of WASPy morons who scare away the birds. You oppose The Enlightenment itself.

Problems persist, however. When we actually look at the sitting still performance of the elite bird-watcher population, they’re actually not so great. Yes, these people are good at sitting still, and some are really quite good. But not crazy good at it, even among the ultra elite. If you go by elite scores, in fact, it looks like literally a quarter of the population might meet the sitting still bar for being a great bird-watcher, even though the test sample was admitted to the birding academies partly on sitting scores. Among other things, there’s basically no excuse for the differential representation of men and women in the birding world.

Crazy! A quarter of the population! We thought that there could only be a few great birders, but maybe there’s a huge untapped potential for a breakthrough in our species. The sitting still psychologists are not pleased.

Some well-intentioned educators show up. Could we at least split it, guys? We have this intuition that there are many different kinds of birders. Fine, the psychologists say. Make a test. The educators invent some tests, but in as much as they are predictive of bird-watching, they correlate with sitting score, and in as much as they aren’t, they don’t. Somehow, the other aspects of birding are resistant to isolated measurement in a test you take sitting down for a few hours. Grit doesn’t replicate.

What do people who teach bird-watching know about a person’s capacity to learn bird-watching? the psychologists say. Our best studies now show that we can isolate the ultimate essence of birding, the principal component of all the tests. It is a test conducted in a white room, with a chair of so-and-so-weight. All stimuli are excluded. It is totally silent. Nobody is present in the room. There are no windows.

Some birders hear about this test and are amazed. The test now excludes absolutely everything we think matters about bird-watching, they say: responsiveness to external stimuli, to other birders in the field, to dynamic upsets, false leads, the thrill of the chase, the intuitions, the third-sight. Doesn’t this disprove that the sitting-still task is a measure of bird watching?

Fine, if sitting still is not birding, the psychologists say, what else could it be? Could you define birding for us?

Many people think this is a good point, in part because the sitting-still score has been named the “Bird-watching Ability Quotient”. How could it do anything other than measure it? Parents tell kids who can sit really still, oh, you could make a great bird-watcher. In movies, bird-watchers save the Earth by sitting really really still while things explode all over the bird-watching complex. Young kids who are just mediocre at sitting still give up on bird watching and become psychologists.

We’d never do this kind of stuff in reality, of course. We’d never be so wrong about a thing we value so much. We’re a high-IQ society.

10 Replies to “IQ Cults, Nonlinearity, and Reality: a Bird-watcher’s Parable”

  1. In 2018, almost every great bird-watcher alive–who thinks the test is “absolutely crazy” — has himself performed well on the test and magically hails from an institutions with extremely high median scores.

    These objections will be taken seriously when, for example, your and other top-flight technical institutions start admitting 500 SAT-M / GRE-M applicants and reliably continue to churn out similarly prestigious mathematicians and physicists. Until then, too many of us notice too strong and too robust a correlation between psychometric IQ and intelligence for us to take too seriously hyperbolic deflations like “absolutely crazy”.

    There is a need and a market for intellectual diagnostics. And such diagnostics will perforce be relatively brief and exam-like. We’re on a planet of almost 7 billion people. Examinations which allow us to quickly and efficiently separate the meal from the bran to a first approximation are useful.

    The critics of IQ are all too frequently feel-good equalitiarians, morally opposed to the any notion of innate superiority, and inclined to gripe when greeted with any unequal outcome.

    To be sure, they offer nothing better. And neuroscience, psychometrics, and genomics are proceeding apace and in a manner largely confirmatory of
    hereditarian, immutable, intelligence-as-tested-by-IQ-tests. I’m hopeful we’ll soon have a better, less correlative understanding of the essential nature of intelligence.

    Until then, it looks like IQ proxies– whether the AFQT, GRE, RAPM, WAIS-R &c. — are necessary and do a good job at measuring something real and important and predictive.

    • Are you really Emmy Noether, the great mathematical physicist returned? This is great.

      I think you’ve represented the opposing viewpoint quite well. I query a few things:

      1. Per parable, the sitting task is, indeed, predictive. I don’t know what SD 500 SAT corresponds to, but if it’s above median, sure. The idea that physicists need an IQ of 150 to do great work is nonsense, however—see the great Twitter debate for statistics on one collection of physicists that puts them between one and two SD from median. And that’s after partial selection on test scores.

      2. Be careful of psychoanalyzing people who disagree with you. It’s tempting, I know.

      3. Your claims about hereditary elite performance, neuroscience, psychometrics, genomics have zero support in the data. You will disagree with this because you confuse high sitting score with being good at bird watching.

  2. “Are you really Emmy Noether, the great mathematical physicist returned? This is great.”

    Ha-ha. Indeed, the same, my son. I am the Ghost of Physics Problem Sets Past, granted temporary leave of Heaven to answer all your psychometric queries 😉

    “1. Per parable, the sitting task is, indeed, predictive. I don’t know what SD 500 SAT corresponds to, but if it’s above median, sure.”

    Sorry for the confusion — I meant an SAT (math) score of 500. Not sigma units.

    “2. The idea that physicists need an IQ of 150 to do great work is nonsense, however—see the great Twitter debate for statistics on one collection of physicists that puts them between one and two SD from median. And that’s after partial selection on test scores.”

    Thanks for the rec! I’m decently familiar with what I think is related differential psych lit — Rae, Terman, Collingsworth, etc. But this isn’t ringing a bell exactly. Sounds interesting though. I’ll try and locate this!

    2. Be careful of psychoanalyzing people who disagree with you. It’s tempting, I know.

    A wise and fair admonition! But, of all psychologizing, surely it’s safer to make a general claim about some fraction of the opposition. Rather than an absolute claim about a particular person. Or a claim about absolutely everyone.

    I myself try and hew studiously to Noether’s Razor: Never attribute to malice that which can be adequately explained by stupidity — or ordinary disagreement!

  3. “The idea that physicists need an IQ of 150 to do great work is nonsense, however.”

    I’m inclined to say, as usually deployed, this sort of statement is wrong/misleading. But to be fair, strictly speaking, it’s in need of some disambiguation because I’d have to admit for some values of “great” and “IQ = 150”, I agree with you.

    We need, of course, to specify our metric to really even begin to evaluate.
    150 IQ as measured by the MAT is not 150 IQ as measured by the WAIS-R. (Which, as a theoretical physicist, is likely partly your beef with this crude measure!). I would agree that you don’t need a 150 IQ as measured by these tests. An 150 IQ is not needed the way a score on a bar exam score is needed to be a lawyer. Much less the way the male is needed to be bachelor.
    This is not how I think one should most constructively construe statements like “150 IQ is needed to do great work in physics.”

    You don’t think you need a 150 IQ to do great work in physics. Fair enough.
    But in order to motivate this discussion and get a better handle on where exactly you stand, some related questions:

    1. Do you need to meet some threshold of intelligence (suitably measured) to do physics successfully? I.e., Do you agree that a rare intelligence is required for good physics, but IQ is an imperfect measure of mental horsepower?

    2. Is this threshold of intelligence greater to be an eminent physicist? Do you need, e.g., 3-sigma+ intelligence to do great work in physics?

    3. Do you consider human intelligence to be functionally dissociable along verbal and mathematical domains?

    • Richard Feynman famously reported his IQ score: 120 (IIRC). That’s 2-sigma, for one of the best. With the simplest correction for the Flynn effect, that means he would score around 100 today. If you believe that IQ is a measure of intelligence, this will seem insane. It appears to undermine the whole story that Charles Murray et al want to tell. So the impulse is to say it can’t be true. I’d urge people, instead, to take it seriously.

      RE: “Functionally dissociable”—I’d point to my remarks about about beauty in music.

  4. Is there a possible tension in fundamental aims? Psychometricians are interested in bird-watching aptitude. They’re not interested in bird watching achievement per se.

    The force of your parable is “Myopic fools! Sitting-still has only the most tenuous connection to our praxis. Just do; just bird-watch!” Which is more or less in tension with the very notion of abstract intelligence and general aptitude if not compressed, artificial examinations more broadly. Why do IQ tests exist? Let’s just look at peoples’ marks in school! Who needs tests for mathematical talent? Will just let people tackle Fermat’s theorem.

    I can’t help but think this parable would have a greater correspondence with reality of psychometric practice if instead of a caricature of Procrustean sitzfleisch, you analogized to something like visual acuity. Better still, if you wish to critique concepts of IQ as exemplified by the WAIS or Stanford-Binet, you used a composite measure of bird-watching ability–one that included sitting-still + visual acuity + general bird knowledge, e.g. — instead of a univariate score like time sitting-still. A 150 IQ on the WAIS or Stanford-Binet is a *composite* measure summed over a dozen or so sub-tests. There’s more than one path to a X IQ destination!

    Your parable simply stipulates a priori this sitzfleisch fixation. But in reality, a lot more thought and validation and gone into intelligence proxies. We know we’re leaving out numerous contingencies like interest in birds and knowledge of birds. But we’re studying intelligence. Not personality variables. And not achievement, as such.

    E.g., There are reasonable, intelligent, and informed practioners in cognitive science who think analogy is the core of cognition, like Hofstadter. Now, whether you think this is right is beside the point. It’s a defensible, not-crazy position. Analogies are also common elements of intelligence tests. There is clearly a greater correspondence between intelligence test-items and intelligence than sitting-still and bird-watching.

    Our understanding and measurement of intelligence may be imprecise. It could be improved–I think it will be. But this is a pretty provocative parody, right?

    I see value in the parable. And I concur that we should avoid a brittle over-reliance on these measures, especially as a society. I think there are a class of related problems that are going to attend any IQ measure we come up with, no matter how good. Cf. Goodhart’s Law.

    • So!—what’s nice here is that we’ve honed down to a disagreement driven by a basic assumption: that intelligence is a thing like temperature, and all we have to do is work out the core (mean kinetic energy?) Whether that’s IQ, marks in school, number of citations, etc.

      I think this is a fundamentally incoherent position. But it’s also a very compelling one to take—intelligence as brain temperature permeates our culture, the general understanding of what matters when it comes to this basic feature of our lives.

      I agree that intelligence has an abstract quality. Higgs happened to do phase transitions in QFT with a scalar order parameter, but he was not a phase-transitions-in-QFT-&c intelligence “type”.

      Perhaps a better analogy would be to beauty in music, say.

      We know that beauty in music is real (I’m a Platonist), it’s not convention, it’s not uni-axial, nor even “multidimensional”. To even know it is the outcome of a far more complex process/computation than an assay and, to a certain extent, is as difficult a process as the creation itself. That doesn’t exclude the possibility that a measure of musical complexity might be an interesting way to track the cultural evolution of the practice.

      I think this puts me in a tradition with David Deutsch (see, e.g., Beginning of Infinity), among others.

      Happy New Year!

    • Sorry to butt in uninvited. I feel, though, that this overcomplicates what is really a simple issue. The question that (I think) our good admin is asking, and that no one seems able to answer is this:

      Why does IQ *stop* mattering?

      It’s not hard to understand why, among a given population, people with a slight edge in a particular kind of abstract problem solving might be more successful on average.

      The real question is whether that edge matters because it directly enables success, or only because it serves to distinguish certain candidates from others. If the former were true, it would be natural to expect a monotonic effect: on average, more IQ would keep meaning more success at every level.

      The fact that we don’t see a monotonic effect suggests this hypothesis: what we’re really seeing is a gradient selection effect. There’s an IQ gradient, and resources wind up going to people on the “right side” of the gradient. But when it comes to actually doing science, the benefits of high IQ are modest at best. A similarly organized society with an average IQ ten points lower wouldn’t have fewer or noticeably worse scientists. Their IQs would be above average in their society, but merely average in ours. Moreover, this isn’t as hard to test as one might think, as our good admin’s example of Feynman hints.

      Now I don’t know if the above hypothesis is true. But it seems pretty consistent with the evidence. To disprove it, I think you’d have to come up with a better way to explain why IQ stops mattering.

  5. Don’t forget to mention that the psychologists who study this phenomenon happen to be exceptionally good at sitting still.

  6. An interesting question would be: What measurable skills actually go into birdwatching?

    My guess would be:

    – Energy to get out and tromp around after birds
    – Obsessiveness in running up tallies
    – Competitiveness
    – Sharp eyesight
    – Sensitive color vision (no color blindness)
    – General intelligence
    – Visual memory
    – Maybe sitting still

    Other interesting questions are how many of the traits are positively correlated and how many are negatively correlated. Probably your sitting still is negatively correlated with the high energy required to run up big counts.

    In contrast to IQ, where subtests surprisingly positively correlate, my guess would be that some birdwatching skills tend to negatively correlate with some other skills.

Comments are closed.