Ad hoc explanations – a rejoinder to Hanson

I wrote a few days ago this rather lengthy review of The Elephant in the Brain, and I got a reply from Robin Hanson here. After this reply, I think we are in agreement more than I initially thought, and that is a good thing, but there are still a few things I disagree with in his reply.

I will begin by noting down something that left me surprised to the utmost degree, even though it is a tiny discussion point:

 (Note that our book never uses “hypocrisy”.)

It is just a comment in passing Hanson makes. Initially I thought that my memory had failed me (my memory is self-admittedly poor) yet again, as when I wrote the bit about hypocrisy in my review I hadn’t literally checked the use of the word. But then a commenter on Overcoming Bias pointed out that the book does use it. Bryan Caplan also made a few days later the same point – perhaps they discussed my review over lunch 🙂 – but he notes that the concept itself pervades the book. Not only it pervades the book, but the word appears in the book (along with hypocrite) a couple of times used in the relevant sense (Discounting references or quotes). It is not a bad thing that the book uses the word, as it is talking precisely about it, but it puzzles me that Bryan and Robin say that the book scrupulously avoids the word. Maybe it was the intention to avoid the word to avoid giving the arguments a moralising tone: hypocrisy sounds like a bad thing, but somehow it slipped into the book. Hanson has been writing about people being hypocrites for a long time after all.

Moving on to more relevant stuff,

  1. Other explanations can account for each of puzzling patterns we consider.
  2. We shouldn’t call hidden purposes “motives”, nor purposeful ignorance of them “self-deception”.

My comments went along similar but not quite those lines: I said that yes, some other explanations can account for the puzzling patterns, and I also denied that some of the puzzles even existed. I’ll discuss the second point first.

I’ve said this before, but let me repeat: Our focus in this book is on big puzzling patterns of behavior that don’t fit with the usual purposes people usually cite in the most public of forums. We point to other purposes that people better achieve via these behaviors. […]

We call these purposes “motives” and note that people seem suspiciously unaware that their behaviors achieve them, even though these are familiar purposes with simple connections to behavior. Such a lack of awareness, created on purpose, we call “self-deception.” We do not clarify the degree to which people lie or are unconscious of these motives, nor the degree to which behaviors are adaptive in particular circumstances. These things vary greatly by culture, person, and context, and it was hard to fit as much as we did into one book. We are focused on distal, not proximate, causes. […]

Kel mentions that a well-known author we revere (Trivers) prefers a definition of “self-deception” compatible with our usage. But Kel still complains that our claims are doubtful given his preferred usage. And even if he accepted our usage there, he says, other claims would remain doubtful given his preferred usage of terms like “aware”, “motive”, and “selfish”. (Note that our book never uses “hypocrisy”.) His preferred usage of these terms, you see, relies on distinctions regarding lying, awareness, consciousness, and adaptiveness. As we don’t make those distinctions, our claims just can’t use those terms correctly. And thus we are wrong, just wrong, admit it wrong! […]

And of course we haven’t proven the centrality of the hidden motives we postulate, especially if you require that we first disprove all possible ad hoc explanations for each behavior pattern.

The last paragraph I cite above tells me I was wrong about an assumption I made when I started writing the review: I overestimated how confident the authors were in making the claims made in the book. Seen in this light, from their perspective my 21k words long review must have seemed like an overkill. I had the impression that the claims were meant to be more solid than they were, because even though the authors admitted that there would be mistakes made in the book, the fact that it is a book published by a major academic publisher, and the feedback I had read from people I highly respect made me think that the book really was making super-solid claims with high probability of being true, instead of merely proposing an interesting theory that deserves further investigation.

The previous paragraphs I think misrepresent the points I made regarding “conceptual confusion”. I personally think that if one defines their terms clearly, there is little or no issue in accepting them within that particular argumentative framework.

The only trouble with using some words or others is that they can induce confusion about the topics one is discussing if one is not clear enough. I may be too thick or take things too literally, but for me it was not clear enough, though by no means I mean to imply that the authors deliberately seeked to obscure what they wanted to say!

The points I made however, were meant to be discussed within the author’s preferred use of words, to facilitate conversation. Ultimately we have some meanings in our heads that we want to convey, and we use words for that, but different people may use the same words meaning or implying different things.

For example, when I said that there is little evidence of “adaptive self-deception”, this claim was made within the framework they discuss where they make claims along the lines of “There are different modules in the brain, the press secretary of which is your conscious awareness, there are many things going on you are not aware of, and your brain hides things from you while acting on the true information”.

So clearly the authors here are meaning self-deception in the “two inconsistent representations of knowledge” meaning that I and Pinker endorse.

Now, so it happens that this meaning is covered by Trivers’ usage, so my concept of self-deception is subsumed by Trivers’. As I said in my previous review, this is all fine. But then one has to be clear when one uses that concept, because that word is addressing different things now. As it stands after my review, there is plenty of evidence of cognitive bias and little of adaptive SD. So claiming that we engage in self-deception collapses into saying that our cognition is biased. That was my point. I thus grant that “self-deception is pervasive” if by that we mean “cognitive biases are pervasive”.

At this point one might think that I’m just being a punctilious word nazi of some sort, but it’s not that, I am honestly caring only about clarity, and explicitly disregarding anything like an intrinsic meaning for a word.

If the above wasn’t clear, imagine that we are considering the effects of drinking on health. It is found that heavy drinking worsens health. So I accept “Heavy drinking causes worse health”. Someone comes along and says “For me drinking is a more inclusive concept, we have to include drinking water along with alcohol too”. And this person continues: “Drinking (as defined by drinking water and/or alcohol) causes worse health, but there is some heterogeneity in the results in low-alcohol drinking samples”. I would accept this claim. But here it seems to me that it is clearer to study the drinking of water and the drinking of alcohol separately. Same for adaptive self-deception and other biases.

The first point of Hanson’s reply:

One can usually construct an ad hoc explanation for most any particular observed pattern. That’s the problem; its too easy. That’s why we held ourselves to the higher standard of trying in each area to suggest a single main purpose that could explain as many behavior puzzles as possible. Though we mention a few other plausible purposes.

Consider the example of explaining over-consumption of medicine as due to ignorance on the optimal spending level. This is plausible, but if all we knew is that people are ignorant, we would predict under-consumption just as easily as over-consumption. So to explain over-consumption we have to add in an auxiliary assumption about the typical direction of mistake. We also need another auxiliary assumption to explain the strong correlation of this mistake across time and space.

Each auxiliary assumption may be plausible, but it isn’t obvious. So the more of them you make, the more your theory loses on prior probability grounds. After all, you must multiply together the prior on each of assumptions to get the prior on a total model that explains the world. In contrast, while a single main explanation for all puzzles also typically need a few auxiliary assumptions, it needs fewer. Which is why a Bayesian analysis tends to favor a simpler explanation when that fits the data as well.

I agree with Robin that the Bayesian approach to arbitering between competing theories is the right one, but the explanations I mentioned are not ad hoc, I see them as relying on general well established principles. For one, one of my assumptions is “People care about their health” or “People care about their family and friends” (Not merely that they want to show it, but that they truly do!) or “The human body is an extremely complex system hard to understand from first principles”. These claims may not be true with probability one, but they get close. Furthermore these claims are not new: There were already there and we already believe them, I am just connecting them and drawing out their implications. In contrast, Hanson’s theory, while indeed ambitious, is a new one that it would get a much lower initial prior probability because of its novelty.

The core thesis of the book, to restate it once more is that “we are strategically blind to key aspects of our motives” (and that this explains a few puzzles). These key aspects relate to selfishness.

Through my review I gave rejections of some of the puzzles (So if these rejections go trough, then these would be example where the theory predicts something that is not observed, counting against it). For exampe, with body language I argued that we are substantially more aware of its workings than the chapter says, and that the reason it is not taught formally is precisely that we have a good intuitive grasp of it. I’m not claiming one cannot get better at it, but that for most purposes and intents, our intuition suffices. Similarly, for charity the authors argued that people say charity is for maximising good done, but that if that were so, people would behave differently (i.e. they would be EAs). I said, in contrast, that this is not what people aim for when they are donating money. Another one is a causal link between status and health mediated by stress. But as I mentioned in my previous post, that doesn’t seem to be the case either.

I also identified at least some cases where the theory would predict something that we do not observe and that was not mentioned in the book: Shouldn’t we observe healthcare being used as a gift if we use it to show that we care? It sounds like a vaguely plausible implication of the model, yet it is not an implication that Hanson would accept (I think).

For others I did offer an alternate explanation, I want to mention medicine here as it is argueably one of the topics where getting it right is important given the amount of resources that are at stake.

My explanation for medicine involves the following points:

  1. People care about their health
  2. People care about others’s welfare, mostly friends and family
  3. The human body is a highly complex system
  4. The human body is one of the few complex systems that has been around for millenia that humans have had to deal with
  5. People are reluctant to do experiments on human bodies for ethical reasons (Or even do experiments with dead bodies due to religious beliefs in the past)
  6. Good health is most usually a prerequisite for a good life
  7. Trial-and-error learning is extremely hard with complex systems
  8. 6. is even harder without the modern tools of science
  9. Humans have a tendency to see patterns where there are not

I don’t see these ad ad-hoc, I see them as general truths, not specific to medicine, and I see them, again, as extremely plausible.

Consider architecture and engineering: Cathedrals or roman engineering were top notch compared to the medicine of their time. Why? Because like most things in the world, classical mechanics or an intuitive understanding thereof makes it easy. Classical mechanics is based on simple principles, and throughout human history most systems we have encountered or built were still explainable in terms of these simple principles.

With the human body, some things seem easy: (Many kinds of) Surgery, dentistry, or ophtamology are easier to understand through classical mechanics. These have advanced the most, but many aspects of medicine are not like that: Cancer, or heart diseases, or infections, or diabetes are hard to understand that way, and thus they took longer to understand and properly cure.

That didn’t stop people from trying. But given that we tend to heal naturally, any random treatment (herbs, magic rituals) will seem to cure you. These anecdotes then become folk knowledge, and without science that’s the best you can do.

The effect of our expectations on perception (of pain, the placebo effect) also makes it hard to come up with remedies that “really work”, beause remedies that “do not really work” actually work! (Meaning that they do relieve pain, but not for the reason you think it does)

My theory predicts a bunch of things: That in the fields that are more like mechanics you will see more progress and better treatments, that over time medicine effectiveness will increase, and supply will match demand. I thus predict that all the waste we see will disappear as knowledge of healthcare improves -especially for healthcare systems dominated by out of pocket payments that make prices visible, but argueably these are not most healthcare systems-. And that this will happen even if the general population or the subpopulation of clinicians do not become aware of the “healthcare is to show that we care” theory. (I also predict that they won’t become aware of it). Healthcare spending as a % of GDP will go down in most developed countries.

However, if healthcare is -in part- to show that we care, and people subconsciously want to do that, we shouldn’t see this happening. Waste will accompany us forever unless people pay attention to the theory and we engage in some clever Hansonian institutional design to cater to those hidden motives. Or maybe one may still argue that yes, in the long run knowledge will triumph, but that given the current lack of knowledge, healthcare to show that we care explains overconsumption. As Robin said:

Consider the example of explaining over-consumption of medicine as due to ignorance on the optimal spending level. This is plausible, but if all we knew is that people are ignorant, we would predict under-consumption just as easily as over-consumption. So to explain over-consumption we have to add in an auxiliary assumption about the typical direction of mistake. We also need another auxiliary assumption to explain the strong correlation of this mistake across time and space.

I assume you will agree with me that medicine and healthcare are harder and more complex than classical mechanics. This and some others of the claims above explain why we don’t know as much as we do in other fields. But this does not explain, as Hanson points out, why over instead of underconsumption.

Well, we should ask the clinicians signing up patients for treatment, and the patients asking for it. Why do they say? One, that they are aware that ~20% of provided care is unnecessary, but that they recommend it anyway to avoid being sued for malpractice, and because patients pressured them into it. (Lyu et al, 2017). This is similar in at least the UK, Italy, Japan, and Australia(Ortashi et al. 2013). This motive most likely extends beyond economic issues: one of the articles points out that doctors may be willing to overtreat to avoid having to deal with the negative thoughts associated with a patient’s worsening of health because of unprovided treatment.

In addition to that, both physicians (Hoffman & Del Mar, 2017) and patients (Hoffman & Del Mar., 2015) overestimate on net how good treatment is and underestimate potential harms from treatment. Errors in the opposite direction also were prevalent, but an optimistic bias prevailed.

If we continue through this trail of papers and ask again why both groups overestimate on net we end up at a set of cognitive biases: illusion of control and confirmation bias, and these two are general phenomena that have been observed across many domains, not just healthcare. Perhaps also complexity bias.

These two can argueably also be leveraged to explain puzzles in finance: why people pay fees so high and get shitty funds from financial advisors. And they may also explain similar other things like dowsing, (believe it or not, some employees of UK water companies still do it), education (“See, this student went far because of me, education works”), or nutrition science (here the parallels to medicine are more striking)

(Maybe I should write a “How cognitive biases rule the world” book instead of reviewing other people’s? Or at least write a proper paper so that I can be at the receiving end of a review?)

What is more likely, these biases or hidden motives, all things considered, and thinking bayesianly? We don’t have to choose, of course, it can be both, but I see the weight of the evidence as being more heavily on the side of what I say being much bigger than hidden motives (in healthcare).

Ideally we would want to bet on this: $200 that in 5(or 7? 10?) years time that the evidence will favour my explanation more strongly than a hidden motives based one (Unless we want to call the biases mentioned above as hidden motives! But then we reject healthcare as showing care…). As I think the bet is biased in my favor at even odds (Research on hidden motives in health may take its time, and Hanson’s paper on the matter was published almost two decades ago and has gotten very little engagement), if I win I get 200, if I lose I pay 400 or something like that. I volunteer Bryan Caplan and RCAFDM as arbiters.

EDIT: I got a brief reply at Hanson’s original post.  Below my reply to his reply to my reply to his reply to my review of his book:

I’m pretty sure that a) few are aware that even close friends use status motives to negotiate a non-equal relative status, and b) most will say that the point of their charity is to help others. (Few ever talk of “maximizing” anything.)

What I meant by people being aware of body language and not needing to study it is that we know when a person is happy or sad, when someone looks down, when someone is fidgeting with their hair and what that means, that touch signals closeness, what having the arms crossed means, or that looking into the eyes of a superior is more intimidating than looking into the eyes of an inferior in some hierarchy. We may not be able to verbalise why, and we won’t know all there is to know about body language by mere intuition, but my claim is that we will know enough to get around such that we don’t have to learn it. I would be surprised if training in body language awareness had much of an effect (Cohen’s d>0.2) on any success-related outcome compared to a group of peers. I tried looking for studies on this but I failed.

For charity, they will say that, and they will be right in it. They help others, but not any random others, one has to keep asking “why these others and not those others” and they will confirm.

Most ad hoc explanations everywhere are based on reasonable long-standing assumptions. But they still require topic-specific auxiliary assumptions. For example, Kel invokes “biases” to explain why we over- rather than under-consume medicine:

Ask again why both groups overestimate on net we end up at a set of cognitive biases: illusion of control and confirmation bias, and these two are general phenomena that have been observed across many domains, not just healthcare.

We can’t over-consume everything. If we over-consume medicine relative to other things we need a more specific reason than a general bias that applies equally to everything.

Yes: But that is why I talked about the specifics of medicine: medicine has some peculiarities that I describe (So does financial advise, day trading, nutrition, etc)

My theory predicts a bunch of things: … I thus predict that all the waste we see will disappear as knowledge of healthcare improves … Healthcare spending as a % of GDP will go down in most developed countries. … Ideally we would want to bet on this: $200 that in 5(or 7? 10?) years time that the evidence will favour my explanation more strongly than a hidden motives based one. (Unless we want to call the biases mentioned above as hidden motives!)

The claim “evidence will favor biases over hidden motives” seems harder to judge than whether the % of GDP to medicine goes down. I’d bet $10K at even odds that % of GDP to medicine goes up over the next ten years in the highest-income 1/4 of nations worldwide. Hereare some relevant datasets.

That is quite a big ellipsis! The 10 year time frame is for the bet I proposed, not for the healthcare % going down: I see research as progresing faster than social structures: in healthcare RCTs can take up to decades to have a significant effect on practice.

Healthcare costs have been increasing for decades everywhere and they are not going to start declining today. The international campaign Choose Wisely to reduce waste in medicine is just 5 years old. This seems like the first serious attempt to tackle the problem (targeting specific issues, measure effectiveness, brainwash  teach medical students to do better, etc). For healthcare I  accept the $10,000 bet if the time frame is 40 years: No developed country will spend in 2058 more % of their GDP in healthcare (private+public) than the US today minus 10%. If one or more countries consume more than 16.1% of their GDP in 2058, I lose. For a shorter term prediction, also $10,000 that by the end of 2038 no country will be spending more than 21% of their GDP in healthcare (For reference, A 1% yearly increase trend for the US leads us to 21.84%, and a 2% trend leads us to 27% in that year. Typical projections are all GDP+more than 1%, so my projection is lower than the lower end of these)


This entry was posted in Blog. Bookmark the permalink.

One Response to Ad hoc explanations – a rejoinder to Hanson

  1. Pingback: Overcoming Bias : A LONG review of Elephant in the Brain

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s