Archive for category Probability and Risk

Intuitive Bayes and the preponderance of pessimism about human reasoning

(This is a 4th post on rational behavior of people unfairly judged to be irrational. See 1, 2, and 3)

Most people believe they are better than average drivers. Is this a cognitive bias? Behavioral economists think so: “illusory superiority.” But a rational 40-year-old having had no traffic accidents might think her car insurance premiums are still extremely high. She may then conclude she is a better than average drivers since she’s apparently paying for a lot of other peoples’ smashups. Are illusory superiority and selective recruitment at work here? Or is this intuitive Bayesianism operating on the available evidence?

Bayesian philosophy is based on using a specific rule set for updating one’s belief in light of new evidence. Objective Bayesianism, in particular, if applied strictly, would require us to quantify every belief we hold – our prior credence –  with a probability in the range of zero to one and to quantify the value of new evidence. That’s a lot of cognizing, which would lead to a lot more personal book keeping than most of us care to do.

As I mentioned last time, Daniel Kahneman and other in his field hold that we are terrible intuitive Bayesians. That is, they believe we’re not very good at doing the equivalent of Bayesian reasoning intuitively (“not Bayesian at all” said Kahneman and Tversky in Subjective probability: A judgment of representativeness, 1972). But beyond the current wave of books and TED talks framing humans as sacks of cognitive bias (often with government-paternalistic overtones), many experts in social psychology have reached the opposite conclusion.

For example,

  • Edwards, W. 1968. “Conservatism in human information processing”. In Formal representation of human judgment.
  • Peterson, C. R. and L. R. Beach. 1967. “Man as an intuitive statistician”. Psychological Bulletin. 68.
  • Piaget, Jean. 1975. The Origin of the Idea of Chance in Children.
  • Anderson, J. R. 1990. The Adaptive Character of Thought.

Anderson makes a particularly interesting point. People often have reasonable but wrong understandings of base rates, and official data sources often vary wildly about some base rates. So what is characterized by critics of humans’ poor performance at Bayesian reasoning (e.g., by ignoring rates) is in fact use of incorrect base rates, not a failure to employ base rates at all.

Beyond the simple example above of better-than-average driver belief, many examples have been given (and ignored by those who see bias everywhere) of intuitive Bayesian reasoning that yields rational but incorrect results. These include not only for single judgments, but for people’s modification of belief across time – Bayesian updates.

For math-inclined folk seeking less trivial examples, papers like this one from Benoit and Dubra lay this out in detail (If a fraction x of the population believes that they rank in, say, the top half of the distribution with probability at least q > 1/2, then Bayesian rationality immediately implies that xq <= 1/2, not that x <= 1/2 [where q is the subject’s confidence that he is in the top half and x is the fraction who think they’re in the top half]).

A 2006 paper, Optimal Predictions in Everyday Cognition, by Thomas L. Griffiths and Joshua B. Tenenbaum warrants special attention. It is the best executed study I’ve ever seen in this field, and its findings are astounding – in a good way. They asked subjects to predict the duration or extent of common phenomena such as human lifespans, movie run times, and the box office gross of movies. They then compared the predictions given by participants with calculations from an optimal Bayesian model. They found that, as long as subjects had some everyday experience with the phenomena being predicted (like box office gross, unlike the reign times of Egyptian pharaohs), people predict extremely well.

The results of Griffiths and Tenenbaum showed people to be very competent intuitive Bayesians. Even more interesting, people’s implicit beliefs about data distributions, be they Gaussian (birth weights), Erlang, (call-center hold times), or power-law (length of poems), were very consistent with real works statistics, as was hinted at in Adaptive Character of Thought.

Looking at the popular material judging people to be lousy Bayesians steeped in bias and systematic error, and far less popular material like that from Griffiths/Tenenbaum, Benoit/Dubra and Anderson, makes me think several phenomena are occurring. To start, as noted in previous posts, those dedicated to uncovering bias (e.g. Kahneman, Ariely) strongly prefer confirming evidence over disconfirming evidence. This bias bias manifests itself both as ignoring cases where humans are good Bayesians reaching right conclusions (as in Griffiths/Tenebaum and Anderson) and as failure to grant that wrong conclusions don’t necessarily mean bad reasoning (auto driver example and the Benoit/Dubra cases).

Further, the pop-science presentation of human bias (Ariely TED talks, e.g.) makes newcomers to the topic feel like they’ve received a privileged view into secret knowledge. This gives the bias meme much stronger legs than the idea that humans are actually amazingly good intuitive Bayesians in most cases. As John Stuart Mill noted 200 years ago, those who despair when others hope are admired as sages while optimists are dismissed as fools. The best, most rigorous analyses in this realm, however, rest strongly with the optimists.

 

1 Comment

Daniel Kahneman’s Bias Bias

(3rd post on rational behavior of people too hastily judged irrational. See first and second.)

Daniel Kahneman has made great efforts to move psychology in the direction of science, particularly with his pleas for attention to replicability after research fraud around the priming effect came to light. Yet in Thinking Fast And Slow Kahneman still seems to draw some broad conclusions from a thin mantle of evidentiary icing upon a thick core of pre-formed theory. He concludes that people are bad intuitive Bayesians through flawed methodology and hypotheticals that set things up so that his psychology experiment subjects can’t win. Like many in the field of behavioral economics, he’s inclined to find bias and irrational behavior in situations better explained by the the subjects’ simply lacking complete information.

Like Richard Thaler and Dan Ariely, Kahneman sees bias as something deeply ingrained and hard-coded, programming that cannot be unlearned. He associates most innate bias with what he calls System 1, our intuitive, fast thinking selves. When called on to judge probability,” Kahneman says, “people actually judge something else and believe they have judged probability.” He agrees with Thaler, who finds “our ability to de-bias people is quite limited.”

But who is the “we” (“our” in that quote), and how is that “they” (Thaler, Ariely and Kahneman) are sufficiently unbiased to make this judgment? Are those born without the bias gene somehow drawn to the field of psychology; or through shear will can a few souls break free? If behavioral economists somehow clawed their way out of the pit of bias, can they not throw down a rope for the rest of us?

Take Kahneman’s example of the theater tickets. He compares two situations:

A. A woman has bought two $80 tickets to the theater. When she arrives at the theater, she opens her wallet and discovers that the tickets are missing. $80 tickets are still available at the box office. Will she buy two more tickets to see the play?

B. A woman goes to the theater, intending to buy two tickets that cost $80 each. She arrives at the theater, opens her wallet, and discovers to her dismay that the $160 with which she was going to make the purchase is missing. $80 tickets are still available at the box office. She has a credit card. Will she buy the tickets and just charge them?

Kahnemen says that the sunk-cost fallacy, a mental-accounting fallacy, and the framing effect account for the fact that many people view these two situations differently. Cases A and B are functionally equivalent, Kahneman says.

Really? Finding that $160 is missing from a wallet would cause most people to say, “darn, where did I misplace that money?”. Surely, no pickpocket removed the cash and stealthily returned the wallet to her purse. So the cash is unarguably a sunk cost in case A, but reasonable doubt exists in case B. She probably left the cash at home. As with philosophy, many problems in psychology boil down to semantics. And like the trolley problem variants, the artificiality of the problem statement is a key factor in the perceived irrationality of subjects’ responses.

By framing effect, Kahneman means that people’s choices are influenced by whether two options are presented with positive or negative connotations. Why is this bias? The subject has assumed that some level of information is embedded in the framer’s problem statement. If the psychologist judges that the subject has given this information too much weight, we might consider demystifying the framing effect by rebranding it the gullibility effect. But at that point it makes sense to question whether framing, in a broader sense, is at work in the thought problems. In presenting such problems and hypothetical situations to subjects, the framers imply a degree of credibility that is then used against those subjects by judging them irrational for accepting the conditions stipulated in the problem statement.

Bayesian philosophy is based on the idea of using a specific rule set for updating a “prior” (meaning prior belief – the degree of credence assigned to a claim or proposition) on the basis of new evidence. A Bayesian would interpret the framing effect, and related biases Kahneman calls anchoring and priming, as either a logic error in processing the new evidence or as a judgment error in the formation of an initial prior. The latter – how we establish initial priors – is probably the most enduring criticism of Bayesian reasoning. More on that issue later, but a Bayesian would say that Kayneman’s subjects need training in the use of uninformative priors and initial priors. Humans are shown to be very trainable in this matter, against the behavioral economists’ conclusion that we are hopelessly bound to innate bias.

One example Kahneman uses to show the framing effect presents different anchors to two separate test groups:

Group 1: Is the height of the tallest redwood more or less than 1200 feet? What is your best guess for the height of the tallest redwood?

Group 2: Is the height of the tallest redwood more or less than 120 feet? What is your best guess for the height of the tallest redwood?

Group 1’s average estimate was 844 feet, Group 2 gave 282 feet. The difference between the two anchors is 1080 feet. (1200 – 120). The difference in estimates by the two groups was 562 feet. Kahneman defines anchoring index as the ratio of the difference between mean estimates and difference in anchors. He uses this anchoring index to measure the robustness of the effect. He rules out the possibility that anchors are taken by subjects to be informative, saying that obviously random anchors can be just as effective, citing a 50% anchoring index when German judges rolled loaded dice (allowing only values of 3 or 9 to come up) before sentencing a shoplifter (hypothetical, of course). Kahneman reports that judges rolling a 3 gave 5-month sentences while those rolling a 9 assigned the shoplifter an 8-month sentence (index = 50%).

But the actual study (Englich, et. al.) cited by Kahneman has some curious aspects, besides the fact that it was very hypothetical. The judges found the fictional case briefs to be realistic, but they were not judging from the bench. They were working a thought problem. Englich’s Study 3 (the one Kahneman cites) shows the standard deviation in sentences was relatively large compared to the difference between sentences assigned by the two groups. More curious is a comparison of Englich’s Study 2 and the Study 3 Kahneman describes in Fast and Slow. Study 2 did not involve throwing dice to create an anchor. Its participants were only told that the prosecutor was demanding either a 3 or 9 month sentence, those terms not having originated in any judicial expertise. In Study 3, the difference between mean sentences from judges who received the two anchors was only two months (anchoring index = 33%).

Studies 2 and 3 therefore showed a 51% higher anchoring index for an explicitly random (clearly known to be random by participants) anchor than for an anchor understood by participants to be minimally informative. This suggests either that subjects regard pure chance as being more useful than potentially relevant information, or that something is wrong with the experiment, or that something is wrong with Kahnemnan’s inferences from evidence. I’ll suggest that the last two are at work, and that Kahneman fails to see that he is preferentially selecting confirming evidence over disconfirming evidence because he assumed his model of innate human bias was true before he examined the evidence. That implies a much older, more basic fallacy might be at work: begging the question, where an argument’s premise assumes the truth of the conclusion.

That fallacy is not an innate bias, however. It’s a rhetorical sin that goes way back. It is eminently curable. Aristotle wrote of it often and committed it slightly less often. The sciences quickly began to learn the antidote – sometimes called the scientific method – during the Enlightenment. Well, some quicker than others.

,

Leave a comment

A Bayesian folly of J Richard Gott

Don’t get me wrong. J Richard Gott is one of the coolest people alive. Gott does astrophysics at Princeton and makes a good argument that time travel is indeed possible via cosmic strings. He’s likely way smarter than I, and he’s from down home. But I find big holes in his Copernicus Method, for which he first achieved fame.

Gott conceived his Copernuicus Method for estimating the lifetime of any phenomenon when he visited the Berlin wall in 1969. Wondering how long it would stand, Gott figured that, assuming there was nothing special about his visit, a best guess was that he happened upon the wall 50% of the way through its lifetime. Gott saw this as an application of the Copernican principle: nothing is special about our particular place (or time) in the universe. As Gott saw it, the wall would likely come down eight years later (1977), since it had been standing for eight years in 1969. That’s not exactly how Gott did the math, but it’s the gist of it.

I have my doubts about the Copernican principle – in applications from cosmology to social theory – but that’s not my beef with Gott’s judgment of the wall. Had Gott thrown a blindfolded dart at a world map to select his travel destination I’d buy it. But anyone who woke up at the Berlin Wall in 1969 did not arrive there by a random process. The wall was certainly in the top 1000 interesting spots on earth in 1969. Chance alone didn’t lead him there. The wall was still news. Gott should have concluded that he saw the wall near in the first half of its life, not at its midpoint.

Finding yourself at the grand opening of Brooklyn pizza shop, it’s downright cruel to predict that it will last one more day. That’s a misapplication of the Copernican principle, unless you ended up there by rolling dice to pick the time you’d parachute in from the space station. More likely you saw Vini’s post on Facebook last night.

Gott’s calculation boils down to Bayes Theorem applied to a power-law distribution with an uninformative prior expectation. I.e., you have zero relevant knowledge. But from a Bayesian perspective, few situations warrant an uninformative prior. Surely he knew something of the wall and its peer group. Walls erected by totalitarian world powers tend to endure (Great Wall of China, Hadrian’s Wall, the Aurelian Wall), but mean wall age isn’t the key piece of information. The distribution of wall ages is. And though I don’t think he stated it explicitly, Gott clearly judged wall longevity to be scale-invariant. So the math is good, provided he had no knowledge of this particular wall in Berlin.

But he did. He knew its provenance; it was Soviet. Believing the wall would last eight more years was the same as believing the Soviet Union would last eight more years. So without any prior expectation about the Soviet Union, Gott should have judged the wall would come down when the USSR came down. Running that question through the Copernican Method would have yielded the wall falling in the year 2016, not 1977 (i.e., 1969 + 47, the age of the USSR in 1969). But unless Gott was less informed than most, his prior expectation about the Soviet Union wasn’t uninformative either. The regime showed no signs of weakening in 1969 and no one, including George Kennan, Richard Pipes, and Gorbachev’s pals, saw it coming. Given the power-law distribution, some time well after 2016 would have been a proper Bayesian credence.

With any prior knowledge at all, the Copernican principle does not apply. Gott’s prediction was off by only 14 years. He got lucky.

Leave a comment

Representative Omar’s arithmetic

Women can’t do math. Hypatia of Alexandria and Émilie du Châtelet notwithstanding, this was asserted for thousands of years by men who controlled access to education. With men in charge it was a self-fulfilling prophecy. Women now represent the majority of college students and about 40% of math degrees. That’s progress.

base rateLast week Marcio Rubio caught hell for taking Ilhan Omar’s statement about double standards and unfair terrorism risk assessment out of context. The quoted fragment was: “I would say our country should be more fearful of white men across our country because they are actually causing most of the deaths within this country…”

Most news coverage of the Rubio story (e.g. Vox) note that Omar did not mean that everyone should be afraid of white men as a group, but that, e.g., “violence by right-wing extremists, who are overwhelmingly white and male, really is a bigger problem in the United States today than jihadism.”

Let’s look at the numbers. Wikipedia, following the curious date-range choice of the US GAO, notes: “of the 85 violent extremist incidents that resulted in death since September 12, 2001, far-right politics violent extremist groups were responsible for 62 (73 percent) while radical Islamist violent extremists were responsible for 23 (27 percent).” Note that those are incident counts, not death counts. The fatality counts were 106 (47%) for white extremists and 119 (53%) for jihadists. Counting fatalities instead of incidents reverses the sense of the numbers.

Pushing the terminus post quem back one day adds the 2,977 9-11 fatalities to the category of deaths from jihadists. That makes 3% of fatalities from right wing extremists and 97% from radical Islamist extremists. Pushing the start date further back to 1/1/1990, again using Wikipedia numbers, would include the Oklahoma City bombing (white extremists, 168 dead), nine deaths from jihadists, and 14 other deaths from white wackos, including two radical Christian antisemites and professor Ted Kaczynski. So the numbers since 1990 show 92% of US terrorism deaths from jihadists and 8% from white extremists.

Barring any ridiculous adverse selection of date range (in the 3rd week of April, 1995, 100% of US terrorism deaths involved white extremists), Omar is very, very wrong in her data. The jihadist death toll dwarfs that from white extremists.

But that’s not the most egregious error in her logic – and that of most politicians armed with numbers and a cause. The flagrant abuse of data is what Kahneman and Tversky termed base-rate neglect. Omar, in discussing profiling (sampling a population subset) is arguing about frequencies while citing raw incident counts. The base rate (an informative prior, to Bayesians) is crucial. Even if white extremists caused most – as she claimed – terrorism deaths, there would have to be about one hundred times more deaths from white men (terrorists of all flavors are overwhelmingly male) than from Muslims for her profiling argument to hold. That is, the base rate of being Muslim in the US is about one percent.

The press overwhelmingly worked Rubio over for his vicious smear. 38 of the first 40 Google search results on “Omar Rubio” favored Omar. One favored Rubio and one was an IMDb link to an actor named Omar Rubio. None of the news pieces, including the one friendly to Rubio, mentioned Omar’s bad facts (bad data) or her bad analysis thereof (bad math). Even if she were right about the data – and she is terribly wrong – she’d still be wrong about the statistics.

I disagree with Trump about Omar. She should not go back to Somalia. She should go back to school.

6 Comments

-> Fallacies for Physicians and Prosecutors

For business reasons I’ve started a separate blog – “on risk of” – for topics involving risk analysis, probability , aerospace and process engineering, and the like.

Tonight I wrote a post there on logical fallacies that come up in medicine and in court. For example, confusing the probability of a match between characteristics of a perpetrator as reported by witnesses and those of a specific suspect with the probability of a match with anyone in a large population – particularly when the probability of a match is claimed by prosecution to be the probability that a defendant is not guilty. I also look at cases involving confusion between the conditional probability of A given B vs. the probability of B given A, e.g., the chance of the disease given a positive test result vs. the chance of a positive test result given the disease – classic Bayes Theorem stuff.

Please join me at onriskof.com. Thanks for your interest.

 

Leave a comment

My Trouble with Bayes

The MultidisciplinarianIn past consulting work I’ve wrestled with subjective probability values derived from expert opinion. Subjective probability is an interpretation of probability based on a degree of belief (i.e., hypothetical willingness to bet on a position) as opposed a value derived from measured frequencies of occurrences (related posts: Belief in Probability, More Philosophy for Engineers). Subjective probability is of interest when failure data is sparse or nonexistent, as was the data on catastrophic loss of a space shuttle due to seal failure. Bayesianism is one form of inductive logic aimed at refining subjective beliefs based on Bayes Theorem and the idea of rational coherence of beliefs. A NASA handbook explains Bayesian inference as the process of obtaining a conclusion based on evidence,  “Information about a hypothesis beyond the observable empirical data about that hypothesis is included in the inference.” Easier said than done, for reasons listed below.

Bayes Theorem itself is uncontroversial. It is a mathematical expression relating the probability of A given that B is true to the probability of B given that A is true and the individual probabilities of A and B:

P(A|B) = P(B|A) x P(A) / P(B)

If we’re trying to confirm a hypothesis (H) based on evidence (E), we can substitute H and E for A and B:

P(H|E) = P(E|H) x P(H) / P(E)

To be rationally coherent, you’re not allowed to believe the probability of heads to be .6 while believing the probability of tails to be .5; the sum of chances of all possible outcomes must sum to exactly one. Further, for Bayesians, the logical coherence just mentioned (i.e., avoidance of Dutch book arguments) must hold across time (synchronic coherence) such that once new evidence E on a hypothesis H is found, your believed probability for H given E should equal your prior conditional probability for H given E.

Plenty of good sources explain Bayesian epistemology and practice far better than I could do here. Bayesianism is controversial in science and engineering circles, for some good reasons. Bayesianism’s critics refer to it as a religion. This is unfair. Bayesianism is, however, like most religions, a belief system. My concern for this post is the problems with Bayesianism that I personally encounter in risk analyses. Adherents might rightly claim that problems I encounter with Bayes stem from poor implementation rather than from flaws in the underlying program. Good horse, bad jockey? Perhaps.

Problem 1. Subjectively objective
Bayesianism is an interesting mix of subjectivity and objectivity. It imposes no constraints on the subject of belief and very few constraints on the prior probability values. Hypothesis confirmation, for a Bayesian, is inherently quantitative, but initial hypotheses probabilities and the evaluation of evidence is purely subjective. For Bayesians, evidence E confirms or disconfirms hypothesis H only after we establish how probable H was in the first place. That is, we start with a prior probability for H. After the evidence, confirmation has occurred if the probability of H given E is higher than the prior probability of H, i.e., P(H|E) > P(H). Conversely, E disconfirms H when P(H|E) < P(H). These equations and their math leave business executives impressed with the rigor of objective calculation while directing their attention away from the subjectivity of both the hypothesis and its initial prior.

2. Rational formulation of the prior
Problem 2 follows from the above. Paranoid, crackpot hypotheses can still maintain perfect probabilistic coherence. Excluding crackpots, rational thinkers – more accurately, those with whom we agree – still may have an extremely difficult time distilling their beliefs, observations and observed facts of the world into a prior.

3. Conditionalization and old evidence
This is on everyone’s short list of problems with Bayes. In the simplest interpretation of Bayes, old evidence has zero confirming power. If evidence E was on the books long ago and it suddenly comes to light that H entails E, no change in the value of H follows. This seems odd – to most outsiders anyway. This problem gives rise to the game where we are expected to pretend we never knew about E and then judge how surprising (confirming) E would have been to H had we not know about it. As with the general matter of maintaining logical coherence required for the Bayesian program, it is extremely difficult to detach your knowledge of E from the rest of your knowing about the world. In engineering problem solving, discovering that H implies E is very common.

4. Equating increased probability with hypothesis confirmation.
My having once met Hillary Clinton arguably increases the probability that I may someday be her running mate; but few would agree that it is confirming evidence that I will do so. See Hempel’s raven paradox.

5. Stubborn stains in the priors
Bayesians, often citing success in the business of establishing and adjusting insurance premiums, report that the initial subjectivity (discussed in 1, above) fades away as evidence accumulates. They call this washing-out of priors. The frequentist might respond that with sufficient evidence your belief becomes irrelevant. With historical data (i.e., abundant evidence) they can calculate P of an unwanted event in a frequentist way: P = 1-e to the power -RT, roughly, P=RT for small products of exposure time T and failure rate R (exponential distribution). When our ability to find new evidence is limited, i.e., for modeling unprecedented failures, the prior does not get washed out.

6. The catch-all hypothesis
The denominator of Bayes Theorem, P(E), in practice, must be calculated as the sum of the probability of the evidence given the hypothesis plus the probability of the evidence given not the hypothesis:

P(E) = [P(E|H) x p(H)] + [P(E|~H) x P(~H)]

But ~H (“not H”) is not itself a valid hypothesis. It is a family of hypotheses likely containing what Donald Rumsfeld famously called unknown unknowns. Thus calculating the denominator P(E) forces you to pretend you’ve considered all contributors to ~H. So Bayesians can be lured into a state of false choice. The famous example of such a false choice in the history of science is Newton’s particle theory of light vs. Huygens’ wave theory of light. Hint: they are both wrong.

7. Deference to the loudmouth
This problem is related to no. 1 above, but has a much more corporate, organizational component. It can’t be blamed on Bayesianism but nevertheless plagues Bayesian implementations within teams. In the group formulation of any subjective probability, normal corporate dynamics govern the outcome. The most senior or deepest-voiced actor in the room drives all assignments of subjective probability. Social influence rules and the wisdom of the crowd succumbs to a consensus building exercise, precisely where consensus is unwanted. Seidenfeld, Kadane and Schervish begin “On the Shared Preferences of Two Bayesian Decision Makers” with the scholarly observation that an outstanding challenge for Bayesian decision theory is to extend its norms of rationality from individuals to groups. Their paper might have been illustrated with the famous photo of the exploding Challenger space shuttle. Bayesianism’s tolerance of subjective probabilities combined with organizational dynamics and the shyness of engineers can be a recipe for disaster of the Challenger sort.

All opinions welcome.

, , ,

1 Comment

More Philosophy for Engineers

In a post on Richard Feynman and philosophy of science, I suggested that engineers would benefit from a class in philosophy of science. A student recently asked if I meant to say that a course in philosophy would make engineers better at engineering – or better philosophers. Better engineers, I said.

Here’s an example from my recent work as an engineer  that drives the point home.

I was reviewing an FMEA (Failure Mode Effects Analysis) prepared by a high-priced consultancy and encountered many cases where a critical failure mode had been deemed highly improbable on the basis that the FMEA was for a mature system with no known failures.

How many hours of operation has this system actually seen, I asked. The response indicated about 10 thousand hours total.

I said on that basis we could assume a failure rate of about one per 10,001 hours. The direct cost of the failure was about $1.5 million. Thus the “expected value” (or “mathematical expectation” – the probabilistic cost of the loss) of this failure mode in a 160 hour mission is $24,000 or about $300,000 per year (excluding any secondary effects such as damaged reputation). With that number in mind, I asked the client if they wanted to consider further mitigation by adding monitoring circuitry.

I was challenged on the failure rate I used. It was, after all, a mature, ten year old system with no recorded failures of this type.

Here’s where the analytic philosophy course those consultants never took would have been useful.

You simply cannot justify calling a failure mode extremely rare based on evidence that it is at least somewhat rare. All unique events – like the massive rotor failure that took out all three hydraulic systems of a DC-10 in Sioux City – were very rare before they happened.

The authors of the FMEA I was reviewing were using unjustifiable inductive reasoning. Philosopher David Hume debugged this thoroughly in his 1738 A Treatise of Human Nature.

Hume concluded that there simply is no rational or deductive basis for  induction, the belief that the future will be like the past.

Hume understood that, despite the lack of justification for induction, betting against the sun rising tomorrow was not a good strategy either. But this is a matter of pragmatism, not of rationality. A bet against the sunrise would mean getting behind counter-induction; and there’s no rational justification for that either.

In the case of the failure mode not yet observed, however, there is ample justification for counter-induction. All mechanical parts and all human operations necessarily have nonzero failure or error rates. In the world of failure modeling, the knowledge “known pretty good” does not support the proposition “probably extremely good”, no matter how natural the step between them feels.

Hume’s problem of induction, despite the efforts of Immanuel Kant and the McKinsey consulting firm, has not been solved.

A fabulously entertaining – in my view – expression of the problem of induction was given by philosopher Carl Hempel in 1965.

Hempel observed that we tend to take each new observation of a black crow as incrementally supporting the inductive conclusion that all crows are black. Deductive logic tells us that if a conditional statement is true, its contrapositive is also true, since the statement and its contrapositive are logically equivalent. Thus if all crows are black then all non-black things are non-crow.

It then follows that if each observation of black crows is evidence that all crows are black (compare: each observation of no failure is evidence that no failure will occur), then each observation of a non-black non-crow is also evidence that all crows are black.

Following this line, my red shirt is confirming evidence for the proposition that all crows are black. It’s a hard argument to oppose, but it simply does not “feel” right to most people.

Many try to salvage the situation by suggesting that observing that my shirt is red is in fact evidence that all crows are black, but provides only unimaginably small support to that proposition.

But pushing the thing just a bit further destroys even this attempt at rescuing induction from the clutches of analysis.

If my red shirt gives a tiny bit of evidence that all crows are black, it then also gives equal support to the proposition that all crows are white. After all, my red shirt is a non-white non-crow.

,

9 Comments