How the most common prescription doesn’t cure bias

One of the main messages of most psychological research into bias, and the hundreds of  popular books that have followed,  is that most people just aren’t intuitively good at thinking statistically. We make mistakes about the influence of sample size, and about representativeness and likelihood. We fail to understand regression to the mean, and often make mistakes about causation.

However, suggesting statistical competence as a universal cure can lead to new sets of problems. An emphasis on statistical knowledge and tests can introduce its own blind spots, some of which can be devastating.

The discipline of psychology is itself going through a “crisis’ over reproducibility of results, as this Bloomberg view article from the other week discusses. One recent paper found that only 39 out of a sample of 100 psychological experiments could be replicated. That would be disastrous for the position of psychology as a science, as if results cannot be replicated by other teams their validity must be in doubt. The p-value test of statistical significance is overused as a marketing tool or way to get published. Naturally, there are some vigorous rebuttals in process.

It is, however, a problem for other disciplines as well, which suggests the issues are  genuine, deeper and more pervasive. John Ioannidis has been arguing the same about medical research for some time.

He’s what’s known as a meta-researcher, and he’s become one of the world’s foremost experts on the credibility of medical research. He and his team have shown, again and again, and in many different ways, that much of what biomedical researchers conclude in published studies—conclusions that doctors keep in mind when they prescribe antibiotics or blood-pressure medication, or when they advise us to consume more fiber or less meat, or when they recommend surgery for heart disease or back pain—is misleading, exaggerated, and often flat-out wrong. He charges that as much as 90 percent of the published medical information that doctors rely on is flawed.

The same applies to economics, where many of the most prominent academics apparently do not understand some of the statistical measures they use. A paper (admittedly from the 1990s) found that 70% of the empirical papers in the American Economic Review, the most prestigious journal in the field,” did not distinguish statistical significance from economic, policy, or scientific significance.”  The conclusion:

We would not assert that every econoimst misunderstands statistical significance, only that most do, and these some of the best economic scientists.

Of course, the problems and flaws in statistical models in the lead up to the great crash of 2008 are also multiple and by now famous. If bank management and traders do not understand the “black box” models they are using, and their limits, tears and pain are the usual result.

The takeaway is not to impugn statistics. It is that people are nonetheless very good at making a whole set of different mistakes when they tidy up one aspect of their approach. More statistical rigor can also mean more blind spots to other issues or considerations, and use of  technique in isolation from common sense.

The more technically proficient and rigorous you believe you are, often the more vulnerable you become to wishful thinking or blind spots. Technicians often have a remarkable ability to miss the forest for the trees, or twigs on the trees.

It also means there are (even more) grounds for mild skepticism about the value of many academic studies to practitioners.

By | March 30, 2016|Big Data, Economics, Expertise, Psychology, Quants and Models|Comments Off on How the most common prescription doesn’t cure bias

“Everyone was Wrong”

From the New Yorker to FiveThirtyEight, outlets across the spectrum failed to grasp the Trump phenomenon.” – Politico


It’s the morning after Super Tuesday, when Trump “overwhelmed his GOP rivals“.

The most comprehensive losers (after Rubio) were media pundits and columnists, with their decades of experience and supposed ability to spot trends developing. And political reporters, with their primary sources and conversations with campaigns in late night bars. And statisticians with models predicting politics. And anyone in business or markets or diplomacy or politics who was naive enough to believe confident predictions from any of  the experts.

Politico notes how the journalistic eminences at the New Yorker and the Atlantic got it wrong over the last year.

But so did the quantitative people.

Those two mandarins weren’t alone in dismissing Trump’s chances. Washington Post blogger Chris Cillizza wrote in July that “Donald Trump is not going to be the Republican presidential nominee in 2016.” And numbers guru Nate Silver told readers as recently as November to “stop freaking out” about Trump’s poll numbers.

Of course it’s all too easy to spot mistaken predictions after the fact. But the same pattern has been arising after just about every big event in recent years. People make overconfident predictions, based on expertise, or primary sources, or big data, and often wishful thinking about what they want to see happen. They project an insidery air of secret confidences or confessions from the campaigns. Or disinterested quantitative rigor.

Then  they mostly go horribly wrong. Maybe one or two through sheer luck get it right – and then get it even more wrong the next time. Predictions may work temporarily so long as nothing unexpected happens or nothing changes in any substantive way. But that means the forecasts turn out to be worthless just when you need them most.

The point? You remember the old quote (allegedly from Einstein) defining insanity: repeating the same thing over and over and expecting a different result.

Markets and business and political systems are too complex to predict. That means a different strategy is needed. But instead  there are immense pressures to keep doing the same things which don’t work in media, and markets, and business. Over and over and over again.

So recognize and understand the pressures. And get around them. Use them to your advantage. Don’t be naive.


By | March 2, 2016|Adaptation, Expertise, Forecasting, Politics, Quants and Models|Comments Off on “Everyone was Wrong”

Goldman gets it wrong

One interesting thing about markets (and politics and foreign policy) is that predictions from gurus and self-important experts have a remarkable tendency to turn out wrong. Take this Bloomberg report:

Goldman Sachs to clients: whoops. Just six weeks into 2016, the New York-based bank has abandoned five of six recommended top trades for the year.

The dollar versus a basket of euro and yen; yields on Italian bonds versus their German counterparts; U.S. inflation expectations: Goldman Sachs Group Inc. was wrong on all that and more.

In fact, the best investors tend not to show significantly better predictive ability than anyone else either.  The brutal truth is no-one is reliably good at predicting short-term market movements, although smarter players are more alert to some of the potential mistakes in their perspective.

Instead, they’re mainly better at changing their mind quickly – cutting their losses and getting out of positions that move against them. They manage their risks and survive to fight another day.  They figure out when they are wrong faster.

They don’t make huge overconfident predictions. Instead, they watch their exposure, as Nicholas Nassim Taleb argues.

Of course, it’s unlikely that Goldman was putting its own money behind these predictions to the bitter end. The successful traders at investment banks historically don’t pay much attention to their own economists and research in any case. As they see it, the research is really for the pension fund managers in Cleveland or Edinburgh. And press reporting of market swings is the lowest “dumb money” tier of all – something to bear in mind when the markets are scary like today.

By | February 11, 2016|Current Events, Expertise, Market Behavior|Comments Off on Goldman gets it wrong

“And no one saw it coming.” Again. And again.

Peggy Noonan, writing today about the state of US GOP primary race:

But really, what a year. Nobody, not the most sophisticated expert watching politics up close all his life, knew or knows what’s going to happen. Does it go to the convention? Do Mr. Trump’s roarers turn out? Does he change history?

And no one saw it coming.

But the press and tv  and political and economic research firms will drown you in speculation and commentary and confident predictions. That’s yet another reason to distrust them, as I keep arguing. Instead, look for leverage, and resilience.  Don’t get locked into a convenient narrative. It’s what you can do to change your own thinking and position that counts.

By | December 18, 2015|Assumptions, Expertise, Forecasting, Politics|Comments Off on “And no one saw it coming.” Again. And again.

The sun doesn’t go round the earth, after Paris

Evidence should be a fundamental part of any discussion of what to do in the wake of the Paris bombings, I said yesterday.  Do you agree with that? Instead we most often make assumptions about “what the terrorists want” or discuss things on such an abstract level (“they hate freedom”) that there’s little link to reality at all.

The trouble goes much deeper, though, because even when people use evidence (which is something to be grateful for), they cherry-pick it. It’s riddled with confirmation bias, and it mostly doesn’t prove anything at all.

Remember this in reading all the op-eds from experts on terrorism and the Middle East you’ll see in the next few weeks:  the success of expert predictions in this area is about as good as dart-throwing chimps.  Predictions from the most learned Syria and ISIS and intelligence experts are likely to be useless, just like most economic and political predictions. People can know almost everything about the issue – and still get things completely wrong.

If gathering information and evidence alone clearly isn’t enough, what do we do?

Here’s the further essential thing to grapple with; the most likely explanation or hypothesis is not the one with the most information lined up for it. It’s the one with the least information against it.

That rule is taken from Richards Heuer’s Psychology of Intelligence Analysis, and lies at the root of his method of Analysis of Competing Hypotheses

The root of the problem is that most information can be consistent with a whole variety of explanations. You can integrate it into a number of completely different and satisfying and incompatible stories. That means the most genuinely useful information is diagnostic; that is only consistent with one or a few explanations. It helps you choose between different explanations.

Think of it this way. There was plenty of seemingly obvious evidence for thousands of years that the sun went round the earth.  The fact that the sun rises and sets could be read to be consistent with the either the sun or the earth at the center of the solar system. So that evidence doesn’t help very much. You need to find evidence that can’t be read in favor of both. (That’s another story).

But that essential diagnostic information can be surprisingly difficult to find, especially because people rush to find facts that fit with their existing views.

What happens instead is the more information people gather, the more (over)confident they become of their point of view, regardless of the validity or reliability of the information.  They don’t think about the information. They just more or less weigh the total amount of it.

So what is needed instead is a kind of disconfirmation strategy.

Hold on, you might say. Doesn’t this mean we have to stop and think for a moment before jumping to our favorite recommendation? And isn’t that a pain which we’d rather avoid? Isn’t that uncomfortable?  Isn’t this a little austere and unglamorous compared with colorful and vivid stories and breathless reporting?

Yes. Repeat: Yes. All the information and opinion and sourcing and satellite photography in the world doesn’t help you if you ignore disconfirmation. It’s a lot less painful than wasting billions of dollars and potentially thousands of lives, and failing.  There’s some very practical ways to do it, too.



By | November 16, 2015|Assumptions, Confirmation bias, Expertise, Security|Comments Off on The sun doesn’t go round the earth, after Paris

One way to tell you’re getting into trouble

People have a tendency to think highly confident calls are a sign of expertise and credibility. More often, it’s a sign of ignorance and naiveté. That’s one of the lessons of scientific method.

Whenever a theory appears to you as the only possible one, take this as a sign you have neither understood the theory nor the problem which it was intended to solve.

Karl Popper, Objective Knowledge: An Evolutionary Approach

The trick is not to take confidence (or prominence) as a sign of credibility, or fall in love with a particular narrative. Instead, find a way to test assumptions and find the boundary conditions.

By | February 13, 2015|Assumptions, Decisions, Expertise, Human Error|Comments Off on One way to tell you’re getting into trouble

The Right Level of Detachment

One of the keys to why decisions fail is that we often have a tendency to look for universal, across-the-board rules for what to do. Yet success most often comes from maintaining the right kind of balance between different rules or requirements. Remember, it’s usually the assumptions people make which are likely to cause them to lose money or fail.

In fact, one of the best ways to look for those blind spots and hidden assumptions is to look for the balances that have been ignored.

One of the most important of these is the right balance between looking at a situation in general and specific terms.

Take the last post about medical errors. A physician or surgeon has to be alert to the immediate details of a particular patient. You can’t ignore the specifics. You can’t assume distinctive individual characteristics are infernal from the general.

This happens a lot in applied or professional situations. Good reporters or detectives or  short-term traders also often tend to focus on “just the facts, ma’am”, and get impatient with anything that sounds abstract. Their eyes glaze over at anything which is not immediate and tangible. Foreign exchange traders have traditionally often known next to nothing about the actual underlying countries or economics in a currency pair, but are very attuned to short-term market conditions and sentiment.

But the essence of expertise is also being able to recognize more general patterns. The most seminal investigation of expertise was Chase and Simon’s description of “chunking” in 1973. They investigated chess grand masters, and found that they organized the information about pieces on the board into broader, larger, more abstract units. Years of experience gave grand masters a larger “vocabulary” of such patterns, which they quickly recognized and used effectively. More recent work also finds that

Experts see and represent a problem in their domain at a deeper (more principled) level than novices; novices tend to represent the problem at a superficial level. Glaser & Chi, quoted p.50, The Cambridge Handbook of Expertise and Expert Performance

Indeed, one of the biggest reasons experts fail is because they fail to use that more general knowledge: they get too close-up and engrossed in the particular details. They show off their knowledge of irrelevant detail and “miss the forest for the trees.” They tend to believe This Time Is Different.

This is why simple linear models most often outperform experts, because they weigh information in a consistent way without getting caught up in the specifics. It is also why taking what Kahneman calls the “outside view” and base rates are essential in most situations, including project management.  You have to be able to step back and think about what generally happens and that requires skill in perceiving similarities and analogies and patterns.

People’s mindsets often tend to push them to one extreme or the other. Too little abstraction and chunking? You get lost in specific facts and noise and don’t recognize patterns. You don’t represent the problem accurately, if you see it at all.  You get too close, and too emotive, and too involved. On the other hand, too much detachment, and you lose “feel” and color and tacit knowledge. You become the remote head office which knows nothing about front line conditions, or the academic who is committed to the theory rather than the evidence. You lose the ability to perceive contrary information, or see risks that do not fit within your theory.

You need the right balance of general and specific knowledge. But maintaining that kind of balance is incredibly hard, because people and organizations tend to get carried away in one direction or the other. Looking for signs of that can tell you what kind of mistaken assumptions or blind spots are most likely. Yet people hardly ever monitor that.


By | December 15, 2014|Assumptions, Base Rates, Decisions, Expertise, Outside View|Comments Off on The Right Level of Detachment

Look for the balances

One of the most intractable policy problems is  people have a strong tendency to ignore or deny any downside  or side-effects in their actions. Instead, they prefer talk in terms of unitary goals, universal rights, and clear, consistent principles. Much of the educational training of elites, especially economists,  inclines them to deal with generalizations.  Hedgehogs are temperamentally inclined to look for single overarching explanations.

In actual practice, however,

The need to adapt to particular circumstances … runs counter to our tendency to generalize and form abstract plans of action.

Dörner is the brilliant German psychologist I mentioned last week, who studies the “logic of failure” and the typical patterns of errors people make in decisions.  Many of the essentials of better decision-making come down to maintaining threading your way between two opposite errors, and so maintaing a balance.

One problem is ignoring trade-offs and incompatibilities between goals:

Contradictroy goals are the rule, not the exception, in complex situations.

It is also difficult to judge the right amount of information-gathering:

We combat our uncertainty either by acting hastily on the basis of minimal information, or by gathering excessive information , which inhibits action and may even increase our uncertainty. Which of these patterns we follow depends on time pressure, or the lack of it.

It  is  also difficult to judge the right balance of specific versus general considerations: i.e how unique a situation is.  Experts typically overestimate the unique factors (“this time it’s different”) at the expense of  the general class of outcomes, or what Kahneman calls the “outside view.”  (“what usually happens in these situations?”) The necessities of political rhetoric and motivating people to take action also often requires politicians to ignore or downplay conflicting goals.

So in any complicated situation, judgment consists of striking many balances. One of the most useful ways to look for blind spots is, therefore, to look for the hidden or ignored balances, and trade-offs, and conflicts. And one sign of that is, according to Dorner, that good decision-makers and problem solves tend to use qualified language: “sometimes”  versus “every time”, “frequently” instead of “constantly” or “only.”

Everything at its proper time and with proper attention to existing conditions. There is no universally applicable rule, no magic wand, that we can apply to every situation and to all the structures we find in the real world. Our job is to think of, and then do, the right things at the right times in the right way. There may be rules for accomplishing this, but the rules are local – they are to a large extent dictated by specific circumstances. And that mens in turn that there are a great many rules.(p192)

And that means in turn that any competent or effective policy institution, like a central bank, cannot easily describe its reaction function in terms of clear consistent principles or rules. When it tries, markets and audiences are going to be confused and disappointed.


By | September 11, 2014|Base Rates, Expertise, Foxes and Hedgehogs|Comments Off on Look for the balances

Two kinds of error (part 1)

How do we explain why rigorous, formal processes can be very successful in some cases, and disastrous in others? I was asking this in reference to Henry Mintzberg’s research on the disastrous performance of formal planning. Mintzberg cites earlier research on different kinds of errors in this chart (from Mintzberg, 1994, p327).

Mintzberg Diagram


.. the analytic approach to problem solving produced the precise answer more often, but its distribution of errors was quite large. Intuition, in contrast, was less frequently precise but more consistently close. In other words, informally, people get certain kinds of problems more or less right, while formally , their errors, however infrequent, can be bizarre.

This is important, because it lies underneath a similar distinction that can be found in many other places. And because the field of decision-making research is so fragmented, the similar answers usually stand alone and isolated.

Consider, for example, how this relates to  Nicholas Nassim Taleb’s distinction between Fragile and Antifragile approaches and trading strategies. Think of exposure, he says, and the size and risk of the errors you may make.

A lot depends on whether you want to rigorously eliminate small errors, or watch out for really big errors.

By | May 4, 2014|Assumptions, Decisions, Expertise, Foxes and Hedgehogs, Human Error, Quants and Models, Risk Management|Comments Off on Two kinds of error (part 1)

“Strategies grow like weeds in a garden”. So do trades.

How much should you trust “gut feel” or “market instincts” when it comes to making decisions or trades or investments? How much should you make decisions through a rigorous, formal process using hard, quantified data instead? What can move the needle on performance?

In financial markets more mathematical approaches have been in the ascendant for the last twenty years, with older “gut feel” styles of trading increasingly left aside. Algorithms and linear models are much better at optimizing in specific situations than the most credentialed people are (as we’ve seen.) Since the 1940s business leaders have been content to have operational researchers (later known as quants) make decisions on things like inventory control or scheduling, or other well-defined problems.

But rigorous large-scale planning to make major decisions has generally turned out to be a disaster whenever it has been tried. It has generally been about as successful in large corporations as planning also turned out to be in the Soviet Union (for many of the same reasons). As one example, General Electric originated one of the main formal planning processes in the 1960s. The stock price then languished for a decade. One of the very first things Jack Welch did was to slash the planning process and planning staff.  Quantitative models (on the whole) performed extremely badly during the Great Financial Crisis. And hedge funds have increasing difficulty even matching market averages, let alone beating them.

What explains this? Why does careful modeling and rigor often work very well on the small scale, and catastrophically on large questions or longer runs of time? This obviously has massive application in financial markets as well, from understanding what “market instinct” is to seeing how central bank formal forecasting processes and risk management can fail.

Something has clearly been wrong with formalization. It may have worked wonders on the highly structured, repetitive tasks of the factory and clerical pool, but whatever that was got lost on its way to the executive suite.

I talked about Henry Mintzberg the other day. He pointed out that contrary to myth, most successful senior decision-makers are not rigorous or hyper-rational in planning, Quite the opposite. In the 1990s he wrote a book, The Rise and Fall of Strategic Planning, which tore into formal planning and strategic consulting (and where the quote above comes from.)

There were three huge problems, he said. First, planners assumed that analysis can provide synthesis or insight or creativity. Second, that hard quantitative data alone ought to be the heart of the planning process. Third, assuming the context for plans is stable, or predictable. All of them were just wrong. For example,

For data to be “hard” means that they can be documented unambiguously, which usually means that they have already been quantified. That way planners and managers can sit in their offices and be informed. No need to go out and meet the troops, or the customers, to find out how the products get bought or the wards get flight to what connects those strategies to that stock price; all that just wastes time.

The difficulty, he says, is that hard information is often limited in scope, “lacking richness and often failing to encompass important noneconomic and non-quantitiative factors.” Often hard information is too aggregated for effective use. It often arrives too late to be useful. And it is often surprisingly unreliable, concealing numerous biases and inaccuracies.

The hard data drive out the soft, while that holy ‘bottom line’ destroys people’s ability to think strategically. The Economist described this as “playing tennis by watching the scoreboard instead of the ball.” ..  Fed only abstractions, managers can construct nothing but hazy images, poorly focused snapshots that clarify nothing.

The performance of forecasting was also woeful, little better than the ancient Greek belief in the magic of the Delphic Oracle, and “done for superstitious reasons, and because of an obsession with control that becomes the illusion of control. ”

Of course, to create a new vision requires more than just soft data and commitment: it requires a mental capacity for synthesis, with imagination. Some managers simply lack these qualities – in our experience, often the very ones most inclined to rely on planning, as if the formal process will somehow make up for their own inadequacies. … Strategies grow initially like weeds in a garden: they are not cultivated like tomatoes in a hothouse.

Highly analytical approaches often suffered from “premature closure.”

.. the analyst tends to want to get on with the more structured step of evaluation alternatives and so tends to give scant attention to the less structured, more difficult, but generally more important step of diagnosing the issue and generating possible alternatives in the first place.

So what does strategy require?

We know that it must draw on all kinds of informational inputs, many of them non-quantifiable and accessible only to strategists who are connected to the details rather than detached from them. We know that the dynamics of the context have repeatedly defied any efforts to force the process into a predetermined schedule or onto a predetermined track. Strategies inevitably exhibit some emergent qualities, and even when largely deliberate, often appear less formally planned than informally visionary. And learning, in the form of fits and starts as well as discoveries based on serendipitous events and the recognition of unexcited patterns, inevitably plays a role, if not the key role in the development of all strategies that are novel. Accordingly, we know that the process requires insight, creativity and synthesis, the very thing that formalization discourages. [my bold]

If all this is true (and there is plenty of evidence to back it up), what does it mean for formal analytic processes? How can it be reconciled with the claims of Meehl and Kahneman that statistical models hugely outperform human experts? I’ll look at that next.

By | May 2, 2014|Adaptation, Assumptions, Big Data, Decisions, Expertise, Foxes and Hedgehogs, Insight & Creativity, Organizational Culture and Learning, Quants and Models, Risk Management|Comments Off on “Strategies grow like weeds in a garden”. So do trades.