Moloch's Toolbox (1/2)

EliezerYudkowsky

Previous: An Equilibrium of No Free Energy

There’s a toolbox of reusable concepts for analyzing systems I would call “inadequate”—the causes of civilizational failure, some of which correspond to local opportunities to do better yourself. I shall, somewhat arbitrarily, sort these concepts into three larger categories:

Decisionmakers who are not beneficiaries;
Asymmetric information;

and above all,

Nash equilibria that aren’t even the best Nash equilibrium, let alone Pareto-optimal.

In other words:

Cases where the decision lies in the hands of people who would gain little personally, or lose out personally, if they did what was necessary to help someone else;
Cases where decision-makers can’t reliably learn the information they need to make decisions, even though someone else has that information; and
Systems that are broken in multiple places so that no one actor can make them better, even though, in principle, some magically coordinated action could move to a new stable state.

I will then play fast and loose with these concepts in order to fit the entire Taxonomy of Failure inside them.

For example, “irrationality in the form of cognitive biases” wouldn’t obviously fit into any of these categories, but I’m going to shove it inside “asymmetric information” via a clever sleight-of-hand. Ready? Here goes:

If nobody can detect a cognitive bias in particular cases, then from our perspective we can’t really call it a “civilizational inadequacy” or “failure to pluck a low-hanging fruit.” We shouldn’t even be able to see it ourselves. So, on the contrary, let’s suppose that you and some other people can indeed detect a cognitive bias that’s screwing up civilizational decisionmaking.

Then why don’t you just walk up to the decision-maker and tell them about the bias? Because they wouldn’t have any way of knowing to trust you rather than the other five hundred people trying to influence their decisions? Well, in that case, you’re holding information that they can’t learn from you! So that’s an “asymmetric information problem,” in much the same way that it’s an asymmetric information problem when you’re trying to sell a used car and you know it doesn’t have any mechanical problems, but you have no way of reliably conveying this knowledge to the buyer because for all they know you could be lying.

That argument is a bit silly, but so is the notion of trying to fit the whole Scroll of Woe into three supercategories. And if I named more than three supercategories, you wouldn’t be able to remember them due to computational limitations (which aren’t on the list anywhere, and I’m not going to add them).

i. For want of docosahexaenoic acids, a baby was lost

My discussion of modest epistemology in Chapter 1 might have given the impression that I think of modesty mostly as a certain set of high-level beliefs: beliefs about how best to combat cognitive bias, about how individual competencies stack up against group-level competencies, and so on. But I predict that many of this book’s readers have high-level beliefs similar to those I outlined in Chapter 2, while employing a reasoning style that is really a special case of modest epistemology; and I think that this reasoning style is causing them substantial harm.

As reasoning styles, modest epistemology and inadequacy analysis depend on a mix of explicit principles and implicit mental habits. In inadequacy analysis, it’s one thing to recognize in the abstract that we live in a world rife with systemic inefficiencies, and quite another to naturally perceive systems that way in daily life. So my goal here won't be to unkindly stick the label “inadequate” to a black box containing the world; it will be to say something about how the relevant systems actually operate.

For our central example, we’ll be using the United States medical system, which is, so far as I know, the most broken system that still works ever recorded in human history. If you were reading about something in 19th-century France which was as broken as US healthcare, you wouldn’t expect to find that it went on working when overloaded with a sufficiently vast amount of money. You would expect it to just not work at all.

In previous years, I would use the case of central-line infections as my go-to example of medical inadequacy. Central-line infections, in the US alone, killed 60,000 patients per year, and infected an additional 200,000 patients at an average treatment cost of $50,000/patient.

Central-line infections were also known to decrease by 50% or more if you enforced a five-item checklist that included items like “wash your hands before touching the line.”

Robin Hanson has old Overcoming Bias blog posts on that untaken, low-hanging fruit. But I discovered while re-Googling in 2015 that wider adoption of hand-washing and similar precautions are now finally beginning to occur, after many years—with an associated 43% nationwide decrease in central-line infections. After partial adoption.¹

So my new example is infants suffering liver damage, brain damage, and death in a way that’s even easier to solve, by changing the lipid distribution of parenteral nutrition to match the proportions in breast milk.

Background: Some babies have digestion problems that require direct intravenous feeding. Long ago, somebody created a hospital formula for this intravenous feeding that matched the distribution of “fat,” “protein,” and “carbohydrate” in breast milk.

Just like “protein” comes in different amino acids, some of which the body can’t make on its own and some of which it can, what early doctors used to think of as “fat” actually breaks down into metabolically distinct elements like short-chain triglycerides, medium-chain triglycerides, saturated fat, and omega-6, omega-9, and the famous “omega-3.” “Omega-3” is actually several different lipids in its own right; vegetable oils with “omega-3” usually just contain alpha-linolenic acids, which can only be inefficiently converted to ecosapentaenoic acids, which are then even more inefficiently converted to docosahexaenoic acids, which are the actual key structural components in the body. This conversion pathway is rate-limited by a process that also converts omega-6, so too much omega-6 can prevent you from processing ALA into DHA even if you’re getting ALA.

So what happens if your infant nutrition was initially designed based on the concept of “fat” as a natural category, and all the “fat” in the mix comes from soybean oil?

From a popular book by Jaminet and Jaminet:

Some babies are born with “short bowel syndrome” and need to be given parenteral nutrition, or nutrition delivered intravenously directly to the blood, until their digestive tracts grow and heal. Since 1961, parenteral nutrition has used soybean oil as its source of fat.[6] And for decades, babies on parenteral nutrition have suffered devastating liver and brain damage. The death rate on soybean oil is 30 percent by age four. […]

In a clinical trial, of forty-two babies given fish oil [after they had already developed liver damage on soybean oil], three died and one required a liver transplant; of forty-nine given soybean oil, twelve died and six required a liver transplant.[8] The death-or-liver-transplant rate was reduced from 37 percent with soybean oil to 9 percent with fish oil.²

When Jaminet and Jaminet wrote the above, in 2012, there was a single hospital in the United States that could provide correctly formulated parenteral nutrition, namely the Boston Children’s Hospital; nowhere else. This formulation was illegal to sell across state lines.

A few years after the Boston Children’s Hospital developed their formula—keeping in mind the heap of dead babies continuing to pile up in the meanwhile—there developed a shortage of “certified lipids” (FDA-approved “fat” for adding to parenteral nutrition). For a year or two, the parenteral nutrition contained no fat at all which is worse and can kill adults.

You see, although there’s nothing special about the soybean oil in parenteral nutrition, there was only one US manufacturer approved to add it, and that manufacturer left the market, so…

As of 2015, the state of affairs was as follows: The FDA eventually solved the problem with the shortage of US-certified lipids, by… allowing US hospitals to import parenteral nutrition bags from Europe. And it only took them two years’ worth of dead patients to figure that out!

As of 2016, if your baby has short bowel syndrome, and has already ended up with liver damage, and either you or your doctor is lucky enough to know what’s wrong and how to fix it, your doctor can apply for a special permit to use a non-FDA-approved substance for your child on an emergency basis. After this, you can buy Omegaven and hope that it cures your baby and that there isn’t too much permanent damage and that it’s not already too late.

This is an improvement over the prior situation, where the non-poisonous formulation was illegal to sell across state lines under any circumstances, but it’s still not good by any stretch of the imagination.

Now imagine trying to explain to a visitor from a relatively well-functioning world just why it is that your civilization has killed a bunch of babies and subjected other babies to pointless brain damage.

“It’s not that we’re evil,” you say helplessly, “it’s that… well, you see, it’s not that anyone wanted to kill those babies, it’s just the way the System ended up, somehow…”

ii. Asymmetric information and lemons problems

Three people have gathered in a blank white space:

The Visitor from a Better World;
Simplicio, who is attending a major university but hasn’t taken undergraduate economics;
Cecie, the Conventional Cynical Economist.

The Visitor speaks first.

Visitor: So I’ve listened to you explain about babies suffering death and brain damage from parenteral nutrition built on soybean oil. I have several questions here, but I’ll start with the most obvious one.

Cecie: Go ahead.

Visitor: Why aren’t there riots?

Simplicio: The first thing you have to understand, Visitor, is that the folk in this world are hypocrites, cowards, psychopaths, and sheep.

I mean, I certainly care about the the lives of newborn children. Hearing about their plight certainly makes me want to do something about it. When I see the problem continuing in spite of that, I can only conclude that other people don’t feel the level of moral indignation that I feel when staring at a heap of dead babies.

Cecie: I don’t think that hypothesis is needed, Simplicio. As a start, Visitor, you have to realize that the picture I’ve shown you is not widely known. Maybe 10% of the population, at most, is walking around with the prior belief that the FDA in general is killing people; our government runs on majority rule and the 10% can’t unilaterally defy it.³ Maybe 0.1% of that 10% know that omega-3 ALA is converted into omega-3 DHA via a metabolic pathway that competes with omega-6. And then most of those aren’t aware of what’s happening to babies right now.

Visitor: Pointing to that state of ignorance is hardly a sufficient explanation! If a theater is on fire and only one person knows it, they yell “Fire!” and then more people know it. People from my civilization would scream “Babies are dying over here!” and other people from my civilization would whip around their heads and look.

Simplicio: Our world’s cowards and sheep would hear that and think that it’s (a) somebody else’s problem and (b) all part of the plan.

Cecie: In our world, Visitor, we have an economic phenomenon sometimes called the lemons problem. Suppose you want to sell a used car, and I’m looking for a car to buy. From my perspective, I have to worry that your car might be a “lemon”—that it has a serious mechanical problem that doesn’t appear every time you start the car, and is difficult or impossible to fix. Now, you know that your car isn’t a lemon. But if I ask you, “Hey, is this car a lemon?” and you answer “No,” I can’t trust your answer, because you’re incentivized to answer “No” either way. Hearing you say “No” isn’t much Bayesian evidence. Asymmetric information conditions can persist even in cases where, like an honest seller meeting an honest buyer, both parties have strong incentives for accurate information to be conveyed.

A further problem is that if the fair value of a non-lemon car is $10,000, and the possibility that your car is a lemon causes me to only be willing to pay you $8,000, you might refuse to sell your car. So the honest sellers with reliable cars start to leave the market, which further shifts upward the probability that any given car for sale is a lemon, which makes me less willing to pay for a used car, which incentivizes more honest sellers to leave the market, and so on.

Visitor: What does the lemons problem have to do with your world’s inability to pass around information about dead babies?

Cecie: In our world, there are a lot of people screaming, “Pay attention to this thing I’m indignant about over here!” In fact, there are enough people screaming that there’s an inexploitable market in indignation. The dead-babies problem can’t compete in that market; there’s no free energy left for it to eat, and it doesn’t have an optimal indignation profile. There’s no single individual villain. The business about competing omega-3 and omega-6 metabolic pathways is something that only a fraction of people would understand on a visceral level; and even if those people posted it to their Facebook walls, most of their readers wouldn’t understand and repost, so the dead-babies problem has relatively little virality. Being indignant about this particular thing doesn’t signal your moral superiority to anyone else in particular, so it’s not viscerally enjoyable to engage in the indignation. As for adding a further scream, “But wait, this matter really is important!”, that’s the part subject to the lemons problem. Even people who honestly know about a fixable case of dead babies can’t emit a trustworthy request for attention.

Simplicio: You're saying that people won’t listen even if I sound really indignant about this? That’s an outrage!

Cecie: By this point in our civilization’s development, many honest buyers and sellers have left the indignation market entirely; and what’s left behind is not, on average, good.

Visitor: Your reply contains so many surprising postulates of weird civilizational dysfunction, I hardly know what to ask about next. So instead I’ll try to explain how my world works, and you can explain to me why your world doesn’t work that way.

Cecie: Sounds reasonable.

iii. Academic incentives and beneficiaries

Visitor: To start with, in my world, we have these people called “scientists” who verify claims experimentally, and other people trust the “scientists.” So if our “scientists” say that a certain formula seems to be killing babies, this would provoke general indignation without every single listener needing to study docohexa-whatever acids.

Simplicio: Alas, our so-called scientists are just pawns of the same medical-industrial complex that profits from killing babies.

Cecie: I’m afraid, Visitor, that although there are strong prior reasons to expect too much omega-6 and no omega-3 to be very bad for an infant baby, and there are now a few dozen small-scale studies which seem to match that prediction, this matter hasn’t had the massive study that would begin to produce confident scientific agreement—

Visitor: You’d better not be pointing to that as an exogenous fact that explains your civilization’s problem! See, on my planet, if somebody points to strong prior suspicion combined with confirming pilot studies saying that something is killing innocent babies and is fixable, and the pilot studies are not considered sufficient evidence to settle the issue, our people would do more studies and wouldn’t just go on blindly feeding the babies poison in the meantime. Our scientists would all agree on that!

Cecie: But people loudly agreeing on something, by itself, accomplishes nothing. It’s all well and good for everyone to agree in principle that larger studies ought to be done; but in your world, who actually does the big study, and why do they do it?

Visitor: Two subclasses within the profession of “scientist” are suggesters, whose piloting studies provide the initial suspicions of effects, and replicators whose job it is to confirm the result and nail things down solidly—the exact effect size and so on. When an important suggestive result arises, two replicators step forward to confirm it and nail down the exact conditions for producing it, being forbidden upon their honor to communicate with each other until they submit their findings. If both replicators agree on the particulars, that completes the discovery. The three funding bodies that sustained the suggester and the dual replicators would receive the three places of honor in the announcement. Do I need to explain how part of the function of any civilized society is to appropriately reward those who contribute to the public good?

Cecie: Well, that’s not how things work on Earth. Our world gives almost all the public credit and fame to the discoverer, as the initial suggester is called among us. Our scientists often say that replication is important, but our most prestigious journals won’t publish mere replications; nor do the history books remember them. The outcome is a lot of small studies that have just enough subjects to obtain “statistically significant” results—

Visitor: … What? Probability is quantitative, not qualitative. There’s no such thing as a “significant” or “insignificant” likelihood ratio—

Cecie: Anyway, while it might be good if larger studies were done, the decisionmaker is not the beneficiary—the people who did the extra work of a larger study, and funded the extra work of a larger study, would not receive fame and fortune thereby.

Visitor: I must be missing something basic here. You do have multiple studies, right? When you have multiple bodies of data, you can multiply the likelihood functions from the studies’ respective data to the hypotheses to obtain the meaning of the combined evidence—the likelihood function from all the data to the hypotheses.⁴

Cecie: I’m afraid you can’t do that on Earth.

Visitor: … Of course you can. It’s a mathematical theorem. You can’t possibly tell me that differs between our universes!

Yes, there are pitfalls for the especially careless. Sometimes studies end up being conducted under different circumstances, with the result that the naively computed likelihood functions don’t have uniform relations to the hypotheses under consideration. In that case, blindly multiplying will give you a likelihood function that’s nearly zero everywhere. But, I mean, if you just look at all the likelihood functions, it’s pretty obvious when some of them are pointing in different directions and then you can investigate that divergence.

Either it makes sense to multiply all the likelihood functions and get out one massive evidential pointer, or else you don’t get a sensible result when you multiply them and then you know something’s wrong with your methods—

Cecie: I’m afraid our scientific community doesn’t run on your world’s statistical methods. You see, during the first half of the twentieth century, it became conventional to measure something called “p-values” which imposed a qualitative distinction between “successful” and “unsuccessful” experiments—

Visitor: That is still not an explanation. Why not change the way you do things?

Cecie: Because somebody who tried using unconventional statistical methods, even if they were better statistical methods, wouldn’t be able to publish their papers in the most prestigious journals. And then they wouldn’t get hired. It’s similar to the way that the most prestigious journals don’t publish mere replications, only discoveries, so people focus on making discoveries instead of replications.

Visitor: Why would anyone pay attention to journals like that?

Cecie: Because university hiring departments care a lot about whether you’ve published in prestigious journals.

Visitor: No, I mean… how did these journals end up prestigious in the first place? Why do university hiring departments pay attention to them?

Simplicio: Why would university hiring departments care about real science? Shouldn’t it be you who has to explain why some lifeless cog of the military-industrial complex would care about anything except grant money?

Cecie: Okay… you’re digging pretty deep here. I think I need to back up and try to explain things on a more basic level.

Visitor: Indeed, I think you should. So far, every time I’ve asked you why someone is acting insane, you’ve claimed that it’s secretly a sane response to someone else acting insane. Where does this process bottom out?

iv. Two-factor markets and signaling equilibria

Cecie: Let me try to identify a first step on which insanity can emerge from non-insanity. Universities pay attention to prestigious journals because of a signaling equilibrium, which, in our taxonomy, is a kind of bad Nash equilibrium that no single actor can defy unilaterally.

In your terms, it involves a sticky, stable equilibrium of everyone acting insane in a way that’s secretly a sane response to everyone else acting insane.

Visitor: Go on.

Cecie: First, let me explain the idea of what Eliezer has nicknamed a “two-factor market.” Two-factor markets are a conceptually simpler case that will help us later understand signaling equilibria.

In our world there’s a crude site for classified ads, called Craigslist. Craigslist doesn’t contain any way of rating users, the way that eBay lets buyers and sellers rate each other, or that Airbnb lets renters and landlords rate each other.

Suppose you wanted to set up a version of Craigslist that let people rate each other. Would you be able to compete with Craigslist?

The answer is that even if this innovation is in fact a good one, competing with Craigslist would be far more difficult than it sounds, because Craigslist is sustained by a two-factor market. The sellers go where there are the most buyers; the buyers go where they expect to find sellers. When you launch your new site, no buyers will want to go there because there are no sellers, and no sellers will want to go there because there are no buyers. Craigslist initially broke into this market by targeting San Francisco particularly, and spending marketing effort to assemble the San Francisco buyers and sellers into the same place. But that would be harder to do for a later startup, because now the people it’s targeting are already using Craigslist.

Simplicio: Those sheep! Just mindlessly doing whatever their incentives tell them to!

Cecie: We can imagine that there’s a better technology than Craigslist, called Danslist, such that everyone using Craigslist would be better off if they all switched to Danslist simultaneously. But if just one buyer or just one seller is the first to go to Danslist, they find an empty parking lot. In conventional cynical economics, we’d say that this is a coordination problem—

Simplicio: A coordination problem? What do you mean by that?

Cecie: Backing up a bit: A “Nash equilibrium” is what happens when everyone makes their best move, given that all the other players are making their best moves from that Nash equilibrium—everyone goes to Craigslist, because that’s their individually best move given that everyone else is going to Craigslist. A “Pareto optimum” is any situation where it’s impossible to make every actor better off simultaneously, like “Cooperate/Cooperate” in the Prisoner’s Dilemma—there’s no alternative outcome to Cooperate/Cooperate that makes both agents better off. The Prisoner’s Dilemma is a coordination problem because the sole Nash equilibrium of Defect/Defect isn’t Pareto-optimal; there’s an outcome, Cooperate/Cooperate, that both players prefer, but aren’t reaching.

Simplicio: How stupid of them!

Cecie: No, it’s… ah, never mind. Anyway, the frustrating parts of civilization are the times when you’re stuck in a Nash equilibrium that’s Pareto-inferior to other Nash equilibria. I mean, it’s not surprising that humans have trouble getting to non-Nash optima like “both sides cooperate in the Prisoner’s Dilemma without any other means of enforcement or verification.” What makes an equilibrium inadequate, a fruit that seems to hang tantalizingly low and yet somehow our civilization isn’t plucking, is when there’s a better stable state and we haven’t reached it.

Visitor: Indeed. Moving from bad equilibria to better equilibria is the whole point of having a civilization in the first place.

Cecie: Being stuck in an inferior Nash equilibrium is how I’d describe the frustrating aspect of the two-factor market of buyers and sellers that can’t switch from Craigslist to Danslist. The scenario where everyone is using Danslist would be a stable Nash equilibrium, and a better Nash equilibrium. We just can’t get there from here. There’s no one actor who is behaving foolishly; all the individuals are responding strategically to their incentives. It’s only the larger system that behaves “foolishly.” I’m not aware of a standard term for this situation, so I’ll call it an “inferior equilibrium.”

Simplicio: Why do you care what academics call it? Why not just use the best phrase?

Cecie: The terminology “inferior equilibrium” would be fine if everyone else were already using that terminology. Mostly I want to use the same phrase that everyone else uses, even if it’s not the best phrase.

Simplicio: Regardless, I’m not seeing what the grand obstacle is to people solving these problems by, you know, coordinating. If people would just act in unity, so much could be done!

I feel like you’re placing too much blame on system-level issues, Cecie, when the simpler hypothesis is just that the people in the system are terrible: bad at thinking, bad at caring, bad at coordinating. You claim to be a “cynic,” but your whole world-view sounds rose-tinted to me.

Visitor: Even in my world, Simplicio, coordination isn’t as simple as everyone jumping simultaneously every time one person shouts “Jump!” For coordinated action to be successful, you need to trust the institution that says what the action should be, and a majority of people have to trust that institution, and they have to know that other people trust the institution, so that everyone expects the coordinated action to occur at the critical time, so that it makes sense for them to act too.

That’s why we have policy prediction markets and… there doesn’t seem to be a word in your language for the timed-collective-action-threshold-conditional-commitment… hold on, this cultural translator isn’t making any sense. “Kickstarter”? You have the key concept, but you use it mainly for making video games?

Cecie: I’ll now introduce the concept of a signaling equilibrium.

To paraphrase a commenter on Slate Star Codex: suppose that there’s a magical tower that only people with IQs of at least 100 and some amount of conscientiousness can enter, and this magical tower slices four years off your lifespan. The natural next thing that happens is that employers start to prefer prospective employees who have proved they can enter the tower, and employers offer these employees higher salaries, or even make entering the tower a condition of being employed at all.⁵

Visitor: Hold on. There must be less expensive ways of testing intelligence and conscientiousness than sacrificing four years of your lifespan to a magical tower.

Cecie: Let’s not go into that right now. For now, just take as an exogenous fact that employers can’t get all of the information they want by other channels.

Visitor: But—

Cecie: Anyway: the natural next thing that happens is that employers start to demand that prospective employees show a certificate saying that they’ve been inside the tower. This makes everyone want to go to the tower, which enables somebody to set up a fence around the tower and charge hundreds of thousands of dollars to let people in.⁶

Visitor: But—

Cecie: Now, fortunately, after Tower One is established and has been running for a while, somebody tries to set up a competing magical tower, Tower Two, that also drains four years of life but charges less money to enter.

Visitor: … You’re solving the wrong problem.

Cecie: Unfortunately, there’s a subtle way in which this competing Tower Two is hampered by the same kind of lock-in that prevents a jump from Craigslist to Danslist. Initially, all of the smartest people headed to Tower One. Since Tower One had limited room, it started discriminating further among its entrants, only taking the ones that have IQs above the minimum, or who are good at athletics or have rich parents or something. So when Tower Two comes along, the employers still prefer employees from Tower One, which has a more famous reputation. So the smartest people still prefer to apply to Tower One, even though it costs more money. This stabilizes Tower One’s reputation as being the place where the smartest people go.

In other words, the signaling equilibrium is a two-factor market in which the stable point, Tower One, is cemented in place by the individually best choices of two different parts of the system. Employers prefer Tower One because it’s where the smartest people go. Smart employees prefer Tower One because employers will pay them more for going there. If you try dissenting from the system unilaterally, without everyone switching at the same time, then as an employer you end up hiring the less-qualified people from Tower Two, or as an employee, you end up with lower salary offers after you go to Tower Two. So the system is stable as a matter of individual incentives, and stays in place. If you try to set up a cheaper alternative to the whole Tower system, the default thing that happens to you is that people who couldn’t handle the Towers try to go through your new system, and it acquires a reputation for non-prestigious weirdness and incompetence.

Visitor: This all just seems so weird and complicated. I’m skeptical that this scenario with the magical towers could happen in real life.

Simplicio: I agree that trying to build a cheaper Tower Two is solving the wrong problem. The interior of Tower One boasts some truly exquisite architecture and decor. It just makes sense that someone should pay a lot to allow people entry to Tower One. What we really need is for the government to subsidize the entry fees on Tower One, so that more people can fit inside.

Cecie: Consider a simpler example: Velcro is a system for fastening shoes that is, for at least some people and circumstances, better than shoelaces. It’s easier to adjust three separate Velcro straps then it is to keep your shoelaces perfectly adjusted at all loops, it’s faster to do and undo, et cetera, and not everyone is running at high speeds that call for perfectly adjusted running shoes. But when Velcro was introduced, the earliest people to adopt Velcro were those who had the most trouble tying their shoelaces—very young children and the elderly. So Velcro became associated with kids and old people, and thus unforgivably unfashionable, regardless of whether it would have been better than shoelaces in some adult applications as well.

Visitor: I take it you didn’t have the stern and upright leaders, what we call the Serious People, who could set an example by donning Velcro shoes themselves?

Simplicio & Cecie: (in unison) No.

Visitor: I see.

Cecie: Now consider the system of scientific journals that we were originally talking about. Some journals are prestigious. So university hiring committees pay the most attention to publications in that journal. So people with the best, most interesting-looking publications try to send them to that journal. So if a university hiring committee paid an equal amount of attention to publications in lower-prestige journals, they’d end up granting tenure to less prestigious people. Thus, the whole system is a stable equilibrium that nobody can unilaterally defy except at cost to themselves.

Visitor: I’m still skeptical. Doesn’t your parable of the magical tower suggest that, if that’s actually true, somebody ought to rope off the journals too and charge insane amounts of money?

Cecie: Yes, and that’s exactly what happened. Elsevier and a few other profiteers grabbed the most prestigious journals and started jacking up the access costs. They contributed almost nothing—even the peer review and editing was done by unpaid volunteers. Elsevier just charged more and more money and sat back. This is standardly called rent-seeking. In a few cases, the scientists were able to kickstart a coordinated move where the entire editing board would resign, start a new journal, and everybody in the field would submit to the new journal instead. But since our scientists don’t have recognized kickstarting customs, or any software support for them, it isn’t easy to pull that off. Most of the big-name journals that Elsevier has captured are still big names, still getting prestigious submissions, and still capturing big-money rents.

Visitor: Well, I guess I understand why my cultural translator keeps putting air quotes around Earth’s version of “science.” The whole idea of science, as I understand the concept, is that everything has to be in the open for anyone to verify. Science is the part of humanity’s knowledge that everyone can potentially learn about and reproduce themselves. You can’t charge money in order for people to read your experimental results, or you lose the “everyone can access and verify your claims” property that distinguishes science from other kinds of information.

Cecie: Oh, rest assured that scientists aren’t seeing any of this money. It all goes to the third-party journal owners.

Simplicio: And this isn’t just scientists being stupid?

Cecie: No stupider than you are for going to college. It’s hard to beat signaling equilibria—because they’re “multi-factor markets”—which are special cases of coordination problems that create “inferior Nash equilibria”—which are so stuck in place that market controllers can seek rent on the value generated by captive participants.

Simplicio: Weren’t we talking about dead babies at some point?

Cecie: Yes, we were. I was explaining how our system allocated too much credit to discoverers and not enough credit to replicators, and the only socially acceptable statistics couldn’t aggregate small-scale trials in a way regarded as reliable. The Visitor asked me why the system was like that. I pointed to journals that published a particular kind of paper. The Visitor asked me why anyone paid attention to those journals in the first place. I explained about signaling equilibria, and that’s where we are now.

Visitor: I can’t say that I feel enlightened at the end of walking through all that. There must be particular scientists on the editorial boards who choose not to demand replications and who forbid multiplying likelihood ratios. Why are those particular scientists doing the non-sensible thing?

Cecie: Because people in the general field wouldn’t cite nonstandard papers, so if the editors demanded nonstandard papers, the journal’s publication factor would decrease.

Visitor: Why don’t the journal editors start by demanding that paper submitters cite dual replications as well as initial suggestions?

Cecie: Because that would be a weird unconventional demand, which might lead people with high-prestige results to submit those results to other journals instead. Fundamentally, you’re asking why scientists on Earth don’t adopt certain new customs that you think would be for the good of everyone. And the answer is that there’s this big, multi-factor system that nobody can dissent from unilaterally, and that people have a lot of trouble coordinating to change. That’s true even when there are forces like Elsevier that are being blatant about ripping everyone off. Implementing your proposed cultural shift to “suggesters” and “replicators,” or using likelihood functions, would be significantly harder than everyone just simultaneously ceasing to deal with Elsevier, since the case for it would be less obvious and would provoke more disagreement. All that we can manage is to make incremental shifts toward funding more replication and asking more for study preregistration.

To sum up, academic science is embedded in a big enough system with enough separate decisionmakers creating incentives for other decisionmakers that it almost always takes the path of least resistance. The system isn’t in the best Nash equilibrium because nobody has the power to look over the system and choose good Nash equilibria. It’s just in a Nash equilibrium that it wandered into, which includes statistical methods that were invented in the first half of the 20th century and editors not demanding that people cite replications.

Visitor: I see. And that’s why nobody in your world has multiplied the likelihood functions, or done a large-enough single study, or otherwise done whatever it would take to convince whoever needs to be convinced about the effects of feeding infants soybean oil.

Cecie: It’s one of the reasons. A large study would also be very expensive because of extreme paperwork requirements, generated by other systemic failures I haven’t gotten around to talking about yet—⁷

Visitor: How does anything get done ever, in your world?

Cecie: —and when it comes to funding or carrying out that bigger study, the decisionmaker would not significantly benefit under the current system, which is held in place by coordination problems. And that’s why people who already have a background grasp of lipid metabolic pathways have asymmetric information about what is worth becoming indignant about.

v. Total market failures

Visitor: Even granting the things you’ve said already, I don’t feel like I’ve been told enough to understand why your society is killing babies.

Cecie: Well, no. Not yet. The lack of incentive to do a large-scale convincing study is only one thing that went wrong inside one part of the system. There’s a lot more broken than just that—which is why effective altruists shouldn’t be running out and trying to fund a big replication study for Omegaven, because that by itself wouldn’t fix things.

Visitor: Okay, suppose there had been a large enough study to satisfy your world’s take on “scientists.” What else would likely go wrong after that?

Cecie: Several things. For example, doctors wouldn’t necessarily be aware of the experimental results.

Visitor: Hold on, I think my cultural translator is broken. You used that word “doctor” and my translator spit out a long sequence of words for Examiner plus Diagnostician plus Treatment Planner plus Surgeon plus Outcome Evaluator plus Student Trainer plus Business Manager. Maybe it’s stuck and spitting out the names of all the professions associated with medicine.

Cecie: So, in your world, if there is a dual replication of results on Omegaven versus soybean oil, how does that end up changing the actual patient treatments?

Visitor: By informing the Treatment Planners who specialize in infant ailments that required parenteral nutrition, of course. The discovery would appear inside the “parenteral nutrition” pages in the Earthweb and show up in the feeds of everyone subscribed to that page. The statistics would appear inside the Treatment Planner’s decision-support software. And if all of those broke for some reason, every Treatment Planner for infant ailments that required parenteral nutrition would just use chatrooms. And anyone who ignored the chatrooms would have worse patient outcome ratings, and would lose status relative to Treatment Planners who were more attentive.

Cecie: It sounds like “Treatment Planners” in your world are much more specialized than doctors in this world. I suppose they’re also selected specifically for talent at… cost-benefit analysis and decision theory, or something along those lines? And then they focus their learning on particular diseases for which they are Treatment Planners? And somebody else tracks their outcomes?

Visitor: Of course. I’m… almost afraid to ask, but how do they do it in your world?

Cecie: Your translator wasn’t broken. In our world, “doctors” are supposed to examine patients for symptoms, diagnose especially complicated or obscure ailments using their encyclopedic knowledge and their keen grasp of Bayesian inference, plan the patient’s treatment by weighing the costs and benefits of the latest treatments, execute the treatments using their keen dexterity and reliable stamina, evaluate for themselves how well that went, train students to do it too, and in many cases, also oversee the small business that bills the patients and markets itself. So “doctors” have to be selected for all of those talents simultaneously, and then split their training, experience, and attention between them.

Visitor: Why in the name of—

Cecie: Oh, and before they go to medical school, we usually send them off to get a four-year degree in philosophy first or something, just because.

I don’t know if there’s a standard name for this phenomenon, but we can call it “failure of professional specialization.” It also appears when, for example, a lawyer has to learn calculus in order to graduate college, even though their job doesn’t require any calculus.

Visitor: Why. Why. Why why why—

Cecie: I’m not sure. I suspect the origin has something to do with status—like, a high-status person can do all things at once, so it’s insulting and lowers status to suggest that an esteemed and respectable Doctor should only practice one surgical operation and get very good at it. And once you yourself have spent twelve years being trained under the current system, you won’t be happy about the proposal to replace it with two years of much more specialized training. Once you’ve been through a painful initiation ritual and rationalized its necessity, you’ll hate to see anyone else going through a less painful one. Not to mention that you won’t be happy about the competition against your own human capital, by a cheaper and better form of human capital—and after the sunk cost in pain and time that you endured to build human capital under the old system…

Visitor: Do they not have markets on your planet? Because on my planet, when you manufacture your product in a crazy, elaborate, expensive way that produces an inferior product, someone else will come along and rationalize the process and take away your customers.

Cecie: We have markets, but there’s this unfortunate thing called “regulatory capture,” of which one kind is “occupational licensing.”

As an example, it used to be that chairs were carefully hand-crafted one at the time by carpenters who had to undergo a lengthy apprenticeship, and indeed, they didn’t like it when factories came along staffed by people who specialized in just carving a single kind of arm. But the factory-made chairs were vastly cheaper and most of the people who insisted on sticking to handcrafts soon went out of business.

Now imagine: What if the chair-makers had been extremely respectable—had already possessed very high status? What if their profession had an element of danger? What if they’d managed to frighten everyone about the dangers of improperly made chairs that might dump people on the ground and snap their necks?

Visitor: Okay, yes, we used to have Serious People who would go around and certify the making of some medicines where somebody might be tempted to cheat and use inferior ingredients. But that was before computers and outcome statistics and online ratings.

Cecie: And on our planet, Uber and Lyft are currently fighting it out with taxi companies and their pet regulators after exactly that development. But suppose the whole system was set up before the existence of online ratings. Then the carpenters might have managed to introduce occupational licensing on who could be a carpenter. So if you tried to set up a factory, your factory workers would have needed to go through the traditional carpentry apprenticeship that covered every part of every kind of furniture, before they were legally allowed to come to your factory and specialize in carving just one kind of chair-arm. And then your factory would also need a ton of permits to sell its furniture, and would need to inveigle orders from a handful of resellers who were licensed to buy and resell furniture at a fixed margin. That small, insular group of resellers might not benefit literally personally—in their own personal salary—from buying from your cheaper factory system. And so it would go.

Visitor: But why would the legislators go along with that?

Cecie: Because the carpenters would have a big, concentrated incentive to figure out how to make legislators do it—maybe by hiring very persuasive people, or by subtle bribery, or by not-so-subtle bribery.

Insofar as occupational licensing works to the benefit of professionals at the expense of consumers, occupational licensing represents a kind of regulatory capture, which happens when a few regulatees have a much more concentrated incentive to affect the regulation process. Regulatory capture in turn is a kind of commons problem, since every citizen shares the benefits of non-captured regulation, but no individual citizen has a sufficient incentive to unilaterally spend their life attending to that particular regulatory problem. So occupational licensing is regulatory capture is a commons problem is a coordination problem.

Visitor: Then… the upshot is that it’s impossible for your country to test a functional hospital design in the first place? The reformers can’t win the competition because they’re not legally allowed to try?

Cecie: But of course. Though in this case, if you did manage to set up a test hospital working along more reasonable lines, you still wouldn’t be able to advertise your better results relative to any other hospitals. With just a few isolated exceptions, all of the other hospitals on Earth don’t publish patient outcome statistics in the first place.

Visitor: … But… then—what are they even selling?

Simplicio: Hold on. If you reward the doctors with the highest patient survival rates, won’t they just reject all the patients with poor prognoses?

Visitor: Obviously you don’t evaluate raw survival rates. You have Diagnosticians who estimate prognosis categories and are rated on their predictive accuracy, and Treatment Planners and Surgeons who are rated on their relative outcomes, and you have the outcomes evaluated by a third party, and—

Cecie: In our world, there’s no separation of powers where one person assigns patients a prognosis category and has their prediction record tracked, and another person does their best to treat them and has their treatment record tracked. So hospitals don’t publish any performance statistics, and patients choose the hospital closest to their house that takes their workplace’s insurance, and nobody has any financial incentive to decrease the number of patient deaths from sloppy surgeons or central line infections. When anesthesiologists in particular did happen to start tracking patient outcomes, they adopted some simple monitoring standards and subsequently decreased their fatality rates by a factor of one hundred.⁸ But that’s just anesthesiologists, not, say, cardiac surgeons.

With cardiac surgeons, a group of researchers recently figured out how to detect when the most senior cardiac surgeons were at conferences, and found that the death rates went down while the most senior cardiac surgeons were away.⁹ But our scientists have to use special tricks if they want to find out any facts like that.

Visitor: Do your patients not care if they live or die?

Cecie: Robin Hanson has a further thesis about how what people really want from medicine is reassurance rather than statistics. But I’m not sure that hypothesis is necessary to explain this particular aspect of the problem. If no hospital offers statistics, then you have no baseline to compare to if one hospital does start offering statistics. You’d just be looking at an alarming-looking percentage for how many patients die, with no idea of whether that’s a better percentage or a worse percentage. Terrible marketing! Especially compared to that other hospital across town that just smiles at you reassuringly.

No hospital would benefit from being the first to publish statistics, so none of them do.

Visitor: Your world has literally zero market demand for empirical evidence?

Cecie: Not zero, no. But since publishing scary numbers would be bad marketing for most patients, and hospitals are heavily regional, they all go by the majority preference to not hear about the statistics.

Visitor: I confess I’m having some trouble grasping the concept of a market consisting of opaque boxes allegedly containing goods, in which nobody publishes what is inside the boxes.

Cecie: Hospitals don’t publish prices either, in most cases.

Visitor: …

Cecie: Yeah, it’s pretty bad even by Earth standards.

Visitor: You literally don’t have a healthcare market. Nobody knows what outcomes are being sold. Nobody knows what the prices are.

Cecie: I guess we could call that Total Market Failure? As in, things have gone so wrong that there’s literally no supply-demand matching or price-equilibrating mechanism remaining, even though money is still changing hands.

And while I wish that this phenomenon of “you simply don’t have a market” were only relevant to healthcare and not to other facets of our civilization… well, it’s not.

vi. Absence of (meta-)competition

Visitor: I suppose I can imagine imagine a hypothetical world in which one country screws things up as badly as you describe. But your planet has multiple governments, I thought. Or did I misunderstand that? Why wouldn’t patients emigrate to—or just visit—countries that made better hospitals legal?

Cecie: The forces acting on governments with high technology levels are mostly the same between countries, so all the governments of those countries tend to have their medical system screwed up in mostly the same way (not least because they’re imitating each other). Some aspects of dysfunctional insurance and payment policies are special to the US, but even the relatively functional National Health System in Britain still has failure of professional specialization. (Though they at least don’t require doctors to have philosophy degrees.)

Visitor: Is there not one government that would allow a reasonably designed hospital staffed by specialists instead of generalists?

Cecie: It wouldn’t be enough to just have one government’s okay. You’d need some way to initially train your workers, despite none of our world’s medical schools being set up to train them. A majority of legislators won’t benefit personally from deciding to let you try your new hospital in their country. Furthermore, you couldn’t just go around raising money from rich countries for a venture in a poor country, because rich countries have elaborate regulations on who’s allowed to raise money for business ventures through equity sales. The fundamental story is that everything, everywhere, is covered with varying degrees of molasses, and to do any novel thing you have to get around all of the molasses streams simultaneously.

Visitor: So it’s impossible to test a functional hospital design anywhere on the planet?

Cecie: But of course.

Visitor: I must still be missing something. I just don’t understand why all of the people with economics training on your planet can’t go off by themselves and establish their own hospitals. Do you literally have people occupying every square mile of land?

Cecie: … How do I phrase this…

All useful land is already claimed by some national government, in a way that the international order recognizes, whether or not that land is inhabited. No relevant decisionmaker has a personal incentive to allow there to be unclaimed land. Those countries will defend even a very small patch of that claimed land using all of the military force their country has available, and the international order will see you as the aggressor in that case.

Visitor: Can you buy land?

Cecie: You can’t buy the sovereignty on the land. Even if you had a lot of money, any country poor enough and desperate enough to consider your offer might just steal your stuff after you moved in.

Negotiating the right to bring in weapons to defend yourself in this kind of scenario would be even more unthinkable, and would spark international outrage that could prevent you from trading with other countries.

To be clear, it’s not that there’s a global dictator who prevents new countries from popping up; but every potentially useful part of every land is under some system’s control, and all of those systems would refuse you the chance to set up your own alternative system, for very similar reasons.

Visitor: So there’s no way for your planet to try different ways of doing things, anywhere. You literally cannot run experiments about things like this.

Cecie: Why would there be? Who would decide that, and how would they personally benefit?

Visitor: That sounds extremely alarming. I mean, difficulties of adoption are one thing, but not even being able to try new things and see what happens… Shouldn’t everyone on your planet be able to detect at a glance how horrible things have become? Can this type of disaster really stand up to universal agreement that something is wrong?

Cecie: I’m afraid that our civilization doesn’t have a sufficiently stirring and narratively satisfying conception of the valor of “testing things” that our people would be massively alarmed by its impossibility. And now, Visitor, I hope we’ve bottomed out the general concept of why people can’t do things differently—the local system’s equilibrium is broken, and the larger system’s equilibrium makes it impossible to flee the game.

Visitor: Okay, look… despite everything you’ve said so far, I still have some trouble understanding why doctors and parents can’t just not kill the babies. I manage to get up every single morning and successfully not kill any babies. It’s not as hard as it sounds.

Cecie: I worry you’re starting to think like Simplicio. You can’t just not kill babies and expect to get away with it.

Simplicio: I actually agree with Cecie here. The evil people behind the system hate those who defy them by behaving differently; there’s no way they’d countenance anyone departing from the norm. What we really need is a revolution, so we can depose our corrupt overlords, and finally be free to coordinate, and…!

Cecie: There’s no need to add in any evil conspiracy hypotheses here.

It’s sufficient to note that the system is in equilibrium and it has causes for the equilibrium settling there—causes, if not justifications. You can’t go against the system’s default without going against the forces that underpin that default. A doctor who gives a baby a nutrition formula that isn’t FDA-approved will lose their job. A hospital that doesn’t fire that kind of doctor will be sued. A scientist that writes proposals for a big, expensive, definitive study won’t get a grant, and while they were busy writing those failed grant proposals, they’ll have lost their momentum toward tenure. So no, you can’t just try out a competing policy of not killing babies. Not more than once.

Visitor: Have you tried?

Cecie: No.

Visitor: But—

Cecie: Anyway, from my perspective, it’s no surprise if you don’t yet feel like you understand. We’ve only begun to survey the malfunctions of the whole system, which would further include the FDA, and the clinical trials, and the p-hacking. And the way venture capital is structured, and equity-market regulations. And the insurance companies, and the tax code. And the corporations who contract with the insurance companies. And the corporations’ employees. And the politicians. And the voters.

Visitor: … Consider me impressed that your planet managed to reach this level of dysfunction without actually physically bursting into flames.

Cross-posted to LessWrong and equilibriabook.com. Next: Moloch's Toolbox (2/2).

Carl Shulman notes that the Affordable Care Act linked federal payments to hospitals with reducing central-line infections (source), which was probably a factor in the change. ↩
Around a thousand infants are born with short bowel syndrome per year in the United States, of whom two-thirds develop parenteral nutrition-associated liver disease (source). See Park, Nespor, and Kerner Jr for a 2011 review of the academic literature, and Koch, Cohen, and Carroll and Madrzyk for news coverage. ↩
See Tabarrok’s “Assessing the FDA via the Anomaly of Off-Label Drug Prescribing,” which cites the widespread practice of off-label prescription as evidence that the FDA’s efficacy trial requirements are unnecessary. ↩
See the “Report Likelihoods, Not p-Values” FAQ, or, in dialogue form: “Likelihood Functions, p-Values, and the Replication Crisis.” ↩
From Schmidt and Hunter’s “Select on Intelligence”: “Intelligence is the major determinant of job performance, and therefore hiring people based on intelligence leads to marked improvements in job performance.” See also psychologist Stuart Ritchie’s discussion of IQ in Vox.

Software engineer Alyssa Vance adds:

I’ll note that, as far as I can tell, the informal consensus at least among the best-informed people in software is that hiring has tons of obvious irrationality even when there’s definitely no external cause; see [1] and [2]. In terms of Moloch’s toolbox, the obvious reason for that is that interviewers are rarely judged on the quality of the people they accept, and when they are, certainly aren’t paid more or less based on it. (Never mind the people they reject. “Nobody ever got fired because of the later performance of someone they turned down.”) Their incentive, insofar as they have one, is to hire people who they’d most prefer to be on the same floor with all day long. ↩
Compare psychiatrist Scott Alexander’s account, in “Against Tulip Subsidies”:

In America, aspiring doctors do four years of undergrad in whatever area they want (I did Philosophy), then four more years of medical school, for a total of eight years post-high school education. In Ireland, aspiring doctors go straight from high school to medical school and finish after five years. I’ve done medicine in both America and Ireland. The doctors in both countries are about equally good. When Irish doctors take the American standardized tests, they usually do pretty well. Ireland is one of the approximately 100% of First World countries that gets better health outcomes than the United States. There’s no evidence whatsoever that American doctors gain anything from those three extra years of undergrad. And why would they? Why is having a philosophy degree under my belt supposed to make me any better at medicine? […]

I’ll make another confession. Ireland’s medical school is five years as opposed to America’s four because the Irish spend their first year teaching the basic sciences—biology, organic chemistry, physics, calculus. When I applied to medical school in Ireland, they offered me an accelerated four year program on the grounds that I had surely gotten all of those in my American undergraduate work. I hadn’t. I read some books about them over the summer and did just fine.

Americans take eight years to become doctors. Irishmen can do it in four, and achieve the same result. Each year of higher education at a good school—let’s say an Ivy, doctors don’t study at Podunk Community College—costs about $50,000. So American medical students are paying an extra $200,000 for…what?

Remember, a modest amount of the current health care crisis is caused by doctors’ crippling level of debt. Socially responsible doctors often consider less lucrative careers helping the needy, right up until the bill comes due from their education and they realize they have to make a lot of money right now. We took one look at that problem and said “You know, let’s make doctors pay an extra $200,000 for no reason.”

For a more general discussion of the evidence that college is chiefly a costly signal of pre-existing ability, rather than a mechanism for building skills and improving productivity, see Bryan Caplan’s argument in “Is College Worth It?”, also summarized by Roger Barris. ↩
See, e.g., Scott Alexander’s “My IRB Nightmare.” ↩
From Hyman and Silver, “You Get What You Pay For”:

By the 1950s, death rates ranged between 1 and 10 per 10,000 encounters. Anesthesia mortality stabilized at this rate for more than two decades. Mortality and morbidity rates fell again after a 1978 article reframed the issue of anesthesia safety as one of human factor analysis. In the mid-1980s, the American Society of Anesthesiologists (ASA) promulgated standards of optimal anesthesia practice that relied heavily on systems-based approaches for preventing errors. Because patients frequently sued anesthetists when bad outcomes occurred and because deviations from the ASA guidelines made the imposition of liability much more likely, anesthetists had substantial incentives to comply.

[… W]e should consider why anesthesia mortality stabilized at a rate more than one hundred times higher than its current level for more than two decades. The problem was not lack of information. To the contrary, anesthesia safety was studied extensively during the period. A better hypothesis is that anesthetists grew accustomed to a mortality rate that was exemplary by health care standards, but that was still higher than it should have been. From a psychological perspective, this low frequency encouraged anesthetists to treat each bad outcome as a tragic but unforeseen and unpreventable event. Indeed, anesthetists likely viewed each individual bad outcome as the manifestation of an irreducible baseline rate of medical mishap.

Hyman and Silver note other possible factors behind the large change, e.g., the fact that the person responsible for mishaps was often easy to identify since there tended to be only one anesthetist per procedure, and that “because surgical patients had no on-going relationships with their anesthetist, victims were particularly likely to sue.” ↩
See Jena, Prasad, Goldman, and Romley, “Mortality and Treatment Patterns Among Patients Hospitalized With Acute Cardiovascular Conditions During Dates of National Cardiology Meetings.” ↩

Aaron GertlerNov 5 20174

One Molochian factor that was briefly mentioned in the dead-baby example: The people most skilled at generating outrage, at least until good-aligned organizations get good at training people to generate outrage, will typically generate outrage about more-or-less random topics that happen to affect them.

See, for example, the one-man campaign by a heart surgeon, whose wife died due to very rare complications, to reduce the odds of those rare complications ever happening -- and getting unusually rapid support from the FDA, because he made a Change.org petition and writes in a style that is accessible, yet sufficiently medical-sounding, to draw attention from many different groups.

(I'm no medical expert, but the surgeon's suggestions are controversial, and many doctors seem to think they'll cause more harm than good by squeezing out the good uses of the procedure which caused the complications.)

https://www.change.org/p/women-s-health-alert-deadly-cancers-of-the-uterus-spread-by-gynecologists-stop-morcellating-the-uterus-in-minimally-invasive-and-robot-assisted-hysterectomy

If this person had been the father of a child who died of parenteral nutrition-associated liver disease, the FDA might well have acted on that issue instead. But it's hard to point people like this in the "right direction".

LilaNov 4 20174

The p-value critique doesn't apply to many scientific fields. As far as I can tell, it mostly applies to social science and maybe epidemiological research. In basic biological research, a paper wouldn't be published in a good journal on the basis of a single p-value. In fact, many papers don't have any p-values. When p-values are presented, they're often so low (10^-15) that they're unnecessary confirmations of a clearly visible effect. (Silly, in my opinion.) Most papers rely on many experiments, which ideally provide multiple lines of evidence. It's also common to propose a mechanism that's plausible given the existing literature. In some cases, you can see the fingerprints of skeptical reviewers. For example, when I see "to exclude the possibility that", I assume that this experiment was added later at the demand of a reviewer. Published biology is often wrong, but for subtler reasons.

CarlShulmanNov 5 201714

"The p-value critique doesn't apply to many scientific fields." I agree with this, or at least that it is vastly weaker when overwhelming data are available to pin down results.

"As far as I can tell, it mostly applies to social science and maybe epidemiological research. "

I disagree with this.

For instance, p-value issues have been catastrophic in quantitative genetics. The vast bulk of candidate gene research in genetics was non-replicable p-hacking of radically underpowered studies. E.g. schizophrenia candidate genes replicate at chance levels in massive replications but had literatures of p-hacked and publication bias artifact studies. The field moved to requiring genome-wide significance of 5*10^-8 (i.e. Bonferroni corrections for multiple testing at all measured variants). Results obtained in huge genome-wide association studies that meet that criterion replicate reliably.

ETA: It isn't basic biological research, but medical and drug trials routinely have severe p-hacking issues. And there have been a lot of reproducibility problems reported with, e.g. preclinical cancer research, often lacking slam dunk evidence. The Reproducibility Project: Cancer is working on that.

Medical studies take up the bulk of biomedical research funds, and Eliezer's example is at the intersection of medicine and nutrition.

ETA2: I don't think issues of p-hacking would be solved just by using Bayesian statistics: people can instead selectively report Bayes factors, i.e. posterior hacking. It's the selective use of analytic and reporting degrees of freedom that's central. Here's Daryl Bem and coauthors' Bayesian meta-analysis purporting to show psi in Bem's p-hacked experiments.

surfergirlNov 8 20173

medical and drug trials routinely have severe p-hacking issues. And there have been a lot of reproducibility problems reported with, e.g. preclinical cancer research, often lacking slam dunk evidence.

Due to my medical problems I have been reading medical literature for 25 years, and indeed it is a catastrophe of p-hacking and the like, incompetent statistical analysis, ven very often there is a basic misunderstanding of what p-values mean. You routinely see researchers claiming "no effect" when the p value is slightly over 0.05.

Usually, medical papers are misleading in some serious way. The best you can hope for is that they waste the vast majority of the value in the data.

People who read abstracts only and thing they are learning something are deluding themselves. You can to go through the methods section carefully and even then not all the shenanigans are disclosed, and look very closely at sponsorship of the parties to the study (researchers, journal editors, institutions etc) to pick up the extreme biases that result from sponsorship.

LilaNov 5 20171

I consider GWAS applied, not basic, because it's not mechanistic. Most biologists I've spoken to have a fairly poor opinion of GWAS, as do I. Much of the biological research that gets funded is basic.

DenkenbergerNov 8 20172

As for the value of college for non-doctors, what about the study of GI bill recipients that were randomly chosen that found that college did have significant causal benefits (it was not just correlation that colleges were just choosing better qualified people)?

RobBensingerNov 9 20171

I'm not an expert in this area and haven't seen that study, but I believe Eliezer generally defers to Bryan Caplan's analysis on this topic. Caplan's view, discussed in The Case Against Education (which is scheduled to come out in two months), is that something like 80% of the time students spend in school is signaling, and something like 80% of the financial reward students enjoy from school is due to signaling. So the claim isn't that school does nothing to build human capital, just that a very large chunk of schooling is destroying value.

DenkenbergerNov 18 20170

Wow - is there a paper to this effect? I would be surprised if it is that high for the technical fields.

Ben PaceNov 18 20171

I haven't read Caplan's book, but I can imagine >50% of the math learned in a math course being not used in a technical career outside of research, and furthermore that the heuristics picked up in those courses are not generalisable (e.g. geometry heuristics not applying to differential equations).

Effective Altruism Forum
EA Forum

Moloch's Toolbox (1/2)

13

i. For want of docosahexaenoic acids, a baby was lost

ii. Asymmetric information and lemons problems

iii. Academic incentives and beneficiaries

iv. Two-factor markets and signaling equilibria

v. Total market failures

vi. Absence of (meta-)competition

13

Reactions

More posts like this