Mistakes in the moral mathematics of existential risk (Part 1: Introduction and cumulative risk) - Reflective altruism

Eevee🔹; David Thorstad

This is a linkpost for https://ineffectivealtruismblog.com/2023/05/27/mistakes-in-the-moral-mathematics-of-existential-risk-part-1-introduction-and-cumulative-risk/

This is the first part of "Mistakes in the moral mathematics of existential risk", a series of blog posts by David Thorstad that aims to identify ways in which estimates of the value of reducing existential risk have been inflated. I've made this linkpost part of a sequence.

Even if we use … conservative estimates, which entirely ignor[e] the possibility of space colonization and software minds, we find that the expected loss of an existential catastrophe is greater than the value of 10¹⁶ human lives. This implies that the expected value of reducing existential risk by a mere one millionth of one percentage point is at least a hundred times the value of a million human lives.
Nick Bostrom, “Existential risk prevention as global priority”

1. Introduction

This is Part 1 of a series based on my paper, “Mistakes in the moral mathematics of existential risk.”

(Almost) everyone agrees that human extinction would be a bad thing, and that actions which reduce the chance of human extinction have positive value. But some authors assign quite high value to extinction mitigation efforts. For example:

Nick Bostrom argues that even on the most conservative assumptions, reducing existential risk by just one millionth of one percentage point would be as valuable as saving a hundred million lives today.
Hilary Greaves and Will MacAskill estimate that early asteroid-detection efforts saved lives at an expected cost of fourteen cents per life.

These numbers are a bit on the high side. If they are correct, then on many philosophical views the truth of longtermism will be (nearly) a foregone conclusion.

I think that these, and other similar estimates, are inflated by many orders of magnitude. My paper and blog series “Existential risk pessimism and the time of perils” brought out one way in which these numbers may be too high: they will be overestimates unless the Time of Perils Hypothesis is true.

My aim in this paper is to bring out three novel ways in which many leading estimates of the value of existential risk mitigation have been inflated. (The paper should be online as a working paper within a month.)

I’ll introduce the mistakes in detail throughout the series, but it might be helpful to list them now.

Mistake 1: Focusing on cumulative risk rather than per-unit risk.
Mistake 2: Ignoring background risk.
Mistake 3: Neglecting population dynamics.

I show how many leading estimates make one, or often more than one of these mistakes.

Correcting these mistakes in the moral mathematics of existential risk has two important implications.

First, many debates have been mislocated, insofar as factors such as background risk and population dynamics are highly relevant to the value of existential risk mitigation, but these factors have rarely figured in recent debates.
Second, many authors have overestimated the value of existential risk mitigation, often by many orders of magnitude.

In this series, I review each mistake in turn. Then I consider implications of this discussion for current and future debates. Today, I look at the first mistake, focusing on cumulative rather than per-unit risk.

2. Bostrom’s conservative scenario

Nick Bostrom (2013) considers what he terms a conservative scenario in which humanity survives for a billion years on the planet Earth, at a stable population of one billion humans.

We will see throughout this series that is far from a conservative scenario. Modeling background risk (correcting the second mistake) will put pressure on the likelihood of humanity surviving for a billion years. And modeling population dynamics (correcting the third mistake) will raise the possibility that humanity may survive at a population far below one billion people. However, let us put aside these worries for now and consider Bostrom’s scenario as described.

In this scenario, there are 10¹⁸ human life-years yet to be lived, or just over 10¹⁶ lives at current lifespans. Bostrom uses these figures to make a startling claim: reducing existential risk by just one millionth of one percent is, in expectation, as valuable as saving one hundred million people.

Even if we use … conservative estimates, which entirely ignor[e] the possibility of space colonization and software minds, we find that the expected loss of an existential catastrophe is greater than the value of 10¹⁶ human lives. This implies that the expected value of reducing existential risk by a mere one millionth of one percentage point is at least a hundred times the value of a million human lives.

It can seem obvious that Bostrom must be correct here. After all, a reduction of existential risk by one part in a million gives a 10^-8 chance of saving just over 10¹⁶ people, and so in expectation it saves just more than 10⁸ lives, or a hundred million lives.

Today, we will see that this estimate is not strictly speaking false. It is rather worse than false: it is badly misleading. Once the estimate is described in more revealing terms, we will see that the seemingly small reduction of 10^-8 in the chance of existential catastrophe required to deliver an expected value equivalent to a hundred million lives saved is better described as a very large reduction in existential risk.

3. Relative and absolute risk reduction

To see the point, we need to make two distinctions. First, reductions in risk can be described in two ways.

Typically, we speak about relative reductions in risk, which chop a specified fraction off the current amount of risk. It is in this sense that a 10% reduction in risk takes us from 80% to 72% risk, from 20% to 18% risk, or from 2% to 1.8% risk. (Formally, relative risk reduction by f takes us from risk r to risk (1-f)r).

More rarely, we talk about absolute reductions, which subtract an absolute amount from the current level of risk. It is in this sense that a 10% reduction in risk takes us from 80% to 70% risk, from 20% to 18% risk, or from 10% to 0% risk. (Formally, relative risk reduction by f takes us from risk r to risk r – f).

Bostrom must be concerned with absolute risk reduction for his argument to make sense. Otherwise, the benefit of existential risk reduction would have to be multiplied by the current level of risk r.

In general, a focus on absolute risk isn’t especially nefarious, just a bit nonstandard. It does tend to overstate risk a bit, since absolute risk reduction is equivalent to relative risk reduction from a starting point of 100% risk. In this way, stating risk in absolute rather than relative terms overstates risk by 1/r, where r is the starting level of risk. This can be quite a strong boost if starting levels of risk are low, however many effective altruists think that levels of existential risk are rather high, in which case the overstatement may be one order of magnitude or less. Let’s not dwell on this.

4. Cumulative and per-unit risk

Over a long period (say, a billion years), we can report risk in two ways. On the one hand, we can report the cumulative risk r_C that a catastrophe will occur at least once throughout the period. Cumulative risk over a billion years can be quite high: it’s hard to go a billion years without catastrophe.

On the other hand, we can divide a long period into smaller units (say, centuries). Then we can report the per-unit risk r_U that a catastrophe will occur in any given unit.

How is cumulative risk related to per-unit risk? Well, if there are N units, then we have:

r_C = 1-(1-r_U)^N

Therein lies the rub, for if N is very high (in this case, N = ten million!), then for almost any value of r_U, the rightmost term will be driven exponentially towards zero, so that r_C is driven almost inescapably towards one. This means that over a long period, driving r_C meaningfully away from one requires very low values of r_U.

Therein lies the rub, because Bostrom is concerned with cumulative risk. For an intervention to be credited with increasing the survival probability not only of current humans, but also of all future humans, by 10^-8, that intervention must reduce cumulative risk by 10^-8.

No problem, you say. Surely it cannot be so hard to reduce cumulative risk by a mere one millionth of one percent. However, an absolute reduction of cumulative risk by 10^-8 requires (by definition) driving cumulative risk at least below 1-10^-8. Again, you say, that must be easy. Not so. Driving cumulative risk this low requires driving per-century risk to about 1.6*10^-6, barely one in a million.

5. First mistake: Focusing on cumulative over per-unit risk

We can describe this intervention in two ways. On the one hand, we can describe it flatteringly, as Bostrom does, in terms of (absolute) cumulative risk reduction. All that’s needed, Bostrom says, is a reduction by one millionth of one percent.

On the other hand, we can describe it unflatteringly, in the more standard terms of relative per-unit risk reduction. Now what’s needed is driving risk to almost one in a million per century. By contrast, many effective altruists think that per-century risk is currently above 10%. This would put us at a relative risk reduction of over 100,000x, not one part in a million, but the tenth part of a million.

Our first mistake in the moral mathematics of existential risk is therefore focusing on cumulative rather than per-unit risk. This is a mistake for two reasons.

First, as we saw, focusing on cumulative risk significantly understates risk by describing very, very large changes in per-century risk as very, very small changes in cumulative risk. This gives the misleading sense that what is, in quite a natural and intuitive sense, an astronomically large change, is in fact only a tiny change.

Second, focusing on cumulative risk moves debates away from what we can affect. Our actions may reduce risk in our own century, and perhaps if we are lucky they will even affect risk in nearby centuries. But it is unlikely that we can predictably affect risk in far distant centuries with anything approaching the ease that we affect risk in nearby centuries. For this reason, in assessing the value of feasible acts taken to mitigate existential risk, we should focus on per-unit risk rather than cumulative risk in order to bring the focus back to quantities that our actions can predictably affect.

6. Conclusion

Today’s post introduced the series and discussed a first mistake in the moral mathematics of existential risk: focusing on cumulative over per-century risk. We saw that a leading paper by Nick Bostrom makes this mistake, and that once the mistake is corrected, what appeared to be a very small change in existential risk turned out in a natural sense to be a very large change. We will also see in the next post that another leading paper makes the same mistake.

We will also look at two more mistakes in the moral mathematics of existential risk: ignoring background risk, and neglecting population dynamics. We will see how these mistakes combine in leading papers to overestimate levels of existential risk, and serve to mislocate debates about the value of existential risk mitigation.

74 Reactions

Three mistakes in the moral mathematics of existential risk (David Thorstad)

14 comments48 karma

Mistakes in the moral mathematics of existential risk (Part 2: Ignoring background risk) - Reflective altruism

7 comments84 karma

Comments6

Sorted by

New & upvoted

Click to highlight new comments since: Today at 7:53 AM

Joe BentonJul 3 202352

I think a key point of contention here with many people (including me) who would endorse the argument for working to mitigate existential risks might be your background assumption that the per-unit risk rate is constant over time. I would put substantial probability (at least at a conservative minimum) on us being in a relatively short period of heightened existential risk, which will then be followed by a much longer and safer period. I think if you put substantial credence on this, then small relative reductions of the risk rate in this century still end up looking very good in expected value.

To make this concrete, consider the following simplified model. Suppose that we will face 10 centuries with a $10 %$ chance that we go extinct in each one, followed by a reduction of the per-century risk to $10^{- 6}$ . (Let’s suppose all these probabilities are independent for simplicity.) Under this model, the value of the future would be approximately $3.5 \times 10^{14}$ human lives.

Then, if we could decrease the relative risk of extinction this century by 1 in 100 million, this would be equivalent in expected value to saving approximately $1.4 \times 10^{6}$ lives. (To calculate this, consider the probability that we would have gone extinct this century, but our intervention prevented this, and that we would not have gone extinct in the following dangerous centuries.) Discounting by a further factor of 20 to account for my $5 %$ credence that this model is reasonable, this would give a lower bound on the value of our intervention of approximately $7 \times 10^{4}$ lives.

These are somewhat smaller than the numbers Bostrom gets (partly due to my conservative discounting for model uncertainty) but even so are large enough that I think his core point still stands.

I expect our key disagreement might be in whether we should assign non-negligible credence to humanity driving the background risk down to a very low level. While this might seem unlikely from our current perspective, I find it hard to justify putting a credence below $5 %$ on this happening. This largely comes from seeing several plausible pathways for this to happen (space colonisation, lock-in driven by TAI, etc.) plus quite a lot of epistemic modesty because predicting the future is hard.

Gideon FutermanJul 3 202313

I think David has broadly addressed his views on this in 'Existential Risk Pessimism and the Time of Perils" (https://globalprioritiesinstitute.org/wp-content/uploads/David-Thorstad-Existential-risk-pessimism-.pdf), which I believe this moral mathematics series is a follow up to

🔸Zachary BrownJul 3 202318

I really appreciated this post and it's sequel (and await the third in the sequence)! The "second mistake" was totally new to me, and I hadn't grasped the significance of the "first mistake". The post did persuade me that the case for existential risk reduction is less robust than I had previously thought.

One tiny thing. I think this should read "from 20% to 10% risk":

More rarely, we talk about absolute reductions, which subtract an absolute amount from the current level of risk. It is in this sense that a 10% reduction in risk takes us from 80% to 70% risk, from 20% to 18% risk, or from 10% to 0% risk. (Formally, relative risk reduction by f takes us from risk r to risk r – f).

David ThorstadJul 3 20235

Whoops, thanks!

Dan_KeysJul 4 202316

However, an absolute reduction of cumulative risk by 10^-8 requires (by definition) driving cumulative risk at least below 1-10^-8. Again, you say, that must be easy. Not so. Driving cumulative risk this low requires driving per-century risk to about 1.6*10^-6, barely one in a million.

I'm unclear on what this means. I currently think that humanity has better than a 10^-8 chance of surviving the next billion years, so can I just say that "driving cumulative risk at least below 1-10^-8" is already done? Is the 1.6*10^-6per-century risk some sort of average of 10 million different per-century numbers (such that my views on the cumulative risk imply that this risk is similarly already below that number), or is this trying to force our thinking into an implausible-to-me model where the per-century risk is the same in every century, or is this talking about the first future century in which risk drops below that level?

On the whole, this decay-rate framing of the problem feels more confusing to me than something like a two-stage framing where there is some short-term risk of extinction (over the next 100 or 1000 years or similar) and then some probability of long-term survival conditional on surviving the first stage.

e.g., Suppose that someone thinks that humanity has a 10^-2 chance (1%) of surviving the next thousand years, and a 10^-4 chance (.01%) of surviving the next billion years conditional on surviving the next thousand years, and that our current actions can only affect the first of those two probabilities. Then increasing humanity's chances of surviving a billion years by 10^-8 (in absolute terms) requires adding 10^-4 to our 10^-2 chance of surviving the next thousand years (an absolute .01% increase), or, equivalently, multiplying our chances of surviving the next thousand years by x1.01 (a 1% relative increase).

David ThorstadJul 6 20238

Thanks Dan! As mentioned, to think that cumulative risk is below 1-(10^-8) is to make a fairly strong claim about per-century risk. If you think we're already there, that's great!

Bostrom was actually considering something slightly stronger: the prospect of reducing cumulative risk by a further 10^(-8) from wherever it is at currently. That's going to be hard even if you think that cumulative risk is already lower than I do. So for example, you can ask what changes you'd have to make to per-century risk to drop cumulative risk from N to r-(10^-8) for any r in [0,1). Honestly, that's a more general and interesting way to do the math here. The only reason I didn't do this is that (a) it's slightly harder, and (b) most academic readers will already find per-century risk of ~one-in-a-million relatively implausible, and (c) my general aim was to illustrate the importance of carefully distinguishing between per-century risk and cumulative risk.

It might be a good idea, in rough terms, to think of a constant hazard rate as an average across all centuries. I suspect that if the variance of risk across centuries is low-ish, this is a good idea, whereas if the variance of risk across centuries is high-ish, it's a bad idea. In particular, on a time of perils view, focusing on average (mean) risk rather than explicit distributions of risk across centuries will strongly over-value the future, since a future in which much of the risk is faced early on is lower-value than a future in which risk is spread out.

Strong declining trends in hazard rates induce a time-of-perils like structure, except that on some models they might make a bit weaker assumptions about risk than leading time of perils models do. At least one leading time of perils model (Aschenbrenner) has a declining hazard structure. In general, the question will be how to justify a declining hazard rate, given a standard story on which (a) technology drives risk, and (b) technology is increasing rapidly. I think that some of the arguments against the time of perils hypothesis made in my paper "Existential risk pessimism and the time of perils" against the time of perils hypothesis will be relevant here, whereas others may be less relevant, depending on your view.

In general, I'd like to emphasize the importance of arguing for views about future rates of existential risk. Sometimes effective altruists are very quick to produce models and assign probabilities to models. Models are good (they make things clear!) but they don't reduce the need to support models with arguments, and assignments of probability are not arguments, but rather statements in need of argument.