I recently wrote a philosophy paper that might be of interest to some EAs. The introduction is copied below, and the full paper available here.
Suppose you face the following moral decision.
Dyson's Wager
You have $2,000 to use for charitable purposes. You can donate it to either of two charities.
The first charity distributes bednets in low-income countries in which malaria is endemic. With an additional $2,000 in their budget, they would prevent one additional death from malaria in the coming year. You are certain of this.
The second organisation does speculative research into how to do computations using ‘positronium’ - a form of matter which will be ubiquitous in the far future of our universe. If our universe has the right structure (which it probably does not), then in the distant future we may be able to use positronium to instantiate all of the operations of human minds living blissful lives, and thereby allow morally valuable life to survive indefinitely. (Footnotes omitted - see full text.) From your perspective as a good epistemic agent, there is some tiny, non-zero probability that, with (and only with) your donation, this research would discover a method for stable positronium computation and would be used to bring infinitely many (or just arbitrarily many) blissful lives into existence.
What ought you do, morally speaking? Which is the better option: saving a life with certainty, or pursuing a tiny probability of bringing about arbitrarily many future lives?
A common view in normative decision theory and the ethics of risk - expected value theory - says that it’s better to donate to the speculative research. Why? Each option has some probability of bringing about each of several outcomes, and each of those outcomes has some value, specified by our moral theory. Expected value theory says that the best option is whichever one has the greatest probability-weighted sum of value - the greatest expected value (distinct from expected utility - see footnotes). Here, the option with the greatest expected value is donating to the speculative research (at least on certain theories of value - more on those in a moment). So, plausibly, that’s what you should do.
This verdict is counterintuitive to many. All the more counterintuitive is that it’s still better to donate to speculative research no matter how low the probability is (short of being 0). For instance, the odds of your donation actually making the research succeed could be 1 in 10^100. (10^100 is greater than the number of atoms in the observable universe). The chance that the research yields nothing at all would be 99.99... percent, with another 96 nines after that. And yet you ought to take the bet, despite it being almost guaranteed that it will actually turn out worse than the alternative; despite the fact that you will almost certainly have let a person die for no actual benefit? Surely not, says my own intuition. On top of that, suppose that $2,000 spent on preventing malaria would save more than one life. Suppose it would save a billion lives, or any enormous finite value. Expected value theory would say that it's still better to take the risky bet - that it would be better to risk those billion or more lives for a miniscule chance at much greater value. But endorsing that verdict, regardless of how low the probability of success and how high the cost, seems fanatical.
That verdict does depend on more than just our theory of instrumental rationality, expected value theory. It also requires that our moral theory endorses totalism: that the ranking of outcomes can be represented by a total (cardinal) value of each outcome; and that this total value increases linearly, without bound, with the sum of value in all lives that ever exist. Then the outcome containing vastly more blissful lives is indeed a much better one than that in which one life is saved. And, as we increase the number of blissful lives, we can increase how much better it is without bound. No matter how low the probability of those many blissful lives, there can be enough such lives that the expected total value of the speculative research is greater than that of malaria prevention. But this isn't a problem unique to totalism. When combined with expected value theory, analogous problems face most competing views of value (axiologies), including: averageism, pure egalitarianism, maximin, maximax, and narrow person-affecting views. Those axiologies all allow possible outcomes to be unboundedly good, so it's easy enough to construct cases like Dyson's Wager for each. I'll focus on totalism here for simplicity, and also because it seems to me far more plausible than the others. But suffice it to say that just about any plausible axiology can deliver fanatical verdicts when combined with expected value theory.
A little more generally, we face fanatical verdicts if our theory of instrumental rationality (in conjunction with our theory of value) endorses Fanaticism. And to avoid fanatical verdicts it must, at minimum, avoid Fanaticism.
Fanaticism: For any (finite) probability ε > 0 (no matter how low), and for any finite value v on a cardinal scale, there is some value V which is large enough that: we are rationally required to choose the lottery L_risky over L_safe.
L_risky: (an outcome with) value V with probability ε; value 0 otherwise
L_safe: value v with probability 1
The comparison of lotteries L_risky and L_safe resembles Dyson’s Wager: one option gives a slim chance of an astronomical value V; the other a certainty of some modest value v. V need not be infinite, in case you think infinite value impossible. But v must be finite. And Fanaticism in this form implies the fanatical verdict in Dyson’s Wager, if we choose sufficiently many (perhaps infinitely many) blissful lives. Likewise, to reject the fanatical verdict in Dyson’s Wager, we must reject Fanaticism.
You might think it easy enough to reject Fanaticism. Many philosophers have done so, in the domains of practical reason and also moral decision-making. For instance, Bostrom (2009) presents a compelling reductio ad absurdum for all fanatical views in the prudential context. Bostrom (2011), Beckstead (2013), and Askell (2019) treat (a weak form of) Fanaticism as itself a reductio for theories in the moral context. Others propose theories of rationality by which we simply ignore small enough probabilities (e.g., D’Alembert 1761; Buffon 1777; Smith 2014; Monton 2019). And Tarsney (n.d.) goes to great lengths to develop a theory resembling expected value theory which specifically avoids Fanaticism.
Meanwhile, there are few defenders of Fanaticism. No philosopher I know of has explicitly defended it in print. And few philosophers have defended fanatical verdicts in cases like Dyson's Wager, with the exception of Pascal (1669) himself and those who, I suspect reluctantly, endorse his conclusion. And even they accept it only as a consequence of expected value theory, not because it has good independent justification. Even to most who endorse them, I suspect that fanatical verdicts are seen as unfortunate skeletons in the closet of expected value theory.
I think that this situation is unfortunate. We have good reason to accept Fanaticism beyond just expected value theory. As I hope to show, there are compelling arguments in favour of Fanaticism in the moral context. As those arguments show, if we reject Fanaticism then we face disturbing implications.
The paper proceeds as follows. Section 2 addresses common motivations for rejecting Fanaticism. Section 3 introduces the necessary formal framework for what follows. Sections 4 through 6 present arguments in favour of Fanaticism, each premised on weaker claims than expected value theory, and each (by my reckoning) more compelling than the last. The first is a basic continuum argument. The second is driven by a basic assumption that we can put at least some value on each lottery we face. The third is that: to deny Fanaticism we must accept either ‘scale-inconsistency’ or an absurd sensitivity to small probability differences, both of which are implausible. And the final nail in the coffin is what I will call the Indology Objection (a cousin of Parfit's classic Egyptology Objection) by which those who deny Fanaticism must make judgements which appear deeply irrational. Section 7 is the conclusion.
Read the rest here.
One practical objection to Fanaticism is that in cases where we believe in a small chance of a very large positive payoffs, we should also suspect a chance of very large negative payoffs, and we may have deep uncertainty about the probabilities and payoffs, so that the sign of the expected value could be positive or negative, and take extreme values.
So,
What if this backfires spectacularly and we instantiate astronomical amounts of suffering instead? Or, similar technology is used to instantiate astronomical suffering to threaten us into submission (strategic threats in conflict)?
Given how speculative this is and that we won't get feedback on whether we've done good or harm until it's too late, are there other ways this can go very badly that we haven't thought of?
In the case of a Pascal's mugging, when someone is threatening you, you should consider that giving in may encourage this behaviour further, allowing more threats and more threats followed through on, and the kind of person threatening you this way may use these resources for harm, anyway.
In the case of Pascal's wager, there are multiple possible deities, each of which might punish or reward you or others (infinitely) based on your behaviour. It may turn out to be the case that you should take your chances with one or multiple of them, and that you should spend a lot of time and resources finding out which. This could end up being a case of unresolvable moral cluelessness, and that choosing any of the deities isn't robustly good, nor is choosing none of them isn't robustly bad, and you'll never be able to get past this uncertainty. Then, you may have multiple permissible options (according to the maximality rule, say), but given a particular choice (or non-choice), there could still be things that appear to you to be robustly better than other things, e.g. donating to charity X instead of charity Y, or not torturing kittens for fun instead of torturing kittens for fun. This doesn't mean anything goes.
Just a note on the Pascal's Mugging case: I do think the case can probably be overcome by appealing to some aspect of the strategic interaction between different agents. But I don't think it comes out of the worry that they'll continue mugging you over and over. Suppose you (morally) value losing $5 to the mugger at -5 and losing nothing at 0 (on some cardinal scale). And you value losing every dollar you ever earn in your life at -5,000,000. And suppose you have credence (or, alternatively, evidential probability) of p that the mugger can and will generate any among of moral value or disvalue they claim they will. Then, as long as they claim they'll bring about an outcome worse than -5,000,000/p if you don't give them $5, or they claim they'll bring about an outcome better than +5,000,000/p if you do, then EV theory says you should hand it over. And likewise for any other fanatical theory, if the payoff is just scaled far enough up or down.
Yes, in practice that'll be problematic. But I think we're obligated to take both possible payoffs into account. If we do suspect the large negative payoffs, it seems pretty awful to ignore them in our decision-making. And then there's a weird asymmetry if we pay attention to the negative payoffs but not the positive.
More generally, Fanaticism isn't a claim about epistemology. A good epistemic and moral agent should first do their research, consider all of the possible scenarios in which their actions backfire, and put appropriate probabilities on them. If as they do the epistemic side right, it seems fine for them to act according to Fanaticism when it comes to decision-making. But in practice, yeah, that's going to be an enormous 'if'.
If you accept Fanaticism, how do you respond to Pascal's wager and Pascal's mugging, etc.?
Both cases are traditionally described in terms of payoffs and costs just for yourself, and I'm not sure we have quite as strong a justification for being risk-neutral or fanatical in that case. In particular, I find it at least a little plausible that individuals should effectively have bounded utility functions, whereas it's not at all plausible that we're allowed to do that in the moral case - it'd lead something a lot like the old Egyptology objection.
That said, I'd accept Pascal's wager in the moral case. It comes out of Fanaticism fairly straightforwardly, with some minor provisos. But Pascal's Mugging seems avoidable - for it to arise, we need another agent interacting with you strategically to get what they want. I think it's probably possible for an EV maximiser to avoid the mugging as long as we make their decision-making rule a bit richer in strategic interactions. But that's just speculation - I don't have a concrete proposal for that!
Great to see actual defences of fanaticism! As you say, the arguments get stronger as you proceed, although I personally only found section 6 compelling, and the argument there depends on your precise background uncertainty. Depending on your background uncertainty, you could instead reject a lottery with a positive probability of an infinite payoff (as per Tarsney or the last paragraph in this comment) for a more probable finite payoff. To me, this still seems like a rejection of a kind of fanaticism, defined slightly differently.
I think your Minimal Tradeoffs is not that minimal, since it's uniform in r, i.e. the same r can always be used. A weaker assumption would just be that for any binary lottery with payoff v or 0, there is another binary lottery with lower probability of nonzero payoff that's better. And this is compatible with a bounded vNM utility function. I would guess, assuming a vNM utility function, your Minimal Tradeoffs is equivalent to mine + unbounded above.
As someone who's sympathetic to bounded social welfare functions, Scale Consistency doesn't seem obvious (then again, neither does cardinal welfare to me, and with it, totalism), and it doesn't seem motivated in the paper. Maybe you could use Scale Invariance instead:
where the multiplication by k is not multiplying value directly, but by duplicating the world/outcome ktimes (with perfect correlation between the duplicates). This is (I think) Scale Invariance from this paper, which generalizes Harsanyi's utilitarian theorem using weaker assumptions (in particular, it does not assume the vNM rationality axioms), and their representation theorems could also lead to fanaticism. Scale Invariance is a consequence of "stochastic" separability, that if A > B, then A + C > B + C, for any lotteries A, B and C such that A and B act on disjoint populations from C.
I'm confused about the following bit when you construct the background uncertainty in section 6, although I think your argument still goes through. First,
This seems fine. Next, you write:
They can't say this for all b, but they can for some b, right? Aren't they saying exactly this when they deny Fanaticism ("If you deny Fanaticism, you know that no matter how your background uncertainty is resolved, you will deny that Lrisky plus b is better than Lsafe plus b.")? Is this meant to follow from Lrisky+B≻Lsafe+B? I think that's what you're trying to argue after, though.
Then,
Aren't we comparing lotteries, not definite outcomes? Your vNM utility function could be arctan(∑iui), where the function inside the arctan is just the total utilitarian sum. Let Lsafe=π2, and Lrisky=∞ with probability 0.5 (which is not small, but this is just to illustrate) and 0 otherwise. Then these have the same expected value without a background payoff (or b=0), but with b>0, the safe option has higher EV, while with b<0, the risky option has higher EV. Of course, then, this utility function doesn't deny Fanaticism under all possible background uncertainty.
Thanks!
Good point about Minimal Tradeoffs. But there is a worry that if you don't make it a fixed r then you could have an infinite sequence of decreasing rs but they don't go arbitrarily low. (e.g., 1, 3/4, 5/8, 9/16, 17/32, 33/64, ...)
I agree that Scale-Consistency isn't as compelling as some of the other key principles in there. And, with totalism, it could be replaced with the principle you suggest in which multiplication is just duplicating the world k. Assuming totalism, that'd be a weaker claim, which is good. I guess one minor worry is that, if we reject totalism, duplicating a world k times wouldn't scale its value by k. So Scale-Consistency is maybe the better principle for arguing in greater generality. But yeah, not needed for totalism.
Nope, wasn't meaning for the statement involving little b to follow from the one about big B. b is a certain payoff, while B is a lottery. When we add b to either lottery, we're just adding a constant to all of the payoffs. Then, if lotteries can be evaluated by their cardinal payoffs, we've got to say that L_1 +b > L_2 +b iff L_1 > L_2.
Yep, that utility function is bounded, so using it and EU theory will avoid Fanaticism and bring on this problem. So much the worse for that utility function, I reckon.
And, in a sense, we're not just comparing lotteries here. L_risky + B is two independent lotteries summed together, and we know in advance that you're not going to affect B at all. In fact, it seems like B is the sort of thing you shouldn't have to worry about at all in your decision-making. (After all, it's a bunch of events off in ancient India or in far distant space, outside your lightcone.) In the moral setting we're dealing with, it seems entirely appropriate to cancel B from both sides of the comparison and just look at L_risky and L_safe, or to conditionalise the comparison on whatever B will actually turn out as: some b. That's roughly what's going on there.
Oh, also you wrote "La is better than Lb" in the definition of Minimal Tradeoffs, but I think you meant the reverse?
Isn't the problem if the r's approach 1? Specifically, for each lottery, get the infimum of the r's that work (it should be ≤1), and then take the supremum of those over each lottery. Your definition requires that this supremum is < 1.
Hmm, I think this kind of stochastic separability assumption implies risk-neutrality (under the assumption of independence of irrelevant alternatives?), since it will force your rankings to be shift-invariant. If you do maximize the expected value of some function of the total utilitarian sum (you're a vNM-rational utilitarian), then I think it should rule out non-linear functions of that sum.
However, what if we maximize the expected value of some function of the difference we make (e.g. compared to a "business as usual" option, subtracting the value of that option)? This way, we have to ignore the independent background B since it gets cancelled, and we can use a bounded vNM utility function on what's left. One argument I've heard against this (from section 4.2 here) is that it's too agent-relative, but the intuition for stochastic separability itself seems kind of agent-relative, too. I suppose there are slightly different ways of framing stochastic separability, "What I can't affect shouldn't change what I should do" vs "What isn't affected shouldn't change what's best", with only the former agent-relative, although also more plausible given agent-relative ethics. If I reject agent relative ethics, neither seems so obvious.
How about this: fanaticism is fine in principle, but in practice we never face any actual fanatical choices. For any actions with extremely large value V, we estimate p < 1/V, so that the expected value is <1, and we ignore these actions based on standard EV reasoning.
It could just always happen to have been the case or you could have a strong prior like this, although I don't think you can just declare this to necessarily be true; you should accept that evidence can in principle overcome the prior. It would be motivated reasoning to decide what probabilities to assign to empirical questions just to make sure you don't accept a normative implication you don't like. (Then again, even choosing your prior this way may also be motivated reasoning, unless you can justify it another way.)
Also, I think there have been serious proposals for Pascalian cases, e.g. see this paper.
Furthermore, you'd have to assign p=0 when V=∞, which means perfect certainty in an empirical claim, which seems wrong.
Also related is this GiveWell post, where they model the standard deviation for the value/good accomplished as proportional (equal) to the estimate of value, with a normal distribution with mean X and a standard deviation X, where X is your value estimate. In this way, larger estimates of X are so suspicious they actually reduce the expected value.
Yes, I'm saying that it happens to be the case that, in practice, fanatical tradeoffs never come up.
Hm, doesn't claiming V=∞ also require perfect certainty? Ie, to know that V is literally infinite rather than some large number.
In real cases, I think V should have a distribution with support ranging over the whole real line and incude both positive and negative infinity. This is for both the fanatical and non-fanatical option (compared to each other or some fixed third option). The difference is that most of the (difference in) expected value of the fanatical option comes from a region of very low probability. This way, I'm not assigning perfect certainty to infinity, just a greater probability for a fanatical option than a non-fanatical option.
(I'm kind of skipping over some subtleties about dealing with infinities. I think there are reasonable approaches, although they aren't perfectly satisfying.)
I guess the problem is that V=∞ is nonsensical. We can talk about V→∞, but not equality.
I think V=∞ is logically possible when you aggregate over space and time, and I think we shouldn't generally assign probability 0 to anything that's logically possible (except where a measure is continuous; I think this requirement had a name, but I forget). Pascal's wager and Dyson's wager illustrate this.
We have reason to believe the universe is infinite in extent, and there's a chance that it's infinite temporally. You might claim that our lightcone is finite/bounded and we can't affect anything outside of it (setting aside multiverses), but this is an empirical claim, so we should give it some chance of being false. That we could affect an infinite region of spacetime is also not a logical impossibility, so we shouldn't absolutely rule it out.
Yep, we've got pretty good evidence that our spacetime will have infinite 4D volume and, if you arranged happy lives uniformly across that volume, we'd have to say that the outcome is better than any outcome with merely finite total value. Nothing logically impossible there (even if it were practically impossible).
That said, assigning value "∞" to such an outcome is pretty crude and unhelpful. And what it means will depend entirely on how we've defined ∞ in our number system. So, what I think we should do in such a case is not say V equals such and such. Instead, ditch the value function when you've left the domain where it works. Instead, just deal with your set of possible outcomes, your lotteries (probability measures over that set), and a betterness relation which might sometimes follow a value function but might also extend to outcomes beyond the function's domain. That's what people tend to do in the infinite aggregation literature (including the social choice papers that consider infinite time horizons), and for good reason.
You're probably (pun not intended) thinking of Cromwell's rule.
Yes, thanks!
That'd be fine for the paper, but I do think we face at least some decisions in which EV theory gets fanatical. The example in the paper - Dyson's Wager - is intended as a mostly realistic such example. Another one would be a Pascal's Mugging case in which the threat was a moral one. I know I put P>0 on that sort of thing being possible, so I'd face cases like that if anyone really wanted to exploit me. (That said, I think we can probably overcome Pascal's Muggings using other principles.)
1. I don't know much about probability and statistics, so forgive me if this sounds completely naive (I'd be interested in reading more on this problem, if it's as simple for you as saying "go read X").
Having said that, though, I may have an objection to fanaticism, or something in the neighborhood of it:
You could throw a lot of resources at the low certainty bets, and if the certainty is low enough, you could get to the end of time and say "we got nothing for all that". If the individual bets are low-certainty enough, even if you had a lot of them in your suite you would still have a very high probability of getting nothing for your troubles. (The state of coming up empty-handed.)
That investment could have come at the cost of pursuing the short-term, high certainty suite.
So you might feel regret at the end of time for not having pursued the safer bets, and with that in mind, it might be intuitively rational to pursue safe bets, even with less expected value. You could say "I should pursue high EV things just because they're high EV", and this "avoid coming up empty-handed" consideration might be a defeater for that.
You can defeat that defeater with "no, actually the likelihood of all these high-EV bets failing is low enough that the high-EV suite is worth pursuing."
2. It might be equally rational to pursue safety as it is to pursue high EV, it's just that the safety person and the high-EV person have different values.
3. I think in the real world, people do something like have a mixed portfolio, like Taleb's advice of "expose yourself to high-risk, high-reward investments/experiences/etc., and also low-risk, low-reward." And how they do that shows, practically speaking, how much they value super-great futures versus not coming up empty-handed. Do you think your paper, if it got its full audience, would do something like "get some people to shift their resources a little more toward high-risk, high-reward investments"? Or do you think it would have a more radical effect? (A big shift toward high-risk, high-reward? A real bullet-biting, where people do the bare minimum to survive and invest all other resources into pursuing super-high-reward futures?)