Hide table of contents


When taking expected values, the results can differ radically based on which common units we fix across possibilities. If we normalize relative to the value of human welfare, then other animals will tend to be prioritized more than by normalizing by the value of animal welfare or by using other approaches to moral uncertainty.

  1. For welfare comparisons and prioritization between different moral patients like humans, other animals, aliens and artificial systems, I argue that we should fix and normalize relative to the moral value of human welfare, because our understanding of the value of welfare is based on our own experiences of welfare, which we directly value. Uncertainty about animal moral weights is about the nature of our experiences and to what extent other animals have capacities similar to those that ground our value, and so empirical uncertainty, not moral uncertainty (more).
  2. I revise the account in light of the possibility of multiple different human reference points between which we don’t have fixed uncertainty-free comparisons of value, like pleasure vs belief-like preferences (cognitive desires) vs non-welfare moral reasons, or specific instances of these. If and because whatever moral reasons we apply to humans, (similar or other) moral reasons aren’t too unlikely to apply with a modest fraction of the same force to other animals, then the results would still be relatively animal-friendly (more).
    1. I outline why this condition plausibly holds across moral reasons and theories, so that it’s plausible we should be fairly animal-friendly (more).
  3. I describe and respond to some potential objections:
    1. There could be inaccessible or unaccessed conscious subsystems in our brains that our direct experiences and intuitions do not (adequately) reflect, and these should be treated like additional moral patients (more).
    2. The approach could lead to unresolvable disagreements between moral agents, but this doesn't seem any more objectionable than any other disagreement about what matters (more).
    3. Epistemic modesty about morality may push for also separately normalizing by the values of nonhumans or against these comparisons altogether, but this doesn't seem to particularly support the prioritization of humans (more).
  4. I consider whether similar arguments apply in cases of realism vs illusionism about phenomenal consciousness, moral realism vs moral antirealism, and person-affecting views vs total utilitarianism, and find them less compelling for these cases, because value may be grounded on fundamentally different things (more).


How this work has changed my mind: I was originally very skeptical of intertheoretic comparisons of value/reasons in general, including across theories of consciousness and the scaling of welfare and moral weights between animals, because of the two envelopes problem (Tomasik, 2013-2018) and the apparent arbitrariness involved. This lasted until around December 2023, and some arguments here were originally going to be part of a piece strongly against such comparisons for cross-species moral weights, which I now respond to here along with positive arguments for comparisons.



I credit Derek Shiller and Adam Shriver for the idea of treating the problem like epistemic uncertainty relative to what we experience directly. I’d also like to thank Brian Tomasik, Derek Shiller and Bob Fischer for feedback. All errors are my own.



On the allocation between the animal-inclusive and human-centric near-termist views, specifically, Karnofsky (2018) raised a problem:

The “animal-inclusive” vs. “human-centric” divide could be interpreted as being about a form of “normative uncertainty”: uncertainty between two different views of morality. It’s not entirely clear how to create a single “common metric” for adjudicating between two views. Consider:

  • Comparison method A: say that “a human life improved” is the main metric valued by the human-centric worldview, and that “a chicken life improved” is worth >1% of these (animal-inclusive view) or 0 of these (human-centric view). In this case, a >10% probability on the animal-inclusive view would lead chickens to be valued >0.1% as much as humans, which would likely imply a great deal of resources devoted to animal welfare relative to near-term human-focused causes.
  • Comparison method B: say that “a chicken life improved” is the main metric valued by the animal-inclusive worldview, and that “a human life improved” is worth <100 of these (animal-inclusive view) or an astronomical number of these (human-centric view). In this case, a >10% probability on the human-[centric] view would be effectively similar to a 100% probability on the human-centric view.

These methods have essentially opposite practical implications. Method A is the more intuitive one for me (it implies that the animal-inclusive view sees “more total value at stake in the world as a whole,” and this implication seems correct), but the lack of a clear principle for choosing between the two should give one pause, and there’s no obviously appropriate way to handle this sort of uncertainty. One could argue that the two views are “philosophically incommensurable” in the sense of dealing with fundamentally different units of value, with no way to identify an equivalence-based conversion factor between the two.[1]


For example, if one thinks there’s a 50% chance that one should be weighing the interests of chickens 1% as much as those of humans, and a 50% chance that one should not weigh them at all, one might treat this situation as though chickens have an “expected moral weight” of 0.5% (50% * 1% + 50% * 0) relative to humans. This would imply that (all else equal) a grant that helps 300,000 chickens is better than a grant that helps 1,000 humans, while a grant that helps 100,000 chickens is worse.


Credit: DALL·E

We can define random variables to capture these statements more precisely via a formalization with expected values. Let  denote the (average or marginal) moral value per human life improved by some intervention, and let  denote the (average or marginal) moral value per chicken life improved by another intervention. Then,

  1. Method A could follow from assuming  is constant and calculating the expected value per chicken life improved as . Indeed, if  is assumed constant, then . Furthermore, under linear views like utilitarianism, if  is constant, we can normalize the value of all interventions by it, across species, and so we can just value the chicken welfare improvements proportionally to .
  2. Method B could follow from assuming  is constant and calculating the expected value per human life improved as , or, after normalizing by .

Based on Karnofsky’s example, we could take  to be 1% with probability 50% and (approximately) 0 otherwise, and  to be 100 (100=1/(1%)) with probability 50% and astronomical (possibly infinite) otherwise. If  is never 0, then  and  are multiplicative inverses of one another this way, i.e. . However, , while  is astronomical or infinite, and . In general,  as long as  is defined, non-negative and not constant.[2] The fact that these two expected values of ratios aren’t inverses of one another is why the two methods give different results for prioritization.

Rather than specific welfare improvements in particular,  and  could denote welfare ranges, i.e. the difference between the maximum welfare at a time and the minimum welfare at a time of the average chicken or average human, respectively. Or, they may be the “moral weight” of the average chicken or the average human, respectively, as multipliers by which to weigh measures of welfare. We may let  denote the moral value per unit of a human welfare improvement according to a measure of human welfare, like DALYs, QALYs, or measures of life satisfaction, and let  denote the moral value of per unit of chicken welfare improvement according to a measure of chicken welfare.[3] See Fischer, 2022 and Rethink Priorities’ Moral Weight Project Sequence for further discussion of welfare ranges, capacities for welfare and moral weights.


This problem has been called the two envelopes problem, in analogy with the original two envelopes problem (Tomasik, 2013-2018, Tomasik et al., 2009-2014). I use Karnofsky (2018)’s framing because of its more explicit connection to effective altruist cause prioritization.


I make a case here that we should fix and normalize by the (or a) human moral weight, using something like comparison method A, with some caveats and adjustments.


Welfare in human-relative terms

The strengths of our reasons to reduce human suffering or satisfy human belief-like preferences, say, don’t typically seem to depend on our understanding of their empirical or descriptive nature. This is not how we actually do ethics. If we found out more about the nature of consciousness and suffering, which we define in human terms, we typically wouldn’t decide it mattered less (or more) than we thought before.[4] Finding out that pleasure is mediated not by dopamine or serotonin but by a separate system, or that humans only have around 86 billion neurons instead of 100 billion doesn’t change how important our own experiences directly seem to us. Nor does changing our confidence between the various theories of consciousness.

Instead, we directly value our experiences, not our knowledge of what exactly generates them. Water didn’t become more or less important to human life from finding out it was H2O.[5] The ultimate causes of why we care about something may depend on its precise empirical or descriptive nature, but the proximal reasons — for example, how suffering feels to us and how bad it feels to us, say — do not change with our understanding of its nature. One might say we know (some of) these reasons by direct experience.[6] My own suffering just directly seems bad to me,[7] and how bad it directly seems does not depend on my beliefs about theories of consciousness or about how many neurons we have.

And, in fact, on utilitarian views using subjective theories of welfare like hedonism, desire theories and preference views, how bad my suffering actually (directly) is for me on those theories plausibly should just be how bad my suffering (directly) seems to me.[8] In that case, uncertainty about the nature of these “seemings” or appearances and how they arise and their extent in other animals is just descriptive uncertainty, like uncertainty about the nature and prevalence of any other physical or biological phenomenon, like gravity or cancer.[9] This is not a problem of comparisons of reasons across moral theories or moral uncertainty. It’s a problem of comparisons of reasons across theories of the empirical or descriptive nature of the things to which we assign moral value. There is, however, still moral uncertainty in deciding between hedonism, desire theories, preference views and objective list theories, and between variants of each, among other things.

Despite later warning about two-envelopes effects in Muehlhauser, 2018, one of Muehlhauser (2017)’s illustration of how he understands moral patienthood is based on his own direct experience of pain:

What are the implications of illusionism for my intuitions about moral patienthood? In one sense, there might not be any.360 After all, my intuitions about (e.g.) the badness of conscious pain and the goodness of conscious pleasure were never dependent on the “reality” of specific features of consciousness that the illusionist thinks are illusory. Rather, my moral intuitions work more like the example I gave earlier: I sprain my ankle while playing soccer, don’t notice it for 5 seconds, and then feel a “rush of pain” suddenly “flood” my conscious experience, and I think “Gosh, well, whatever this is, I sure hope nothing like it happens to fish!” And then I reflect on what was happening prior to my conscious experience of the pain, and I think “But if that is all that happens when a fish is physically injured, then I’m not sure I care.” And so on.

It’s the still poorly understood “whatever this is”, i.e. his direct experience, and things “like it” that are of fundamental moral importance and for which he’s looking in other animals. Conscious pain according to specific theories are just designed to track “whatever this is” and things “like it”, but almost all theories will be wrong. The example also seems best interpretable as an illustration of comparison method A, weighing fish pain relative to his experience of pain spraining an ankle.

The relevant moral reasons are or derive directly from these direct experiences or appearances, and the question is just when, where (what animals and other physical systems) and to what extent these same (kinds of) appearances and resulting reasons apply. Whatever this is that we’re doing, to what extent do others do it or something like it, too? All of our views and theories of the value of welfare should already be or should be made human-relative, because the direct moral reasons we have to apply all come from our own individual experiences and modest extensions, e.g. assuming our experiences are similar to other humans’. As we find out more about other animals and the nature of human welfare, our judgements about where other animals stand in relation to our concept and direct impressions of human welfare — the defining cases — can change.

So I claim that we have direct access to the grounds for the disvalue of human suffering and human moral value, i.e. the variable  in the previous section, and we understand the suffering and moral value of other beings, including the (dis)value in chickens as  above, relative to humans. Because of this, we can fix  and use comparison method A, at least across some theories, including at least separately across theories of the nature of unpleasantness, across theories of the nature of felt desires, and across theories of the nature of belief-like preferences.

On the other hand, it doesn’t make much sense for us to fix the moral value of chicken suffering or the chicken moral weight, because we (or you, the reader) only understand it in human-relative terms, and especially in reference to our (respectively, your) own experiences.[10]

And it could end up being the case — i.e. with nonzero probability — that chickens don’t matter at all, not even infinitesimally. They may totally lack the grounds to which we assign moral value, e.g. they may not be capable of suffering at all, even though I take it to be quite likely that they can suffer, or moral status could depend on more than suffering. Then, we aren’t even fixing the moral weight of a chicken at all, if it can be 0 with nonzero probability and nonzero with nonzero probability. And because of the possible division by 0 moral weight, the expected moral weights of humans and all other animals will be infinite or undefined.[11] It seems such a view wouldn’t be useful for guiding action.[12]

Similarly, we wouldn’t normalize by the moral weights of any other animals, artificial systems, plants or rocks.

We have the most direct access to (some) human moral reasons, can most reliably understand (some of) them and so typically theorize morally relative to (some of) them. How we handle uncertainty should reflect these facts.


Finding common ground

How intense or important suffering is could be quantified differently across theories, both empirical theories and moral theories. In some cases, there will be foundational metaphysical claims inherent to those theories that could ground comparisons between the theories. In many or even most important cases, there won’t be.

What common metaphysical facts could ground intertheoretic comparisons of value or reasons across theories of consciousness as different as Integrated Information Theory, Global Workspace Theory and Attention Schema Theory? Under their standard intended interpretations, they have radically different and mutually exclusive metaphysical foundations — or basic building blocks —, and each of these foundations is false, except possibly one. Similarly, there are very different and mutually exclusive proposals to quantify the empirical intensity of welfare and moral weights, like counting just-noticeable differences, functions of the number of relevant (firing) neurons or cognitive sophistication, direct subjective intrapersonal weighing, among others (e.g. Fischer, 2023 with model descriptions in the tabs of this sheet). How do the numbers of relevant neurons relate to the number of just-noticeable differences across all possible minds, not just humans? There’s nothing clearly inherent to these accounts that would ground intertheoretic comparisons between them, at least given our current understanding. But we can look outside the contents of the theories themselves to the common facts they’re designed to explain.

When ice seemed like it could have turned out to be something other than the solid phase of water, we would be comparing the options based on the common facts — the evidence or data — the different possibilities were supposed to explain. And then by finding out that ice is water, you learn that there is much more water in the world, because you would then also have to count all the ice on top of all the liquid water.[13] If your moral theory took water to be intrinsically good and more of it to be better, this would be good news (all else equal).

For moral weights across potential moral patients, the common facts our theories are designed to explain are those in human experiences, our direct impressions and intuitions, like how bad suffering feels or appears to be to us. It’s these common facts that can be used to ground intertheoretic comparisons of value or reasons, and it’s these common facts or similar ones for which we want to check in other beings or systems. So, we can hold the strengths of reasons from these common facts constant across theories, if and because they ground value directly on these common facts in the same way, e.g. the same hedonistic utilitarianism under different theories of (conscious) pleasure and unpleasantness, or the same preference utilitarianism under different theories of belief-like preferences. And in recognizing animal consciousness, like finding out that ice is water, you could come to see the same kind of empirical facts and therefore moral value in some other animals, finding more of it in the world.


Multiple possible reference points

However, things aren’t so simple as fixing the human moral weight across theories. We should be unsure about that, too. Perhaps a given instance of unpleasantness matters twice as much as another given belief-like preference, or perhaps it matters half as much, with 50% each. We get the two envelopes problem here, too. If we were to fix the value of the unpleasantness, then the belief-like preference would have an expected value of 50%*0.5 + 50%*2 = 1.25 times as great of the value of the unpleasantness. If we were to fix the value of the belief-like preference, then the unpleasantness would have an expected value of 50%*0.5 + 50%*2 = 1.25 times as great of the value of the belief-like preference.

We’re uncertain about which theory of wellbeing is correct and how to weigh human unpleasantness vs human pleasure vs human felt desires vs human belief-like preferences vs human choices vs objective goods and objective bads (and between each). The relative strengths of these different corresponding reasons are not in general fixed across theories.  Therefore, the strengths of our reasons can only be fixed for at most one of these at a time (if their relationships aren’t fixed). And the positive arguments for fixing any specific one and not the others seem likely to be weak, so it really is plausible that none should be fixed.

Similarly, we can also be uncertain about tradeoffs, strengths and intensities within the same type of welfare for a human, too, e.g. just degrees of unpleasantness, resulting in another two envelopes problem. For example, I’m uncertain about the relative intensities and moral disvalues of pains I’ve experienced.[14] In general, people may use multiple reference points with which they’re familiar, like multiple specific experiences or intensities, and be uncertain about how they relate to one another.

There could also be non-welfarist moral reasons to consider, like duties, rights, virtues, justifiability and reasonable complaints (under contractualism), special relationships, and specific instances of any of these. We can be uncertain about how they relate to each other and the various types of welfare, too.

So, what do we do? We could separately fix and normalize by each possible (typically human-based) reference point, a specific moral reason, use intertheoretic comparisons relative to it, e.g. the expected value of belief-like preferences (cognitive desires) in us and other animals relative to the value of some particular (human) pleasure. I’ll elaborate here.

We pick a very specific reference point or moral reason  and fix its moral weight  as a common unit relative to which we measure everything else.  takes the role of  in the human-relative method A in the background section. We measure the moral weights of humans (or specific human welfare concerns) like  and that of chickens like , and we do the same for everything else. And we also do all of this separately for every possible (typically human-based) reference point .


For uncertainty between choices of reference points, e.g. between a human pleasure and a human belief-like preference, we would apply a different approach to moral uncertainty that does not depend on intertheoretic comparisons of value or reasons, e.g. a moral parliament.[15] Or, when we can fix (or bound or get a distribution on) the ratios between all pairs of reference points in a subet of them, we could take a weighted sum across the reference points (or subsets of them), like in maximizing expected choiceworthiness and calculate the expected moral weights of chickens and humans (on that subset of reference points) as [16]

 In either case, it’s essentially human-relative, if and because  is almost always a human reference point.


There are some things we could say about  vs  with some constraints on the relationship between the distributions of  and . Using the same numbers as Karnofsky (2018)’s and assuming

  1.  and  are both nonnegative,
  2.  is positive with probability at least 50%, i.e. , and
  3. Whatever value  reaches,  reaches a value at least 1/100th as high with at least 50% of the probability, i.e.  for all  (to replace  with probability 50%),


like in Karnofsky (2018)’s illustration of the human-relative method A. In general, we multiply the probability ratio (50% here) by the value ratio (1/100 here) to get the ratio of expected moral weights (0.005 here). We can also upper bound  with a multiple of  by reversing the inequalities between the probabilities in 2 and 3.[18]


What can we say about the ratio of expected moral weights?

Would we end up with a ratio of expected moral weights between chickens and humans that’s relatively friendly to chickens? This will depend on the details and our credences.

Consider a lower bound for the chicken’s expected moral weight relative to a human’s. Say we fix some human reference point and corresponding moral reason.

As in the inequality from the previous section, we might think that whatever reason applies to a given human and with whatever strength, a chicken has at least 50% of the probability of having the same or a similar reason apply, but with strength only at least 1/100th of the human’s (relative to the reference point). That would give a ratio of 0.005. Or something similar with different numbers.

We might expect something like this because the central moral reasons from major moral theories seem to apply importantly to farmed chickens with probability not far lower than they do to humans.[19] Let’s consider several:

  1. However intensely a human can suffer, can a chicken suffer at least 1/100th as intensely? I would go further and say a chicken has a decent chance of having the capacity to suffer similarly intensely, e.g. at least half as intensely.
    1. Consider some intensity of suffering, and take a physical pain in a human, say a bone break or burn, that would induce it in a typical human under typical circumstances. Then, it seems not very unlikely — e.g. at least a 5% probability — that a similar pain in a typical chicken (e.g. a similar bone break or burn with a similar whole-body proportion of affected pain signaling nerves) would result in a similar intensity of suffering. However intense the suffering in a human being burned or boiled alive, it doesn’t seem too unlikely a chicken could suffer similarly intensely under the same conditions.
    2. If intensity scales as a function of relative motivational salience or attention — i.e. relative to their maximum possible or the hypothetical full pull of their attention —, or the proportion of suffering-contributing neurons firing (per second) or the proportion of just-noticeable differences away from indifference, then this doesn’t seem to favour humans in particular at all. In general, suppose that the intensity or disvalue of human suffering scales as some function relative to some underlying variable , e.g. a measure of motivational salience, attention, neurons firing per second, or just-noticeable differences, as  could scale very aggressively, even exponentially. Still, the function can be reinterpreted as a function scaling in the proportion  of the maximum of  for humans, , as . Then, we might assign a modest probability to the disvalue in chicken suffering also scaling roughly like , with  for the chicken being relative to the chicken’s own maximum value of .
  2. Do chickens have important belief-like preferences? I will discuss this further in another piece, but defend it briefly here. If conscious hedonic states or conscious felt desires count as or ground belief-like preferences, as I discussed in a previous piece, then chickens probably do have belief-like preferences. Rats and pigs also seem to be able to discriminate anxiety from its absence generalizably across causes with a learned behaviour, like pressing a lever when they would apparently feel anxious.[20] Perhaps the same would hold for chickens (it hasn't been studied in birds, as far as I know). Perhaps they can generalize this further to unpleasantness or aversion, which would constitute their concepts of bad and worse. The strengths of animals’ belief-like preferences would be a separate issue, but interpersonal comparisons of belief-like preferences in general may be impossible (or extremely vague), even between humans, so it wouldn’t even be clear that any given human has more at stake for belief-like preferences than the typical farmed chicken, or vice versa, of course. There could be no fact of the matter.
  3. Do chickens have important rights or do we have important duties to them? Yes, according to Regan (1983, 1989) and Korsgaard (2018, 2020), the former modifying the Kantian position and the latter extending Kantian arguments to other animals and further claiming general interpersonal incomparability in Korsgaard, 2020. Furthermore, even if we did recognize duties only to rational beings who can recognize normative concepts like Kant originally did, as above, their conscious hedonic states and felt desires or generalizable discrimination could qualify as or ground their normative concepts. Other animals are also plausibly rational to some extent, too, even if minimal.
  4. Should our actions be justifiable to chickens, real or hypothetical trustees for them (Scanlon, 1998, p.183), or idealized rational versions of them? If yes, then chickens could be covered by contractualism, and what’s at stake for them seems reasonably large, given points 1 and 2 and their severe suffering on factory farms. See also the last two sections, on contractualist protections for animals and future people, in Ashford and Mulgan, 2018.
  5. Could the capacity to mount reasonable complaints be enough to be covered under contractualism? Can chickens actually mount reasonable complaints? If yes to both, then chickens could be covered by contractualism. Chickens can and do complain about their situations and mistreatment in their own ways (vocalizations i.e. gakel-calls, feelings of unpleasantness and aversion, attempts to avoid, etc.), and what makes a complaint reasonable could just be whether the reasons for the complaint are strong enough relative to other reasons (e.g. under the Parfit’s Complaint Model or Scanlon’s modified version, described in Scanlon, 1998, p.229), which does not require (much) rationality on the part of the complainant. Severe suffering, like what factory farmed chickens endure, seems like a relatively strong reason for complaint.
  6. How do virtues guide our treatment of chickens? The virtues of compassion, beneficence and justice seem applicable here, given their circumstances.
  7. Do we have any special obligations to chickens? We — or chicken farmers, at least, and perhaps as consumers indirectly — are responsible for their existences and lives, like we are for those of our companion animals and children.

See also the articles by Animal Ethics on the status of nonhuman animals under various ethical theories and the weight of animal interests.

However, many of the comparisons here probably do in fact depend on comparisons across moral theories, e.g. Kant’s original animal-unfriendly position vs Regan and (perhaps) Korsgaard’s animal-friendly positions. The requirement of (sufficient) rationality for Kant’s reasons to apply could be an inherently moral claim, not a merely empirical one. If Regan and Korsgaard don’t require rationality for moral status, are they extending the same moral reasons Kant recognizes to other animals, or grounding different moral reasons? They might be the same intrinsically, if we see the restriction to rational beings as not changing the nature of the moral reasons. Perhaps the moral reasons come first, and Kant mistakenly inferred that they apply only to rational beings. Or, if they are different, are they similar enough that we can identify them anyway? On the other hand, could the kinds of reasons Regan and Korsgaard recognize as applying to other animals be far far weaker than Kant’s that apply to humans or incomparable to them? Could Kant’s apply to other animals directly with modest probability anyway?

Similar issues could arise between contractualist theories that protect nonrational (or not very rational) beings and those that only protect (relatively) rational beings. I leave these as open problems.



In this section, I describe and respond to some potential objections to the approach and rationale for intertheoretic comparisons of moral weights I’ve described.


Conscious subsystems

First, it should be human welfare standardly and simultaneously accessed for report that we fix. There could be multiple conscious (or otherwise intrinsically morally considerable) subsystems in a brain to worry about — whether inaccessible in general or not accessed at any particular time — effectively multiple moral patients with their own moral interests in each brain. Our basic moral intuitions about the value of human welfare and the common facts we’re trying to explain probably do not reflect any inaccessible conscious subsystems in our brains, and in general would plausibly only reflect conscious subsystems when they are actually accessed. So, we should normalize relative to what we actually access. It could then be that the number of such conscious subsystems scales in practice with the number of neurons in a brain, so that the average human would have many more of them in expectation, and so could have much greater expected moral weight than other animals with fewer neurons (Fischer, Shriver & St. Jules, 2023 (EA Forum post)).

In the most extreme case, we end up separately counting overlapping systems that differ only by a single neuron (Mathers, 2021) or even a single electron (Crummett, 2022), and the number of conscious subsystems may grow polynomially or even exponentially with the number of neurons or the number of particles, by considering all connected subsets of neurons and neural connections or “connected” subsets of particles.[21] Even a small probability on an aggressive scaling hypothesis could lead to large predictable expected differences in total moral weights between humans, and could give greater expected moral weight to the average whale with more neurons than the average human (List of animals by number of neurons - Wikipedia). With a small but large enough probability to fast enough scaling with the number of neurons or particles, a single whale could have more expected moral weight than all living humans combined. That seems absurd.

In this case, how we decide to individuate and count conscious systems seems to be a matter of moral uncertainty. Empirically, I am pretty confident that both the system that is my whole brain is conscious and that the system that is my whole brain excluding any single neuron or electron is conscious. I just don’t think I should count these systems separately to add up. And then, even if I should assign some non-negligible probability that I should count such systems separately and that the same moral reasons apply views on counting conscious systems — this would be a genuine identification of moral reasons across different moral theories, not just identifying the same moral reasons across different empirical views —, it seems far too fanatical if I prioritize humans (or whales) because of the tiny probability I assign to the number of conscious subsystems of a brain scaling aggressively with the number of neurons or electrons. I outline some other ways to individuate and count subsystems in this comment, and I would expect these to give a number of conscious subsystems scaling at most roughly proportionally in expectation with the number of neurons.

There could be ways to end up with conscious subsystems scaling with the number of neurons that are more empirically based, rather than dependent on moral hypotheses. However, this seems unlikely, because the apparently valuable functions realized in brains seem to occur late in processing, after substantial integration and high-level interpretation of stimuli (see this comment and Fischer, Shriver & St. Jules, 2023 (EA Forum post)). Still, even a small but modest probability could make a difference, so the result will depend on your credences.


Unresolvable disagreements

Second, it could also be difficult for intelligent aliens and us, if both impartial, to agree on how to prioritize humans vs the aliens under uncertainty, if and because we’re using our own distinct standards to decide what matters and how much. Suppose the aliens have their own concept of a-suffering, which is similar to, but not necessarily identical to our concept of suffering. It may differ from human suffering in that some functions are missing, or additional functions are present, or the number of times they’re realized differ, or the relative or absolute magnitudes of (e.g. cognitive) effects differ. Or, if they haven’t gotten that far in their understanding of a-suffering, it could just be the fact that a-suffering feels different or might feel different from human suffering, so their still vague concept picks out something potentially different from ours. Or vice versa.

In the same way chickens matter relatively more on the human-relative view than chickens do on the chicken-relative view, as above from Karnofsky, 2018, humans and the aliens could agree on (almost) all of the facts and have the same probability distributions for the ratio of the moral weight of human suffering to the moral weight of a-suffering, and yet still disagree on expected moral weights and about how to treat each other. Humans could weigh humans and aliens relative to human suffering, while the aliens could weigh humans and aliens relative to a-suffering. In relative terms and for prioritization, the aliens would weigh us more than we weigh ourselves, but we’d weigh them more than they weigh themselves.

One might respond that this seems too agent-relative, and we should be able to agree on priorities if we agree on all the facts, and share priors and the same impartial utilitarian moral views. However, while consciousness remains unsolved, humans don't know what it's like to be the aliens or to a-suffer, and the aliens don't know what it's like to be us or suffer like us. We have access to different facts, and this is not a source of agent-relativity, or at least not an objectionable one. Furthermore, we are directly valuing our own experiences, human suffering, and the aliens are directly valuing their own, a-suffering, and if these differ enough, then we could also disagree about what matters intrinsically or how. This seems no more agent-relative than the disagreement between utilitarians that disagree just on whether hedonism or desire theory is true: a utilitarian grounding welfare based on human suffering and a utilitarian doing so based on a-suffering just disagree about the correct theory of wellbeing or how it scales.


Epistemic modesty about morality

Still, perhaps both we and the aliens should be more epistemically modest[22] about what matters intrinsically and how, and so give weight to the direct perspectives of the aliens. If we try to entertain and weigh all points of view, then we would need to make and agree on genuine intertheoretic comparisons of value, which seems hard to ground and justify, or else we’d use an approach that doesn’t depend on intertheoretic comparisons. This could bring us and the aliens closer to agreement about optimal resource allocation, and perhaps convergence under maximal epistemic modesty, assuming we also agree on how to weigh perspectives and an approach to normative uncertainty.

Doing this can take some care, because we’re uncertain about whether the aliens have any viewpoint at all for us to adopt, and similarly they could be uncertain about us having any such viewpoint. This could prevent full convergence.

On the other hand, chickens presumably don’t think at all about the moral value of human welfare in impartial terms, so there very probably is no such viewpoint to adopt on their behalf, or else only one that’s extremely partial, e.g. some chickens may care about some humans to which they are emotionally attached, and many chickens may fear or dislike humans. Chickens’ points of view therefore wouldn’t grant humans much or any moral weight at all, or may even grant us negative overall weight instead. However, the right response here may instead be against moral impartiality, not against humans in particular. Indeed, most humans seem to be fairly partial, too, and we might partially defer to them, too. Either way, this perspective doesn’t look like the chicken-relative comparison method B from Karnofsky, 2018 that grants humans astronomically more weight than chickens.

How might we get such a perspective? We might idealize: what would a chicken believe if they had the capacities and were impartial, while screening off the value from those extra capacities. Or, we might consider a hypothetical impartial human or other intelligent being whose capacities for suffering are like those of a chicken, whatever those may be. Rather than actual viewpoints for which we have specific evidence of their existence, we’re considering conceivable viewpoints.

I’ll say here that this seems pretty speculative and weird, so I have some reservations about this, but I’m not sure either way.

A plausibly stronger objection to epistemic modesty about moral (and generally normative) stances is that it can undermine whatever moral views you or I or anyone else holds too much, including the foundational beliefs of effective altruists or assumptions in the project of effective altruism, like impartiality and the importance of beneficence. I am strongly disinclined to practically abandon my own moral views this way. I think this is a more acceptable position than rejecting epistemic modesty about non-normative claims, especially for a moral antirealist, i.e. those who reject stance-independent moral facts. We may have no or only weak reasons for epistemic modesty about moral facts in particular.

On the other hand, rather than abandoning foundational beliefs, it may actually support them. It may capture impartiality in a fairly strong sense by weighing each individual’s normative stance(s). Any being who suffers finds their own suffering bad in some sense, and this stance is weighed. A typical parent cares a lot for their child, so the child gets extra weight through the normative stance of the parent. Some humans particularly object to exploitation and using others as means to ends, and this stance is weighed. Some humans believe it’s better for far more humans to exist, and this stance is weighed. Some humans believe it’s better for fewer humans to exist, and this stance is weighed. The result could look like a kind of impartial person-affecting preference utilitarianism, contractualism or Kantianism (see also Gloor, 2022),[23] but relatively animal-inclusive, because whether or not other animals meet some thresholds for rationality or agency, they could have their own perspectives on what matters, e.g. their suffering and its causes.

If normative stances across species, like even across humans, are often impossible to compare, then the implications for prioritization could be fundamentally indeterminate, at least very vague. Or, they could be dominated by those with the most fanatical or lexical stances, who prioritize infinite value at stake without trading it off against mere finite stakes. Or, we might normalize each individual's values (or utility function) by their own range or variance in value (Cotton-Barratt et al., 2020), and other animals could outweigh humans through their numbers in the near term.


Other applications of the approach

What other intertheoretic comparisons of value could this epistemic approach apply to? I will consider:

  1. Realism vs illusionism about phenomenal consciousness.
  2. Moral realism vs moral antirealism.
  3. Person-affecting views vs total utilitarianism.


First, realism vs illusionism about phenomenal consciousness. Illusionists deny the phenomenal nature of consciousness and the existence of qualia as “Introspectable qualitative properties of experience that are intrinsic, ineffable, and subjective” (Frankish, 2012, preprint), introduced by Lewis (1929, pp.121, 124-125). Realists accept the phenomenal nature of consciousness and/or qualia. Illusionists do not deny that consciousness exists.[24] In section 5.2, Kammerer, 2019 argues that if phenomenal consciousness would ground moral value if it existed, it would be an amazing coincidence for pain to be as bad under (strong) illusionism, which denies the existence of phenomenal consciousness, as it is under realism which accepts the existence of phenomenal consciousness. However, if you're already a moral antirealist or take an epistemic approach to intertheoretic comparisons, then it seems reasonable to hold the strengths of your reasons to be the same, but just acknowledge that you may have misjudged their source or nature. Rather than phenomenal properties as their source, it could be quasi-phenomenal properties, where “a quasi-phenomenal property is a non-phenomenal, physical property (perhaps a complex, gerrymandered one) that introspection typically misrepresents as phenomenal” (Frankish, 2017, p. 18), or even the beliefs, appearances or misrepresentations themselves. Frankish (2012, preprint) proposed a theory-neutral explanandum for consciousness:

Zero qualia The properties of experiences that dispose us to judge that experiences have introspectable qualitative properties that are intrinsic, ineffable, and subjective.

These zero qualia could turn out to be phenomenal, under realism, or non-phenomenal and so quasi-phenomenal under illusionism (Frankish, 2012, preprint), but the judgements to be captured are the same, so it seems reasonable to treat the resulting reasons as the same. Or, we could use a less precise common ground: consciousness, whatever it is.

A similar approach could be taken with respect to uncertainty between metaethical positions, using our moral judgements or intuitions as the common facts. Again, we may be wrong about the nature of what they’re supposed to refer to or even the descriptive reality of these moral judgements and intuitions — e.g. whether they express propositions, as in cognitivism, or desires, emotions or other pro-attitudes and con-attitudes, as in non-cognitivism (van Roojen, 2023), and, under cognitivism, whether they are stance-independent or stance-dependent —, but we will still have them in any case. I’d judge torture very negatively regardless of my metaethical stance. Even more straightforwardly, for any specific moral realist stance, there’s a corresponding subjectivist stance that recognizes the exact same moral facts (and vice versa?), but just interprets them as stance-dependent rather than stance-independent. Any non-cognitivist pro-attitude or desire could be reinterpreted as expressing a belief (or appearance) that something is better.[25] This could allow us to at least match identical moral theories, e.g. the same specific classical utilitarianism, under the different metaethical interpretations.

Riedener (2019) proposes a similar and more general constructivist approach based on epistemic norms.[26] He illustrates with person-affecting views vs total utilitarianism, arguing for holding the strengths of reasons to benefit existing people the same between welfarist person-affecting views and total utilitarianism,[27] which would tend to favour total utilitarianism under moral uncertainty. However, if we’re comparing a Kantian person-affecting view and total utilitarianism, he argues that we may have massively misjudged our reasons other than for beneficence between the two views. So the comparison is more complex, and reasons for beneficence could be stronger under total utilitarianism, while our other reasons could be stronger under Kantian views, and we should balance epistemic norms and the particulars to decide these differences.

To be clear, I’m much less convinced of the applications in these cases, and there are important reasons for doubt:

  1. Realist accounts of phenomenal consciousness are designed primarily to explain our actually (allegedly) phenomenal properties, which illusionists deny, while illusionism is designed primarily to explain our beliefs about consciousness or quasi-phenomenal properties that lead to them, so realist and illusionist accounts disagree about what is to be explained. That phenomenal properties are in practice or even by physical necessity quasi-phenomenal could be incidental to a realist, hence philosophical zombie (p-zombie) thought experiments (see Kirk, 2023 for a standard reference). If a moral position directly grants moral value to phenomenal properties in virtue of being phenomenal, then this is not a common ground with moral positions that instead grant moral value to quasi-phenomenal properties in virtue of being quasi-phenomenal or to the resulting dispositions. That being said, I think moral positions should generally not ground value on phenomenal consciousness specifically, but instead on consciousness, whatever it is.
  2. Similarly, moral realists take actually (allegedly) stance-independent moral facts as fundamental, which to them may not derive from (actual, hypothetical or idealized) moral judgements or intuitions, which are stances that could differ between people, while subjectivists and non-cognitivists seem to take our (actual, hypothetical or idealized) moral judgements or intuitions as fundamental. That moral judgements and intuitions are evidence about and sometimes track stance-independent moral facts isn’t enough for a moral realist, because they could be mistaken. There does not seem to be a common ground for granting moral value between these positions.
  3. Those holding apparently welfarist person-affecting views may disagree that total utilitarianism is designed to explain the same kinds of reasons or facts as their views are. They may understand their moral reasons in ways similar to a Kantian or contractualist or otherwise reject (standard conceptions of) axiology, but also deny act-omission distinctions and see in others the same kinds of reasons they have to promote their own interests. It’s not preference satisfaction or even welfare per se that matters, but that things go better or worse according to the preferences, points of view, ends or normative stances of individuals who have them.[28] And merely possible people don’t (actually) have them.

In each of the above cases, one view takes as fundamental and central measures or consequences of what the other view takes as fundamental and central.[29] This will look like Goodhart’s law to those who insist it’s not these measures or consequences that matter but what is being measured or the causes. Those holding one of the pairs of views could complain that the others are gravely mistaken about what matters and why, so Riedener (2019)’s conservatism may not tell us much about how to weigh the views. The comparisons seem less reasonable, and we could end up with two envelopes problems again, fixing one theory’s fundamental grounds and evaluating both theories relative to it.

On the other hand, while not every pair of theories of consciousness or the value of welfare will agree on common facts to explain, many will. For example, realists about phenomenal consciousness will tend to agree with each other that it’s (specific) phenomenal properties themselves that their theories are designed to explain, so we could compare reasons across realist theories. Illusionists will tend to agree that it’s our beliefs (or appearances) about consciousness that are the common facts to explain, so we could compare reasons across illusionist theories. And theories of welfare and its value are designed to explain, among other things, why suffering is bad or seems bad. So, many reason comparisons can be grounded in practice, even if not all. And regardless of the reasons and whether they can be compared across all views, the common facts from which comparable reasons derive are based on human experience, so our moral views are justifiably human-relative.

  1. ^

     Karnofsky (2018) wrote:

    In this case, a >10% probability on the human-inclusive view would be effectively similar to a 100% probability on the human-centric view.

    I assume he meant “human-centric view” instead of “human-inclusive view”, so I correct the quote with square brackets here.

  2. ^

     if and only if  is equal to a constant with probability 1, and  if  is nonnegative and not equal to a constant with probability 1. This follows from Jensen's inequality, because  defined by  is convex.

  3. ^

     I would either use per unit averages for chickens and humans, respectively, or assume here that the value scales in proportion (or at least linearly) with each unit of measured welfare for each of humans and chickens, separately.

  4. ^

     However, some may believe objective moral value is threatened by illusionism about phenomenal consciousness, which denies that phenomenal consciousness exists. These positions do still recognize that consciousness exists, but they deny that it is phenomenal. We could just substitute an illusionist account of consciousness wherever phenomenal consciousness was used in our ethical theories, although some further revisions may be necessary to accommodate differences. For further discussion, see Kammerer, 2019, Kammerer, 2022 or a later section in this piece. The difference here is because some ethical theories directly value phenomenal consciousness specifically, and not (or less) consciousness in general.

    Other examples could be free will, libertarian free will specifically or god(s) which may turn out not to exist, and so moral theories that tied some reasons specifically to them would lose those reasons.

    If a moral theory only places value on things that actually exist in some form, while being more agnostic about their nature, then the value can follow the vague and revisable concepts of those things.

  5. ^

    Except possibly for indirect and instrumental reasons. It’s useful to know water is H2O.

  6. ^

    This could be cashed out in terms of acquaintance, as in knowledge by acquaintance (Hasan, 2019, Duncan, 2021, Knowles & Raleigh, 2019), or appearance, as in phenomenal conservatism (Huemer, 2013). Adam Shriver made a similar point in conversation.

  7. ^

    This may be more illustrative than literal for me. Personally, it’s more that other people’s suffering seems directly and importantly bad to me, or indirectly and importantly bad through my emotional responses to their suffering.

  8. ^

    However, which kind of “seeming” or appearance should be used can depend on the theory of wellbeing, i.e. unpleasantness under hedonism, cognitive desires or motivational salience under desire theories and preferences under preference theories. I concede later that we may need to separate by these very broad accounts of welfare (and perhaps more finely) rather than treat them all as generating the same moral reasons.

  9. ^

    From conversation with multiple people, something like this seems to be the standard view.

  10. ^

    Our sympathetic responses to the suffering of another individual — chicken, human or otherwise — don’t necessarily reliably track how bad it is for them from their own perspective, but is probably closer for other humans, because of greater similarity between humans (neurological, functional, cognitive, psychological, behavioural).

  11. ^

     (or undefined) if ,  , and  with nonzero probability, because we get  with nonzero probability.  is undefined if , and  with nonzero probability, because we get  with nonzero probability.

    However, in principle, humans in general or each proposed type of wellbeing could not matter with nonzero probability, so we could get a similar problem normalizing by human welfare or moral weights.

  12. ^

    There may be some ways to address the issue.

    You could treat the 0 moral weight like an infinitesimal and do arithmetic with it, but I think this entirely denies the possibility that chickens don’t matter at all. This seems ad hoc and to have little or no independent justification.

    You could take conditional expected values in the denominator (and numerator) first that gives a nonzero value, assuming Cromwell’s rule, before taking the ratio and expected value of the ratio. In other words, you take the expected value of a ratio of conditional expected values of moral weights. Then, in effect, you’re treating the conditional expected value of chicken moral weight as equal across some views. Most naturally, you would take the conditional expected values over descriptive uncertainty, conditional on each fixed normative stance — so that the resulting prescriptions would agree with each normative stance — and then take the expected value of the ratio across these normative stances/theories (over normative uncertainty).

  13. ^

    If you had already measured all the liquid water directly and precisely, you wouldn’t expect any more or less liquid water from finding out ice is also water.

  14. ^

    I even doubt that there is any precise fact of the matter for the ratio of their intensities or moral disvalue.

  15. ^

    Approaches include Open Philanthropy’s worldview diversification approach (Karnofsky, 2018), variance voting (MacAskill et al., 2020, Ch4), moral parliaments (Newberry & Ord, 2021), a bargain-theoretic approach (Greaves & Cotton-Barratt, 2019), or the Property Rights Approach (Lloyd, 2022). For an overview of moral uncertainty, see MacAskill et al., 2020.

  16. ^

    With multiple values for a given , e.g. a distribution of values, we could get a distribution or set of expected moral weights for chickens and humans. To these, we could apply an approach to moral uncertainty that doesn’t depend on intertheoretic reason comparisons.

  17. ^

    Let  and  be the quantile functions of  and , respectively. Then, for p between 0 and 1,



  18. ^

     and  gives .

  19. ^

     However, some major moral theories don’t weigh reasons by summation, aggregate at all or take expected values. The expected moral weights of chickens and humans may not be very relevant in those cases.

  20. ^

     Carey and Fry (1995) showed that pigs generalize the discrimination between non-anxiety states and drug-induced anxiety to non-anxiety and anxiety in general, in this case by pressing one lever repeatedly with anxiety, and alternating between two levers without anxiety (the levers gave food rewards, but only if they pressed them according to the condition). Many more such experiments were performed on rats, as discussed in Sánchez-Suárez, 2016, summarized in Table 2 on pages 63 and 64 and discussed further across chapter 3. Rats could discriminate between the injection of the anxiety-inducing drug PTZ and saline injection, including at subconvulsive doses. Various experiments with rats and PTZ have effectively ruled out convulsions as the discriminant, further supporting that it’s the anxiety itself that they’re discriminating, because they could discriminate PTZ from control without generalizing between PTZ and non-anxiogenic drugs, and with the discrimination blocked by anxiolytics and not nonanxiolytic anticonvulsants. Rats further generalized between various pairs of anxiety(-like) states, like those induced by PTZ, drug withdrawal, predator exposure, ethanol hangover, “jet lag”, defeat by a rival male, high doses of stimulants like bemegride and cocaine, and movement restraint.

    However, Mason and Lavery (2022) caution:

    But could such results merely reflect a “blindsight-like” guessing: a mere discrimination response that need not reflect underlying awareness? After all, as we have seen for S.P.U.D. subjects, decerebrated pigeons can use colored lights as DSs (128), and humans can use subliminal visual stimuli as DSs [e.g., (121)]. We think several refinements could reduce this risk.

  21. ^

     There are exponential (non-tight) upper bounds for the number of connected subgraphs of a graph, and hence connected neural subsystems of a brain (Pandey & Patra, 2021, Filmus, 2018). However, not any such connected subsystem would be conscious. Also, with bounded degree, i.e. a bounded number of connections/synapses per neuron in your set of brains under consideration, the number of connected subgraphs can be bounded above by a polynomial function of the number of neurons (Eppstein, 2013).

  22. ^

    For a defense of epistemic modesty, see Lewis, 2017.

    Aumann's agreement theorem, which supports convergence in beliefs between ideally rational Bayesians with common priors about events of common knowledge, may not be enough for convergence here. This is because our conscious experiences are largely private and not common knowledge. Even if they aren’t inherently private, without significant advances in theory or technology that would resolve remaining factual disagreements or far more introspection and far more detailed introspective reports than are practical, they’ll remain largely private in practice.

    Or, our priors could differ, based on our distinct conscious experiences, which we use as references to understand moral patienthood and often moral value in general.

  23. ^

    I’d only be inclined to weigh the actual or idealized intrinsic/terminal values of actual moral patients, not any possible or conceivable moral patients or perspectives. The latter also seems particularly ill-defined. How would we weigh possible or conceivable perspectives?

  24. ^

    The term ‘illusionism’ seems prone to cause misunderstanding, and multiple illusionists have taken issue with the term, including Graziano (2016, ungated), Humphrey (2016) and Veit and Browning (2023, preprint).

  25. ^

    See my previous piece discussing how desires and hedonic states may be understood as beliefs or appearances of normative reasons. Others have defended desire-as-belief, desire-as-perception and generally desire-as-guise or desire-as-appearance of normative reasons, the good or what one ought to do. See Schroeder, 2015, 1.3 for a short overview of different accounts of desire-as-guise of good, and Part I of Deonna (ed.) & Lauria (ed), 2017 for more recent work on and discussion of such accounts and alternatives. See also Archer, 2016, Archer, 2020 for some critiques, and Milona & Schroeder 2019 for support for desire-as-guise (or desire-as-appearance) of reasons. A literal interpretation of Roelofs (2022, ungated)’s “subjective reasons, reasons as they appear from its perspective” would be as desire-as-appearance of reasons.

  26. ^

    Riedener, 2019 writes, where IRCs is short for intertheoretic reason-comparisons:

    So I’ll propose a version of this approach, on which ought-facts are grounded in epistemic norms. In other words, I’ll propose a form of constructivism about IRCs. If I’m right, IRCs are not facts out there that hold independently of facts about morally uncertain agents. They hold in virtue of being the result of an ideally reasonable deliberation, in terms of certain epistemic norms, about what you ought to do in light of your uncertainty.


    So very roughly, these norms suggest that without any explanation, you shouldn’t assume that you’ve always systematically and radically misjudged the strength of your everyday paradigm reasons. And they imply that you should more readily assume you may have misjudged some reasons if you have an explanation for why and how you may have done so, or if these reasons are less mundane and pervasive. This seems intuitively plausible. But Simplicity, Conservatism and Coherence might be false, or not quite correct as I’ve stated them, or there might be other and more important norms besides them.27 My aim is not to argue for these precise norms. I’m happy if it’s plausible that some such epistemic norms hold, and that they can constrain the IRCs or ought-judgements you can reasonably make. If that’s so, we can invoke a form of constructivism to ground IRCs. We can understand truth about IRCs as the outcome of ideally reasonable deliberation – in terms of principles like the above – about what you ought to do in light of your uncertainty. By comparison, consider the view that truth in first-order moral theory is simply the result of an ideal process of systematizing our pre-theoretical moral beliefs.28 On this view, it’s not that there’s some independent Platonic realm of moral facts, and that norms like simplicity and coherence are best at guiding us towards it. Rather, the principles are first, and ‘truth’ is simply the outcome of the principles. We can invoke a similar kind of constructivism about IRCs. On this view, principles like Simplicity, Conservatism and Coherence are not justified in virtue of their guiding us towards an independent realm of ought-facts or IRCs. Rather, they help constitute this realm.

    So this provides an answer to why some ought-facts or IRCs hold. It’s not because of mind-independent metaphysical facts about how theories compare, or how strong certain reasons would be if we had them. It’s simply because of facts about how to respond reasonably to moral evidence or have reasonable moral beliefs. Ultimately, we might say, it’s because of facts about us – about why we might have been wrong about morality, and by how much and in what way, and so on.

  27. ^

    Riedener (2019) writes:

    According to TU, you have all the reasons that you have according to PAD – reasons to benefit existing others – but also some additional reasons beyond them. So on this interpretation, the least radical change in your credences and the most simple ultimate credence distribution will be such that your reasons to benefit existing people are the same on both theories. Unless you have some additional beliefs that could render other beliefs more coherent, this IRC will be most reasonable in light of the above principles.

  28. ^

    Rabinowicz and Österberg (1996) describe similar accounts as object versions of preference views, contrasting them with satisfaction versions, which are instead concerned with preference satisfaction per se. Also similar are actualist preference-affecting views (Bykvist, 2007) and conditional reasons (Frick, 2020).

  29. ^

    Or in the case of illusionism vs realism about phenomenal consciousness on one interpretation of illusionism, the comparisons are grounded based on such measures or consequences for both, i.e. the (real or hypothetical) dispositions for phenomenality/qualia beliefs, but what matters are the quasi-phenomenal properties that lead to these beliefs, which are either actually phenomenal under realism or not under illusionism. On another interpretation of illusionism, it’s the beliefs themselves that matter, not quasi-phenomenal properties in general. For more on the distinction, see Frankish, 2021.

Sorted by Click to highlight new comments since:

Re. the scenario with the intelligent aliens, you argue that we just have access to different facts, so it's unobjectionable that we reach different conclusions.

But the classic two-envelope problem is a problem because you get ~exploited. Offered a choice of two envelopes you pick one. And then when you open it you will predictably pay money to switch to the other envelope. Of course now you have extra facts — but that doesn't change that it looks like a mistake to predictably have this behaviour.

Similarly in this case we could set up an (admittedly construed) situation where you start by doing a bunch of reasoning about what's best, under a veil of ignorance about whether you're human or alien. Then it's revealed which you are, you remember all your experiences and can reason about how big a deal they are — and then you will predictably pay some utility in order to benefit the other species more. It similarly looks like it's a mistake to predictably have this behaviour (in the sense that, if humans and aliens are equally likely to be put in this kind of construed situation, then the world would be predictably better off if nobody had this behaviour), and I don't really feel like you've addressed this.

In the case of the classic two-envelope paradox the standard resolution is that you need to pay attention to your priors about how much money might be in envelopes. After you open your envelope and find $100, your probabilities of $50 vs $200 are no longer quite 50% — and for some values you could find, you should prefer not to switch.[1]

So in the case with the aliens, shouldn't we similarly be discussing priors? Shouldn't we be considering how much, on some kind of ur-prior, we should expect to experience, and then comparing what our actual experience is to that? And if we're doing this in the case of aliens, shouldn't we also do it in the case of chickens?

  1. ^

    At least with tame priors. If your prior over the amount of money in the envelope has an infinite expectation, it's possible for it to be always correct to switch. But in that case I imagine your complaint will be that you shouldn't start with a prior with infinite expectation.

Similarly in this case we could set up an (admittedly construed) situation where you start by doing a bunch of reasoning about what's best, under a veil of ignorance about whether you're human or alien. Then it's revealed which you are, you remember all your experiences and can reason about how big a deal they are — and then you will predictably pay some utility in order to benefit the other species more.

In this case, assuming you have no first-person experience with suffering to value directly (or memory of it), you would develop your concept of suffering third-personally — based on observations of and hypotheses about humans, aliens, chickens and others, say — and could base your ethics on that concept. This is not how humans or the aliens would typically understand and value suffering, which is largely first-personally. The human has their own vague revisable placeholder concept of suffering on which they ground value, and the alien has their own (and the chicken might have their own). Each also differ from the hypothetical third-personal concept.

Technically, we could say the humans and aliens have developed different ethical theories from each other, even if everyone's a classical utilitarian, say, because they're picking out different concepts of suffering on which to ground value.[1] And your third-personal account would give a different ethical theory from each, too. All three (human, alien, third-personal) ethical theories could converge under full information, though, if the concepts of suffering would converge under full information (and if everything else would converge).[2]

With the third-personal concept, I doubt there'd be a good solution to this two envelopes problem that actually gives you exactly one common moral scale and corresponding prior when you have enough uncertainty about the nature of suffering. You could come up with such a scale and prior, but you'd have to fix something pretty arbitrarily to do so. Instead, I think the thing to do is to assign credences across multiple scales (and corresponding priors) and use an approach to moral uncertainty that doesn't depend on comparisons between them. (EDIT: And these could be the alien stance and human stance which relatively prioritize the other and result in a two envelopes problem.) But what I'll say below applies even if you use a single common scale and prior.

When you have first-person experience with suffering, you can narrow down the common moral scales under consideration to ones based on your own experience. This would also have implications for your credences compared to the hypothetical third-person perspective.

If you started from no experience of suffering and then became a human, alien or chicken and experienced suffering as one of them, you could then rule out a bunch of scales (and corresponding priors). This would also result in big updates from your prior(s). You'd end up in a human-relative, alien-relative or chicken-relative account (or multiple such accounts, but for one species only).

  1. ^

    A typical chicken very probably couldn't be a classical utilitarian.

  2. ^

    A typical chicken's concept of suffering wouldn't converge, but we could capture/explain it. Their apparent normative stances wouldn't converge either, unless you imagine radically different beings.

I understand that you're explaining why you don't really think it's well modelled as a two-envelope problem, but I'm not sure whether you're biting the bullet that you're predictably paying some utility in unnecessary ways (in this admittedly convoluted hypothetical), or if you don't think there's a bullet there to bite, or something else?

Alternatively, you might assume you actually already are a human, alien or chicken, have (and remember) experience with suffering as one of them, but are uncertain about which you in fact are. For illustration, let's suppose human or alien. Because you're uncertain about whether you're an alien or human, your concept of suffering points to one that will turn out to be human suffering with some probability, p, and alien suffering with the rest of the probability, 1-p. You ground value relative to your own concept of suffering, which could turn out to be (or revised to) the human concept or the alien concept with respective probabilities.

Let H_H be the moral weight of human suffering according to a human concept of suffering, directly valued, and A_H be the moral weight of alien suffering according to a human concept of suffering, indirectly valued. Similarly, let A_A and H_A be the moral weights of alien suffering and human suffering according to the alien concept of suffering. A human would fix H_H, build a probability distribution for A_H relative to H_H and evaluate A_H in terms of it. An alien would fix A_A, build a probability distribution for H_A relative to A_A and evaluate H_A in terms of it.

You're uncertain about whether you're an alien or human. Still, you directly value your direct experiences. Assume A_A and H_H specifically represent the moral value of an experience of suffering you've actually had,[1] e.g. the moral value of a toe stub, and you're doing ethics relative to your toe stubs as the reference point. You therefore set A_A = H_H. You can think of this as a unit conversion, e.g. 1 unit of alien toe stub-relative suffering = 10 units of human toe stub-relative suffering.

This solves the two envelopes problem. You can either use A_A or H_H to set your common scale, and the answer will be the same either way, because you've fixed the ratio between them. The moral value of a human toe stub, H, will be H_H with probability p, and H_A with probability 1-p. The moral weight of an alien toe stub, A, will be A_H with probability p and A_A with probability 1-p. You can just take expected values in either the alien or human units and compare.

We could also allow you to have some probability of being a chicken under this thought experiment. Then you could set A_A = H_H = C_C, with C_C representing the value of a chicken toe stub to a chicken, and C_A, C_H, A_C and H_C defined like above.

But if you're actually a chicken, then you're valuing human and alien welfare as a chicken, which is presumably not much, since chickens are very partial (unless you idealize). Also, if you're a human, it's hard to imagine being uncertain about whether you're a chicken. There's way too much information you need to screen off from consideration, like your capacities for reasoning and language and everything that follows from these. And if you're a chicken, you couldn't imagine yourself as a human or being impartial at all.

So, maybe this doesn't make sense, or we have to imagine some hypothetically cognitively enhanced chicken or an intelligent being who suffers like a chicken. You could also idealize chickens to be impartial and actually care about humans, but then you're definitely forcing them into a different normative stance than the ones chickens actually take (if any).

  1. ^

    It would have to be something "common" to the beings under consideration, or you'd have to screen off information about who does and doesn't have access to it or use of that information, because otherwise you'd be able to rule out some possibilities for what kind of being you are. This will look less reasonable with more types of beings under consideration, in case there's nothing "common" to all of them. For example, not all moral patients have toes to stub.

(Replying back at the initial comment to reduce thread depth and in case this is a more important response for people to see.)

I understand that you're explaining why you don't really think it's well modelled as a two-envelope problem, but I'm not sure whether you're biting the bullet that you're predictably paying some utility in unnecessary ways (in this admittedly convoluted hypothetical), or if you don't think there's a bullet there to bite, or something else?

Sorry, yes, I realized I missed this bit (EDIT: and which was the main bit...). I guess then I would say your options are:

  1. Bite the bullet (and do moral trade).
  2. Entertain both the human-relative stance and the alien-relative stance even after finding out which you are,[1] say due to epistemic modesty. I assume these stances won't be comparable on a common scale, at least not without very arbitrary assumptions, so you'd use some other approach to moral uncertainty.
  3. Make some very arbitrary assumptions to make the problem go away.

I think 1 and 2 are both decent and defensible positions. I don't think the bullet to bite in 1 is really much of a bullet at all.

From your top-level comment:

Then it's revealed which you are, you remember all your experiences and can reason about how big a deal they are — and then you will predictably pay some utility in order to benefit the other species more. It similarly looks like it's a mistake to predictably have this behaviour (in the sense that, if humans and aliens are equally likely to be put in this kind of construed situation, then the world would be predictably better off if nobody had this behaviour), and I don't really feel like you've addressed this.

The aliens and humans just disagree about what's best, and could coordinate (moral trade) to avoid both incurring unnecessary costs from relatively prioritizing each other. They have different epistemic states and/or preferences, including moral preferences/intuitions. Your thought experiment decides what evidence different individuals will gather (at least on my bullet-biting interpretation). You end up with similar problems generally if you decide behind a veil of ignorance what evidence different individuals are going to gather (e.g. fix some facts about the world and decide ahead of time who will discover which ones) and epistemic states they'd end up in. Even if they start from the same prior.

Maybe one individual comes to believe bednets are the best for helping humans, while someone else comes to believe deworming is. If the bednetter somehow ends up with deworming pills, they'll want to sell them to buy bednets. If the dewormer ends up with bednets, they'll want to sell them to buy deworming pills. They could both do this at deadweight loss in terms of pills delivered, bednets delivered, cash and/or total utility. Instead, they could just directly trade with each other, or coordinate and agree to just deliver what they have directly or to the appropriate third party.

EDIT: Now, you might say they can just share evidence and then converge in beliefs. That seems fair for the dewormer and bednetter, but it's not currently possible for me to fully explain the human experience of suffering to an alien, or to give an alien access to that experience. If and when that does become possible, we'd be able to agree much more.

Another illustration: suppose you don't know whether you'll prefer apples or oranges. You try both. From then on, you're going to predictably pay more for one than the other. Some other people will do the opposite. Whenever an apple-preferrer ends up with an orange for whatever reason, they would be inclined to trade it away to get an apple. Symmetrically for the orange-preferrer. They might both do so together at deadweight loss and benefit from directly trading with each other.

This doesn't seem like much of a bullet to bite.

  1. ^

    Or your best approximations of each, given you'll only have direct access to one.

I don't think that the apples and oranges case is analogous, since then it's really about different preferences. In this case I'm assuming that all the parties have the same ultimate preferences (to make more good morally relevant good experiences and fewer bad ones), but different pieces of evidence.

I do think the deworming and bednets case is analogous. Suppose the two of us are in a room before we go out to gather evidence. We agree that there is a 50% chance that bednets are twice as good as deworming, and a 50% chance that deworming is twice as good. We neither of us have a great idea how good either of them is.

One of us goes off to study bednets. After that they have a reasonable sense of how good bednets are, and predictably prefer deworming (for 2-envelope reasons). The other goes to study deworming, and afterwards predictably prefers bednets. At this point we each have an expertise which makes our work 10% more effective on the thing we're expert in, but we each choose to eschew our expertise as the benefit from switching envelopes is higher.

We'd like to morally trade so that we each stay working in our domain of expertise. But suppose that later we'll be causally disconnected and unable to engage in moral trade. We'd still like to commit at the start to a trade where neither party switches.

Now suppose that there's only you, and you're about to flip a coin to decide if you'll go to study bednets or deworming. You'd prefer to commit to not then switching to the other thing.

But suppose you forgot to make that commitment, and are only thinking about this after having flipped the coin and discovered you're about to study bednets. Your epistemic position hasn't yet changed, only your expectation of future evidence. Surely(?) you'd still want to make the commitment at this point?

Now if you only think about it later, having studied bednets, I'm imagining that you think "well I would have wanted to commit earlier, but now that I know about how good bednets are I think deworming is better in expectation, so I'm glad I didn't commit". Is that right? (I prefer to act as though I'd made the commitment I predictably would have wanted to make.)

Now suppose that there's only you, and you're about to flip a coin to decide if you'll go to study bednets or deworming. You'd prefer to commit to not then switching to the other thing.

Maybe? I'm not sure I'd want to constrain my future self this way, if it won't seem best/rational later. I don't very strongly object to commitments in principle, and it seems like the right thing to do in some cases, like Parfit's hitchhiker. However, those assume the same preferences/scale after, and in the two envelopes problem, we may not be able to assume that. It could look more like preference change.

In this case, it looks like you're committing to something you will predictably later regret either way it goes (because you'll want to switch), which seems kind of irrational. It looks like violating the sure-thing principle. Plus, either way it goes, it looks like you'll fail to follow your own preferences later, and it will seem irrational then. Russell and Isaacs (2021) and Gustafsson (2022) also argue similarly against resolute choice strategies.

I'm more sympathetic to acausal trade with other beings that could simultaneously exist with you (even if you don't know ahead of time whether you'll find bednets or deworming better in expectation), if and because you'll expect the world to be better off for it at every step: ahead of time, just before you follow through and after you follow through. There's no expected regret. In an infinite multiverse (or a non-negligible chance of one), we should expect such counterparts to exist, though, so plausibly should do the acausal trade.

Also, I think you'd want to commit ahead of time to a more flexible policy for switching that depends on the specific evidence you'll gather.[1]

Now if you only think about it later, having studied bednets, I'm imagining that you think "well I would have wanted to commit earlier, but now that I know about how good bednets are I think deworming is better in expectation, so I'm glad I didn't commit". Is that right? (I prefer to act as though I'd made the commitment I predictably would have wanted to make.)

Ya, that seems mostly right on first intuition.

However, acausal trade with counterparts in a multiverse still seems kind of compelling.

Also, I see some other appeal in favour of committing ahead of time to stick with whatever you study (and generally making the commitment earlier, too, contra what I say above in this comment): you know there's evidence you could have gathered that would tell you not to switch, because you know you would have changed your mind if you did, even if you won't gather it anymore. Your knowledge of the existence of this evidence is evidence that supports not switching, even if you don't know the specifics. It seems like you shouldn't ignore that. Maybe it doesn't go all the way to support committing to sticking with your current expertise, because you can favour the more specific evidence you actually have, but maybe you should update hard enough on it?

This seems like it could avoid both the ex ante and ex post regret so far. But, still you either:

  1. can't be an EU maximizer, and so you'll be vulnerable to money pump arguments anyway or abandon completeness and often be silent on what to do (e.g. multi-utility representations), or
  2. have to unjustifiably fix a single scale and prior over it ahead of time.


The same could apply to humans vs aliens. Even if we're not behind the veil of ignorance now and never were, there's information that we'd be ignoring: what real or hypothetical aliens would believe and the real or hypothetical existence of evidence that supports their stance.

But, it's also really weird to consider the stances of hypothetical aliens. It's also weird in a different way if you imagine finding out what it's like to be a chicken and suffer like a chicken.

  1. ^

    Suppose you're justifiably sure that each intervention is at least not net negative (whether or not you have a single scale and prior). But then you find out bednets have no (or tiny) impact. I think it would be reasonable to switch to deworming at some cost. Deworming could be less effective than you thought ahead of time, but no impact is as bad as it gets given your credences ahead of time.

We should fix and normalize relative to the moral value of human welfare, because our understanding of the value of welfare is based on our own experiences of welfare

I used to think this for exactly the same reason, but I now no longer do. The basic reason I changed my mind is the idea that uncertainty in the amount of welfare humans (or chickens) experience is naturally scale invariant. This scale invariance means that observing any particular absolute amount of welfare (by experiencing it directly) shouldn't update you as to the relative amount of welfare under different theories.

The following is a fairly "heuristic" version of the argument, I spent some time trying to formalise it better but got stuck on the maths, so I'm giving the version that was in my head before I tried that. I'm quite convinced it's basically true though.

The argument

Consider only theories that allow the most aggregation-friendly version of hedonistic utilitarianism[1]. Under this constraint, the total amount of utility experienced by one or more moral patients is some real quantity that can be expressed in objective units ("hedons"), and this quantity is comparable across the theories that we are allowing. You might imagine that you could consult God as to the utility of various world states and He could say truthfully "ah, stubbing your toe is -1 hedon". In your post you also suppose that you can measure this amount yourself through direct experience, which I find reasonable.

From the perspective of someone who is unable to experience utility themselves, there is a natural scale invariance to this quantity. This is clearest when considering the "ought" side of the theory: the recommendations of utilitarianism are unchanged if you scale utility up and down by any amount as it doesn't affect the rank ordering of world states.

Another way to get this intuition is to imagine an unfeeling robot that derives the concept of utility from some combination of interviewing moral patients and constructing a first principles theory[2]. It could even get the correct theory, and derive that e.g. breaking your arm is 10 times as bad as stubbing your toe. It would still be in the dark about how bad these things are in absolute terms though. If God told it that stubbing your toe was –1 hedons that wouldn't mean anything to the robot. God could play a prank on the robot and tell it stubbing your toe was instead –1 millihedons, or even temporarily imbue the robot with the ability to feel pain and expose it to –1 millihedons and say "that's what stubbing your toe feels like". This should be equally unsurprising to the robot as being told/experiencing –1 hedon.

My claim is that the epistemic position of all the different theories of welfare are effectively that of this robot. And as a result of this, observing any absolute amount of welfare (utility) under theory A shouldn't update you as to what the amount would be under theory B, because both theories were consistent with any absolute amount of welfare to begin with. In fact they were "maximally uncertain" about the absolute amount, no amount should be any more or less of a surprise under either theory.

If you had a prior reason to think theory B gives say 5 times the welfare to humans as theory A (importantly in relative terms), then you should still think this after observing the absolute amount yourself, and this is what generates the thorny version of the two envelopes problem. I think there are sensible prior reasons to think there is such a relative difference for various pairs of theories.

For instance, suppose both A and B are essentially "neuron count" theories and agree on some threshold brain complexity for sentience, but then A says "amount of sentience" scales linearly with neuron count whereas B says it scales quadratically. It's reasonable to think that the amount of welfare in humans is much higher under B, maybe  times higher.

Other examples where arguments like this can be made are:

  • A and B are the same except B has multiple conscious subsystems
  • A and B are predicting chicken welfare rather than human, and A says they are sentient whereas B says they are not. Clearly B predicts 0 times the welfare of A (equivalently A predicts infinity times the welfare of B)

Putting this in two envelopes terms

If we say we have two theories, 1 and 2, which you might imagine are a human centric ()[4] and an animal-inclusive () view, then we have:


As we are used to seeing.

But as you point out in your post, the quantities  and  are not necessarily the same (though you argue they should be treated as such) which makes this a nonsensical average of dimensionless numbers. E.g.  could be 0.00001 hedons and  could be 10 hedons, which would mean we are massively overcounting theory 1. The quantities we actually care about are  and  (dimension-ed numbers in units of hedons), or their ratio . We can write these as:

This may seem like a roundabout way of writing these down, but remember that what we have from our welfare range estimates are values for , so these can't be cancelled further and the s are the minimum number of parameters we can add to pin down the equations. The ratio  is then:

I find this easier to think about if the ratios are in terms of a specific theory, e.g. , so you are always comparing what the relative amount of welfare is in theory X vs some definite reference theory. We can rearrange (3) to support this by dividing all the fractions though by :


Again, maybe this seems incredibly roundabout, but in this form it is more clear that we now only need the ratios  not their absolute values. This is good according to the previous claims I have made:

  1. Because of scale invariance, it's not possible to say anything about the absolute value of 
  2. It is possible to reason about the relative welfare values between theories, represented by 

So under this framing the "solution to the two envelopes problem for moral weights" is that you need to estimate the inter-theoretic welfare ratios for humans (or any reference moral patient), as well as the intra-theoretic ratios between moral patients. I.e. you have to estimate  as well as  and  for each theory.

I think this is still quite a big problem because of the potential for arguing that some theories have combinatorially higher welfare than others, thus causing them to dominate even if you put a very low probability on them. The neuron count example above is like this, you could make it even worse by supposing a theory where welfare is exponential in neuron count.

Returning to the human-centric vs animal inclusive toy example

If we say we have two theories, 1 and 2, which you might imagine are a human centric ()[4] and a animal-inclusive () view

Adding these  numbers into this example we now have:

What should the value of  be? Well in this case I think it's reasonable to suppose  and  are in fact equal, as we don't have any principled reason not to, so this still comes out to ~0.001. As in the original version we can flip this around to see if we get a wildly different answer if we make the inter-theoretic comparison be between chickens:

Now what should  be, recalling that theory 1 says chickens are worth very little compared to humans? I think it's reasonable say that  is also very little compared to , since the point of theory 1 is basically to suppose chickens aren't (or are barely) sentient, and not to say anything about humans. Supposing that none of the difference is explained by humans, we get , this also gives , so  comes out to ~1000. This is the inverse of  as we expect.

Clearly this is just rearranging the same numbers to get the same result, but hopefully it illustrates how explicitly including these  ratios makes the two envelope problem that you get by naively inverting the ratios less spooky, because by doing so you are effectively wildly changing the estimates of .

I agree with you that there are many cases where for the specific theories under consideration it is right to assume that  and  are equal (because we have no principled reason not to), but that this is not because we are able to observe welfare directly (even if we suppose that this is possible). And for many pairs of theories we might think  and  are very different.

(Apologies for switching back and forth between "welfare" and "utility", I'm basically treating them both like "utility")

  1. ^

    I think it's right to start with this case, because it should be the easiest. So if something breaks in this case it is likely to also break once we start trying to include things like non-welfare moral reasons

  2. ^

    "I've met a few of those"

  3. ^

    We can label the "true" theory as A, because we only get the chance to experience the true theory (we just don't know which one it is)

  4. ^

    You could make this actually zero, but I think adding infinity in makes the argument more confusing

There's a lot here, so I'll respond to what seems to be most cruxy to me.

Another way to get this intuition is to imagine an unfeeling robot that derives the concept of utility from some combination of interviewing moral patients and constructing a first principles theory[2]. It could even get the correct theory, and derive that e.g. breaking your arm is 10 times as bad as stubbing your toe. It would still be in the dark about how bad these things are in absolute terms though.

I agree with this, but I don't think this is our epistemic position, because we can understand all value relative to our own experiences. (See also a thread about an unfeeling moral agent here.)

My claim is that the epistemic position of all the different theories of welfare are effectively that of this robot. And as a result of this, observing any absolute amount of welfare (utility) under theory A shouldn't update you as to what the amount would be under theory B, because both theories were consistent with any absolute amount of welfare to begin with. In fact they were "maximally uncertain" about the absolute amount, no amount should be any more or less of a surprise under either theory.

I agree that directly observing the value of a toe stub, say, under hedonism might not tell you much or anything about its absolute value under non-hedonistic theories of welfare.[1]

However, I think we can say more under variants of closer precise theories. I think you can fix the badness of a specific toe stub across many precise theories. But then also separately fix the badness of a papercut and many other things under the same theories. This is because some theories are meant to explain the same things, and it's those things to which we're assigning value, not directly to the theories themselves. See this section of my post. And those things in practice are human welfare (or yours specifically), and so we can just take the (accessed) human-relative stances.

You illustrate with neuron count theories, and I would in fact say we should fix human welfare across those theories (under hedonism, say, and perhaps separately for different reference point welfare states), so evidence about absolute value under one hedonistic neuron count theory would be evidence about absolute value under other hedonistic theories.

I suspect conscious subsystems don't necessarily generate a two envelopes problem; you just need to calculate the expected number of subsystems and their expected aggregate welfare relative to accessed human welfare. But it might depend on which versions of conscious subsystems we're considering.

For predictions of chicken sentience, I'd say to take expectations relative to human welfare (separately with different reference point welfare states).

  1. ^

    I'd add a caveat that evidence about relative value under one theory can be evidence under another. If you find out that a toe stub is less bad than expected relative to other things under hedonism, then the same evidence would typically support that it's less bad for desires and belief-like preferences than you expected relative to the same other things, too.

I'm still trying to work through the maths on this so I won't respond in much detail until I've got further with that, I may end up writing a separate post. I did start off at your position so there's some chance I will end up there, I find this very confusing to think about.

Some brief comments on a couple of things:

I agree with this, but I don't think this is our epistemic position, because we can understand all value relative to our own experiences.

I think relative is the operative word here. That is, you experience that a toe stub is 10 times worse than a papercut, and this motivates the development of moral theories that are consistent with this, and rules out ones that are not (e.g. ones that say they are equally bad). But there is an additional bit of parameter fixing that has to happen to get from the theory predicting this relative difference to predicting the absolute amount. 

My claim is that at least generally speaking, and I think actually always, theories that are under consideration only predict these relative differences and not the absolute amounts. E.g. if a theory supposes that a certain pain receptor causes suffering when activated, then it might suppose that 10 receptors being activated causes 10 times as much suffering, but it doesn't say anything about the absolute amount. This is also true of more fundamental theories (e.g. more information processing => more sentience). I have some ideas about why this is[1], but mainly I can't think of any examples where this is not the case. If you can think of any then please tell me as that would at least partially invalidate this scale invariance thing (which would be good).

I think you would also say that theories don't need to predict this overall scale parameter because we can always fix it based on our observations of absolute utility... this is the bit of maths that I'm not clear on yet, but I do currently think this is not true (i.e. the scale parameter does matter still, especially when you have a prior reason to think there would be a difference between the theories).

I agree that directly observing the value of a toe stub, say, under hedonism might not tell you much or anything about its absolute value under non-hedonistic theories of welfare.... However, I think we can say more under variants of closer precise theories.

I was intending to restrict to only theories that fall under hedonism, because I think this is the case where this kind of cross theory aggregation should work the best. And given that I think this scale invariance problem arises there then it would be even worse when considering more dissimilar theories.

So I was considering only theories where the welfare relevant states are things that feel pretty close to pleasure and pain, and you can be uncertain about how good or bad different states are for common sense reasons[2], but you're able to tell at least roughly how good/bad at least some states are.


  1. ^

    Mentioned in the previous comment. One is that the prescriptions of utilitarianism have this scale invariance (only distinguish between better/worse), as do the behaviours associated with pleasure/pain (e.g. you can only communicate that something is more/less painful, or [for animals] show an aversion to a more painful thing in favour of a less painful thing).

  2. ^

    E.g. you might not remember them, you might struggle to factor in duration, the states might come along with some non-welfare-relevant experience which biases your recollection (e.g. a painfully bright red light vs a painfully bright green light)

My claim is that at least generally speaking, and I think actually always, theories that are under consideration only predict these relative differences and not the absolute amounts.


I have some ideas about why this is[1], but mainly I can't think of any examples where this is not the case. If you can think of any then please tell me as that would at least partially invalidate this scale invariance thing (which would be good).

I think what matters here is less whether they predict absolute amounts, but which ones can be put on common scales. If everything could be put on the same common scale, then we would predict values relative to that common scale, and could treat the common scale like an absolute one. But scale invariance would still depend on you using that scale in a scale-invariant way with your moral theory.

I do doubt all theories can be put on one common scale together this way, but I suspect we can find common scales across some subsets of theories at a time. I think there usually is no foundational common scale between any pair of theories, but I'm open to the possibility in some cases, e.g. across approaches for counting conscious subsystems, causal vs evidential decision theory (MacAskill et al., 2019), in some pairs of person-affecting vs total utilitarian views (Riedener, 2019, also discussed in my section here). This is because the theories seem to recognize the same central and foundational reasons, but just find that they apply differently or in different numbers. You can still value those reasons identically across theories. So, it seems like they're using the same scale (all else equal), but differently.

I'm not sure, though. And maybe there are multiple plausible common scales for a given set of theories, but this could mean two envelopes problem between those common scales, not between the specific theories themselves.

And I agree that there probably isn't a shared foundational common scale across all theories of consciousness, welfare and moral weights (as I discuss here).

I think you would also say that theories don't need to predict this overall scale parameter because we can always fix it based on our observations of absolute utility

Ya, that's roughly my position, and more precisely that we can construct common scales based on our first-person observations of utility, although with the caveat that in fact these observations don't uniquely determine the scale, so we still end up with multiple first-person observation-based common scales.


this is the bit of maths that I'm not clear on yet, but I do currently think this is not true (i.e. the scale parameter does matter still, especially when you have a prior reason to think there would be a difference between the theories).

Do you think we generally have the same problem for other phenomena, like how much water there is across theories of the nature of water or the strength of gravity as we moved from the Newtonian picture to general relativity? So, we shouldn't treat theories of water as using a common scale, or theories of gravity as using a common scale? Again, maybe you end up with multiple common scales for water, and multiple for gravity, but the point is that we still can make some intertheoretic comparisons, even if vague/underdetermined, based on the observations the theories are meant to explain, rather than say nothing about hiw they relate.

In these cases, including consciousness, water and gravity, it seems like we first care about the observations, and then we theorize about them, or else we wouldn't bother theorizing about them at all. So we do some (fairly) theory-neutral valuing.

Since the heart of your case is "well we know what human experience is like so we can treat that as a fixed point", I'm just going to point out various ways in which we don't necessarily know what human experience is like, and some of the implications if we more narrowly try to anchor on what we know and otherwise adopt what I take to be your stance on the two-envelope problem:

  • We each only experience our own consciousness
    • It seems decently likely that humans vary in some dimension like degree- or intensity-of-consciousness
      • Generically, we won't know if we're above- or below-average on this
      • So in expectation, others' experiences all matter more than our own
        • But in aggregate, a society of fully altruistic people would make errors if they each act on the assumption that their own experience matters less in expectation than other people's
  • In the moment writing this, I don't know what intense pain or intense pleasure feel like
    • I can only base my judgement of these things on memory
      • But memory, as we know in many contexts, could be faulty
    • Because there is more at stake in worlds where my memory is minimizing rather than exaggerating my past experiences, I should act on the assumption that my memory is systematically skewed in this way
  • It's not unusual for people to lie to themselves about their own experiences
    • e.g. telling themselves things are fine while at some level experiencing significant psychological suffering
    • So we should assume that our top-level consciousness doesn't always have full access to our morally relevant experience even in the moment of experiencing it
    • Our uncertainty should presumably include some worlds where a large majority of our morally relevant experience is opaque to us; so in expectation the moral weight we assign ourselves should be rather higher than the one which is experienced and hence "known"
  • We're unable to tell how many times our experience is being instantiated
    • On accounts where that's morally relevant, this could have a big impact on the expectation of the moral worth of our experiences

To be clear, I don't endorse the conclusions here — but in each case my instinct is that I'm getting off the train by saying "seems like there's some two-envelope type phenomenon going on here, so I'm not happy straightforwardly taking expectations".

I basically agree with all of this, and make some similar points in my sections Multiple possible reference points and Conscious subsystems. I think there are still two envelopes problem between what we actually access, and we don't have a nice way of uniquely fixing comparisons. But, I think it's defensible to do everything human-relative or relative to your own experiences (which are human, so this is still human-relative), what's accessed. You'll need to use multiple reference points.

Thanks for the exploration of this.

I'm concerned that this approach is structurally very vulnerable to fanaticism / muggings. This matters for insect experience, and for possible moral relevance of single-cell organisms (ok, before getting to this case we'd likely want to revisit your section on subsystems of the brain and consider the possibility of individual neurons having morally relevant experience that our consciousness doesn't get proper access to). It could matter especially for how much we chase after the possibility of artificial minds with far far greater capacity for morally relevant experience than humans.

I guess I see this as the central issue with normalizing this way, and was sort of hoping you'd say more about it. It gets discussed a little when you talk about the possibility of overlapping conscious subsystems of the brain, but I'm unclear what your stance is towards it in general, or what you would say to someone who objected to this approach because it seemed to give a fanatical weight to chickens in the human/chicken comparison? (perhaps having somewhat different probabilities than you on the likelihood of different levels of chicken moral relevance)

I agree that this approach, if you're something like a (risk neutral) expectational utilitarian, is very vulnerable to fanaticism / muggings, but that to me is a problem for expectational utilitarianism. To you and "to someone who objected to this approach because it seemed to give a fanatical weight to chickens in the human/chicken comparison", I'd say to put more weight on normative stances that are less fanatical than expectational utilitarianism.

I personally reserve substantial skepticism of expected value maximization in general (both within moral stances and for handling moral uncertainty between them), expected value maximization with unbounded value specifically, aggregation in general and aggregation by summation. I'd probably end up with "worldview buckets" based on different attitudes towards risk/uncertainty, aggregation and grounds for moral value (types of welfare, non-welfarist values, as in the problem of multiple (human) reference points). RP's CURVE sequence goes over attitudes to risk and their implications for intervention and cause prioritization. Then, I doubt these stances would be intertheoretically comparable. For uncertainty between them, I'd use an approach to moral uncertainty that didn't depend on intertheoretic comparisons, like a moral parliament, a bargain-theoretic approach, variance voting or just sizing worldview buckets proportionally to credences.

In practice, within a neartermist focus (and ignoring artificial consciousness), this could conceivably roughly end up looking like a set of resource buckets: a human-centric bucket, a bucket for mammals and birds, a bucket for all vertebrates, a bucket for all vertebrates + sufficiently sophisticated invertebrates, a bucket for all animals, and a ~panpsychist bucket.[1] However, the boundaries between these buckets would be soft (and softer), because the actual buckets don't specifically track a human-centric view, a vertebrate view, etc.. My approach would also inform how to size the buckets and limit risky interventions within them.

For example, fix some normative stance, and suppose within it:

  1. you thought a typical chicken had a 1% chance of having roughly the same moral weight (per year) as a typical human (according to specific moral grounds), and didn't matter at all otherwise.
  2. you aggregate via summation.
  3. you thought helping chickens (much) at all would be too fanatical.

Then that view would also recommend against human-helping interventions with at most a 1% probability of success.[2] Or, you could include some chicken interventions with many more roughly statistically risky independent human-helping interventions, because many independent risky (positive expected value) bets together don't look as risky. Still, this stance shouldn't bet everything on an intervention helping humans with only a 1% chance of success, because otherwise it could just bet everything on chickens with a similar payoff distribution. This stance would limit risky bets. Every stance could limit risky bets, but the ones that end up human-centric in practice would tend to do so more than others.

  1. ^

    Or, maybe some of the later buckets are just replaced with longtermist buckets, if and because longtermist bets could have similar probabilities of making a difference, but better payoffs when they succeed.

  2. ^

    Depending on how the nature of your attitudes to risk. This could follow from difference-making risk aversion or probability difference discounting of some kind. On the other hand, if you maximized the expected utility of the arctan of total welfare, a bounded function, then you'd prioritize marginal local improvements to worlds with small populations and switching between big and small populations, while ignoring marginal local improvements to worlds with large populations. This could also mean ignoring chickens but not marginal local improvements for humans, because if chickens don't count and we go extinct soon (or future people don't count), then the population is much smaller.

Is the two-envelope problem, as you understand it, a problem for anything except expectational utilitarianism?

I'm asking because it feels to me like you're saying roughly "yes yes although I proposed a solution to the two-envelope problem I agree it's very much still a problem, so you also need an entirely different type of solution to address it". I think this is a bit of a caricature of what you're saying, and I suspect that it's an unfair one, but I can't immediately see how it's unfair, so I'm asking this way to try to get quickly to the heart of what's going on.

Is the two-envelope problem, as you understand it, a problem for anything except expectational utilitarianism?

I think it is or would have been a problem for basically any normative stance (moral theory + attitudes towards risk, etc.) that is at all sensitive to risk/uncertainty and stakes roughly according to expected value.[1]

I think I've given a general solution here to the two envelopes problem for moral weights (between moral patients) when you fix your normative stance but have remaining empirical/descriptive uncertainty about the moral weights of beings conditional on that stance. It can be adapted to different normative stances, but I illustrated it with versions of expectational utilitarianism. (EDIT: And I'm arguing that a lot of the relevant uncertainty actually is just empirical, not normative, more than some have assumed.)

For two envelopes problems between normative stances, I'm usually skeptical of intertheoretic comparisons, so would mostly recommend approaches that don't depend on them.

  1. ^

    (Footnote added in an edit of this comment.)

    For example, I think there's no two envelopes problem for someone who maximizes the median value, because the reciprocal of the median is the median of the reciprocal.

    But I'd take it to be a problem for anyone who roughly maximizes an expected value or counts higher expected value in favour of an act, e.g. does so with constraints, or after discounting small probabilities. They don't have to be utilitarian or aggregate welfare at all, either.

OK thanks. I'm going to attempt a summary of where I think things are:

  • In trying to assess moral weights, you can get two-envelope problems for both empirical uncertainty and normative uncertainty
  • Re. empirical uncertainty, you argue that there isn't a two-envelope problem, and you can just treat it like any other empirical uncertainty
    • In my other comment thread I argue that just like the classic money-based two-envelope problem, there's still a problem to be addressed, and it probably needs to involve priors
  • Re. normative uncertainty, you would tend to advise approaches which help to dodge facing two-envelope problems in the first place, alongside dodging facing a bunch of other issues
    • I'm sympathetic to this, although I don't think it's uncontroversial
  • You argue that a lot of the uncertainty should be understood to be empirical rather than normative — but you also think quite a bit of it is normative (insofar as you recommend people allocating resources into buckets associated with different worldviews)
    • I kind of get where you're coming from here, although I feel that the lines between what's empirical and what's normative uncertainty are often confusing, and so I kind of want action-guiding advice to be available for actors who haven't yet worked out how to disentangle them. (I'm also not certain that the "different buckets for different worldviews" is the best approach to normative uncertainty, although as a pragmatic matter I certainly don't hate it, and it has some theoretical appeal.)

Does that seem wrong anywhere to you?

This all seems right to me.

(I wouldn't pick out the worldview bucket approach as the solution everyone should necessarily find most satisfying, given their own intuitions/preferences, but it is one I tend to prefer now.)

Ok great. In that case one view I have is that it would be clearer to summarize your position (e.g. in the post title) as "there isn't a two envelope problem for moral weights", rather than as presenting a solution.

Hi Michael,

I would be curious to know your thoughts on the approach I outlined here:

Let me try to restate your point [the 2 envelopes problem], and suggest why one may disagree. If one puts weight w on the welfare range (WR) of humans relative to that of chickens being N, and 1 - w on it being n, the expected welfare range of:

  • Humans relative to that of chickens is E("WR of humans"/"WR of chickens") = w*N + (1 - w)*n.
  • Chickens relative to that of humans is E("WR of chickens"/"WR of humans") = w/N + (1 - w)/n.

You are arguing that N can plausibly be much larger than n. For the sake of illustration, we can say N = 389 (ratio between the 86 billion neurons of a humans and 221 M of a chicken), n = 3.01 (reciprocal of RP's median welfare range of chickens relative to humans of 0.332), and w = 1/12 (since the neuron count model was one of the 12 RP considered, and all of them were weighted equally). Having the welfare range of:

  • Chickens as the reference, E("WR of humans"/"WR of chickens") = 35.2. So 1/E("WR of humans"/"WR of chickens") = 0.0284.
  • Humans as the reference (as RP did), E("WR of chickens"/"WR of humans") = 0.305.

So, as you said, determining welfare ranges relative to humans results in animals being weighted more heavily. However, I think the difference is much smaller than the suggested above. Since N and n are quite different, I guess we should combine them using a weighted geometric mean, not the weighted mean as I did above. If so, both approaches output exactly the same result:

  • E("WR of humans"/"WR of chickens") = N^w*n^(1 - w) = 4.49. So 1/E("WR of humans"/"WR of chickens") = (N^w*n^(1 - w))^-1 = 0.223.
  • E("WR of chickens"/"WR of humans") = (1/N)^w*(1/n)^(1 - w) = 0.223.

The reciprocal of the expected value is not the expected value of the reciprocal, so using the mean leads to different results. However, I think we should be using the geometric mean, and the reciprocal of the geometric mean is the geometric mean of the reciprocal. So the 2 approaches (using humans or chickens as the reference) will output the same ratios regardless of N, n and w as long as we aggregate N and n with the geometric mean. If N and n are similar, it no longer makes sense to use the geometric mean, but then both approaches will output similar results anyway, so RP's approach looks fine to me as a 1st pass. Does this make any sense?

I think a weighted geometric mean is unprincipled and won't reflect expected value maximization (if w is meant to be a probability). It's equivalent to weighing by the following, where  is the ratio of moral weights (or maybe conditional on being positive):

The expectation is in the exponent, but taking expectations is supposed to be the last thing we do, after aggregation, if we're maximizing an expected value.

It's not clear if it would be a good approximation of more principled approaches, but it seems like a compromise between the human-relative and animal-relative approaches and should (always?) give intermediate moral weights.

It and both the unmodified human-relative and animal-relative solutions also hide the differences between types of uncertainty. For example, I think conscious subsystems should be treated separately like the number of moral patients.

Also, you shouldn't be taking the square root in the weighted geometric mean. You need the exponents to sum to 1, not 0.5.

EDIT: And you need to condition on both humans and the other animal having nonzero moral weight before taking the weighted geometric mean, or else you'll get 0, infinite or undefined weighted geometric means. If you take the expected value of the conditional weighted geomean, you would have something like

but then  (and probably at least one of the two should be infinite, anyway), so you have a two envelopes problem again.

Thanks for the reply!

I think a weighted geometric mean is unprincipled and won't reflect expected value maximization (if w is meant to be a probability).

I agree it is unprincipled, and I strongly endorse expected value maximisation in principle, but maybe using the geometric mean is still a good method in practice?

  • The mean ignores information from extremely low predictions, and overweights outliers.
  • The weighted/unweighted geometric mean performed better than the weighted/unweighted mean on Metaculus' questions.
  • Samotsvety aggregated predictions differing a lot between them from 7 forecasters[1] using the geometric mean after removing the lowest and highest values.

Also, you shouldn't be taking the square root in the weighted geometric mean. You need the exponents to sum to 1, not 0.5.

Thanks! Corrected.

you need to condition on both humans and the other animal having nonzero moral weight

I think the welfare range outputted by any given model should always be positive.

  1. ^

    For the question "What is the unconditional probability of London being hit with a nuclear weapon in October?", the 7 forecasts were 0.01, 0.00056, 0.001251, 10^-8, 0.000144, 0.0012, and 0.001. The largest of these is 1 M (= 0.01/10^-8) times the smallest.

I agree it is unprincipled, and I strongly endorse expected value maximisation in princple, but maybe using the geometric mean is still a good method in practice?

I would want to know more about what our actual targets should plausibly be before making any such claim. I'm not sure we can infer much from your examples. Maybe an analogy is that we're aggregating predictions of different perspectives, though?

I think the welfare range outputted by any given model should always be positive.

Other animals could fail to be conscious, and so have welfare ranges of 0.

I would want to know more about what our actual targets should plausibly be before making any such claim. I'm not sure we can infer much from your examples.

I agree it would be good to know which aggregation methods perform better under different conditions, and performance targets. The geometric mean is better than the mean, in the sense of achieving a lower Brier and log score, for all Metaculus' questions. However, it might be this would not hold for a set of questions whose predictions are distributed more like the welfare ranges of the 12 models considered by Rethink Priorities. I would even be open to using different aggregation methods depending on the species, since the distribution of the 12 mean welfare ranges of each model varies across species.

Maybe an analogy is that we're aggregating predictions of different perspectives, though?

If the forecasts come from "all-considered views of experts", which I think is what you are calling "different perspectives", Jaime Sevilla suggests using the geometric mean of odds if poorly calibrated outliers can be removed, or the median otherwise. For the case of welfare ranges, I do not think one can say there are poorly calibrated outliers. So, if one interpreted each of the 12 models as one forecaster[1], I guess Jaime would suggest determining the cumulative distribution function (CDF) of the welfare range from the geometric mean of the odds of the CDFs of the welfare ranges of the 12 models, as Epoch did for judgment-based AI timelines. I think using the geometric mean is also fine, as it performed marginally better than the geometric mean of odds in Metaculus' questions.

Jaime agrees with using the mean if the forecasts come from "models with mutually exclusive assumptions":

If you are not aggregating all-considered views of experts, but rather aggregating models with mutually exclusive assumptions, use the mean of probabilities.


  • Models can have more or less mutually exclusive assumptions. The less they do, the more it makes sense to rely on the median, geometric mean, or geometric mean of odds instead of the mean.
  • There is not a strong distinction between all-considered views and the outputs of quantitative models, as the judgements of people are models themselves. Moreover, one should presumably prefer the all-considered views of the modellers over the models, as the former account for more information. 
    • Somewhat relatedly, Rethink recommends using the median (not mean) welfare ranges.

Other animals could fail to be conscious, and so have welfare ranges of 0.

Sorry for not being clear. I agree with the above if lack of consciousness is defined as having a null welfare range. However:

  • In practice, consciousness has to be operationalised as satisfying certain properties to a desired extent.
  • I do not think one can say that, conditional on such properties not being satisfied to the desired extent, the welfare range is 0.

So I would say one should put no probability mass on a null welfare range, and that the CDF of the welfare range should be continuous[2]. In general, I assume zeros and infinities do not exist in the real world, even though they are useful in maths and physics to think about limiting processes.

  1. ^

    This sounds like a moral parliament in some way?

    Side note. I sometimes link to concepts I know you are aware of, but readers may not be.

  2. ^

    In addition, I think the CDF of the welfare range should be smooth such that the probability density function (PDF) of the welfare range is continuous.

When ice seemed like it could have turned out to be something other than the solid phase of water, we would be comparing the options based on the common facts — the evidence or data — the different possibilities were supposed to explain. And then by finding out that ice is water, you learn that there is much more water in the world, because you would then also have to count all the ice on top of all the liquid water.[13] If your moral theory took water to be intrinsically good and more of it to be better, this would be good news (all else equal).

Suppose we measure amounts by mass. The gram was in fact originally defined as the mass of one cubic centimetre of pure water at 0 °C.[1] We could imagine having defined the gram as the mass of one cubic centimetre of liquid water, but using water that isn't necessarily pure, not fixing the temperature or using a not fully fixed measure for the centimetre. This introduces uncertainty about the measure of mass itself, and we'd later revise the definition as we understood more, but we could still use it in the meantime. We'd also aim to roughly match the original definition: the revised mass of one cubic centimetre of water shouldn't be too different from 1 gram under the new definition.

This is similar to what I say we'd do with consciousness: we define it first relative to human first-person experiences and measure relative to them, but revise the concept and measure with further understanding. We should also aim to make conservative revisions and roughly preserve the value in our references, human first-person experiences.

  1. ^

    In French:

    Gramme, le poids absolu d'un volume d'eau pure égal au cube de la centième partie du mètre , et à la température de la glace fondante.


Curated and popular this week
Relevant opportunities