This blog post was written on an old version of the EA Forum, so it has formatting issues. The title was also updated for clarity.
Many thanks for helpful feedback to Jo Anderson, Tobias Baumann, Jesse Clifton, Max Daniel, Michael Dickens, Persis Eskander, Daniel Filan, Kieran Greig, Zach Groff, Amy Halpern-Laff, Jamie Harris, Josh Jacobson, Gregory Lewis, Caspar Oesterheld, Carl Shulman, Gina Stuessy, Brian Tomasik, Johannes Treutlein, Magnus Vinding, Ben West, and Kelly Witwicki. I also forwarded Ben Todd and Rob Wiblin a small section of the draft that discusses an 80,000 Hours article.
Abstract
When people in the effective altruism (EA) community have worked to affect the far future, they’ve typically focused on reducing extinction risk, especially risks associated with superintelligence or general artificial intelligence alignment (AIA). I agree with the arguments for the far future being extremely important in our EA decisions, but I tentatively favor improving the quality of the far future by expanding humanity’s moral circle more than increasing the likelihood of the far future or humanity’s continued existence by reducing AIA-based extinction risk because: (1) the far future seems to not be very good in expectation, and there’s a significant likelihood of it being very bad, and (2) moral circle expansion seems highly neglected both in EA and in society at large. Also, I think considerations of bias are very important here, given how necessarily intuitive and subjective judgment calls make up the bulk of differences in opinion on far future cause prioritization. I find the argument in favor of AIA that technical research might be more tractable than social change to be the most compelling counterargument to my position.
Context
This post largely aggregates existing content on the topic, rather than making original arguments. I offer my views, mostly intuitions, on the various arguments, but of course I remain highly uncertain given the limited amount of empirical evidence we have on far future cause prioritization.
Many in the effective altruism (EA) community think the far future is a very important consideration when working to do the most good. The basic argument is that humanity could continue to exist for a very long time and could expand its civilization to the stars, creating a very large amount of moral value. The main narrative has been that this civilization could be a very good one, and that in the coming decades, we face sizable risks of extinctions that could prevent us from obtaining this “cosmic endowment.” The argument goes that these risks also seem like they can be reduced with a fairly small amount of additional resources (e.g. time, money), and therefore extinction risk reduction is one of the most important projects of humanity and the EA community.
(This argument also depends on a moral view that bringing about the existence of sentient beings can be a morally good and important action, comparable to helping sentient beings who currently exist live better lives. This is a contentious view in academic philosophy. See, for example, “'Making People Happy, Not Making Happy People': A Defense of the Asymmetry Intuition in Population Ethics.”)
However, one can accept the first part of this argument — that there is a very large amount of expected moral value in the far future and it’s relatively easy to make a difference in that value — without deciding that extinction risk is the most important project. In slightly different terms, one can decide not to work on reducing population risks, risks that could reduce the number of morally relevant individuals in the far future (of course, these are only risks of harm if one believes more individuals is a good thing), and instead work on reducing quality risks, risks that could reduce the quality of morally relevant individuals’ existence. One specific type of quality risk often discussed is a risk of astronomical suffering (s-risk), defined as “events that would bring about suffering on an astronomical scale, vastly exceeding all suffering that has existed on Earth so far.”
This blog post makes the case for focusing on quality risks over population risks. More specifically, though also more tentatively, it makes the case for focusing on reducing quality risk through moral circle expansion (MCE), the strategy of impacting the far future through increasing humanity’s concern for sentient beings who currently receive little consideration (i.e. widening our moral circle so it includes them), over AI alignment (AIA), the strategy of impacting the far future through increasing the likelihood that humanity creates an artificial general intelligence (AGI) that behaves as its designers want it to (known as the alignment problem).[1][2]
The basic case for MCE is very similar to the case for AIA. Humanity could continue to exist for a very long time and could expand its civilization to the stars, creating a very large number of sentient beings. The sort of civilization we create, however, seems highly dependent on our moral values and moral behavior. In particular, it’s uncertain whether many of those sentient beings will receive the moral consideration they deserve based on their sentience, i.e. whether they will be in our “moral circle” or not, like the many sentient beings who have suffered intensely over the course of human history (e.g. from torture, genocide, oppression, war). It seems the moral circle can be expanded with a fairly small amount of additional resources (e.g. time, money), and therefore MCE is one of the most important projects of humanity and the EA community.
Note that MCE is a specific kind of values spreading, the parent category of MCE that describes any effort to shift the values and moral behavior of humanity and its decendants (e.g. intelligent machines) in a positive direction to benefit the far future. (Of course, some people attempt to spread values in order to benefit the near future, but in this post we’re only considering far future impact.)
I’m specifically comparing MCE and AIA because AIA is probably the most favored method of reducing extinction risk in the EA community. AIA seems to be the default cause area to favor if one wants to have an impact on the far future, and I’ve been asked several times why I favor MCE instead.
This discussion risks conflating AIA with reducing extinction risk. These are two separate ideas, since an unaligned AGI could still lead to a large number of sentient beings, and an aligned AGI could still potentially cause extinction or population stagnation (e.g. if according to the designers’ values, even the best civilization the AGI could help build is still worse than nonexistence). However, most EAs focused on AIA seem to believe that the main risk is something quite like extinction, such as the textbook example of an AI that seeks to maximize the number of paperclips in the universe. I’ll note when the distinction between AIA and reducing extinction risk is relevant. Similarly, there are sometimes important prioritization differences between MCE and other types of values spreading, and those will be noted when they matter. (This paragraph is an important qualification for the whole post. The possibility of unaligned AGI that involves a civilization (and, less so because it seems quite unlikely, the possibility of an AGI that causes extinction) is important to consider for far future cause prioritization. Unfortunately, elaborating on this would make this post far more complicated and far less readable, and would not change many of the conclusions. Perhaps I’ll be able to make a second post that adds this discussion at some point.)
It’s also important to note that I’m discussing specifically AIA here, not all AI safety work in general. AI safety, which just means increasing the likelihood of beneficial AI outcomes, could be interpreted as including MCE, since MCE plausibly makes it more likely that an AI would be built with good values. However, MCE doesn’t seem like a very plausible route to increasing the likelihood that AI is simply aligned with the intentions of its designers, so I think MCE and AIA are fairly distinct cause areas.
AI safety can also include work on reducing s-risks, such as specifically reducing the likelihood of an unaligned AI that causes astronomical suffering, rather than reducing the likelihood of all unaligned AI. I think this is an interesting cause area, though I am unsure about its tractability and am not considering it in the scope of this blog post.
The post’s publication was supported by Greg Lewis, who was interested in this topic and donated $1,000 to Sentience Institute, the think tank I co-founded which researches effective strategies to expand humanity’s moral circle, conditional on this post being published to the Effective Altruism Forum. Lewis doesn’t necessarily agree with any of its content. He decided on the conditional donation prior to the post being written, and I did ask him to review the post prior to publication and it was edited based on his feedback.
The expected value of the far future
Whether we prioritize reducing extinction risk partly depends on how good or bad we expect human civilization to be in the far future, given it continues to exist. In my opinion, the assumption that it will be very good is a tragically unexamined assumption in the EA community.
What if it’s close to zero?
If we think the far future is very good, that clearly makes reducing extinction risk more promising. And if we think the far future is very bad, that makes reducing extinction risk not just unpromising, but actively very harmful. But what if it’s near the middle, i.e. close to zero?[3] 80,000 Hours wrote that to believe reducing extinction risk is not an EA priority on the basis of the expected moral value of the far future,
...even if you’re not sure how good the future will be, or suspect it will be bad, you may want civilisation to survive and keep its options open. People in the future will have much more time to study whether it’s desirable for civilisation to expand, stay the same size, or shrink. If you think there’s a good chance we will be able to act on those moral concerns, that’s a good reason to leave any final decisions to the wisdom of future generations. Overall, we’re highly uncertain about these big-picture questions, but that generally makes us more concerned to avoid making any irreversible commitments...
This reasoning seems mistaken to me because wanting “civilisation to survive and keep its options open” depends on optimism that civilization will do research, make good[4] decisions based on that research, and be capable of implementing those decisions.[5] In other words, while preventing extinction keeps options open for good things to happen, it also keeps options open for bad things to happen, and desiring this option value depends on an optimism that the good things are more likely. In other words, the reasoning assumes the optimism (thinking the far future is good, or at least that humans will make good decisions and be able to implement them[6]), which is also its conclusion.
Having that optimism makes sense in many decisions, which is why keeping options open is often a good heuristic. In EA, for example, people tend to do good things with their careers, which means career option value is a useful thing. This doesn’t readily translate to decisions where it’s not clear whether the actors involved will have a positive or negative impact. (Note 80,000 Hours isn’t making this comparison. I’m just making it to explain my own view here.)
There’s also a sense in which preventing extinction risk decreases option value because if humanity progresses past certain civilizational milestones that make extinction more unlikely — say, the rise of AGI or expansion beyond our own solar system — it might become harder or even impossible to press the “off switch” (ending civilization). However, I think most would agree that there’s more overall option value in a civilization that has gotten past these milestones because there’s a much wider variety of non-extinct civilizations than extinct civilizations.[7]
If you think that the expected moral value of the far future is close to zero, even if you think it’s slightly positive, then reducing extinction risk is a less promising EA strategy than if you think it’s very positive.
Key considerations
I think the considerations on this topic are best represented as questions where people’s beliefs (mostly just intuitions) vary on a long spectrum. I’ll list these in order of where I would guess I have the strongest disagreement with people who believe the far future is highly positive in expected value (shortened as HPEV-EAs), and I’ll note where I don’t think I would disagree or might even have a more positive-leaning belief than the average such person.
- I think there’s a significant[8] chance that the moral circle will fail to expand to reach all sentient beings, such as artificial/small/weird minds (e.g. a sophisticated computer program used to mine asteroids, but one that doesn’t have the normal features of sentient minds like facial expressions). In other words, I think there’s a significant chance that powerful beings in the far future will have low willingness to pay for the welfare of many of the small/weird minds in the future.[9]
- I think it’s likely that the powerful beings in the far future (analogous to humans as the powerful beings on Earth in 2018) will use large numbers of less powerful sentient beings, such as for recreation (e.g. safaris, war games), a labor force (e.g. colonists to distant parts of the galaxy, construction workers), scientific experiments, threats, (e.g. threatening to create and torture beings that a rival cares about), revenge, justice, religion, or even pure sadism.[10] I believe this because there have been less powerful sentient beings for all of humanity’s existence and well before (e.g. predation), many of whom are exploited and harmed by humans and other animals, and there seems to be little reason to think such power dynamics won’t continue to exist.
- Alternative uses of resources include simply working to increase one’s own happiness directly (e.g. changing one’s neurophysiology to be extremely happy all the time), and constructing large non-sentient projects like a work of art. Though each of these types of project could still include sentient beings, such as for experimentation or a labor force.
- With the exception of threats and sadism, the less powerful minds seem like they could suffer intensely because their intense suffering could be instrumentally useful. For example, if the recreation is nostalgic, or human psychology persists in some form, we could see powerful beings causing intense suffering in order to see good triumph over evil or in order to satisfy curiosity about situations that involve intense suffering (of course, the powerful beings might not acknowledge the suffering as suffering, instead conceiving of it as simulated but not actually experienced by the simulated entities). For another example, with a sentient labor force, punishment could be a stronger motivator than reward, as indicated by the history of evolution on Earth.[11][12]
- I place significant moral value on artificial/small/weird minds.
- I think it’s quite unlikely that human descendants will find the correct morality (in the sense of moral realism, finding these mind-independent moral facts), and I don’t think I would care much about that correct morality even if it existed. For example, I don’t think I would be compelled to create suffering if the correct morality said this is what I should do. Of course, such moral facts are very difficult to imagine, so I’m quite uncertain about what my reaction to them would be.[13]
- I’m skeptical about the view that technology and efficiency will remove the need for powerless, high-suffering, instrumental moral patients. An example of this predicted trend is that factory farmed animals seem unlikely to be necessary in the far future because of their inefficiency at producing animal products. Therefore, I’m not particularly concerned about the factory farming of biological animals continuing into the far future. I am, however, concerned about similar but less inefficient systems.
- An example of how technology might not render sentient labor forces and other instrumental sentient beings obsolete is how humans seem motivated to have power and control over the world, and in particular seem more satisfied by having power over other sentient beings than by having power over non-sentient things like barren landscapes.
- I do still believe there’s a strong tendency towards efficiency and that this has the potential to render much suffering obsolete; I just have more skepticism about it than I think is often assumed by HPEV-EAs.[14]
- I’m skeptical about the view that human descendants will optimize their resources for happiness (i.e. create hedonium) relative to optimizing for suffering (i.e. create dolorium).[15] Humans currently seem more deliberately driven to create hedonium, but creating dolorium might be more instrumentally useful (e.g. as a threat to rivals[16]).
- On this topic, I similarly do still believe there’s a higher likelihood of creating hedonium; I just have more skepticism about it than I think is often assumed by EAs.
- I’m largely in agreement with the average HPEV-EA in my moral exchange rate between happiness and suffering. However, I think those EAs tend to greatly underestimate how much the empirical tendency towards suffering over happiness (e.g. wild animals seem to endure much more suffering than happiness) is evidence of a future empirical asymmetry.
- My view here is partly informed by the capacities for happiness and suffering that have evolved in humans and other animals, the capacities that seem to be driven by cultural forces (e.g. corporations seem to care more about downsides than upsides, perhaps because it’s easier in general to destroy and harm things than to create and grow them), and speculation about what could be done in more advanced civilizations, such as my best guess on what a planet optimized for happiness and a planet optimized for suffering would look like. For example, I think a given amount of dolorium/dystopia (say, the amount that can be created with 100 joules of energy) is far larger in absolute moral expected value than hedonium/utopia made with the same resources.
- I’m unsure of how much I would disagree with HPEV-EAs about the argument that we should be highly uncertain about the likelihood of different far future scenarios because of how highly speculative our evidence is, which pushes my estimate of the expected value of the far future towards the middle of the possible range, i.e. towards zero.
- I’m unsure of how much I would disagree with HPEV-EAs about the persistence of evolutionary forces into the future (i.e. how much future beings will be determined by fitness, rather than characteristics we might hope for like altruism and happiness).[17]
- From the historical perspective, it worries me that many historical humans seem like they would be quite unhappy with the way human morality changed after them, such as the way Western countries are less concerned about previously-considered-immoral behavior like homosexuality and gluttony than their ancestors were in 500 CE. (Of course, one might think historical humans would agree with modern humans upon reflection, or think that much of humanity’s moral changes have been due to improved empirical understanding of the world.)[18]
- I’m largely in agreement with HPEV-EAs that humanity’s moral circle has a track record of expansion and seems likely to continue expanding. For example, I think it’s quite likely that powerful beings in the far future will care a lot about charismatic biological animals like elephants or chimpanzees, or whatever beings have a similar relationship to those powerful beings as humanity has to elephants and chimpanzees. (As mentioned above, my pessimism about the continued expansion is largely due to concern about the magnitude of bad-but-unlikely outcomes and the harms that could occur due to MCE stagnation.)
Unfortunately, we don’t have much empirical data or solid theoretical arguments on these topics, so the disagreements I’ve had with HPEV-EAs have mostly just come down to differences in intuition. This is a common theme for prioritization among far future efforts. We can outline the relevant factors and a little empirical data, but the crucial factors seem to be left to speculation and intuition.
Most of these considerations are about how society will develop and utilize new technologies, which suggests we can develop relevant intuitions and speculative capacity by studying social and technological change. So even though these judgments are intuitive, we could potentially improve them with more study of big-picture social and technological change, such as Sentience Institute’s MCE research or Robin Hanson’s book on The Age of Em that analyzes what a future of brain emulations would look like. (This sort of empirical research is what I see as the most promising future research avenue for far future cause prioritization. I worry EAs overemphasize armchair research (like most of this post, actually) for various reasons.[19])
I’d personally be quite interested in a survey of people with expertise in the relevant fields of social, technological, and philosophical research, in which they’re asked about each of the considerations above, though it might be hard to get a decent sample size, and I think it would be quite difficult to debias the respondents (see the Bias section of this post).
I’m also interested in quantitative analyses of these considerations — calculations including all of these potential outcomes and associated likelihoods. As far as I know, this kind of analysis has only been attempted so far by Michael Dickens in “A Complete Quantitative Model for Cause Selection,” in which Dickens notes that, “Values spreading may be better than existential risk reduction.” While this quantification might seem hopelessly speculative, I think it’s highly useful even in such situations. Of course, rigorous debiasing is also very important here.
Overall, I think the far future is close to zero in expected moral value, meaning it’s not nearly as good as is commonly assumed, implicitly or explicitly, in the EA community.
Scale
Range of outcomes
It’s difficult to compare the scale of far future impacts since they are all astronomical, and I find the consideration of scale here to overall not be very useful.
Technically, it seems like MCE involves a larger range of potential outcomes than reducing extinction risk through AIA because, at least from a classical consequentialist perspective (giving weight to both negative and positive outcomes), it could make the difference between some of the worst far futures imaginable and the best far futures. Reducing extinction risk through AIA only makes the difference between nonexistence (a far future of zero value) and whatever world comes to exist. If one believes the far future is highly positive, this could still be a very large range, but it would still be less than the potential change from MCE.
How much less depends on one’s views of how bad the worst future is relative to the best future. If the absolute value is the same, then MCE has a range twice as large as extinction risk.
As mentioned in the Context section above, the change in the far future that AIA could achieve might not exactly be extinction versus non-extinction. While an aligned AI would probably not involve the extinction of all sentient beings, since that would require the values of its creators to prefer extinction over all other options, an unaligned AI might not necessarily involve extinction. To use the canonical AIA example of a “paperclip maximizer” (used to illustrate how an AI could easily have a harmful goal without any malicious intention), the rogue AI might create sentient beings as a labor force to implement its goal of maximizing the number of paperclips in the universe, or create sentient beings for some other goal.[20]
This means that the range of AIA is the difference between the potential universes with aligned AI and unaligned AI, which could be very good futures contrasted with very bad futures, rather than just very good futures contrasted with nonexistence.
Brian Tomasik has written out a thoughtful (though necessarily speculative and highly uncertain) breakdown of the risks of suffering in both aligned and unaligned AI scenarios, which weakly suggests that an aligned AI would lead to more suffering in expectation.
All things considered, it seems that the range of quality risk reduction (including MCE) is larger than that of extinction risk reduction (including AIA, depending on one’s view of what difference AI alignment makes), but this seems like a fairly weak consideration to me because (i) it’s a difference of roughly two-fold, which is quite small relative to the differences of ten-times, a thousand-times, etc. that we frequently see in cause prioritization, (ii) there are numerous fairly arbitrary judgment calls (like considering reducing extinction risk from AI versus AIA versus AI safety) that lead to different results.[21]
Likelihood of different far future scenarios[22][23]
MCE is relevant for many far future scenarios where AI doesn’t undergo the sort of “intelligence explosion” or similar progression that makes AIA important; for example, if AGI is developed by an institution like a foreign country that has little interest in AIA, or if AI is never developed, or if it’s developed slowly in a way that makes safety adjustments quite easy as that development occurs. In each of these scenarios, the way society treats sentient beings, especially those currently outside the moral circle, seems like it could still be affected by MCE. As mentioned earlier, I think there is a significant chance that the moral circle will fail to expand to reach all sentient beings, and I think a small moral circle could very easily lead to suboptimal or dystopian far future outcomes.
On the other hand, some possible far future civilizations might not involve moral circles, such as if there is an egalitarian society where each individual is able to fully represent their own interests in decision-making and this societal structure was not reached through MCE because these beings are all equally powerful for technological reasons (and no other beings exist and they have no interest in creating additional beings). Some AI outcomes might not be affected by MCE, such as an unaligned AI that does something like maximizing the number of paperclips for reasons other than human values (such as a programming error) or one whose designers create its value function without regard for humanity’s current moral views (“coherent extrapolated volition” could be an example of this, though I agree with Brian Tomasik that current moral views will likely be important in this scenario).
Given my current, highly uncertain estimates of the likelihood of various far future scenarios, I would guess that MCE is applicable in somewhat more cases than AIA, suggesting it’s easier to make a difference to the far future through MCE. (This is analogous to saying the risk of MCE-failure seems greater than the risk of AIA-failure, though I’m trying to avoid simplifying these into binary outcomes.)
Tractability
How much of an impact can we expect our marginal resources to have on the probability of extinction risk, or on the moral circle of the far future?
Social change versus technical research
One may believe changing people’s attitudes and behavior is quite difficult, and direct work on AIA involves a lot less of that. While AIA likely involves influencing some people (e.g. policymakers, researchers, and corporate executives), MCE is almost entirely influencing people’s attitudes and behavior.[24]
However, one could instead believe that technical research is more difficult in general, pointing to potential evidence such as the large amount of money spent on technical research (e.g. by Silicon Valley) with often very little to show for it, while huge social change seems to sometimes be effected by small groups of advocates with relatively little money (e.g. organizers of revolutions in Egypt, Serbia, and Turkey). (I don’t mean this as a very strong or persuasive argument, just as a possibility. There are plenty of examples of tech done with few resources and social change done with many.)
It’s hard to speak so generally, but I would guess that technical research tends to be easier than causing social change. And this seems like the strongest argument in favor of working on AIA over working on MCE.
Track record
In terms of EA work explicitly focused on the goals of AIA and MCE, AIA has a much better track record. The past few years have seen significant technical research output from organizations like MIRI and FHI, as documented by user Larks on the EA Forum for 2016 and 2017. I’d defer readers to those posts, but as a brief example, MIRI had an acclaimed paper on “Logical Induction,” which used a financial market process to estimate the likelihood of logical facts (e.g. mathematical propositions like the Riemann hypothesis) that we aren’t yet sure of. This is analogous to how we use probability theory to estimate the likelihood of empirical facts (e.g. a dice roll). In the bigger picture of AIA, this research could help lay the technical foundation for building an aligned AGI. See Larks’ post for a discussion of more papers like this, as well as non-technical work done by AI-focused organizations such as the Future of Life Institute’s open letter on AI safety signed by leading AI researchers and cited by the White House’s “Report on the Future of Artificial Intelligence.”
Using an analogous definition for MCE, EA work explicitly focused on MCE (meaning expanding the moral circle in order to improve the far future) basically only started in 2017 with the founding of Sentience Institute (SI), though there were various blog posts and articles discussing it before then. SI has basically finished four research projects: (1) Foundational Question Summaries that summarize evidence we have on important effective animal advocacy (EAA) questions, including a survey of EAA researchers, (2) a case study of the British antislavery movement to better understand how they achieved one of the first major moral circle expansions in modern history, (3) a case study of nuclear power to better understand how some countries (e.g. France) enthusiastically adopted this new technology, but others (e.g. the US) didn’t, (4) a nationally representative poll of US attitudes towards animal farming and animal-free food.
With a broader definition of MCE that includes activities that people prioritizing MCE tend to think are quite indirectly effective (see the Neglectedness section for discussion of definitions), we’ve seen EA achieve quite a lot more, such as the work done by The Humane League, Mercy For Animals, Animal Equality, and other organizations on corporate welfare reforms to animal farming practices, and the work done by The Good Food Institute and others on supporting a shift away from animal farming, especially through supporting new technologies like so-called “clean meat.”
Since I favor the narrower definition, I think AIA outperforms MCE on track record, but the difference in track record seems largely explained by the greater resources spent on AIA, which makes it a less important consideration. (Also, when I personally decided to focus on MCE, SI did not yet exist, so the lack of track record was an even stronger consideration in favor of AIA (though MCE was also more neglected at that time).)
To be clear, the track records of all far future projects tend to be weaker than near-term projects where we can directly see the results.
Robustness
If one values robustness, meaning a higher certainty that one is having a positive impact, either for instrumental or intrinsic reasons, then AIA might be more promising because once we develop an aligned AI (that continues to be aligned over time), the work of AIA is done and won’t need to be redone in the future. With MCE, assuming the advent of AI or similar developments won’t fix society’s values in place (known as “value lock-in”), then MCE progress could more easily be undone, especially if one believes there’s a social setpoint that humanity drifts back towards when moral progress is made.[25]
I think the assumptions of this argument make it quite weak: I’d guess an “intelligence explosion” has a significant chance of value lock-in,[26][27] and I don’t think there’s a setpoint in the sense that positive moral change increases the risk of negative moral change. I also don’t value robustness intrinsically at all or instrumentally very much; I think that there is so much uncertainty in all of these strategies and such weak prior beliefs[28] that differences in certainty of impact matter relatively little.
Miscellaneous
Work on either cause area runs the risk of backfiring. The main risk for AIA seems to be that the technical research done to better understand how to build an aligned AI will increase AI capabilities generally, meaning it’s also easier for humanity to produce an unaligned AI. The main risk for MCE seems to be that certain advocacy strategies will end up having the opposite effect as intended, such as a confrontational protest for animal rights that ends up putting people off of the cause.
It’s unclear which project has better near-term proxies and feedback loops to assess and increase long-term impact. AIA has technical problems with solutions that can be mathematically proven, but these might end up having little bearing on final AIA outcomes, such as if an AGI isn’t developed using the method that was advised or if technical solutions aren’t implemented by policy-makers. MCE has metrics like public attitudes and practices. My weak intuition here, and the weak intuition of other reasonable people I’ve discussed this with, is that MCE has better near-term proxies.
It’s unclear which project has more historical evidence that EAs can learn from to be more effective. AIA has previous scientific, mathematical, and philosophical research and technological successes and failures, while MCE has previous psychological, social, political, and economic research and advocacy successes and failures.
Finally, I do think that we learn a lot about tractability just by working directly on an issue. Given how little effort has gone into MCE itself (see Neglectedness below), I think we could resolve a significant amount of uncertainty with more work in the field.
Overall, considering only direct tractability (i.e. ignoring information value due to neglectedness, which would help other EAs with their cause prioritization), I’d guess AIA is a little more tractable.
Neglectedness
With neglectedness, we also face a challenge of how broadly to define the cause area. In this case, we have a fairly clear goal with our definition: to best assess how much low-hanging fruit is available. To me, it seems like there are two simple definitions that meet this goal: (i) organizations or individuals working explicitly on the cause area, (ii) organizations or individuals working on the strategies that are seen as top-tier by people focused on the cause area. How much one favors (i) versus (ii) depends largely on whether one thinks the top-tier strategies are fairly well-established and thus (ii) makes sense, or whether they will change over time such that one should favor (i) because those organizations and individuals will be better able to adjust.[29]
With the explicit focus definitions of AIA and MCE (recall this includes having a far future focus), it seems that MCE is much more neglected and has more low-hanging fruit.[30] For example, there is only one organization that I know of explicitly committed to MCE in the EA community (SI), while numerous organizations (MIRI, CHAI, part of FHI, part of CSER, even parts of AI capabilities organizations like Montreal Institute for Learning Algorithms, DeepMind, and OpenAI, etc.) are explicitly committed to AIA. Because MCE seems more neglected, we could learn a lot about MCE through SI’s initial work, such as how easily advocates have achieved MCE throughout history.
If we include those working on the cause area without an explicit focus, then that seems to widen the definition of MCE to include some of the top strategies being used to expand the moral circle in the near-term, such as farmed animal work done by Animal Charity Evaluators and it’s top-recommended charities, which have a combined budget of around $7.5 million in 2016. The combined budgets of top-tier AIA work is harder to estimate, but the Centre for Effective Altruism estimates all AIA work in 2016 was around $6.6 million. The AIA budgets seem to be increasing more quickly than the MCE budgets, especially given the grant-making of the Open philanthropy project. We could also include EA movement-building organizations that place a strong focus on reducing extinction risk, and even AIA specifically, such as 80,000 Hours. The categorization for MCE seems to have more room to broaden, perhaps all the way to mainstream animal advocacy strategies like the work of People for the Ethical Treatment of Animals (PETA), which might make AIA more neglected. (It could potentially go even farther, such as advocating for human sweatshop laborers, but that seems too far removed and I don’t know any MCE advocates who think it’s plausibly top-tier.)
I think there’s a difference in aptitude that suggests MCE is more neglected. Moral advocacy seems like a field which, while quite crowded, seems relatively easy for deliberate, thoughtful people to vastly outperform the average advocate,[31] which can lead to surprisingly large impact (e.g. EAs have already had far more success in publishing their writing, such as books and op-eds, than most writers hope for).[32] Additionally, despite centuries of advocacy, very little quality research has been done to critically examine what advocacy is effective and what’s not, while the fields of math, computer science, and machine learning involve substantial self-reflection and are largely worked on by academics who seem to use more critical thinking than the average activist (e.g. there’s far more skepticism in these academic communities, a demand for rigor and experimentation that’s rarely seen among advocates). In general, I think the aptitude of the average social change advocate is much lower than that of the average technological researcher, suggesting MCE is more neglected, though of course other factors also count.
The relative neglectedness of MCE also seems likely to continue, given the greater self-interest humanity has in AIA relative to MCE and, in my opinion, the net biases towards AIA described in the Biases section of this blog post. (This self-interest argument is a particularly important consideration for prioritizing MCE over AIA in my view.[33])
However, while neglectedness is typically thought to make a project more tractable, it seems that existing work in the extinction risk space has made marginal contributions more impactful in some ways. For example, talented AI researchers can find work relatively easily at an organization dedicated to AIA, while the path for talented MCE researchers is far less clear and easy. This alludes to the difference in tractability that might exist between labor resources and funding resources, as it currently seems like MCE is much more funding-constrained[34] while AIA is largely talent-constrained.
As another example, there are already solid inroads between the AIA community and the AI decision-makers, and AI decision-makers have already expressed interest in AIA, suggesting that influencing them with research results will be fairly easy once those research results are in hand. This means both that our estimation of AIA’s neglectedness should decrease, and that our estimation of its non-neglectedness tractability should increase, in the sense that neglectedness is a part of tractability. (The definitions in this framework vary.)
All things considered, I find MCE to be more compelling from a neglectedness perspective, particularly due to the current EA resource allocation and the self-interest humanity has, and will most likely continue to have, in AIA. When I decided to focus on MCE, there was an even stronger case for neglectedness because no organization existed committed to that goal (SI was founded in 2017), though there was an increased downside to MCE — the even more limited track record.
Cooperation
Values spreading as a far future intervention has been criticized on the following grounds: People have very different values, so trying to promote your values and change other people’s could be seen as uncooperative. Cooperation seems to be useful both directly (e.g. how willing are other people to help us out if we’re fighting them?) and in a broader sense because of superrationality, an argument that one should help others even when there’s no causal mechanism for reciprocation.[35]
I think this is certainly a good consideration against some forms of values spreading. For example, I don’t think it’d be wise for an MCE-focused EA to disrupt the Effective Altruism Global conferences (e.g. yell on stage and try to keep the conference from continuing) if they have an insufficient focus on MCE. This seems highly ineffective because of how uncooperative it is, given the EA space is supposed to be one for having challenging discussions and solving problems, not merely advocating one’s positions like a political rally.
However, I don’t think it holds much weight against MCE in particular for two reasons: First, because I don’t think MCE is particularly uncooperative. For example, I never bring up MCE with someone and hear, “But I like to keep my moral circle small!” I think this is because there are many different components of our attitudes and worldview that we refer to as values and morals. People have some deeply-held values that seem strongly resistant to change, such as their religion or the welfare of their immediate family, but very few people seem to have small moral circles as a deeply-held value. Instead, the small moral circle seems to mostly be a superficial, casual value (though it’s often connected to the deeper values) that people are okay with — or even happy about — changing.[36]
Second, insofar as MCE is uncooperative, I think a large number of other EA interventions, including AIA, are similarly uncooperative. Many people even in the EA community are concerned with, or even opposed to, AIA. For example, if one believes an aligned AI would create a worse far future than an unaligned AI, or if one thinks AIA is harmfully distracting from more important issues and gives EA a bad name. This isn’t to say I think AIA is bad because it’s uncooperative — on the contrary, this seems like a level of uncooperativeness that’s often necessary for dedicated EAs. (In a trivial way, basically all action involves uncooperativeness because it’s always about changing the status quo or preventing the status quo from changing.[37] Even inaction can involve uncooperativeness if it means not working to help someone who would like your help.)
I do think it’s more important to be cooperative in some other situations, such as if one has a very different value system than some of their colleagues, as might be the case for the Foundational Research Institute, which advocates strongly for cooperation with other EAs.
Cooperation with future do-gooders
Another argument against values spreading goes something like, “We can worry about values after we’ve safely developed AGI. Our tradeoff isn’t, ‘Should we work on values or AI?’ but instead ‘Should we work on AI now and values later, or values now and maybe AI later if there’s time?’”
I agree with one interpretation of the first part of this argument, that urgency is an important factor and AIA does seem like a time-sensitive cause area. However, I think MCE is similarly time-sensitive because of risks of value lock-in where our descendants’ morality becomes much harder to change, such as if AI designers choose to fix the values of an AGI, or at least to make them independent of other people’s opinions (they could still be amenable to self-reflection of the designer and new empirical data about the universe other than people’s opinions)[38]; if humanity sends out colonization vessels across the universe that are traveling too fast for us to adjust based on our changing moral views; or if society just becomes too wide and disparate to have effective social change mechanisms like we do today on Earth.
I disagree with the stronger interpretation, that we can count on some sort of cooperation with or control over future people. There might be some extent to which we can do this, such as via superrationality, but that seems like a fairly weak effect. Instead, I think we’re largely on our own, deciding what we do in the next few years (or perhaps in our whole career), and just making our best guess of what future people will do. It sounds very difficult to strike a deal with them that will ensure they work on MCE in exchange for us working on AIA.
Bias
I’m always cautious about bringing considerations of bias into an important discussion like this. Considerations easily turn into messy, personal attacks, and often you can fling roughly-equal considerations of counter-biases when accusations of bias are hurled at you. However, I think we should give them serious consideration in this case. First, I want to be exhaustive in this blog post, and that means throwing every consideration on the table, even messy ones. Second, my own cause prioritization “journey” led me first to AIA and other non-MCE/non-animal-advocacy EA priorities (mainly EA movement-building), and it was considerations of bias that allowed me to look at the object-level arguments with fresh eyes and decide that I had been way off in my previous assessment.
Third and most importantly, people’s views on this topic are inevitably driven mostly by intuitive, subjective judgment calls. One could easily read everything I’ve written in this post and say they lean in the MCE direction on every topic, or the AIA direction, and there would be little object-level criticism one could make against that if they just based their view on a different intuitive synthesis of the considerations. This subjectivity is dangerous, but it is also humbling. It requires us to take an honest look at our own thought processes in order to avoid the subtle, irrational effects that might push us in either direction. It also requires caution when evaluating “expert” judgment, given how much experts could be affected by personal and social biases themselves.
The best way I know of to think about bias in this case is to consider the biases and other factors that favor either cause area and see which case seems more powerful, or which particular biases might be affecting our own views. The following lists are presumably not exhaustive but lay out what I think are some common key parts of people’s journeys to AIA or MCE. Of course, these factors are not entirely deterministic and probably not all will apply to you, nor do they necessarily mean that you are wrong in your cause prioritization. Based on the circumstances that apply more to you, consider taking a more skeptical look at the project you favor and your current views on the object-level arguments for it.
One might be biased towards AIA if...
They eat animal products, and thus are assign lower moral value and less mental faculties to animals.
They haven’t accounted for the bias of speciesism.
They lack personal connections to animals, such as growing up with pets.
They are or have been a fan of science fiction and fantasy literature and media, especially if they dreamed of being the hero.
They have a tendency towards technical research over social projects.
They lack social skills.
They are inclined towards philosophy and mathematics.
They have a negative perception of activists, perhaps seeing them as hippies, irrational, idealistic, “social justice warriors,” or overly emotion-driven.
They are a part of the EA community, and therefore drift towards the status quo of EA leaders and peers. (The views of EA leaders can of course be genuine evidence of the correct cause prioritization, but they can also lead to bias.)
The idea of “saving the world” appeals to them.
They take pride in their intelligence, and would love if they could save the world just by doing brilliant technical research.
They are competitive, and like the feeling/mindset of doing astronomically more good than the average do-gooder, or even the average EA. (I’ve argued in this post that MCE has this astronomical impact, but it lacks the feeling of literally “saving the world” or otherwise having a clear impact that makes a good hero’s journey climax, and it’s closely tied to lesser, near-term impacts.)
They have little personal experience of extreme suffering, the sort that makes one pessimistic about the far future, especially regarding s-risks. (Personal experience could be one’s own experience or the experiences of close friends and family.)
They have little personal experience of oppression, such as due to their gender, race, disabilities, etc.
They are generally a happy person.
They are generally optimistic, or at least averse to thinking about bad outcomes like how humanity could cause astronomical suffering. (Though some pessimism is required for AIA in the sense that they don’t count on AI capabilities researchers ending up with an aligned AI without their help.)
One might be biased towards MCE if...
They are vegan, especially if they went vegan for non-animal or non-far-future reasons, such as for better personal health.
Their gut reaction when they hear about extinction risk or AI risk is to judge it nonsensical.
They have personal connections to animals, such as growing up with pets.
They are or have been a fan of social movement/activism literature and media, especially if they dreamed of being a movement leader.
They have a tendency towards social projects over technical research.
They have benefitted from above-average social skills.
They are inclined towards social science.
They have a positive perception of activists, perhaps seeing them as the true leaders of history.
They have social ties to vegans and animal advocates. (The views of these people can of course be genuine evidence of the correct cause prioritization, but they can also lead to bias.)
The idea of “helping the worst off” appeals to them.
They take pride in their social skills, and would love if they could help the worst off just by being socially savvy.
They are not competitive, and like the thought of being a part of a friendly social movement.
They have a lot of personal experience of extreme suffering, the sort that makes one pessimistic about the far future, especially regarding s-risks. (Personal experience could be one’s own experience or the experiences of close friends and family.)
They have a lot of personal experience of oppression, such as due to their gender, race, disabilities, etc.
They are generally an unhappy person.
They are generally pessimistic, or at least don’t like thinking about good outcomes. (Though some optimism is required for MCE in the sense that they believe work on MCE can make a large positive difference in social attitudes and behavior.)
They care a lot about directly seeing the impact of their work, even if the bulk of their impact is hard to see. (E.g. seeing improvements in the conditions of farmed animals, which can be seen as a proxy for helping farmed-animal-like beings in the far future.)
Implications
I personally found myself far more compelled towards AIA in my early involvement with EA before I had thought in detail about the issues discussed in this post. I think the list items in the AIA section apply to me much more strongly than the MCE list. When I considered these biases, in particular speciesism and my desire to follow the status quo of my EA friends, a fresh look at the object-level arguments changed my mind.
From my reading and conversations in EA, I think the biases in favor of AIA are also quite a bit stronger in the community, though of course some EAs — mainly those already working on animal issues for near-term reasons — probably feel a stronger pull in the other direction.
How you think about these bias considerations also depends on how biased you think the average EA is. If you, for example, think EAs tend to be quite biased in another way like “measurement bias” or “quantifiability bias” (a tendency to focus too much on easily-quantifiable, low-risk interventions), then considerations of biases on this topic should probably be more compelling to you than they will be to people who think EAs are less biased.
Notes
[1] This post attempts to compare these cause areas overall, but since that’s sometimes too vague, I specifically mean the strategies within each cause area that seem most promising. I think this is basically equal to “what EAs working on MCE most strongly prioritize” and “what EAs working on AIA most strongly prioritize.”
[2] There’s a sense in which AIA is a form of MCE simply because AIA will tend to lead to certain values. I’m excluding that AIA approach of MCE from my analysis here to avoid overlap between these two cause areas.
[3] Depending on how close we’re talking about, this could be quite unlikely. If we’re discussing the range of outcomes from dystopia across the universe to utopia across the universe, then a range like “between modern earth and the opposite value of modern earth” seems like a very tiny fraction of the total possible range.
[4] I mean “good” in a “positive impact” sense here, so it includes not just rationality according to the decision-maker but also value alignment, luck, being empirically well-informed, being capable of doing good things, etc.
[5] One reason for optimism is that you might think most extinction risk is in the next few years, such that you and other EAs you know today will still be around to do this research yourselves and make good decisions after those risks are avoided.
[6] Technically one could believe the far future is negative but also that humans will make good decisions about extinction, such as if one believes the far future (given non-extinction) will be bad only due to nonhuman forces, such as aliens or evolutionary trends, but has optimism about human decision-making, including both that humans will make good decisions about extinction and that they will be logistically able to make those decisions. I think this is an unlikely view to settle on, but it would make option value a good thing in a “close to zero” scenario.
[7] Non-extinct civilizations could be maximized for happiness, maximized for interestingness, set up like Star Wars or another sci-fi scenario, etc. while extinct civilizations would all be devoid of sentient beings, perhaps with some variation in physical structure like different planets or remnant structures of human civilization.
[8] My views on this are currently largely qualitative, but if I had to put a number on the word “significant” in this context, it’d be somewhere around 5-30%. This is a very intuitive estimate, and I’m not prepared to justify it.
[9] Paul Christiano made a general argument in favor of humanity reaching good values in the long run due to reflection in his post “Against Moral Advocacy” (see the “Optimism about reflection” section) though he doesn’t specifically address concern for all sentient beings as a potential outcome, which might be less likely than other good values that are more driven by cooperation."
[10] Nick Bostrom has considered some of these risks of artificial suffering using the term “mind crime,” which specifically refers to harming sentient beings created inside a superintelligence. See his book, Superintelligence.
[11] The Foundational Research Institute has written about risks of astronomical suffering in “Reducing Risks of Astronomical Suffering: A Neglected Priority.” The TV series Black Mirror is an interesting dramatic exploration of how the far future could involve vasts amounts of suffering, such as the episodes “White Christmas” and “USS Callister.” Of course, the details of these situations often veer towards entertainment over realism, but their exploration of the potential for dystopias in which people abuse sentient digital entities is thought-provoking.
[12] I’m highly uncertain about what sort of motivations (like happiness and suffering in humans) future digital sentient beings will have. For example, is punishment being a stronger motivator in earth-originating life just an evolutionary fluke that we can expect to dissipate in artificial beings? Could they be just as motivated to attain reward as we are to avoid punishment? I think this is a promising avenue for future research, and I’m glad it’s being discussed by some EAs.
[13] Brian Tomasik discusses this in his essay on “Values Spreading is Often More Important than Extinction Risk,” suggesting that, “there's not an obvious similar mechanism pushing organisms toward the things that I care about.” However, Paul Christiano notes in “Against Moral Advocacy” that he expects “[c]onvergence of values” because “the space of all human values is not very broad,” though this seems quite dependent on how one defines the possible space of values.
[14] This efficiency argument is also discussed in Ben West’s article on “An Argument for Why the Future May Be Good.”
[15] The term “resources” is intentionally quite broad. This means whatever the limitations are on the ability to produce happiness and suffering, such as energy or computation.
[16] One can also create hedonium as a promise to get things from rivals, but promises seem less common than threats because threats tend to be more motivating and easier to implement (e.g it’s easier to destroy than create). However, some social norms encourage promises over threats because promises are better for society as a whole. Additionally, threats against powerful beings (e.g. other citizens in the same country) do less than threats against less powerful, or more distant beings, and the latter category might be increasingly common in the future. Additionally, threats and promises matter less when one considers that they are often unfulfilled because the other party doesn’t do the action that was the subject of the threat or promise.
[17] Paul Christiano’s blog post on “Why might the future be good?” argues that “the future will be characterized by much higher influence for altruistic values [than self-interest],” though he seems to just be discussing the potential of altruism and self-interest to create positive value, rather than their potential to create negative value.
Brian Tomasik discusses Christiano’s argument and others in “The Future of Darwinism” and concludes, “Whether the future will be determined by Darwinism or the deliberate decisions of a unified governing structure remains unclear.”
[18] One discussion of changes in morality on a large scale is Robin Hanson’s blog post, “Forager, Farmer Morals.”
[19] Armchair research is relatively easy, in the sense that all it requires is writing and thinking rather than also digging through historical texts, running scientific studies, or engaging in substantial conversation with advocates, researchers, and/or other stakeholders. It’s also more similar to the mathematical and philosophical work that most EAs are used to doing. And it’s more attractive as a demonstration of personal prowess to think your way into a crucial consideration than to arrive at one through the tedious work of research. (These reasons are similar to the reasons I feel most far-future-focused EAs are biased towards AIA over MCE.)
[20] These sentient beings probably won’t be the biological animals we know today, but instead digital beings who can more efficiently achieve the AI’s goals.
[21] The neglectedness heuristic involves a similar messiness of definitions, but the choices seem less arbitrary to me, and the different definitions lead to more similar results.
[22] Arguably this consideration should be under Tractability rather than Scale.
[23] There’s a related framing here of “leverage,” with the basic argument being that AIA seems more compelling than MCE because AIA is specifically targeted at an important, narrow far future factor (the development of AGI) while MCE is not as specifically targeted. This also suggests that we should consider specific MCE tactics focused on important, narrow far future factors, such as ensuring the AI decision-makers have wide moral circles even if the rest of society lags behind. I find this argument fairly compelling, including the implication that MCE advocates should focus more on advocating for digital sentience and advocating in the EA community than they would otherwise.
[24] Though plausibly MCE involves only influencing a few decision-makers, such as the designers of an AGI.
[25] Brian Tomasik discusses this in, “Values Spreading is Often More Important than Extinction Risk,” arguing that, “Very likely our values will be lost to entropy or Darwinian forces beyond our control. However, there's some chance that we'll create a singleton in the next few centuries that includes goal-preservation mechanisms allowing our values to be "locked in" indefinitely. Even absent a singleton, as long as the vastness of space allows for distinct regions to execute on their own values without take-over by other powers, then we don't even need a singleton; we just need goal-preservation mechanisms.”
[26] Brian Tomasik discusses the likelihood of value lock-in in his essay, “Will Future Civilization Eventually Achieve Goal Preservation?”
[27] The advent of AGI seems like it will have similar effects on the lock-in of values and alignment, so if you think AI timelines are shorter (i.e. advanced AI will be developed sooner), then that increases the urgency of both cause areas. If you think timelines are so short that we will struggle to successfully reach AI alignment, then that decreases the tractability of AIA, but MCE seems like it could more easily have a partial effect on AI outcomes than AIA could.
[28] In the case of near-term, direct interventions, one might believe that “most social programmes don’t work,” which suggests that we should have low, strong priors for intervention effectiveness that we need robustness to overcome.
[29] Caspar Oesterheld discusses the ambiguity of neglectedness definitions in his blog post, "Complications in evaluating neglectedness." Other EAs have also raised concern about this commonly-used heuristic, and I almost included this content in this post under the “Tractability” section for this reason.
[30] This is a fairly intuitive sense of the word “matched.” I’m taking the topic of ways to affect the far future, dividing it into population risk and quality risk categories, then treating AIA and MCE as subcategories of each. I’m also thinking in terms of each project (AIA and MCE) being in the category of “cause areas with at least pretty good arguments in their favor,” and I think “put decent resources into all such projects until the arguments are rebutted” is a good approach for the EA community.
[31] I mean “advocate” quite broadly here, just anyone working to effect social change, such as people submitting op-eds to newspapers or trying to get pedestrians to look at their protest or take their leaflets.
[32] It’s unclear what the explanation is for this. It could just be demographic differences such as high IQ, going to elite universities, etc. but it could also be exceptional “rationality skills” like finding loopholes in the publishing system.
[33] In Brian Tomasik’s essay on “Values Spreading is Often More Important than Extinction Risk,” he argues that “[m]ost people want to prevent extinction” while, “In contrast, you may have particular things that you value that aren't widely shared. These things might be easy to create, and the intuition that they matter is probably not too hard to spread. Thus, it seems likely that you would have higher leverage in spreading your own values than in working on safety measures against extinction.”
[34] This is just my personal impression from working in MCE, especially with my organization Sentience Institute. With indirect work, The Good Food Institute is a potential exception since they have struggled to quickly hired talented people after their large amounts of funding.
[35] See “Superrationality” in “Reasons to Be Nice to Other Value Systems” for an EA introduction to the idea. See “In favor of ‘being nice’” in “Against Moral Advocacy” as example of cooperation as an argument against values spreading. In “Multiverse-wide Cooperation via Correlated Decision Making,” Caspar Oesterheld argues that superrational cooperation makes MCE more important.
[36] This discussion is complicated by the widely varying degrees of MCE. While, for example, most US residents seem perfectly okay with expanding concern to vertebrates, there would be more opposition to expanding to insects, and even more to some simple computer programs that some argue should fit into the edges of our moral circles. I do think the farthest expansions are much less cooperative in this sense, though if the message is just framed as, “expand our moral circle to all sentient beings,” I still expect strong agreement.
[37] One exception is a situation where everyone wants a change to happen, but nobody else wants it badly enough to put the work into changing the status quo.
[38] My impression is that the AI safety community currently wants to avoid fixing these values, though they might still be trying to make them resistant to advocacy from other people, and in general I think many people today would prefer to fix the values of an AGI when they consider that they might not agree with potential future values.
I come back to this post quite frequently when considering whether to prioritize MCE (via animal advocacy) or AI safety. It seems that these two cause areas often attract quite different people with quite different objectives, so this post is unique in its attempt to compare the two based on the same long-term considerations.
I especially like the discussion of bias. Although some might find the whole discussion a bit ad hominem, I think people in EA should take seriously the worry that certain features common in the EA community (e.g., an attraction towards abstract puzzles) might bias us towards particular cause areas.
I recommend this post for anyone interested in thinking more broadly about longtermism.
Thank you for writing this post. An evergreen difficulty that applies to discussing topics of such a broad scope is the large number of matters that are relevant, difficult to judge, and where one's judgement (whatever it may be) can be reasonably challenged. I hope to offer a crisper summary of why I am not persuaded.
I understand from this the primary motivation of MCE is avoiding AI-based dystopias, with the implied causal chain being along the lines of, “If we ensure the humans generating the AI have a broader circle of moral concern, the resulting post-human civilization is less likely to include dystopic scenarios involving great multitudes of suffering sentiences.”
There are two considerations that speak against this being a greater priority than AI alignment research: 1) Back-chaining from AI dystopias leaves relatively few occasions where MCE would make a crucial difference. 2) The current portfolio of ‘EA-based’ MCE is poorly addressed to averting AI-based dystopias.
Re. 1): MCE may prove neither necessary nor sufficient for ensuring AI goes well. On one hand, AI designers, even if speciesist themselves, might nonetheless provide the right apparatus for value learning such ... (read more)
In Stuart Russell's Human Compatible (2019), he advocates for AGI to follow preference utilitarianism, maximally satisfying the values of humans. As for animal interests, he seems to think that they are sufficiently represented since he writes that they will be valued by the AI insofar as humans care about them. Reading this from Stuart Russell shifted me toward thinking that moral circle expansion probably does matter for the long-term future. It seems quite plausible (likely?) that AGI will follow this kind of value function which does not directly care about animals rather than broadly anti-speciesist values, since AI researchers are not generally anti-speciesists. In this case, moral circle expansion across the general population would be essential.
(Another factor is that Russell's reward modeling depends on receiving feedback occasionally from humans to learn their preferences, which is much more difficult to do with animals. Thus, under an approach similar to reward modeling, AGI developers probably won't bother to directly include animal preferences, when that involves all the extra work of figuring out how to get the AI to discern animal preferences. And how many AI researc... (read more)
Those considerations make sense. I don't have much more to add for/against than what I said in the post.
On the comparison between different MCE strategies, I'm pretty uncertain which are best. The main reasons I currently favor farmed animal advocacy over your examples (global poverty, environmentalism, and companion animals) are that (1) farmed animal advocacy is far more neglected, (2) farmed animal advocacy is far more similar to potential far future dystopias, mainly just because it involves vast numbers of sentient beings who are largely ignored by most of society. I'm not relatively very worried about, for example, far future dystopias where dog-and-cat-like-beings (e.g. small, entertaining AIs kept around for companionship) are suffering in vast numbers. And environmentalism is typically advocating for non-sentient beings, which I think is quite different than MCE for sentient beings.
I think the better competitors to farmed animal advocacy are advocating broadly for antispeciesism/fundamental rights (e.g. Nonhuman Rights Project) and advocating specifically for digital sentience (e.g. a larger, more sophisticated version of People for the Ethical Treatment of Reinforcement L... (read more)
Wild animal advocacy is far more neglected than farmed animal advocacy, and it involves even larger numbers of sentient beings ignored by most of society. If the superiority of farmed animal advocacy over global poverty along these two dimensions is a sufficient reason for not working on global poverty, why isn't the superiority of wild animal advocacy over farmed animal advocacy along those same dimensions not also a sufficient reason for not working on farmed animal advocacy?
I personally don't think WAS is as similar to the most plausible far future dystopias, so I've been prioritizing it less even over just the past couple of years. I don't expect far future dystopias to involve as much naturogenic (nature-caused) suffering, though of course it's possible (e.g. if humans create large numbers of sentient beings in a simulation, but then let the simulation run on its own for a while, then the simulation could come to be viewed as naturogenic-ish and those attitudes could become more relevant).
I think if one wants something very neglected, digital sentience advocacy is basically across-the-board better than WAS advocacy.
That being said, I'm highly uncertain here and these reasons aren't overwhelming (e.g. WAS advocacy pushes on more than just the "care about naturogenic suffering" lever), so I think WAS advocacy is still, in Gregory's words, an important part of the 'far future portfolio.' And often one can work on it while working on other things, e.g. I think Animal Charity Evaluators' WAS content (e.g. ]guest blog post by Oscar Horta](https://animalcharityevaluators.org/blog/why-the-situation-of-animals-in-the-wild-should-concern-us/)) has helped them be more well-rounded as an organization, and didn't directly trade off with their farmed animal content.
Yes, terraforming is a big way in which close-to-WAS scenarios could arise. I do think it's smaller in expectation than digital environments that develop on their own and thus are close-to-WAS.
I don't think terraforming would be done very differently than today's wildlife, e.g. done without predation and diseases.
Ultimately I still think the digital, not-close-to-WAS scenarios seem much larger in expectation.
Thanks for funding this research. Notes:
Ostensibly it seems like much of Sentience Institute's (SI) current research is focused on identifying those MCE strategies which historically have turned out to be more effective among the strategies which have been tried. I think SI as an organization is based on the experience of EA as a movement in having significant success with MCE in a relatively short period of time. Successfully spreading the meme of effective giving; increasing concern for the far future in notable ways; and corporate animal welfare campaigns are all dramatic achievements for a young social movement like EA. While these aren't on the scale of shaping MCE over the course of the far future, these achievements makes it seem more possible EA and allied movements can have an outsized impact by pursuing neglected strategies for values-spreading.
On terminology, to say the focus is on non-human animals, or even moral patients which typically come to mind when describing 'animal-like' minds, i.e., familiar vertebrates is inaccurate. "Sentient being", "moral patient" or "non-human agents/beings" are terms which are inclusive of non-human animals, and other types of potential moral patients posited. Admittedly these aren't catchy terms.
Hm, yeah, I don't think I fully understand you here either, and this seems somewhat different than what we discussed via email.
My concern is with (2) in your list. "[T]hey do not wish to be convinced to expand their moral circle" is extremely ambiguous to me. Presumably you mean they -- without MCE advocacy being done -- wouldn't put in wide-MC* values or values that lead to wide-MC into an aligned AI. But I think it's being conflated with, "they actively oppose" or "they would answer 'no' if asked, 'Do you think your values are wrong when it comes to which moral beings deserve moral consideration?'"
I think they don't actively oppose it, they would mostly answer "no" to that question, and it's very uncertain if they will put the wide-MC-leading values into an aligned AI. I don't think CEV or similar reflection processes reliably lead to wide moral circles. I think they can still be heavily influenced by their initial set-up (e.g. what the values of humanity when reflection begins).
This leads me to think that you only need (2) to be true in a very weak sense for MCE to matter. I think it's quite plausible that this is the case.
*Wide-MC meaning an extremely wide moral circle, e.g. includes insects, small/weird digital minds.
I think that there's an inevitable tradeoff between wanting a reflection process to have certain properties and worries about this violating goal preservation for at least some people. This blogpost is not about MCE directly, but if you think of "BAAN thought experiment" as "we do moral reflection and the outcome is such a wide circle that most people think it is extremely counterintuitive" then the reasoning in large parts of the blogpost should apply perfectly to the discussion here.
That is not to say that trying to fine tune reflection processes is pointless: I think it's very important to think about what our desiderata should be for a CEV-like reflection process. I'm just saying that there will be tradeoffs between certain commonly mentioned desiderata that people don't realize are there because they think there is such a thing as "genuinely free and open-ended deliberation."
Thanks for commenting, Lukas. I think Lukas, Brian Tomasik, and others affiliated with FRI have thought more about this, and I basically defer to their views here, especially because I haven't heard any reasonable people disagree with this particular point. Namely, I agree with Lukas that there seems to be an inevitable tradeoff here.
I tend to think of moral values as being pretty contingent and pretty arbitrary, such that what values you start with makes a big difference to what values you end up with even on reflection. People may "imprint" on the values they receive from their culture to a greater or lesser degree.
I'm also skeptical that sophisticated philosophical-type reflection will have significant influence over posthuman values compared with more ordinary political/economic forces. I suppose philosophers have sometimes had big influences on human politics (religions, Marxism, the Enlightenment), though not necessarily in a clean "carefully consider lots of philosophical arguments and pick the best ones" kind of way.
I'd qualify this by adding that the philosophical-type reflection seems to lead in expectation to more moral value (positive or negative, e.g. hedonium or dolorium) than other forces, despite overall having less influence than those other forces.
I thought this piece was good. I agree that MCE work is likely quite high impact - perhaps around the same level as X-risk work - and that it has been generally ignored by EAs. I also agree that it would be good for there to be more MCE work going forward. Here's my 2 cents:
You seem to be saying that AIA is a technical problem and MCE is a social problem. While I think there is something to this, I think there are very important technical and social sides to both of these. Much of the work related to AIA so far has been about raising awareness about the problem (eg the book Superintelligence), and this is more a social solution than a technical one. Also, avoiding a technological race for AGI seems important for AIA, and this also is more a social problem than a technical one.
For MCE, the 2 best things I can imagine (that I think are plausible) are both technical in nature. First, I expect clean meat will lead to the moral circle expanding more to animals. I really don't see any vegan social movement succeeding in ending factory farming anywhere near as much as I expect clean meat to. Second, I'd imagine that a mature science of consciousness would increase MCE significantly. Many ... (read more)
Thanks for the comment! A few of my thoughts on this:
If one is convinced non-extinction civilization is net positive, this seems true and important. Sorry if I framed the post too much as one or the other for the whole community.
Maybe. My impression from people working on AIA is that they see it as mostly technical, and indeed they think much of the social work has been net negative. Perhaps not Superintelligence, but at least the work that's been done to get media coverage and widespread attention without the technical attention to detail of Bostrom's book.
I think the more important social work (from a pro-AIA perspective) is about convincing AI decision-makers to use the technical results of AIA research, but my impression is that AIA proponents still think getting those technical results is proba... (read more)
Thanks for writing this, I thought it was a good article. And thanks to Greg for funding it.
My pushback would be on the cooperation and coordination point. It seems that a lot of other people, with other moral values, could make a very similar argument: that they need to promote their values now, as the stakes as very high with possible upcoming value lock-in. To people with those values, these arguments should seem roughly as important as the above argument is to you.
Yeah, I think that's basically right. I think moral circle expansion (MCE) is closer to your list items than extinction risk reduction (ERR) is because MCE mostly competes in the values space, while ERR mostly competes in the technology space.
However, MCE is competing in a narrower space than just values. It's in the MC space, which is just the space of advocacy on what our moral circle should look like. So I think it's fairly distinct from the list items in that sense, though you could still say they're in the same space because all advocacy competes for news coverage, ad buys, recruiting advocacy-oriented people, etc. (Technology projects could also compete for these things, though there are separations, e.g. journalists with a social beat versus journalists with a tech beat.)
I think the comparably narrow space of ERR is ER, which also includes people who don't want extinction risk reduced (or even want it increased), such as some hardcore environmentalists, antinatalists, and negative utilitarians.
I think these are legitimate cooperation/coordination perspectives, and it's not really clear to me how they add up. But in general, I think this matters mostly in situations where you... (read more)
Fortunately, not everyone does take this advice literally :).
This is very similar to the tragedy of the commons. If everyone acts out of their own self motivated interests, then everyone will be worse off. However, the situation as you described does not fully reflect reality because none of the groups you mentioned are actually trying to influence AI researchers at the moment. Therefore, MCE has a decisive advantage. Of course, this is always subject to change.
I find that it is often the case that people will dismiss any specific moral recommendation for AI except this one. Personally I don't see a reason to think that there are certain universal principles of minimal alignment. You may argue that human extinction is something that almost everyone agrees is bad -- but now the principle of minimal alignment has shifted to "have the AI prevent things that almost everyone agrees is bad" which is another privileged moral judgement that I see no intrinsic reason to hold.
In truth, I see no neutral assumptions to ground AI alignment theory in. I think this is made even more difficult because even relatively small differences in moral theory from the point of view of information theoretic descriptions of moral values can lead to drastically different outcomes. However, I do find hope in moral compromise.
As EA as a movement has grown so far, the community appears to converge upon a rationalization process whereby most of us have realized what is centrally morally important is the experiences of well-being of a relatively wide breadth of moral patients, and the relatively equal moral weight assigned to well-being of each moral patient. The difference between SI and those who focus on AIA is primarily their differing estimates of the expected value of far-future in terms of average or total well-being. Among the examples you provided, it seems some worldviews are more amenable to the rationalization process which lends itself to consequentialism and EA. Many community members were egalitarians and libertarians who find common cause now in trying to figure out if to focus on AIA or MCE. I think your point is important in that ultimately advocating for this type of values spreading could be bad. However what appears to be an extreme amount of diversity could end up looking less fraught in a competition among values as divergent worldviews converge on similar goals.
Since different types of worldviews, like any amenable to aggregate consequentialist frameworks, can collate around a sing... (read more)
Thanks for this post. Some scattered thoughts:
This doesn't seem like a big consideration to me. Even if unfriendly AI comes sooner by an entire decade, this matters little on a cosmic timescale. An argument I find more compelling: If we plot the expected utility of an AGI as a function of the amount of effort put into aligning it, there might be a "valley of bad alignment" that is worse than no attempt at alignment at all. (A paperclip maximizer will quickly kill us and not generate much long-term suffering, whereas an AI that understands the importance of human survival but doesn't understand any other values will imprison us for all eternity. Something like that.)
I'd like to know more about why people think that our moral circles have expanded. I suspect activism plays a smaller role than you think. Steven Pinker talks about possible reasons for declining violence in his book The Better Angels of Our Nature. I'm guessing this is highly relat... (read more)
Next to the counterpoints mentioned by Gregory Lewis, I think there is an additional reason why MCE seems less effective than more targeted interventions to improve the quality of the long-term future: Gains from trade between humans with different values become easier to implement as the reach of technology increases. As long as a non-trivial fraction of humans end up caring about animal wellbeing or digital minds, it seems likely it would be cheap for other coalitions to offer trades. So whether 10% of future people end up with an expanded moral circle or 100% may not make much of a difference to the outcome: It will be reasonably good either way if people reap the gains from trade.
One might object that it is unlikely that humans would be able to cooperate efficiently, given that we don't see this type of cooperation happening today. However, I think it's reasonable to assume that staying in control of technological progress beyond the AGI transition requires a degree of wisdom and foresight that is very far away from where most societal groups are at today. And if humans do stay in control, then finding a good solution for value disagreements may be the easier problem... (read more)
So I'm curious for your thoughts. I see this concern about "incidental suffering of worker-agents" stated frequently, which may be likely in many future scenarios. However, it doesn't seem to be a crucial consideration, specifically because I care about small/weird minds with non-complex experiences (your first consideration).
Caring about small minds seems to imply that "Opportunity Cost/Lost Risks" are the dominate consideration - if small minds have moral value comparable to large minds, then... (read more)
This post is extremely valuable - thank you! You have caused me to reexamine my views about the expected value of the far future.
What do you think are the best levers for expanding the moral circle, besides donating to SI? Is there anything else outside of conventional EAA?
Thanks! That's very kind of you.
I'm pretty uncertain about the best levers, and I think research can help a lot with that. Tentatively, I do think that MCE ends up aligning fairly well with conventional EAA (perhaps it should be unsurprising that the most important levers to push on for near-term values are also most important for long-term values, though it depends on how narrowly you're drawing the lines).
A few exceptions to that:
Digital sentience probably matters the most in the long run. There are good reasons to be skeptical we should be advocating for this now (e.g. it's quite outside of the mainstream so it might be hard to actually get attention and change minds; it'd probably be hard to get funding for this sort of advocacy (indeed that's one big reason SI started with farmed animal advocacy)), but I'm pretty compelled by the general claim, "If you think X value is what matters most in the long-term, your default approach should be working on X directly." Advocating for digital sentience is of course neglected territory, but Sentience Institute, the Nonhuman Rights Project, and Animal Ethics have all worked on it. People for the Ethical Treatment of Reinforceme
I thought this was very interesting, thanks for writing up. Two comments
It was useful to have a list of reasons why you think the EV of the future could be around zero, but it still found it quite vague/hard to imagine - why exactly would more powerful minds be mistreating less powerful minds? etc. - so I'd would have liked to see that sketched in slightly more depth.
It's not obvious to me it's correct/charitable to draw the neglectedness of MCE so narrowly. Can't we conceive of a huge ammount of moral philosophy, and well as social activism, both new and old, as MCE? Isn't all EA outreach an indirect form of MCE?
I'm sympathetic to both of those points personally.
1) I considered that, and in addition to time constraints, I know others haven't written on this because there's a big concern of talking about it making it more likely to happen. I err more towards sharing it despite this concern, but I'm pretty uncertain. Even the detail of this post was more than several people wanted me to include.
But mostly, I'm just limited on time.
2) That's reasonable. I think all of these boundaries are fairly arbitrary; we just need to try to use the same standards across cause areas, e.g. considering only work with this as its explicit focus. Theoretically, since Neglectedness is basically just a heuristic to estimate how much low-hanging fruit there is, we're aiming at "The space of work that might take such low-hanging fruit away." In this sense, Neglectedness could vary widely. E.g. there's limited room for advocating (e.g. passing out leaflets, giving lectures) directly to AI researchers, but this isn't affected much by advocacy towards the general population.
I do think moral philosophy that leads to expanding moral circles (e.g. writing papers supportive of utiltiarianism), moral-circle-foc... (read more)
A very interesting and engaging article indeed.
I agree that people often underestimate the value of strategic value spreading. Oftentimes, proposed moral models that AI agents will follow have some lingering narrowness to them, even when they attempt to apply the broadest of moral principles. For instance, in Chapter 14 of Superintelligence, Bostrom highlights his common good principle:
Clearly, even something as broad as tha... (read more)
This is the main reason I think the far future is high EV. I think we should be focusing on p(Hedonium) and p(Delorium) more than anything else. I'm skeptical that, from a hedonistic utilitarian perspective, byproducts of civilization could come close to matching the expected value from deliberately tiling the universe (potentially multiverse) with consciousness optimized for pleasure or pain. If p(H)>p(D), the future of humanity is very likely positive EV.
You say
Could you elaborate more on why this is the case? I would tend to think that a prior would be that they're equal, and then you update on the fact that they seem to be asymmetrical, and try to work out why that is the case, and whether those factors will apply in future. They could be fundamentally asymmetrical, or evolutionary pressures may tend... (read more)
My current position is that the amount of pleasure/suffering that conscious entities will experience in a far-future technological civilization will not be well-defined. Some arguments for this:
Generally utility functions or reward functions are invariant under affine transformations (with suitable rescaling for the learning rate for reward functions). Therefore they cannot be compared between different intelligent agents as a measure of pleasure.
The clean separation of our civilization into many different individuals is an artifact of how evolution op
Thanks very much for writing this, and thanks to Greg for funding it! I think this is a really important discussion. Some slightly rambling thoughts below.
We can think about 3 ways of improving the EV of the far future:
1: Changing incentive structures experienced by powerful agents in the future (e.g. avoiding arms races, power struggles, selection pressures)
2: a) Changing the moral compass of powerful agents in the future in specific directions (e.g. MCE).
b) Indirect ways to improve the moral compass of powerful agents in the future (e.g. philosophy r... (read more)
Thank you for providing an abstract for your article. I found it very helpful.
(and I wish more authors here would do so as well)
Random thought: (factory farm) animal welfare issues will likely eventually be solved by cultured (lab grown) meat when it becomes cheaper than growing actual animals. This may take a few decades, but social change might take even longer. The article even suggests technical issues may be easier to solve, so why not focus more on that (rather than on MCE)?
Thank you for this piece. I enjoyed reading it and I'm glad that we're seeing more people being explicit about their cause-prioritization decisions and opening up discussion on this crucially important issue.
I know that it's a weak consideration, but I hadn't, before I read this, considered the argument for the scale of values spreading being larger than the scale of AI alignment (perhaps because, as you pointed out, the numbers involved in both are huge) so thanks for bringing that up.
I'm in agreement with Michael_S that hedonium and delorium should be... (read more)
While the central thesis to expand one’s moral circles can be well-enjoyed by the community, this post is not selling it well. This is exemplified by the “One might be biased towards AIA if…” section, which makes assumptions about individuals who focus on AI alignment. Further, while the post includes a section on cooperation, it discourages it. [Edit: Prima facie,] the post does not invite critical discussion. Thus,
I would not recommend this post to any readers interested in moral circles expansion, AI alignment, or cooperation.Thus, I would recommend this post to readers interested in moral circles expansion, AI alignment, and cooperation, as long as they are interested in a vibrant discourse.Do you think there's a better way to discuss biases that might push people to one cause or another? Or that we shouldn't talk about such potential biases at all?
What do you mean by this post discouraging cooperation?
What do you expect an invitation for critical discussion to look like? I usually take that to be basically implicit when something is posted to the EA Forum, unless the author states otherwise.
Impressive article - I especially liked the biases section. I would recommend doing a quantitative model of cost effectiveness comparing to AIA, as I have done for global agricultural catastrophes, especially because neglectedness is hard to define in your case.
Thanks for writing it.
Here are my reasons for the belief wild animal/small minds/... suffering agenda is based mostly on errors and uncertainties. Some of the uncertainties should warrant research effort, but I do not believe the current state of knowledge justifies prioritization ofany kind of advocacy or value spreading.
1] The endeavour seems to be based on extrapolating intuitive models far outside the scope for which we have data. The whole suffering calculus is based on extrapolating the concept of suffering far away from the domain for which we have... (read more)
You raise some good points. (The following reply doesn't necessarily reflect Jacy's views.)
I think the answers to a lot of these issues are somewhat arbitrary matters of moral intuition. (As you said, "Big part of it seems arbitrary.") However, in a sense, this makes MCE more important rather than less, because it means expanded moral circles are not an inevitable result of better understanding consciousness/etc. For example, Yudkowsky's stance on consciousness is a reasonable one that is not based on a mistaken understanding of present-day neuroscience (as far as I know), yet some feel that Yudkowsky's view about moral patienthood isn't wide enough for their moral tastes.
Another possible reply (that would sound better in a political speech than the previous reply) could be that MCE aims to spark discussion about these hard questions of what kinds of minds matter, without claiming to have all the answers. I personally maintain significant moral uncertainty regarding how much I care about what kinds of minds, and I'm happy to learn about other people's moral intuitions on these things because my own intuitions aren't settled.
... (read more)@Matthew_Barnett As a senior electrical engineering student, proficient in a variety of programming languages, I do think and believe that AI is important to think about and discuss. The theoretical threat of a malevolent strong AI would be immense. But that does not mean one has cause or a valid reason to support CS grad students financially.
A large, significant, asteroid collision with Earth would also be quite devastating. Yet, to fund and support aerospace grads does not follow. Perhaps I really mean this: AI safety is an Earning to Give non sequitur.... (read more)