An aspiration in my life is to make the biggest positive impact in the world that I can. In 2018 I started working on this goal as a junior paramedic and in 2019 by starting to be trained as a physiotherapist. My perspective shifted significantly after reading Factfulness by Hans Rosling, which inspired me to explore larger-scale global issues. This led me to pursue an interdisciplinary degree in Global Studies and to discover the research field and social community of Effective Altruism.
Since 2022, I’ve been actively involved in projects ranging from founding a local EA university group to launching an AI safety field building organization. Through these experiences and the completion of my bachelor programme, I discovered that my strengths seem to align best with AI governance research, a field I believe is fundamental for ensuring the responsible development of artificial intelligence.
Moving forward, my goal is to deepen my expertise in AI governance as a researcher and contribute to projects that advance this critical area. I am excited to connect with like-minded professionals and explore opportunities that allow me to make a meaningful impact.
I enjoyed reading your insightful reply! Thanks for sharing, Guillaume. You don’t make any arguments I strongly disagree with, and you’ve added many thoughtful suggestions with caveats. The distinction you make between the two sub-questions is useful.
I am curious, though, about what makes you view capacity building (CB) in a more positive light compared to other interventions within AI safety. As you point out, CB also has the potential to backfire. I would even argue that the downside risk of CB might be higher than that of other interventions because it increases the number of people taking the issue seriously and taking proactive action—often with limited information.
For example, while I admire many of the people working at PauseAI, I believe there are quite a few worlds in which those initially involved in setting up the group have had a net-negative impact in expectation. Even early on, there were indications that some people were okay with using violence or radical methods to stop AI (which was then banned by the organizers). However, what happens if these tendencies resurface when “shit hits the fan”? To push back on my own thinking, it still might be a good idea to work on PauseAI due to community diversification argument within AI safety (footnote two).
I agree that other forms of CB, such as MATS, seem more robust. But even here, I can always find compelling arguments for why I should be clueless about the expected value. For instance, an increased number of AI safety researchers working on solving an alignment problem that might ultimately be unsolvable could create a false sense of security.
Thank you for flagging. I actually tried removing the tag before you mentioned this, since I agree. I tried removing it multiple times already (when I wanted to replace it with a better tag), but it isn't working ...
Just before I posted it, I selected the tag because it stated: "The community topic covers posts about the effective altruism community, as well as applying EA in one's personal life". But the subsequent points didn't match too well, and I didn't think that it would be shown in a separate section.
@Toby Tremlett🔹 or @Sarah Cheng , could you please remove the tag?
Mo, thank you for chiming in. Yes, you understood the key point, and you summarised it very well! In my reply to Jan, I expanded on your point about why I think calculating the expected value is not possible for AI safety. Feel free to check it out.
I am curious, though: do you disagree with the idea that a worldview diversification approach at an individual level is the preferred strategy? You understood my point, but how true do you think it is?
Hi Jan, I appreciate the kind words and the engagement!
You correctly summarized the two main arguments. I will start my answer by making sure we are on the same page regarding what expected value is.
Here is the formula I am using:
As EAs, we are trying to maximize EV.
Given that I believe we are extremely morally uncertain about most causes, here is the problem I have encountered: Even if we could reliably estimate the probability of how good or bad a certain outcome is, and even how large +V and −W are, we still don’t know how to evaluate the overall intervention due to moral uncertainty.
For example, while the Risk-Averse Welfarist Consequentialism worldview would “aim to increase the welfare of individuals, human or otherwise, without taking long-shot bets or risking causing bad outcomes,” the Total Welfarist Consequentialism worldview would aim to maximize the welfare of all individuals, present or future, human or otherwise.
In other words, the two worldviews would interpret the formula (even if we perfectly knew the value of each variable) quite differently.
Which one is true? I don’t know. And this does not even take into account other worldviews, such as Egalitarianism, Nietzscheanism, or Kantianism, and others that exist.
To make matters worse, we are not only morally uncertain, but I am also saying that we can’t reliably estimate the probability of how likely a certain +V or −W is to come into existence through the objectives of AI safety. This is linked to my discussion with Jim about determinate credences (since I didn’t initially understand this concept well, ChatGPT gave me a useful explanation).
I think complex cluelessness leads to the expected value formula breaking down (for AI safety and most other longtermist causes) because we simply don’t know what p is, and I, at least, don’t have a determinate credence higher than 50%.
Even if we were to place a bet on a certain worldview (like Total Welfarist Consequentialism), this wouldn’t solve the problem of complex cluelessness or our inability to determine p in the context of AI safety (in my opinion).
This suggests that this cause shouldn’t be given any weight in my portfolio and implies that even specializing in AI safety on a community level doesn’t make sense.
Having said that, there are quite likely some causes where we can have a determinate credence way above 50% for p in the EV formula. And in these cases, a worldview diversification strategy seems to be the best option. This requires making a bet on a set or one worldview, though. Otherwise, we can’t interpret the EV formula, and doing good might not be possible.
Here is an (imperfect) example of how this might look and why a WDS could be the best to pursue. The short answer to why WDS is preferred appears to be that, given moral uncertainty, we don't need to choose one out of 10 different worldviews and hope that it is correct; instead, we can diversify across different ones. Hence, this seems to be a much more robust approach to doing good.
What do you make of this?
One of the reasons I wrote this post was to reflect on excellent comments like yours. Thank you for posting and spotting this inconsistency!
You rightly point out that I jump between i) and ii). The short answer is that, at least for AI safety, I feel clueless or agnostic about whether this cause is positive in expectation. @Mo Putera summarised this nicely in their comment. I am happy to expand on the reasons as to why I think that.
What is your perspective here? If you do have a determinate credence above 50% for AI safety work, how do you arrive at this conclusion? I know you have been also doing some in-depth thinking on the topic of cluelessness.
Next, I want to push back on your claim that if ii) is correct, everything collapses. I agree that this would lead to the conclusion that we are probably entirely clueless about longtermist causes, probably the vast majority of causes in the world. However, it would make me lean toward near-term areas with much shorter causal chains, where there is a smaller margin of error—for example, caring for your family or local animals, which carry a low risk of backfiring.
Although, to be fair, this is unclear as well if one is also clueless about different moral frameworks. For example, helping a young child who fell off their skateboard might seem altruistic but could inadvertently increase their ambition, leading them to become the next Hitler or a power-seeking tech CEO. And to take this to the next level: not taking an action also has downsides (e.g not addressing the ongoing suffering in the world). Yaay!
If conclusion ii) is correct for all causes, altruism would indeed seem not possible from a consequentialist perspective. I don’t have a counterargument at the moment.
I would love to hear your thoughts on this!
Thank you for engaging :)
Thank you for your reply!
Summary: My main intention in my previous comment was to share my perspective on why relying too much on the outside view is problematic (and, to be fair, that wasn’t clear because I addressed multiple points). While I think your calculations and explanation are solid, the general intuition I want to share is that people should place less weight on the outside view, as this article seems to suggest.
I wrote this fairly quickly, so I apologize if my response is not entirely coherent.
Emphasizing the definition of unemployment you use is helpful, and I mostly agree with your model of total AI automation, where no one is necessarily looking for a job.
Regarding your question about my estimate of the median annual unemployment rate: I haven’t thought deeply enough about unemployment to place a bet or form a strong opinion on the exact percentage points. Thanks for the offer, though.
To illustrate the main point in my summary, I want to share a basic reasoning process I'm using.
Assumptions:
Worldview implications of my assumptions:
To articulate my intuition as clearly as possible: the lack of action we’re currently seeing from various stakeholders in addressing the advancement of frontier AI systems seems to be, in part, because they rely too heavily on the outside view for decision-making. While this doesn’t address the crux of your post ( but it prompted me to write my comment initially), I believe it’s dangerous to place significant weight on an approach that attempts to make sense of developments we have no clear reference classes for. AGI hasn’t happened yet, so I don’t understand why we should lean heavily on historical data to assess such a novel development.
What’s currently happening is that people are essentially throwing their arms up and saying, “Uh, the probabilities are so low for X or Y impact of AGI, so let’s just trust the process.” If people placed more weight on assumptions like those above, or reasoned more from first principles, the situation might look very different. Do you see? My issue is with putting too much weight on the outside view, not with your object-level claims.
I am open to changing my mind on this.
Thank you for the post! How much weight do you think should one allocate to the inside and outside view respectively in order to develop a comprehensive estimate of the potential future unemployment rate?
Your calculations look fancy and all of that, but it seems inappropriate to me to be putting so much weight on historical data as you are doing. Especially because I think this ignores the apparent fact that the development of intelligent systems that are more capable than humans has never occurred in history. This fundamentally changes the game.
The more the world changes, I think the less weight one should be putting on the outside view (needs more nuance). People are scared, people don't update in the face of new evidence, people dislike change.
I know you are not saying that the inside view doesn't matter, but I am concerned that a post like this anchors people toward a base rate that is a lot lower than what things will actually be like. It reinforces status quo bias. And this is frustrating to me because so many people don't seem to understand the seriousness of our situation.
I think it makes a lot of sense to reason bottom-up when thinking about topics like these, and I actually disagree with you a lot. It seems to be that there is a deeply correlated failure happening in the AI safety community. In my view, people are putting way too much weight onto the outside view. I am happy to elaborate.
Thank you for sparking this discussion.
Would you consider adding your ideas for 2 minutes? - Creating an comprehensive overview of AI x-risk reduction strategies
------
Motivation: To identify the highest impact strategies for reducing the existential risk from AI, it’s important to know what options are available in the first place.
I’ve just started creating an overview and would love for you to take a moment to contribute and build on it with the rest of us!
Here is the work page: https://workflowy.com/s/making-sense-of-ai-x/NR0a6o7H79CQpLYw
Some thoughts on how we collaborate:
I agree with your reasoning, and the way you’ve articulated it is very compelling to me! It seems that the bar this evidence would need to reach is, quite literally, impossible.
I would even take this further and argue that your chain of reasoning could be applied to most causes (perhaps even all?), which seems valid.
Would you disagree with this?
Your reply also raises a broader question for me: What criteria must an intervention meet for our determinance credence in its expected value being positive to exceed 50%, thereby justifying work on it?