Skimmed it and mostly agree, thanks for writing. Especially takeover and which capabilities are needed for that is a crux for me, rather than human-level. Still, one realistically needs a shorthand for communication and AGI/human-level AI is time tested and understood relatively easily. For policy and other more advanced comms, and as more details become available on what capabilities are and aren't important for takeover, making messaging more detailed is a good next step.
High impact startup idea: make a decent carbon emissions model for flights.
Current ones simply use flight emissions which makes direct flights look low-emission. But in reality, some of these flights wouldn't even be there if people could be spread over existing indirect flights more efficiently, which is why they're cheaper too. Emission models should be relative to counterfactual.
The startup can be for-profit. If you're lucky, better models already exist in scientific literature. Ideal for the AI for good-crowd.
My guess is that a few man-years work could have a big carbon emissions impact here.
Great work, thanks a lot for doing this research! As you say, this is still very neglected. Also happy to see you're citing our previous work on the topic. And interesting finding that fear is such a driver! A few questions:
- Could you share which three articles you've used? Perhaps this is in the dissertation, but I didn't have the time to read that in full.
- Since it's only one article per emotion (fear, hope, mixed), perhaps some other article property (other than emotion) could also have led to the difference you find?
- What follow-up research would you recommend?
- Is there anything orgs like ours (Existential Risk Observatory) (or, these days, MIRI, that also focuses on comms) should do differently?
As a side note, we're conducting research right now on where awareness has gone after our first two measurements (that were 7% and 12% in early/mid '23, respectively). We might also look into the existence and dynamics of a tipping point.
Again, great work, hope you'll keep working in the field in the future!
Congratulations on a great prioritization!
Perhaps the research that we (Existential Risk Observatory) and others (e.g. @Nik Samoylov, @KoenSchoen) have done on effectively communicating AI xrisk, could be something to build on. Here's our first paper and three blog posts (the second includes measurement of Eliezer's TIME article effectiveness - its numbers are actually pretty good!). We're currently working on a base rate public awareness update and further research.
Best of luck and we'd love to cooperate!
It's definitely good to think about whether a pause is a good idea. Together with Joep from PauseAI, I wrote down my thoughts on the topic here.
Since then, I have been thinking a bit on the pause and comparing it to a more frequently mentioned option, namely to apply model evaluations (evals) to see how dangerous a model is after training.
I think the difference between the supposedly more reasonable approach of evals and the supposedly more radical approach of a pause is actually smaller than it seems. Evals aim to detect dangerous capabilities. What will need to happen when those evals find that, indeed, a model has developed such capabilities? Then we'll need to implement a pause. Evals or a pause is mostly a choice about timing, not a fundamentally different approach.
With evals, however, we'll move precisely to the brink, look straight into the abyss, and then we plan to halt at the last possible moment. Unfortunately, though, we're in thick mist and we can't see the abyss (this is true even when we apply evals, since we don't know which capabilities will prove existentially dangerous, and since an existential event may already occur before running the evals).
And even if we would know where to halt: we'll need to make sure that the leading labs will practically succeed in pausing themselves (there may be thousands of people working there), that the models aren't getting leaked, that we'll implement the policy that's needed, that we'll sign international agreements, and that we gain support from the general public. This is all difficult work that will realistically take time.
Pausing isn't as simple as pressing a button, it's a social process. No one knowns how long that process of getting everyone on the same page will take, but it could be quite a while. Is it wise to start that process at the last possible moment, namely when the evals turn red? I don't think so. The sooner we start, the higher our chance of survival.
Also, there's a separate point that I think is not sufficiently addressed yet: we don't know how to implement a pause beyond a few years duration. If hardware and algorithms improve, frontier models could democratize. While I believe this problem can be solved by international (peaceful) regulation, I also think this will be hard and we will need good plans (hardware or data regulation proposals) for how to do this in advance. We currently don't have these, so I think working on them should be a much higher priority.
Thanks for the comment. I think the ways an aligned AGI could make the world safer against unaligned AGIs can be divided in two categories: preventing unaligned AGIs from coming into existence or stopping already existing unaligned AGIs from causing extinction. The second is the offense/defense balance. The first is what you point at.
If an AGI would prevent people from creating AI, this would likely be against their will. A state would be the only actor who could do so legally, assuming there is regulation in place, and also most practically. Therefore, I think your option falls under what I described in my post as "Types of AI (hardware) regulation may be possible where the state actors implementing the regulation are aided by aligned AIs". I think this is indeed a realistic option and it may reduce existential risk somewhat. Getting the regulation in place at all, however, seems more important at this point than developing what I see as a pretty far-fetched and - at the moment - intractable way to implement it more effectively.
I sympathize with working on a topic you feel in your stomach. I worked on climate and switched to AI because I couldn't get rid of a terrible feeling about humanity going to pieces without anyone really trying to solve the problem (~4 yrs ago, but I'd say this is still mostly true). If your stomach feeling is in climate instead, or animal welfare, or global poverty, I think there is a case to be made that you should be working in those fields, both because your effectiveness will be higher there and because it's better for your own mental health, which is always important. I wouldn't say this cannot be AI xrisk: I have this feeling about AI xrisk, and I think many eg. PauseAI activists and others do, too.