(Posting in a personal capacity unless stated otherwise.) I help allocate Open Phil's resources to improve the governance of AI with a focus on avoiding catastrophic outcomes. Formerly co-founder of the Cambridge Boston Alignment Initiative, which supports AI alignment/safety research and outreach programs at Harvard, MIT, and beyond, co-president of Harvard EA, Director of Governance Programs at the Harvard AI Safety Team and MIT AI Alignment, and occasional AI governance researcher. I'm also a proud GWWC pledger and vegan.
Thanks for running this survey. I find these results extremely implausibly bearish on public policy -- I do not think we should be even close to indifferent between improving the AI policy of the country that can make binding rules on all of the leading labs plus many key hardware inputs and has a $6 trillion budget and the most powerful military on earth by 5% and having $8.1 million more dollars for a good grantmaker, or having 32.5 "good video explainers," or having 13 technical AI academics. I'm biased, of course, but IMO the surveyed population is massively overrating the importance of the alignment community relative to the US government.
Fwiw, I think the main thing getting missed in this discourse is that even 3 out of your 50 speakers (especially if they're near the top of the bill) are mostly known for a cluster of edgy views that are not welcome in most similar spaces, people who really want to gather to discuss those edgy and typically unwelcome views will be a seriously disproportionate share of attendees, and this will have significant repercussions for the experience of the attendees who were primarily interested in the other 47 speakers.
I recommend the China sections of this recent CNAS report as a starting point for discussion (it's definitely from a relatively hawkish perspective, and I don't think of myself as having enough expertise to endorse it, but I did move in this direction after reading).
From the executive summary:
Taken together, perhaps the most underappreciated feature of emerging catastrophic AI risks from this exploration is the outsized likelihood of AI catastrophes originating from China. There, a combination of the Chinese Communist Party’s efforts to accelerate AI development, its track record of authoritarian crisis mismanagement, and its censorship of information on accidents all make catastrophic risks related to AI more acute.
From the "Deficient Safety Cultures" section:
While such an analysis is of relevance in a range of industry- and application-specific cultures, China’s AI sector is particularly worthy of attention and uniquely predisposed to exacerbate catastrophic AI risks [footnote]. China’s funding incentives around scientific and technological advancement generally lend themselves to risky approaches to new technologies, and AI leaders in China have long prided themselves on their government’s large appetite for risk—even if there are more recent signs of some budding AI safety consciousness in the country [footnote, footnote, footnote]. China’s society is the most optimistic in the world on the benefits and risks of AI technology, according to a 2022 survey by the multinational market research firm Institut Public de Sondage d’Opinion Secteur (Ipsos), despite the nation’s history of grisly industrial accidents and mismanaged crises—not least its handling of COVID-19 [footnote, footnote, footnote, footnote]. The government’s sprint to lead the world in AI by 2030 has unnerving resonances with prior grand, government-led attempts to accelerate industries that have ended in tragedy, as in the Great Leap Forward, the commercial satellite launch industry, and a variety of Belt and Road infrastructure projects [footnote, footnote, footnote]. China’s recent track record in other hightech sectors, including space and biotech, also suggests a much greater likelihood of catastrophic outcomes [footnote, footnote, footnote, footnote, footnote].
From "Further Considerations"
In addition to having to grapple with all the same safety challenges that other AI ecosystems must address, China’s broader tech culture is prone to crisis due to its government’s chronic mismanagement of disasters, censorship of information on accidents, and heavy-handed efforts to force technological breakthroughs. In AI, these dynamics are even more pronounced, buoyed by remarkably optimistic public perceptions of the technology and Beijing’s gigantic strategic gamble on boosting its AI sector to international preeminence. And while both the United States and China must reckon with the safety challenges that emerge from interstate technology competitions, historically, nations that perceive themselves to be slightly behind competitors are willing to absorb the greatest risks to catch up in tech races [footnote]. Thus, even while the United States’ AI edge over China may be a strategic advantage, Beijing’s self-perceived disadvantage could nonetheless exacerbate the overall risks of an AI catastrophe.
Yes, but it's kind of incoherent to talk about the dollar value of something without having a budget and an opportunity cost; it has to be your willingness-to-pay, not some dollar value in the abstract. Like, it's not the case that the EA funding community would pay $500B even for huge wins like malaria eradication, end to factory farming, robust AI alignment solution, etc, because it's impossible: we don't have $500B.
And I haven't thought about this much but it seems like we also wouldn't pay, say, $500M for a 1-in-1000 chance for a "$500B win" because unless you're defining "$500B win" with respect to your actual willingness-to-pay, you might wind up with many opportunities to take these kinds of moonshots and quickly run out of money. The dollar size of the win still has to ultimately account for your budget.
Well, it implies you could change the election with those amounts if you knew exactly how close the election would be in each state and spent optimally. But If you figure the estimates are off by an OOM, and half of your spending goes to states that turn out not to be useful (which matches a ~30 min analysis I did a few months ago), and you have significant diminishing returns such that $10M-$100M is 3x less impactful than the first $10M and $100M-$1B is another 10x less impactful, you still get:
I think if you think there's a major difference between the candidates, you might put a value on the election in the billions -- let's say $10B for the sake of calculation; so the first $10M would be worth it if there's a 0.1% chance the election is decided by <1000 votes (which of course happened 6 elections ago!), the next $90M is worth it if there's a 0.9% chance the election is decided by >1000 but <4000 votes, and the next $900M is worth it if there's a 9% chance the election is decided by >4000 but <14000 votes. IMO the first two probably pass and the last one probably doesn't, but idk.
It seems like you might be under-weighing the cumulative amount of resources - even if you have some pretty heavy decay rate (which it's unclear you should -- usually we think of philanthropic investments compounding over time), avoiding nuclear war was a top global priority for decades, and it feels like we have a lot of intellectual and policy "legacy infrastructure" from that.
I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable.
I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies.
In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences.
Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).
Yes, some regulations backfire, and this is a good flag to keep in mind when designing policy, but to actually make the reference-class argument here work, you'd have to show that this is what we should expect from AI policy, which would include showing that failures like NEPA are either much more relevant for the AI case or more numerous than other, more successful regulations, like (in my opinion) the Clean Air Act, Sarbanes-Oxley, bans on CFCs or leaded gasoline, etc. I know it's not quite as simple as "I would simply design good regulations instead of bad ones," but it's also not as simple as "some regulations are really counterproductive, so you shouldn't advocate for any." Among other things, this assumes that nobody else will be pushing for really counterproductive regulations!
I hope to eventually/maybe soon write a longer post about this, but I feel pretty strongly that people underrate specialization at the personal level, even as there are lots of benefits to pluralization at the movement level and large-funder level. There are just really high returns to being at the frontier of a field. You can be epistemically modest about what cause or particular opportunity is the best, not burn bridges, etc, while still "making your bet" and specializing; in the limit, it seems really unlikely that e.g. having two 20 hr/wk jobs in different causes is a better path to impact than a single 40 hr/wk job.
I think this applies to individual donations as well; if you work in a field, you are a much better judge of giving opportunities in that field than if you don't, and you're more likely to come across such opportunities in the first place. I think this is a chronically underrated argument when it comes to allocating personal donations.