Interested in AI safety talent search and development.
Making and following through on specific concrete plans.
This is a thoughtful post so it's unfortunate it hasn't gotten much engagement here. Do you have cruxes around the extent to which centralization is favorable or feasible? It seems like small models that could be run on a phone or laptop (~50GB) are becoming quite capable and decentralized training runs work for 10 billion parameter models which are close to that size range. I don't know its exact size, but Gemini Flash 2.0 seems much better than I would have expected a model of that size to be in 2024.
Interesting. People probably aren't at peak productivity or even working at all for some part of those hours, so you could probably cut the hours by 1/4. This narrows the gap between what GPT2030 can achieve in a day and what all humans can together.
Assuming 9 billion people work 8 hours that's ~8.22 million years of work in a day. But given slowdowns in productivity throughout the day we might want to round that down to ~6 million years.
Additionally, GPT2030 might be more effective than even the best human workers at their peak hours. If it's 3x as good as a PhD student at learning, which it might be because of better retention and connections, it would be learning more than all PhD students in the world every day. The quality of its work might be 100x or 1000x better, which is difficult to compare abstractly. In some tasks like clearing rubble, more work time might easily translate into catching up on outcomes.
With things like scientific breakthroughs, more time might not result in equivalent breakthroughs. From that perspective, GPT2030 might end up doing more work than all of humanity since huge breakthroughs are uncommon.
Interesting post - I particularly appreciated the part about the impact of Szilard's silence not really affecting Germany's technological development. This was recently mentioned in Leopold Aschenbrenner's manifesto as an analogy for why secrecy is important, but I guess it wasn't that simple. I wonder how many other analogies are in there and elsewhere that don't quite hold. Could be a useful analysis if anyone has the background or is interested.
"Something relevant to EAs that I don't focus on in the paper is how to think about the effect of campaigning for a policy given that I focus on the effect of passing one conditional on its being proposed. It turns out there's a method (Cellini et al. 2010) for backing this out if we assume that the effect of passing a referendum on whether the policy is in place later is the same on your first try is the same as on your Nth try. Using this method yields an estimate of the effect of running a successful campaign on later policy of around 60% (Appendix Figure D20).
So it seems like you're saying there are at least two conditions: 1) someone with enough resources would have to want to release a frontier model with open weights, maybe Meta or a very large coalition of the opensource community if distributed training continues to scale, 2) it would need at least enough dangerous capability mitigations like unlearning and tamper resistant weights or cloud inference monitoring, or be behind the frontier enough so governments don't try to stop it. Does that seem right? What do you think is the likely price range for AGI?
I'm not sure the government is moving fast enough or interested in trying to lock down the labs too much given it might slow them down more than it increases their lead or they don't fully buy into risk arguments for now. I'm not sure what the key factors to watch here are. I expected reasoning systems next year, but it seems like even open weight ones were released this year that seem around o1 preview level just a few weeks after, indicating that multiple parties are pursuing similar lines of AI research somewhat independently.