Do you have any material on this? It sounds plausible to me but I couldn't find anything with a quick search.
Nope, it's just an unsubstantiated guess based on seeing what small teams can build today vs 30 years ago. Also based on the massive improvement in open-source libraries and tooling compared to then. Today's developers can work faster at higher levels of abstraction compared to folks back then.
In this world we have AIs that cheaply automate half of work. That seems like it would have immense economic value and promise, enough to inspire massive new investments in AI companies....
Ah, I think we have a crux here. I think that, if you could hire -- for the same price as a human -- a human-level AGI, that would indeed change things a lot. I'd reckon the AGI would have a 3-4x productivity boost from being able to work 24/7, and would be perfectly obedient, wouldn't be limited to working in a single field, could more easily transfer knowledge to other AIs, could be backed up and/or replicated, wouldn't need an office or a fun work environment, can be "hired" or "fired" ~instantly without difficulty, etc.
That feels somehow beside the point, though. I think in any such scenario, there's also going to be very cheap AIs with sub-human intelligence that would have broad economic impact too.
Absolutely agree. AI and AGI will likely provide immense economic value even before the threshold of transformative AGI is crossed.
Still, supposing that AI research today is:
...then even a 4x labor productivity boost may not be all that path-breaking when you zoom out enough. Things will speed up, surely, but they might won't create transformative AGI overnight. Even AGI researchers will need time and compute to do their experiments.
Let me replay my understanding to you, to see if I understand. You are predicting that...
IF:
THEN:
WHERE:
ASSUMING:
Is this a correct restatement of your prediction?
And are your confidence levels for this resulting in AGI on the first try? Within ten tries? Within a year of trial and error? Within a decade of trial and error?
(Rounding to the nearest tenth of a percent, I personally am 0.0% confident we'd get AGI on our first try with a system like this, even with 10^50 FLOPS.)
Confidence intervals over probabilities don’t make much sense to me. The probability itself is already the confidence interval over the binary domain [event happens, event doesn’t happen].
I guess to me the idea of confidence intervals over probabilities implies two different kinds of probabilities. E.g., a reducible flavor and an irreducible flavor. I don’t see what a two-tiered system of probability adds, exactly.
No it's not just extrapolating base rates (that would be a big blunder). We assume that the development of proto-AGI or AGI will rapidly accelerate progress and investment, and our conditional forecasts are much more optimistic about progress than they would be otherwise.
However, it's a totally fair to disagree with us on the degree of that acceleration. Even with superhuman AGI, for example, I don't think we're moving away from semiconductor transistors in less than 15 years. Of course, it really depends on how superhuman this superhuman intelligence would be. We discuss this more in the essay.
despite current models learning vastly faster than humans (training time of LLMs is not a human lifetime, and covers vastly more data)
Some models learning some things faster than humans does not imply AGI will learn all things faster than humans. Self-driving cars, for example, are taking much longer to learn to drive than teenagers do.
Agree that:
We didn't explicitly forecast 2053 in the paper, just 2043 (0.4%) and 2100 (41%). If I had to guess without much thought I might go with 3%. It's a huge advantage to get 10 extra years to build fabs, make algorithms efficient, collect vast training sets, train from slow/expensive real-world feedback, and recover from rare setbacks.
My mental model is some kind of S surve where progress in the short-term is extremely unlikely, progress in the medium-term is more likely, and after a while, the longer it takes to happen, the less likely it is to happen in any given year, as that suggests that some ingredient is still missing and hard to get.
I think you may be right that twenty years is before the S of my S curve really kicks in. Twenty just feels so short with everything that needs to be solved and scaled. I'm much more open-minded about forty.
The end-to-end training run is not what makes learning slow. It's the iterative reinforcement learning process of deploying in an environment, gathering data, training on that data, and then redeploying with a new data collection strategy, etc. It's a mistake, I think, to focus only the narrow task of updating model weights and omit the critical task of iterative data collection (i.e., reinforcement learning).