Disclaimer
Written quickly[1]. It's better to draft my objections poorly, than to not draft them at all.
Introduction
I am sceptical that "foom"[2] is some of not physically possible/feasible/economically viable.
[Not sure yet what level of scepticism I endorse.]
I have a few object level beliefs that bear on it. I'll try and express them succinctly below (there's a summary at the end of the post for those pressed for time).
Note that my objections to foom are more disjunctive than they are conjuctive. Each is independently a reason why foom looks less likely to me.
Beliefs
I currently believe/expect the following to a sufficient degree that they inform my position on foom.
Diminishing Marginal Returns
1.0. Marginal returns to cognitive investment (e.g. compute) decay at a superlinear rate (e.g. exponential) across some relevant cognitive domains (e.g. some of near human, human spectrum, superhuman, strongly superhuman).
1.1. Marginal returns to real world capabilities from cognitive amplification likewise decay at a superlinear rate across relevant cognitive domains.
Among humans +6 SD g factor humans do not seem in general as more capable than +3 SD g factor humans as +3 SD g factor humans are compared to median humans.
Broad Human Cognitive Spectrum
2. The human cognitive spectrum (1st percentile human to peak human) is broad in an absolute sense.
On many useful cognitive tasks(chess, theoretical research, invention, mathematics, etc.), beginner/dumb/unskilled humans are closer to a chimpanzee/rock than peak humans (for some fields, only a small minority of humans are able to perform the task at all, or perform the task in a useful manner[3], for other like chess, beginners are simply closer to the lowest attainable scores than to the scores obtained by peak humans [600 - 800 is a lot closer to 0 than to 2700 - 2900]).
Median humans are probably also closer to a rock than to peak humans (on e.g. inventing general relativity pre 1920).
Peak humans may be closer to bounded superintelligences than beginner/median humans.
E.g. Magnus Carlsen is closer in ELO to Stockfish than median human.
I expect Magnus Carlsen to be closer in ELO to a bounded superintelligence than to a median human.
Narrow Optimisers Outperform General Optimisers on Narrow Domains
3.0. I believe that for similar levels of cognitive investment narrow optimisers outperform general optimisers on narrow domains.
This is because they are not constrained by the pareto frontier across many domains and are more able to pursue the optimum in their narrow domains.
I expect this to translate to many narrow domains (I wouldn't be surprised if we get superhuman language performance without "dangerously capable" systems [we got superhuman art without dangerously capable systems".].
E.g. future LLMs may be able to write very compelling ("bestseller" status) long form fiction in an hour.)
I expect a superintelligence to not win against dedicated chess/Go bots with comparable cognitive endowments (compute budgets, comparably efficient cognitive algorithms/architectures).
"Not win" is too conservative: I expect the ASI to lose unless it adopts the strategy of just running the bot (or depending on the level of superhuman, it might be able to force a tie). I simply do not think a general optimiser (no matter how capable) with comparable cognitive endowment can beat a narrow optimiser at their own game. Optimisation across more domains constrains the attainable optimum in any domain; the pareto frontier is an absolute limit.
I wouldn't be surprised if this generalises somewhat beyond Go.
Are narrow AI superhuman real world strategists viable?
The answer is not obviously "no" to me.
3.1. I believe that general intelligence is not compact.
Deployment Expectations and Strategic Conditions
4.0. I expect continuous progress in cognitive capabilities for several years/decades more.
There may be some paradigm shifts/discontinuous jumps, but I expect that the world would have already been radically transformed when superhuman agents arrive.
4.1. I expect it to be much more difficult for any single agent to attain decisive cognitive superiority to civilisation, or to a relevant subset of civilisation.
Especially given 3.
Superhuman agents may not be that much more capable than superhuman narrow AI amplified humans.
4.2. Specifically, I expect a multipolar world in which many actors have a suite of superhuman narrow AIs that make them "dangerously capable" relative to 2020s earth, but not relative to their current time (I expect the actors to be in some sort of equilibrium).
I'm not convinced the arrival of superhuman agents in such a world would necessarily shatter such an equilibrium.
Or be unilateral "existentially dangerous" relative to said world.
Hence, I expect failure to materialise as dystopia not extinction.
"Superintelligence" is a High Bar
5. "Superintelligence" requires a "very high" level of strongly superhuman cognitive capabilities
Reasons:
- Arguments through
- Attaining decisive strategic advantage seems difficult.
- E.g. I doubt:
- A +12 SD human could do so during most of human history
- Human intelligence in a chimpanzee body easily takes over a chimpanzee tribe
- E.g. I doubt:
My intuition is that the level of cognitive power required to achieve absolute strategic dominance is crazily high.
And it's a moving target that would rise with the extant effective level of civilisation.
Summary
Courtesy of chatGPT:
The author presents several objections to the idea of a rapid, exponential increase in AI capabilities known as an "intelligence explosion" or "foom". The objections include the belief that marginal returns to cognitive investment decay at a superlinear rate, that narrow optimizers outperform general optimizers on narrow domains, and that it will be difficult for a single agent to attain decisive cognitive superiority over civilization. The author also believes that the arrival of superhuman agents in a world with multiple actors possessing superhuman narrow AI will not necessarily shatter the existing equilibrium.
- ^
Half an hour to touch up a stream of consciousness Twitter thread I wrote yesterday.
- ^
An "intelligence explosion" scenario where there's a very short time period where AI systems rapidly grow in intelligence until their cognitive capabilities far exceed humanity's.
- ^
E.g. inventing the dominant paradigm in a hard science seems beyond the ability of most humans. I'm under the impression that pre 1920 < 1,000 (and plausibly < a 100) people could have invented general relativity.
Some have claimed that without Einstein we may not have gotten general relativity for decades.
I am skeptical of the FOOM idea too, but I don't think most of this post argues effectively against it. Some responses here:
1.0/1.1 - This seems nonobvious to me. Do you have examples of these superlinear decays? This seems like the best argument of the entire piece if true, and I'd love to see this specific point fleshed out.
2 - ELO of 0 is not the floor of capability. ELO is measured as a relative ranking of competitors, and is not an objective measure of chess capability - it doesn't start at "You don't know how to play at 0", it starts at 1200 being defined as the average chess player who plays enough ELO matches to be stably ranked, then goes up and down from there based on victories or losses.
You also have to compare potential, not just actual skill. A beginner might be close to a chimpanzee in Chess ability, but give me a median human who has never played Chess before and a chimpanzee, then give me a few hours or a few days to train them both, and I predict we will see a very swift capabilities jump for the human relative to the chimp. Similarly, I bet there was a point in its training process where I was a superior Go player to AlphaGo. That didn't last long.
As for the general relativity example - I think you're looking at the wrong measurement, here. Large language models often have "emergent" properties where they suddenly become able to do something (e.g, 3-digit multiplication) that they couldn't do before, but if you measured their general maths ability (Say, via a 2-digit addition task), you would find it was increasing with scale long before the model reached the level of mathematics required to do 3-digit multiplication correctly even occasionally. "Ability to invent general relativity" is impossible for both a rock and the median human, and yet the median human would still outperform a rock on a cognitive test to measure scientific acumen. If two agents are equally incapable or capable of performing a specific task, that does not make them equal in the underlying domain of that task.
In addition, both Chess and Go are bounded domains. I wouldn't be surprised if Magnus Carlsen was closer in ELO to an unbounded superintelligence than a median human, because once you've solved chess you've hit an ELO ceiling, and adding more intelligence doesn't help. It is possible that an ELO of 4500+ is legitimately impossible. This probably does not generalise to real-world domains.
3 - I don't see how this acts as an argument against recursive self-improvement. It doesn't matter if an AGI would lose to a specialised Chess bot of the same level. That does not stop it from FOOMing. If recursive self-improvement is possible, the AI just has to reach a certain capability level in each of X domains. This can be achieved eventually even if each X could be individually reached faster by a specialised machine.
4 - This is not an argument against FOOM, this is just a description of what the world might look like if FOOM does not happen and is not possible.
5 - I agree with you that superintelligence requires a very high level of strongly superhuman cognitive abilities. In a world where FOOM-level improvement is possible, these levels will be reached anyway. This is, again, not an argument that argues against FOOM - it seems more an argument of "If FOOM is not possible, we won't get a singleton through ordinary capabilities improvements".
In short - point 1 is a reasonable crux and I agree that if 1.0 and 1.1 were true, this would mean a recursively self-improving AI would quickly hit a ceiling of diminishing returns. I'd like to see a more detailed argument for why this would be expected.
Point 2 appears to have several misconceptions that are pretty fundamental to the understanding of what capabilities are and how they scale.
Points 3-5 do not, to me, seem to argue against the possibility of FOOM at all. They are interesting points, in that they argue against the concept of a singleton in the event that FOOM is not possible, but they don't present in argument in favor of that being the case.
Thanks for the detailed reply.
I'll try and address these objections later.
I think a big problem with FOOM is error propagation. For example, if an initial AI believes the earth is flat with probability ~1, then that belief will form part of the benchmark for evaluating the "next step up" AI. Non flat-earther AI's will be rejected for their "crazy"(from the AI's perspective) beliefs that the earth is round. Continue this up the chain, and you end up with a "super" AGI with an unwavering belief in flat earth conspiracy theories.
The same would apply to other, less noticeable errors in beliefs and reasoning, including beliefs about how to build better AI. Every attempt at foom is going to hampered by the initial flaws of the origin AI.
Natural selection is the ultimate selector. Any meta-system that faces this problem will be continually outcompeted and replaced by meta-systems that either have an architecture that can better mitigate this failure mode, or that have beliefs that are minimally detrimental to its unfolding when this failure mode occurs (in my evaluation, the first is much more likely than the second)
Indeed, better programs will outcompete worse ones, so I expect AI to improve gradually over time. Which is the opposite of foom. I don't quite understand what you mean by a "meta-system that faces this problem". Do you mean the problem of having imperfect beliefs, or not being perfectly rational? That's literally every system!
In this case "improve gradually over time" could take place over the course of a few days or even a few hours. So it's not actually antithetical to FOOM
So, for natural selection to function, there has to be a selection pressure towards the outcome in question, and sufficient time for the selection to occur. Can you explain how a flat-earther AI designed to solve mathematical equations would lose that belief in a matter of hours? Where is the selection pressure?