AS

Aaron_Scher

560 karmaJoined Claremont, CA, USA

Bio

I'm Aaron, I've done Uni group organizing at the Claremont Colleges for a bit. Current cause prioritization is AI Alignment.

Comments
92

Thanks for writing this. I agree that this makes me nervous. Various thoughts:

I think I’ve slowly come to believe something like, ‘sufficiently smart people can convince themselves that arbitrary morally bad things are actually good’. See e.g., as the gymnastic meme, but also there’s something deeper of like ‘many of the evil people throughout history have believed that what they’re doing is good actually’. I think the response to this should be deep humility and moral risk aversion. Having a big brain argument that sounds good to you about why what you’re doing is good is actually extremely weak evidence about the goodness of the thing. I think it would probably be better if EAs took took this more seriously and didn’t do things like starting an AGI company or starting an AGI hedge fund. An AGI hedge fund seems even worse than Anthropic (where I think the argument for cutting edge research is medium brained and at least somewhat true empirically). The reasons Chana lists for why hedge fund could be a good idea all seem fairly weak — they would be stronger if Leopold was saying these were part of the plan.

The unilateralist nature and relationship to race dynamics also worries me. Maybe there would have been AGI hedge funds anyway, and maybe there would have been lengthy blog posts that tell the USG and China that they should be in a massive race on AI — but those things sure weren’t being done before Leopold did it.

I don’t think I have strong reasons to actively trust Leopold. I don’t know him and I think my baseline trust isn’t super high nowadays. By “trust” I mean some combination of being of good character, having correct judgment, and good epistemic practices to make up for poor judgment. Choosing to lose OpenAI equity is a positive sign, but I’m not sure how big. So this caches out in not making much of an update on the value of an AGI hedge fund — something that seems initially medium bad.

I think it’s sus to write up a blog post telling people AGI is coming soon while starting an investment firm that will benefit from people thinking AGI is coming soon. This is clearly a case of conflicting interests. It’s not necessarily a bad thing — there are good arguments around putting your money where your mouth is and taking actions based on big if true ideas, but it is a warning flag.

I could imagine a normal person reading Situational Awareness, including the part about Superalignment, and then hearing that the author is starting an AGI hedge fund, and their response being “WTF?! You believe all this about the intelligence explosion and how there are critical safety problems we’re not on track to solve, and you’re starting a hedge fund?” This response makes a lot of sense to me (and I do think I’ve heard it somewhere, though I’m not sure where). I think ‘starting an AGI hedge fund’ is really low on the list of things somebody who cares a lot about superintelligence safety should be doing. So either I’m misunderstanding something, or this is an update that Leopold isn’t as serious about ASI safety as I thought.

I have yet to see any replies from Leopold to people commentating or responding to Situational Awareness. This seems like bad form for truth seeking and getting buy-in from EAs, but it may be the norm for general intellectual content. 

The paper that introduces the test is probably what you're looking for. Based on a skim, it seems to me that it spends a lot of words laying out the conceptual background that would make this test valuable. Obviously it's heavily selected for making the overall argument that the test is good. 

Elaborating on point 1 and the "misinformation is only a small part of why the system is broken" idea: 

The current system could be broken in many ways but at some equilibrium of sorts. Upsetting this equilibrium could have substantial effects because, for instance, people's built immune response to current misinformation is not as well trained as their built immune response to traditionally biased media. 

Additionally, intervening on misinformation could be far more tractable than other methods of improving things. I don't have a solid grasp of what the problem is and what makes is worse, but a number of potential causes do seem much harder to intervene on than misinformation: general ignorance, poor education, political apathy. It can be the case that misinformation makes the situation merely 5% worse but is substantially easier to fix than these other issues. 

I appreciate you writing this, it seems like a good and important post. I'm not sure how compelling I find it, however. Some scattered thoughts:

  • In point 1, it seems like the takeaway is "democracy is broken because most voters don't care about factual accuracy, don't follow the news, and elections are not a good system for deciding things; because so little about elections depends on voters getting reliable information, misinformation can't make things much worse". You don't actually say this, but this appears to me to be the central thrust of your argument — to the extent modern political systems are broken, it is not in ways that are easily exacerbated by misinformation. 
  • Point 3 seems to be mainly relevant to mainstream media, but I think the worries about misinformation typically focus on non-mainstream media. In particular, when people say they "saw X on Facebook", they're not basing their information diet on trustworthiness and reputation. You write, "As noted above (#1), the overwhelming majority of citizens get their political information from establishment sources (if they bother to get such information at all)." I'm not sure what exactly you're referencing here, but it looks to me like people are getting news from social media about ⅔ as much as from news websites/apps (see "News consumption across digital platforms"). This is still a lot of social media news, which should not be discounted. 
  • I don't think I find point 4 compelling. I expect the establishment to have access to slightly, but not massively better AI. But more importantly, I don't see how this helps? If it's easy to make pro-vaccine propaganda but hard to make anti-vax propaganda, I don't see how this is a good situation? It's not clear that propaganda counteracts other propaganda in an efficient way such that those with better AI propaganda will win out (e.g., insularity and people mostly seeing content that aligns with their beliefs might imply little effect of counter-propaganda existing). You write "Anything that anti-establishment propagandists can do with AI, the establishment can do better", but propaganda is probably not a zero-sum, symmetric weapon. 
  • Overall, it feels to me like these are decent arguments about why AI-based disinformation is likely to be less of a big deal than I might have previously thought, but they don't feel super strong. They feel largely handwavy in the sense of "here is an argument which points in that direction", but it's really hard to know how hard they push that direction. There is ample opportunity for quantitative and detailed analysis (which I generally would find more convincing), but that isn't made here, and is instead obfuscated in links to other work. It's possible that the argument I would actually find super convincing here is just way to long to be worth writing. 
  • Again, thanks for writing this, I think it's a service to the commons. 

Due to current outsourcing being of data labeling, I think one of the issues you express in the post is very unlikely:

My general worry is that in future, the global south shall become the training ground for more harmful AI projects that would be prohibited within the Global North. Is this something that I and other people should be concerned about?

Maybe there's an argument about how: 

  • current practices are evidence that AI companies are trying to avoid following the laws (note I mostly don't believe this), 
  • and this is why they're outsourcing parts of development, 
  • so then we should be worried they'll do the same to get around other (safety-oriented) laws. 

This is possible, but my best guess is that low wages are the primary reason for current outsourcing. 

Additionally, as noted by Larks, outsourcing data-centers is going to be much more difficult, or at least take a long time, compared to outsourcing data-labeling, so we should be less worried that companies could effectively get around laws by doing this. 

This line of argument suggests that slow takeoff is inherently harder to steer. Because pretty much any version of slow takeoff means that the world will change a ton before we get strongly superhuman AI.

I'm not sure I agree that the argument suggests that. I'm also not sure slow takeoff is harder to steer than other forms of takeoff — they all seem hard to steer. I think I messed up the phrasing because I wasn't thinking about it the right way. Here's another shot:

Widespread AI deployment is pretty wild. If timelines are short, we might get attempts at AI takeover before we have widespread AI deployment. I think attempts like this are less likely to work than attempts in a world with widespread AI deployment. This is thinking about takeoff in the sense of deployment impact on the world (e.g., economic growth), rather than in terms of cognitive abilities. 

 

On a related note, slow takeoff worlds are harder to steer in the sense that the proportion of influence on AI from x-risk oriented people probably goes down because the rest of the world gets involved, also the neglectedness of AI safety research probably drops; this is why some folks have considered conditioning their work on e.g., high p(doom).

Thanks for your comments! I probably won't reply to the others as I don't think I have much to add, they seem reasonable, though I don't fully agree. 

I think these don’t bite nearly as hard for conditional pauses, since they occur in the future when progress will be slower

Your footnote is about compute scaling, so presumably you think that's a major factor for AI progress, and why future progress will be slower. The main consideration pointing the other direction (imo) is automated researchers speeding things up a lot. I guess you think we don't get huge speedups here until after the conditional pause triggers are hit (in terms of when various capabilities emerge)? If we do have the capabilities for automated researchers, and a pause locks these up, that's still pretty massive (capability) overhang territory. 

While I’m very uncertain, on balance I think it provides more serial time to do alignment research. As model capabilities improve and we get more legible evidence of AI risk, the will to pause should increase, and so the expected length of a pause should also increase [footnote explaining that the mechanism here is that the dangers of GPT-5 galvanize more support than GPT-4]

I appreciate flagging the uncertainty; this argument doesn't seem right to me. 

One factor affecting the length of a pause would be the (opportunity cost from pause) / (risk of catastrophe from unpause) ratio of marginal pause days, or what is the ratio of the costs to the benefits. I expect both the costs and the benefits of AI pause days to go up in the future — because risks of misalignment/misuse are greater, and because AIs will be deployed in a way that adds a bunch of value to society (whether the marginal improvements are huge remains unclear, e.g., GPT-6 might add tons of value, but it's unclear how much more GPT-6.5 adds on top of that, seems hard to tell). I don't know how the ratio will change, which is probably what actually matters. But I wouldn't be surprised if that numerator (opportunity cost) shot up a ton

I think it's reasonable to expect that marginal improvements to AI systems in the future (e.g., scaling up 5x) could map on to automating an additional 1-7% of a nation's economy. Delaying this by a month would be a huge loss (or a benefit, depending on how the transition is going). 

What relevant decision makers think the costs and benefits are is what actually matters, not the true values. So even if right now I can look ahead and see that an immediate pause pushes back future tremendous economic growth, this feature may not become apparent to others until later. 

To try and say what I'm getting at a different way: you're suggesting that we get a longer pause if we pause later than if we pause now. I think that "races" around AI are going to ~monotonically get worse and that the perceived cost of pausing will shoot up a bunch. If we're early on an exponential of AI creating value in the world, it just seems way easier to pause for longer than it will be later on. If this doesn't make sense I can try to explain more. 

Sorry, I agree my previous comment was a bit intense. I think I wouldn't get triggered if you instead asked "I wonder if a crux is that we disagree on the likelihood of existential catastrophe from AGI. I think it's very likely (>50%), what do you think?" 

P(doom) is not why I disagree with you. It feels a little like if I'm arguing with an environmentalist about recycling and they go "wow do you even care about the environment?" Sure, that could be a crux, but in this case it isn't and the question is asked in a way that is trying to force me to agree with them. I think asking about AGI beliefs is much less bad, but it feels similar. 

I think it's pretty unclear if extra time now positively impacts existential risk. I wrote about a little bit of this here, and many others have discussed similar things. I expect this is the source of our disagreement, but I'm not sure. 

I don't think you read my comment:

I don't think extra time pre-transformative-AI is particularly valuable except its impact on existential risk

I also think it's bad how you (and a bunch of other people on the internet) ask this p(doom) question in a way that (in my read of things) is trying to force somebody into a corner of agreeing with you. It doesn't feel like good faith so much as bullying people into agreeing with you. But that's just my read of things without much thought. At a gut level I expect we die, my from-the-arguments / inside view is something like 60%, and my "all things considered" view is more like 40% doom. 

Load more