Hide table of contents
This is a linkpost for https://scottaaronson.blog/?p=6821

Scott Aaronson makes the argument that an Orthodox vs Reform analogy works for AI Alignment. I don't think I've really heard it framed this way before. He gives his takes on the beliefs of each group and there's a lot of discussion in the comments around that, but what's more interesting to me is thinking about what the benefits and risks of framing it this way might be. 

Perhaps the reform option gives people a way to take the arguments seriously without feeling like they are aligning themselves with something too radical. If your beliefs tend reformist, you probably like differentiating yourself from those with more radical-sounding beliefs. If your beliefs are more on the Orthadox side, maybe this is the "gateway drug" and more talent would find its way to your side. This has a little bit of the "bait-and-switch" dynamic I sometimes hear people complain about (but I do not at all endorse) with EA - that it pitches people on global health and animal welfare, but it's all really about AI safety. As long as people really do hold beliefs along reformist lines though, I don't see how that would be an issue. 

Maybe the labels are just too sloppy, most people don't really fit into either camp and it's bad to pretend that they do? 

Not coming up with much else, but I'd be surprised if I wasn't missing something. 




New Answer
New Comment

2 Answers sorted by

As a factual question, I'm not sure if people's opinions on the shape of AI risk can be divided into two distinct clusters, or even distributed along a spectrum (that is, that factor analysis on the points of opinion-space would find a good general factor), though I suspect it may quite weakly be the case. For instance, I found myself agreeing with six of the statements on one side of Scott's dichotomy and two on the other.

As a public epistemic health question, I think issuing binary labels is harmful for further progress in the field, especially if they borrow terminology from religious groups and the author identifies with one of the proposed camps in the same post he raises the distinction. See the comment by xarkn on LW

Even if the range of current opinions could be well-described by a single general factor, we should certainly use less divisive terminology for such a spectrum and be mindful that truth may well lie orthogonal to it.

I agree with most of this - clusters probably not very accurate, divisive religious terminology, him identifying with one of the camps while describing them.

 Can you elaborate a bit more on why you think binary labels are harmful for further progress? Would you say they always are?  How much of your objection here is these particular labels and how Scott defines them, and how much of it is that you don't think the shape can be usefully divided into two clusters?

I find that, on topics that I understand well, I often object intuitively to labels on... (read more)

Milan Weibel
Having thought more about this, I suppose you can divide opinions into two clusters and be pointing at something real. That's because people's views on different aspects of the issue correlate, often in ways that make sense. For instance, people who think AGI will be achieved by scaling up current (or very similar to current) neural net architectures are more excited about practical alignment research on existing models. However, such clusters would be quite broad. My main worry is that identifying two particular points as prototypical of them would narrow their range. People would tend to let their opinions drift closer to the point closest to them. This need not be caused by tribal dynamics. It could be something as simple as availability bias. This narrowing of the clusters would likely be harmful, because the AI safety field is quite new and we've still got exploring to do.  Another risk is that we may become too focused on the line between the two points, neglecting other potentially more worthwhile axes of variation. If I were to divide current opinions into two clusters, I think that Scott's two points would in fact fall in different clusters. They would probably even be not too far off their centers of mass. However, I strongly object to pretending the clusters are points, and then getting tribal about it. I think labeling clusters could be useful, if we made it clear that they are still clusters. On the paths to understanding AI risk without accepting weird arguments, maybe getting people worried about ML unexplainability may be worthwhile to explore, though I suspect most people would think you were pointing to algorithmic bias and the like.
Thank you!

I want to make an analogy to personality types. Lots of humans believe there is one single personality type. "Everyone thinks and reacts more or less like me."  Given this starting point, upgrading to thinking there are 4 or 16 or whatever types of people is a great update. Lists of different conflict resolution styles, or different love languages, etc is helpful in the same way.

However, the same system can become harmfull if after a person learns about them, they get stuck, and refuse to move on to even more nuanced understandings, and insist that the dimensions covered by the system they learned, is the only ones that exists.

Overall, I think Scott Aronsons post is good. 

I expect outsiders who read it will update from thinking there are 1 AIS camp to thinking there are 2 AIS camps. Which is an update in the right direction.

I expect insiders who read it to notice "hey, I agree with one side on some points and the other side on some point" and correctly conclude that the picture of two camps is an oversimplification.

More from Jeremy
Curated and popular this week
Relevant opportunities