W

Winston

11 karmaJoined

Comments
3

Yeah, there might be a correlation in practice, but I think intelligent agents could have basically any random values. There are no fundamentally incorrect values, just some values that we don't like or that you'd say lack importance nuance. Even under moral realism, intelligent systems don't necessarily have to care about the moral truth (even if they're smart enough to figure out what the moral truth is). Cf. the orthogonality thesis.

But AIs could value anything. They don’t have to value some metric of importance that lines up with what we care about on reflection. That is, it wouldn’t be a blunder in an epistemic sense. AIs could know their values lack nuance and go against human values, and just not care.

Or maybe you’re just saying that, with the path we’re currently on, it looks like powerful AIs will in fact end up with nuanced values in line with humanity’s. I think this could still constitute a value lock-in, though, just not one that you consider bad. And I expect there would still be value disagreements between humans even if we had perfect information, so I’m skeptical we could ever instill values into AIs that everyone is happy about it.

I’m also not sure AI would cause a value lock-in, but more because powerful AIs may be widely distributed such that no single AGI takes over everything.

Yeah. Though for a utilitarian it could still be instrumentally good to believe the points in this post, at least on an emotional level. But lying to yourself is plausibly a bad norm for other reasons, and in any case, this type of reasoning is the exact thing the post is arguing against.