I find points 4, 5, and 6 really unconvincing. Are there any stronger arguments for these, that don't consist of pointing to a weird example and then appealing to the intuition that "it would be weird if this thing was conscious"?
I'm not particularly sympathetic with arguments that rely on intuitions to tell us about the way the world is, but unfortunately, I think that we don't have a lot else to go on when we think about consciousness in very different systems. It is too unclear what empirical evidence would be relevant and theory only gets us so far on its own.
That said, I think there are some thought experiments that should be compelling, even though they just elicit intuitions. I believe that the thought experiments I provide are close enough to this for it to be reasonable to put weight on them. The mirror grid, in particular, just seems to me to be the kind of thing where, if you accept that it is conscious, you should probably think everything is conscious. There is nothing particularly mind-like about it, it is just complex enough to read any structure you want into it. And lots of things are complex. (Panpsychism isn't beyond the pale, but it is not what most people are on board with when they endorse functionalism or wonder if computers could be conscious.)
Another way to think about my central point: there is a history in philosophy of trying to make sense of why random objects (rocks, walls) don't count as properly implementing the same functional roles that characterize conscious states. There are some accounts that have been given for this, but it is not clear that those accounts wouldn't predict that contemporary computers couldn't be conscious either. There are plausible readings of those accounts that suggest that contemporary computers would not be conscious no matter what programs they run. If you don't particularly trust your intuitions, and you don't want to accept that rocks and walls properly implement the functional roles of conscious states, you should probably be uncertain over exactly which view is correct. Since many views would rule out consciousness in contemporary computers, you should lower the probability you assign to that.
I don't get the impression that EAs are particularly motivated by morality. Rather, they are motivated to produce things they see as good. Some moral theories, like contractualism, see producing a lot of good things (within the bounds of our other moral duties) as morally optional. You're not doing wrong by living a normal decent life. It seems perfectly aligned with EA to hold one of those theories and still personally aim to do as much good as possible.
A moral theory is more important in what it tells you you can't do in pursuit of the good. Generally what is practical to do if you're trying to effectively pursue the good and abiding by the standard moral rules of society (e.g. don't steal money to give to charity) go hand in hand, so I would expect to see less discussion of this on the forum. Where they come apart, it is probably a significant reputational risk to discuss them.
I like this take: if AI is dangerous enough to kill us in three years, no feasible amount of additional interpretability research would save us.
Our efforts should instead go to limiting the amount of damage that initial AIs could do. That might involve work securing dangerous human-controlled technologies. It might involve creating clever honey pots to catch unsophisticated-but-dangerous AIs before they can fully get their act together. It might involve lobbying for processes or infrastructure to quickly shut down Azure or AWS.
Even in humans, language production is generally subconscious. At least, my experience of talking is that I generally first become conscious of what I say as I'm saying it. I have some sense of what I might want to say before I say it, but the machinery that selects specific words is not conscious. Sometimes, I think of a couple of different things I could say and consciously select between them. But often I don't: I just hear myself speak. Language generation may often lead to conscious perceptions of inner speech, but it doesn't seem to rely on it.
All of this suggests that the possibility of non-conscious chatbots should not be surprising. It may be that chatbots provide pretty good evidence that cognitive complexity can come apart from consciousness. But introspection alone should provide sufficient evidence for that.
EA should be willing to explore all potentially fruitful avenues of mission fulfillment without regard to taboo.
In general, where it doesn't directly relate to cause areas of principle concern to effective altruists, I think EAs should strive to respect others' sacred cows as much as possible. Effective Altruism is a philosophy promoting practical action. It would be harder to find allies who will help us achieve our goals if we are careless about the things other people care a lot about.
The theory is actually doing well on its own terms.
Can you expand on what you mean by this? I would think that expected utility maximization is doing well insofar as your utility is high. If you take a lot of risky bets, you're doing well if a few pay off. If you always pay the mugger, you probably think your decision theory is screwing you unless you find yourself in one of those rare situation where the mugger's promises are real.
I'm very interested though, do you know a better justification for Occam's razor than usability?
I don't . I'm more or less in the same boat that I wish there was a better justification, and I'm inclined to continue using it because I have to (because there is no clear alternative, because it is human nature, etc.)
dogmatism is the most promising way to justify the obvious fact that it is not irrational to refuse to hand over your wallet to a Pascal mugger. (If anyone disagrees that this is an obvious fact, please get in touch, and be prepared to hand over lots of cash).
There is another way out. We can agree that it is rational to hand over the wallet and thank heavens that we’re lucky not to be rational. I’m convinced by things like Kavka’s poison paradox and Newcomb’s paradox that sometimes it sucks to be rational. Maybe Pascal’s mugger is one of those cases.
Occam's razor seems to be necessary in order to learn anything at all about the world from experience, but it remains an assumption.
There are plenty of other assumptions that would allow learning. For any specific complex way the world might be, x, we are able to learn given an assumption of bias toward simplicity for every hypothesis except for x and a bias for x. If all you have to justify Occam’s razor is overall usability, you’ve got very little reason to prefer it to nearby aberrations.
I worry about the effect that AI friends and partners could have on values. It seems plausible that most people could come to have a good AI friend in the coming decades. Our AI friends might always be there for us. They might get us. They might be funny and insightful and eloquent. How would it play out if they're opinions are crafted by tech companies, or the government, or even are reflections of what we want our friends to think? Maybe AI will develop fast enough and be powerful enough that it won't matter what individuals think or value, but I see reasons for concern potentially much greater than the individual harms of social media.