RN

richard_ngo

6243 karmaJoined

Bio

Former AI safety research engineer, now AI governance researcher at OpenAI. Blog: thinkingcomplete.blogspot.com

Sequences
2

Replacing Fear
EA Archives Reading List

Comments
274

Great post! Not much to add, it all seems to make sense. I'd consider adding a more direct summary of the key takeaways at the top for easier consumption, though.

Impressed by the post; I'd like to donate! Is there a way to do so that avoids card fees? And if so, at what donation size do you prefer that people start using it?

One question: I am curious to hear anyone's perspective on the following "conflict": 

The former is more important for influencing labs, the latter is more important for doing alignment research.

And yet, as I say, I believe both of these are necessary.

FWIW when I talk about the "specific skill", I'm not talking about having legible experience doing this, I'm talking about actually just being able to do it. In general I think it's less important to optimize for having credibility, and more important to optimize for the skills needed. Same for ML skill—less important for gaining credibility, more important for actually just figuring out what the best plans are.

Also are there good online courses anyone would recommend? 

See the resources listed here.

My guess is that this post is implicitly aimed at Bay Area EAs, and that roughly every perk at Trajan House/other Oxford locations is acceptable by these standards.

Perhaps worth clarifying this explicitly, if true—it would be unfortunate if the people who were already most scrupulous about perks were the ones who updated most from this post.

Good comment, consider cross-posting to LW?

I think there’s a sort of “LessWrong decision theory black hole” that makes people a bit crazy in ways that are obvious from the outside, and this comment thread isn’t the place to adjudicate all that.

From my perspective it's the opposite: epistemic modesty is an incredibly strong skeptical argument (a type of argument that often gets people very confused), extreme forms of which have been popular in EA despite leading to conclusions which conflict strongly with common sense (like "in most cases, one should pay scarcely any attention to what you find the most persuasive view on an issue").

In practice, fortunately, even people who endorse strong epistemic modesty don't actually implement it, and thereby manage to still do useful things. But I haven't yet seen any supporters of epistemic modesty provide a principled way of deciding when to act on their own judgment, in defiance of the conclusions of (a large majority of) the 8 billion other people on earth.

By contrast, I think that focusing on policies rather than all-things-considered credences (which is the thing I was gesturing at with my toy example) basically dissolves the problem. I don't expect that you believe me about this, since I haven't yet written this argument up clearly (although I hope to do so soon). But in some sense I'm not claiming anything new here: I think that an individual's all-things-considered deferential credences aren't very useful for almost the exact same reason that it's not very useful to take a group of people and aggregate their beliefs into a single set of "all-people-considered" credences when trying to get them to make a group decision (at least not using naive methods; doing it using prediction markets is more reasonable).

I don't follow. I get that acting on low-probability scenarios can let you get in on neglected opportunities, but you don't want to actually get the probabilities wrong, right?

I reject the idea that all-things-considered probabilities are "right" and inside-view probabilities are "wrong", because you should very rarely be using all-things-considered probabilities when making decisions, for reasons of simple arithmetic (as per my example). Tell me what you want to use the probability for and I'll tell you what type of probability you should be using.

You might say: look, even if you never actually use all-things-considered probabilities in the real world, at least in theory they're still normatively ideal. But I reject that too—see the Anthropic Decision Theory paper for why.

If it informs you that EA beliefs on some question have been unusual from the get-go, it makes sense to update the other way, toward the distribution of beliefs among people not involved in the EA community.

I'm a bit confused by this. Suppose that EA has a good track record on an issue where its beliefs have been unusual from the get-go. For example, I think that by temperament EAs tend to be more open to sci-fi possibilities than others, even before having thought much about them; and that over the last decade or so we've increasingly seen sci-fi possibilities arising. Then I should update towards deferring to EAs because it seems like we're in the sort of world where sci-fi possibilities happen, and it seems like others are (irrationally) dismissing these possibilities.

On a separate note: I currently don't think that epistemic deference as a concept makes sense, because defying a consensus has two effects that are often roughly the same size: it means you're more likely to be wrong, and it means you're creating more value if right.* But if so, then using deferential credences to choose actions will systematically lead you astray, because you'll neglect the correlation between likelihood of success and value of success.

Toy example: your inside view says your novel plan has 90% chance of working, and if it does it'll earn $1000; and experts think it has 10% chance of working, and if it does it'll earn $100. Suppose you place as much weight on your own worldview as experts'. Incorrect calculation: your all-things-considered credence in your plan working is 50%, your all-things-considered estimate of the value of success is $550, your all-things-considered expected value of the plan is $275. Better calculation: your worldview says that the expected value of your plan is $900, the experts think the expected value is $10, average these to get expected value of $455—much more valuable than in the incorrect calculation!

Note that in the latter calculation we never actually calculated any "all-things-considered credences". For this reason I now only express such credences with a disclaimer like "but this shouldn't be taken as action-guiding".

 

* A third effect which might be bigger than either of them: it motivates you to go out and try stuff, which will give you valuable skills and make you more correct in the future.

Thanks! I'll update it to include the link.

Only when people face starvation, illness, disasters, or warfare can they learn who they can really trust.

Isn't this approximately equivalent to the claim that trust becomes much more risky/costly under conditions of scarcity?

only under conditions of local abundance do we see a lot of top-down hierarchical coercion

Yeah, this is an interesting point. I think my story here is that we need to talk about abundance at different levels. E.g. at the highest level (will my country/civilization survive?) you should often be in scarcity mindset, because losing one war is disastrous. Whereas at lower levels (e.g. will my city survive?) you can have more safety: your city is protected by your country (and your family is protected by your city, and you're protected by your family, and so on).

And so even when we face serious threats, we need to apply coercion only at the appropriate levels. AI is a danger on a civilizational level; but the best way to deal with danger on a civilizational level is via cultivating abundance at the level of your own community, since that's the only way it'll be able to make a difference at that higher level.

Load more