Buck

Chief Technology Officer @ Redwood Research

6123 karmaJoined Sep 2014Berkeley, CA, USA

Bio

I'm Buck Shlegeris. I am the CTO of Redwood Research, a nonprofit focused on applied alignment research. Read more about us here: https://www.redwoodresearch.org/

Posts
21

Sorted by New

Buck's Quick takes

Buck

· 5y ago · 1m read

The current alignment plan, and how we might improve it | EAG Bay Area 23

Buck

· 3y ago · 40m read

175

A freshman year during the AI midgame: my approach to the next year

Buck

· 3y ago · 8m read

Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

Max Nadeau

· 3y ago · 15m read

EA Infrastructure Fund: September–December 2021 grant recommendations

Max_Daniel

· 3y ago · 22m read

The case for becoming a black-box investigator of language models

Buck

· 4y ago · 3m read

111

Apply to the second ML for Alignment Bootcamp (MLAB 2) in Berkeley [Aug 15 - Fri Sept 2]

Buck

· 4y ago · 7m read

EA Infrastructure Fund: May–August 2021 grant recommendations

Max_Daniel

· 4y ago · 25m read

140

Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22]

Habryka

· 4y ago · 2m read

106

We're Redwood Research, we do applied alignment research, AMA

Buck

· 4y ago · 2m read

104

Funds are available to fund non-EA-branded groups

Buck

· 4y ago · 2m read

Comments
300

Existential risk from AI and what DC could do about it (Ezra Klein on the 80,000 Hours Podcast)

Buck2y9

I found Ezra's grumpy complaints about EA amusing and useful. Maybe 80K should arrange to have more of their guests' children get sick the day before they tape the interviews.

Joseph Lemien's Quick takes

Buck2y8

I agree that we should tolerate people who are less well read than GPT-4 :P

Joseph Lemien's Quick takes

Buck2y1

For what it’s worth, gpt4 knows what rat means in this context: https://chat.openai.com/share/bc612fec-eeb8-455e-8893-aa91cc317f7d

How come there isn't that much focus in EA on research into whether / when AI's are likely to be sentient?

Buck3y14

I think this is a great question. My answers:

I think that some plausible alignment schemes seem like they could plausibly involve causing suffering to the AIs. I think that it seems pretty bad to inflict huge amounts of suffering on AIs, both because it's unethical and because it seems potentially inadvisable to make AIs justifiably mad at us.
If unaligned AIs are morally valuable, then it's less bad to get overthrown by them, and perhaps we should be aiming to produce successors who we're happier to be overthrown by. See here for discussion. (Obviously the plan A is to align the AIs, but it seems good to know how important it is to succeed at this, and making unaligned but valuable successors seems like a not-totally-crazy plan B.)

How come there isn't that much focus in EA on research into whether / when AI's are likely to be sentient?

Answer by BuckApr 27, 202334

My attitude, and the attitude of many of the alignment researchers I know, is that this problem seems really important and neglected, but we overall don't want to stop working on alignment in order to work on this. If I spotted an opportunity for research on this that looked really surprisingly good (e.g. if I thought I'd be 10x my usual productivity when working on it, for some reason), I'd probably take it.

It's plausible that I should spend a weekend sometime trying to really seriously consider what research opportunities are available in this space.

My guess is that a lot of the skills involved in doing a good job of this research are the same as the skills involved in doing good alignment research.

A freshman year during the AI midgame: my approach to the next year

Buck3y2

Thanks Lizka. I think you mean to link to this video:

Can we evaluate the "tool versus agent" AGI prediction?

Buck3y5

Holden's beliefs on this topic have changed a lot since 2012. See here for more.

Announcing CEA’s Interim Managing Director

Buck3y54

I really like this frame. I feel like EAs are somewhat too quick to roll over and accept attacks from dishonest bad actors who hate us for whatever unrelated reason.

Do you worry about totalitarian regimes using AI Alignment technology to create AGI that subscribe to their values?

Answer by BuckFeb 28, 202320

Yes, I think this is very scary. I think this kind of risk is at least 10% as important as the AI takeover risks that I work on as an alignment researcher.

Taking a leave of absence from Open Philanthropy to work on AI safety

Buck3y12

I don't think Holden agrees with this as much as you might think. For example, he spent a lot of his time in the last year or two writing a blog.

Buck

Bio

Posts 21

Comments300

Posts
21

Comments
300