Paul Christiano on Dwarkesh Podcast

ESRogs

This is a linkpost for https://www.dwarkeshpatel.com/p/paul-christiano

Dwarkesh's summary:

Paul Christiano is the world’s leading AI safety researcher. My full episode with him is out!
We discuss:
Does he regret inventing RLHF, and is alignment necessarily dual-use?
Why he has relatively modest timelines (40% by 2040, 15% by 2030),
What do we want post-AGI world to look like (do we want to keep gods enslaved forever)?
Why he’s leading the push to get to labs develop responsible scaling policies, and what it would take to prevent an AI coup or bioweapon,
His current research into a new proof system, and how this could solve alignment by explaining model's behavior
and much more.

Effective Altruism Forum
EA Forum

Paul Christiano on Dwarkesh Podcast

5

5

Reactions