Holden Karnofsky

I expected readers to assume that my wife owned significant equity in Anthropic; I've now edited the post to state this explicitly (and also added a mention of her OpenAI equity, which I should've included before and have included in the past). I don't plan to disclose the exact amount and don't think this is needed for readers to have sufficient context on my statements here.

A Playbook for AI Risk Reduction (focused on misaligned AI)

Holden Karnofsky3y11

Sorry, I didn't mean to dismiss the importance of the conflict of interest or say it isn't affecting my views.

I've sometimes seen people reason along the lines of "Since Holden is married to Daniela, this must mean he agrees with Anthropic on specific issue X," or "Since Holden is married to Daniela, this must mean that he endorses taking a job at Anthropic in specific case Y." I think this kind of reasoning is unreliable and has been incorrect in more than one specific case. That's what I intended to push back against.

Seeking (Paid) Case Studies on Standards

Holden Karnofsky3y2

Thanks! I'm looking for case studies that will be public; I'm agnostic about where they're posted beyond that. We might consider requests to fund confidential case studies, but this project is meant to inform broader efforts, so confidential case studies would still need to be cleared for sharing with a reasonable set of people, and the funding bar would be higher.

Taking a leave of absence from Open Philanthropy to work on AI safety

Holden Karnofsky3y2

I think this was a goof due to there being a separate hardcover version, which has now been removed - try again?

My takes on the FTX situation will (mostly) be cold, not hot

Holden Karnofsky3y2

To give a rough idea, I basically mean anyone who is likely to harm those around them (using a common-sense idea of doing harm) and/or "pollute the commons" by having an outsized and non-consultative negative impact on community dynamics. It's debatable what the best warning signs are and how reliable they are.

Time Article Discussion - "Effective Altruist Leaders Were Repeatedly Warned About Sam Bankman-Fried Years Before FTX Collapsed"

Holden Karnofsky3y75

Re: "In the weeks leading up to that April 2018 confrontation with Bankman-Fried and in the months that followed, Mac Aulay and others warned MacAskill, Beckstead and Karnofsky about her co-founder’s alleged duplicity and unscrupulous business ethics" -

I don't remember Tara reaching out about this, and I just searched my email for signs of this and didn’t see any. I'm not confident this didn't happen, just noting that I can't remember or easily find signs of it.

In terms of what I knew/learned 2018 more generally, I discuss that here.

Taking a leave of absence from Open Philanthropy to work on AI safety

Holden Karnofsky3y13

For context, my wife is the President and co-founder of Anthropic, and formerly worked at OpenAI.

80% of her equity in Anthropic is (not legally bindingly) pledged for donation. None of her equity in OpenAI is. She may pledge more in the future if there is a tangible compelling reason to do so.

I plan to be highly transparent about my conflict of interest, e.g. I regularly open meetings by disclosing it if I’m not sure the other person already knows about it, and I’ve often mentioned it when discussing related topics on Cold Takes.

I also plan to discuss the implications of my conflict of interest for any formal role I might take. It’s possible that my role in helping with safety standards will be limited to advising with no formal powers (it’s even possible that I’ll decide I simply can’t work in this area due to the conflict of interest, and will pursue one of the other interventions I’ve thought about).

But right now I’m just exploring options and giving non-authoritative advice, and that seems appropriate. (I’ll also note that I expect a lot of advice and opinions on standards to come from people who are directly employed by AI companies; while this does present a conflict of interest, and a more direct one than mine, I think it doesn’t and can’t mean they are excluded from relevant conversations.)

Some comments on recent FTX-related events

Holden Karnofsky3y24

There was no one with official responsibility for the relationship between FTX and the EA community. I think the main reason the two were associated was via FTX’s/Sam having a high profile and talking a lot about EA - that’s not something anyone else was able to control. (Some folks did ask him to do less of this.)

It’s also worth noting that we generally try to be cautious about power dynamics as a funder, which means we are hesitant to be pushy about most matters. In particular, I think one of two major funders in this space attacking the other, nudging grantees to avoid association and funding from it, etc. would’ve been seen as strangely territorial behavior absent very strong evidence of misconduct.

That said: as mentioned in another comment, with the benefit of hindsight, I wish I’d reasoned more like this: “This person is becoming very associated with effective altruism, so whether or not that’s due to anything I’ve done, it’s important to figure out whether that’s a bad thing and whether proactive distancing is needed.”

Some comments on recent FTX-related events

Holden Karnofsky3y20

In 2018, I heard accusations that Sam had communicated in ways that left people confused or misled, though often with some ambiguity about whether Sam had been confused himself, had been inadvertently misleading while factually accurate, etc. I put some effort into understanding these concerns (but didn’t spend a ton of time on it; Open Phil didn’t have a relationship with Sam or Alameda).

I didn’t hear anything that sounded anywhere near as bad as what has since come out about his behavior at FTX. At the time I didn’t feel my concerns rose to the level where it would be appropriate or fair to publicly attack or condemn him. The whole situation did make me vaguely nervous, and I spoke with some people about it privately, but I never came to a conclusion that there was a clearly warranted (public) action.

My takes on the FTX situation will (mostly) be cold, not hot

Holden Karnofsky3y5

I don’t believe #1 is correct. The Open Philanthropy grant is a small fraction of the funding OpenAI has received, and I don’t think it was crucial for OpenAI at any point.

I think #2 is fair insofar as running a scaling lab poses big risks to the world. I hope that OpenAI will avoid training or deploying directly dangerous systems; I think that even the deployments it’s done so far pose risks via hype and acceleration. (Considering the latter a risk to society is an unusual standard to hold a company to, but I think it’s appropriate here.)

#3 seems off to me - “regulatory capture” does not describe what’s at the link you gave (where’s the regulator?) At best it seems like a strained analogy, and even there it doesn’t seem right to me - I don’t know of any sense in which I or anyone else was “captured” by OpenAI.

I can’t comment on #4.

#5 seems off to me. I don’t know whether OpenAI uses nondisparagement agreements; I haven’t signed one. The reason I am careful with public statements about OpenAI is (a) it seems generally unproductive for me to talk carelessly in public about important organizations (likely to cause drama and drain the time and energy of me and others); (b) I am bound by confidentiality requirements, which are not the same as nondisparagement requirements. Information I have access to via having been on the board, or via being married to a former employee, is not mine to freely share.

Holden Karnofsky

Posts 92

Sequences 4

Comments194

Posts
92

Sequences
4

Comments
194