L

Linch

@ EA Funds
24630 karmaJoined Dec 2015Working (6-15 years)openasteroidimpact.org

Comments
2659

This feels really suss to me:

Many people at OpenAI get more of their compensation from PPUs than from base salary. PPUs can only be sold at tender offers hosted by the company. When you join OpenAI, you sign onboarding paperwork laying all of this out.

And that onboarding paperwork says you have to sign termination paperwork with a 'general release' within sixty days of departing the company. If you don't do it within 60 days, your units are cancelled. No one I spoke to at OpenAI gave this little line much thought.

And yes this is talking about vested units, because a separate clause clarifies that unvested units just transfer back to the control of OpenAI when an employee undergoes a termination event (which is normal).

There's a common legal definition of a general release, and it's just a waiver of claims against each other. Even someone who read the contract closely might be assuming they will only have to sign such a waiver of claims.

But when you actually quit, the 'general release'? It's a long, hardnosed, legally aggressive contract that includes a confidentiality agreement which covers the release itself, as well as arbitration, nonsolicitation and nondisparagement and broad 'noninterference' agreement.

And if you don't sign within sixty days your units are gone. And it gets worse - because OpenAI can also deny you access to the annual events that are the only way to sell your vested PPUs at their discretion, making ex-employees constantly worried they'll be shut out.

How do careful startups happen? Basically I think it just takes safety-minded founders. 

Thanks! I think this is the crux here. I suspect what you say isn't enough but it sounds like you have a lot more experience than I do, so happy to (tentatively) defer.

Thank you! You might like the 3 minute youtube version as well.

Fwiw I think the website played well with at least some people in the open-source faction (in OP's categorization). Eg see here on the LocalLlama reddit. 

I would do it but my LTFF funding does not cover this

(Speaking as someone on LTFF, but not on behalf of LTFF) 

How large of a constraint is this for you? I don't have strong opinions on whether this work is better than what you're funded to do, but usually I think it's bad if LTFF funding causes people to do things that they think is less (positively) impactful! 

We probably can't fund people to do things that are lobbying or lobbying-adjacent, but I'm keen to figure out or otherwise brainstorm an arrangement that works for you.

I agree that it's possible for startups to have a safety-focused culture! The question that's interesting to me is whether it's likely / what the prior should be.

Finance is a good example of a situation where you often can get a safety culture despite no prior experience with your products (or your predecessor's products, etc) killing people. I'm not sure why that happened? Some combination of 2008 making people aware of systemic risks + regulations successfully creating a stronger safety culture?

I'm interested in what people think of are the strongest arguments against this view. Here are a few counterarguments that I'm aware of: 

1. Empirically the AI-focused scaling labs seem to care quite a lot about safety, and make credible commitments for safety. If anything, they seem to be "ahead of the curve" compared to larger tech companies or governments.

2. Government/intergovernmental agencies, and to a lesser degree larger companies, are bureaucratic and sclerotic and generally less competent. 

3. The AGI safety issues that EAs worry about the most are abstract and speculative, so having a "normal" safety culture isn't as helpful as buying in into the more abstract arguments, which you might expect to be easier to do for newer companies.

4. Scaling labs share "my" values. So AI doom aside, all else equal, you might still want scaling labs to "win" over democratically elected governments/populist control.

We should expect that the incentives and culture for AI-focused companies to make them uniquely terrible for producing safe AGI. 
 

From a “safety from catastrophic risk” perspective, I suspect an “AI-focused company” (e.g. Anthropic, OpenAI, Mistral) is abstractly pretty close to the worst possible organizational structure for getting us towards AGI. I have two distinct but related reasons:

  1. Incentives
  2. Culture

From an incentives perspective, consider realistic alternative organizational structures to “AI-focused company” that nonetheless has enough firepower to host successful multibillion-dollar scientific/engineering projects:

  1. As part of an intergovernmental effort (e.g. CERN’s Large Hadron Collider, the ISS)
  2. As part of a governmental effort of a single country (e.g. Apollo Program, Manhattan Project, China’s Tiangong)
  3. As part of a larger company (e.g. Google DeepMind, Meta AI)

In each of those cases, I claim that there are stronger (though still not ideal) organizational incentives to slow down, pause/stop, or roll back deployment if there is sufficient evidence or reason to believe that further development can result in major catastrophe. In contrast, an AI-focused company has every incentive to go ahead on AI when the case for pausing is uncertain, and minimal incentive to stop or even take things slowly. 

From a culture perspective, I claim that without knowing any details of the specific companies, you should expect AI-focused companies to be more likely than plausible contenders to have the following cultural elements:

  1. Ideological AGI Vision AI-focused companies may have a large contingent of “true believers” who are ideologically motivated to make AGI at all costs and
  2. No Pre-existing Safety Culture AI-focused companies may have minimal or no strong “safety” culture where people deeply understand, have experience in, and are motivated by a desire to avoid catastrophic outcomes. 

The first one should be self-explanatory. The second one is a bit more complicated, but basically I think it’s hard to have a safety-focused culture just by “wanting it” hard enough in the abstract, or by talking a big game. Instead, institutions (relatively) have more of a safe & robust culture if they have previously suffered the (large) costs of not focusing enough on safety.

For example, engineers who aren’t software engineers understand fairly deep down that their mistakes can kill people, and that their predecessors’ fuck-up have indeed killed people (think bridges collapsing, airplanes falling, medicines not working, etc). Software engineers rarely have such experience.

Similarly, governmental institutions have institutional memories with the problems of major historical fuckups, in a way that new startups very much don’t.

Introducing Ulysses*, a new app for grantseekers. 


 

We (Austin Chen, Caleb Parikh, and I) built an app! You can test the app out if you’re writing a grant application! You can put in sections of your grant application** and the app will try to give constructive feedback about your applicants. Right now we're focused on the "Track Record" and "Project Goals" section of the application. (The main hope is to save back-and-forth-time between applicants and grantmakers by asking you questions that grantmakers might want to ask.

Austin, Caleb, and I hacked together a quick app as a fun experiment in coworking and LLM apps. We wanted a short project that we could complete in ~a day. Working on it was really fun! We mostly did it for our own edification, but we’d love it if the product is actually useful for at least a few people in the community!

As grantmakers in AI Safety, we’re often thinking about how LLMs will shape the future; the idea for this app came out of brainstorming, “How might we apply LLMs to our own work?”. We reflected on common pitfalls we see in grant applications, and I wrote a very rough checklist/rubric and graded some Manifund/synthetic applications against the rubric.  Caleb then generated a small number of few shot prompts by hand and then used LLMs to generate further prompts for different criteria (e.g., concreteness, honesty, and information on past projects) using a “meta-prompting” scheme. Austin set up a simple interface in Streamlit to let grantees paste in parts of their grant proposals. All of our code is open source on Github (but not open weight 😛).***

This is very much a prototype, and everything is very rough, but please let us know what you think! If there’s sufficient interest, we’d be excited about improving it (e.g., by adding other sections or putting more effort into prompt engineering). To be clear, the actual LLM feedback isn’t necessarily good or endorsed by us, especially at this very early stage. As usual, use your own best judgment before incorporating the feedback.

*Credit to Saul for the name, who originally got the Ulysses S. Grant pun from Scott Alexander.

** Note: Our app will not be locally saving your data. We are using the OpenAI API for our LLM feedback. OpenAI says that it won’t use your data to train models, but you may still wish to be cautious with highly sensitive data anyway. 

*** Linch led a discussion on the potential capabilities insights of our work, but we ultimately decided that it was asymmetrically good for safety; if you work on a capabilities team at a lab, we ask that you pay $20 to LTFF before you look at the repo.


 

The broader question I'm confused about is how much to update on the local/object-level of whether the labs are doing "kind of reasonable" stuff, vs what their overall incentives and positions in the ecosystem points them to doing. 

eg your site puts OpenAI and Anthropic as the least-bad options based on their activities, but from an incentives/organizational perspective, their place in the ecosystem is just really bad for safety. Contrast with, e.g., being situated within a large tech company[1] where having an AI scaling lab is just one revenue source among many, or Meta's alleged "scorched Earth" strategy where they are trying very hard to commoditize the component of LLMs

  1. ^

    eg GDM employees have Google/Alphabet stock, most of the variance in their earnings isn't going to come from AI, at least in the short term.

Load more