Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas

Jake Mendel; Max Nadeau; Peter Favaloro

This is a linkpost for https://www.openphilanthropy.org/request-for-proposals-technical-ai-safety-research/

Open Philanthropy is launching a big new Request for Proposals for technical AI safety research, with plans to fund roughly $40M in grants over the next 5 months, and available funding for substantially more depending on application quality.

Applications start with a simple 300 word expression of interest and are open until April 15, 2025.

Apply now

Overview

We're seeking proposals across 21 different research areas, organized into five broad categories:

Adversarial Machine Learning
- *Jailbreaks and unintentional misalignment
- *Control evaluations
- *Backdoors and other alignment stress tests
- *Alternatives to adversarial training
- Robust unlearning
Exploring sophisticated misbehavior of LLMs
- *Experiments on alignment faking
- *Encoded reasoning in CoT and inter-model communication
- Black-box LLM psychology
- Evaluating whether models can hide dangerous behaviors
- Reward hacking of human oversight
Model transparency
- Applications of white-box techniques
- Activation monitoring
- Finding feature representations
- Toy models for interpretability
- Externalizing reasoning
- Interpretability benchmarks
- More transparent architectures
Trust from first principles
- White-box estimation of rare misbehavior
- Theoretical study of inductive biases
Alternative approaches to mitigating AI risks
- Conceptual clarity about risks from powerful AI
- New moonshots for aligning superintelligence

We’re willing to make a range of types of grants including:

Research expenses (compute, APIs, etc.)
Discrete research projects (typically lasting 6-24 months)
Academic start-up packages
Support for existing nonprofits
Funding to start new research organizations or new teams at existing organizations.

The full RFP provides much more detail on each research area, including eligibility criteria, example projects, and nice-to-haves.

We want the bar to be low for submitting expressions of interest: even if you're unsure whether your project fits perfectly, we encourage you to submit an EOI. This RFP is partly an experiment to understand the demand for funding in AI safety research.

Please email aisafety@openphilanthropy.org with questions, or just submit an EOI.

95 Reactions

Comments3

Sorted by

New & upvoted

Click to highlight new comments since: Today at 4:40 AM

ThaoOnEarth🔹Feb 271

I'm new to applying for an AIS grant, so I have some common questions that might have been answered elsewhere:

(1) what are some failure modes that I might need to consider when writing a proposal, specifically for a research project?

(2) will research expenses include stipends for the researchers?

(3) can I write a grant to do a research project with my university AI safety group? I'm not sure if this will be considered a field-building or a technical AI safety grant.

Max NadeauFeb 281

Some common failure modes:

Not reading the eligibility criteria
Not clearly distinguishing your project from prior work on the topic you're interested in
Not demonstrating a good understanding of prior work (would be good to read some/all of the papers we link to in this doc for whatever section you're applying within)
Not demonstrating that you/your team has prior experience doing ML projects. If you don't have such experience, then it's good to work with/be mentored by someone who does.

"Research expeneses" does not include stipends, but you can apply for a project grant, which does.

If you're looking for money to spend on ML experiments or to pay people who are spending their time doing ML research, then that may fall within this RFP. If you're looking for money to do other things (e.g. reading groups, events, etc), then that may fall under the capacity-building team's RFPs.

MoneerFeb 83

Has Open Phil (or others) conducted a comprehensive analysis for both understanding and building the AI safety field?

If yes, could you share some leads to add to my research?

If not, would Open Phil consider funding such work? (either under the above or other funds)

Here is a recent example: Introducing SyDFAIS: A Systemic Design Framework for AI Safety Field-Building