Esben Kran

Co-director @ Apart Research
737 karmaJoined Working (0-5 years)apartresearch.com

Bio

🔶

Comments
48

Topic contributions
1

The main effect of regulation is to control certain net negative outcomes and hence slowing down negative AGIs. RSPs that require stopping developing at ASL-4 or otherwise are also under the pausing agenda. It might be a question of semantics due to how Pause AI and the Pause AI Letter have become the memetic sink for the term pause AI?

Great post, thank you for laying out the realities of the situation.

In my view, there are currently three main strategies pursued to solve X-risk:

  1. Slow / pause AI: Regulation, international coordination, and grassroots movements. Examples include UK AISI, EU AI Act, SB1047, METR, demonstrations, and PauseAI.
  2. Superintelligence security: From infrastructure hardening, RSPs, security at labs, and new internet protocols to defense of financial markets, defense against slaughterbots, and civilizational hedging strategies. Examples include UK ARIA, AI control, and some labs.
  3. Hope in AGI: Developing the aligned AGI and hoping it will solve all our problems. Examples include Anthropic and arguably most other AGI labs.

No. (3) seems weirdly overrated in AI safety circles. (1) seems incredibly important now and something radically under-emphasized. And in my eyes, (2) seems like the direction most new technical work should go. I will refer to Anthropic's safety researchers on whether the labs have a plan outside of (3).

Echoing @Buck's point that you now have less need to be inside a lab for model access reasons. And if it's to guide the organization, that has historically been somewhat futile in the face of capitalist incentives.

Answering on behalf of Apart Research!

We're a non-profit research and community-building lab with a strategic target on high-volume frontier technical research. Apart is currently raising a round to run the lab throughout 2025 and 2026 but here I'll describe what your marginal donation may enable.

In just two years, Apart Research has established itself as a unique and efficient part of the AI safety ecosystem. Our research output includes 13 peer-reviewed papers published since 2023 at top venues including NeurIPS, ICLR, ACL, and EMNLP, with six main conference papers and nine workshop acceptances. Our work has been cited by OpenAI's Superalignment team, and our team members have contributed to significant publications like Anthropic's "Sleeper Agents" paper.

With this track record, we're able to capitalize on our position as an AI safety lab and mobilize our work to impactful frontiers of technical work in governance, research methodology, and AI control.

Besides our ability to accelerate a Lab fellow's research career at an average direct cost of around $3k, enable research sprint participants for as little as $30, and enable growth at local groups at similar high price/impact ratios, your marginal donation can enable us to run further impactful projects:

  1. Improved access to our program ($7k-$25k): Professional rewamp of our website and documentation would make our programs and research outputs more accessible to talented researchers worldwide. Besides our establishment as a lab through our paper acceptances, a redesign will help us cater even more to institutional funding and technical professionals, which will help scale our impact through valuable counterfactual funding and talent discovery. At the higher end, we will also be able to make our internal resources publicly available. These resources are specifically designed to accelerate AI safety technical careers.
  2. Higher conference attendance support ($20k): Currently, we only support one fellow per team to attend conferences. Additional funding would enable a second team member to attend, at approximately $2k per person.
  3. Improving worldview diversity in AI safety ($10k-$20k): We've been working on all continents now and find a lot of value in our approach to enable international and underrepresented professional talent (besides our work at organizations such as 7 of the top 10 universities). With this funding, you would enable more targeted outreach from Apart's side and existing lab members' participation in conferences to discuss and represent AI safety to otherwise underrepresented professional groups.
  4. Continuing impactful research projects ($15k-$30k): We will be able to extend timely and critical research projects. For instance, we're looking to port our cyber-evaluations work to Inspect, making it a permanent part of UK AISI catastrophic risk evaluations. Our recent paper also finds novel methods to test whether LLMs game public benchmarks and we would like to expand the work to run the same test on other high-impact benchmarks while making the results more accessible. These projects have direct impacts on AI evaluation methodology but we see other opportunities like this for expanding projects at reasonable follow-up costs.
Donate to Apart Research

You'll be supporting a growing organization with the Apart Lab fellowship already doubling from Q1'24 to Q3'24 (17 to 35 fellows) and our research sprints having moved thousands closer to AI safety.

Given current AGI development timelines, the need to scale and improve safety research is urgent. In our view, Apart seems like one of the better investments to reduce AI risk.

If this sounds interesting and you'd like to hear more (or have a specific marginal project you'd like to see happen), my inbox is open.

Very interesting! We had a submission for the evals research sprint in August last year on the same topic. Check it out here: Turing Mirror: Evaluating the ability of LLMs to recognize LLM-generated text (apartresearch.com)

You are completely right. My main point is that the field of AI safety is under-utilizing commercial markets while commercial AI indeed prioritizes reliability and security to a healthy level.

Thank you so much for the talk, Paul! It was exciting to see the vignettes besides the very practical first case. It will be interesting to see the entry of Straumli on the evaluations scene since I think you have a solid case for success.

CoI statement: Straumli donated the prize money for the Governance Sprint, though nothing goes to me or Apart, just the AI safety community.

I work as co-director of Apart Research, specifically with research management, AI safety research consulting, and field-building. I'm entrepreneurially focused.

Thank you for hosting this! I'll repost a question on Asya's retrospective post regarding response times for the fund.

our median response time from January 2022 to April 2023 was 29 days, but our current mean (across all time) is 54 days (although the mean is very unstable)

I would love to hear more about the numbers and information here. For instance, how did the median and mean change over time? What does the global distribution look like? The disparity between the mean and median suggests there might be significant outliers; how are these outliers addressed? I assume many applications become desk rejects; do you have the median and mean for the acceptance response times?

I was incredibly impressed by the tables of numbers in their impact evaluation. After conversing with the team, I've witnessed their high ability to produce results, and their evaluation research methods certainly attest to this. This appears to be one of those rare opportunities where donations could have a significant counterfactual impact.

Edit: I am not in any way affiliated with FEM and randomly met one of the co-founders on a flight where we had a conversation about their work.

Thank you for sharing your reflections and for the work you've done on the EA Funds, Asya! I appreciate the role the Funds have played over the past years.

our median response time from January 2022 to April 2023 was 29 days, but our current mean (across all time) is 54 days (although the mean is very unstable)

A few questions arise from your mention of the Funds' response times. I would love to hear more about the numbers and information here. For instance, how did the median and mean change over time? What does the global distribution look like? The disparity between the mean and median suggests there might be significant outliers; how are these outliers addressed? I assume many applications become desk rejects; do you have the median and mean for the acceptance response times?

Load more