utilistrutil

Eliminating the malice or recklessness requirement and allowing punitive damages calculations to account for unrealized uninsurable risk are both big asks to make of common law courts.

Would it make a difference if the risks were insured?

A Case for Superhuman Governance, using AI

utilistrutil4mo3

Welsh government commits to making lying in politics illegal

A Case for Superhuman Governance, using AI

utilistrutil5mo3

My main objection is that people working in government need to be able to get away with a mild level of lying and scheming to do their jobs (eg broker compromises, meet with constituents). AI could upset this equilibrium in a couple ways, making it harder to govern.

If the AI is just naive, it might do things like call out a politician for telling a harmless white lie, jeopardizing eg an international agreement that was about to be signed.
1. One response is that human overseers will discipline these naive mistakes, but the more human oversight is required, the more you run into the typical problems of human oversight you outlined above. "These evaluators can do so while not seeing critical private information" is not always true. (Eg if the AI realizes that Biden is telling a minor lie to Xi based on classified information, revealing the existence of the lie to the overseer would necessarily reveal classified information).
Even if the AI is not naive, and can distinguish white lies from outright misinformation, say, I still worry that it undermines the current equilibrium. The public would call for stricter and stricter oversight standards, while government workers will struggle to fight back because
1. That's a bad look, and
2. The benefits of a small level of deception are hard to identify and articulate.

TLDR: Government needs some humans in the loop making decisions and working together. To work together, humans need some latitude to behave in ways that would become difficult with greater AI integration.

Analogy Bank for AI Safety

utilistrutil9mo1

Thanks, Agustín! This is great.

Analogy Bank for AI Safety

utilistrutil9mo1

Please submit more concrete ones! I added "poetic" and "super abstract" as an advantage and disadvantage for fire.

Impact Assessment of AI Safety Camp (Arb Research)

utilistrutil9mo7

If the organization chooses to directly support the new researcher, then the net value depends on how much better their project is than the next-most-valuable project.

This is nit-picky, but if the new researcher proposes, say, the best project the org could support, it does not necessarily mean the org cannot support the second-best project (the "next-most-valuable project"), but it might mean that the sixth-best project becomes the seventh-best project, which the org then cannot support.

In general, adding a new project to the pool of projects does not trade off with the next-best project, it pushes out the nth-best project, which would have received support but now does not meet the funding bar. So the marginal value of adding projects that receive support depends on the quality of the projects around the funding bar.

Another way you could think about this is that the net value of the researcher depends on how much better this bundle of projects is than the next-most-valuable bundle.

Essentially, this is the marginal value of new projects in AI safety research, which may be high or low depending on your view of the field.

So I still agree with this next sentence if marginal = the funding margin, i.e., the marginal project is one that is right on the funding bar. Not if marginal = producing a new researcher, who might be way above the funding bar.

Farewell messages from the EA Philippines Core Team

utilistrutil11mo1

These are beautiful!! Made my day :))

Apply for MATS Winter 2023-24!

utilistrutil1y3

Update: We have finalized our selection of mentors.

Policy ideas for mitigating AI risk

utilistrutil1y3

I'll be looking forward to hearing more about your work on whistleblowing! I've heard some promising takes about this direction. Strikes me as broadly good and currently neglected.

What I would do if I wasn’t at ARC Evals

utilistrutil1y1

This is so well-written!

utilistrutil

Bio

Posts 12

Comments51

Posts
12

Comments
51