Researching Causality and Safe AI at Oxford.
Previously, founder (with help from Trike Apps) of the EA Forum.
Discussing research etc at https://twitter.com/ryancareyai.
I agree with different parts of your comment to different extents.
Regarding cosmopolitanism, I think your pro-government hopes just need to be tempered by the facts. The loudest message on AI from the US government is that they want to maintain a lead over China, which is the opposite of a "cosmopolitan tone", whereas at least in their public statements, AGI companies talk about public benefit.
Regarding violent conflict, I don't think it should be so hard to imagine. Suppose that China and Russia are in a new cold war, and are both racing to develop a new AI superweapon. Then they might covertly sabotage each others' efforts in similar ways to how the US and Israel currently interfere with Iran's efforts to build the bomb.
Regarding ignorance vs indifference, it's true that government is better-incentivised to mitigate negative externalities on their population, and one-day might include a comparable amount of people who care about and know about existential risks to the companies themselves. This is why I said things could change in the future. Just currently they don't.
Jazz:
Non-dinner recs:
Dinner recs (NB. I'm vegetarian, not vegan):
It's not a new idea, but in the long-run it is plausible, and government are starting to think about it:
At a recent AI conference in Washington, Senator Mark Warner, chair of the Senate intelligence committee, wondered aloud whether “it would be in the national security interest of our country to [merge] Open AI, Microsoft, Anthropic, Google, maybe throw in Amazon.” He noted that the US didn’t have “three Manhattan Projects, we had one”.
In addition to the advantages you describe, there are several huge disadvantages from a safety point-of-view:
On balance, I think most people who care about x-risk wouldn't want to see AI nationalised at the moment. But this could change in the future. In particular, point (1) becomes less important as Google's research budget grows. Point (2) might go up or down depending on China's growth and AI progress. Finally, point (3) becomes less important as more x-risk experts enter government.
Congrats to the prizewinners!
Folks thinking about corrigibility may also be interested in the paper "Human Control: Definitions and Algorithms", which I will be presenting at UAI next month. It argues that corrigibility is not quite what we need for a safety guarantee, and that (considering the simplified "shutdown" scenario), instead we should be shooting for "shutdown instructability".
Shutdown instructability has three parts. The first is 1) obedience - the AI follows an instruction to shut down. Rather than requiring the AI to abstain from manipulating the human, as corrigibility would traditionally require, we need the human to maintain 2) vigilance - to instruct shutdown when endangered. Finally, we need the AI to behave 3) cautiously, in that it is not taking risky actions (like juggling dynamite) that would cause a disaster to occur once it is shut down.
We think that vigilance (and shutdown instructability) is a better target than non-manipulation (and corrigibility) because:
Given all of this, it seems to us that in order for corrigibility to seem promising, we would need it to be argued in some greater detail that non-manipulation implies vigilance - that the AI refraining from intentionally manipulating the human would be adequate to ensure that the human can come to give adequate instructions.
Insofar as we can't come up with such justification, we should think more directly about how to achieve obedience (which needs a definition of "shutting down subagents"), vigilance (which requires the human to be able to know whether it will be harmed), and caution (which requires safe-exploration, in light of the human's unknown values).
Hope the above summary is interesting for people!
The second one - I'm addressing what ratio would be beneficial, but maybe you wanted to understand what actually is?
I feel like you're putting words put into my mouth a little bit, there. I didn't say that their beliefs/behaviour were dispositively wrong, but that IF you have longer timelines, then you might start to wonder about groupthink.
That's because in surveys and discussions of these issues even at MIRI, FHI, etc there have always been some researchers who have taken more mainstream views - and non-research staff usually have more mainstream views than researchers (which is not unreasonable if they've thought less about the issue).
Yes, unfortunately I've also been hearing negatives about Conjecture, so much so that I was thinking of writing my own critical post (and for the record, I spoke to another non-Omega person who felt similarly). Now that your post is written, I won't need to, but for the record, my three main concerns were as follows:
1. The dimension of honesty, and the genuineness of their business plan. I won't repeat it here, because it was one of your main points, but I don't think that it's a way to run a business, to sell your investors on a product-oriented vision for the company, but to tell EAs that the focus is overwhelmingly on safety.
2. Turnover issues, including the interpretability team. I've encountered at least half a dozen stories of people working at or considering work at Conjecture, and I've yet to hear of any that were positive. This is about as negative a set of testimonials as I've heard about any EA organisation. Some prominent figures like Janus and Beren have left. In the last couple of months, turnover has been especially high - my understanding is that Connor told the interpretability team that they were to work instead on cognitive emulations, and most of them left. Much talent has been lost, and this wasn't a smooth breakup. One aspect of this is that Conjecture abruptly cancelled an interpretability workshop, that they were scheduled to host, after some had already flown to the UK to attend it.
3. Overconfidence. Some will find Connor's views very sane, but I don't, and would be remiss to ignore:
When I put this together, I get an overall picture that makes it pretty hard to recommend people work with Conjecture, and I would also be thinking about how to disentangle things like MATS from it.
Putting things in perspective: what is and isn't the FTX crisis, for EA?
In thinking about the effect of the FTX crisis on EA, it's easy to fixate on one aspect that is really severely damaged, and then to doomscroll about that, or conversely to focus on an aspect that is more lightly affected, and therefore to think all will be fine across the board. Instead, we should realise that both of these things can be true for different facets of EA. So in this comment, I'll now list some important things that are, in my opinion, badly damaged, and some that aren't, or that might not be.
What in EA is badly damaged:
What in EA is only damaged mildly, or not at all:
What in EA might be badly damaged:
Given all of this, what does that say about how big of a deal the FTX crisis is for EA? Well, I think it's the biggest crisis that EA has ever had (modulo the possible issue of AI capabilities advances). What's more, I also can't think of a bigger scandal in the 223-year history of utilitarianism. On the other hand, the FTX crisis is not even the most important change in EA's funding situation, so far. For me, most important was when Moskovitz entered the fold, changing the number of EA billionaires went from zero to one. When I look over the list above, I think that much more of the value of the EA community resides in its institutions and social network than in its brand. The main ways that a substantial chunk of value could be lost is if enough trust or motivation was lost, that it became hard to run projects, or recruit new talent. But I think that even though some goodwill and trust is lost, it can be rebuilt, and people's motivation is intact. And I think that whatever happens to the exact strategy of outreach currently used by the EA community, we will be able to find ways to attract top talent to work on important problems. So my gut feeling would be that maybe 10% of what we’ve created is undone by this crisis. Or that we’re set back by a couple of years, compared to where we would be if FTX was not started. Which is bad, but it's not everything.