@ EA Funds
25422 karmaJoined Working (6-15 years)openasteroidimpact.org


(I didn't think your comment was primarily referring to them fwiw)


A meta- norm I'd like commentators[1] to have is to Be Kind, When Possible. Some subpoints that might be helpful for enacting what I believe to be the relevant norms:

  • Try to understand/genuinely grapple with the awareness that you are talking to/about actual humans on the other side, not convenient abstractions/ideological punching bags. 
    • For example, most saliently to me, the Manifest organizers aren't an amorphous blob of bureaucratic institutions. 
      • They are ~3 specific people, all of whom are fairly young, new to organizing large events, and under a lot of stress as it is.
      • Rachel in particular played a (the?) central role in organizing, despite being 7(?) months pregnant. Organizing a new, major multiday event under such conditions is stressful enough as it is, and I'm sure the Manifest team in general, and Rachel in particular, was hoping they can relax a bit at the end. 
      • It seems bad enough that a hit piece in the Guardian is written about them, but it's worse when "their" community wants to pile on, etc. 
    • I'm not saying that you shouldn't criticize people. Criticism can be extremely valuable! But there are constructive, human, ways to criticize, and then there are...other ways.
  • Try to make your critiques as minimally personal as possible. Engage with arguments, don't attack individual people if at all possible. 
    • For example, I really appreciate Isa's comments. Judging by upvotes and karma, many other people did as well. 
    • It's reasonable if you disagree with her comments. But you can engage with them on the merits, rather than making comments personal, as some of the replies seem to.
    • Some of the replies implied that Isa said things they did not say. Some of the other replies implied that Isa's critiques are borne out of projection or unnecessarily taking things personally
      • On the object-level, I read her comments and I think that reading is just wrong. On the meta-level, even if that reading is correct, this is the type of thing that you should handle delicately, not harshly try to package it in an attack to make it harder to argue against you.  
  • Relatedly, try to have decent theory of mind
    • People you argue with likely have different beliefs, values, and preferences from you. They are also likely to have different beliefs, values, or preferences than what a caricature of them will have, especially caricatures of the tier that you see in political cartoons, or on Twitter.
    • If you find yourself arguing with implausibly stereotyped caricatures, rather than real people, you should a) consider trying to flesh out enough details about the people you're arguing with to be at all plausible and/or b) take a step back and rethink your life choices.
    • I know this is difficult for some people, but you should at least try.
  • Try not to be ideologically captured, or at least notice when you are and don't take yourself too seriously. 
    • One of my greater failures as a internet commentator/dispassionate observer is during the whole SSC/NYT thing, where I failed to notice myself becoming increasingly tribal/using arguments as soldiers, etc. 
      • The level of outrage was made worse by just how petty the underlying disputes were.
      • I do not think I thought or acted honorably according to my own values, and I'd prefer to minimize such tribalism going forwards
    • If you notice yourself in the grips of white-hot, rage, consider logging off and then do something take a shower, watch Netflix, go back to work, touch grass, crochet, or something else that's not as corrosive to either your own soul or that of the community.
    • One dynamic I haven't seen other people mention so far is the prevalence of what my friend calls "Rationality Justice Warriors," people who (at least online) seem to champion a fairly uncompromising and aggressive stance on defending and evoking the norms and cultures of the rationality community whenever it appears to be under attack.
      • This comes across as fairly childish and frankly irrational to me, and I suspect most people acting under such attitudes will not reflectively endorse them.
      • Of course their opponents hardly do better.  
  • Model the effects of your sentences on other people. 
    • I'm worried that many comments will predictably push away valuable contributors to the community, or be very costly in other ways (eg wasted time/emotions).  
    • If you expect your sentences to predictably cause more heat than light, consider rephrasing what you said. Or perhaps, again, consider logging off. 
    • This is not to say you can't ever offer harsh criticism, or you need to always be nice. Being kind is not the same as being nice, and sometimes even true kindness have to be sacrificed for higher goals. 
      • But you should be careful about what trades you make, and maybe don't sell out your kindness for too cheap a price (like the short-lived approval of your not-very-kind peers, or the brief euphoria of righteous rage) 
    • It might be the case that what "needs to be said" can't be said nicely, or rephrasing things in a diplomatic way takes more skill with the language than you have, or more time than you can afford. Under those circumstances, it is usually (not always) better to leave such criticisms than to leave them unsaid.
    • But you ought to be consciously making those tradeoffs, not falling into them blindly.  
    • Effects include epistemic effects. If you say technically true things that makes people dumber, or worse, obviously wrong and/or illogical arguments in pursuit of a "higher" purpose, you are at least a little bit responsible for poisoning the discourse, and you should again ask yourself whether it's worth it. 
  • Consider the virtue of silence[2]
    • We are all busy people, and some of us have very high opportunity costs.
    • There are times to speak up, and times to stay silent, and let us all pray for the wisdom to know the difference.
  1. ^

    I'm far from flawless on these grounds, myself. But I try my best. Or at least, I mostly try to try. 

  2. ^

    This is a virtue I'm personally exceedingly bad at practicing.


I think this is both rather uncharitable and implausible

(I replied more substantively on Metaculus)

Thanks! If anybody thinks they're in a good position to do so and would benefit from any or all of my points being clearer/more spelled out, feel free to DM me :)

I do regret using the holocaust example. The example was loosely based on one speaker who appeared to be defending eugenics by saying that the holocaust was actually considered a dysgenic event by top nazi officials

That sounds like an obviously invalid argument! Now, a) I didn't attend that talk, b) many people are bad at making arguments, and c) I've long suspected that poor reasoning especially is positively correlated with racism (and this is true even after typical range restriction). So it's certainly possible that the argument they made was literally that bad. 

But I think it's more likely that you misunderstood their argument. 

This is a rough draft of questions I'd be interested in asking Ilya et. al re: their new ASI company. It's a subset of questions that I think are important to get right for navigating the safe transition to superhuman AI.

(I'm only ~3-7% that this will reach Ilya or a different cofounder organically, eg because they read LessWrong or from a vanity Google search. If you do know them and want to bring these questions to their attention, I'd appreciate you telling me so I have a chance to polish the questions first)

  1. What's your plan to keep your model weights secure, from i) random hackers/criminal groups, ii) corporate espionage and iii) nation-state actors?
    1. In particular, do you have a plan to invite e.g. the US or Israeli governments for help with your defensive cybersecurity? (I weakly think you have to, to have any chance of successful defense against the stronger elements of iii)). 
    2. If you do end up inviting gov't help with defensive cybersecurity, how do you intend to prevent gov'ts from building backdoors? 
    3. Alternatively, do you have plans to negotiate with various nation-state actors (and have public commitments about in writing, to the degree that any gov't actions are legally enforeceable at all) about which things they categorically should not do with AIs you develop? 
      1. (I actually suspect the major AGI projects will be nationalized anyway, so it might be helpful to plan in advance for that transition)
  2. If you're banking on getting to safe AGI/ASI faster than other actors because of algorithmic insights and conceptual breakthroughs, how do you intend to keep your insights secret? This is a different problem from securing model weights, as your employees inevitably leak information in SF parties, in ways that are much more ambiguous than exfiltrating all the weights on a flash drive. 
  3. What's your planned corporate governance structure? We've seen utter failures of corporate governance in AI before, as you know. My current guess is that "innovations in corporate governance" is a red flag, and you should aim for a corporate governance structure that's as close to tried-and-tested systems as possible (I'll leave it to actual corporate governance lawyers to suggest a good alternative).
  4. We know that the other AGI labs tend to publicly claim they're pro-regulations that have teeth and then secretly take actions (lobbying) to weaken significant regulations/limitations on frontier labs. Can you publicly commit in advance that you will not do that? Either commit to
    1. Don't lobby against good safety regulations privately 
    2. Don't publicly say you are pro-regulation for regulations you don't actually like, and generally avoid talking about politics in ways that will leave a deceptive impression.
  5. What's your plan to stop if things aren't going according to plan? Eg because capability gains outstrip safety. I don't think "oh we'll just stop because we're good, safety-concerned, people" is a reasonable belief to have, given the evidence available
    1. Your incentives are (in my opinion) massively pointed towards acceleration, your VCs will push you to acceleration, your staff will be glory-seeking, normal competitive dynamics will cause you to cut corners, etc, etc. 
    2. You probably need very strong, legal, unambiguous and (probably) public commitments to have any chance of turning on the brakes when things get crazy
  6. I personally suspect that you will be too slow to get to AGI. Because AGI is bottlenecked on money (compute) and data, not algorithmic insights and genius conceptual breakthroughs. And I think you'll be worse at raising money than the other players, despite being a top scientist in the field (From my perspective this is not obviously bad news). If you end up deciding I'm correct, at what point do you a) shutter your company and stop working on AI, or b) fold and entirely focus on AI safety, either independently or as a lab, rather than capabilities + safety?
  7. Suppose on the other hand you actually have a viable crack at AGI/ASI. In the event that another actor(s) is ahead in the race towards ASI, and they're very close, can you commit in advance under which conditions you'd be willing to shut down and do something similar to "merge and assist" (eg after specific safety guarantees from the leading actor). 
  8. If you end up deciding your company is net bad for the world, and that problem is irrecoverable, do you have a plan to make sure it shuts down, rather than you getting ousted (again) and the employees continuing on with the "mission" of hurtling us towards doom?
  9. Do you have a whistleblower policy? If not, do you have plans to make a public whistleblower policy, based on a combination of best practices from other fields and stuff Christiano writes about here? My understanding is that you have first-hand experience with how whistleblowing can go badly, so it seems valuable to make sure it gets done well. 
  10. (out of curiosity) Why did you decide to make your company one focused on building safe AGI yourself, rather than a company or nonprofit focused on safety research? 
    1. Eg I'd guess that Anthropic and maybe Google DeepMind would be happy to come up with an arrangement to leash their frontier models to you for you to focus on developing safety tools. 

I'll leave other AGI-safety relevant questions like alignment, evaluations, and short-term race dynamics, to others with greater expertise. 

I do not view the questions I ask as ones I'm an expert on either, just one where I perceive relatively few people are "on the ball" so to speak, so hopefully a generalist paying attention to the space can be helpful. 

Are you sure there are basically no wins?

Nope, not sure at all. Just vague impression. 

Kaj Sotala has an interesting anecdote about the game DragonBox in this blog post. Apparently it's a super fun puzzle game that incidentally teaches kids basic algebra.

@Kaj_Sotala wrote that post 11 years ago, titled "Why I’m considering a career in educational games." I'd be interested to see if he still stands by it and/or have more convincing arguments by now. 

Thanks. I appreciate your kind words.

IMO if EA funds isn't representative of EA, I'm not sure what is.

I think there's a consistent view where EA is about doing careful, thoughtful, analysis with uniformly and transparently high rigor, to communicate that analyses transparently and legibly, and to (almost) always make decisions entirely according to such analyses as well as strong empirical evidence. Under that view GiveWell, and for that matter, JPAL, is much more representative of what EA ought to be about, than what at least LTFF tries to do in practice.

I don't know how popular the view I described above is. But I definitely have sympathy towards it.

Thanks. I agree here that "criminals" seem a more plausible interpretation of what he said than "woke activists." I also definitely sympathize with an unthinking tweet written in the moment being misinterpreted, especially by people on the EA Forum.

I think generally though it's easy to misunderstand people, and if people respond to clarify, you should believe what they say they meant to say, not your interpretation of what they said. 

I agree this is true in general. I think we might have different underlying probabilities of how accurate that model is however. In particular, I find it rather plausible that people pushing for "edgy" political beliefs will intentionally backtrack when challenged. I also have a cached view that this type of strategic ambiguity is particularly popular among the alt-right (not saying that other political factions are innocent here). 

And in this particular case, I'd note that the incentive for falsifying what he meant is massive

Again, I don't know Richard and how strong his desire is to always be consistently candid about what he means. It's definitely possible that he's unusually truth-seeking (my guess is that some of his defenders will point to that as one of his chief virtues). I'm just saying that you should not exclude deception from the hypothesis space in situations similar to this one.

Load more