LRL

L Rudolf L

669 karmaJoined

Comments
17

You discuss three types of AI safety ventures:
 

  • Infrastructure: Tooling, mentorship, training, or legal support for researchers.
  • New AI Safety Organizations: New labs or fellowship programs.
  • Advocacy Organizations: Raising awareness about the field.

Where would, for example, insurance for AI products fit in this? This is a for-profit idea that creates a natural business incentive to understand & research risks from AI products at a very granular level, and if it succeeds, it puts you into position to influence the entire industry (e.g. "we will lower your premiums if you implement safety measure X").

I agree that if you restrict yourself to either supporting AIS researchers, launching field-building projects or research labs, or doing advocacy, then you will in fact not find good startup ideas, for the structural reasons you do a good job of listing in your post, as well as the fact that these are all things people are already doing.

METR is a very good AIS org. In addition to just being really solid and competent, a lot of why they succeeded was that they started doing something that few people were thinking about at the time. Everyone and their dog is launching an evals startup today, but the real value is finding ideas like METR before they are widespread. If the startup ideas you consider are all about doing the same thing that existing orgs do, you will miss out on the most important ones.

I do agree that the intersection of impact & profit & bootstrappability is small and hard to hit, and there's no law of nature that says something should definitely exist there. But something exists in that corner, it will be a novel type of thing.

(reposted from a Slack thread)

I've been thinking about this space. I have some ideas for hacky projects in the direction of "argument type-checkers"; if you're interested in this, let me know

I'd like to add an asterisk. It is true that you can and should support things that seem good while they seem good and then retract support, or express support on the margin but not absolutely. But sometimes supporting things for a period has effects you can't easily take back. This is especially the case if (1) added marginal support summons some bigger version of the thing that, once in place, cannot be re-bottled, or (2) increased clout for that thing changes the culture significantly (I think cultural changes are very hard to reverse; culture generally doesn't go back, only moves on).

I think there are many cases where, before throwing their lot in with a political cause for instrumental reasons, people should've first paused to think more about whether this is the type of thing they'd like to see more of in general. Political movements also tend to have an enormous amount of inertia, and often end up very influenced by by path-dependence and memetic fitness gradients.

I think it's worth trying hard to stick to strict epistemic norms. The main argument you bring against is that it's more effective to be more permissive about bad epistemics. I doubt this. It seems to me that people overstate the track record of populist activism at solving complicated problems. If you're considering populist activism, I would think hard about where, how, and on what it has worked.

Consider environmentalism. It seems quite uncertain whether the environmentalist movement has been net positive (!). This is an insane admission to have to make, given that the science is fairly straightforward, environmentalism is clearly necessary, and the movement has had huge wins (e.g. massive shift in public opinion, pushing governments to make commitments, & many mundane environmental improvements in developed country cities over the past few decades). However, the environmentalist movement has repeatedly spent enormous efforts on directly harming their stated goals through things like opposing nuclear power and GMOs. These failures seem very directly related to bad epistemics.

In contrast, consider EA. It's not trivial to imagine a movement much worse along the activist/populist metrics than EA. But EA seems quite likely positive on net, and the loosely-construed EA community has gained a striking amount of power despite its structural disadvantages.

Or consider nuclear strategy. It seems a lot of influence was had by e.g. the staff of RAND and other sober-minded, highly-selected, epistemically-strong actors. Do you want more insiders at think-tanks and governments and companies, and more people writing thoughtful pieces that swing elite opinion, all working in a field widely seen as credible and serious? Or do you want more loud activists protesting on the streets?

I'm definitely not an expert here, but by thinking through what I understand about the few cases I can think of, the impression I get is that activism and protest have worked best to fix the wrongs of simple and widespread political oppression, but that on complex technical issues higher-bandwidth methods are usually how actual progress is made.

I think there are also some powerful but abstract points:

  1. Choosing your methods is not just a choice over methods, but also a choice over who you appeal to. And who you appeal to will change the composition of your movement, and therefore, in the long run, the choice of methods. Consider carefully before summoning forces you can't control (this applies both to superhuman AI as well as epistemically-shoddy charismatic activist-leaders).
  2. If we make the conversation about AIS more thoughtful, reasonable, and rational, it increases the chances that the right thing (whatever that ends up being - I think we should have a lot of intellectual humility here!) ends up winning. If we make it more activist, political, and emotional, we privilege the voice of whoever is better at activism, politics, and narratives. I think you basically always want to push the thoughtfulness/reasonableness/rationality. This point is made well in one of Scott Alexander's best essays (see section IV in particular, for the concept of asymmetric vs symmetric weapons). There is a spirit here, of truth-seeking and liberalism and building things, of fighting Moloch rather than sacrificing our epistemics to him for +30% social clout. I admit that this is partly an aesthetic preference on my part. But I do believe in it strongly.

(A) Call this "Request For Researchers" (RFR). OpenPhil has tried a more general version of this in the form of the Century Fellowship, but they discontinued this. That in turn is a Thiel Fellowship clone, like several other programs (e.g. Magnificent Grants). The early years of the Thiel Fellowship show that this can work, but I think it's hard to do well, and it does not seem like OpenPhil wants to keep trying.

(B) I think it would be great for some people to get support for multiple years. PhDs work like this, and good research can be hard to do over a series of short few-month grants. But also the long durations just do make them pretty high-stakes bets, and you need to select hard not just on research skill but also the character traits that mean people don't need external incentives.

(C) I think "agenda-agnostic" and "high quality" might be hard to combine. It seems like there are three main ways to select good people: rely on competence signals (e.g. lots of cited papers, works at a selective organisation), rely on more-or-less standardised tests (e.g. a typical programming interview, SATs), or rely on inside-view judgements of what's good in some domain. New researchers are hard to assess by the first, I don't think there's a cheap programming-interview-but-for-research-in-general that spots research talent at high rates, and therefore it seems you have to rely a bunch on the third. And this is very correlated with agendas; a researcher in domain X will be good at judging ideas in that domain, but less so in others.

The style of this that I'd find most promising is:

  1. Someone with a good overview of the field (e.g. at OpenPhil) picks a few "department chairs", each with some agenda/topic.
  2. Each department chair picks a few research leads who they think have promising work/ideas in the direction of their expertise.
  3. These research leads then get collaborators/money/ops/compute through the department.

I think this would be better than a grab-bag of people selected according to credentials and generic competence, because I think an important part of the research talent selection process is the part where someone with good research taste endorses the agenda takes of someone else on agenda-specific inside-view grounds.

Yes, letting them specifically set a distribution, especially as this was implicitly done anyways in the data analysis, would have been better. We'd want to normalise this somehow, either by trusting and/or checking that it's a plausible distribution (i.e. sums to 1), or by just letting them rate things on a scale of 1-10 and then getting an implied "distribution" from that.

I agree that this is confusing. Also note:

 Interestingly, the increase in perceived comfort with entrepreneurial projects is larger for every org than that for research. Perhaps the (mostly young) fellows generally just get slightly more comfortable with every type of thing as they gain experience.

However, this is additional evidence that ERI programs are not increasing fellows' self-perceived comfort with research any more than they increase fellows' comfort with anything. It would be interesting to see if mentors of fellows think they have improved overall; it may be that changes in self-perception and actual skill don't correlate very much.

And also note that fellows consistently ranked the programs as providing on average slightly higher research skill gain than standard academic internships (average 5.7 on a 1-10 scale where 5 = standard academic internship skill gain, see ""perceived skills and skill changes" section).

I can think of many possible theories, including:

  • fellows don't become more comfortable with research despite gaining competence at it because the competence does not lead to feeling good at research (e.g. maybe they update towards research being hard, or there is some form of Dunning-Kruger type thing here, or they already feel pretty comfortable as you mention); therefore self-rated research comfort is a bad indicator and we might instead try e.g. asking their mentors or looking at some other external metric
  • fellows don't actually get better at research, but still rate it as a top source of value because they want to think they did, and their comfort with research not staying the same is a more reliable indicator than them marking it as a top source of value (and also they either have a low opinion of skill gain from standard academic internships, or then haven't experienced those and are just (pessimistically)  imagining what it would be like)

The main way to answer this seems to be getting a non-self-rated measure of research skill change.

For "virtual/intellectual hub", the central example in my mind was the EA Forum, and more generally the way in which there's a web of links (both literal hyperlinks and vaguer things) between the Forum, EA-relevant blogs, work put out by EA orgs, etc. Specifically in the sense that if you stumble across and properly engage with one bit of it, e.g. an EA blog post on wild animal suffering, then there's a high (I'd guess?)  chance you'll soon see a lot of other stuff too, like being aware of centralised infrastructure like the Forum and 80k advising, and becoming aware of the central ideas like cause prio and x-risk. Therefore maybe the virtual/physical distinction was a bit misleading, and the real distinction is more like "Schelling point for intellectual output / ideas" vs "Schelling point for meeting people".

That being said, a point that comes to mind is that geographic dispersion is one of the most annoying things for real-world Schelling points and totally absent*  if you do it virtually, so maybe there's some perspective like "don't think about EAGx Virtual as recreating an EAG but virtually, but rather as a chance to create a meeting-people-Schelling-point without the traditional constraints, and maybe this ends up looking more ambitious"?

(*minus timezones, but you can mail people melatonin beforehand :) )

I mentioned the danger of bringing in people mostly driven by personal gain (though very briefly). I think your point about niche weirdo groups finding some types of coordination and trust very easy is underrated.  As other posts point out the transition to positive personal incentives to do EA stuff is a new thing that will cause some problems, and it's unclear what to do about it  (though as that post also says, "EA purity" tests are probably a bad idea).

I think the maximally-ambitious view of the EA Schelling point is one that attracts anyone who fits into the intersection of altruistic, ambitious / quantitative (in the sense of caring about the quantity of good done and wanting to make that big), and talented/competent in relevant ways. I think hardcore STEM weirdness becoming a defining EA feature (rather than just a hard-to-avoid incidental feature of a lot of it) would prevent achieving this.

In general, the wider the net you want to cast, the harder it is to become a clear Schelling point, both for cultural reasons (subgroup cultures tend more specific than their purpose strictly implies, and broad cultures tend to split), and for capacity reasons (it's harder to get many than few people to hear about something, and also simple practical things like big conferences costing more money and effort).

There is definitely an entire different post (or more) that could be written about how much and which parts of EA should be Schelling point or platform -type thing and comparing the pros and cons. In this post I don't even attempt to weigh this kind of choice.

I agree that in practice x-risk involves different types of work and people than e.g. global poverty or animal welfare. I also agree that there is a danger of x-risk / long-termism cannibalizing the rest of the movement, and this might easily lead to bad-on-net things like effectively trading large amounts of non-x-risk work for very little x-risk / long-termist work (because the x-risk people would have done found their work anyway had x-risk been a smaller fraction of the movement, but as a consequence of x-risk preeminence a lot of other people are not sufficiently attracted to even start engaging with EA ideas).

However, I worry about something like intellectual honesty. Effective Altruism, both the term and the concept, are about effective forms of helping other people, and lots of people keep coming to the conclusion that preventing x-risks is one of the best ways of doing so. It seems almost intellectually dishonest to try to cut off or "hide" (in the weak sense of reducing the public salience of) the connection. One of the main strengths of EA is that it keeps pursuing that whole "impartial welfarist good" thing even if it leads to weird places, and I think EA should be open about the fact that it seems weird things follow from trying to do charity rigorously.

I think ideally this looks like global poverty, animal welfare, x-risk, and other cause areas all sitting under the EA umbrella, and engaged EAs in all of these areas being aware that the other causes are also things that people following EA principles have been drawn towards (and therefore prompted to weigh them against each other in their own decisions and cause prioritization). Of course this also requires that one cause area does not monopolize the EA image.

I agree with your concern about the combination seeming incongruous, but I think there are good ways to pitch this while tying them all into core EA ideas, e.g. something like:

If you start thinking quantitatively about how to do the most good, you might realize that some especially promising ways of doing good are are cases where
-there is clear evidence that a small amount of money goes far, like helping extremely poor people in developing countries
-some suffering has clearly not historically been taken into account, like animal welfare
-the stakes are absolutely huge, like plausible catastrophes that might affect the entire world

I think you also overestimate the cultural congruence between non-x-risk causes like, for example, global poverty and animal welfare. These concerns span from hard-nosed veteran economists who think everything vegan is hippy nonsense and only people matter morally, to young non-technical vegans with concern for everything right down to worms. Grouping these causes together only looks normal because you're so used to EA.

(likewise, I expect low weirdness gradients between x-risk and non-x-risk causes, e.g. nuclear risk reduction policy and developing-country economic development, or GCBRs and neglected diseases)

Load more