The first point seems to be saying that we should factor in the chance that a program works into cost-effectiveness analysis. Isn't this already a part of all such analyses? If it isn't, I'm very surprised by that and think it would be a much more important topic for an essay than anything about PEPFAR in particular.
The second point, that people should consider whether a project is politically feasible, is well taken. It sounds like the lesson here is "if you find yourself in a situation where you have to recommend either project A or B, and both are good, but A is better than B, but if you do activism for A it still won't happen, but if you do activism for B your input will push it over the edge into happening, do activism for B." I agree with this as far as it goes, but there seem like some important caveats:
Update: I think Bing passes the high school essay bar, based on the section "B- Essays No More" at https://oneusefulthing.substack.com/p/i-hope-you-werent-getting-too-comfortable
I find this interesting, but also somewhat hard to identify any meaningful patterns. For example, one could expect red points to be clustered at the top for Manifold, indicating that more forecasts equal better performance. But we don't see that here. The comparison may be somewhat limited anyway: In the eyes of the Metaculus community prediction, all forecasts are created equal. On Manifold, however, users can invest different amounts of money. A single user can therefore in principle have an outsized influence on the overall market price if they are willing to spend enough. I'd be interested to see more on how accuracy on Manifold changes with the number of traders and overall trading volume. Who knows, maybe Manifold would be ahead if they had a similar number of forecasters to Metaculus?
Does this mean that if you controlled for number of forecasters, you still think Metaculus would beat Manifold? If not, do you have any opinion on this question (sorry if I missed it).
Thank you for doing this! I was working on a similar project and mostly came up with the same headline finding as you: the experts seemed well-calibrated. I did decide a few of the milestones a little differently, and would like to hear why you chose the way you did so I can decide whether or not to change mine.
Beyond those quibbles - I was also looking at https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/#Data (the dataset itself; the summary doesn't include the milestones). This new version seems like total garbage. The experts continue to predict several of the milestones are five years out, including milestones that were achieved by ChatGPT (ie a few months after the survey) and at least one milestone that had already clearly been achieved by the time the survey was released! Unless there's some reason to think the new crop of experts is worse than the old one, this makes me think they only did okay last time by luck/coincidence, and actually they have no idea what they're doing.
(I don't think it works to say that the period 2017-2022.5 was predictable, but the period 2022.5-2023 wasn't, because part of what the 2017 experts were right about was ChatGPT, which came out in late 2022.)
Thanks for asking. One reason we decided to start with forecasting was because we think it has comparatively low risks compared to other fields like AI or biotech.
If this goes well and we move on to a more generic round, we'll include our thoughts on this, which will probably include a commitment not to oracular-fund projects that seem like they were risky when proposed, and maybe to ban some extremely risky projects from the market entirely. I realize we didn't explicitly say that here, which is because this is a simplified test round and we think the forecasting focus makes risks pretty unlikely.
In the unlikely event that someone proposes a forecasting project < $20,000 which we think carries significant risk, we're prepared to take those steps this time too.
In 2018, I collected data about several types of sexual harassment on the SSC survey, which I will report here to help inform the discussion. I'm going to simplify by assuming that only cis women are victims and only cis men are perpetrators, even though that's bad and wrong.
Women who identified as EA were less likely report lifetime sexual harassed at work than other women, 18% vs. 20%. They were also less likely to report being sexually harassed outside of work, 57% vs. 61%.
Men who identified as EA were less likely to admit to sexually harassing people at work (2.1% vs. 2.9%) or outside of work (16.2% vs. 16.5%)
The sample was 270 non-EA women, 99 EA women, 4940 non-EA men, and 683 EA men. None of these results were statistically significant, although all of them trended in the direction of EAs experiencing less sexual harassment.
This doesn't prove that EA environments have less harassment than the average environment, since it could be that EAs are biased to have less sexual harassment for other reasons, and whatever additional harassment they get in EA isn't enough to make up for it; the vast majority of EAs have the vast majority of interactions in non-EA environments. I tried to sort of get around this by limiting my analysis to people living in California, on the grounds that they were more likely to be plugged into EA communities and jobs. Conditional on being a woman in California, being EA did make someone more likely to experience sexual harassment, consistently, as measured in many different ways. But Californian EAs were also younger, much more bisexual, and much more polyamorous than Californian non-EAs; adjusting for sexuality and polyamory didn't remove the gap, but age was harder to adjust for and I didn't try. EAs who said they were working at charitable jobs that they explicitly calculated were effective had lower harassment rates than the average person, but those working at charitable jobs that they didn't expliclitly calculate were higher. All of these subgroup analyses were very small sample size.
Overall I am not sure that anything can be concluded from these results either way.
I would urge everyone thinking about this question to read my original discussion of the sexual harassment survey results. It mostly focuses on professions but I think the overall conclusion is extremely relevant here too. You can also find the link to the data there in case you want to double-check my results.
Minor object-level objection: you say we should predict that crypto exchanges like FTX to fail, but I tried to calculate the risk of this in the second part of my post, and the average FTX-sized exchange fails only very rarely.
I don't think this is our main point of disagreement though. My main point of disagreement is about how actionable this is and what real effects it can have.
I think that the main way EA is "affiliated with" crypto is that it has accepted successful crypto investors' money. Of people who have donated the most to EA, I think about 5-7 of the top ten names made their money in something crypto-related (even counting all the FTX people as one donor). Some of those people (example: Vitalik Buterin) are well-liked, honest, and haven't done anything to embarrass us. I think it would be practically bad to stop accepting their people, and morally bad (as a betrayal) to denounce them and writing them out of the movement based on guilt by association. (CoI note: I have benefited from non-FTX crypto money)
I see you're not recommending that EA stop taking crypto money. But then I'm not sure what you do want, other than what's already happening:
Although the point of "don't invite random crypto scammers to serve on your board and become the public face of EA for no reason" is obviously correct, I don't know of anyone actually doing this, and so I worry that the real effect of posts like this will be to slowly make crypto so toxic in this community that EA leaders feel pressured to refuse crypto donations for PR reasons, and then we lose > half of our potential money. I'm especially worried about some kind of purity spiral, where after crypto is toxified, the next level is people arguing that Facebook has also been a pretty evil company at various points and so maybe we shouldn't accept Dustin's money either. I don't see a good Schelling fence here and would prefer not to start down that slope. I think we should avoid associating with (including taking money from) anyone who seems likely to be an outright fraud or breaking the law, and maybe some extremely harmful industries like tobacco, but not try to more generally be the arbiters of which industries are vs. aren't socially productive.
Thanks for your thoughtful response.
I'm trying to figure out how much of a response to give, and how to balance saying what I believe vs. avoiding any chance to make people feel unwelcome, or inflicting an unpleasant politicized debate on people who don't want to read it. This comment is a bad compromise between all these things and I apologize for it, but:
I think the Kathy situation is typical of how effective altruists respond to these issues and what their failure modes are. I think "everyone knows" (in Zvi's sense of the term, where it's such strong conventional wisdom that nobody ever checks if it's true ) that the typical response to rape accusations is to challenge and victim-blame survivors. And that although this may be true in some times and places, the typical response in this community is the one which, in fact, actually happened - immediate belief by anyone who didn't know the situation, and a culture of fear preventing those who did know the situation from speaking out. I think it's useful to acknowledge and push back against that culture of fear.
(this is also why I stressed the existence of the amazing Community Safety team - I think "everyone knows" that EA doesn't do anything to hold men accountable for harm, whereas in fact it tries incredibly hard to do this and I'm super impressed by everyone involved)
I acknowledge that makes it sound like we have opposing cultural goals - you want to increase the degree to which people feel comfortable expressing out that EA's culture might be harmful to women, I want to increase the degree to which people feel comfortable pushing back against claims to that effect which aren't true. I think there is some subtle complicated sense in which we might not actually have opposing cultural goals, but I agree to a first-order approximation they sure do seem different. And I realize this is an annoyingly stereotypical situation - I, as a cis man, coming into a thread like this and saying I'm worried about a false accusations and chilling effects. My only two defenses are, first, that I only got this way because of specific real and harmful false accusations, that I tried to do an extreme amount of homework on them before calling false, and that I only ever bring up in the context of defending my decision there. And second, that I hope I'm possible to work with and feel safe around, despite my cultural goals, because I want to have a firm deontological commitment to promoting true things and opposing false things, in a way that doesn't refer to my broader cultural goals at any point.
Okay, so GWWC, LW, and GiveWell, what are we going to do to reverse the trend?
Seriously, should we be thinking of this as "these sites are actually getting less effective at recruiting EAs" or as "there are so many more recruitment pipelines now that it makes sense that each one would drop in relative importance" or as "any site will naturally do better in its early years as it picks the low-hanging fruit in converting its target population, then do worse later"?