I work on the 1-on-1 team at 80,000 hours talking to people about their careers; the opinions I've shared here (and will share in the future) are my own.
My comment wasn't about whether there are any positives in using WELLBYs (I think there are), it was about whether I thought that sentence and set of links gave an accurate impression. It sounds like you agree that it didn't, given you've changed the wording and removed one of the links. Thanks for updating it.
I think there's room to include a little more context around the quote from TLYCs.
In short, we do not seek to duplicate the excellent work of other charity evaluators. Our approach is meant to complement that work, in order to expand the list of giving opportunities for donors with strong preferences for particular causes, geographies, or theories of change. Indeed, we will continue to rely heavily on the research done by other terrific organizations in this space, such as GiveWell, Founders Pledge, Giving Green, Happier Lives Institute, Charity Navigator, and others to identify candidates for our recommendations, even as we also assess them using our own evaluation framework.
We also fully expect to continue recommending nonprofits that have been held to the highest evidentiary standards, such as GiveWell’s top charities. For our current nonprofit recommendations that have not been evaluated at that level of rigor, we have already begun to conduct in-depth reviews of their impact. Where needed, we will work with candidate nonprofits to identify effective interventions and strengthen their impact evaluation approaches and metrics. We will also review our charity list periodically and make sure our recommendations remain relevant and up to date.
[Speaking for myself here]
I also thought this claim by HLI was misleading. I clicked several of the links and don't think James is the only person being misrepresented. I also don't think this is all the "major actors in EA's GHW space" - TLYCS, for example, meet reasonable definitions of "major" but their methodology makes no mention of wellbys
(I'm straight up guessing, and would be keen for an answer from someone familiar with this kind of study)
This also confused me. Skimming the study, I think they're calculating efficacy from something like how long it takes people to get malaria after the booster, which makes sense because you can get it more than once. Simplifying a lot (and still guessing), I think this means that if e.g. on average people get malaria once a week, and you reduce it to once every 10 weeks, you could say this has a 90% efficacy, even though if you looked at how many people in each group got it across a year, it would just be 'everyone' in both groups.
This graph seems to back this up:
https://www.thelancet.com/cms/attachment/2eddef00-409b-4ac2-bfea-21344b564686/gr2.jpg
This is a useful consideration to point out, thanks. I push back a bit below on some specifics, but this effect is definitely one I'd want to include if I do end up carving out time to add a bunch more factors to the model.
I don't think having skipped the neglectedness considerations you mention is enough to call the specific example you quote misleading though, as it's very far from the only thing I skipped, and many of the other things point the other way. Some other things that were skipped:
Work after AGI likely isn't worth 0, especially with e.g. Metaculus definitions.
While in the community building examples you're talking about, shifting work later doesn't change the quality of that work, this is not true wrt PhDs (doing a PhD looks more like truncating the most junior n years of work than shifting all years of work n years later).
Work that happens just before AGI can be done with a much better picture of what AGI will look like, which pushes against the neglectedness effect.
Work from research leads may actually increase in effectiveness as the field grows, if the growth is mostly coming from junior people who need direction and/or mentorship, as has historically been the case.
And then there's something about changing your mind, but it's unclear to me which direction this shifts things:
I'm a little confused about what "too little demand" means in the second paragraph. Both of the below seem like they might be the thing you are claiming:
I'd separately be curious to see more detail on why your guess at the optimal structure for the provision of the kind of services you are interested in is "EA-specific provider". I'm not confident that it's not, but my low confidence guess would be that "EA orgs" are not similar enough that "context on how to with with EA orgs" becomes a hugely important factor.
I think "different timelines don't change the EV of different options very much" plus "personal fit considerations can change the EV of a PhD by a ton" does end up resulting in an argument for the PhD decision not depending much on timelines. I think that you're mostly disagreeing with the first claim, but I'm not entirely sure.
In terms of your point about optimal allocation, my guess is that we disagree to some extent about how much the optimal allocation has changed, but that the much more important disagreement is about whether some kind of centrally planned 'first decide what fraction of the community should be doing what' approach is a sensible way of allocating talent, where my take is that it usually isn't.
I have a vague sense of this talent allocation question having been discussed a bunch, but don't have write-up that immediately comes to mind that I want to point to. I might write something about this at some point, but I'm afraid it's unlikely to be soon. I realise that I haven't argued for my talent allocation claim at all, which might be frustrating, but it seemed better to highlight the disagreement at all than ignore it, given that I didn't have the time to explain in detail.
So there's now a bunch of speculation in the comments here about what might have caused me and others to criticise this post.
I think this speculation puts me (and, FWIW, HLI) in a pretty uncomfortable spot for reasons that I don't think are obvious, so I've tried to articulate some of them:
- There are many reasons people might want to discuss others' claims but not accuse them of motivated reasoning/deliberately being deceptive/other bad faith stuff, including (but importantly not limited to):
a) not thinking that the mistake (or any other behaviour) justifies claims about motivated reasoning/bad faith/whatever
b) not feeling comfortable publicly criticising someone's honesty or motivations for fear of backlash
c) not feeling comfortable publicly criticising someone's honesty of motivations because that's a much more hurtful criticism to hear than 'I think you made this specific mistake'
d) believing it violates forum norms to make this sort of public criticism without lots of evidence
- In situations where people are speculating about what I might believe but not have said, I do not have good options for moving that speculation closer to the truth, once I notice that this might not be the only time I post a comment or correction to something someone says.
Examples:
- If I provide positive reassurance about me not actually implying bad faith with a comment that didn't mention it, that makes it pretty clear what I think in situations where I'm not ruling it out.
- If I give my honest take on someone's motivation in any case where I don't think there's any backlash risk, but don't give a take in situations where there is backlash risk, then I'm effectively publicly identifying which places I'd be worried about backlash, which feels like the sort of thing that might cause backlash from them.
If you think for a few minutes about various actions I might take in various situations, either to correct misunderstanding or to confirm correct understanding, I'm sure you'll get the idea. To start with, you might want to think about why it doesn't make sense to only correct speculation that seems false.
That's a very long-winded way of saying "I posted a correction, you can make up your own mind about what that correction is evidence of, but I'd rather you didn't spend a ton of time publicly discussing what I might think that correction is evidence of, because I won't want to correct you if you're wrong or confirm if you're right".