Cross-posted to my blog. Thanks to Ajeya Cotra and Jeff Kaufman for feedback on a draft of this post. Any remaining errors are my own. Comments on this post may be copied to my blog.
Last year, Peter Hurford wrote a post titled ‘EA risks falling into a "meta trap". But we can avoid it.’ Ben Todd wrote a followup that clarified a few points. I have a few more meta traps to add to the list.
What do I mean by "meta"?
In the original two posts, I believe "meta" means "promoting effective altruism in abstract, with the hope that people do good object level projects in the future". This does not include cause prioritization research, which is typically also lumped under "meta". In this post, I'll use "meta" in much the same way -- I'm talking about work that is trying to promote effective altruism in the abstract and/or fundraising for effective charities. Example organizations include most of the Centre for Effective Altruism (Giving What We Can, EA Outreach, EA Global, Chapters team), 80,000 Hours, most of .impact, Raising for Effective Giving, Charity Science, Center for Applied Rationality, Envision, Students for High Impact Charity and local EA groups. However, it does not include GiveWell or the Foundational Research Institute.
What is this post not about?
I am not suggesting that we should put less money into meta; actually I think most meta-charities are doing great work and should get more money than they have. I am more worried that in the future, meta organizations will be funded more than they should be because of leverage ratio considerations that don't take into account all of the potential downsides.
Potential Biases
Most of the work I do falls under "meta"—I help coordinate across local EA groups, and I run a local group myself. As a result, I've become pretty familiar with the landscape of certain meta organizations, which means that I've thought much more about meta than other cause areas. I expect that if I spent more time looking into other cause areas, I would find issues with them as well that I don't currently know about.
I have become most familiar with local EA groups, the CEA chapters team, Giving What We Can, and 80,000 Hours, so most of my examples focus on those organizations. I'm not trying to single them out or suggest that they suffer most from the issues I'm highlighting—it's just easiest for me to use them as examples since I know them well.
Summary of considerations raised before
In Peter Hurford’s original post, there were several points made:
- Meta Orgs Risk Not Actually Having an Impact. Since meta organizations are often several levels removed from direct impact, if even one of the "chains of impact" between levels fails to materialize, the meta org will not have any impact.
- Meta Orgs Risk Curling In On Themselves. This is the failure mode where meta organizations are optimized to spread the meta-movement but fail to actually have object-level impact.
- You Can’t Use Meta as an Excuse for Cause Indecisiveness. Donating to meta organizations allows you to skip the work of deciding which object-level cause is best, which is not good for the health of the movement.
- At Some Point, You Have to Stop Being Meta. This is the worry that we do too much meta work and don't get around to doing object-level work soon enough.
- Sometimes, Well Executed Object-Level Action is What Best Grows the Movement. For example, GiveWell built an excellent research product without focusing on outreach, but still ended up growing the movement.
It seems to me that meta trap #4 is simply a consequence of several other traps -- there are many meta traps that will cause us to overestimate the value of meta work, and as a result we may do more meta work than is warranted, and not do enough object-level work. So, I will be ignoring meta trap #4 for the rest of this post.
I'm also not worried about meta trap #2, because I think that meta organizations will always have a “chain of impact” that bottoms out with object-level impact. (Either you start a meta organization that helps an object-level organization, or you start a meta organization that helps some other organization that inductively eventually helps an object-level organization.) To fall into meta trap #2, at some point in the chain one of the organizations would have to change direction fairly drastically to not have any object-level impact at all. However, the magnitude of the object-level impact may be much smaller than we expect. In particular, this "chain of impact" is what causes meta trap #1.
The Chain of Impact
Let's take the example of the Chapters team at the Centre for Effective Altruism (CEA), which is one of the most meta organizations -- they are aimed at helping and coordinating local EA groups, which are helping to spread EA, which is both encouraging more donations to effective charities and encouraging more effective career choice, which ultimately causes object-level impact. At each layer, there are several good metrics which are pretty clear indicators that object-level impact is happening:
- The Chapters team could use "number of large local EA groups" or "total GWWC pledges/significant plan changes from local groups"
- Each local group could use "number of GWWC pledges", "number of significant plan changes", and "number of graduated members working at EA organizations"
- GWWC and 80,000 Hours themselves have metrics to value a pledge and a significant plan change, respectively.
- The object-level charities that GWWC recommends have their own metrics to evaluate their object-level impact.
Like I said above, this “chain of impact” makes it pretty unlikely that meta organizations will end up curling in on themselves and become detached from the object level. However, a long chain does mean that there is a high risk of having no impact (meta trap #1). In fact, I would generalize meta trap #1:
Meta Trap #1. Meta orgs amplify bad things too
The whole point of meta organizations is to have large leverage ratios through a "chain of impact", which is usually operationalized using a "chain of metrics". However, this sort of chain can lead to other amplifications as well:
Meta Trap #1a. Probability of not having an impact
This is the point made in the original post -- the more meta an organization is, the more likely that one of the links in the chain fails and you don't have any impact at all.
One counterargument is that often, meta organizations help many organizations one level below them (eg. GWWC). In this case, it is extremely unlikely that none of these organizations have any impact, and so the risk is limited to just the risk that the meta organization itself fails at introducing enough additional efficiency to justify its costs.
Meta Trap #1b. Overestimating impact
It is plausible that organizations systematically overestimate their impact (something like the overconfidence effect). If most or all of the organizations along the chain overestimate their impact, then the organization at the top of the chain will have a vastly overestimated object-level impact. Note that if an organization phrases its impact in terms of its impact on the organization directly below them in the chain, this does not apply to that number. It only applies to the total estimated object-level impact of the organization.
You can get similar problems with selection effects, where the people who start meta organizations are more likely to think that meta organizations are worthwhile. Each selection effect leads to more overconfidence in the approach, and once again the negative effects of the bias grow as you go further up the chain.
Meta Trap #1c. Issues with metrics
A commonly noted issue with metrics is that an organization will optimize to do well on its metrics, rather than what we actually care about. In this way, badly chosen metrics can make us far less effective.
As a concrete example, let's look at the CEA Chapters Team chain of metrics:
- One of many plausible metrics for the CEA Chapters team is "number of large local groups". (I'm not sure what metrics they actually use.) They may focus on getting smaller local groups to grow, but this may result in local groups having large speaker events and counting vaguely interested students as "members" that results in a classification of "large" even though not much has actually changed with that local group.
- Local groups themselves often use "number of GWWC pledges" as a metric. They may start to promote the pledge very widely with external incentives (eg. free food for event attendees, followed by social pressure to sign the pledge). As a result, more students may take the pledge, but they may be much more likely to drop out quickly.
- GWWC has three metrics on its home page -- number of pledges, amount of money already donated, and amount of money pledged. They may preferentially build a community for pledge takers who already make money, since they will donate sooner and increase the amount of money already donated. Students may lose motivation due to the lack of community and so would forget about the pledge, and wouldn’t donate when the time came. GWWC would still be happy to have such pledges, because they increase both the number of pledges and the amount of money pledged.
- The various object-level organizations that GWWC members donate to could have similar problems. For example, perhaps the Against Malaria Foundation would focus on areas where bednets can be bought and distributed cheaply, giving a better “cost per net” metric, rather than spending slightly more to distribute nets in regions with much higher malaria burdens, where the money would save more lives.
If all of these were true, you'd have to question whether the CEA Chapters team is having much of an impact at all. On the other hand, even though the metrics for AMF lead to suboptimal behavior, you can still be quite confident that they have a significant impact.
I don't think that these are true, but I could believe weaker versions (in particular, I worry that GWWC pledges from local groups are not as good as the average GWWC pledge).
In addition to the 5 traps that Peter Hurford mentions, I believe there are other meta traps that tend to make us overestimate the impact of meta work:
Meta Trap #6. Marginal impact may be much lower than average impact.
The GiveWell top charities generally focus on implementing a single intervention. As a result, we can roughly expect that the marginal impact of a dollar donated is about the same as the average impact of a dollar, since both dollars go to the same intervention. It's still likely lower (for example, AMF may start funding bednets in areas with lower malaria burden, reducing marginal impact), but not that much lower.
However, meta organizations typically have many distinct activities for the same goal. These activities can have very different cost-effectiveness. The marginal dollar will typically fund the activity with the lowest (estimated) cost-effectiveness, and so will likely be significantly less impactful than the average dollar.
Note that this assumes that the activities are not symbiotic -- that is, the argument only works if stopping one of the activities would not significantly affect the cost-effectiveness of other activities. As a concrete example of such activities in a meta organization, see this comment about all the activities that get pledges for GWWC.
The numbers that are typically publicized are for average impact. For example, on average, GWWC claims a 104:1 multiplier, or 6:1 for their pessimistic calculation, and claims that the average pledge is worth $73,000. On average the cost to 80,000 Hours of a significant plan change is £1,667. (These numbers are outdated and will probably be replaced by newer ones soon.) What about the marginal impact of the next dollar? I have no idea, except that it's probably quite a bit worse than the average. I would not be surprised if the marginal dollar didn't even achieve a 1:1 multiplier under GWWC's assumptions for their pessimistic impact calculation. (But their pessimistic impact calculation is really pessimistic.) To be fair to meta organizations, marginal impact is a lot harder to assess than average impact. I myself focus on average impact when making the case for local EA groups because I actually have a reasonable estimate for those numbers.
One counterargument against this general argument is that each additional activity further shares fixed costs, making everything more cost-effective. For example, GWWC has to maintain a website regardless of which activities it runs. If it adds another activity to get more pledges that relies on the website, then the cost of maintaining the website is spread over the new activity as well, making the other activities more cost effective. Similar considerations could apply to office space costs, legal fees, etc. I would guess that this is much less important than the inherent difference in cost effectiveness of different activities.
Meta Trap #7. Meta suffers more from coordination problems.
As the movement grows and more people and meta organizations become a part of it, it becomes more important to consider the group as a whole, rather than the impact of just your actions. This 80,000 Hours post applies this idea to five different situations.
Especially in the area of promoting effective altruism in the abstract, there are a lot of organizations working toward the same goal and targeting the same people. For example, perhaps Alice learns about EA from a local EA group, goes to a CFAR workshop, starts a company because of 80,000 Hours and takes the Founder's Pledge and GWWC pledge. We now have five different organizations that can each claim credit for Alice's impact, not to mention Alice herself. In addition, from a "single player counterfactual analysis", it is reasonable for all the organizations to attribute nearly all the impact to themselves -- if Alice was talking to these organizations while making her decision, each organization could separately conclude that Alice would not have made these life changes without them, and so counterfactually they get all the credit. (And this could be a reasonable conclusion by each organization.) However, the total impact caused would then be smaller than the sum of the impacts each organization thinks they had.
Imagine that Alice will now have an additional $2,000 of impact, and each organization spent $1,000 to accomplish this. Then each organization would (correctly) claim a leverage ratio of 2:1, but the aggregate outcome is that we spent $5,000 to get $2,000 of benefit, which is clearly suboptimal. These numbers are completely made up for pedagogical purposes and not meant to be actual estimates. In reality, even in this scenario I suspect that the ratio would be better than 1:1, though it would be smaller than the ratio each organization would compute for itself.
Note that the recent changes at CEA have helped with this problem, but it still matters.
Meta Trap #8. Counterfactuals are harder to assess.
It's very unclear what would happen in the absence of meta organizations -- I would expect the EA movement to grow anyway simply by spreading through word of mouth, but I don't know how much. If Giving What We Can didn’t exist, perhaps EA would grow at the same rate and EAs would publicize their donations on the EA Hub. If 80,000 Hours didn’t exist, perhaps some people would still make effective career choices by talking to other EAs about their careers. It is hard to properly estimate the counterfactual for impact calculations -- for example, GWWC asks pledge takers to self-report their counterfactual donations, which is fraught with uncertainties and biases, and as far as I know 80,000 Hours does not try to estimate the impact that a person would have had before their plan change.
This isn't a trap in and of itself -- it becomes a trap when you combine it with biases that lead us to overestimate how much counterfactual impact we have by projecting the counterfactual as worse than it actually would have been. We should care about this for the same reasons that we care about robustness of evidence.
I think that with most object-level causes this is less of an issue. When RCTs are conducted, they eliminate the problem, at least in theory (though you do run into problems when trying to generalize from RCTs to other environments). I think that this is a problem in far future areas (would the existential risk have happened, or would it have been solved anyway?), but people are aware of the problem and tackle it (research into the probabilities of various existential risks, looking for particularly neglected existential risks such as AI risk). I haven't seen anything similar for meta organizations.
What should we do?
Like I mentioned before, I’m not worried about meta traps #2 and #4. I agree that meta traps #3 (cause indecisiveness) and #5 (object-level action as a meta strategy) are important but I don't have concrete suggestions for them, other than making sure that people are aware of them.
For meta trap #1, I agree with Peter's suggestion: The more steps away from impact an EA plan is, the more additional scrutiny it should get. In addition, I like 80,000 Hours’ policy of publishing some specific, individual significant plan changes that they have caused. Looking at the details of individual cases makes it clear what sort of impact the organization has underneath all the metrics, and ideally would directly show the object-level impact that the organization is causing (even if the magnitude is unclear). It seems to me that most meta organizations can do some version of this.
I worry most about meta trap #6 (marginal vs. average impact). It also applies to animal welfare organizations, but I'm less worried there because Animal Charity Evaluators (ACE) does a good job of thinking about that consideration deeply, creating cost-effectiveness estimates for each activity, and basing its recommendations on that. We could create "Meta Charity Evaluators", but I'm not confident that this is actually worthwhile, given that there are relatively few meta organizations, and not much funding flows to them. However, this is similar to the case for ACE, so there's some reason to do this. This could also help with meta trap #7 (coordination problems), if it took on coordination as an explicit goal.
I would guess that we don't need to do anything about meta trap #8 (hard counterfactuals) now. I think most meta organizations have fairly strong cases for large counterfactual impact. I would guess that we could make good progress by research into what the counterfactuals are, but again since there is not much funding for meta organizations now, this does not seem particularly valuable.
Apparently this post has been nominated for the review! And here I thought almost no one had read it and liked it.
Reading through it again 5 years later, I feel pretty happy with this post. It's clear about what it is and isn't saying (in particular, it explicitly disclaims the argument that meta should get less money), and is careful in its use of arguments (e.g. trap #8 specifically mentions that counterfactuals being hard isn't a trap until you combine it with a bias towards worse counterfactuals). I still agree that all of the traps mentioned here are worth keeping in mind when working on "meta".
The biggest critique of this post is that it doesn't demonstrate that any of these traps actually happen(ed) in practice. It has several examples, but most are of the form "such-and-such bad thing could be happening, I can't tell from the outside". This comment makes some more speculative claims about what bad things actually happened, but they are speculative and the response mentions that they were probably already taken into account.
I think this does in fact make the post less valuable than it otherwise could be. Nonetheless, I still find the post important, because it's the closest we get to criticism of "meta" work. In theory, we could have better criticisms from people who are actually doing the work themselves, who can say more definitively whether in practice there are cases of these "traps", but in practice I have not seen such critiques.
If I were rewriting this post today, I'd make a few changes:
Some miscellaneous thoughts:
I'll add two more potential traps. There's overlap with some of the existing ones but I think these are worth mentioning on their own.
9) Object level work may contribute more learning value.
I think it's plausible that the community will learn more if it's more focused on object level work. There are several plausible mechanisms. For example (not comprehensive): object level work might have better feedback loops, object level work may build broader networks that can be used for learning about specific causes, or developing an expert inside view on an area may be the best way to improve your modelling of the world. (Think about liberal arts colleges' claim that it's worth having a major even if your educational goals are broad "critical thinking" skills.)
I'm eliding here over lots of open questions about how to model the learning of a community. For example: is it more efficient for communities to learn by their current members learning or by recruiting new members with preexisting knowledge/skills?
I don't have an answer to this question but when I think about it I try to take the perspective of a hypothetical EA community ten years from now and ask whether it would prefer to primarily be made up of people with ten years' experience working on meta causes or a biologist, a computer scientist, a lawyer, etc. . .
10) The most valuable types of capital may be "cause specific"
I suppose (9) is a subset of (10). But it may be that it's important to invest today on capital that will pay off tomorrow. (E.G. See 80k on career capital.) And cause specific opportunities may be better developed (and have higher returns) than meta ones. So, learning value aside, it may be valuable for EA to have lots of people who invested in graduate degrees or building professional networks. But these types of opportunities may sometimes require you to do object level work.
These traps are fairly compelling.
My broader claim would be that if we had a model where most of the activities that can usefully be augmented will come from folks with: i) great expertise in one of several fields ii) excellent epistemics iii) low risk aversion then the movement would de-prioritize grassroots meta, and change its emphasis, while upweighting direct activities and subfield-specific meta.
9) seems pretty compelling to me. To use some analogies from the business world: it wouldn't make sense for a company to hire lots of people before it had a business model figured out, or run a big marketing campaign while its product was still being developed. Sometimes it feels to me like EA is doing those things. (But maybe that's just because I am less satisfied with the current EA "business model"/"product" than most people.)
"But maybe that's just because I am less satisfied with the current EA "business model"/"product" than most people."
Care to elaborate (or link to something?)
https://www.facebook.com/groups/effective.altruists/permalink/1263971716992516/
Hi Rohin,
I agree these are good concerns.
I partially address 7 and point 8 in the post that you link to: https://80000hours.org/2016/02/the-value-of-coordination/#attributing-impact Note that it is possible for the credit to sum to more than 100%.
I discuss point 6 here: https://80000hours.org/2015/11/stop-talking-about-declining-returns-in-small-organisations/
Issues with how to assess impact, metrics etc. are discussed in-depth in the organisation's impact evaluations.
What I'm keen to see is a detailed case arguing that these are actually problems, rather than just pointing out that they might be problems. This would help us improve.
Just to clarify, you'd like to see funding to meta-charities increase, so don't think these worries are actually sufficient to warrant a move back to first order charities?
Cheers,
Ben
PS. One other small thing – it's odd to class GiveWell as not meta, but 80k as meta. I often think of 80k as the GiveWell of career choice. Just as GiveWell does research into which charities are most effective and publicises it, we do research into which career strategies are most effective and publicise it.
Yes, I agree that this is possible (this is why I said it could be "a reasonable conclusion by each organization"). My point is that because of this phenomenon, you can have the pathological case where from a global perspective, the impact does not justify the costs, even though the impact does justify the costs from the perspective of every organization.
Yeah, I agree that potential economies of scale are much greater than diminishing marginal returns, and I should have mentioned that. Mea culpa.
My impression is that organizations acknowledge that there are issues, but the issues remain. I'll write up an example with GWWC soon.
That's correct.
I agree that 80k's research product is not meta the way I've defined it. However, 80k does a lot of publicity and outreach that GiveWell for the most part does not do. For example: the career workshops, the 80K newsletter, the recent 80K book, the TedX talks, the online ads, the flashy website that has popups for the mailing list. To my knowledge, of that list GiveWell only has online ads.
I've got a speculative one for GWWC, and a more concrete one for chapter seeding.
GWWC pledges: I've mentioned that I don't worry about traps #2 and #4, and traps #3 and #5 don't apply to a specific organization, so I'll skip those.
I don't think this is a problem for GWWC.
Here are some potential ways that GWWC could be overestimating impact:
Now since GWWC talks about their impact in terms of their impact on the organizations directly beneath them on the chain, you don't see any amplification of overestimates. However, consider the case of local EA groups. They could be overestimating their impact too:
My original post explained how this would be the case for GWWC. I agree though that economies of scale will probably dominate for some time.
I think local EA groups and GWWC both take credit for pledges originating from local groups. (It depends on what those pledge takers self-reported as the counterfactual.) If they came from an 80,000 Hours career workshop, then we now have three organizations claiming the impact.
There's also a good Facebook thread about this. I forgot about it when writing the post.
I've mentioned this above -- GWWC uses self-reported counterfactuals. If you agree that you should penalize expected value estimates if the evidence is not robust, then I think you should do the same here.
Here's the second example:
There was an effort to "seed" local EA groups, and in the impact evaluation we see "each hour of staff time generated between $187 and $1270 USD per hour."
First problem: The entire point of this activity was to get other people to start a local EA group, but the time spent by these other people aren't included as costs (meta trap #7, kind of). Those other people would probably have to put in ~20 person hours per group to get this impact. If you include these hours as costs, then the estimated cost-effectiveness becomes something more like $167-890.
Second problem: I would bet money at even odds that the pessimistic estimate for cold-email seeding was too optimistic (meta trap #1b). (I'm not questioning the counterfactual rate, I'm questioning the number of chapters that "went silent".)
Taking these two into account, I think that the chapter seeding was probably worth ~$200 per hour. Now if GWWC itself is too optimistic in its impact calculation (as I think it is), this falls even further (meta trap #1b), and this seems just barely worthwhile.
That said, there are other benefits that aren't incorporated into the calculation (both for GWWC pledges in general and chapter seeding). So overall it still seems like it was worthwhile, but it's not nearly as exciting as it initially seemed.
These are all reasonable concerns. I can't speak for the details of the two estimates you mention, though my impression is that the points listed have probably already been considered by the people making the estimates. Though you could easily differ from them in your judgement calls.
With LEAN not including the costs of the chapter heads, they might have just decided that the costs of this time are low. Typically, in these estimates, people are trying to work out something like GiveWell dollars in vs. GiveWell dollars out. If a chapter head wouldn't have worked on an EA project or earned to give to GiveWell charities otherwise, then the opportunity cost of their time could be small when measured in GiveWell dollars. In practice, it seems like much chapter time comes out of other leisure activities.
With 80k, we ask people taking the pledge whether they would have taken it if 80k never existed, and only count people who say "probably not". These people might still be biased in our favor, but on the other hand, there's people we've influenced but were pushed over the edge by another org. We don't count these people towards our impact, even though we made it easier for the other org.
(We also don't count people who were influenced by us indirectly, so don't know they were influenced)
Zooming out a bit, ultimately what we do is make people more likely to pledge.
Here's a toy model.
80k shows them a workshop, which makes the 10% more likely to take it, so at time 1, the probabilities are:
Then GWWC shows them a talk, which has the same effect. So at time 2:
Given current methods, 80k gets zero impact. Although they got Carla to pledge, Carla tells them she would have taken it otherwise due to GWWC, which is true.
GWWC counts both Carla and Bob as new pledgers in their total, but when they ask them how much they would have donated otherwise, Carla says zero (80k had already persuaded her) and Bob probably gives a high number too (~90%), because he was already close to doing it. So this reduces GWWC's estimate of the counterfactual value per pledge. In total, GWWC adds 10% of the value of Bob's donations to their estimates of counterfactual money moved.
This is pessimistic for 80k, because without 80k, GWWC wouldn't have persuaded Bob, but this isn't added to our impact.
It's also a bit pessimistic for GWWC, because none of their effect on Amy is measured, even though they've made it easier for other organisations to persuade her.
In either case, what's actually happening is that 80k is adding 30% of probability points and GWWC 20% of probability points. The current method of asking people what they would have done otherwise is a rough approximation for this, but it can both overcounts and undercounts what's really going on.
Re: Leisure time. I think I would have probably either taken another class, gotten a part-time paying job as a TA, or done technical research with a professor if I weren't leading EAB (which took ~10 hours of my time each week). I'm not positive how representative this is across the board, but I think this is likely true of at least some other chapter leaders, and more likely to be true of the most dedicated (who probably produce a disproportionate amount of the value of student groups).
Hmm, my comment about this was lost.
On second thoughts, "leisure time" isn't quite what I meant. I more thought that it would come out of other extracurriculars (e.g. chess society).
Anyway, I think there's 3 main types of cost:
Immediate impact you could have had doing something else e.g. part-time job and donating the proceeds.
Better career capital you could have gained otherwise. I think this is probably the bigger issue. However, I also think running a local group is among the best options for career capital while a student, especially if you're into EA. So it's plausible the op cost is near zero. If you want to do research and give up doing a research project though, it could be pretty significant.
More fun you could have had elsewhere. This could be significant on a personal level, but it wouldn't be a big factor in a calculation measured in terms of GiveWell dollars.
Based on other students I know who put time into rationalist or EA societies this seems right.
Okay, this makes more sense. I was mainly thinking of the second point -- I agree that the first and third points don't make too much of a difference. (However, some students can take on important jobs, eg. Oliver Habryka working at CEA while being a student.)
Another possibility is that you graduate faster. Instead of running a local group, you could take one extra course each semester. Aggregating this, for every two years of not running a local group, you could graduate a semester earlier.
(This would be for UC Berkeley, I think it should generalize about the same to other universities as well.)
I strongly disagree.
This is exactly why I focused on general high-level meta traps. I can give several plausible ways in which the meta traps may be happening, but it's very hard to actually prove that it is indeed happening without being on the inside. If GWWC has an issue where it is optimizing metrics instead of good done, there is no way for me to tell since all I can see are its metrics. If GWWC has an issue with overestimating their impact, I could suggest plausible ways that this happens, but they are obviously in a better position to estimate their impact and so the obvious response is "they've probably thought of that". To have some hard evidence, I would need to talk to lots of individual pledge takers, or at least see the data that GWWC has about them. I don't expect to be better than GWWC at estimating counterfactuals (and I don't have the data to do so), so I can't show that there's a better way to assess counterfactuals. To show that coordination problems actually lead to double-counting impact, I would need to do a comparative analysis of data from local groups, GWWC and 80k that I do not have.
There is one point that I can justify further. It's my impression that meta orgs consistently don't take into account the time spent by other people/groups, so I wouldn't call that one a judgment call. Some more examples:
Yes, I agree that there is impact that isn't counted by these calculations, but I expect this is the case with most activities (with perhaps the exception of global poverty, where most of the impacts have been studied and so the "uncounted" impact is probably low).
The main issue is that I don't expect that people are performing these sorts of counterfactual analyses when reporting outcomes. It's a little hard for me to imagine what "90% chance" means so it's hard for me to predict what would happen in this scenario, but your analysis seems reasonable. (I still worry that Bob would attribute most or all of the impact to GWWC rather than just 10%.)
However, I think this is mostly because you've chosen a very small effect size. Under this model, it's impossible for 80k to ever have impact -- people will only say they "probably wouldn't" have taken the GWWC pledge if they started under 50%, but if they started under 50%, 80k could never get them to 100%. Of course this model will undercount impact.
Consider instead the case where a general member of a local group comes to a workshop and takes the GWWC pledge on the spot (which I think happens not infrequently?). The local group has done the job of finding the member and introducing her to EA, maybe raising the probability to 30%. 80K would count the full impact of that pledge, and the local group would probably also count a decent portion of that impact.
More generally, my model is that there are many sources that lead to someone taking the GWWC pledge (80k, the local group, online materials from various orgs), and a simple counterfactual analysis would lead to every such source getting nearly 100% of the credit, and based on how questions are phrased I think it is likely that people are actually attributing impact this way. Again, I can't tell without looking at data. (One example would be to look at what impact EA Berkeley members attribute to GWWC.)
I can't speak for the other orgs, but 80k probably wouldn't count this as "full impact".
First, the person would have to say they made the pledge "due to 80k". Whereas if they were heavily influenced by the local group, they might say they would have taken it otherwise.
Second, as a first approximation, we use the same figure GWWC does for a value of a pledge in terms of donations. IIRC this already assumes only 30% is additional, once counterfactually adjusted. This % is based on their surveys of the pledgers. (Moreover, for the largest donors, who determine 75% of the donations, we ask them to make individual estimates too).
Taken together, 80k would attribute at most 30% of the value.
Third, you can still get the undercounting issue I mentioned. If someone later takes the pledge due to the local group, but was influenced by 80k, 80k probably wouldn't count it.
What would you estimate is the opportunity cost of student group organiser time per hour?
How would it compare to time spent by 80k staff?
Yes, I'm predicting that they would say that almost always (over 90% of the time).
That does make quite a difference. It seems plausible then that impact is mostly undercounted rather than overcounted. This seems more like an artifact of a weird calculation (why use GWWC's counterfactual instead of having a separate one)? And you still have the issue that impact may be double counted, it's just that since you tend to undercount impact in the first place the effects seem to cancel out.
That's a little uncharitable of me, but the point I'm trying to make is that there is no correction for double-counting impact -- most of your counterarguments seem to be saying "we typically underestimate our impact so this doesn't end up being a problem". You aren't using the 30% counterfactual rate because you're worried about double counting impact with GWWC. (I'm correct about that, right? It would a really strange way to handle double counting of impact.)
Nitpick: This spreadsheet suggests 53%, and then adds some more impact based on changing where people donate (which could double count with GiveWell).
I agree that impact is often undercounted. I accept that impact is often undercounted, to such a degree that double counting would not get you over 100%. I still worry that people think "Their impact numbers are great and probably significant underestimates" without thinking about the issue of double counting, especially since most orgs make sure to mention how their impact estimates are likely underestimates.
Even if people just donated on the basis of "their impact numbers are great" without thinking about both undercounting and overcounting, I would worry that they are making the right decision for the wrong reasons. We should promote more rigorous thinking.
My perspective is something like "donors should know about these considerations", whereas you may be interpreting it as "people who work in meta don't know/care about these considerations". I would only endorse the latter in the one specific case of not valuing the time of other groups/people.
The number I use for myself is $20, mostly just made up so that I can use it in Fermi estimates.
Unsure. Probably a little bit higher, but not much. Say $40?
(I have not thought much about the actual numbers. I do think that the ratio between the two should be relatively small.)
I also don't care too much that 80k doesn't include costs to student groups because those costs are relatively small compared to the costs to 80k (probably). This is why I haven't really looked into it. This is not the case with GWWC pledges or chapter seeding.
Hey Rohin, without getting into the details, I'm pretty unsure whether correcting for impacts from multiple orgs makes 80,000 Hours look better or worse, so I'm not sure how we should act. We win out in some cases (we get bragging rights from someone who found out about EA from another source then changes their career) and lose in others (someone who finds out about GiveWell through 80k but doesn't then attribute their donations to us).
There's double counting yes, but the orgs are also legitimately complementary of one another - not sure if the double counting exceeds the real complementarity.
We could try to measure the benefit/cost of the movement as a whole - this gets rid of the attribution and complementarity problem, though loses the ability to tell what is best within the movement.
I'm a little unclear on what you mean here. I see three different factors:
Various orgs are undercounting their impact because they don't count small changes that are part of a larger effort, even though in theory from a single player perspective, they should count the impact.
In some cases, two (or more) organizations both reach out to an individual, but either one of the organizations would have been sufficient, so neither of them get any counterfactual impact (more generally, the sum of the individually recorded impacts is less than the impact of the system as a whole)
Multiple orgs have claimed the same object-level impact (eg. an additional $100,000 to AMF from a GWWC pledge) because they were all counterfactually responsible for it (more generally, the sum of the individually recorded impacts is more than the impact of the system as a whole).
Let's suppose:
X is the impact of an org from a single player perspective
Y is the impact of an org taking a system-level view (so that the sum of Y values for all orgs is equal to the impact of the system as a whole)
Point 1 doesn't change X or Y, but it does change the estimate we make of X and Y, and tends to increase it.
Point 2 can only tend to make Y > X.
Point 3 can only tend to make Y < X.
Is your claim that the combination of points 1 and 2 may outweigh point 3, or just that point 2 may outweigh point 3? I can believe the former, but the latter seems unlikely -- it doesn't seem very common for many separate orgs to all be capable of making the same change, it seems more likely to me that in such cases all of the orgs are necessary which would be an instance of point 3.
Yeah, this is the best idea I've come up with so far, but I don't really like it much. (Do you include local groups? Do you include the time that EAs spend talking to their friends? If not, how do you determine how much of the impact to attribute to meta orgs vs. normal network effects?) It would be a good start though.
Another possibility is to cross-reference data between all meta orgs, and try to figure out whether for each person, the sum of the impacts recorded by all meta orgs is a reasonable number. Not sure how feasible this actually is (in particular, it's hard to know what a "reasonable number" would be, and coordinating among so many organizations seems quite hard).
I agree the double-counting issue is pretty complex. (I think maybe the "fraction of value added" approach I mention in the value of coordination post is along the right lines)
I think the key point is that it seems unlikely that (given how orgs currently measure impact) they're claiming significantly more than 100% in aggregate. This is partly because there's already lots of adjustments that pick up some of this (e.g. asking people if they would have done X due to another org) and because there are various types of undercounting.
Given this, adding a further correction for double counting doesn't seem like a particularly big consideration - there are more pressing sources of uncertainty.
Yes, I agree with this. (See also my reply to Rob above.)
Maybe instead of talking about "meta traps" we should talk about "promotion traps" or something?
Yeah, that does seem to capture the idea better.
I don't think you'll want to equate being a meta org with what proportion of your time you spend on outreach. Some object level charities do a lot of outreach too. If AMF started spending 25% of its budget on marketing, would it become a meta-charity?
Sure 80k puts more effort into outreach than GiveWell, but the core model is very similar.
Fair enough, I don't particularly care about which organization is a "meta org" and which one is not, I mostly care about where these meta traps apply and where they don't. Probably should have talked about "meta work" instead of "meta org". Anyway, it does seem like the traps apply to the outreach portion of 80k and not to GiveWell (since they barely have any outreach).
If AMF started spending a lot on marketing, I would count that as "meta work", though I think a lot of these traps would not apply to that specific scenario.
At a glance, it seems like most of the meta-traps don't apply to stuff like promotion of object-level causes.
That's why Peter Hurford distinguished between second-level and first-level meta, and focused his criticism on the second-level.
80,000 Hours and GiveWell are both mainly doing first-level meta (i.e. we promote specific first order opportunities for impact); though we also do some second-level meta (promoting EA as an idea). 80k does more second-level meta day-to-day than GiveWell, though GiveWell explains their ultimate mission in second-level meta terms:
One other quick point is that I don't think coordination problems arise especially from meta-work. Rather, coordination problems can arise anywhere in which the best action for you depends on what someone else is going to do. E.g. you can get coordination problems among global health donors (GiveWell has written a lot about this). The points you list under "coordination problems" seem more like examples of why the counterfactuals are hard to assess, which is already under trap 8.
I mostly agree, but I think a lot of them do apply to first-level meta in many cases. For example I talked about how they apply to GWWC, which is first-level meta (I think).
Yes, and I specifically didn't include that kind of first-level meta work. I think the parts of first-level meta that are affected by these traps are efforts to fundraise for effective organizations, mainly ones that target EAs specifically. Even for general fundraising though, I think several traps still do apply, such as trap #1, #6 and #8.
I agree, I think it's just disproportionately the case that donors to meta work are not taking into account these considerations. GiveWell and ACE take these considerations into account when making recommendations, so anyone relying on those recommendations has already "taken it into account". This may arise in X-risk, I'm not sure -- certainly it seems to apply to the part of X-risk that is about convincing other people to work on X-risk.
Well, even if each organization assesses counterfactuals perfectly, you still have the problem that the sum of the impacts across all organizations may be larger than 100%. The made-up example with Alice was meant to illustrate a case where each organization assesses their impact perfectly, comes to a ratio of 2:1 correctly, but in aggregate they would have spent more than was warranted.
What makes you think this? I found this post interesting, but not new; it's all stuff I've thought about quite hard before. I wouldn't have thought I was roughly representative of meta donors here (I certainly know people who have thought harder), though I'd be happy for other such donors to contradict me.
I've had conversations with people who said they've donated to GWWC because of high leverage ratios, and my impression based on those conversations is that they take the multiplier fairly literally ("even if it's off by an order of magnitude it's still worthwhile") without really considering the alternatives.
In addition, it's really easy to find all of the arguments in favor of meta, including (many of) the arguments that impact is probably being undercounted -- you just have to read the fundraising posts by meta orgs. I don't know of any post other than Hurford's that suggests considerations against meta. It took me about a year to generate all of the ideas not in that post, and it certainly helped that I was working in meta myself.
I think the arguments in favor of meta are intuitive, but not easy to find. For one thing, the org's posts tend to be org-specific (unsurprisngly) rather than a general defense of meta work. In fact, to the best of my knowledge the best general arguments have never been made on the forum at the top level because it's sort-of-assumed that everybody knows them. So while you're saying Peter's post is the only such post you could find, that's still more than the reverse (and with your post, it's now 2 - 0).
At the comment level it's easy to find plenty of examples of people making anti-meta arguments.
I think it's not quite what you're looking for, but I wrote How valuable is movement growth?, which is an article analysing the long-term counterfactual impact of different types of short-term movement growth effects. (It doesn't properly speak to the empirical question of how short-term effort into meta work translates into short-term movement growth effects.)
Huh, there is a surprising lack of a canonical article that makes the case for meta work. (Just tried to find one.) That said, it's very common when getting interested in EA to hear about GiveWell, GWWC and 80K, and to look them up, which gives you a sense of the arguments for meta.
Also, I would actually prefer that the arguments against also be org-specific, since that's typically more decision-relevant, but a) that's more work and b) it's hard to do without actually being a part of the organization.
Anyway, even though there's not a general article arguing for meta (which I am surprised by), that doesn't particularly change my belief that a lot of people know the arguments for but not the arguments against. This has increased my estimate of the number of people who know neither the arguments for nor the arguments against.
Sure, I think we're on the same page here.
I'm hoping/planning to plug both of those holes (a lack of org-specific criticism, and the uncomplied general arguments in favour) in the next few weeks, so did want to double-check that there wasn't a canonical piece that I was missing.
Hypothesis: there's lots of good, informal meta work to be done, like telling convincing your aunt to donate to GiveWell rather than Heiffer International or your company to do a cash fundraiser rather than a canned food drive. But the marginal returns diminish real quickly: once you've convinced all the relatives that are amenable, it is really hard to convince the holdouts or find new relatives. But the remaining work isn't just lower expected value, it has much slower, more ambiguous feedback loops, so it's easy to miss the transition.
Object level work is hard, and there are few opportunities to do it part time. Part time meta work is easy to find and sometimes very high value. My hypothesis is that when people think about doing direct work full time, these facts conspire to make meta work the default choice. In fact full time meta work is the most difficult thing, because of poor feedback loops, and the easiest to be actively harmful with, because you risk damaging the reputation of EA or charity as a whole.
I think we need to flip the default so that people look to object work, not meta, when they have exhausted their personal low hanging fruit.
I'm confused about what you mean by the "default". Do you mean the default career choice?
My impression was that most people don't even do their personal low hanging fruit because of the social awkwardness around it. What sorts of things do you think people do after exhausting their personal low hanging fruit?
If you mean that the default choice is to work at a meta organization, that seems unlikely -- most meta organizations are small, and it's my impression that CEA often has trouble filling positions. According to the annual survey, 512 people said they were going to earn to give, while only 190 people said "direct charity/non-profit work", and only a portion of those would be at meta organizations. So it seems like earning to give is the default choice.
The feedback loops don't seem poor to me. If you're trying to do outreach, you can see exactly how your techniques are working based on how many people you get interested and how interested they are and what they go on to do.
If you're working in animal welfare, you could turn off a lot of people ("those preachy vegans, they're all crazy"), harming animals. If you're working in x-risk, there can often be the chance that you actually increase x-risk (for example, you show how to have some basic AI safety features, and then people think the safety problem is solved and build an AGI with those safety features which turns out not to be enough). Even in global poverty, we have stories like PlayPump, though in theory we should be able to avoid that.
If you include long-run far future effects there are tons of arguments that action X could actually be net negative.
I’m closest to the EA Foundation and know that their strategy rests to a great part on focusing on hard-to-quantify high risk–high return projects because these are likely to be neglected. I don’t know if other metaorganizations are doing something similar, but it is possible.
Yes. Good point and another reason fund ratios are silly (and possibly toxic). The other one is this one. I’ve written an article on a dangerous phenomenon that has been limiting the work in some cause areas that is also related to this attribution problem.
Huh, interesting. I don't know much about the EA Foundation, but my impression is that this is not the case for other meta orgs.
Yeah, I forgot about evaluating from a growth perspective, despite reading and appreciating that article before. Whoops.
Re#6: The only object-level cause discussed is global poverty and health interventions. However, other object-level causes seem much more structurally similar to meta-level work. For instance, this description would seem to hold true of much of the work on X-risk:
Hence insofar as this is an issue (though see Rob's and Ben's comments) it's not unique to meta-level work.
Re #7: Much of the impact from current work on X-risk plausibly derives from getting more valuable actors involved. Since there are several players in the EA X-risk space, this means that it may be hard to estimate which EA X-risk org caused more valuable actors to get involved in X-risk, just like it may be hard to estimate which EA meta-org caused EA movement growth. Thus this problem doesn't seem to be unique to meta-orgs. (Also, I agree with Ben that one would like to see detailed case arguing that these are actually problems, rather than just pointing out that they might be problems.)
This points to the fact that much of the work within object-level causes is "meta" in the sense that it concerns getting more people involved, rather than in doing direct work. However, it is not "meta" in the sense used in this post. (Ben discussed this distinction in his reply to Hurford - see his remark on 'second level meta'.)
Generally, I think that the discussion on "meta" vs "object-level" work would gain from more precise definitions and more conceptual clarity. I'm currently working on that.
Re#8
I don't understand why global poverty and health RCTs (which I suppose is what you refer to) would make a difference. What you're discussing is whether someone's investment makes a difference, or whether what they're trying to do would have occurred anyway. For instance, whether them donating to AMF leads to less people dying from malaria. I think that's plausibly the case, but the question of RCTs vs other kinds of evidence - e.g. observational studies - seems orthogonal to that issue.
Current neglectedness of an existential risk is not necessarily a good guide to future neglectedness. Hence focussing on currently neglected risks does not guarantee that you have a large counterfactual impact.
I'm currently looking into the issue of future investment into X-risk and there doesn't seem to be that much research done on it, so it's not clear to me that people have tackled this problem. It's generally very difficult.
It thus seems to me the situation regarding X-risk is quite analogous to that regarding meta-work on this score, too, but I am not sure I have understood your argument.
Re#3 (which you support, though you don't comment on it further):
To the contrary, meta-work can be a wise choice in face of uncertainty of what the best cause is. Meta-work is supposed to give you resources which can be flexibly allocated across a range of causes. This means that if we're uncertain of what object-level cause is the best, meta-work might be our best choice (whereas being sure what the best cause is is a reason to work on that cause instead of doing meta-level work).
One of the more ingenious aspects of effective altruism is that it is fits an uncertain world so well. If the world were easy to predict, there would be less of a need for a movement which can shift cause as we gather more evidence of what the top cause is. However, that is not the world we're living in, as, e.g. the literature on forecasting shows.
Given what we know about human overconfidence, I think there is more reason to be worried that people are overconfident about their estimates of the relative marginal expected value of object-level causes, than that they withhold judgement of what object-level cause is best for too long.
It is definitely true for animal welfare, but in this case ACE takes this into account when making its recommendations, which defuses the trap. I'm not too familiar with X-risk organizations so I don't know to what extent it is true there -- it seems plausible that it is also an issue for X-risk organizations.
I would in fact count this as "meta" work -- it would fall under "promoting effective altruism in the abstract".
My point is that an RCT proves to you that distributing bed nets in a certain situation causes a reduction in child mortality. There is no uncertainty about the counterfactual -- that's the whole point of the control. (Yes, there are problems with generalizing to new situations, and there can be problems with methodology, but it is still very good evidence.)
On the other hand, when somebody takes the GWWC pledge, you have next to no idea how much and where they would have donated had they not taken the pledge. (GWWC
In both cases you can have concerns about funding counterfactuals ("What if someone else had donated and my donation was useless?") but with meta work you often don't even know the counterfactuals for the actual intervention you are implementing.
Given what you say about future investment into X-risk, it makes sense that the situation is analogous for X-risk. I wasn't aware of this.
Maybe, I'm not sure. It feels more to me like we should be worried about overconfidence once someone makes a decision, but I haven't seriously thought about it.
I don't think that to promote X-risk should be counted as "promoting effective altruism in the abstract".
There are two kinds of issues here:
1) Does the intervention have the intended effect, or would that effect have occurred anyway? 2) Does the donation make the intervention occur, or would that intervention have occurred anyway (for replaceability reasons)?
Bednet RCTs help with the first question, but not with the second. For meta-work and X-risk both questions are very tricky.
Yes, I agree.
"On average the cost to 80,000 Hours of a significant plan change is £1,667."
80,000 Hours' average cost per plan change this year was more like £300. It's a good example of increasing returns to scale/quality being a more important factor than shrinking opportunities - the marginal cost is way lower than the long term average.
The cost of getting an extra plan change by running additional workshops is about the same, which suggests the short term marginal is about the same as the short-term average.
Yeah, I was looking for this year's numbers because I figured it would have gone below £1000. But that's impressively low, damn. One more point to economies of scale.
What drove the improvement to £300?
The online career guide and workshop keep getting better, and we've become more skilled at promoting them. The fixed costs of producing it all are now spread over a much larger number of readers (~1 million a year).
Approximately what is that £300 spent on? Is it staff time, and if so what is most of that staff time spent on? Talking to individuals and persuading them? Researching what careers would work for them and telling them? Or something competely different?
Just a small comment. Shouldn't we really be calling this worry about 'movement building' rather than 'meta'? Meta to me means things like cause prioritisation.
Yeah I didn't have a great term for it so I just went with the term that was used previously and made sure to define what I meant by it. I think this is a little broader than movement building -- I like the suggestion of "promotion traps" above.
Relevant to #1b. Overestimating impact:
http://lesswrong.com/lw/76k/the_optimizers_curse_and_how_to_beat_it/