Hi everyone,

Recently, I decided to read one of ACE’s charity evaluations in detail, and I was extremely disappointed with what I read. I felt that ACE's charity evaluation was long and wordy, but said very little. 

Upon further investigation, I realized that ACE’s methodology for evaluating charities often rates charities more cost-effective for spending more money to achieve the exact same results. This rewards charities for being inefficient, and punishes them for being efficient. 

ACE’s poor evaluation process leads to ineffective charities receiving recommendations, and many animals are suffering as a result. After realizing this, I decided to start a new charity evaluator for animal charities called Vetted Causes. We wrote our first charity evaluation assessing ACE, and you can read it by clicking the attached link.       

 

Best,

Isaac

45

6
13
2

Reactions

6
13
2
Comments52
Sorted by Click to highlight new comments since:

Thank you for spending time analyzing our methods. We appreciate those who are willing to engage with our work and help us improve the accuracy of our recommendations and reduce animal suffering as much as possible.

Based on previously received feedback and internal reflection, we have significantly updated our evaluation methods in the past year and will be publishing the details next Tuesday when we release our charity recommendations for 2024. From what we can tell from a quick skim, we think that our changes largely address Vetted Causes’ concerns here, as well as the detailed feedback we received last year from Giving What We Can (see also our response at the time) as part of their program that evaluates evaluators. Our cost-effectiveness analyses no longer use achievement or intervention scores, but rather directly calculate cost-effectiveness by dividing impact by cost, as you suggest. That being said, our work will never be perfect so we invite anyone reading this with the expertise to improve the rigor of our work to reach out, now or in the future.

Although your comments are related to methods that we no longer use, we’d like to spend more time understanding and engaging with them, learning from them, and potentially correcting any misconceptions. Unfortunately, we won’t have the opportunity to do so until after our charity recommendations are released next week. Additionally, it might be a comfort to know that for the past few months, Giving What We Can has been assessing ACE’s new evaluation methods along with a panel of other experts and that they intend to publish the results later this month.

Thank you.

- The ACE team

Hi,

Thank you for your response!

we have significantly updated our evaluation methods in the past year and will be publishing the details next Tuesday when we release our charity recommendations for 2024. From what we can tell from a quick skim, we think that our changes largely address Vetted Causes’ concerns here, as well as the detailed feedback we received last year from Giving What We Can (see also our response at the time) as part of their program that evaluates evaluators. Our cost-effectiveness analyses no longer use achievement or intervention scores, but rather directly calculate cost-effectiveness by dividing impact by cost, as you suggest.

We are glad to hear that ACE has changed their evaluation methods, and we hope that the changes effectively address the concerns listed in our review. 

We look forward to seeing ACE’s new charity recommendations when they are released next week. 

Hi Isaac! Now that we’ve announced our 2024 Recommended Charities, we’ve had more time to process your feedback. Thanks again for engaging with our work.

As mentioned before, we’ve substantively updated our evaluation methods this year. This was informed in part by detailed feedback we received as part of Giving What We Can’s 2023 ‘Evaluating the Evaluators’ project, some of which aligns with your feedback. 

One of these changes is that we now seek to conduct more direct cost-effectiveness analyses, rather than the 1-7 scoring method that we used last year. This more direct approach is possible in part thanks to Ambitious Impact’s recent work to allow quantification of animal suffering averted per dollar. Of course, these kinds of calculations are still extremely challenging, limited, and subject to significant uncertainties; we describe our methods and their limitations on our website. For example, while cost-effectiveness = impact divided by cost, it can be difficult to measure impact meaningfully in a way that is also quantifiable, so we rely on other criteria to help us make our assessments.

Another major change was introducing a formal Theory of Change assessment to understand the reasoning, evidence base, and limitations around each charity’s main programs. In our 2023 Evaluations, we discussed these considerations in our Recommendations Decisions meetings but did not systematically incorporate them into our public reviews. Together, we think these changes allow for a more nuanced assessment of charities’ work and (we hope) more informative and accessible reviews.

Regarding the impact of our recommendations, this year, we conducted an assessment of ACE’s programs and our counterfactual influence on funding. As part of this work, we surveyed donors to our Recommended Charity Fund (RCF) and asked them where they’d donate if ACE didn’t exist. This indicated that over 60% of our RCF donors would donate less to animal charities if ACE were not to exist, of whom around 12% would not donate to animal charities at all. We aim to publish these influenced-giving reports on November 29th. We hope this reassures you that animals are not worse off because of ACE’s charity recommendations.  

In terms of your specific feedback on last year’s methodology:

  • ‘Charities can receive a worse Cost-Effectiveness Score by spending less money to achieve the exact same results’ / ‘Charities can receive a better Cost-Effectiveness Score by spending more money to achieve the exact same results’ / ‘Charities can rearrange their budget and achieve the exact same results (with the exact same total expenditures), but their Cost-Effectiveness Score can significantly change.’
    • Your findings here are correct. Because the weighted averages in this model depended on the percentage of expenditure for each factor, they sometimes produced unintended and unhelpful results. In part due to this, we interrogated the outputs of our models in our Recommendation Decisions meetings at the time and considered cost-effectiveness scores alongside other decision-relevant factors (such as their Impact Potential and Room For More Funding), rather than taking cost-effectiveness as the only relevant factor to consider when evaluating charities or prioritizing giving opportunities. This was informed in part by each charity’s uncertainty scores, which helped inform how much weight to assign to cost-effectiveness and other criteria in our final recommendations decisions. As you would expect given their work, Legal Impact for Chickens’ uncertainty scores were among the highest of our 2023 evaluated charities. Of all our 2023 Recommended Charities, our Recommendations Decisions discussions played the biggest role for Legal Impact for Chickens given that our models were not as well-suited to their work compared to those for other charities.
    • How we addressed this in our 2024 Evaluations: As noted above, we now do direct cost-effectiveness analysis rather than using a weighted factor model. We think the role of our Theory of Change-focused discussions in our 2023 Recommendation Decisions meetings should have been more systematic and more clearly communicated in our 2023 reviews, which is one reason why we introduced the new Theory of Change assessment this year.
  • ‘Charities can have 1,000,000 times the impact at the exact same price, and their Normalized Achievement Scores and Cost-Effectiveness Score can remain the same.’
    • This isn’t the case, but we didn’t publish the full details about our method for assessing the impact of books, podcasts, and other interventions, so we see why this wasn’t clear. Essentially for each intervention in our Menu of Interventions we identified proxies for its likely impact. For books, we had intended to include sales/views as well as a rating of the overall audience response/reviews. In practice, this wasn’t possible for various reasons given the wide variation in types of publication (e.g., some publications had not been released yet, or had been provided directly to the audience with no feedback collected), so we had to factor in such considerations on a more case-by-case basis in our Recommendations Decisions discussions. Issues such as this highlighted to us the inherent limitations of seeking to distill a charity’s work in a weighted factor model, given e.g. the large variation in tactics used by animal advocacy charities and the challenges involved in obtaining the necessary data to score their achievements based on pre-set criteria.
    • We used additive rather than multiplicative scoring because our objective was to create weighted factor models that reflect the quality of achievements (rather than, e.g., estimating the number of animals helped by books being written). Since we’ve transitioned to directly estimating cost-effectiveness, we now use a straightforward multiplication for factors like “likelihood of implementation.”
    • How we addressed this in our 2024 Evaluations: Same as the point above: we have now updated to a more direct cost-effectiveness analysis rather than using this weighted factor model, and have also introduced a new Theory of Change assessment.
  • ‘Charities can increase their Normalized Achievement Scores and Cost-Effectiveness Score by breaking down actions into smaller steps, even if the overall results remain unchanged.’
    • This actually isn’t the case (sorry if this wasn’t clear). Breaking down an achievement into smaller steps would drive up the ‘Achievement quantity’ score, but would be offset by lower ‘Achievement quality’ scores for each achievement. However, there was still a risk of this introducing inconsistency into the model, which is another reason why we updated our methods this year.
  • ‘The most important factor in determining the Normalized Achievement Score of an intervention (Impact Potential Score) is decided before the intervention even begins. This makes the maximum Normalized Achievement Score for certain interventions relatively low, even if they have extremely high impact.’
    • We developed this model because past evaluations have shown that the intervention type drives much of the impact of a charity’s achievements. Starting with a baseline intervention score and adjusting it still allows for particularly strong implementations to at least partially make up for a lower intervention score. That said, we agree with you on this model’s shortcomings. As with the cost-effectiveness model, we interrogated the model’s outputs in our Recommendation Decisions meetings and had a mechanism to weight Impact Potential lower in our decision-making when we were less certain about its relevance. 
    • How we addressed this in our 2024 Evaluations: We have updated this model and now only use it in a very limited way to supplement a qualitative assessment of charities’ work during the charity selection phase, rather than during the Evaluations themselves.
  • ‘Legal Impact for Chickens did not achieve any favorable legal outcomes, yet ACE rated them a Recommended Charity.’
    • When ACE considers impact for animals, we consider all the ways that animal suffering might be reduced when interventions are implemented. While they did not secure a litigation win, Legal Impact for Chickens’ Costco lawsuit garnered significant media attention that put pressure on the companies being litigated. While their work is more ‘hits-based’ than some of our other Recommended Charities, we think the considerable impact of any future legal wins means high expected value for this work overall, especially now that funding from ACE, Open Philanthropy, and the EA Animal Welfare Fund has allowed them to hire more litigators. Check out Alene Anello’s recent EA Forum post for an update on Legal Impact for Chickens’ latest achievements.

Thanks again for your engagement with our evaluations. We hope you get in touch with us directly if you come across new evidence-based methods to meaningfully capture cost-effectiveness or to improve the evaluation of animal charities. We might also reach out to you via email in the coming weeks as we go through retrospectives and plan for next year’s evaluation. Because of the complexity of the animal welfare cause area, the many uncertainties and knowledge gaps in the field of charity evaluation, and the urgency and scope of suffering, we embrace productive collaboration.  

Thank you.

-  The ACE team

Hi,

Thank you for taking the time to read our review and for responding to each of our points. We really appreciate ACE’s willingness to engage with feedback and acknowledge problems.

Regarding your clarifications related to the calculation of Normalized Achievement Scores:

‘Charities can have 1,000,000 times the impact at the exact same price, and their Normalized Achievement Scores and Cost-Effectiveness Score can remain the same.’

  • This isn’t the case, but we didn’t publish the full details about our method for assessing the impact of books, podcasts, and other interventions, so we see why this wasn’t clear. Essentially for each intervention in our Menu of Interventions we identified proxies for its likely impact. For books, we had intended to include sales/views as well as a rating of the overall audience response/reviews. In practice, this wasn’t possible for various reasons given the wide variation in types of publication (e.g., some publications had not been released yet, or had been provided directly to the audience with no feedback collected), so we had to factor in such considerations on a more case-by-case basis in our Recommendations Decisions discussions.

We are glad to hear that ACE was accounting for these factors behind the scenes.

‘Charities can increase their Normalized Achievement Scores and Cost-Effectiveness Score by breaking down actions into smaller steps, even if the overall results remain unchanged.’

  • This actually isn’t the case (sorry if this wasn’t clear). Breaking down an achievement into smaller steps would drive up the ‘Achievement quantity’ score, but would be offset by lower ‘Achievement quality’ scores for each achievement. However, there was still a risk of this introducing inconsistency into the model, which is another reason why we updated our methods this year.

Thank you for clarifying this. From the publicly available rubrics for calculating Achievement Quality Scores, it did not seem like breaking down an achievement into smaller steps would decrease the Achievement Quality Score at all. However, given that ACE was accounting for factors outside of the publicly available rubrics, it makes sense that this decrease could occur.

That being said, we believe it is important for ACE to fully disclose its methodology to the public and avoid relying on hidden evaluation criteria. This transparency would allow people from outside the organization to understand how ACE's charity evaluation metrics (i.e. Normalized Achievement Scores) were calculated. 

We might also reach out to you via email in the coming weeks as we go through retrospectives and plan for next year’s evaluation. Because of the complexity of the animal welfare cause area, the many uncertainties and knowledge gaps in the field of charity evaluation, and the urgency and scope of suffering, we embrace productive collaboration.  

We appreciate your openness to collaboration. Feel free to reach out to us at any time at hello@vettedcauses.com 

Did you ask ACE to review this before publishing? It seems like the kind of thing that would be worth getting feedback on before publishing. I didn't look at this for more than a couple minutes, but I saw immediately that there might be some conceptual disagreements between you and ACE - for example, I noticed that in your first example, you assume in your example (I believe), that if LIC didn't spend 200k on the lawsuit against Costco, they wouldn't spend it on anything else. It's unclear to me that this is the counterfactual, or how ACE is conceptualizing those funds. There might be reasoning behind their decisionmaking that would be useful to your critiques they could share.

I also felt like this felt pretty politically motivated. Not sure if that is your intention, but paragraphs like this:

ACE's recommendations determine which animal charities receive millions of dollars in donations.[1] Thus far, we have reviewed 5 of ACE's "Top 11 Animal Charities to Donate to in 2024" and only one of them (Shrimp Welfare Project) appears to be an effective charity for helping animals. ACE's poor evaluation process leads to ineffective charities receiving recommendations, and many animals are suffering as a result.

Without any evidence feels pretty intense. ACE is kind of low hanging fruit to pick on in the EA space, so this read to me like more of that, without necessarily the evidence base to back it. Reading your report, I felt kind of like "oh, there are interesting assumptions here, would be interested to learn more", and not "ACE is doing an extremely bad job." 

E.g. I think the questions that would be good to ask in a critique of ACE might be:

  • If ACE didn't exist, how would the funds the direct be spent otherwise? Would that be better or worse for animals?
  • Is historical track record / cost-effectiveness the only lens on which to evaluate charities?
    • If the answer is yes, seems very hard to start new things!
    • I don't know if the LIC legal case is this, but celebrating the potential impact of promising bets that didn't pan out seems good to me.

I also think getting feedback on statements like this would be really helpful:

The correct formula for calculating cost-effectiveness is simply impact divided by cost. Rather than using this simple formula, ACE has elected to create a methodology that does not properly account for impact or cost.

I think ACE has wanted to do this at points in their history — my impression is just that it is incredibly difficult, so they've approached it from other angles instead. I also don't think it's clear to me that ACE's goal is to report cost-effectiveness. I think clarifying this with them, and getting a sense of why they don't do what you see as the simple approach would be useful for making this critique stronger. And, I don't think people should make giving decisions based only on historic cost-effectiveness - just because an opportunity was impactful doesn't mean the organization needs more funds to do that work, that it will scale, work in the future, etc.

I don't disagree that ACE might be directing funds to ineffective charities! I don't really think non-OpenPhil EA donors should give to farmed animal welfare, for example. But, I don't think it is obvious to me that ACE going away means money going to more effective charities - I expect it would mostly be worse - people giving to animal charities with basically no vetting.

That being said, critique of critical organizations is great in my opinion, so appreciate you putting this out there!

"I don't really think non-OpenPhil EA donors should give to farmed animal welfare, for example." Wow, this is interesting! I would love to know what you mean by this?

(I responded privately to this but wrote up some related reflections a while ago here). 

Having read your reflections, I'm still curious as to why you don't think non-OpenPhil donors should give to farmed animal welfare, if you feel comfortable sharing it publicly. I guessed four options, ordered from most to least likely, but I might have misunderstood the post

  1. We should donate to wild animal welfare instead, as it's more cost-effective
  2. There are no donation opportunities that counterfactually help a significant amount of farmed animals
  3. There is no strong moral obligation to improve future lives, and donations to farmed animal welfare necessarily improve future lives, as farmed animal lives are very short
  4. Tomasik-style arguments on the impact of animal farming on the amount of wild animal suffering

Is it a combination of these? As a concrete example, I'm curious if you believe that the Shrimp Welfare Project shouldn't be funded, should be funded by "non-EA" donors, or will be funded anyway and donors shouldn't worry about it.

 

By the way, thank you for nudging towards sharing evaluations with the evaluated organization before posting, I think it's a really valuable norm.

Thanks! My wording in the above message was imprecise, but I mean something like farmed vertebrates. SWP is probably among the two most important things to fund, in my opinion.

Basically I think the size of good opportunities in farmed animal advocacy is smaller than OpenPhil's grantmaking budget and there are few scalable interventions, though I don't think I want to go into most the reasons publicly. Given that they've stopped funding many of what I believe are more cost-effective projects, and that EA donors are basically the only people willing to fund those, EA donors should be mostly inclined to fund things OpenPhil can't fund instead.

So some combination of 1+2 (for farmed vertebrates) + other factors

I also felt like this felt pretty politically motivated. Not sure if that is your intention, but paragraphs like this:

ACE's recommendations determine which animal charities receive millions of dollars in donations.[1] Thus far, we have reviewed 5 of ACE's "Top 11 Animal Charities to Donate to in 2024" and only one of them (Shrimp Welfare Project) appears to be an effective charity for helping animals. ACE's poor evaluation process leads to ineffective charities receiving recommendations, and many animals are suffering as a result.

Without any evidence feels pretty intense. ACE is kind of low hanging fruit to pick on in the EA space, so this read to me like more of that, without necessarily the evidence base to back it. Reading your report, I felt kind of like "oh, there are interesting assumptions here, would be interested to learn more", and not "ACE is doing an extremely bad job." 

 

What claims did we make that we did not provide evidence for? 

…we have reviewed 5 of ACE's "Top 11 Animal Charities to Donate to in 2024" and only one of them (Shrimp Welfare Project) appears to be an effective charity for helping animals. ACE's poor evaluation process leads to ineffective charities receiving recommendations, and many animals are suffering as a result.

I understand these are forthcoming, but no evidence is provided for this entire part - part of the reason I pushed on this is I think seeing your alternative evaluations would be very helpful for interpreting the strength of the critique of ACE. Without seeing them, I can’t evaluate the latter half of the quoted text. And in my eyes, if these are similar to the evaluation here of LIC, it’s pretty far from demonstrating that ineffective charities are receiving recommendations, etc. And, given that you’ve only evaluated <50% of their charities so far, it seems preemptive to make the overall claim. I think the overall claim is very possibly true, but again, I think to make the argument that animals are directly suffering as a result of this, you’d have to demonstrate that those charities are worse than other donation options, that donors would give to the better options, etc.

Note: this reply addresses everything Abraham claims we did not provide evidence for. 

“ACE's poor evaluation process leads to ineffective charities receiving recommendations”

Our review covered how under ACE’s evaluation process:

  1. Charities can receive a worse Cost-Effectiveness Score by spending less money to achieve the exact same results.
  2. Charities can have 1,000,000 times the impact at the exact same price, and their Cost-Effectiveness Score can remain the same.
  3. The most important factor in determining the impact of an intervention is decided before the intervention even begins. 

This is clear evidence that ACE uses a poor evaluation process. Is the fact that ACE’s evaluation process rewards inefficiency, and punishes efficiency, “no evidence” for ACE recommending ineffective charities?

If you’d like me to get even more specific, let’s look at Problem 1 of our review:

We go on to detail how if LIC had spent less than $2,000 on the lawsuit (saving over $200,000) and achieved the exact same outcome, ACE would have assigned LIC a Cost-Effectiveness Score of 1.8. The lowest Cost-Effectiveness Score ACE assigned to any charity in 2023 was 3.3. This means if LIC had spent less than $2,000 on the lawsuit, LIC's Cost-Effectiveness Score would have been significantly worse than any charity ACE evaluated in 2023. 

Instead, LIC spent over $200,000 on the lawsuit, and LIC rewarded them for this inefficiency by giving them a Cost-Effectiveness Score of 3.7, and deeming LIC a top 11 animal charity.

As we noted in our review, these Cost-Effectiveness Scores are defined by ACE as “how cost effective we think the charity has been”. LIC achieved no favorable legal outcomes despite receiving over a million dollars in funding. As we also noted in our review, every lawsuit LIC filed was dismissed for failing to state a valid legal claim.

If I provided evidence that a Law Firm Rating Organization rewards law firms for losing lawsuits and wasting money, and punishes law firms for winning lawsuits and saving money, would this be no evidence that the Law Firm Rating Organization is recommending ineffective law firms?

ACE's poor evaluation process leads to ineffective charities receiving recommendations, and many animals are suffering as a result.

Our review details how ACE’s recommendations direct the flow of millions of dollars. Are you asking for evidence that directing millions of dollars toward ineffective animal charities, rather than effective ones, leads to animal suffering?

we have reviewed 5 of ACE's "Top 11 Animal Charities to Donate to in 2024" and only one of them (Shrimp Welfare Project) appears to be an effective charity for helping animals.

Imagine a film critic watches 5 of the 11 films that received a 'Best Films' award and writes, “Of the five films I’ve seen, only one appears to deserve the award. I plan to release my reviews of the films shortly.” Does this statement by the film critic require evidence? 

(Responding because this is inaccurate): My claim in the comment above was that you haven't provided any evidence that:

  • 5 / 11 (or more) ACE top charities are not effective
  • That animals are suffering as a result of ACE recommendations

Which remains the case — I look forward to you producing it. 

Responding because this is inaccurate)

I don't know what you're saying is inaccurate. My reply addressed every single word from the section you claimed I didn't provide evidence for. 

My claim in the comment above was that you haven't provided any evidence that:

  • 5 / 11 (or more) ACE top charities are not effective

We never made this claim. 

  • That animals are suffering as a result of ACE recommendations

I’ll ask again. Our review details how ACE is rewarding charities for inefficiency (and punishing them for efficiency), and how LIC was rewarded for their inefficiency with the designation "Top 11 Animal Charities to Donate to in 2024." Our review also details how ACE's recommendations direct the flow of millions of dollars. Are you asking for evidence that directing millions of dollars toward ineffective animal charities, rather than effective ones, leads to animal suffering?

This is starting to feel pretty bad faith, so I'm actually going to stop engaging. 

Hi Abraham,

I didn't look at this for more than a couple minutes

Thank you for reading some of the article. I hope that you find some time to read the rest.

Did you ask ACE to review this before publishing?

No, I did not ask ACE. I hope that this article inspires a public discussion.

I noticed that in your first example, you assume in your example (I believe), that if LIC didn't spend 200k on the lawsuit against Costco, they wouldn't spend it on anything else. It's unclear to me that this is the counterfactual, or how ACE is conceptualizing those funds. 

What do you mean by conceptualizing funds? In this hypothetical, they simply spend $200k less on the lawsuit. LIC did not spend their entire budget, and charities oftentimes do not. Under ACE’s methodology, LIC’s cost-effectiveness would worsen if they spent $200k less and achieved the exact same total outcomes as a charity. The calculations we’ve done are 100% objective, and if you can find an error that we made, please let us know. You can find those calculations here: 

Reading your report, I felt kind of like "oh, there are interesting assumptions here, would be interested to learn more"

What assumptions are you referring to? 

If ACE didn't exist, how would the funds the direct be spent otherwise? Would that be better or worse for animals?

If ACE didn’t exist, I would hope more funds would go to effective charities instead of ineffective ones. I hope to take part in this positive change.

  • Is historical track record / cost-effectiveness the only lens on which to evaluate charities?
    • If the answer is yes, seems very hard to start new things!
    • I don't know if the LIC legal case is this, but celebrating the potential impact of promising bets that didn't pan out seems good to me.

LIC has a historical track record, and it is a bad one. People should have the opportunity to start something new. However, they shouldn’t be rated a top 11 animal charity after receiving over a million dollars in funding and failing to achieve any positive legal outcomes.

I also felt like this felt pretty politically motivated. 

I saw that you call into question the integrity of the article. I want to be clear in saying that we have no relationship with any charity. However, I noticed that you co-founded the Wild Animal Initiative, which is a charity endorsed by ACE. Still, I don’t question your feedback on the article. I hope that going forward you will evaluate our reviews for what they are, rather than suggest something “political” is going on.

That being said, critique of critical organizations is great in my opinion, so appreciate you putting this out there!

Thank you, and I appreciate your feedback! 

Thank you for reading some of the article. I hope that you find some time to read the rest.

To be clear, I read the whole thing - I meant that I think the fact that a pretty important issue jumped out to me within a few minutes of starting reading struck me as a reason that getting feedback from ACE seems really important.

 

Did you ask ACE to review this before publishing?

I really think you should! I also really think you should ask for feedback from other people who have done charity evaluations, and the charities you evaluate. You should definitely still publish them, but they'll be better critiques for having engaged with the best case for the thing you're critiquing!

 

What do you mean by conceptualizing funds? In this hypothetical, they simply spend $200k less on the lawsuit. LIC did not spend their entire budget, and charities oftentimes do not. Under ACE’s methodology, LIC’s cost-effectiveness would improve if they spent $200k less and achieved the exact same total outcomes as a charity. The calculations we’ve done are 100% objective, and if you can find an error that we made, please let us know. You can find those calculations here: 

Yep, this seems right, but it's also the case that if they did something else with that funding, the effectiveness of that action would be rated much more highly, which also seems correct. I think the issues you point to are interesting, but they strike me as intentional decisions, which ACE may have internal views on, and for which I think getting their feedback might be really important. You are correct about a mathematical fact, but both you and ACE seem to have different goals (calculating historic cost-effectiveness vs marginal impact of future dollars), and there are assumptions underlying your analysis that if changed, might change the output.

 

What assumptions are you referring to? 

I meant ACE's assumptions - I thought your post raised some really good questions. They are issues that if I saw, I'd email to ACE and ask why they made the choices they made, then choose whether or not to publicly publish them based on their response. Maybe these choices are reasonable, and maybe they aren't - you raised some really good points I think. But it just seems hard to evaluate in a vacuum.

 

LIC has a historical track record, and it is a bad one. People should have the opportunity to start something new. However, they shouldn’t be rated a top charity after receiving over a million dollars in funding and failing to achieve any positive legal outcomes.

Again, I don't really see good evidence for this - what is the typical track record for legal campaigns? How much do they cost? How long do they take to work? These all would be important questions to answer before claiming cost-effectiveness or lack thereof.  In this case, I could easily be persuaded to agree with you, but not for any of the reasons in your analysis — the fact that they spent some money some lawsuits and it didn't work isn't the only evidence I'd want to think about whether or not donations to them will be useful.

Yep, this seems right, but it's also the case that if they did something else with that funding, the effectiveness of that action would be rated much more highly, which also seems correct. 

Our hypothetical in Problem 1 of the review is about two scenarios:

  1. LIC spending $204,428 on the Costco lawsuit and achieving outcome X. (this is what actually happened)
  2. LIC spending $1,566 on the Costco lawsuit and achievement outcome X. (this is what happened in the hypothetical)

Note that both scenarios achieve the exact same result, which is outcome X. ACE would rate LIC less cost-effective for spending $1,566 (saving $202,862) to achieve outcome X.

The hypothetical is not about something else they could do with the funding. The hypothetical is about assessing what happens to a charity’s Cost-Effectiveness Score if they save money to achieve the same outcome.

it's also the case that if they did something else with that funding, the effectiveness of that action would be rated much more highly, which also seems correct. 

How do you know that if they did something else with that funding the effectiveness of that action would be rated much more highly? According to ACE, the Costco lawsuit was a particularly cost effective intervention in spite of the fact that the lawsuit was dismissed for failing to state a valid legal claim:

  • "We think that out of all of Legal Impact for Chickens' achievements, the Costco shareholder derivative case is particularly cost effective because it scored high on achievement quality."

I’m also not sure how it would even be possible to evaluate a hypothetical in which LIC does something else with the funding. Could you explain how this hypothetical would work, and how it would be evaluated? All of the examples in our review are 100% objective and based on ACE’s own methodology. There is no subjectivity in our examples, and this was done intentionally. 

You are correct about a mathematical fact, but both you and ACE seem to have different goals (calculating historic cost-effectiveness vs marginal impact of future dollars)

I’m not sure how ACE’s goals could align with the principles of effective altruism if they intentionally created a methodology that contains the problem described above. 

Again, I don't really see good evidence for this - what is the typical track record for legal campaigns? How much do they cost? How long do they take to work? 

Imagine there is a law firm that has received over a million of dollars in funding, existed for multiple years, and failed to secure any favorable legal outcomes. Also imagine that their most cost-effective lawsuit was one in which they spent over $200,000, and the lawsuit was dismissed for failing to state a valid legal claim. 

If this law firm were rated one of the top 100 law firms in the world, what would you think of the organization that assigned this rating? Would you say there is not good evidence for this being an incorrect rating? 

ACE rated LIC as one of the Top 11 Animal Charities to Donate to in 2024. Prior to being reviewed, LIC received over a million dollars in funding, existed for multiple years, and failed to secure any favorable legal outcomes. According to ACE, LIC’s most cost-effective intervention was one in which they spent over $200,000, and the lawsuit was dismissed for failing to state a valid legal claim.

Are these factors poor evidence for LIC not being one of the 11 best animal charities to donate to? 

I don’t really have a strong view about LIC - as I’ve mentioned elsewhere in the comments, I’m skeptical in general that very EA donors should give to farmed vertebrate welfare issues in the near future. But I don’t find this level of evidence particularly compelling on its own. I think I feel confused about the example you’re giving because it isn’t about hypothetical cost-effectiveness, it’s about historic cost-effectiveness, where what matters are the counterfactuals.

I broadly think the critique is interesting, and again, seems like probably an issue with the methodology, but on its own doesn’t seem like reason to think that ACE isn’t identifying good donation opportunities, because things besides cost-effectiveness also matter here.

But I don’t find this level of evidence particularly compelling on its own.

You don't find these facts particularly compelling evidence that LIC is not historically cost-effective?

  1. LIC’s most cost-effective intervention was one in which they spent over $200,000, and the lawsuit was dismissed for failing to state a valid legal claim.
  2. LIC received over a million dollars in funding prior to being reviewed
  3. LIC existed for multiple years prior to being reviewed
  4. LIC failed to secure any favorable legal outcomes, or file any lawsuit that stated a valid legal claim?

What would be compelling evidence for LIC not being historically cost-effective? 

I think I feel confused about the example you’re giving because it isn’t about hypothetical cost-effectiveness, it’s about historic cost-effectiveness, where what matters are the counterfactuals.

ACE does 2 separate analyses for past cost-effectiveness, and room for future funding. For example, those two sections in ACE's review of LIC are:

  • Cost Effectiveness: How much has Legal Impact for Chickens achieved through their programs?
  • Room For More Funding: How much additional money can Legal Impact for Chickens effectively use in the next two years?

Our review focuses on ACE's Cost-Effectiveness analysis, not on their Room For More Funding analysis. In the future, we may evaluate ACE's Room For More Funding Analysis, but that is not what our review focused on. 

However, I would like to pose a question to you: Given the ACE often gives charities a worse historic cost-effectiveness rating for spending less money to achieve the exact same outcomes (see Problem 1), how confident do you feel in ACE's ability to analyze future cost-effectiveness (which is inherently more difficult to analyze)?

I don't find that evidence particularly compelling on its own, no. Lots of projects cost more than 1M or take more than a few years to have success. I don't see why those things would be cause to dismiss a project out of hand. I don't really buy social movement theories of change for animal advocacy, but many people do, and it just seems like many social movement-y things take a long time to build momentum, and legal and research-focused projects take forever to play out. Things I'd want to look at to form a view on this (though to be clear, I plausibly agree with you!):

  • How much lawsuits of this type typically cost
  • What the base rate for success is for this kind of work
  • How long this kind of work typically takes to get traction
  • Has anyone else tried similar work on misleading labelling or whatever? Was it effective or not?
  • Has LIC's work inspired other lawsuits, as ACE reported might be a positive side effect?
     

I don't think we disagree that much here, except how much these things matter — I don't really care about ACE's ability to analyze cost-effectiveness outside broad strokes because I think the primary benefits of organizations like ACE is shifting money to more cost-effective things within the animal space, which I do believe ACE does. I also don't mind ACE endorsing speculative bets that don't pay off — I think there are many things that were worth paying for in expectation that don't end up helping any animals, and will continue to be, because we don't really know very many effective ways to help animals so the information value of trying new things is high.

But to answer your question specifically, I'd be very skeptical of anyone's numbers on future cost-effectiveness, ACE's or yours or my own, because I think this is an issue that has historically been extremely difficult to estimate cost-effectiveness for. I'm not convinced that's the right way to approach identifying effective animal interventions, in part because it is so hard to do well. I don't really think ACE is making cost-effectiveness estimates here though - it seems much more like trying to get a rough sense of relative cost-effectiveness, which, putting aside the methodological issues you've raised, seems like the right approach to me, but only a small part of the information I'd want to know where money should move in animal advocacy.

  • How much lawsuits of this type typically cost
  • What the base rate for success is for this kind of work
  • How long this kind of work typically takes to get traction

The Nonhuman Rights Project provides a possible point of comparison.  From 2013 to 2023 they raised $13.2 Million.  As far as I know, they have never won a case.

I don't find that evidence particularly compelling on its own, no. Lots of projects cost more than 1M or take more than a few years to have success. I don't see why those things would be cause to dismiss a project out of hand.

The question I asked was: "You don't find these facts particularly compelling evidence that LIC is not historically cost-effective?"

The question was not about whether these facts are compelling evidence that LIC won't be successful in the future, or if the project should be dismissed. 

Wait, those are related to each other though - if we haven't seen the full impact of their previous actions, we haven't yet seen their historical cost-effectiveness in full! Also, you cite these as reasons the project should be dismissed in your post - you have a section literally called "Legal Impact for Chickens Did Not Achieve Any Favorable Legal Outcomes, Yet ACE Rated Them a Top Charity" which reads to me that you believe that it is bad they were rated a Top Charity, and make these same arguments (and no others) in the section, suggesting that you think this evidence means they should be dismissed.

Wait, those are related to each other though - if we haven't seen the full impact of their previous actions, we haven't yet seen their historical cost-effectiveness in full!

No, they are not. Historical cost-effectiveness refers to past actions and outcomes—what has already occurred.

All of LIC's legal actions have already been either dismissed or rejected. What are you suggesting we need to wait for before we can analyze LIC's historical cost-effectiveness in full? 

You are conflating the issue of past cost-effectiveness with future potential.

Also, you cite these as reasons the project should be dismissed in your post - you have a section literally called "Legal Impact for Chickens Did Not Achieve Any Favorable Legal Outcomes, Yet ACE Rated Them a Top Charity" which reads to me that you believe that it is bad they were rated a Top Charity, and make these same arguments (and no others) in the section, suggesting that you think this evidence means they should be dismissed.

Did I claim that I don't think LIC "should be dismissed"? 

both you and ACE seem to have different goals (calculating historic cost-effectiveness vs marginal impact of future dollars) 

ACE states (under Criterion 2) that a charity's Cost-Effectiveness Score "indicates, on a 1-7 scale, how cost effective we think the charity has been [...] with higher scores indicating higher cost effectiveness." 

Would you mind clarifying what you believe ACE's goal is, and what you believe my goal is? 

The analysis in my review is entirely about calculating historic cost-effectiveness. ACE's Cost-Effectiveness Scores are also entirely about calculating historic cost-effectiveness. 

From this post, it seems like you’re trying to calculate historic cost-effectiveness and rate charities exclusively on that (since you haven’t published an evaluation of an animal charity yet I could be wrong here though). My understanding of what ACE is trying to do with its evaluations as a whole is identify where marginal dollars might be most useful for animal advocacy, and move money from less effective opportunities to those. Cost-effectiveness might be one component of that, but is far from the only one (e.g. intervention scalability might matter, having a diversity of types of opportunities to appeal to different donors, etc.). It’s pretty easy to imagine scenarios where you wouldn’t prefer to only look at cost-effectiveness of individual charities when making recommendation, even if that’s what matters in the end. It’s also easy to imagine scenarios where recommending less effective opportunities leads to better outcomes to animals - maybe installing shrimp stunners is super effective, but only some donors will give to it. Maybe it can only scale to a few M per year but you influence more money than that. Depending on your circumstances, a lot more than cost-effectiveness of specific interventions matters for making the most effective recommendations.

My understanding is also that ACE doesn’t see EAs as its primary audience (but I’m less certain about this). This is a reason I’m excited about your project - seems nice to have “very EA” evaluations of charities in addition to ACE’s. But, I also imagine it would be hard to get charities to participate in your evaluation process if you don’t run the evaluations by them in advance, which could make it hard for you to get information to do what you’re trying to do, unless you rely on the information ACE collects, which then puts you in an awkward position of making a strong argument against an organization you might need to conduct evaluations.

My understanding is ACE has tried to do something that’s just cost-effectiveness analysis in the past (they used to give probability distributions for how many animals were helped, for example). But it’s really difficult to do confidently for animal issues, and that’s part of the reason it’s only a portion of the whole picture (along with other factors like I mention above).

Thank you for your response!

From this post, it seems like you’re trying to calculate historic cost-effectiveness and rate charities exclusively on that (since you haven’t published an evaluation of an animal charity yet I could be wrong here though)

This is not what we are trying to do. We simply critiqued the way that ACE calculated historic cost-effectiveness, and how ACE gave Legal Impact for Chickens a relatively high historic cost-effectiveness rating despite have no historic success. 

My understanding of what ACE is trying to do with its evaluations as a whole is identify where marginal dollars might be most useful for animal advocacy, and move money from less effective opportunities to those. 

ACE does 2 separate analyses for past cost-effectiveness, and room for future funding. For example, those two sections in ACE's review of LIC are:

  • Cost Effectiveness: How much has Legal Impact for Chickens achieved through their programs?
  • Room For More Funding: How much additional money can Legal Impact for Chickens effectively use in the next two years?

Our review focuses on ACE's Cost-Effectiveness analysis, not on their Room For More Funding analysis. In the future, we may evaluate ACE's Room For More Funding Analysis, but that is not what our review focused on. We wanted to keep our review short enough that people could read it without a huge time investment, so we could not include an assessment of every single part of ACE's evaluation process in our review. 

It is also less reasonable to hold ACE accountable for their Room For More Funding analysis, since this is inherently more subjective and difficult to do. It is far easier for ACE (or any charity evaluator) to analyze historic cost-effectiveness than to analyze future cost-effectiveness. However, I would like to pose a question to you: Given the ACE often gives charities a worse historic cost-effectiveness rating for spending less money to achieve the exact same outcomes (see Problem 1), how confident do you feel in ACE's ability to analyze future cost-effectiveness?

My understanding is ACE has tried to do something that’s just cost-effectiveness analysis in the past (they used to give probability distributions for how many animals were helped, for example).

ACE responded to this thread acknowledging that the problems listed in our review needed to be addressed, and that they changed their methodology (to a cost-effectiveness calculation of simply impact divided by cost) to do so: 

This is not what we are trying to do. We simply critiqued the way that ACE calculated historic cost-effectiveness, and how ACE gave Legal Impact for Chickens a relatively high historic cost-effectiveness rating despite have no historic success. 

FWIW this seems great - excited to see more comprehensive evaluations. Yeah, I agree with many of your comments here on the granular level — it seems you found something that is a potential issue for how ACE does (or did) some aspects of their evaluations, and publishing that is great! I think we just disagree on how important it is?

By the way, I'm ending further engagement on this (though feel free to leave a response if useful!) just because I already find the EA Forum distracting from other work, and don't have time this week to think about this more. Appreciate you going through everything with me!

Appreciate you going through everything with me!

No problem. Thank you for your replies! 

Thank you for doing this work. I’m very supportive of productive criticism on the Forum. As a moderator, I’d like to recommend this post for tips on how to make criticism more productive. EA is a collective project, and I think that steps such as sharing this feedback with ACE directly and writing a less aggressive title for your post would improve the outcomes of this work.

Thank you for your feedback. We will view the tips, and keep them in mind during our future reviews! 

Edit: I have also changed the title of the post. For transparency, the original title was: Animal Charity Evaluators (ACE) is Extremely Bad at Evaluating Charities. 

Their evaluation process has been updated (e.g. here), and I'm inclined to wait to see their new evaluations and recommendations before criticizing much, because any criticism based on last year's work may no longer apply. Their new recommendations come out November 12th.

FWIW, I am sympathetic to your criticisms, as applied to last year's evaluations. I previously left some constructive criticism here, too.

Hi Michael,

ACE re-evaluates their Recommended Charities every two years. In our review of ACE, all charities mentioned were evaluated in 2023 (the most recent published review cycle). Therefore, every charity mentioned in our review will still be recommended in ACE's upcoming list of Recommended Charities. 

When the new reviews come out, we will be sure to read them though!

Thanks for writing this. I feel like the following is the crux of your criticism of LIC:

ACE acknowledges the lawsuit was dismissed, but still celebrates this achievement. They note that this achievement would inspire similar lawsuits. Would it be good to inspire more lawsuits that cost $200,000 and are dismissed?

You state this as though the answer is "obviously no" but the answer feels extremely nonobvious to me. I note that you excluded the some key things when quoting ACE:

LIC’s first lawsuit, a shareholder derivative case against Costco’s executives for chicken neglect, was featured on TikTok and in multiple media outlets, including CNN Business, Fox Business, The Washington Post, and Meatingplace... We thought the achievement has strong potential for indirect impact, and it received a high amount of media attention. - ACE

The Facebook fan page still as of this writing has a post about the lawsuit pinned to the top because apparently the owner decided to boycott after learning about the cruelty.

It sounds like the Costco board also had to take official action:

In a letter dated August 15, 2023, Costco’s board stated that it had “formed a Board committee to review and investigate the demand’s allegations.” LIC’s shareholder clients then met with investigators retained by the committee. - LIC

Is it worth $200k to get a bunch of bad publicity for Costco, force the board to form a committee and hire an investigator, etc.?

I don't know, I'm pretty willing to believe that the answer is "no", but it doesn't seem obvious to me. I could pretty easily believe that the CEO of the next company they sue would to change their policies instead of having to deal with the embarrassment of asking the board to form a committee to investigate.

Hi Ben,

Thank you for your response!

I will address your points, but first I would like to clarify what we believe the crux of the problem is with LIC being deemed a top 11 animal charity by ACE. 

In Problem 1 of our review, we state the following:

We go on how to detail how if LIC had spent less than $2,000 on the lawsuit (saving over $200,000) and achieved the exact same outcome, ACE would have assigned LIC a Cost-Effectiveness Score of 1.8. The lowest Cost-Effectiveness Score ACE assigned to any charity in 2023 was 3.3. This means if LIC had spent less than $2,000 on the lawsuit, LIC's Cost-Effectiveness Score would have been significantly worse than any charity ACE evaluated in 2023. 

Instead, LIC spent over $200,000 on the lawsuit, and LIC rewarded them for this inefficiency by giving them a Cost-Effectiveness Score of 3.7, and deeming LIC a top 11 animal charity.

This is the crux of the problem, and it is really an issue with ACE deeming LIC a top 11 animal charity, not with LIC itself. ACE elected to give LIC this distinction, and LIC merely accepted it. 

I would also like to note encouraging or valuing lawsuits that fail to state valid legal claims (but burden defendants/garner publicity) risks causing the legal system to take animal rights/welfare cases less seriously. If courts observe a pattern of weak or legally insufficient cases being filed for publicity/to burden the defendant, they will become skeptical of all animal rights/welfare lawsuits--even those with strong legal merit. Prior to being deemed a top 11 animal charity by ACE, every single lawsuit filed by LIC failed to state a valid legal claim.  

I note that you excluded the some key things when quoting ACE: LIC’s first lawsuit, a shareholder derivative case against Costco’s executives for chicken neglect, was featured on TikTok and in multiple media outlets, including CNN Business, Fox Business, The Washington Post, and Meatingplace... We thought the achievement has strong potential for indirect impact, and it received a high amount of media attention. - ACE

ACE’s review of LIC contains a section titled “Our Assessment of Legal Impact for Chickens’ Cost Effectiveness”, and the quote you have provided is not part of this section. Our entire review of ACE is about ACE incorrectly calculating cost-effectiveness; consequently, this is the section we decided to focus on. ACE’s review of LIC is over 5,000 words, and we cannot include every quote from ACE’s review of LIC.

Additionally, the quote you’ve provided gives no metrics to gauge how much media attention was received. If media attention is a strong justification for stating a $200,000 lawsuit that failed to state a valid legal claim is “particularly cost-effective” (as ACE put it), ACE should provide metrics regarding how much media attention was received. Ironically, the Facebook post you mentioned appears to have more metrics than ACE's review of LIC regarding amount of media attention caused by the Costco lawsuit, since the Facebook post lists the numbers of likes and comments the post received. 

The Facebook fan page still as of this writing has a post about the lawsuit pinned to the top because apparently the owner decided to boycott after learning about the cruelty.

The Facebook post you referred to received 56 likes and 83 comments. To my understanding, the post is also not pinned to the top, it is simply the last post the Facebook page has made (it appears that the page has not posted in over 2 years). I do not think this is very strong evidence that LIC’s $200,000 lawsuit that was dismissed for failing to state a valid legal claim was “particularly cost-effective” (as ACE put it). 

It sounds like the Costco board also had to take official action

Correct, the Costco board took official action by rejecting LIC’s demands. 

Is it worth $200k to get a bunch of bad publicity for Costco [...]?

Could you please define what “a bunch of bad publicity for Costco” means? And could you provide evidence that this level of publicity was caused by LIC’s lawsuit? 

Is it worth $200k to [...] force the board to form a committee and hire an investigator, etc.?

Costco’s board formed a committee to review and investigate LIC’s demands. The committee then recommended that the board reject the demand, which they did. This does not appear to be a very good outcome. 

Is it worth $200k to get a bunch of bad publicity for Costco, force the board to form a committee and hire an investigator, etc.?

I don't know, I'm pretty willing to believe that the answer is "no", but it doesn't seem obvious to me. I could pretty easily believe that the CEO of the next company they sue would to change their policies instead of having to deal with the embarrassment of asking the board to form a committee to investigate.

It is ACE’s job to write charity reviews that provide the empirics necessary to answer questions like the one you’ve asked. From your own statement, it seems like ACE has failed to do this. ACE did not provide metrics on how much media attention the Costco lawsuit caused, and did not provide any insight into how much of a burden it was to form a committee to review and investigate LIC's demands (I don’t recall ACE’s review even mentioning this). 

  1. Yes, thank you, I understand that weighting by budget results in the phenomenon you described. I didn't comment on this since it sounds like ACE is planning to change it anyway.
  2. I was referring to the  publicity listed in ACE's review. The stories appear to be about the lawsuit so I am not entirely sure what you mean by "could you provide evidence that this level of publicity was caused by LIC’s lawsuit". See e.g. CNN, Fox.
  3. To clarify: I don't care about causing burdens to Costco per se. The reason that burdens are relevant is because future companies might prefer to avoid that burden and instead change their policies. I agree it would be good to have a better model of when this would happen and would be excited for someone to make such a model!

So, I have some mixed views about this post. Let's start with the positive.

In terms of agreement: I do think organizational critics are valuable, and specifically, critics of ACE in the past have been helpful in improving their direction and impact. I also love the idea of having more charity evaluators (even in the same cause area) with slightly different methods or approaches to determining how to do good, so I’m excited to see this initiative. I also have quite a bit of sympathy for giving higher weight to explicit cost-effectiveness models when it comes to animal welfare evaluations.

I can personally relate to the feeling of being disappointed after digging deeper into the numbers of well-respected EA meta organizations, so I understand the tone and frustration. However, I suspect your arguments may get a lot of pushback on tone alone, which could distract from the more important substance of the post and concepts (I’ll leave that for others to address, as it feels less important, in my opinion).

In terms of disagreement: I will focus on what I think is the crux of the issue, which I would summarize as: (a) ACE uses a methodology that yields quite different results than a raw cost-effectiveness analysis; (b) this methodology seems to have major flaws, as it can lead to clearly incoherent conclusions and recommendations easily; and (c) thus, it is better to use a more straightforward, direct CEA.

I agree with points A and B, but I am much less convinced about point C. To me, this feels a bit like an isolated demand for methodological rigor. Every methodology has flaws, and it’s easy to find situations that lead to clearly incoherent conclusions. Expected value theory itself, using pure EV terms, has well-known issues like St. Petersburg Paradoxoptimizer's curse, and general model mistakes. CEAs in general share these issues and have additional flaws (see more on this here). I think CEAs are a super useful tool, but they are ultimately a model of reality, not reality itself, and I think EA can sometimes get too caught up in them (whereas the rest of the world probably doesn’t use them nearly enough). GW, which has ~20x the budget of ACE, still finds model errors and openly discusses how softer judgments on ethics and discount factors influence outcomes (and they consider more than just a pure CEA calculation when recommending a charity).

Overall, being pretty familiar with ACE’s methodology and CEAs, I would expect, for example, that a 10-hour CEA of the same organizations would be quite a bit further from the truth of the actual impact or effectiveness of an organization. It's not clear to me that spending equal time on pure CEAs versus a mix of evaluative techniques (as ACE currently does) would lead to more accurate results (I would probably weakly bet against it). I think this post overstates the importance of discarding a model due to a flaw that can be exploited.

A softer argument, such as “ACE should spend double the percentage of time it currently spends on CEAs relative to other methods” or “ACE should ensure that intervention weightings do not overshadow program-level execution data,” is something I have a lot of sympathy for.

Hi Joey,

Thank you for taking the time to read our review!

(a) ACE uses a methodology that yields quite different results than a raw cost-effectiveness analysis; (b) this methodology seems to have major flaws, as it can lead to clearly incoherent conclusions and recommendations easily; and (c) thus, it is better to use a more straightforward, direct CEA.

I agree with points A and B, but I am much less convinced about point C. 

I would like to point to Problem 1 and Problem 4 from the review:

  1. Charities can receive a worse Cost-Effectiveness Score by spending less money to achieve the exact same results.
  2. Charities can have 1,000,000 times the impact at the exact same price, and their Cost-Effectiveness Score can remain the same.

Effective giving is all about achieving the greatest impact at the lowest cost. ACE’s methodology is not properly accounting for impact, or for cost. 

Using the equation impact / cost at least results in impact being in the numerator, and cost being in the denominator. To me, this alone makes a straightforward, direct CEA a better methodology than the one used by ACE.  

To me, this feels a bit like an isolated demand for methodological rigor. Every methodology has flaws, and it’s easy to find situations that lead to clearly incoherent conclusions. 

I absolutely agree that every methodology has flaws, and we did not mean to imply otherwise. However, the incoherent conclusions described in our review of ACE's methodology are not one off instances. hey are pervasive problems that impact all of ACE's reviews. 

Thank you for your feedback! 

If you're correct in the linked analysis, this sounds like a really important limitation in ACE's methodology, and I'm very glad you've shared this!

In case anyone else has the same confusion as me when reading your summary: I think there is nothing wrong with calculating a charity's cost effectiveness by taking the weighted sum of the cost-effectiveness of all of their interventions (weighted by share of total funding that intervention receives). This should mathematically be the same as (Total Impact / Total cost), and so should indeed go up if their spending on a particular intervention goes down (while achieving the same impact).

The (claimed) cause of the problem is just that ACE's cost-effectiveness estimate does not go up by anywhere near as much as it should when the cost of an intervention is reduced, leading the cost-effectiveness of the charity as a whole to actually change in the wrong direction when doing the above weighted sum!

If this is true it sounds pretty bad. Would be interested to read a response from them.

Of course, the other thing that could be going on here, is that average cost-effectiveness is not the same as cost-effectiveness on the margin, which is presumably what ACE should care about. Though I don't see why an intervention representing a smaller share of a charity's expenditure should automatically mean that this is not where extra dollars would be allocated. The two things seem independent to me.

Hi Toby,

Thank you for your reply! 

Of course, the other thing that could be going on here, is that average cost-effectiveness is not the same as cost-effectiveness on the margin, which is presumably what ACE should care about.

I'm not certain if by cost-effectiveness on the margin, you meant cost-effectiveness in the future if additional funding is obtained. If that's the case, the following information could be helpful. 

ACE does 2 separate analyses for past cost-effectiveness, and room for future funding. For example, those two sections in ACE's review of LIC are:

  • Cost Effectiveness: How much has Legal Impact for Chickens achieved through their programs?
  • Room For More Funding: How much additional money can Legal Impact for Chickens effectively use in the next two years?

Our review focuses on ACE's Cost-Effectiveness analysis. Additionally, ACE states (under Criterion 2) that a charity's Cost-Effectiveness Score "indicates, on a 1-7 scale, how cost effective we think the charity has been [...] with higher scores indicating higher cost effectiveness." 

This is very helpful, thanks!

Great analysis, Isaac! I worry the Animal Welfare Fund (AWF) has similar problems (see below), but they are way less transparent than ACE about their evaluations, and therefore much less scrutable. Instead of mostly deferring to AWF, I would rather have donors look over ACE's evaluations, discuss their findings with others, and eventually publish them online, even if they spend much less time on these activities than you did.

AWF only runs cost-effectiveness analysis (CEAs) for a minority of applications. According to a comment by Karolina Sarek, AWF's chair, on June 28 (this year):

In the past, we tended to do CEAs more often if: a) The project is relatively well-suited to a back-of-the-envelope calculation b) A back-of-the-envelope calculation seems decision-relevant. At that time, a) and b) seem true in a minority of cases, maybe ~10%-20% of applications depending on the round, to give some rough sense. However, note that there tends to be some difference between projects in areas or by groups we have already evaluated versus projects/groups/areas that are newer to us. I'd say newer projects/groups/areas are more likely to receive a back-of-the-envelope style estimate.

Comparisons across grants also seem to be lacking. From Giving What We Can's (GWWC's) evaluation of AWF in November 2023 (emphasis mine):

Fourth, we saw some references to the numbers of animals that could be affected if an intervention went well, but we didn’t see any attempt at back-of-the-envelope calculations to get a rough sense of the cost-effectiveness of a grant, nor any direct comparison across grants to calibrate scoring. We appreciate it won’t be possible to come up with useful quantitative estimates and comparisons in all or even most cases, especially given the limited time fund managers have to review applications, but we think there were cases among the grants we reviewed where this was possible (both quantifying and comparing to a benchmark) — including one case in which the applicant provided a cost-effectiveness analysis themselves, but this wasn’t then considered by the PI in their main reasoning for the grant.

GWWC looked into 10 applications:

Of the 10 grant investigation reports we reviewed, three were provided by the AWF upon our general request for representative grants; two were selected by us from their grants database; two were selected by the AWF after we provided specifications; and three were selected by the AWF based on our request for grant applications by organisations that applied to both the AWF and ACE’s MG.

Karolina also said on June 28 that AWF has improved their methodology since GWWC's evaluation:

However, since then, we've started conducting BOTEC CEA more frequently and using benchmarking in more of our grant evaluations. For example, we sometimes use this BOTEC template and compare the outcomes to cage-free corporate campaigns (modified for our purposes from a BOTEC that accompanied RP's Welfare Range Estimates).

I do not doubt AWF has taken the above steps, but I have no way to check it. I think donating to ACE over AWF is a good way of incentivising transparency, which ultimately can lead to more impact.

Hey Vasco! I agree that AWF should be more transparent, and since I started working on it full-time, we have more capacity for that, and we are planning to communicate about our work more proactively.

In light of that, we just published a post summarizing how 2024 went, what changes we recently introduced, and what we are planning. We touched on updates to our evaluation process as well. Here is the relevant section from that post: 

"Grant investigations:
Updated grant evaluation framework: We've updated our systematic review process, enabling us to evaluate every application using standardized templates that vary based on the required depth of investigation. This framework ensures a thorough assessment of key factors while maintaining flexibility for grant-specific considerations. For example, for the deep evaluations, (which are the vast majority of all evaluations), key evaluation areas include assessment of the project’s Theory of Change, scale of counterfactual impact, likelihood of success, back-of-the-envelope cost-effectiveness and benchmarking, and the expected value of receiving funding. It also includes forecasting grant outcomes. You can read more about our process in the FAQ.
Introduced new decision procedures for marginal grants: We introduced an additional step in our evaluation that enables us to make better decisions about grants that are just below or just above our funding bar. Since AWF gives grants on a rolling basis rather than in rounds, it is important to have a process for this to ensure decisions are consistent."

We also slightly updated our website and added a new question to the FAQ - I'm copying that below: 

"How Does the EA Animal Welfare Fund Make Grant Decisions?

Our grantmaking process consists of the following stages:

Stage 1: Application Processing. When we receive an application, it's entered into our project management system along with the complete application details, history of previous applications from the applicant, evaluation rubrics, investigator assignments, and other relevant documentation.

Stage 2: Initial Screening. We conduct a quick scope check to ensure applications align with our fund's mission and show potential for high impact. About 30% of applications are filtered out at this stage, typically because they fall outside our scope or don't demonstrate sufficient impact potential.

Stage 3: Selecting Primary Grant Investigator and Depth of the Evaluation. For applications that pass the initial screening, we assign investigators who are most suitable for a given evaluation. Based on various heuristics, such as the size of the grant, uncertainty, and potential risk, the Fund’s Chair also determines the depth of the evaluation.

Stage 4: In-Depth Evaluation. Every grant application undergoes a systematic review. For each level of depth of investigation required, AWF has an evaluation template that fund managers follow. The framework balances ensuring that all key factors have been considered and that evaluations are consistent, while leaving space for additional, grant-specific crucial considerations. For the deep evaluations, (which are the vast majority of all evaluations), the primary investigator typically examines:
 

  • Theory of Change (ToC) - examining how activities translate into improvements for animals and whether the evidence supports its merits
  • Scale of counterfactual impact - assessing the problem's scale, neglectedness, and strategic importance
  • Likelihood of success - evaluating track record, team competence, and concrete plans
  • Cost-effectiveness and benchmarking- conducting calculations to estimate impact per dollar and compare it to relevant benchmarks
  • Value of funding - analyzing counterfactuals and long-term sustainability
  • Forecasting - forecasting the probability that the project will succeed or fail and due to what reasons (validity of the ToC or performance in achieving planned outcomes )
  • In the case of evaluations that require the maximum level of depth, a secondary investigator critically reviews the completed write-up, raises additional questions and concerns, and provides alternative perspectives or recommendations.

Stage 5: Collective Review and Voting. After the evaluation, each application undergoes a thorough collective assessment. The Fund Chair and at least two Fund Managers review the analysis. All Fund Managers without conflicts of interest can contribute additional insights and discuss key questions through dedicated channels. Finally, each Fund Manager assigns a score, which helps us systematically compare the most promising grants.

Stage 6: Final Recommendation Looking at the average score, the Fund Chair approves grants that are clearly above our funding bar and rejects those clearly below it. For grants near our funding threshold, we conduct another step where all found managers compare those marginal grants against each other to select the strongest proposals.

Once decisions are finalized, approved grants move to our grants team for contracting and reporting setup.

Throughout this process, we maintain detailed documentation and apply consistent standards to ensure we select the most promising opportunities to help animals most effectively."

Thanks, Karolina! Great updates.

I strongly upvoted this post because I'm extremely interested in seeing it get more attention and, hopefully, a potential rebuttal. I think this is extremely important to get to the bottom of!

At first glance your critiques seem pretty damning, but I would have to put a bunch of time into understanding ACE's evaluations first before I would be able to conclude whether I agree your critiques (I can spend a weekend day doing this and writing up my own thoughts in a new post if there is interest).

My expectation is that if I were to do this I would come out feeling less confident than you seem to be. I'm a bit concerned that you haven't made an attempt at explaining why ACE might have constructed their analyses this way.

But like I'm pretty confused too. It's hard to think of much justification for the choice of numbers in the 'Impact Potential Score' and deciding the impact of a book based on the average of all books doesn't seem like the best way to approach things?

Hi Mathias,

Thank you for your comment!

I can spend a weekend day doing this and writing up my own thoughts in a new post if there is interest

We would definitely be interested in hearing your thoughts. We've set post notifications on for your profile, and look forward to seeing your post!

It feels like this needs a response from both ACE and Legal Impact for Chickens. (I'm not suggesting it should be a quick one, some things are important enough to warrant careful review. I agree with @abrahamrowe  it would probably have been better to ask for their comments before publishing)

  • I think it is possible for a charity focusing on taking legal action to be impactful without [consistent] legal success, which the review doesn't really acknowledge. A large part of the theory of change around suing corporate bad behaviour is the idea that it will deter bad behaviour in future, by making standards compliance more cost effective than defending lawsuits
  • Deterrent effects however are a more complicated theory of change than actually winning cases and forcing actors to change. And it may be very difficult to have a deterrent effect if cases are typically dismissed.
  • To that extent I'm quite surprised to learn that Legal Impact for Chickens apparently hasn't yet had any victories, based on what I had heard about that organization. I don't think this necessarily reflects badly on the organization, which is a young charity focused on a legal process which inevitably takes time. But it does mean the error bars for their impact are rather large, and could mean a nonzero possibility they aren't [yet] having an impact at all. It would be interesting to hear more about metrics used (both by LIC and ACE, and other charities with similar theory of change for that matter) to evaluate the impact of an unsuccessful lawsuit, and how substantial those are.
  • Some of the questions raised about ACE's weightings are quite independent from the example given. It would be interesting to hear from ACE if and how evaluation criteria for their [apparently mostly subjective] impact scoring takes into account the idea that a charity could achieve a higher score by subdividing campaigns, and if and how they intend to update impact assessments in cases like the example of books either failing to reach a non trivial number of people or being phenomenally successful even if the case they make for veganism was not originally assessed as particularly evidence-based.

I think this would have been an interesting contribution to the Animal Welfare vs GHD debate week. From the limited amount I read of it, it seemed that even people (on different sides of the debate) whose analysis was very thorough weren't taking account the more straightforward possibility that some of the highlighted top animal advocacy charities simply weren't close to being as effective [yet] at achieving their goals as suggested, regardless of philosophical positions and empirical claims about welfare levels.

Hi David,

Thank you for your reply!

I think it is possible for an charity focusing on taking legal action to be impactful without [consistent] legal success, which the review doesn't really acknowledge. A large part of the theory of change around suing corporate bad behaviour is the idea that it will deter bad behaviour in future, by making standards compliance more cost effective than defending lawsuits

I definitely agree that this is possible! However, as you said

it may be very difficult to have a deterrent effect if cases are typically dismissed.

ACE evaluated 3 “legal actions” in their review of LIC:

  • 2 of the legal actions were dismissed under Rule 12(b)(6) for failing to state a valid legal claim. 12(b)(6) dismissals occur very early on in the legal process, making any legal expenses incurred by the Defendants relatively low. Additionally, encouraging or valuing lawsuits that fail to state valid legal claims but cost the defendant money risks causing the legal system to take animal rights/welfare cases less seriously. If courts observe a pattern of weak or legally insufficient cases being filed to burden defendants, they will become skeptical of all animal rights/welfare lawsuits--even those with strong legal merit.
  • The 3rd legal action ACE evaluated was not actually a legal action, but rather a public comment submission (ACE still classified it as a legal action). The public comment was rejected, and it is difficult to see how this would have a positive impact. 

I don't think this necessarily reflects badly on the organization, which is a young charity focused on a legal process which inevitably takes time. 

ACE endorses LIC as a top charity. Currently, I don’t think this endorsement is justified given LIC’s track record, and I don't think ACE provided a very strong justification for it. Here is a quote from ACE's review of LIC: 

  • "We think that out of all of Legal Impact for Chickens' achievements, the Costco shareholder derivative case is particularly cost effective because it scored high on achievement quality." 

The Costco shareholder derivative case cost LIC over $200,000 and was dismissed for failing to state a valid legal claim. It is difficult to understand why ACE thinks this is a particularly cost effective achievement. 

Some of the questions raised about ACE's weightings are quite independent from the example given.

Could you elaborate on what you mean by this?

I think this would have been an interesting contribution to the Animal Welfare debate week.

I wasn’t aware of that week. Maybe we’ll be able to prepare something for it next year!

Thank you for your feedback!

ACE endorses LIC as a top charity. Currently, I don’t think this endorsement is justified given LIC’s track record, and I don't think ACE provided a very strong justification for it.

I agree with this, and particularly agree that the quote you highlighted below does not seem like good justification. I also think your comment (elsewhere in this thread) that their track record is a "bad one" might be going a little too far.[1] As I say, I was surprised to find that LIC had not yet had any legal success, given that I'd heard about them mostly through hearing positive commentary on their cost effectiveness

Could you elaborate on what you mean by this?

I meant that there were criticisms you raised about the overall methodology that had wider implications than just LIC. Possibly I could have worded that better.

I wasn’t aware of that week. Maybe we’ll be able to prepare something for it next year!

There was an animal welfare vs GHD debate week on this forum. Honestly, I hope they don't repeat it![2]

  1. ^

    I think a charity aiming to encourage compliance that never filed any lawsuits unless they were almost certain to succeed would probably underperform too, and $200k is not an especially expensive legal case, though there are certainly more proven cost effective ways to save lives for that sort of money. That said, I haven't read the lawsuit and wouldn't know enough about relevant law to know whether the basis for dismissal was blindingly obvious or not...

  2. ^

    I think there are probably more specific and less polarizing topics for debate. And polarizing topics more likely to yield concrete results, which probably includes this one. 

I recently explored a charity from ACE’s 2024 Recommended Charity list that initially seemed like a good fit for the region and cause I want to support. However, upon closer examination, I found significant gaps in the evidence provided to justify their inclusion. Specifically, the impact analysis lacked clear data on cost-effectiveness and how the total number of animals impacted was calculated.

The organization’s focus on corporate pledges, while valuable, is hard to verify. Additionally, there was no evidence of systematic follow-up to ensure these commitments are implemented. Given that their last audited financial and activity reports were from 2022 (as published at the organisation`s website), I am concerned about whether ACE’s evaluation relied on up-to-date and complete information.

As someone passionate about animal welfare, I appreciate ACE’s mission, but I believe their evaluation process must ensure greater transparency, robust evidence, and accountability to maintain credibility and guide donors effectively.

Curated and popular this week
Relevant opportunities