Doctor from NZ, now doing Global Health & Development Research @ Rethink Priorities, but interested and curious about most EA topics.
Outside of RP work, I spend some time doing independent "grand futures"/ GPR research (Anders Sandberg/BERI) and very sporadic grantmaking-assisting work. Also looking to re-engage with UN processes for OCA/Summit of the Future.
Feel free to reach out if you think there's anything I can do to help you or your work, or if you have any Qs about Rethink Priorities! If you're a medical student / junior doctor reconsidering your clinical future, or if you're quite new to EA / feel uncertain about how you fit in the EA space, have an especially low bar for reaching out.
Outside of EA, I do a bit of end of life care research and climate change advocacy, and outside of work I enjoy some casual basketball, board games and good indie films. (Very) washed up classical violinist and oly-lifter.
All comments in personal capacity unless otherwise stated.
Thanks for writing this post!
I feel a little bad linking to a comment I wrote, but the thread is relevant to this post, so I'm sharing in case it's useful for other readers, though there's definitely a decent amount of overlap here.
TL; DR
I personally default to being highly skeptical of any mental health intervention that claims to have ~95% success rate + a PHQ-9 reduction of 12 points over 12 weeks, as this is is a clear outlier in treatments for depression. The effectiveness figures from StrongMinds are also based on studies that are non-randomised and poorly controlled. There are other questionable methodology issues, e.g. surrounding adjusting for social desirability bias. The topline figure of $170 per head for cost-effectiveness is also possibly an underestimate, because while ~48% of clients were treated through SM partners in 2021, and Q2 results (pg 2) suggest StrongMinds is on track for ~79% of clients treated through partners in 2022, the expenses and operating costs of partners responsible for these clients were not included in the methodology.
(This mainly came from a cursory review of StrongMinds documents, and not from examining HLI analyses, though I do think "we’re now in a position to confidently recommend StrongMinds as the most effective way we know of to help other people with your money" seems a little overconfident. This is also not a comment on the appropriateness of recommendations by GWWC / FP)
(commenting in personal capacity etc)
Edit:
Links to existing discussion on SM. Much of this ends up touching on discussions around HLI's methodology / analyses as opposed to the strength of evidence in support of StrongMinds, but including as this is ultimately relevant for the topline conclusion about StrongMinds (inclusion =/= endorsement etc):
If I write a message like that because I find someone attractive (in some form), does that seem wrong to you? :) Genuinely curious about your reaction and am open to changing my mind, but this seems currently fine to me. I worry that if such a thing is entirely prohibited, so much value in new beautiful relationships is lost.
Yes, you're still contributing to harm (at least probabalistically) because the norm and expectation is currently that EAG / swapcard shouldn't be used as a speed-dating tool. So if you reaching out only because you find them attractive despite that, you are explicitly going against what other parties are expecting when engaging with swapcard, and they don't have a way to opt-out of receiving your norm-breaking message.
I'll also mention that you're arguing for the scenario of asking people for 1-1s at EAGS "only because you find them attractive". This means it would also allow for messages like, "Hey, I find you attractive and I'd love to meet." Would you also defend this? If not, what separates the two messages, and why did you choose the example you gave?
Sure, a new beautiful relationship is valuable, but how many non-work swapcard messages lead to a new beautiful relationship? Put yourself in the shoe of an undergrad who is attending EAG for the first time, wishing to learn more about a potential career in biosecurity or animal welfare or AI safety. Now imagine they receive a message from you, and 50 other people who also find them attractive. This doesn't seem like a good conference experience, nor a good introduction to the EA community. It also complicates the situation with people they want to reach out to as it increases uncertainty around whether people they want to meet with are responding in a purely professional sense, or whether they are just opportunistic. Then there's an additional layer of complexity when you add in things around power dynamics etc. Having shared professional standards and norms goes some way to reducing this uncertainty, but people need to actually follow them.
If you are worried that you'll lose the opportunity for beautiful relationships at EAGs, then there's nothing stopping you from attending something after the conference wraps up for the day, or even organising some kind of speed-dating thing yourself. But note how your organised speed-dating event would be something people choose to opt in to, unlike sending solicitation DMs via an app intended to be used for professional / networking purposes (or some other purpose explicit on their profile - i.e. if you're sending that DM to someone whose profile says "DM me if you're interested in dating me", then this doesn't apply. The appropriateness of that is a separate convo though).
Some questions for you:
I'll also note Kirsten's comment above, which already talks about why it could be plausibly be bad "in general":
"The EAG team have repeatedly asked people not to use EAG or the Swapcard app for flirting. 1-1s at EAG are for networking, and if you're just asking to meet someone because you think they're attractive, there's a good chance you're wasting their time. It's also sexualizing someone who presumably doesn't want to be because they're at a work event."
And Lorenzo's comment above:
"Because EAG(x) conferences exist to enable people to do the most good, conference time is very scarce, misusing a 1-1 slot means someone is missing out on a potentially useful 1-1. Also, these kinds of interactions make it much harder for me to ask extremely talented and motivated people I know to participate in these events, and for me to participate personally. For people that really just want to do the most good, and are not looking for dates, this kind of interaction is very aversive."
While I agree that both sides are valuable, I agree with the anon here - I don't think these tradeoffs are particularly relevant to a community health team investigating interpersonal harm cases with the goal of "reduc[ing] risk of harm to members of the community while being fair to people who are accused of wrongdoing".
One downside of having the bad-ness of say, sexual violence[1]be mitigated by their perceived impact,(how is the community health team actually measuring this? how good someone's forum posts are? or whether they work at an EA org? or whether they are "EA leadership"?) when considering what the appropriate action should be (if this is happening) is that it plausibly leads to different standards for bad behaviour. By the community health team's own standards, taking someone's potential impact into account as a mitigating factor seems like it could increase the risk of harm to members of the community (by not taking sufficient action with the justification of perceived impact), while being more unfair to people who are accused of wrongdoing. To be clear, I'm basing this off the forum post, not any non-public information
Additionally, a common theme about basically every sexual violence scandal that I've read about is that there were (often multiple) warnings beforehand that were not taken seriously.
If there is a major sexual violence scandal in EA in the future, it will be pretty damning if the warnings and concerns were clearly raised, but the community health team chose not to act because they decided it wasn't worth the tradeoff against the person/people's impact.
Another point is that people who are considered impactful are likely to be somewhat correlated with people who have gained respect and power in the EA space, have seniority or leadership roles etc. Given the role that abuse of power plays in sexual violence, we should be especially cautious of considerations that might indirectly favour those who have power.
More weakly, even if you hold the view that it is in fact the community health team's role to "take the talent bottleneck seriously; don’t hamper hiring / projects too much" when responding to say, a sexual violence allegation, it seems like it would be easy to overvalue the bad-ness of the immediate action against the person's impact, and undervalue the bad-ness of many more people opting to not get involved, or distance themselves from the EA movement because they perceive it to be an unsafe place for women, with unreliable ways of holding perpetrators accountable.
That being said, I think the community health team has an incredibly difficult job, and while they play an important role in mediating community norms and dynamics (and thus have corresponding amount of responsibility), it's always easier to make comments of a critical nature than to make the difficult decisions they have to make. I'm grateful they exist, and don't want my comment to come across like an attack of the community health team or its individuals!
(commenting in personal capacity etc)
If this comment is more about "how could this have been foreseen", then this comment thread may be relevant. I should note that hindsight bias means that it's much easier to look back and assess problems as obvious and predictable ex post, when powerful investment firms and individuals who also had skin in the game also missed this.
TL;DR:
1) There were entries that were relevant (this one also touches on it briefly)
2) They were specifically mentioned
3) There were comments relevant to this. (notably one of these was apparently deleted because it received a lot of downvotes when initially posted)
4) There has been at least two other posts on the forum prior to the contest that engaged with this specifically
My tentative take is that these issues were in fact identified by various members of the community, but there isn't a good way of turning identified issues into constructive actions - the status quo is we just have to trust that organisations have good systems in place for this, and that EA leaders are sufficiently careful and willing to make changes or consider them seriously, such that all the community needs to do is "raise the issue". And I think looking at the systems within the relevant EA orgs or leadership is what investigations or accountability questions going forward should focus on - all individuals are fallible, and we should be looking at how we can build systems in place such that the community doesn't have to just trust that people who have power and who are steering the EA movement will get it right, and that there are ways for the community to hold them accountable to their ideals or stated goals if it appears to, or risks not playing out in practice.
i.e. if there are good processes and systems in place and documentation of these processes and decisions, it's more acceptable (because other organisations that probably have a very good due diligence process also missed it). But if there weren't good processes, or if these decisions weren't a careful + intentional decision, then that's comparatively more concerning, especially in context of specific criticisms that have been raised,[1] or previous precedent. For example, I'd be especially curious about the events surrounding Ben Delo,[2] and processes that were implemented in response. I'd be curious about whether there are people in EA orgs involved in steering who keep track of potential risks and early warning signs to the EA movement, in the same way the EA community advocates for in the case of pandemics, AI, or even general ways of finding opportunities for impact. For example, SBF, who is listed as a EtG success story on 80k hours, has publicly stated he's willing to go 5x over the Kelly bet, and described yield farming in a way that Matt Levine interpreted as a Ponzi. Again, I'm personally less interested in the object level decision (e.g. whether or not we agree with SBF's Kelly bet comments as serious, or whether Levine's interpretation as appropriate), but more about what the process was, how this was considered at the time with the information they had etc. I'd also be curious about the documentation of any SBF related concerns that were raised by the community, if any, and how these concerns were managed and considered (as opposed to critiquing the final outcome).
Outside of due diligence and ways to facilitate whistleblowers, decision-making processes around the steering of the EA movement is crucial as well. When decisions are made by orgs that bring clear benefits to one part of the EA community while bringing clear risks that are shared across wider parts of the EA community,[3] it would probably be of value to look at how these decisions were made and what tradeoffs were considered at the time of the decision. Going forward, thinking about how to either diversify those risks, or make decision-making more inclusive of a wider range stakeholders[4], keeping in mind the best interests of the EA movement as a whole.
(this is something I'm considering working on in a personal capacity along with the OP of this post, as well as some others - details to come, but feel free to DM me if you have any thoughts on this. It appears that CEA is also already considering this)
If this comment is about "are these red-teaming contests in fact valuable for the money and time put into it, if it misses problems like this"
I think my view here (speaking only for the red-teaming contest) is that even if this specific contest was framed in a way that it missed these classes of issues, the value of the very top submissions[5] may still have made the efforts worthwhile. The potential value of a different framing was mentioned by another panelist. If it's the case that red-teaming contests are systematically missing this class of issues regardless of framing, then I agree that would be pretty useful to know, but I don't have a good sense of how we would try to investigate this.
This tweet seems to have aged particularly well. Despite supportive comments from high-profile EAs on the original forum post, the author seemed disappointed that nothing came of it in that direction. Again, without getting into the object level discussion of the claims of the original paper, it's still worth asking questions around the processes. If there was were actions planned, what did these look like? If not, was that because of a disagreement over the suggested changes, or the extent that it was an issue at all? How were these decisions made, and what was considered?
Apparently a previous EA-aligned billionaire ?donor who got rich by starting a crypto trading firm, who pleaded guilty to violating the bank secrecy act
Even before this, I had heard from a primary source in a major mainstream global health organisation that there were staff who wanted to distance themselves from EA because of misunderstandings around longtermism.
This doesn't have to be a lengthy deliberative consensus-building project, but it should at least include internal comms across different EA stakeholders to allow discussions of risks and potential mitigation strategies.
As requested, here are some submissions that I think are worth highlighting, or considered awarding but ultimately did not make the final cut. (This list is non-exhaustive, and should be taken more lightly than the Honorable mentions, because by definition these posts are less strongly endorsed by those who judged it. Also commenting in personal capacity, not on behalf of other panelists, etc):
Bad Omens in Current Community Building
I think this was a good-faith description of some potential / existing issues that are important for community builders and the EA community, written by someone who "did not become an EA" but chose to go to the effort of providing feedback with the intention of benefitting the EA community. While these problems are difficult to quantify, they seem important if true, and pretty plausible based on my personal priors/limited experience. At the very least, this starts important conversations about how to approach community building that I hope will lead to positive changes, and a community that continues to strongly value truth-seeking and epistemic humility, which is personally one of the benefits I've valued most from engaging in the EA community.
Seven Questions for Existential Risk Studies
It's possible that the length and academic tone of this piece detracts from the reach it could have, and it (perhaps aptly) leaves me with more questions than answers, but I think the questions are important to reckon with, and this piece covers a lot of (important) ground. To quote a fellow (more eloquent) panelist, whose views I endorse: "Clearly written in good faith, and consistently even-handed and fair - almost to a fault. Very good analysis of epistemic dynamics in EA." On the other hand, this is likely less useful to those who are already very familiar with the ERS space.
Most problems fall within a 100x tractability range (under certain assumptions)
I was skeptical when I read this headline, and while I'm not yet convinced that 100x tractability range should be used as a general heuristic when thinking about tractability, I certainly updated in this direction, and I think this is a valuable post that may help guide cause prioritisation efforts.
The Effective Altruism movement is not above conflicts of interest
I was unsure about including this post, but I think this post highlights an important risk of the EA community receiving a significant share of its funding from a few sources, both for internal community epistemics/culture considerations as well as for external-facing and movement-building considerations. I don't agree with all of the object-level claims, but I think these issues are important to highlight and plausibly relevant outside of the specific case of SBF / crypto. That it wasn't already on the forum (afaict) also contributed to its inclusion here.
I'll also highlight one post that was awarded a prize, but I thought was particularly valuable:
Red Teaming CEA’s Community Building Work
I think this is particularly valuable because of the unique and difficult-to-replace position that CEA holds in the EA community, and as Max acknowledges, it benefits the EA community for important public organisations to be held accountable (and to a standard that is appropriate for their role and potential influence). Thus, even if listed problems aren't all fully on the mark, or are less relevant today than when the mistakes happened, a thorough analysis of these mistakes and an attempt at providing reasonable suggestions at least provides a baseline to which CEA can be held accountable for similar future mistakes, or help with assessing trends and patterns over time. I would personally be happy to see something like this on at least a semi-regular basis (though am unsure about exactly what time-frame would be most appropriate). On the other hand, it's important to acknowledge that this analysis is possible in large part because of CEA's commitment to transparency.
I was a participant and largely endorse this comment.
one contributor to a lack of convergence was attrition of effort and incentives. By the time there was superforecaster-expert exchange, we'd been at it for months, and there weren't requirements for forum activity (unlike the first team stage)
[Edit: wrote this before I saw lilly's comment, would recommend that as a similar message but ~3x shorter].
============
I would consider Greg's comment as "brought up with force", but would not consider it an "edge case criticism". I also don't think James / Alex's comments are brought up particularly forcefully.
I do think it is worth making a case that pushing back on making comments that are easily misinterpreted or misleading are also not edge case criticisms though, especially if these are comments that directly benefit your organisation.
Given the stated goal of the EA community is "to find the best ways to help others, and put them into practice", it seems especially important that strong claims are sufficiently well-supported, and made carefully + cautiously. This is in part because the EA community should reward research outputs if they are helpful for finding the best ways to do good, not solely because they are strongly worded; in part because EA donors who don't have capacity to engage at the object level may be happy to defer to EA organisations/recommendations; and in part because the counterfactual impact diverted from the EA donor is likely higher than the average donor.
For example:
While I wouldn't want to exclude careless communication / miscommunication, I can understand why others might feel less optimistic about this, especially if they have engaged more deeply at the object level and found additional reasons to be skeptical.[2] I do feel like I subjectively have a lower bar for investigating strong claims by HLI than I did 7 or 8 months ago.
(commenting in personal capacity etc)
============
Adding a note RE: Nathan's comment below about bad blood:
Just for the record, I don't consider there to be any bad blood between me and any members of HLI. I previously flagged a comment I wrote with two HLI staff, worrying that it might be misinterpreted as uncharitable or unfair. Based on positive responses there and from other private discussions, my impression is that this is mutual.[3]
-This as the claim that originally prompted me to look more deeply into the StrongMinds studies. After <30 minutes on StrongMinds' website, I stumbled across a few things that stood out as surprising, which prompted me to look deeper. I summarise some thoughts here (which has been edited to include a compilation of most of the critical relevant EA forum commentary I have come across on StrongMinds), and include more detail here.
-I remained fairly cautious about claims I made, because this entire process took three years / 10,000 hours, so I assumed by default I was missing information or that there was a reasonable explanation.
-However, after some discussions on the forum / in private DMs with HLI staff, I found it difficult to update meaningfully towards believing this statement was a sufficiently well-justified one. I think a fairly charitable interpretation would be something like "this claim was too strong, it is attributable to careless communication, but unintentional."
Quotes above do not imply any particular views of commentors referenced.
I have not done this for this message, as I view it as largely a compilation of existing messages that may help provide more context.
A commonly used model in the trust literature (Mayer et al., 1995) is that trustworthiness can be broken down into three factors: ability, benevolence, and integrity.
RE: domain specific, the paper incorporates this under 'ability':
The domain of the ability is specific because the trustee may be highly competent in some technical area, affording that person trust on tasks related to that area. However, the trustee may have little aptitude, training, or experience in another area, for instance, in interpersonal communication. Although such an individual may be trusted to do analytic tasks related to his or her technical area, the individual may not be trusted to initiate contact with an important customer. Thus, trust is domain specific.
There are other conceptions but many of them describe something closer to trust that is domain specific rather than generalised.
...All of these are similar to ability in the current conceptualization. Whereas such terms as expertise and competence connote a set of skills applicable to a single, fixed domain (e.g., Gabarro's interpersonal competence), ability highlights the task- and situation-specific nature of the construct in the current model.
This is a conversation I have a fair amount when I talk to non-EA + non-medical friends about work, some quick thoughts:
If someone asks me Qs around DALYs at all (i.e. "why measure"), I would point to general cases where this happens fairly uncontroversially, e.g.:
-If you were in charge of the health system, how would you choose to distribute the resources you get?
-If you were building a hospital, how would you go about choosing how to allocate your wards to different specialties?
-If you were in an emergency waiting room and you had 10 people in the waiting room, how would you choose who to see first?
These kinds of questions entail some kind of "diverting resources from one person to another" in a way that is pretty understandable (though they also point to reasonable considerations for why you might not only use DALYs in those contexts)
If someone is challenging me over using DALYs in context of it being a measurement system that is potentially ableist, then I generally just agree - it is indeed ableist by some framings![1]
Though, often in these conversations the underlying theme isn't necessarily a "I have a problem with healthcare prioritisation" but a general sense that disabled folk aren't receiving enough resources for their needs - so when having these conversations it's important to acknowledge that disabled folk do just face a lot more challenges navigating the healthcare system (and society generally) through no fault of their own, and that we haven't worked out the answers to prioritising accordingly or for solving the barriers that disabled folk face.
If the claim goes further and is explicitly saying interventions for disabilities are more cost effective than current DALYs approach give them credit for, then that's also worth considering - though the standard would correspondingly increase if they are suggesting a new approach to resource allocation - as Larks' comment illustrates, it is difficult to find an singular approach / measure that doesn't push against intuitions or have something problematic at the policy level.[2]
On how you're feeling when talking about prioritising:
But then I feel like I'm implicitly saying something about valuing some people's lives less than others, or saying that I would ultimately choose to divert resources from one person's suffering to another's.
This makes sense, though I do think there is a decent difference between the claim of "some people's lives are worth more than others" and the claim of "some healthcare resources go further in one context than others (and thus justify the diversion)". For example, I think if you never actively deprioritised anyone you would end up implicitly/passively prioritising based on things like [who can afford to go to the hospital / who lives closer / other access constraints]. But these are going to be much less correlated to what people care about when they say "all lives are equal".
But if we have data on what the status quo is, then "not prioritising" / "letting the status quo happen" is still a choice we are making! And so we try to improve on the status quo and save more lives, precisely because we don't think the 1000 patients on diabetes medication is worth less than the one cancer patient on a third-line immunotherapy.
E.g., for DALYs, the disability weight of 1 person with (condition A+B) is mathematically forced to be lower than the combined disability weight of two separate individuals with condition A and condition B respectively. That means for any cure of condition A, those who have only condition A would theoretically be prioritised under the DALY framework than those who have other health issues (e.g. have a disability). While I don't have a good sense of when/if this specific part of the DALY framework has impacted resource allocation in practice, it is important to acknowledge the (many!) limitations the measures we use have.
Also, different folks within the disability community also have a wide range of views around what it means to live with a disability / be a disabled person (e.g. functional VS social models of disability), so it's not actually clear that e.g., WELLBYs would necessarily lead to more healthcare resources in that direction, depending on which groups you were talking to.
Hey Ollie! Hope you're well.
I want to gently push back on this a bit - I don't think this is necessarily a tradeoff. It's not clear to me that the guidelines have to be "all-inclusive or nothing". As an example, just because the guidelines say you can't use the swapcard app for dating purposes, it would be pretty unreasonable for people to interpret that as "oh, the guidelines don't say I can't use the swapcard app to scam people, that must mean this is endorsed by CEA".
And even if it's the case that the current guidelines don't explicitly comment against using swapcard to scam other attendees, and this contributes to some degree of "failing to prevent harmful behaviour that isn't on the list", that seems like a bad reason to choose to not state "don't use swapcard for sexual purposes".
RE: guidelines that include helpful examples, here's one that I found from 10secs of googling.
As I responded to Julia's comment that you linked, I think these lists can be helpful because most reported cases are likely not from people intentionally wishing to cause harm, but differences in norms or communication or expectations around what might be considered harmful. Having a explicit list of actions helps get around these differences by being more precise about actions that are likely to be considered net negative in expectation. If it's the case that there are a lot of examples that are in a grey area, then this may be an argument to exclude those examples, but it isn't really an argument against having a list that contains less ambiguous examples.
Ditto RE: different settings - this is an argument to have narrower scope for the guidelines, and to not write a single guideline that is intended to cover both the career fair and the afterparty, but not an argument against expressing what's unacceptable under one specific setting (especially when that setting is something as crucial as "EAG conference time")
Lastly, RE: "Responses should be shaped by the wishes of the person who experienced the problem" - of course it should be! But a list of possible actions that might be taken can be helpful without committing the team to a set response, but the inclusion of potential actions that can be taken is still reassuring and helpful for people to know what can be possible.
Again, this was just the first link I clicked, I don't think it's perfect, but I think there are multiple aspects of this that CEA could use to help with further iterations of its guidelines.
I think it's fine to start from CEA's circle of influence and have good guidelines + norms for CEA events - if things go well this may incentivise other organisers to adopt these practices (or perhaps they won't adopt it, because the context is sufficiently different, which is fine too!) But even if other organisers don't adopt better guidelines, this doesn't seem like a particularly strong argument against adopting clearer guidelines for CEA events. The UNFCCC presumably aren't using "oh, we can't control what happens in UN Youth events globally, and we can't force them to agree to follow our guidelines" as an excuse to not have guidelines. But because they have their own guidelines, and many UN Youth events try to emulate what the UN event proper looks like, they will (at least try to) adopt a similar level of formality.
One last reason to err on the side of more precise guidelines echoes point 3 in what lilly shared above - if guidelines are vague and more open to interpretation by the Community Health team, this requires a higher level of trust in the CH team's track record and decision-making and management of CoIs, etc. To whatever extent recent events may reflect actual gaps in this process or even just a change in the perception here, erring on the side of clearer guidelines can help with accountability and trust building.