MH

Max H

26 karmaJoined

Bio

I am mostly active on LessWrong. See my profile and my self-introduction there for more.

Comments
5

As someone who is exploring a transition to full-time TAIS work, I appreciate this series of posts and other efforts like it. Having a detailed public critique and "adversarial summary" of multiple organizations will make any future due-diligence process (by me or others) much quicker and easier.

That said, I am pretty unconvinced of the key points of this post.

  • I suspect some of the criticism is more convincing and more relevant to those who share the authors' views on specific technical questions related to TAIS.
  • My own personal experience (very limited, detailed below) engaging with Conjecture and their public work conflicts with the claim that it is low quality and that they react defensively to criticism.
  • Some points are vague and I personally do not find them damning even if true.

Below, I'll elaborate on each bullet.

Criticism which seems dependent on technical views on TAIS

I agree with Eliezer's views on the difficulty and core challenges posed by the development of AGI.

In particular, I agree with his 2021 assessment of the AI safety field here, and List of Lethalities 38-41.

I realize these views are not necessarily consensus among EAs or the TAIS community, and I don't want to litigate them on the object level here. I merely want to remark that, having personally accepted them, I find some of the criticism and suggestions offered in this post unconvincing, overly generic, or even misguided (hire more experienced ML researchers, engage with the broader ML community, "average ML conference paper" as a quality benchmark, etc.) for reasons that aren't specific to Conjecture.

My own experience engaging directly with Conjecture

I myself am skeptical of Conjecture's current alignment plan and much of their current and past research, as I understand it. However, in engaging with some of their published work, I have not found it to be low quality or disconnected from relevant work of others, and some of the views I disagree most strongly with are actually shared by other prominent TAIS researchers or funders.

I commented (obliquely) on their CoEm strategy in the thread starting here, and on a post by one of their researchers starting here.

These posts by me cite the work of researchers at Conjecture, and some of them are partially criticism or counters to views that I perceive they hold:

Conjecture has not engaged directly with most of my posts or comments, but this is explained by the fact that the posts and comments have received very little engagement in general. My point here is mainly that I would not have written many of the posts and comments above at all, if I did not personally find the work which they cite and / or criticize to be above a pretty high quality threshold.

To the very limited degree that Conjecture's researchers have engaged directly with my own work, I did not find it to be defensive.

I think Conjecture is hopelessly confused and doomed to fail, but mostly for inside-view technical reasons mentioned in the previous section, and my own criticism is not really specific to Conjecture. When the criteria I most care about are graded on a curve, I think Conjecture and their research stacks up well against most other organizations. My views on this are low-confidence and based on scant firsthand evidence, but the information in this post was not a meaningful update.

Other points

Regarding:

  • CEO trustworthiness and consistency
  • funding sources / governance structure / profit motive
  • scaling too quickly


These are important points to consider when diligencing any organization, and even more critical when the organization in question is working on TAIS. I appreciate the authors compiling and summarizing their views on these topics.


However, there is limited detail to the accusations and criticism in these sections. Even if all of the points were true and exactly as bad as the authors claim or imply, none of them are severe enough that I would consider them damning or even far outside the norm for a non-TAIS organization.

I think that EA and particularly TAIS organizations should strive to meet a higher standard, and agree Conjecture has room for improvement in these areas, but there is nothing in these sections which I consider a dealbreaker if I were choosing to work for or collaborate with or fund them.

Given the sparsity and viewpoint diversity of TAIS organizations, unless the issues on these topics are extremely serious, I personally would weigh the following factors much more heavily when evaluating TAIS organizations:

  • Clarity of thought, epistemic hygiene, and general sanity of the researchers at the organization.
  • The organization's operational adequacy (relative to other orgs)
  • Understanding of, and a plan to actually work on (or at least engage with) the most difficult and important problems.
     

Are there other technologies besides AGI whose development has been slowed by social stigma or backlash?

Nuclear power and certain kinds of genetic engineering (e.g. GoF research) seem like plausible candidates off the top of my head. OTOH, we still have nuclear bombs and nuclear power plants, and being a nuclear  scientist or a geneticist is not widely stigmatized. Polygenic screening is apparently available to the general public, though there are some who would call the use and development of such technology immoral.

I think this is an interesting point overall, but I suspect the short-term benefits of AI will be too great to create a backlash which results in actual collective / coordinated action to slow frontier capabilities progress, even if the backlash is large. One reason is that AI capabilities research is currently a lot easier to do in private without running afoul of any existing regulations, compared to nuclear power or genetic engineering, which require experimentation on controlled materials, human test subjects, or a high biosaftey level lab.

So, given the current state of for-profit corporate governance, and for-power nation-state governance, that seems very unlikely.

Yep. I think in my ideal world, there would be exactly one operationally adequate organization permitted to build AGI. Membership in that organization would require a credible pledge to altruism and a test of oath-keeping ability.

Monopoly power of this organization to build AGI would be enforced by a global majority of nation states, with monitoring and deterrence against defection.

I think a stable equilibrium of that kind is possible in principle, though obviously we're pretty far away from it being anywhere near the Overton Window. (For good reason - it's a scary idea, and probably ends up looking pretty dystopian when implemented by existing Earth governments. Alas! Sometimes draconian measures really are necessary; reality is not always nice.)

In the absence of such a radically different global political order we might have to take our chances on the hope that the decision-makers at OpenAI, Deepmind, Anthropic, etc. will all be reasonably nice and altruistic, and not power / profit-seeking. Not great!

There might be worlds in between the most radical one sketched above and our current trajectory, but I worry that any "half measures" end up being ineffective and costly and worse than nothing, mirroring many countries' approach to COVID lockdowns.

There are a lot more than two decision theories. Most are designed to do equally well or better than both causal and evidential decision theory in Newcomb-like problems and even more exotic setups.

The basic idea in all of them is that, instead of choosing the best decision at any particular decision point, they choose the best decision-making algorithm across possible world states.

I think the original CEV paper from 2003 addresses (or at least discusses) a lot of these concerns. Basically, the thing that a group attempting to build an aligned AI should try to align it with is the collective CEV of humanity, not any individual humans.

On anti-natalism, religious extremism, voluntary extinction, etc. -  if those values end up being stable under reflection, faster and more coherent thinking, and don't end up dominated by other values of the people who hold them, then the Future may indeed include things which satisfy or maximize those values.

(Though those values, and the people that hold them don't necessarily get more say than people who believe the opposite. If some interests and values are truly irreconcilable,  a compromise might look like dividing up chunks of the lightcone.)


Of course, the first group who attempts to build a super-intelligence might try to align it with something else - their own personal CEV (which may or may not have a component for the collective CEV of humanity), or some kind of equal or unequal split between the individual CEVs of every human, or every sentient, etc. or something else entirely.

This would be inadvisable for various reasons discussed in the paper, and I agree it is a real danger / problem. (Mostly though, I think anyone who tries to build any kind of CEV sovereign right now just fails, and we end up with tiny molecular squiggles.)