Hide table of contents

Holden Karnofsky describes the claims of his “Most Important Century” series as “wild” and “wacky”, but at the same time purports to be in the mindset of “critically examining” such “strange possibilities” with “as much rigour as possible”. This emphasis is mine, but for what is supposedly an important piece of writing in a field that has a big part of its roots in academic analytic philosophy, it is almost ridiculous to suggest that this examination has been carried out with 'as much rigour as possible'. My main reactions - which I will expand on this essay - are that Karnofsky’s writing is in fact distinctly lacking in rigour; that his claims are too vague or even seem to shift around; and that his writing style - often informal, or sensationalist - aggravates the lack of clarity while simultaneously putting the goal of persuasion above that of truth-seeking. I also suggest that his emphasis on the wildness and wackiness of his own "thesis" is tantamount to an admission of bias on his part in favour of surprising or unconventional claims. 

I will start with some introductory remarks about the nature of my criticisms and of such criticism in general. Then I will spend some time trying to point to various instances of imprecision, bias, or confusion. And I will end by asking whether any of this even matters or what kind of lessons we should be drawing from it all. 

Notes: Throughout, I will quote from the whole series of blog posts by treating them as a single source rather than referencing them separately. Note that the series appears in single pdf here (so one can always Ctrl/Cmd+F to jump to the part I am quoting).

It is plausible that some of this post comes across quite harshly but none of it is intended to constitute a personal attack on Holden Karnofsky or an accusation of dishonesty. Where I have made errors of have misrepresented others, I welcome any and all corrections. I also generally welcome feedback on the writing and presentation of my own thoughts either privately or in the comments.

Acknowledgements: I started this essay a while ago and so during the preparation of this work, I have been supported at various points by FHI, SERI MATS, BERI and Open Philanthropy. The development of this work benefitted significantly from numerous conversations with Jennifer Lin.

1. Broad Remarks About My Criticisms

If you felt and do feel convinced by Karnofsky's writings, then upon hearing about my reservations, your instinct may be to respond with reasonable-seeming questions like: 'So where exactly does he disagree with Karnofsky?' or 'What are some specific things that he thinks Karnofsky gets wrong?'. You may well want to look for wherever it is that I have carefully categorized my criticisms, to scroll through to find all of my individual object-level disagreements so that you can see if you know the counterarguments that mean that I am wrong. And so it may be frustrating that I will often sound like I am trying to weasel out of having to answer these questions head-on or not putting much weight on the fact that I have not laid out my criticisms in that way.

Firstly, I think that the main issues to do with clarity and precision that I will highlight occur at a fundamental level. It is not that they are 'more important' than individual, specific, object-level disagreements, but I claim that Karnofsky does a sufficiently poor job of explaining his main claims, the structure of his arguments, the dependencies between his propositions, and in separating his claims from the verifications of those claims, that it actually prevents detailed, in-depth discussions of object-level disagreements from making much sense. I also contend that this in itself is a rhetorical technique (and that Karnofsky is not the only person in the EA ecosystem that employs it). The principle example here is that without a clear notion of 'the importance' of a given century or clear criteria for what would in theory make a century 'the most important' (stated in a way that is independent of specific facts about this century), it is impossible for anyone to compare the importance of two different centuries or to evaluate whether or not this century meets the criteria. Thus it is impossible for a critic to precisely explain why Karnofsky's 'arguments' fail to show that this century meets the criteria.

Secondly, I'd invite you to consider a point made by Philip Trammell in his blog post But Have They Engaged with the Arguments? He considers the situation in which a claim is argued for via a long series of fuzzy inferences, each step of which seems plausible by itself. And he asks us to suppose that most people who try to understand the full argument will ‘drop out’ and reject it at some random step along the chain. Then:

Believers will then be in the extremely secure-feeling position of knowing not only that most people who engage with the arguments are believers, but even that, for any particular skeptic, her particular reason for skepticism seems false to almost everyone who knows its counterargument.

This suggests that when a lot of people disagree with the 'Believers', we should (perhaps begrudgingly, in practice) give weight to the popular disagreement, even when each person’s particular disagreement sounds incorrect to those who know all the counterarguments. The kicker - to paraphrase Trammell - is that although

They haven't engaged with the arguments, … there is information to be extracted from the very fact that they haven't bothered engaging with them.

I believe I am one of many such people in the present context, i.e. although I have at least taken the time to write this essay, it may seem to a true believer that I am not 'engaging with arguments' enough. But there is yet a wider context into which this fits, which forms my third point: To form detailed, specific criticisms of something that one finds to be vague and disorganized, one typically has to actually add clarity first. For example, by picking a precise characterization for a term that was left undefined, as MacAskill had to do in his Are we living at the hinge of history? essay when responding to Parfit, or by rearranging a set of points that were made haphazardly into an argument that is linear enough to be cleanly attacked. But I have not set out to do these things. In particular, if one finds the claims to be generally unconvincing and can't see a good route to bolstering them with better versions of the arguments, then it is hard to find motivation to do this.

Readers will no doubt notice that the previous two points apply to the AI x-risk argument itself (i.e. independently of most important century-type claims) and indeed, yes, I do sympathize with skeptics who are constantly told they are not engaging with the arguments. It often seems like they are being subjected to a device wherein their criticism is held to a higher standard of clarity and rigour than the original arguments.

In fact, I think much of this is part of a wider issue in which vague and confusing ‘research’ (in the EA/LessWrong/Alignment Forum space) often does not adequately receive the signal that it is in fact vague and confusing: It takes a lot of effort to produce high-quality criticisms of low-quality work; and if the work is so low-quality, then why would you want to bother? This lack of pushback can then be interpreted as a positive signal, i.e. a set of ideas or a research agenda can gain credibility from the fact it is being written about regularly without being publicly criticized, when actually part of the reason it isn’t being criticized enough is that it’s confusing or vague nature is putting people off bothering to try. 

All this having been said, let's now start to turn to my more specific points.

2. Precision of the main claim

Philosophers or mathematicians will often go to many paragraphs carefully explaining a main claim (what it is, what it isn’t, what it does or doesn’t imply, what is stronger, what is weaker, giving examples etc.), but in my opinion, a sufficient unpacking of Karnofsky's central claim is absent. The version that appears early-on in the summary is just: “we could be in the most important century of all time for humanity”.  The later post Some additional detail on what I mean by "most important century" then adds two "different senses" of what the phrase means. The following remarks apply to each of these.

The foremost thing that this is needed in order to consider any version of such a claim is: What is the criteria by which a given century can be shown to be the most important? One obvious way of doing this would be to define some quantity called 'the importance' of a given century and then to argue that this century has the greatest importance. To do this, we'd want to know: What measure of importance is being used to compare centuries and how can we estimate it? Or perhaps you can't actually compare some quantity called 'importance' but there is still some clear criteria which, if satisfied by a given century, would be sufficient to show that it were 'the most important century'. Neither of these things, nor any equivalently useful definition or operationalization is given.

Next: What exactly do we mean by “could”? Note that Karnofsky has elsewhere espoused the idea of assigning probabilities to claims as part of the “Bayesian mindset”, so could "could" refer to a quantifiable level of certainty (that he has happened to omit)? Probabilities do in fact appear elsewhere in later posts, e.g. he writes "a 15-30% chance that this is the "most important century" in one sense or another". But note that this is quite different from stating upfront what he claims the probability to be in the context of some framework and then demonstrating the truth of that claim (not to mention the addition of the qualification "in one sense or another"). Instead, the numbers are dropped in later, unsystematically, and we are left to suppose that however convinced we feel in the end is what he must have originally meant by “could”. 

There are various rhetorical mechanisms that are facilitated by this lack of precision. Firstly, vague claims blur together many different precise claims. One can often phrase a vague claim in such a way that people more easily agree with the vague version than they would with many of the individual, more precise versions of the claim that have been lumped together. So, a wide set of people will comfortably feel that they more or less agree with the general sentiment of 'this could be the most important century' and will nod along, but it seems reasonable to suppose that any given detailed and specific version of the thesis that one might commit to would garner much more disagreement.

Secondly, the lack of precision allows the claim to take on different forms at different times, even in one reading, so that the claim can locally fit the arguments at hand and so that after the fact, it feels like it fits whatever it is that you've been convinced of. When you first see the main claim, you may not think too hard about the "could" or the fact that you don't have a precise notion of "most important" etc., but as you give the piece a charitable reading, your interpretation of the appropriate notion of 'importance' can readily shift and mould to fit the present argument or whatever subset of the arguments you actually find convincing.

Thirdly, in Some additional detail on what I mean by "most important century", Karnofsky states that the first possible meaning of the phrase is:

Meaning #1: Most important century of all time for humanity, due to the transition to a state in which humans as we know them are no longer the main force in world events.

We can guess that it might mean something like 'a century is the most important century if it is the case that in that century, there is a transition to a state in which....'. But the use of "due to" is  a bit confusing. One reading is that he has started arguing for the claim while in the midst of defining his terms. And indeed, when he tries to expand on the meaning, he starts off by saying "Here the idea is that: During this century  civilization could... " and "This century is our chance to shape just how this happens." I don't want to get too bogged down in trying to go through all of it line by line, but to summarize this point: There is no clean separation of the definition that underpins the main claim from the arguments in favour of that claim. The explanation of the criteria by which a century can be judged to be the most important is blurred together with the specific points about this century that will be used. Compare with: '4 is the most important number because, as I will demonstrate, it is equal to 2+2... '.

3. Expecting the Unexpected.

In Reasons and Persons (1984), Parfit wrote that " the next few centuries will be the most important in human history"  and (according to MacAskill's essay  Are we living at the hinge of history?), said much more recently in 2015: "I think that we are living now at the most critical part of human history...we may be living in the most critical part of the history of the universe". 

We do not need to assume that Karnofsky has a clear notion of importance or one that matches Parfit's in order to bring one of MacAskill's main criticisms to bear. The point is that any reasonable operationalization of the main claim must contend with the very low base rate. We will not go into technical detail here (see MacAskill's essay for more discussion) but, for example, two reasonable ways of setting priors are using the self-sampling assumption or using a uniform prior of importance over centuries. In both cases, the prior probability that the claim is true is very low. Karnofsky does in fact write that he

has talked about civilization lasting for billions of years... so the prior probability of "most important century" is less than 1/10,000,000.

And to be fair to him, he adds that:

This argument feels like it is pretty close to capturing my biggest source of past hesitation about the "most important century" hypothesis. 

But immediately afterwards, he falls back on a dubious sort of argument that we will now discuss in more depth:

However, I think there are plenty of markers that this is not an average century, even before we consider specific arguments about AI.

The emphasis is mine.

To dig into this a bit, consider also the following quotation:

When someone forecasts transformative AI in the 21st century… a common intuitive response is something like: "It's really out-there and wild to claim that transformative AI is coming this century. So your arguments had better be really good."

I think this is a very reasonable first reaction to forecasts about transformative AI (and it matches my own initial reaction). But… I ultimately don't agree with the reaction.

Note that this is different from saying 'If you disagree with the claim, then I disagree with you (because I agree with the claim)'. He is saying that he doesn’t agree with the reaction that the arguments in favour of his claim need to be really good. And why would he think this? Presumably it is because he believes that the prior probability for believing that transformative AI is coming this century is not low to begin with. Indeed, he goes on:

  • I think there are a number of reasons to think that transformative AI - or something equally momentous - is somewhat likely this century, even before we examine details of AI research, AI progress, etc. 

Again we see such phrasing - "even before we examine details of AI..." -  that seems to suggest that these "reasons" are not so much part of the object-level arguments and evidence in favour of the claim, but are part of an overarching framing in which the prior probability for the claim is not too small. He continues:

  • I also think that on the kinds of multi-decade timelines I'm talking about, we should generally be quite open to very wacky, disruptive, even revolutionary changes. With this backdrop, I think that specific well-researched estimates of when transformative AI is coming can be credible, even if they involve a lot of guesswork and aren't rock-solid.

The emphasis is mine. What is expressed here is in the same vein of either shifting the priors or somehow getting round the fact that they might be low. It ventures into a questionable argument that seems to say: Given how uncertain and difficult-to-predict everything is, maybe we should just generally be more open to unlikely things than we normally would? Maybe we should think of less-than-solid "guesswork" as more credible than we usually would? 

I must point out that there is actually a sense in which something like this can be technically true: A specific extreme outcome may be more likely in some specific higher-variance worlds than in a given low-variance world. But without much more detailed knowledge or assumptions about what 'distributions' one is dealing with, one cannot pull of the kind of argument he is attempting here, which is to appeal to uncertainty in order to back-up predictions. And when talking about something like a date for the arrival of transformative AI, since such estimates can 'go either way', the high-variance "backdrop" being referred to makes it more important to have strong arguments aimed specifically at bounding the estimate from above.

He writes elsewhere, speaking about his main claims:

These claims seem too "wild" to take seriously. But there are a lot of reasons to think that we live in a wild time, and should be ready for anything

And

I know what you're thinking: "The odds that we could live in such a significant time seem infinitesimal; the odds that Holden is having delusions of grandeur (on behalf of all of Earth, but still) seem far higher."

This is exactly the kind of thought that kept me skeptical for many years of the arguments I'll be laying out in the rest of this series... [but]... Grappling directly with how "wild" our situation seems to ~undeniably be has been key for me.

And consider:

There are further reasons to think this particular century is unusual. For example...

  • The current economic growth rate can't be sustained for more than another 80 centuries or so.

Again, it is worth us pointing out that arguments of a not-dissimilar form can be valid and useful: For example, perhaps you want to argue for claim A and although P(A) is small, you know that P(A|B) is not small. So perhaps you are pointing out that we are a world where we already know that B holds - i.e. essentially that we can take P(B) = 1 - in which case P(A) is not the relevant quantity; but P(A|B) is. 

However, once again we notice that that isn't what Karnofsky is doing. He is not making arguments of this form. For example, his point about the “current economic growth rate" is not developed into an argument that a century with a fast growth rate is necessarily one with an increased probability of the development of transformative AI. Time and again, this aspect of his overall argument seems only to say that the general situation we find ourselves in is so ‘special’ -"a wild time", "not... average", "a significant time" - that this permits things that were generally unlikely by default to now be much more likely "even before" considering the details of AI development. I can just about imagine the possibility of a very carefully argued-for framing in which the prior probability of the main claim is not small, but we do not find that here. This particular point of his ends up being not even wrong; it's the absence of a solid and relevant argument.

But notice that it does, however, work in favour of the persuasiveness of the writing. When one gives the piece a charitable reading and tries to parse all of the colour and added detail that could make this century special or "wild" (PASTA, economic productivity explosion, seeding a galaxy-wide civilization, digital people, misaligned AI, the prospect of having huge impact on future people ) one feels broadly like one is reading about things that seem to be characteristic or representative of a very important century.  And one's uncertainty about how the whole argument actually fits together doesn't really surface unless one is challenged. But when we stop to think about it: Presumably certain collections of claims have to all hold simultaneously or certain chains of implications have to hold in order for the overall argument to work? And so - as Yudkowsky explained here -  there's a form of the conjunction fallacy that creeps in here as a rhetorical device: "Adding detail can make a scenario sound more plausible, even though the event necessarily becomes less probable."  

4. 60% of The Time, It Works Every Time.

When introducing a graph that shows the size of the global economy of the past 75 years or so, Karnofsky writes "When you think about the past and the future, you're probably thinking about something kind of like this:" and then after displaying the graph he follows up with: 

I live in a different headspace, one with a more turbulent past and a more uncertain future.

Elsewhere later, he writes:

When some people imagine the future, they picture the kind of thing you see in sci-fi films. But… The future I picture is enormously bigger, faster, weirder,

He regularly chooses to emphasize how strange, weird or "wacky" his thesis is or that it has a "sci-fi feel”. 

It sure sounds more sexy and exciting to share Karnofsky’s headspace than to remain in a sober, skeptical headspace. Would it not just be so much... well, cooler... if this were all true, rather than if we had to play the poindexter and say 'hmm, after careful consideration, this just seems too unlikely'? And do we not think, perhaps, that young, idealistic EAs, often arriving in this community as part of a secular search for meaning and purpose, find themselves motivated to share his views not by the weight of evidence and quality of argument in favour of the claims, but in order to get in on this exciting feeling of living in the most important time? -  To 'get a piece of the action'? 

But yet another purpose is served by this sort of language. In fact, with different wording, I'm sure many will recognize the format: "What I'm about to say might sound crazy, but...". It's a common rhetorical device, a type of expectation management intended to disarm our natural reaction of skepticism. By being reminded of something to the effect of 'You cannot reject the claim just because it sounds crazy', you are primed to be more receptive to the future arguments. I contend that if one wants to priotirize the rigorous consideration of a claim, then one has something of a duty to omit this kind of language. Truth-seeking is not the same as persuasion. A critical weighing of the arguments needs to be done dispassionately. 

On a similar theme, one final specific point I want to draw attention to is seen in All Possible Views About Humanity's Future Are Wild, when Karnofsky writes: 

Let's say you agree with me about where humanity could eventually be headed - that we will eventually have the technology to create robust, stable settlements throughout our galaxy and beyond. But you think it will take far longer than I'm saying.

....

You don't think any of this is happening this century - you think, instead...it will take 100,000 years.

He goes on to say that:

In the scheme of things, this "conservative" view and my view are the same.

Really? What does he mean that these things are the same "in the scheme of things"? I thought that the precise "scheme of things" was that this century were the most important? And it sort of sounded like the development of technology to create settlements throughout the galaxy was somehow part of the argument. He ends the post with:

the choices made in the next 100,000 years - or even this century - could determine whether that galaxy-scale civilization comes to exist, and what values it has, across billions of stars and billions of years to come.

He genuinely appears to not be narrowing down this particular point beyond a period of 1,000 centuries. This of course makes us wonder: Is there a sub-claim about technologies that lead to galaxy-scale civilizations that forms a necessary step in the overall argument? If there is, then we should be understandably confused as how it can not really matter in which of the next 1,000 centuries this technology emerges. If there is no such necessary sub-claim, then why are we spending so much time discussing and analyzing it and what then is the real structure of the argument?  

This oddity is part of the fact that, apparently, Karnofsky doesn't place much weight on whether or not the titular claim is even true or not. At the end of the Some additional detail on what I mean by "most important century” post, he writes that if he is correct about the general picture of the near future that he has been describing but "wrong about the 'most important century' for some reason" then he'd "still think this series's general idea was importantly right". 

Not only does this exemplify some of the rhetorical devices alluded to earlier ('If you think there are specific versions of the claim that are wrong, don't worry, just make it a bit more vague or change the claim a little bit until it seems right'), but could he really be ending the series by more or less retroactively absolving himself from ever having to have verified the main claim? Or is he indeed admitting that seeking out the truth or otherwise of the main claim was never even really his point? And if we weren't even trying to figure out if the main claim was true, then what were we doing? Perhaps he is sticking with the phrase partly because he's in the mindset of a fundraiser and overly accustomed to presenting an eye-catching or - dare we say - exaggerated version of a cause's importance, urgency, and worth in order 'sell' the idea of funding it. Indeed he does also write that he chose the phrase 'most important century'

as a wake-up call about how high the stakes seem to be.

and even that his "main intent" is just:

 to call attention to the "Holy !@#$" feeling of possibly developing something like PASTA this century...

But he shows no signs of abandoning the phrase: As recently as a couple of months ago, he was writing about what we can do to "help with the most important century" or make it "go well", e.g. How major governments can help with the most important century. So I cannot help but feel that the quotations above act as a way of avoiding criticism. He is going to keep using the phrase with a straight face, but if you try to pin him down on it in order criticize it, the comeback is that he never really mean it in the first place (chill out, it's only meant "holistically"). 

5. Concluding Remarks

My first real contact with this community - i.e. the EA, rationalist, and Alignment Forum communities - was when I started on the Research Scholars Program at the Future of Humanity Institute a little more than two years ago. As part of one of the introductory meetings for the incoming cohort, and in order to stimulate a discussion session, we we'd been granted permission to read draft versions of some of Karnofsky's most important century posts. One of my reactions, I remember well. It was a feeling that has resurfaced on multiple occasions since and in multiple different contexts during my immersion into this community: One of incredulity that I was surrounded by ostensibly very smart people, in this case people had come from traditional degrees and were now technically members of the Oxford philosophy department, who seemed to be totally buying into an unusual worldview that was being argued for in only fairly lax, informal ways. 

What I was witnessing then, and have witnessed many times since, was a level of deference that is not fully explained by an appeal to the expertise or track record of those or that which was being deferred to. In fact, I would describe it less as deference and more as a kind of susceptibility towards big, exciting, and unlikely claims that has its roots in the social and cultural forces of the community. Note here that the unlikeliness is part of the appeal: It sure makes you feel clever and important to be part of a club that (you believe) has a track record for uncovering urgent truths about the world that many other smart people have failed to see. But we must be wary of simply building this in to our identity, of internalizing it to the point that we are primed to accept certain kinds of arguments that are made by the right kind of people in the right sort of way.

But that is where my fixation with these posts stems from. For me, they became emblematic of the culture shock that I experienced and were the context for my first big disappointment with the intellectual and critical standards within the EA/LW/AF ecosystem. I will emphasize the distinction that I say 'critical' as well as 'intellectual' because it isn't just that I'd come across just any old content that I disagreed with or thought was low-quality, it was because it was content from a respected and powerful figure in the field and (perhaps through no fault of his own) too many others seemed to have given him too much benefit of the doubt and swallowed the message without doing him the honour of providing decent criticism.

Later, I saw people recommend the series of posts as a standard way for someone who is curious about AI Safety to get more of an idea about the way the community approaches the issue. Coming from an academic background myself, I would think: If I wanted to come across as a convincing, credible authority on this subject, I would be embarrassed to recommend this series of posts or to cite it as one of the sources of my own knowledge on the subject. And I continue to be frustrated by the fact that many of those doing the recommending or defending can be quite so oblivious to how off-putting this style of writing can be to a skeptical outsider, particularly when part of the assumption going in is that this subject lies between two paradigms of rather extreme intellectual achievement: One that exemplifies rigorous, analytic, thought - academic analytic philosophy - and one that exemplifies the bleeding edge of technical sophistication  - the AI research labs of the Bay Area. We must not use the fact that our scholarship doesn't fully belong to either world as an excuse; we ought to be seeking the best of both worlds, with all that comes with it, including the holding of our work, even when or especially when it is speculative and unusual, to standards that are recognizable from both vantage points.

122

1
0

Reactions

1
0

More posts like this

Comments17
Sorted by Click to highlight new comments since:

Holden Karnofsky describes the claims of his “Most Important Century” series as “wild” and “wacky”, but at the same time purports to be in the mindset of “critically examining” such “strange possibilities” with “as much rigour as possible”. This emphasis is mine, but for what is supposedly an important piece of writing in a field that has a big part of its roots in academic analytic philosophy, it is almost ridiculous to suggest that this examination has been carried out with 'as much rigour as possible'. 

(Probably unimportant note: I don't understand why you don't link to the specific post(s) you're referring to; could you please do that? I ask not simply to nitpick, but because I want to acknowledge that perhaps you're not referring to https://www.cold-takes.com/most-important-century/ , in which case the main comment below may not apply.)

I fear you may have (seriously) mischaracterized/misquoted Holden.[1] The following is the actual quote from that post:

But part of the mindset I've developed through GiveWell and Open Philanthropy is being open to strange possibilities, while critically examining them with as much rigor as possible. And after a lot of investment in examining the above thesis, I think it's likely enough that the world urgently needs more attention on it.
By writing about it, I'd like to either get more attention on it, or gain more opportunities to be criticized and change my mind.

When I read the quotes from your opening paragraph, I developed the strong impression that Holden wrote something like "This series on 'The Most Important Century' is my attempt at critically examining some wild claims with as much rigor as possible." JoshuaBlake also appears to have developed this impression, given that he commented "Then Karnofsky shouldn't claim that he was arguing with "as much rigour as possibly"." 

However, this is not what Holden claimed, and I never interpreted The Most Important Century as an attempt to be "as rigorous as possible"—which should be implied given that this is a blog post series, not a series of academic papers.

Ultimately, I would like for OP to clarify the situation (e.g., are you just referring to a quote from a different post?) before I go into more detail.

 

  1. ^

    Caveat: I can't myself claim to know exactly what Holden intended, all I can say is the way you made me think Holden described his writing is very different from how I actually interpreted Holden's claims both before and after I read this post.

I appreciate the comment.

I'll be honest, I'm probably not going to go back through all of the quotations now and give the separate posts they come from. Karnofsky did put the whole series into a single pdf which can function as a single source (but as I say, I haven't gone through again and checked myself).

I do recognize the bit you are quoting though - yes that is indeed where I got my quotations for that part from - and I did think a bit the issue you are bringing up. To me, this seems like a fair paraphrase/description of Karnofsky in that section:

a) He's developed a mindset in which he critically examines things like this with as much rigour as possible;

and

b) He has made a lot of investment in examining this thesis.

So it seemed - and still seems - to me that he is implying that he has invested a lot of time in critically examining this thesis with as much as rigour as possible. 

However, I would not have used the specific phrasing that JoshuaBlake used. i.e. I deliberately did not say something like 'Karnofsky claims to have written these posts with as much rigour as possible". So I do think that one could try to give him the benefit of the doubt and say something like: 

'Although he spent a lot of time examining the thesis with as much rigor as possible, it does not necessarily follow that he wrote the posts in a way that shows that. So criticising the writing in the posts is kind of an unfair way to attack his use of rigour.'

But I think to me this just seemed like I would be trying a bit too hard to avoid criticising him. This comes back to some of my points in my post: i.e. I am suggesting that his posts are not written in a way that invites clear criticism, despite his claim that his is one of his main intentions and I suggest that Karnofsky's rhetorical style aims for the best of both worlds: He wants his readers to think both that he has thought about this very rigorously and critically for a long time but also - wherever it seems vague or wrong - to give him the benefit of the doubt and say 'well it was never meant to be taken too seriously or to be 100% rigorous, they're just blog posts etc.'.

(Just a note: Of course you are free to go into more detail in comments but I'm not sure I have much bandwidth to devote to writing long replies.)

 

Thanks for writing this and posting it - it is surprising that a text which is increasingly becoming a foundational piece of 'introductory' reading for people interested in x-risk reduction (and TAI more specifically) hasn't been rigorously or critically examined to the extent we'd probably want it to. Hopefully there are more to come after this.

On a different note @mods - it doesn't seem like tagging this as a community post is appropriate, and would lead to fewer people (who'd probably want to see it) from seeing it. It might even be worth posting on lesswrong if you're feeling brave!

@mods - it doesn't seem like tagging this as a community post is appropriate, and would lead to fewer people (who'd probably want to see it) from seeing it.

It was tagged community by the author and, after a quick read, I agree with keeping the tag.

I agree that if the post was only about criticism of Karnofsky's Most Important Century series, it definitely shouldn't be tagged community. But the post seems to me to be in large part about the "intellectual and critical standards within the EA/LW/AF ecosystem", especially (but not exclusively) in the concluding remarks.

I think that a post that's not entirely about biorisk, but significantly touches on biorisk, should be tagged biorisk, and that the same applies in this case about "community".

I agree about your biorisk point. However, tagging community has the additional effect of greatly reducing visibility which is fine when the content is primarily about the community but not here where the community aspect is a collary to the main point critiquing a fundamental text in EA thinking.

Thanks for the comment.

Yeah, it wasn't too clear to me how to think about using the community tag but I decided to go with it in the end. This exchange however makes it look like people tend to disagree and think I shouldn't have used it. Hmmm, I'm not sure.

Well done on voicing harsh criticism. The general conclusion about lax intellectual standards feels plausible to me.

1 specific point of pushback on Karnofsky's behalf: You complain: 'He goes on to say that In the scheme of things, this "conservative" view and my view are the same.'

In context, I think the structure of his argument here is as follows: "people reject my claim that space colonization might well start this century because it means we live in a very unusual time, but even if space colonization starts in the next 100k years, we live in nearly as unusual a time, because the period before colonization is still really short compared to the period after. But the objection seems bad against the latter view, so maybe we shouldn't trust it against my view either.' Though if that is what he meant, then yes, it could be spelled out more clearly, and would have to be in an analytic philosophy journal. 

He does specifically point out that the conservative view and his view have quite different implications for what we should do now: 'It’s true that the “conservative” view doesn’t have the same urgency for our generation in particular. But it still places us among a tiny proportion of people in an incredibly significant time period. And it still raises questions of whether the things we do to make the world better - even if they only have a https://www.cold-takes.com/all-possible-views-about-humanitys-future-are-wild/ 12 tiny flow-through to the world 100,000 years from now - could be amplified to a galactic-historical-outlier degree.'

Here's what I think Spencer is saying (somewhat abridged):

  • Holden's writing does not have the careful rigour and precision you would get from a philosophy paper
  • This makes it harder to engage with the arguments
  • This means you should be less willing to trust Holden's conclusions, if you had only formed your conclusions based on those posts alone

This all seems reasonable, but I don't think Holden was aiming for a carefully reasoned philosophy paper, I think he was aiming for accessible blog posts.

I would be more excited to see Spencer steelman the strongest case for the Most Important Century hypothesis, and then argue why it's incorrect.

I don't think Holden was aiming for a carefully reasoned philosophy paper, I think he was aiming for accessible blog posts

Then Karnofsky shouldn't claim that he was arguing with "as much rigour as possibly".

EDIT: It is not clear Karnofsky was claiming that, see linked comment thread in the replies.

I would be more excited to see Spencer steelman the strongest case for the Most Important Century hypothesis, and then argue why it's incorrect.

This is placing a very high burden on Spencer. Karnofsky claims that the argument is so strong that it should update you from a roughly 1 in 10 million prior to around 1 in 5, a Bayes factor of 2.5 million. If this is the case, it should be on Karnofsky to lay it out clearly, not Spencer.

In fact, this type of requirement is directly addressed by Spencer in his essay.

Firstly, I think that the main issues to do with clarity and precision that I will highlight occur at a fundamental level. It is not that they are 'more important' than individual, specific, object-level disagreements, but I claim that Karnofsky does a sufficiently poor job of explaining his main claims, the structure of his arguments, the dependencies between his propositions, and in separating his claims from the verifications of those claims, that it actually prevents detailed, in-depth discussions of object-level disagreements from making much sense.

And

To form detailed, specific criticisms of something that one finds to be vague and disorganized, one typically has to actually add clarity first. For example, by picking a precise characterization for a term that was left undefined, as MacAskill had to do in his Are we living at the hinge of history? essay when responding to Parfit, or by rearranging a set of points that were made haphazardly into an argument that is linear enough to be cleanly attacked. But I have not set out to do these things. In particular, if one finds the claims to be generally unconvincing and can't see a good route to bolstering them with better versions of the arguments, then it is hard to find motivation to do this.

Regarding

Then Karnofsky shouldn't claim that he was arguing with "as much rigour as possibly".

See my comment

Thank you for the correction, I had not checked the full context of the quote myself. I have now edited to clarify that Karnofsky did not claim this in response to this and Spencer's reply.

What do you think a better way of writing would have been? 

Just flagging uncertainties more clearly or clarifying when he is talking about his subjective impressions? 

Also, while that doesn't invalidate your criticism, I always read the most important century as something like "Holden describes in one piece why he thinks we are potentially in a very important time. It's hard to define what that means exactly but we all kind of get the intention. The arguments about AI are hard to precisely make because we don't really understand AI and its implications yet but the piece puts the current evidence together so that we get slightly less confused." 

I explicitly read the piece NOT as something that would be written in academic analytical philosophy and much more as something that points to this really big thing we currently can't articulate precisely but all agree is important.  

I don’t really understand the response of asking me what I would have done. I would find it very tricky to put myself in the position of someone who is writing the posts but who also thinks what I have said I think.

Otherwise, I don’t think you’re making unreasonable points, but I do think that in the piece itself I already tried to directly address much of what you’re saying.

Ie You talk about his ideas being hard to define or hard to precisely explain but that we all kind of get the intention. Among other things, I write about how his vagueness allows him to lump together many versions of his claims under one heading, which hides the true level of disagreement and uncertainty that is likely to exist about any particular precise version of a claim. And about how it means he can focus on persuading you (-and potentially persuading you to act or donate etc) above figuring out what’s true. Etc

(It feels weird to me to have to repeat that we should be skeptical of - or at least very careful with - ideas/arguments that seem to continue to be “hard to explain” but that “we all kinda get what we mean”.)

And I don’t expect to actually be written like analytic philosophy either: eg one of my points here is that it isn’t reasonable to suppose that he is unaware of the standards of academic philosophy and so it doesn’t feel right for him to suggest that he is using a high level of rigour etc

I struggle with this stuff so please take this lightly.

But doesn't he give some clearer stuff?

  • Doesn't he talk about how you might consider every century as one of 100,000 years, but alternatively you could look at centuries of rapid economic growth, of which there are three? Suddenly the prior probability that this is the most important one seems to go down a lot
  • Likewise, perhaps a more important tech is possible after PASTA but if not, (medium if) then it's not that unlikely we develop it this century. We haven't tried for that long, we've recently made big breakthroughs and a lot more resources are going to it.
  • The creation of an intelligence that much surpass us probably is the biggest event in many species, after their own evolution. So again, this could put it in the top two. 

I guess, this doesn't seem that robust to me sure, but I guess it doesn't seem not at all robust either. And maybe I've missed it (possible my comprehension is pretty poor) but you don't seem to have mentioned these.

Are these the kinds of claims you want or do you want others?

So I haven't read your whole post (apologies), but:

  • I also found Holden's argument related to the "Most Important Century" rather confusing. My theory was that Holden was responding to arguments along the lines of what Robin Hanson and some other economists have made, but without having taken the time to step through and explain those arguments. So for me the experience was one of jumping into the middle of the conversation (even though I had heard some of Hanson's arguments before, I was somewhat vague on them and I took me some time to figure out that I needed to understand Holden's argument in that context).
  • I've generally found Holden's other posts to be much less confusing (although it sounds like you differ here). So from my perspective, I see Holden as a skilled communicator who in this instance was probably so enmeshed in a particular set of conversations that are happening that it didn't occur to him that he needed to explain certain details.

In that case, it's weird that the post is highlighted in the navigation on Karnofsky's, blog and described as "the core of [his] AI content". This strongly implies it is a foundational argument from one of the most influential people about EA on why we should be concerned about AI. In that case, it would either be held to fairly high standards or be replaced.

[comment deleted]5
2
0
Curated and popular this week
Relevant opportunities