The "go extinct" condition is a bit fuzzy. It seems like it would be better to express what you want to change your mind about as something more like (forget the term for this). P(go extinct| AGI)/P(go extinct).
I know you've written the question in terms of go extinct because of AGI but I worry this leads to relatively trivial/uninformative about AI ways to shift that value upward.
For instance, consider a line of argument:
AGI is quite likely (probably by your own lights) to be developed by 2070.
If AGI is developed either it will suffer from serious alignment problems (so reason to think we go extinct) or it will seem to be reliable and extremely capable so will quickly be placed into key roles controlling things like nukes, military responses etc...
The world is a dangerous place and there is a good possibility that there is a substantial nuclear exchange between countries before 2070 which would substantially curtail our future potential (eg by causing a civ collapse which, due to our use of all much of the easily available fossil fuels/minerals etc we can't recover from).
By 2 that exchange will, with high probability, have AGI serving as a key element in the causal pathway that leads to the exchange. Even tho the exchange may we'll have happened w/o AGI it will be the case that the ppl who press the button relied on critical Intel collected by AGI or AGI was placed directly in charge of some of the weapons systems involved in one of the escalating incidents etc...
I think it might be wise to either
a) Shift to a condition in terms of the ratio between chance of extinction and chance of extinction conditional on AGI so the focus is on the effect of AGI on likelihood of extinction.
b) If not that at least clarify the kind of causation required. Is it sufficient that the particular causal pathway that occured include AGI somewhere in it? Can I play even more unfairly and simply point out that by butterfly effect style argument the particular incident that leads to extinction is probably but for caused by almost everything that happens before (if not for some random AI thing years ago the soldiers who provoked the initial confrontation would probably have behaved/been different and instead of that year and incident it would have been one year before or hence).
But hey, if you aren't going to clarify away these issues or say that you'll evaluate to the spirit of the Q not technical formulation I'm going to include in my submission (if I find I have the time for one) a whole bunch of technically responsive but not really what you want arguments about how extinction from some cause is relatively likely and that AGI will appear in that causal chain in a way that makes it a cause of the outcome.
I mean I hope you actually judge on something that ensures you're really learning about impact of AGI but gotta pick up all the allowed percentage points one can ;-).
I'm not sure I complete followed #1 but maybe this will answer what you are getting at.
I agree that the following argument is valid:
Either the time discounting rate is 0 or it is morally preferable to use your money/resources to produce utility now than to freeze yourself and produce utility later.
However, I still don't think you can make the argument that I can't think that time discounting is irrelevant to what I selfishly prefer while believing that you shouldn't apply discounting when evaluating what is morally preferable. And I think this substantially reduces just how compelling the point is. I mean I do lots of things I'm aware are morally non-optimal. I probably should donate more of my earnings to EA causes etc.. etc.. but sometimes I choose to be selfish and when I consider cryonics it's entirely as a selfish choice (I agree that even without discounting it's a waste in utilitarian terms).
(Note that I'd make a distinction between saying something is morally favorable and that it is bad or blameworthy to do it but that's getting a bit into the weeds).
—-
Regarding the theoretical problems I agree that they aren't enough of a reason to accept a true discounting rate. Indeed, I'd go further and say that one is making a mistake to infer things about what's morally good because we'd like our notion of morality to have certain nice properties. We don't get to assume that morality is going to behave like we would like it to …we've just got to do our best with the means of inference we have.
I ultimately agree with you (pure time discounting is wrong…even if our increasing wealth makes it a useful practical assumption) but I don't think you're argument is quite as strong as you think (nor is Cowan's argument very good).
In particular, I'd distinguish my selfish emotional desires regarding my future mental states from my ultimate judgements about the goodness or badness of particular world states. But I think we can show these have to be distinct notions[1]. Someone who was defending pure time discounting could just say: well while, as far as my selfish preferences go, I don't care whether I have another 10 happy years now or in 500 years it's nevertheless true that morally speaking the world in which that utility is realized now is much better than the one it is realized later.
This is also where Cowan's argument falls apart. The pareto principle is only violated if a world in which one person is made better off and everyone else's position is unchanged isn't preferable to the default. But he then makes the unjustified assumption that Sarah isn't 'made worse off' by having her utility moved into the future. But that just begs the question since,if we believe in pure time discounting, Sarah's future happiness really is worth only a fraction of what it would be worth now. In other words we are just being asked to assume that only Sarah's subjective experience and not the time at which it happens affect her contribution to overall utility/world value.
Having said all this, I think that every reason one has for adopting something like utilitarianism (or hell any form of consequentialism) screams out against accepting pure time preferences even if not formally required. The only reason people are even entertaining pure discounting is that they are worried about the paradoxes you get into if you end up having infinite total utility (yes, difficulties remain even if you just try and directly define a preference relation on possible worlds)
—-
^1: I mean your argument basically assumes that, other things being equal, if a world where my selfish desires are satisfied is better than one in which it is not. While that is a coherent position to hold (it's basically what preference satisfaction accounts of morality hold) it's not (absent some a priori derivation of morality) required.
For instance, I'm a pure utilitarian so what I'd say is that while I selfishly wish to continue existing I realize that if I suddenly disappeared in a poof of smoke (suppose I'm a hermit with not affected friends or relatives) and was replaced by an equally happy individual that would be just as good a possible world as the one in which I continued to exist.
Could you provide some evidence that this rate of growth is unusual in history? I mean it wouldn't shock me if we looked back at the last 5000 years and saw that most societies real production grew at similar rates during times of peace/tranquility but that resulted in small absolute growth that was regularly wiped out by invasion, plague or other calamity. In which case the question becomes whether or not you believe that our technological accomplishments make us more resistant to such calamities (another discussion entirely).
Moreover, even if we didn't see similar levels of growth in the past there are plenty of simple models which explain this apparent difference as the result of a single underlying phenomenon. For instance, consider the theory that real production over and above subsistence agricultural level grows at a constant rate per year. As this value was almost 0 for the past 5,000 years that growth wouldn't be very noticeable until recently. And this isn't just some arbitrary mathematical fit but has a good justification, e.g., productivity improvements require free time, invention etc.. etc.. so only happens in the percent of people's time not devoted to avoiding starving.
Also, it's kinda weird to describe the constant rate of growth assumption as business as usual but then pick a graph where we have an economic singularity (flat rate of growth gives a exponential curve which doesn't escape to infinity at any finite time). Having said all that, sure it seems wrong to just assume things will continue this way forever but it seems equally unjustified to reach any other conclusion.
I thought the archetypal example was where everyone had a mild preference to be with other members of their race (even if just because of somewhat more shared culture) and didn't personally really care if they weren't in a mixed group. But I take your point to be that, at least in the gender case, we do have the preference not to be entirely divided by gender.
So yes, I agree that if the effect leads to too much sorting then it could be bad but it seems like a tough empirical question whether we are at a point where the utility gains from more sorting are more or less than the losses.
Re your first point yup they won't try to recruit others to that belief but so what? That's already a bullet any utilitarian has to bite thanks to examples like the aliens who will torture the world if anyone believes utilitarianism is true or ties to act as of it is. There is absolutely nothing self defeating here.
Indeed if we define utilitarianism as simply the belief that ones preference relation on possible worlds is dictated by the total utility in then it follows by definition that the best act an agent can take are just the ones which maximize utility. So maybe the better way to phrase this is as: why care what the agent who pledges to utilitarianism in some way and wants to recruit others might need to do or act that's a distraction from the simple question of what in fact maximizes utility. If that means convincing everyone not to be utilitarians then so be it.
--
And yes re the rest of your points I guess I just don't see why it matters what would be good to do if other agents respond in some way you argue would be reasonable. Indeed, what makes consequentialism consequentialism is that you aren't acting based on what would happen if you imagine interacting with idealized agents like a Kantianesque theory might consider but what actually happens when you actually act.
I agree the caps were aggressive and I apologize for that and I agree I'm not trying to produce evidence which says that in fact how people respond to supposed signals of integrity tends to match what they see as evidence you follow the standard norms. That's just something people need to consult their own experience and ask themselves if, in their experience, thay tends to be true. Ultimately I think that it's just not true that a priori analysis of what should make people see you as trustworthy or have any other social reaction is a good guide to what they will do?
But I guess that is just going to return to point 1 and our different conceptions of what is utilitarianism requires.
Yes and reading this again now I think I was way too harsh. I should have been more positive about what was obviously an earnest concern and desire to help even if I don't think it's going to work out. A better response would have been to suggest other ideas to help but other than reforming how medical practice works so mental suffering isn't treated as less important than being physically debilitated (docs will agree to risky procedures to avoid physical loss of function but won't with mental illness ...likely because the family doesn't see the suffering from the inside but do see the loss in a death so are liable to sue/complain if things go bad).
I feel like there is some definitional slipping going on when you suggest that a painful experience is less bad when you are also experiencing a pleasurable one at the same time. Rather, it seems to me the right way to describe this situation is that the experience is simply not as painful as it would be otherwise.
To drive this intuition consider S&M play. It's not that the pain of being whipped is just as bad...it literally feels different than being whipped in a different context that simply makes it less painful.
Better yet notice the way opiates work that leaves you aware of the physical sensation of the pain but mind it less. Isn't it just that when we experience a pleasure and pain at the same time the nuerochems created by the pleasure literally blunt the pain much like opiates would.
On a related note I'm a bit uncomfortable about inferring too much about the structure of pain/pleasure based on our evolved willingness to seek it out/endure it and worry that also conflates reward and pleasure but it's a hard problem.