riceissa

1070 karmaJoined Sep 2014

Bio

Posts
11

Sorted by New

Cannot return null for non-nullable field Post.lastCommentedAt. Cannot return null for non-nullable field Post.lastCommentedAt. Cannot return null for non-nullable field Post.lastCommentedAt. Cannot return null for non-nullable field Post.lastCommentedAt.

Comments
97

Topic contributions
2

Shapley values: Better than counterfactuals

riceissa3y2

I think a general and theoretically sound approach would be to build a single composite game to represent all of the games together

Yeah, I did actually have this thought but I guess I turned it around and thought: shouldn't an adequate notion of value be invariant to how I decide to split up my games? The linearity property on Wikipedia even seems to be inviting us to just split games up in however manner we want.

And yeah, I agree that in the real world games will overlap and so there will be double counting going on by splitting games up. But if that's all that's saving us from reaching absurd conclusions then I feel like there ought to be some refinement of the Shapley value concept...

Shapley values: Better than counterfactuals

riceissa3y2

I asked my question because the problem with infinities seems unique to Shapley values (e.g. I don't have this same confusion about the concept of "marginal value added"). Even with a small population, the number of cooperative games seems infinite: for example, there are an infinite number of mathematical theorems that could be proven, an infinite number of Wikipedia articles that could be written, an infinite number of films that could be made, etc. If we just use "marginal value added", the total value any single person adds is finite across all such cooperative games because in the actual world, they can only do finitely many things. But the Shapley value doesn't look at just the "actual world", it seems to look at all possible sequences of ways of adding people to the grand coalition and then averages the value, so people get non-zero Shapley value assigned to them even if they didn't do anything in the "actual world".

(There's maybe some sort of "compactness" argument one could make that even if there are infinitely many games, in the real world only finitely many of them get played to completion and so this should restrict the total Shapley value any single person can get, but I'm just trying to go by the official definition for now.)

Shapley values: Better than counterfactuals

riceissa3y2

I don't think the example you give addresses my point. I am supposing that Leibniz could have also invented calculus, so . But Leibniz could have also invented lots of different things (infinitely many things!), and his claim to each invention would be valid (although in the real world he only invents finitely many things). If each invention is worth at least a unit of value, his Shapley value across all inventions would be infinite, even if Leibniz was "maximally unluckly" and in the actual world got scooped every single time and so did not invent anything at all.

I don't understand the part about self-modifications - can you spell it out in more words/maybe give an example?

Shapley values: Better than counterfactuals

riceissa3y2

Disagree-voting a question seems super aggressive and also nonsensical to me. (Yes, my comment did include some statements as well, but they were all scaffolding to present my confusion. I wasn't presenting my question as an opinion, as my final sentence makes clear.) I've been unhappy with the way the EA Forum has been going for a long time now, but I am noting this as a new kind of low.

Shapley values: Better than counterfactuals

riceissa3y2

What numerator and denominator? I am imagining that a single person could be a player in multiple cooperative games. The Shapley value for the person would be finite in each game, but if there are infinitely many games, the sum of all the Shapley values (adding across all games, not adding across all players in a single game) could be infinite.

Shapley values: Better than counterfactuals

riceissa3y12

Example 7 seems wild to me. If the applicants who don't get the job also get some of the value, does that mean people are constantly collecting Shapley value from the world, just because they "could" have done a thing (even if they do absolutely nothing)? If there are an infinite number of cooperative games going on in the world and someone can plausibly contribute at least a unit of value to any one of them, then it seems like their total Shapley value across all games is infinite, and at that point it seems like they are as good as one can be, all without having done anything. I can't tell if I'm making some sort of error here or if this is just how the Shapley value works.

Reminding myself just how awful pain can get (plus, an experiment on myself)

riceissa3y9

Do you know of any ways I could experimentally expose myself to extreme amounts of pleasure, happiness, tranquility, and truth?

I'm not aware of any way to expose yourself to extreme amounts of pleasure, happiness, tranquility, and truth that is cheap, legal, time efficient, and safe. That's part of the point I was trying to make in my original comment. If you're willing forgo some of those requirements, then as Ian/Michael mentioned, for pleasure and tranquility I think certain psychedelics (possibly illegal depending on where you live, possibly unsafe, and depending on your disposition/luck may be a terrible idea) and meditation practices (possibly expensive, takes a long time, possibly unsafe) could be places to look into. For truth, maybe something like "learning all the fields and talking to all the people out there" (expensive, time-consuming, and probably unsafe/distressing), though I realize that's a pretty unhelpful suggestion.

I'd be willing to expose myself to whatever you suggest, plus extreme suffering, to see if this changes my mind. Or we can work together to design a different experimental setup if you think that would produce better evidence.

I appreciate the offer, and think it's brave/sincere/earnest of you (not trying to be snarky/dismissive/ironic here - I really wish more people had more of this trait that you seem to possess). My current thinking though is that humans need quite a benign environment in order to stay sane and be able to introspect well on their values (see discussion here, where I basically agree with Wei Dai), and that extreme experiences in general tend to make people "insane" in unpredictable ways. (See here for a similar concern I once voiced around psychedelics.) And even a bunch of seemingly non-extreme experiences (like reading the news, going on social media, or being exposed to various social environments like cults and Cultural Revolution-type dynamics) seem to have historically made a bunch of people insane and continue to make people insane. Basically, although flawed, I think we still have a bunch of humans around who are still basically sane or at least have some "grain of sanity" in them, and I think it's incredibly important to preserve that sanity. So I would probably actively discourage people from undertaking such experiments in most cases.

Reminding myself just how awful pain can get (plus, an experiment on myself)

riceissa3y2

It may end up being that such intensely positive values are possible in principle and matter as much as intense pains, but they don’t matter in practice for neartermists, because they're too rare and difficult to induce. Your theory could symmetrically prioritize both extremes in principle, but end up suffering-focused in practice. I think the case for upside focus in longtermism could be stronger, though.

If by "neartermism" you mean something like "how do we best help humans/animals/etc who currently exist using only technologies that currently exist, while completely ignoring the fact that AGI may be created within the next couple of decades" or "how do we make the next 1 year of experiences as good as we can while ignoring anything beyond that" or something along those lines, then I agree. But I guess I wasn't really thinking along those lines since I find that kind of neartermism either pretty implausible or feel like it doesn't really include all the relevant time periods I care about.

It's also conceivable that pleasurable states as intense as excruciating pains in particular are not possible in principle after refining our definitions of pleasure and suffering and their intensities.

I agree with you that that is definitely conceivable. But I think that, as Carl argued in his post (and elaborated on further in the comment thread with gwern), our default assumption should be that efficiency (and probably also intensity) of pleasure vs pain is symmetric.

Reminding myself just how awful pain can get (plus, an experiment on myself)

riceissa3y25

I am worried that exposing oneself to extreme amounts of suffering without also exposing oneself to extreme amounts of pleasure, happiness, tranquility, truth, etc., will predictably lead one to care a lot more about reducing suffering compared to doing something about other common human values, which seems to have happened here. And the fact that certain experiences like pain are a lot easier to induce (at extreme intensities) than other experiences creates a bias in which values people care the most about.

Carl Shulman made a similar point in this post: "This is important to remember since our intuitions and experience may mislead us about the intensity of pain and pleasure which are possible. In humans, the pleasure of orgasm may be less than the pain of deadly injury, since death is a much larger loss of reproductive success than a single sex act is a gain. But there is nothing problematic about the idea of much more intense pleasures, such that their combination with great pains would be satisfying on balance."

Personally speaking, as someone who has been depressed and anxious most of my life and sometimes have (unintentionally) experienced extreme amounts of suffering, I don't currently find myself caring more about pleasure/happiness compared to pain/suffering (I would say I care about them roughly the same). There's also this thing I've noticed where sometimes when I'm suffering a lot, the suffering starts to "feel good" and I don't mind it as much, and symmetrically, when I've been happy the happiness has started to "feel fake" somehow so overall I feel pretty confused about what terminal values I am even optimizing for (but thankfully it seems like on the current strategic landscape I don't need to figure this out immediately).

Agrippa's Quick takes

riceissa4y3

Has Holden written any updates on outcomes associated with the grant?

Not to my knowledge.

I don't think that lobbying against OpenAI, other adversarial action, would have been that hard.

It seems like once OpenAI was created and had disrupted the "nascent spirit of cooperation", even if OpenAI went away (like, the company and all its employees magically disappeared), the culture/people's orientation to AI stuff ("which monkey gets the poison banana" etc.) wouldn't have been reversible. So I don't know if there was anything Open Phil could have done to OpenAI in 2017 to meaningfully change the situation in 2022 (other than like, slowing AI timelines by a bit). Or maybe you mean some more complicated plan like 'adversarial action against OpenAI and any other AI labs that spring up later, and try to bring back the old spirit of cooperation, and get all the top people into DeepMind instead of spreading out among different labs'.

riceissa

Bio

Posts 11

Comments97

Topic contributions2

Posts
11

Comments
97

Topic contributions
2