J

JamesDrain

11 karmaJoined

Comments
7

I posted a couple months ago that I was working on an effective altruism board game. You can now order a copy online!

To recap:

  • it's a cooperative game where you start out as a random human sampled from the real-world distribution of income

  • try to get lots of human QALYs and animal QALYs and reduce existential risk

  • all while answering EA-related trivia questions, donating to effective charities, partaking in classic philosophy thought experiments, realizing your own private morality and

  • try to avoid being turned into a chicken.

Ha, I think the problem is just that your formalization of Newcomb's problem is defined so that one-boxing is always the correct strategy, and I'm working with a different formulation. There are four forms of Newcomb's problem that jibe with my intuition, and they're all different from the formalization you're working with.

  1. Your source code is readable. Then the best strategy is whatever the best strategy is when you get to publicly commit e.g. you should tear off the wheel when playing chicken if you have the opportunity to do so before your opponent.
  2. Your source code is readable and so is your opponent's. Then you get mathy things like mutual simulation and lob's theorem.
  3. We're in the real world, so the only information the other player has to guess your strategy is information like your past behavior and reputation. (This is by far the most realistic situation in my opinion.)
  4. You're playing against someone who's an expert in reading body language, say. Then it might be impossible to fool them unless you can fool yourself into thinking you'll one-box. But of course, after the boxes are actually in front of you, it would be great for you if you had a change of heart.

Your version is something like

  1. Your opponent can simulate you with 100% accuracy, including unforeseen events like something unexpected causing you to have a change of mind.

If we're creating AIs that others can simulate, then I guess we might as well make them immune to retro blackmail. I still don't see the implications for humans, who cannot be simulated with 100% fidelity and already have ample intuition about their reputations and know lots of ways to solve coordination problems.

Newcomb's problem isn't a challenge to causal decision theory. I can solve Newcomb's problem by committing to one-boxing in any of a number of ways e.g. signing a contract or building a reputation as a one-boxer. After the boxes have already been placed in front of me, however, I can no longer influence their contents, so it would be good if I two-boxed if the rewards outweighed the penalty e.g. if it turned out the contract I signed was void, or if I don't care about my one-boxing reputation because I don't think I'm going to play this game again in the future.

The "wishful thinking" hypothesis might just apply to me then. I think it would be super cool if we could spontaneously cooperate with aliens in other universes.

Edit: Wow, ok I remember what I actually meant about wishful thinking. I meant that evidential decision theory literally prescribes wishful thinking. Also, if you made a copy of a purely selfish person and then told them of the fact, then I still think it would be rational to defect. Of course, if they could commit to cooperating before being copied, then that would be the right strategy.

I’m worried that people’s altruistic sentiments are ruining their intuition about the prisoner’s dilemma. If Bob were an altruist, then there would be no dilemma. He would just cooperate. But within the framework of the one-shot prisoner’s dilemma, defecting is a dominant strategy – no matter what Alice does, Bob is better off defecting.

I’m all for caring about other value systems, but if there’s no causal connection between our actions and aliens’, then it’s impossible to trade with them. I can pump someone’s intuition by saying, “Imagine a wizard produced a copy of yourself and had the two of you play the prisoner’s dilemma. Surely you would cooperate?” But that thought experiment is messed up because I care about copies of myself in a way that defies the set up of the prisoner’s dilemma.

One way to get cooperation in the one-shot prisoner’s dilemma is if Bob and Alice can inspect each other’s source code and prove that the other player will cooperate if and only if they do. But then Alice and Bob can communicate with each other! By having provably committed to this strategy, Alice and Bob can cause other player’s with the same strategy to cooperate.

Evidential decision theory also preys on our sentiments. I’d like to live in a cool multiverse where there are aliens outside my light cone who do what I want them to, but it’s not like my actions can cause that world to be the one I was born into.

I’m all for chasing after infinities and being nice to aliens, but acausal trade makes no sense. I’m willing to take many other infinite gambles, like theism or simulationism, before I’m willing to throw out causality.

I could really have benefited from a list like this three or four years ago! I wasted a lot of time reading prestigious fiction (Gravity’s Rainbow, Infinite Jest, In Search of Lost Time, Ulysses) and academic philosophy – none of which I liked or understood – as well as a lot of sketchy pop psych.

If Doing Good Better, 80,000 Hours, The Life You Can Save, Animal Liberation, and Superintelligence are already taken, then I’d say the five most influential works I’ve read are: All of Steven Pinker’s books The Art and Craft of Problem Solving Here Be Dragons: Science, Technology and the Future of Humanity Nick Bostrom’s Are You Living in a Computer Simulation?, Infinite Ethics, and Astronomical Waste, (I would add Anthropic Bias if I could understand it) The Tell-Tale Brain

Runner-ups include: To Be a Machine, Freakonomics, 1984, Information Theory, Inference and Learning Algorithms, The One World Schoolhouse, Computability Theory, Human Accomplishment, Linear Algebra and Its Applications, Spivak’s Calculus, The Willpower Instinct, Thinking Fast and Slow, The Nurture Assumption, Introduction to Algorithms, Practical Programming (this is about weightlifting – not CS), Surely You’re Joking, Innumeracy and also Beyond Numeracy, An Anthropologist on Mars, The God Delusion, The Righteous Mind, Poor Economics, Mathematics 1001, A Short History of Nearly Everything, The Selfish Gene, and Reasons and Persons. It’s not a book, but I feel like SlateStarCodex also belongs on this list.

Another good vegan book is Eating Animals.

I imagine I would also have been enthralled by a book like Soonish about emerging technologies (it hasn’t come out yet.)

I have a fully-formed EA board game that I debuted at EA Global in San Francisco a couple weeks ago. EAs seem to really like it! You can see over one hundred of the game's cards here https://drive.google.com/open?id=0Byv0L8a24QNJeDhfNFo5d1FhWHc

The way the game works is that every player has a random private morality that they want to satisfy (e.g. preference utilitarianism, hedonism, sadism, nihilism) and all players also want to collaboratively achieve normative good (accumulating 1000 human QALYs, 10,000 animals QALYs, and 10 x-risk points). Players get QALYs and x-risk points by donating to charities and answering trivia questions.

The coolest part of the game is the reincarnation mechanic: every player has a randomly chosen income taken from the real-world global distribution of wealth. Players also unlock animal reincarnation mode after stumbling upon the bad giant pit of suffering (the modal outcome of unlocking animal reincarnation is to be stuck as a chicken until the pit of suffering is destroyed, or until a friendly human acquires a V(eg*n) card.)

I'm also thinking about turning the game into an app or computer game, but I'll probably need an experienced coder to help me with that.