1 min read 128

18

Hello Effective Altruism Forum, I am Nate Soares, and I will be here to answer your questions tomorrow, Thursday the 11th of June, 15:00-18:00 US Pacific time. You can post questions here in the interim.

Last week Monday, I took the reins as executive director of the Machine Intelligence Research Institute. MIRI focuses on studying technical problems of long-term AI safety. I'm happy to chat about what that means, why it's important, why we think we can make a difference now, what the open technical problems are, how we approach them, and some of my plans for the future.

I'm also happy to answer questions about my personal history and how I got here, or about personal growth and mindhacking (a subject I touch upon frequently in my blog, Minding Our Way), or about whatever else piques your curiosity. This is an AMA, after all!

EDIT (15:00): All right, I'm here. Dang there are a lot of questions! Let's get this started :-)

EDIT (18:00): Ok, that's a wrap. Thanks, everyone! Those were great questions.

Comments128
Sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

What are some of the most neglected sub-tasks of reducing existential risk? That is, what is no one working on which someone really, really should be?

Policy work / international coordination. Figuring out how to build an aligned AI is only part of the problem. You also need to ensure that an aligned AI is built, and that’s a lot harder to do during an international arms race. (A race to the finish would be pretty bad, I think.)

I’d like to see a lot more people figuring out how to ensure global stability & coordination as we enter a time period that may be fairly dangerous.

0
capybaralet
Nailed it. (anyone have) any suggestions for how to make progress in this area?
0
gfloyd
Hi from Melbourne Oz Nate! Law and Ethics research interests me. e.g. AI in drone aircraft & self drive cars has huge potential harm risks. What 'formal' MIRI inputs exist to 'global' governments exist to ensure relevant expertise sets 'ethics-risks' framework. Ryan Calo proposes a Federal Robotics Commission http://www.brookings.edu/research/reports2/2014/09/case-for-federal-robotics-commission Glenn Floyd www.reachers.org

What is the top thing you think you'll do differently now that you're Executive Director?

What do you think is the biggest mistake MIRI has made in it's past? How have you learned from it?

What do you think has been the biggest success MIRI has had? How have you learned from that?

5
So8res
(1) Things Executive!Nate will do differently from Researcher!Nate? Or things Nate!MIRI will do differently from Luke!MIRI? For the former, I'll be thinking lots more about global coordination & engaging with interested academics etc, and lots less about specific math problems. For the latter, the biggest shift is probably going to be something like "more engagement with the academic mainstream," although it's a bit hard to say: Luke probably would have pushed in that direction too, after growing the research team a bit. (I have a lot of opportunities available to me that weren't available to Luke at this time last year.) (2) The old SIAI definitely made some obvious mistakes; see e.g. Holden Karnofsky’s 2012 critique. Luke tried to transfer a number of the lessons learned to me, but it remains to be seen whether I actually learned them :-) The concrete list includes things like (a) constantly drive to systematize, automate, and outsource the busywork; (b) always attack the biggest constraint (by contrast, most people seem to have a default mode of "try and do everything that meets a certain importance level"); (c) put less emphasis on explicit models that you've built yourself an more emphasis on advice from others who have succeeded in doing something similar to what you're trying to do. (3) MIRI played a pretty big role in getting long-term AI alignment issues onto the world stage. There are lots and lots of things I've learned from that particular success. Perhaps the biggest is "don't disregard intellectual capital."

What metrics does MIRI use to internally measure its own success?

9
So8res
(1) number of FAIs produced ;-) Other important metrics include: * number of agent foundations forum posts produced * number of papers written * number of papers published in conferences/journals * number of papers published in high-prestige conferences/journals (a fuzzy metric) * number of conferences attended * number of collaborative papers written * number of research associates * number of people who have attended a workshop * number of non-MIRI-employees who have produced a technical result * amount of progress on core technical problems (a very fuzzy metric, which is why it’s important to also track the more concrete numbers above) * size of research team I also of course keep my eye on "number of dollars available."

Congrats on the new position!

My question: what advances does MIRI hope to achieve in the next 5 years?

Short version: FAI. (You said "hope", not "expect" :-p)

Longer version: Hard question, both because (a) I don't know how you want me to trade off between how nice the advance would be and how likely we are to get it, and (b) my expectations for the next five years are very volatile. In the year since Nick Bostrom released Superintelligence, there has been a huge wave of interest in the future of AI (due in no small part to the efforts of FLI and their wonderful Puerto Rico conference!), and my expectations of where I'll be in five years ... (read more)

What question should we be asking you?

5
So8res
I don't even know what that word means ;-)
3
RyanCarey
Haha, what useful and interesting question are we missing?

Which uncertainties about the trajectory to AI do you regard as of key strategic importance?

9
So8res
(a) how many major insights remain between us and strong AI? (b) how many of those insights will come from thinking hard, and how many will come from examining the brain? (c) how many more AI winters will there be? (d) how far ahead will the frontrunner be? (e) will there be an arms race?, to name a few.

Working without concrete feedback, how are you planning on increasing the chance that MIRI's work will be relevant to the AI developers of the future?

That’s a good question: we don’t have a practical AGI to poke at, so why do we expect that we can do work today that’s likely to be relevant many years down the line?

I’ll answer in part with an analogy: Say you went back in time and dropped by to visit Kolmogorov back when he was trying to formalize probability theory, and you asked "working without concrete feedback, how are you planning to increase the chance that your probability theory will be relevant to people trying to reason probabilistically in the future?" It seems like the best response is for him to sort of cock his head and say "well, uh, I’m still trying to formalize what I mean by "chance" and "probability" and so on; once we’ve got those things ironed out, then we can chat."

Similarly, we’re still trying to formalize the theory of advanced agents: right now, if you handed me unlimited computing power, I wouldn’t know how to program it to reliably and "intelligently" pursue a known goal, even a very simple goal, such as "produce as much diamond as possible." There are parts of the problem of designing highly reliable advanced agents that we don’t understand e... (read more)

Asking on behalf of Daniel Satanove, former intern at MIRI (summer 2014):

What do other people who are concerned with AI safety (e.g., Elon Musk, Bill Gates, Stuart Russell, etc.) think the path to friendly AI is? Are there other people who are working directly on Friendly AI research other than MIRI?

7
So8res
(1) I don't want to put words in their mouths. I'm guessing that most of us have fairly broad priors over what may happen, though. The future's hard to predict. (2) Depends what you mean by "Friendly AI research." Does AI boxing count? Does improving the transparency of ML algorithms count? Once the FLI grants start going through, there will be lots of people doing long-term AI safety research that may well be useful, so if you count that as FAI research, then the answer is "there will be soon." But if by "FAI research" you mean "working towards a theoretical understanding of highly reliable advanced agents," then the answer is "not to my knowledge, no."

Welcome, everybody!

Nate: on behalf of the EA community, thanks very much for showing up here. I think I speak for a lot of EAs when I say that since MIRI has such ambitious goals, it's really valuable to keep things grounded with open conversations about why you're doing what you're doing, and how it's turning out. So I think you've already won a lot of respect by making yourself available to answer questions here! Rest assured, you're not expected to answer every single question!

Everyone else: feel free to ask more questions in the next couple of hours, and to comment, and to upvote the questions you find the most interesting. We're lucky to have Nate around, so enjoy! :)

2
RyanCarey
Wow, cheers for answering over 30 questions here, Nate! What a heroic effort. Thanks for your questions everybody. That is a LOT of reading to go through for anyone interested in this problem, and plenty of interesting thoughts to be absorbed. If EAs want to support MIRI financially or with relevant technical skills, it's good that they now know more about what research they would be helping, and have an idea about the kind of person who will be leading it. Here is a link to how to get involved with MIRI as a researcher or donor: https://intelligence.org/get-involved/ Thanks very much Nate, and on behalf of the EA community, good luck in the new job!

How does MIRI plan to interface with important AI researchers that disagree with key pieces in the argument for safety?

6
So8res
There's a big spectrum, there. Some people think that no matter what the AI does that's fine because it's our progeny (even if it turns as much matter as it can into a giant computer so it can find better YouTube recommendations). Other people think that you can't actually build a superintelligent paperclip maximizer (because maximizing paperclips would be stupid, and we're assuming that it's intelligent). Other people think that yeah, you don't get good behavior by default, but AI is hundreds and hundreds of years off, so we don't need to start worrying now. Other people think that AI alignment is a pressing concern now but that improving our theoretical understanding of what we're trying to do isn't the missing puzzle piece. I interface with each of these different types of people in very different ways. To actually answer your question, though, the default interface is "publish papers, attend conferences," with a healthy dose of "talk to people in person when they're in town" mixed in :-)

1) I see a trend in the way new EAs concerned about the far future think about where to donate money that seems dangerous, it goes:

I am an EA and care about impactfulness and neglectedness -> Existential risk dominates my considerations -> AI is the most important risk -> Donate to MIRI.

The last step frequently involves very little thought, it borders on a cached thought.

How would you be conceiving of donating your X-risk money at the moment if MIRI did not exist? Which other researchers or organizations should be being scrutinized by donors who are X-risk concerned, and AI persuaded?

5
So8res
1) Huh, that hasn't been my experience. We have a number of potential donors who ring us up and ask who in AI alignment needs money the most at the moment. (In fact, last year, we directed a number of donors to FHI, who had much more of a funding gap than MIRI did at that time.) 2) If MIRI disappeared and everything else was held constant, then I'd be pretty concerned about the lack of people focused on the object level problems. (All talk more about why I think this is so important in a little bit, I'm pretty sure at least one other person asks that question more directly.) There'd still be a few people working on the object level problems (Stuart Russell, Stuart Armstrong), but I'd want lots more. In fact, that statement is also true in the actual world! We only have three people on the research team right now, remember, with a fourth joining in August. In other words, if you were to find yourself in a world like this one except without a MIRI, then I would strongly suggest building something like a MIRI :-)

It seems easy to imagine scenarios where MIRI's work is either irrelevant (e.g., mainstream AI research keeps going in a neuromorphic or heuristic trial-and-error direction and eventually "succeeds" that way) or actively harmful (e.g., publishes ideas that eventually help others to build UFAIs). I don't know how to tell whether MIRI's current strategy overall has positive expected impact. What's your approach to this problem?

9
So8res
All right, I'll come back for one more question. Thanks, Wei. Tough question. Briefly, (1) I can't see that many paths to victory. The only ones I can see go through either (a) aligned de-novo AGI (which needs to be at least powerful enough to safely prevent maligned systems from undergoing intelligence explosions) or (b) very large amounts of global coordination (which would be necessary to either take our time & go cautiously, or to leap all the way to WBE without someone creating a neuromorph first). Both paths look pretty hard to walk, but in short, (a) looks slightly more promising to me. (Though I strongly support any attempts to widen path (b)!) (2) It seems to me that the default path leads almost entirely to UFAI: insofar as MIRI research makes it easier for others to create UFAI, most of that effect isn't replacing wins with losses, it's just making the losses happen sooner. By contrast, this sort of work seems necessary in order to keep path (a) open. I don't see many other options. (In other words, I think it's net positive because it creates some wins and moves some losses sooner, and that seems like a fair trade to me.) To make that a bit more concrete, consider logical uncertainty: if we attain a good formal understanding of logically uncertain reasoning, that's quite likely to shorten AI timelines. But I think I'd rather have a 10-year time horizon and be dealing with practical systems built upon solid foundations that come from a decade's worth of formally understanding what good logically uncertain reasoning looks like, rather than a 20-year time horizon where we have to deal with systems built using 19 years of hacks and 1 year of patches bolted on at the end. (In other words, the possibility of improving AI capabilities is the price you have to pay to keep path (a) open.) A bunch of other factors also play into my considerations (including a heuristic which says "the best way to figure out which problems are the real problems is to start sol

In the past, people like Eliezer Yudkowsky (see 1, 2, 3, 4, 5) have argued that MIRI has a medium probability of success.

What is this probability estimate based on and how is success defined?

(Note that I've asked this before, but I'm curious for more perspective.)

To what degree is MIRI now restricted by lack of funding, and is there any amount of funding beyond which you could not make effective use of it?

Among recruiting new talent and having funding for new positions, what is the greatest bottleneck?

7
So8res
Right now we’re talent-constrained, but we’re also fairly well-positioned to solve that problem over the next six months. Jessica Taylor is joining us in august. We have another researcher or two pretty far along in the pipeline, and we’re running four or five more research workshops this summer, and CFAR is running a summer fellows program in July. It’s quite plausible that we’ll hire a handful of new researchers before the end of 2015, in which case our runway would start looking pretty short, and it’s pretty likely that we’ll be funding constrained again by the end of the year.
3
mhpage
A modified version of this question: Assuming MIRI's goal is saving the world (and not MIRI), at what funding level would MIRI recommend giving elsewhere, and where would it recommend giving?
1
So8res
I’m not sure how to interpret this question: are you asking how much money I'd like to see dumped on other people? I’d like to see lots of money dumped on lots of other people, and for now I’m going to delegate to the GiveWell, Open Philanthropy Project, and GoodVentures folks to figure out who and how much :-)
1
Diffractor
I think they mean "what is the quantity of funding at MIRI which would cause a shift in the best marginal use of money, and what organization would it switch to." mhpage, if this is not what you mean, let me know.
1
RobBensinger
I'm not sure what the answer to this is going forward, but relevant things Nate said in response to other questions on this page: +
0
mhpage
Indeed, that is what I meant. I was assuming that MIRI's position is that it presently is the most-effective recipient of funds, but that assumption might not be correct (which would itself be quite interesting).
  1. What are your plans for taking MIRI to the next level? What is the next level?

  2. Now that MIRI is focused on math research (a good move) and not on outreach, there is less of a role for volunteers and supporters. With the donation from Elon Musk, some of which will presumably get to MIRI, the marginal value of small donations has gone down. How do you plan to keep your supporters engaged and donating? (The alternative, which is perhaps feasible, could be for MIRI to be an independent research institution, without a lot of public engagement, funded by a few big donors.)

6
So8res
1. (a) grow the research team, (b) engage more with mainstream academia. I'd also like to spend some time experimenting to figure out how to structure the research team so as to make it more effective (we have a lot of flexibility here that mainstream academic institutes don't have). Once we have the first team growing steadily and running smoothly, it's not entirely clear whether the next step will be (c.1) grow it faster or (c.2) spin up a second team inside MIRI taking a different approach to AI alignment. I'll punt that question to future-Nate. 2. So first of all, I'm not convinced that there's less of a role for supporters. If we had just ten people earning-to-give at the (amazing!) level of Ethan Dickinson, Jesse Liptrap, Mike Blume, or Alexei Andreev (note: Alexei recently stopped earning-to-give in order to found a startup), that would bring in as much money per year as the Thiel Foundation. (I think people often vastly overestimate how many people are earning-to-give to MIRI, and underestimate how useful it is: the small donors taken together make a pretty big difference!) Furthermore, if we successfully execute on (a) above, then we're going to be burning through money quite a bit faster than before. An FLI grant (if we get one) will certainly help, but I expect it's going to be a little while before MIRI can support itself on large donations & grants alone. As for how I plan to keep supporters engaged & donating, I don't expect it will be that much of a problem: I think that many of our donors are excited to see us publish peer-reviewed papers, attend conferences, and engage in the ongoing global conversation. It's hard for me to say for sure, but it seems quite likely that the last year has been much more exciting for MIRI donors than the previous few years, even though there was no Singularity Summit and most of our output was math.
0
Ervin
Any links on this?
2
RyanCarey
https://intelligence.org/2014/06/11/mid-2014-strategic-plan/

What's your response to Peter Hurford's arguments in his article Why I'm Skeptical Of Unproven Causes...?

That post mixes a bunch of different assertions together, let me try to distill a few of them out and answer them in turn:


One of Peter's first (implicit) points is that AI alignment is a speculative cause. I tend to disagree.

Imagine it's 1942. The Manhattan project is well under way, Leo Szilard has shown that it's possible to get a neutron chain reaction, and physicists are hard at work figuring out how to make an atom bomb. You suggest that this might be a fine time to start working on nuclear containment, so that, once humans are done bombing the everloving breath out of each other, they can harness nuclear energy for fun and profit. In this scenario, would nuclear containment be a "speculative cause"?

There are currently thousands of person-hours and billions of dollars going towards increasing AI capabilities every year. To call AI alignment a "speculative cause" in an environment such as this one seems fairly silly to me. In what sense is it speculative to work on improving the safety of the tools that other people are currently building as fast as they can? Now, I suppose you could argue that either (a) AI will never work or (b) it will be safe by defaul... (read more)

4
John_Maxwell
Great article. My thoughts: The smallpox vaccine was the first ever vaccine... a highly unproven cause. This site says it saved over half a billion lives. If there was an EA movement when Edward Jenner was alive hundreds of years ago, would it have sensibly advised Jenner to work on a different project because the idea of vaccines was an unproven one? Note that most of the top lifesavers on ScienceHeros.com did research work, which is an inherently unprovable cause, but managed to save many more lives than a person donating to Givewell's top charities can expect to save. Of course, scientific research can also backfire and cost lives. So one response to this might be to say: "scientific research is an unproven cause that's hard to know the sign of, so we should ignore scientific research in favor of proven causes". But to me this sounds like a head-in-the-sand approach. Scientific research is going to be by far the most significant bit affecting the future of life on Earth. I would rather see the EA movement try to develop tools to get better at predicting science impacts, or at least save money to nudge science when it's more clear what impacts it might have.
3
Peter Wildeford
I regret talking mainly about what is "unproven" when I really meant to talk about what (a) has tight feedback loops and (b) is approached experimentally. See the clarification in http://lesswrong.com/lw/ic0/where_ive_changed_my_mind_on_my_approach_to/ I think MIRI can fit this description in some ways (I'm particularly excited about the AI Impacts blog), but it doesn't in other ways.
5
John_Maxwell
What do you think of the stability under self-modification example in this essay? I haven't taken the time to fully understand MIRI's work. But my reading is that MIRI's work is incremental without being empirical--like most people working in math & theoretical computer science, they are using proofs to advance their knowledge rather than randomized controlled trials. So this might meet the "tight feedback loops" criterion without meeting the "approached experimentally" criterion. BTW, you might be interested in this comment of mine about important questions for which it's hard to gather relevant experimental data. Here are some related guesses of mine if anyone is interested: The importance of the far future is so high that there's nothing to do but bite the bullet and do the best we can to improve it. MIRI represents a promising approach to improving the far future, but it shouldn't be the only approach we investigate. For example, I would like to see an organization that attempted to forecast a broad variety of societal and technological trends, predict how they'll interact, and try to identify the best spots to apply leverage. The first thing to do is to improve our competency at predicting the future in general. The organization I describe could evolve out of a hedge fund that learned to generate superior returns through long-term trading, for instance. The approach to picking stocks that Charlie Munger, Warren Buffet's partner, describes in Poor Charlie's Almanack sounds like the sort of thing that might work for predicting other aspects of how the future will unfold. Munger reads a ton of books and uses a broad variety of mental frameworks to try to understand the assets he evaluates (more of a fox than a hedgehog). (Interesting to note that the Givewell founders are ex-employees of Bridgewater, one of the world's top hedge funds.) A meta-level approach to predictions: push for the legalization of prediction markets that would let us aggregate the vie
1
RomeoStevens
I've never understood this argument. There has always been a latent incentive to off CEOs or destroy infrastructure and trade on the resulting stock price swings. In practice this is very difficult to pull off. Prediction markets would be under more scrutiny and thus harder to game in this manner. To take a step back, this objection is yet another example of one that gets trotted out against prediction markets all the time but which has been addressed in the white papers on the topic.

1) Your current technical agenda involves creating a math of logical uncertainty and forming world-models out of this. When (if possible) do you predict that such a math will be worked out, and will MIRI's focus move to the value learning problem then?

2) How long do you estimate that formal logic will be the arena in which MIRI's technical work takes place - that is, how long will knowing formal logic be of use to a potential researcher before the research moves to new places?

5
So8res
(1) That's not quite how I'd characterize the current technical agenda. Rather, I'd say that in order to build an AI aligned with human interests, you need to do three things: (a) understand how to build an AI that's aligned with anything (could you build an AI that reliably builds as much diamond as possible?), (b) understand how to build an AI that assists you in correcting things-you-perceive-as-flaws (this doesn't come for free, but it's pretty important, because humans are bad at getting software right on the first try), and (c) figure out how to build a machine that can safely learn human values & intentions from training data. We're currently splitting our time between all these problems. It's not that we haven't focused on the value learning problem yet, rather, it's that the value learning problem is only a fraction of the whole problem. We'll keep working on all the parts, and I'm not sure which parts will yield first. I can't give you a timeline on how long various parts will take; scientific progress is very hard to predict. (2) I wouldn't currently say that "formal logic is the arena in which MIRI's technical work takes place" -- if anything, “math in general” is the arena, and that will probably remain the case until we have a much better understanding of the problems we're trying to solve (and how to solve simplified versions of them), at which point computer programming will become much more essential. Again, it's hard to say how long it will take to get there, because scientific progress is hard to predict. Formal logic is one of many tools useful in mathematics (alongside probability theory, statistics, linear algebra, etc.) that shows up fairly frequently in our work, but I don’t think of our work as "focused on formal logic." I don't think we'll "move away from formal logic" at a particular time; rather, we'll just use whichever mathematical tools look useful for the problems at hand. That will change as the problems change :-)
2
Ben Pace
Thank you for the response; it was helpful :^)
2
Paul_Crowley
It seems a bit like the question behind the question might be "I'd like to help, but I don't know formal logic, when will that stop being a barrier". In which case it's worth saying that I'm attending a MIRI decision theory workshop at the moment, and I don't really know formal logic, but it isn't proving too much of a barrier; I can think about the assertion "Suppose PA proves that A implies B" without really understanding exactly what PA is.

Hi Nate,

Thanks for the AMA. I’m most curious as to what MIRI’s working definition is for what has intrinsic value. The core worry of MIRI has been that it’s easy to get the AI value problem wrong, to build AIs that don’t value the correct thing. But how do we humans get the value problem right? What should we value?

Max Tegmark alludes to this in Friendly Artificial Intelligence: the Physics Challenge:

Quantum effects aside, a truly well-defined goal would specify how all particles in our Universe should be arranged at the end of time. [But] what particle

... (read more)
3
So8res
We don't have a working definition of "what has intrinsic value." My basic view on these hairy problems ("but what should I value?") is that we really don't want to be coding in the answer by hand. I'm more optimistic about building something that has a few layers of indirection, e.g., something that figures out how to act as intended, rather than trying to transmit your object-level intentions by hand. In the paper you linked, I think Max is raising about a slightly different issue. He's talking about what we would call the ontology identification problem. Roughly, imagine building an AI system that you want to produce lots of diamond. Maybe it starts out with an atomic model of the universe, and you (looking at its model) give it a utility function that scores one point per second for every carbon atom covalently bound to four other carbon atoms (and then time-discounts or something). Later, the system develops a nuclear model of the universe. You do want it to somehow deduce that carbon atoms in the old model map onto six-proton atoms in the new model, and maybe query the user about how to value carbon isotopes in its diamond lattice. You don't want it to conclude that none of these six-proton nuclei pattern-match to "true carbon", and then turn the universe upside down looking for some hidden cache of "true carbon." We have a few different papers that mention this problem, albeit shallowly: Ontological Crises in Artificial Agents' Value Systems, The Value Learning Problem, Formalizing Two Problems of Realistic World-Models. There's a lot more work to be done here, and it's definitely on our radar, though also note that work on this problem is at least a little blocked on attaining a better understanding of how to build multi-level maps of the world.
1
Alex_Altair
That diamond/carbon scenario is an excellent concrete example of the ontology problem.

What is your AI arrival timeline? Once we get AI, how quickly do you think it will self-improve? How likely do you think it is that there will be a singleton vs. many competing AIs?

4
So8res
(1) Eventually. Predicting the future is hard. My 90% confidence interval conditioned on no global catastrophes is maybe 5 to 80 years. That is to say, I don't know. (2) I fairly strongly expect a fast takeoff. (Interesting aside: I was recently at a dinner full of AI scientists, some of them very skeptical about the whole long-term safety problem, who unanimously professed that they expect a fast takeoff -- I'm not sure yet how to square this with the fact that Bostrom's survey showed fast takeoff was a minority position). It seems hard (but not impossible) to build something that's better than humans at designing AI systems & has access to its own software and new hardware, which does not self improve rapidly. Scenarios where this doesn't occur include (a) scenarios where the top AI systems are strongly hardware limited; (b) scenarios where all operators of all AI systems successfully remove all incentives to self-improve; or (c) the first AI system is strong enough to prevent all intelligence explosions, but is also constructed such that it does not itself self-improve. The first two scenarios seem unlikely from here, the third is more plausible (if the frontrunners explicitly try to achieve it) but still seems like a difficult target to hit. (3) I think we're pretty likely to eventually get a singleton: in order to get a multi-polar outcome, you need to have a lot of systems that are roughly at the same level of ability for a long time. That seems difficult but not impossible. (For example, this is much more likely to happen if the early AGI designs are open-sourced and early AGI algorithms are incredibly inefficient such that progress is very slow and all the major players progress in lockstep.) Remember that history is full of cases where a better way of doing things ends up taking over the world -- humans over the other animals, agriculture dominating hunting & gathering, the Brits, industrialization, etc. (Agriculture and arguably industrialization emerg
3
AlexMennen
Perhaps the first of them to voice a position on the matter expected a fast takeoff and was held in high regard by the others, so they followed along, having not previously thought about it?
0
RyanCarey
Couldn't it be that the returns on intelligence tend to not be very high for a self-improving agent around the human area? Like, it could be that modifying yourself when you're human-level intelligent isn't very useful, but that things really take off at 20x the human level. That would seem to suggest a possible d) the first superhuman AI system is self-improves for some time and then peters out. More broadly, the suggestion is that since the machine is presumably not yet superintelligent, there might be relevant constraints other than incentives and hardware. Plausible or not?
4
So8res
Seems unlikely to me, given my experience as an agent at roughly the human level of intelligence. If you gave me a human-readable version of my source code, the ability to use money to speed up my cognition, and the ability to spawn many copies of myself (both to parallelize effort and to perform experiments with) then I think I'd be "superintelligent" pretty quickly. (In order for the self-improvement landscape to be shallow around the human level, you'd need systems to be very hardware-limited, and hardware currently doesn't look like the bottleneck.) (I'm also not convinced it's meaningful to talk about "the human level" except in a very broad sense of "having that super powerful domain generality that humans seem to possess", so I'm fairly uncomfortable with terminology such as "20x the human level.")

(1) What is the probability of mankind, or a "good" successor species we turn into, surviving for the next 1000 years? (2) What is the probability of MIRI being the first organization to create an AGI smart enough to, say, be better at computer programming than any human?

6
So8res
(1) Not great. (2) Not great. (To be clear, right now, MIRI is not attempting to build an AGI. Rather, we're working towards a better theoretical understanding of the problem.)

1)Which are the implicit assumptions, within MIRI's research agenda, of things that "currently we have absolutely no idea of how to do that, but we are taking this assumption for the time being, and hoping that in the future either a more practical version of this idea will be feasible, or that this version will be a guiding star for practical implementations"?

I mean things like

  • UDT assumes it's ok for an agent to have a policy ranging over all possible environments and environment histories

  • The notion of agent used by MIRI assumes to some ex

... (read more)
7
So8res
1) The things we have no idea how to do aren't the implicit assumptions in the technical agenda, they're the explicit subject headings: decision theory, logical uncertainty, Vingean reflection, corrigibility, etc :-) We've tried to make it very clear in various papers that we're dealing with very limited toy models that capture only a small part of the problem (see, e.g., basically all of section 6 in the corrigibility paper). Right now, we basically have a bunch of big gaps in our knowledge, and we're trying to make mathematical models that capture at least part of the actual problem -- simplifying assumptions are the norm, not the exception. All I can easily say that common simplifying assumptions include: you have lots of computing power, there is lots of time between actions, you know the action set, you're trying to maximize a given utility function, etc. Assumptions tend to be listed in the paper where the model is described. 2) The FLI folks aren't doing any research; rather, they're administering a grant program. Most FHI folks are focused more on high-level strategic questions (What might the path to AI look like? What methods might be used to mitigate xrisk? etc.) rather than object-level AI alignment research. And remember that they look at a bunch of other X-risks as well, and that they're also thinking about policy interventions and so on. Thus, the comparison can't easily be made. (Eric Drexler's been doing some thinking about the object-level FAI questions recently, but I'll let his latest tech report fill you in on the details there. Stuart Armstrong is doing AI alignment work in the same vein as ours. Owain Evans might also be doing object-level AI alignment work, but he's new there, and I haven't spoken to him recently enough to know.) Insofar as FHI folks would say we're making assumptions, I doubt they'd be pointing to assumptions like "UDT knows the policy set" or "assume we have lots of computing power" (which are obviously simplifying assu

Asking for a friend:

"What would it take to get hired by MIRI, if not in a capacity as a researcher? What others ways can I volunteer to help MIRI, operationally or otherwise?"

4
So8res
We're actually going to be hiring a full-time office manager soon: someone who can just Make Stuff Happen and free up a lot of our day-to-day workload. Keep your eyes peeled, we'll be advertising the opening soon. Additionally, we're hurting for researchers who can write fast & well, and before too long we'll be looking for a person who can stay up to speed on the technical research but spend most of their time doing outreach and stewarding other researchers who are interested in doing AI alignment research. Both of these jobs would require a bit less technical ability than is required to make new breakthroughs in the field.

Many years ago, SIAI's outlook seemed to be one of desperation - the world was mad, and probably doomed. Only nine coders, locked in a basement, could save it. Now things seem much more optimistic, the Powers That Be are receptive to AGI risk, and MIRI's job is to help understand the issues. Is this a correct impression? If so, what caused the change?

It appears that the phrase "Friendly AI research" has been replaced by "AI alignment research". Why was that term picked?

3
So8res
Luke talks about the pros and cons of various terms here. Then, long story short, we asked Stuart Russell for some thoughts and settled on "AI alignment" (his suggestion, IIRC).

How does existential risk affect you emotionally? If negatively, how do you cope?

What do you think of popular portrayals of AI-risk in general? Do you think there's much of a point either in trying to spread broad awareness of the issue? Do you think that any such efforts ultimately do more harm than good, and that we should try to keep AI-risk more secretive?

For example, are things like like Ex Machina, which doesn't really present the full AI arguement, but does make it obvious that AI is a risk, or Wait But Why's AI posts good?

Thanks!

What is your stance on whole brain emulation as a path to a positive singularity?

3
So8res
Hard to get there. Highly likely that we get to neuromorphic AI along the way. (Low-fidelity images or low-speed partial simulations are likely very useful for learning more about intelligence, and I currently expect that the caches of knowledge unlocked on the way to WBE probably get you to AI before the imaging/hardware supports WBE.)

What are your biggest flaws, skill gaps, areas to grow?

Hi, I'm a software developer with good knowledge of basic algorithms and machine learning techniques. What mathematics and computer science fields should I learn to be able to make a significant impact in solving AGI problem?

2
So8res
Great question! I suggest checking out either our research guide or our technical agenda. The first is geared towards students who are wondering what to study in order to eventually gain the skills to be an AI alignment researcher, the latter is geared more towards professionals who already have the skills and are wondering what the current open problems are. In your case, I'd guess maybe (1) get some solid foundations via either set theory or type theory, (2) get solid foundations on AI, perhaps via AI: A Modern Approach, (3) brush up on probability theory, formal logic, and causal graphical models, and then (4) dive into the technical agenda and figure out which open problems pique your interest.

Let's assume that an AGI is, indeed, created sometime in the future. Let us also assume that MIRI achieves its goal of essentialy protecting us from the existential dangers that stem from it. My question may well be quite naive, but how likely is it for a totalitarian "New World Order" to seize control of said AGI and use it for their own purposes, deciding who gets to benefit from it and to what degree?
This is something I, myself, get asked a lot and while it takes into account the current state of society which look nothing like the next ones probably will, I can't seem to properly reject as a possibilty.

0
RobBensinger
I wouldn't reject it as a possibility. MIRI wants AGI to have good consequences for human freedom, happiness, etc., but any big increase in power raises the risk that the power will be abused. Ideally we'd want the AI to resist being misused, but there's a tradeoff between 'making the AI more resistant to misuse by its users (when the AI is right and the user is wrong)' and 'making the AI more amenable to correction by its users (when the AI is wrong and the user is right).' I wouldn't say it's inevitable either, though. It doesn't appear to me that past technological growth has tended to increase how totalitarian the average state is.

Do you think a fast takeoff is more likely?

1
So8res
Than a slow takeoff? Yes :-)

What are MIRI's plans for publication over the next few years, whether peer-reviewed or arxiv-style publications?

More specifically, what are the a) long-term intentions and b) short-term actual plans for the publication of workshop results, and what kind of priority does that have?

4
So8res
Great question! The short version is, writing more & publishing more (and generally engaging with the academic mainstream more) are very high on my priority list. Mainstream publications have historically been fairly difficult for us, as until last year, AI alignment research was seen as fairly kooky. (We've had a number of papers rejected from various journals due to the "weird AI motivation.") Going forward, it looks like that will be less of an issue. That said, writing capability is a huge bottleneck right now. Our researchers are currently trying to (a) run workshops, (b) engage with & evaluate promising potential researchers, (c) attend conferences, (d) produce new research, (e) write it up, and (f) get it published. That's a lot of things for a three-person research team to juggle! Priority number 1 is to grow the research team (because otherwise nothing will ever be unblocked), and we're aiming to hire a few new researchers before the year is through. After that, increasing our writing output is likely the next highest priority. Expect our writing output this year to be similar to last year's (i.e., a small handful of peer reviewed papers and a larger handful of technical reports that might make it onto the arXiv), and then hopefully we'll have more & higher quality publications starting in 2016 (the publishing pipeline isn't particularly fast).

Hi Nate!

Daniel Dewey at FHI outlined some strategies to mitigate existential risk from a fast take-off scenario here: http://www.danieldewey.net/fast-takeoff-strategies.pdf

I expect you to agree with the exponential decay model, if not – why?

I would also like your opinion on his four strategic categories, namely:

  • International coordination
  • Sovereign AI
  • AI-empowered project
  • Other decisive technological advantage

Thanks for your attention!

2
So8res
I mostly agree with Daniel's paper :-)
0
AlexLundborg
That was my guess :) To be more specific: do you (or does MIRI) have any preferences for which strategy to pursue, or is it too early to say? I get the sense from MIRI and FHI that aligned sovereign AI is the end goal. Thanks again for doing the AMA!
4
Daniel_Dewey
I am not Nate, but my view (and my interpretation of some median FHI view) is that we should keep options open about those strategies and as-yet unknown other strategies instead of fixating on one at the moment. There's a lot of uncertainty, and all of the strategies look really hard to achieve. In short, no strongly favored strategy. FWIW, I also think that most current work in this area, including MIRI's, promotes the first three of those goals pretty well.
3
Daniel_Dewey
Follow-up: this comment suggests that Nate weakly favors strategies 2 and/or 3 over 1.

Are you single? What are some strategic methods that would make one successful at seducing you? (I'm giving a very liberal interpretation to "I'm also happy to answer questions about (...) whatever else piques your curiosity" :P)

3
So8res
The most reliable strategy to date is "ask me" :-)

1) What was the length of time between you reading the sequences and doing research on the value alignment problem?

2) What portion of your time will now be spent on technical research? Also, what is Eliezer Yudkowsky spending most of his work-time on? Is he still writing up introductory stuff like he said in the HPMOR author notes?

3) What are any unstated pre-requisites for researching the value-alignment problem that aren't in MIRI's research guide? e.g. could include Real Analysis or particular types of programming ability

What is your best characterisation of Robin Hanson's arguments against FOOM, and what is your analysis of the strengths and weaknesses of his argument?

I remember reading that you had plans to change the world via economic/political influence, and then you realized that existential risk was more important. The same thing happened to me.

What was that experience like for you? How long did it take you to change your mind? Other thoughts?

What are some ways in which you've changed your mind? Recently, important things, things that come to mind, whatever you want.

What path did MIRI's staff take there? How many came from other charities?

Three questions:

1: As a past MIRI researcher, which one of the technical problems in the technical research agenda currently looks like the biggest pain in the ass/the one requiring the most lead time to solve?

2: When you become executive director, will that displace all of your research work, or will you still have a few thought cycles left over to contribute mathematically to workshops/do some part-time research?

3: My current life plan is "speedrun college in 3 years (mostly done), speedrun employment by living in a van and spending under 14k/year s... (read more)

Kieran Allen asks:

If we create an artificial intelligence, what right do we have to call it anything other than a life? Further, what right do we have to restrict it to benefiting humans? When a young couple created a man and named him Adolf Hitler, we had no right to restrict his life or actions, until he breached the law. Why should machines be any different?

1
RobBensinger
I'll take a stab at this question too. There are two different schools of thought about what the goal of AI as a field is. One is that the goal is to build a machine that can do everything humans can -- possibly including experiencing emotions and other conscious states. On this view, a "full AI" would plausibly be a person, deserving of moral rights like any other. The more common view within contemporary AI is that the goal of AI is to build machines that can effectively achieve a variety of practical goals in a variety of environments. Think Nate's Deep Blue example, but generalized: instead of steering arrangements of chess pieces on a board toward some goal state, a "full" AI steers arbitrary arrangements of objects in space toward some goal state. Such an AI might not be conscious or have real preferences; it might have "goals" only in the limited sense that Deep Blue has "goals." This is the kind of AI MIRI has in mind, and the kind we're trying to plan for: a system that can draw inferences from sensor inputs and execute effective plans, but not necessarily one that has more moral weight than Google's search engine algorithms do. If it turns out that you do need to make AI algorithms conscious in order to make them effective at scientific and engineering tasks, that does make our task a lot harder, because, yes, we'll have to take into account the AI's moral status when we're designing it, and not just the impact its actions have on other beings. For now, though, consciousness and intelligent behavior look like different targets, and there are obvious economic reasons why mainstream AI is likely to prioritize "high-quality decision making" over "emulating human consciousness." A better analogy to MIRI's goal than "we build Hitler and then put him in chains" is "we build a reasonably well-behaved child and teach the child non-Hitler-ish values." But both of those ways of thinking are still excessively anthropomorphized. A real-world AI, of the "high-quali
1
So8res
(1) I suspect it's possible to create an artificial system that exhibits what many people would call "intelligent behavior," and which poses an existential threat, but which is not sentient or conscious. (In the same way that Deep Blue wasn't sentient: it seems to me like optimization power may well be separable from sentience/consciousness.) That's no guarantee, of course, and if we do create a sentient artificial mind, then it will have moral weight in its own right, and that will make our job quite a bit more difficult. (2) The goal is not to build a sentient mind something that wants to destroy humanity but can't. (That's both morally reprehensible and doomed to failure! :-p) Rather, the goal is to successfully transmit the complicated values of humanity into a powerful optimizer. Have you read Bostrom's The Superintelligent Will? Short version is, it looks possible to build powerful optimizers that pursue goals we might think are valueless (such as an artificial system that, via very clever long-term plans, produces extremely large amounts of diamond, or computes lots and lots of digits of pi). We'd rather not build that sort of system (especially if it's powerful enough to strip the Earth of resources and turn them into diamonds / computing power): most people would rather build something that shares some of our notion of "value," such as respect for truth and beauty and wonder and so on. It looks like this isn't something you get for free. (In fact, it looks very hard to get: it seems likely that most minds would by default have incentives to manipulate & decieve in order to acquire resources.) We'd rather not build minds that try to turn everything they can into a giant computer for computing digits of pi, so the question is how to design the sort of mind that has things like respect for truth and beauty and wonder? In hollywood movies, you can just build something that looks cute and fluffy and then it will magically acquire a spark of human-esque curio

I know that in the past LessWrong, HPMOR, and similar community-oriented publications have been a significant source of recruitment for areas that MIRI is interested in, such as rationality, EA, awareness of the AI problem, and actual research associates (including yourself, I think). What, if anything, are you planning to do to further support community engagement of this sort? Specifically, as a LW member I'm interested to know if you have any plans to help LW in some way.

I have a friend studying a masters' degree in artificial intelligence, and he says:

I went to grad school almost completely and specifically so I can work [at MIRI], but I have loans to pay off and would really like a Tesla.

How much does an internship at MIRI as a researcher pay?

Is MIRIs hope/ambition that that CEV (http://wiki.lesswrong.com/wiki/Coherent_Extrapolated_Volition) or something resemblant of CEV will be implement, or is this not something you have a stance on?

(I'm not asking whether you think CEV should be the goal-system of the first superintelligence. I know it's possible to have strategies such as first creating an oracle and then at some later point implement something CEV-like.)

3
So8res
First, I think that civilization had better be really dang mature before it considers handing over the reins to something like CEV. (Luke has written a bit about civilizational maturity in the past.) Second, I think that the CEV paper (which is currently 11 years old) is fairly out of date, and I don't necessarily endorse the particulars of it. I do hope, though, that if humanity (or posthumanity) ever builds a singleton, that they build it with a goal of something like taking into account the extrapolated preferences of all sentients and fulfilling some superposition of those in a non-atrocious way. (I don't claim to know how to fill in the gaps there.)

As someone who's spent a significant amount of time thinking about possible rearrangements of civilization, reading On Saving The World was both tantalizing and frustrating (as well as cementing your position as one of the most impressive people I am aware of). I understand building up from the ground, covering all the pre-requisites and inferential distance, would be a huge effort and currently not worth your time, but I feel like even a terse summary without any detailed justifications for suggestions based on of all those years of thought would be highl... (read more)

1
tomstocker
Or at least laying out the inferential steps you see most lacking within EA groups you meet? Or less-wrongians

What are your contrarian beliefs?

Cheeky question:

You probably believe in many strange things that most people do not. Nonetheless, I think you are very clever and trust you a lot. Can you think of any unusual beliefs you have that have implications for asset prices?

There are different inputs needed to advance AI safety: money, research talent, executive talent, and others. How do you see the tradeoff between these resources, and which seems most like a priority right now?

1
RobBensinger
Looks like a few of Nate's other answers partly address your question: "Right now we're talent-constrained..." and "grow the research team..."

anonymous question from a big fan of yours on tumblr:

"Re: Nate Soares (thanks for doing this btw, it's really nice of you), two questions. First, I understand his ethical system described in his recent "should" series and other posts to be basically a kind of moral relativism; is he comfortable with that label? Second, does he only intend it for a certain subset of humans with agreeable values, or does it apply to all value systems, even ones we would find objectionable?"

(I'm passing on questions without comment from anyone without an e-a.com account or who wants anonymity here. )

6
So8res
You could call it a kind of moral relativism if you want, though it's not a term I would use. I tend to disagree with many self-proclaimed moral relativists: for example, I think it's quite possible for one to be wrong about what they value, and I am not generally willing to concede that Alice thinks murder is OK just because Alice says Alice thinks murder is OK. Another place I depart from most moral relativists I've met is by mixing in a healthy dose of "you don't get to just make things up." Analogy: we do get to make up the rules of arithmetic, but once we do, we don't get to decide whether 7+2=9. This despite the fact that a "7" is a human concept rather than a physical object (if you grind up the universe and pass it through the finest sieve, you will find no particle of 7). Similarly, if you grind up the universe you'll find no particle of Justice, and value-laden concepts are human concoctions, but that doesn't necessarily mean they bend to our will. My stance can roughly be summarized as "there are facts about what you value, but they aren't facts about the stars or the void, they're facts about you." (The devil's in the details, of course.)
3
Alex_Altair
igotthatreference.jpg

What are some of your techniques for doing good research?

So as I understand it, what MIRI is doing now is to think about theoretical issues and strategies and write papers about this, in the hope that the theory you develop can be made use of by others?

Does MIRI think of ever:

  1. Developing AI yourselves at some point?
  2. Creating a goal-alignment/safy-framework to be used by people developing AGI? (Where e.g. reinforcement learners or other AI-compinents can be "plugged in", but in some sense are abstracted away.)

Also (feel free to skip this part of the question if it is too big/demanding):

Personally, I ... (read more)

3
So8res
Kinda. The current approach is more like "Pretend you're trying to solve a much easier version of the problem, e.g. where you have a ton of computing power and you're trying to maximize diamond instead of hard-to-describe values. What parts of the problem would you still not know how to solve? Try to figure out how to solve those first." (1) If we manage to (a) generate a theory of advanced agents under many simplifying assumptions, and then (b) generate a theory of bounded rational agents under far fewer simplifying assumptions, and then (c) figure out how to make highly reliable practical generally intelligent systems, all before anyone else gets remotely close to AGI, then we might consider teching up towards designing AI systems ourselves. I currently find this scenario unlikely. (2) We're currently far enough away from knowing what the actual architectures will look like that I don't think it's useful to try to build AI components intended for use in an actual AGI at this juncture. (3) I think that making theorem provers easier to use is an important task and a worthy goal. I'm not optimistic about attempts to merge natural language with Martin-Lof type theory. If you're interested in improving theorem-proving tools in ways that might make it easier to design safe reflective systems in the future, I'd point you more towards trying to implement (e.g.) Marcello's Waterfall in a dependently typed language (which may well involve occasionally patching the language, at this stage).

I guess I am way late to the party, but.....

What part of the MIRI research agenda do you think is the most accessible to people with the least background?

How could AI alignment research be made more accessible?

Are there any areas of the current software industry that developing expertise in might be useful to MIRI's research agenda in the future?

Do you believe a terminal value could ever be "rational"? Or is that a Wrong Question?

1
RobBensinger
Could you say more about what you mean by "rational" in this context? Do you have a particular kind of rationality in mind?

Hey Nate, congratulations! I think we briefly met in the office in February when I asked Luke about his plans; now it turns out I should have been quizzing you instead!

I have a huge list of questions; basically the same list I asked Seth Baum, actually. Feel free to answer as many or as few as you want. Apologies if you've already written on the subject elsewhere; feel free to just link if so.

What is your current marginal project(s)? How much will they cost, and what's the expected output (if they get funded).

What is the biggest mistake you've made?

What is... (read more)

0
RyanCarey
Hey Larks, that's a huge set of questions. It might be helpful to some themed bundles of questions from here and split them off into their own comments, so that others can upvote and read the questions according to their interest.

Will you still be answering questions now, or in future?

0
RyanCarey
Nate will answer questions in an hour and a half:
0
Ervin
Ah, I meant would he still be answering questions that got asked later.
1
RyanCarey
Ah, there's no plans, though I imagine Rob Bensinger wouldn't mind me saying that if you have any useful follow-on questions, you can find his contact details on the MIRI website.
0
Ervin
It could be useful to mention that sort of thing on future AMAs

I usually see MIRI's goal in its technical agenda is "to ensure that the development of smarter-than-human intelligence has a positive impact on humanity." Is there any chance of expanding this to include all sentient beings? If not, why not? Given that nonhuman animals vastly outnumber the human ones, I would think the most pressing question for AI is its effect on nonhuman animals rather than on human ones.

7
So8res
Yep :-) The official mission statement is just "has a positive impact." I'll encourage people to also use phrasing that's more inclusive to other sentients in future papers/communications.
4
Tor_Barstad
Unless there are strategic concerns I don't fully understand I second this. I cringe a little every time I see such goal-descriptions. Personally I would argue that the issue of largest moral concern is ensuring that new beings that can have good experiences and have a meaningful existence are put into existence, as the quality and quantity of consciousness experienced by such not-yet-existant beings could dwarf what is experienced by currently existing beings on our small planet. I understand that MIRI doesn't want to take stance on all controversial ethical issues, but I would also wonder if MIRI has considered replacing "a positive impact on humanity" with "a positive impact on humanity and ...", e.g. "a positive impact on humanity and other sentient beings" or "a positive impact on humanity and the universe".
3
Buck
I am not worried as much as you about the effect of AI on nonhuman animals, but I agree that it would maybe be nice if MIRI was slightly more explicitly anti-speciesist in its materials. I think they have a pretty good excuse for not being clearer about this, though. FWIW, MIRI people seem pretty un-speciesist to me, in the strict sense of not being biased based on species. (Eliezer is AFAIK alone among MIRI employees in his confidence that chickens etc are morally irrelevant.) I have had a few conversations with Nate about nonhuman animals, and I've thought his opinions were thoroughly reasonable. (Nate can probably respond to this too, but I think it's possible that I'm a more unbiased source on MIRI's attitude to non-human animals.)
1
tomstocker
P[humans and animals survive a long time]; large P[humans survive with animal life with super AI]: small P[humans survive without animal life with super AI]: much smaller P[Animals survive without humans but with super AI]: nearly none? It seems to me that by focusing on protecting humanity and its society, you're protecting animals by implication pretty much. Promoting animal liberation has large wierdness points. MIRI's efforts are already hampered by wierdness points. So using MIRI as a platform to promote animal liberation is probably not a wise move?
More from So8res
Curated and popular this week
Relevant opportunities