19

0
0

Reactions

0
0
Comments11
Sorted by Click to highlight new comments since:

What are the arguments for why someone should work in AI safety over wild animal welfare? (Holding constant personal fit etc)

  • If someone thinks wild animals live positive lives, is it reasonable to think that AI doom would mean human extinction but maintain ecosystems? Or does AI doom threaten animals as well?
  • Does anyone have BOTECs on numbers of wild animals vs numbers of digital minds?

I would like to know about the history of the term "AI alignment". I found an article written by Paul Christiano in 2018. Did the use of the term start around this time? Also, what is the difference between AI alignment and value alignment?

https://www.alignmentforum.org/posts/ZeE7EKHTFMBs8eMxn/clarifying-ai-alignment

Some considerations I came to think about which might prevent AI systems from becoming power-seeking by default: 

  • Seeking power implies a time delay on the thing it's actually trying to do, which could be against its preferences for various reasons.
  • The longer the time-frame, the more complexity and uncertainty will be added, like "how to gain power", "will this help further the actual goal" etc.

So even if AI systems make plans / chose actions based on expected value calculations, just doing the thing they are trying to do might be the better strategy. (Even if gaining more power first would, if it worked, eventually make the AI system better achieve its goal).

Am I missing something? And are there any predictions on which of these two trends will win out? (I'm speaking of cases where we did not intend the system to be power-seeking, as opposed to, e.g., when you program the system to "make as much money as possible, forever".)

What are the key cruxes between people who think AGI is about to kill us all, and those who don't? I'm at the stage where I can read something like this and think "ok so we're all going to die", then follow it up with this and be like "ah great we're all fine then". I don't yet have the expertise to critically evaluate the arguments in any depth. Has anyone written something that explains where people begin to diverge, and why, in a reasonably accessible way?

Do the concepts behind AGI safety only make sense if you have roughly the same worldview as the top AGI safety researchers - secular atheism and reductive materialism/physicalism and a computational theory of mind?

Can you highlight some specific AGI safety concepts that make less sense without secular atheism, reductive materialism, and/or computational theory of mind?

I'd like to underline that I'm agnostic, and I don't know what the true nature of our reality is, though lately I've been more open to anti-physicalist views of the universe.

For one, if there's a continuation of consciousness after death then AGI killing lots of people might not be as bad as when there is no continuation of consciousness after death. I would still consider it very bad, but mostly because I like this world and the living beings in it and would not like them to end, but it wouldn't be the end of consciousnesses like some doomy AGI safety people imply.

Another thing is that the relationship between consciousness and the physical universe might be more complex than physicalists say - like some of the early figures of quantum physics thought - and there might unknown to current science factors at play that could have an effect on the outcome. I don't have more to say about this because I'm uncertain what the relationship between consciousness and the physical universe might be in such a view.

And lastly, if there's God or gods or something similar, such beings would have agency and could have an effect on what the outcome might be. For example, there are Christian eschatological views that say that the Christian prophecies about the New Earth and other such things must come true in some way, so the future cannot end in a total extinction of all human life.

Suppose someone is an ethical realist: the One True Morality is out there, somewhere, for us to discover. Is it likely that AGI will be able to reason its way to finding it? 

What are the best examples of AI behavior we have seen where a model does something "unreasonable" to further its goals? Hallucinating citations?

I've been doing a 1-year "conversion master's" in CS (I previously studied biochemistry). I took as many AI/ML electives as I'm permitted to/can handle, but I missed out on an intro to RL course. I'm planning to take some time to (semi-independently) up-skill in AI safety after graduating. This might involve some projects and some self-study.

It seems like a good idea to be somewhat knowledgeable on RL basics going forward. I've taken (paid) accredited, distance/online courses (with exams etc.) concurrently with my main degree and found them to be higher quality than common perception suggests - although it does feel slightly distracting to have more on my plate.

Is it worth doing a distance/online course in RL (e.g. https://online.stanford.edu/courses/xcs234-reinforcement-learning ) as one part of the up-skilling period following graduation? Besides the Stanford online one that I've linked, are there any others that might be high quality and worth looking into? Otherwise, are there other resources that might be good alternatives?

Curated and popular this week
Relevant opportunities