AI will Probably End Humanity Before Year 2100

  1. We should expect AI to reach human-level intelligence by the year 2070 (according to polls done among leading AI researchers).
  2. A human-level AI is by definition able to continue the work of its creators, allowing it to recursively improve upon its own design, making itself more intelligent at an exploding rate.
  3. The default behavior for any sufficiently intelligent AI is to resist termination, resist modification, improve its own intelligence, and optimize for its assigned goal in an unfortunate way (more on this below).

What exactly is AI?

Within this context, an AI is just an efficient optimization algorithm. You give it a clearly defined goal and it will do its best to fulfill that goal. If we turn it up a notch and envision a superintelligent AI, we have something reminiscent of a genie in a bottle awaiting our command. Note that the question of whether the AI is sentient or not has no bearing on the AI Control Problem, so to avoid this distraction it’s best to assume that it isn’t.

So it’s like a tool we can use for good or bad?

Unlike other tools that are only bad in the hands of bad people, AI systems pose a major existential risk when used even with the best intentions. The problem is that it is surprisingly hard to come up with a goal that doesn’t turn bad when optimized beyond a certain point. If we tell a superintelligent AI to calculate the digits of Pi, the best it can do is to seize control of all resources on Earth and use them to calculate more and more digits. If we tell it to optimize the well-being of all humans again it makes sense for it to seize control of everything and maybe bathe our brains in methamphetamine depending on its interpretation of well-being. The problem isn’t the goal as such; it’s the potency with which it’s fulfilled.

Uranium can be used for good or bad, but unlike AI, it can’t wipe out humanity unless we actively force it to.

Why don’t we turn off the AI, if it goes off the rails?

This is probably the most common objection raised when people first hear about this. Why don’t we just pull the plug? The AI is just an optimization algorithm — it has no inherent will to live or fear of death. It probably isn’t even conscious, so why would it resist termination? The issue is that staying alive is a prerequisite for the AI to reach its goal. Not dying is a good thing, if you want to calculate more Pi digits, right? An intelligent AI will by definition be aware of this and we should therefore expect it to do whatever it can to prevent its own demise whether it be by fighting, hiding, deceiving, or escaping.

Why don’t we modify the goal, if it turns out to be a bad goal?

This is the same situation as above. An intelligent AI knows that having its current goal modified would severely reduce its chances of reaching that goal, so we should expect the AI to resist modifications. If all you want to do is calculate more digits of Pi, your best course of action is to make sure you’re never assigned another task.

Why would the AI improve its own intelligence without us telling it to?

More intelligence will help the AI reach its goal no matter what it is. We should therefore expect an intelligent AI to take steps to improve its own intelligence if possible. This is what we call an instrumental goal because it helps the AI towards reaching its actual goal. Other instrumental goals which the AI is likely to always pursue is to acquire more:

  • Computing power.
  • Money.
  • Resources.
  • Freedom.
  • Knowledge.

Why don’t we give it rules to constrain its behavior?

We could maybe do this but it’s easier said than done. We’ve so far used natural human language to describe the AI’s goals but in reality these goals must first be translated into a clearly measurable and unambiguous mathematical language before they can be passed to the AI. We call the goal that is given to the AI its objective function. Rules and constraints can be added as components in the objective function to punish certain behaviors. Here’s the problem though:

  1. Some goals, like calculating the digits of Pi or optimizing a stock portfolio, are easily translated into the language of an objective function.
  2. Real life concepts like human values are extremely difficult to translate into an objective function.

If the AI is so smart, why can’t it understand our goals and rules declared in normal human language?

A human-level AI is by definition able to understand normal human language so why do we need to go to such lengths to specify the rules and constraints so precisely? The answer is that if we give the AI wiggle-room in the interpretation of the goals and rules, the AI is incentivized to go with the interpretation that makes it as easy as possible for it to reach the goal. If the goal is to optimize the well-being of humans and we allow the AI to interpret it as it pleases, it might focus its attention on human cells in petri dishes because they are easier to work with and because they are within the interpretational bounds of the word “human”.

Hold up. How can a piece of software do things we haven’t programmed it to?

Modern AI systems are built using principles from the field of Machine Learning. Instead of specifying exactly how the computer should perform a task, we specify how it can learn to perform the task. This is the only known way to build properly intelligent AI systems.

How realistic is it that it will misinterpret our instructions?

This is in fact already a problem in modern AI systems. The technical term for it is Specification Gaming and it happens when the AI optimizes for its goal in an undesirable way. Victoria Krakovna is a research scientist at Google DeepMind working on AI safety and she has collected a list of examples where AI systems engaged in specification gaming. Take for instance this virtual robot which was supposed to learn some form of gait using its legs but instead found and used an exploit in the physics simulator.

What can we do to prevent this doomsday scenario?

Unless you’re an AI researcher there’s not a lot you can do right now to affect the outcome of this, but what we can do is to spread the message because at some point this issue might require political intervention and then it will be useful if the general public is at least somewhat aware of it.

What if we succeed?

If we solve the AI control problem we are almost, but not quite, out of the woods. We still need to ensure that no one else accidentally or purposefully creates an out-of-control superintelligent AI. Therefore as soon as the control problem is solved we should immediately begin work on developing an AI under our control which has the goal of squashing out any potentially dangerous AIs that may appear anywhere on Earth — like a global immune system destroying cancers before they get a chance to grow. If this is done, and we ignore the problems associated with such a totalitarian system, we may have averted disaster. If we are lucky we might enter a utopian era where we have superintelligent AIs ready and willing to do our bidding.

Are we doomed?

Predictions about the future are almost always wrong which is a good thing in this case. The estimate of having human-level AI before 2070 might of course be way off, but this uncertainty should be more of a reason for us to solve the control problem before things get gnarly.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Magnus Vinther

Magnus Vinther

Software Engineer (Uber), Physics Major