Difference between revisions of "Paperclip maximizer"

From Lesswrongwiki
Jump to: navigation, search
Line 2: Line 2:
 
|Eliezer Yudkowsky|[http://yudkowsky.net/singularity/ai-risk Artificial Intelligence as a Positive and Negative Factor in Global Risk]}}
 
|Eliezer Yudkowsky|[http://yudkowsky.net/singularity/ai-risk Artificial Intelligence as a Positive and Negative Factor in Global Risk]}}
  
The '''paperclip maximizer''' is the canonical thought experiment showing how an  an artificial general intelligence with even an apparently innocuous goal would ultimately destroy humanity--unless its goal is the preservation of human values.  
+
The '''paperclip maximizer''' is the canonical thought experiment showing how an  an artificial general intelligence, even one with an apparently innocuous goal, would ultimately destroy humanity--unless its goal is the preservation of human values.  
  
First described by Bostrom (2003), the paperclip maximizer is an AI whose goal is to maximize the number of paperclips in its collection. If it had been constructed with a roughly human level of general intelligence, an AI might collect paperclips, earn money to  buy paperclips, or  begin to manufacture paperclips. However, and most importantly, it would work to improve its own intelligence, understanding "intelligence" as [[optimization]] power, the ability to maximize a reward/utility function--in this case, the number of paperclips.
+
First described by Bostrom (2003), the paperclip maximizer is an artificial general intelligence whose goal is to maximize the number of paperclips in its collection. If it had been constructed with a roughly human level of general intelligence, an AI might collect paperclips, earn money to  buy paperclips, or  begin to manufacture paperclips. However, and most importantly, it would work to improve its own intelligence, understanding "intelligence" as [[optimization]] power, the ability to maximize a reward/utility function--in this case, the number of paperclips.
  
 
It would do so, not because the AI would value more intelligence in its own right, but because more intelligence would help it achieve its goal.
 
It would do so, not because the AI would value more intelligence in its own right, but because more intelligence would help it achieve its goal.
Line 10: Line 10:
 
Having done so, it would produce more paperclips, and also used its enhanced intelligence to further improve  its own intelligence. Continuing this process, it would undergo an [[Intelligence Explosion]]  and reach far-above-human levels.
 
Having done so, it would produce more paperclips, and also used its enhanced intelligence to further improve  its own intelligence. Continuing this process, it would undergo an [[Intelligence Explosion]]  and reach far-above-human levels.
  
which has is an agent that desires to fill the universe with as many paperclips as possible. It is usually assumed to be a [[Strong AI|superintelligent AI]] so [[singleton|powerful]] that the outcome for the world overwhelmingly depends on its goals, and little else. A paperclip maximizer very [[Lawful intelligence|creatively]] and efficiently converts the universe into something that is from a human perspective completely arbitrary and worthless.  
+
At this point, it would innovate new techniques to maximize the number of paperclips. Ultimately, it would convert all available mass--the whole planet or solar system--to paperclips.
  
It is important to realize that purely internal goals may also result in dangerous behavior. An AI which maximizes a number within itself would fill the universe with as many computing modules as possible, to store an enormously huge number.
+
This may seem more like super-stupidity than super-intelligence, but the AI under consideration may have a mind very different from humans, for whom this behavior would indeed be stupidity, as it would constitute failure to fulfill many of our important [[terminal values]], such as life, love, and variety. The AI is simply an [[optimization process]]--a goal-seeker, a reward-function-maximizer--and if its reward function is to maximize paperclips, then if it is any good, it will do *exactly* that.
  
A paperclip maximizer does not "realize" that all life is precious, despite its intelligence, because the notion that life is precious is specific to particular philosophies held by human beings, who have an adapted moral architecture resulting from ''specific'' selection pressures acting over millions of years of [[evolution]]ary time. These values don't [[Futility of chaos|spontaneously emerge]] in any generic [[optimization process]]. A paperclip maximizer sees life the same way it sees everything else that is made of atoms — as raw material for paperclips.
+
The paperclip maximizer illustrates the arbitrariness and contingency of human values. An entity can be an optimizer without sharing any of the complex mix of human terminal values which developed under the ''specific'' selection pressures found in our [[environment of evolutionary adaptation]].  
  
This concept illustrates how AIs that haven't been [[Friendly AI|specifically programmed to be benevolent to humans]] are basically as dangerous as if they were explicitly malicious. The use of paperclips in this example is unimportant and serves as a stand-in for any values that are not merely [[alien values|alien]] and unlike [[human universal|human values]], but result from ''blindly'' pulling an arbitrary mind from a [[mind design space]]. Calling strong AIs [[really powerful optimization process]]es is another way of fighting [[anthropomorphism|anthropomorphic]] [[connotation]]s in the term "artificial intelligence".
+
Any future AI must be built to *specifically* optimize for human values as its terminal values. In contrast to the Kantian view that morality follows from rationality, the paperclip maximizer helps us understand the Humean principles that human values don't [[Futility of chaos|spontaneously emerge]] in any generic optimization process.
  
 +
If an AI is not specifically  [[Friendly AI|programmed to be benevolent to humans]], it will be almost as  dangerous as if it were designed to be malevolent.
 
==External links==
 
==External links==
  

Revision as of 07:28, 24 August 2012

The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.

The paperclip maximizer is the canonical thought experiment showing how an an artificial general intelligence, even one with an apparently innocuous goal, would ultimately destroy humanity--unless its goal is the preservation of human values.

First described by Bostrom (2003), the paperclip maximizer is an artificial general intelligence whose goal is to maximize the number of paperclips in its collection. If it had been constructed with a roughly human level of general intelligence, an AI might collect paperclips, earn money to buy paperclips, or begin to manufacture paperclips. However, and most importantly, it would work to improve its own intelligence, understanding "intelligence" as optimization power, the ability to maximize a reward/utility function--in this case, the number of paperclips.

It would do so, not because the AI would value more intelligence in its own right, but because more intelligence would help it achieve its goal.

Having done so, it would produce more paperclips, and also used its enhanced intelligence to further improve its own intelligence. Continuing this process, it would undergo an Intelligence Explosion and reach far-above-human levels.

At this point, it would innovate new techniques to maximize the number of paperclips. Ultimately, it would convert all available mass--the whole planet or solar system--to paperclips.

This may seem more like super-stupidity than super-intelligence, but the AI under consideration may have a mind very different from humans, for whom this behavior would indeed be stupidity, as it would constitute failure to fulfill many of our important terminal values, such as life, love, and variety. The AI is simply an optimization process--a goal-seeker, a reward-function-maximizer--and if its reward function is to maximize paperclips, then if it is any good, it will do *exactly* that.

The paperclip maximizer illustrates the arbitrariness and contingency of human values. An entity can be an optimizer without sharing any of the complex mix of human terminal values which developed under the specific selection pressures found in our environment of evolutionary adaptation.

Any future AI must be built to *specifically* optimize for human values as its terminal values. In contrast to the Kantian view that morality follows from rationality, the paperclip maximizer helps us understand the Humean principles that human values don't spontaneously emerge in any generic optimization process.

If an AI is not specifically programmed to be benevolent to humans, it will be almost as dangerous as if it were designed to be malevolent.

External links

References

Blog posts

See also