Difference between revisions of "AI boxing"

From Lesswrongwiki
Jump to: navigation, search
(= See Also)
m
Line 3: Line 3:
 
AI Boxing is often discussed in the context of [[Oracle AI]], but not exclusively.  
 
AI Boxing is often discussed in the context of [[Oracle AI]], but not exclusively.  
  
A number of strategies for boxing are discussed in Thinking inside the box. Among them are:
+
A number of strategies for boxing are discussed in Thinking Inside the Box. Among them are:
* Physical Security
+
* Physically isolating the AI
* No Manipulators
+
* Permitting the AI access to no computerized machines
* Limiting the AI’s output
+
* Limiting the AI’s outputs
 
* Periodic resets of the AI's memory
 
* Periodic resets of the AI's memory
 
* An interface between the real world and the AI where it would reveal its unfriendly intentions first
 
* An interface between the real world and the AI where it would reveal its unfriendly intentions first
Line 12: Line 12:
  
 
Both Eliezer Yudkowsky and Justin Corwin have ran simulations, pretending to be a [[superintelligence]], and been able to convince a human playing a guard to let them out on many - but not all - occasions. These experiments have used a set of arbitrary rules,  
 
Both Eliezer Yudkowsky and Justin Corwin have ran simulations, pretending to be a [[superintelligence]], and been able to convince a human playing a guard to let them out on many - but not all - occasions. These experiments have used a set of arbitrary rules,  
 +
 +
==== See Also ====
 +
* [[Oracle AI]]
  
 
=== References ===
 
=== References ===
  
[http://www.aleph.se/papers/oracleAI.pdf Thinking inside the box: using and controlling an Oracle AI] by Stuart Armstrong, Anders Sandberg, and Nick Bostrom
+
* [http://www.aleph.se/papers/oracleAI.pdf Thinking inside the box: using and controlling an Oracle AI] by Stuart Armstrong, Anders Sandberg, and Nick Bostrom
[http://ordinaryideas.wordpress.com/2012/04/27/on-the-difficulty-of-ai-boxing/ on the Difficulty of AI Boxing] by paulfchristiano
+
* [http://ordinaryideas.wordpress.com/2012/04/27/on-the-difficulty-of-ai-boxing/ On the Difficulty of AI Boxing] by Paul Christiano
[http://lesswrong.com/lw/3cz/cryptographic_boxes_for_unfriendly_ai/ http://lesswrong.com/lw/3cz/cryptographic_boxes_for_unfriendly_ai/] by paulfchristiano
+
* [http://lesswrong.com/lw/3cz/cryptographic_boxes_for_unfriendly_ai/ Cryptographic Boxes for Unfriendly AI] by Paul Christiano
 
 
== See Also ==
 
* [[Oracle AI]]
 
  
 
==== The Experiments ====  
 
==== The Experiments ====  

Revision as of 02:28, 3 July 2012

It has often been proposed that as long as an AGI is physically and otherwise isolated, or boxed, it can do little harm. However, since an AGI may be far smarter than any person interacting with it, the AGI may be able to influence any user to let them out of their "box", and human control.

AI Boxing is often discussed in the context of Oracle AI, but not exclusively.

A number of strategies for boxing are discussed in Thinking Inside the Box. Among them are:

  • Physically isolating the AI
  • Permitting the AI access to no computerized machines
  • Limiting the AI’s outputs
  • Periodic resets of the AI's memory
  • An interface between the real world and the AI where it would reveal its unfriendly intentions first
  • Motivational control, using a variety of techniques

Both Eliezer Yudkowsky and Justin Corwin have ran simulations, pretending to be a superintelligence, and been able to convince a human playing a guard to let them out on many - but not all - occasions. These experiments have used a set of arbitrary rules,

See Also

References

The Experiments