Difference between revisions of "Oracle AI"

From Lesswrongwiki
Jump to: navigation, search
m (See also)
Line 1: Line 1:
{{stub}}
+
An '''oracle AI''' is a super-intelligent system which is designed for the purpose of answering questions. It is a common proposed solution for achieving [[Friendly AI]], originally proposed in name by [[Nick Bostrom]]. Typically an oracle is imagined to be a stationary 'black box' AI which is fed data by the creators. It organizes the data into knowledge and models of the world in order to answer questions through some simple interface such as text or voice.
 +
 
 +
==Safety==
 +
It is generally agreed that oracles are safer than fully free [[agent]] AIs. But there is much debate as to whether they will act like an agent in dangerous ways. In [http://lesswrong.com/lw/tj/dreams_of_friendliness/ Dreams of Friendliness], [[Eliezer Yudkowsky]] gives an informal argument that all oracles will be agent-like. It rests on the fact that anything considered "intelligent" must be an [[optimization process]]. Specifically, there are many possible things to believe and very few correct beliefs. Therefore believing the correct thing means some method was used to select the correct belief from the many incorrect beliefs. By definition, this is an optimization process which has a goal of selecting correct beliefs.
 +
 
 +
After the establishment of a goal, one can imagine things the optimization process might do towards that goal. For instance, the oracle could answer more accurately and easily if it killed all life on earth. It would also help make answering easier if the oracle used matter outside its box for computation, despite the desires of the creators. Whether or not the oracle will choose these things depends on the specific architecture of the oracle. Whatever process makes it capable of being super-intelligent would make it capable of doing these things if it so chose.
 +
 
 +
Armstrong, Sandberg and Bostrom discuss oracle safety at length in [http://www.aleph.se/papers/oracleAI.pdf Thinking inside the box: using and controlling an Oracle AI]. They review physical security, such as keeping it in a concrete bunker surrounded by explosives, the potential for the oracle to exploit human psychology, which questions may be safe to ask, [[utility indifference]], and many other factors.
 +
 
 +
==Predictors==
 +
A '''predictor''' is an oracle AI which only returns predictions. Predictors are possibly less dangerous than full oracles, but there are still known dangers with predictors. One is that they could simulate people inside them, perhaps for the purpose of predicting what the person will do. If these simulations are sufficiently accurate, then they will be people themselves. A second hazard is self-fulfilling predictions. Because a prediction itself will effect the future, this may invalidate the prediction. A super-intelligent predictor would realize this, and would need to choose a prediction that did not invalidate itself. This process of choosing implies that the predictor controls the future in some way.
 +
 
 
==Blog posts==
 
==Blog posts==
 
 
* [http://lesswrong.com/lw/tj/dreams_of_friendliness/ Dreams of Friendliness]
 
* [http://lesswrong.com/lw/tj/dreams_of_friendliness/ Dreams of Friendliness]
  
 
==See also==
 
==See also==
 +
* [[Basic AI drives]]
 +
* [[Tool AI]]
  
* [[Basic AI drives]], [[Complexity of value]]
+
==External links==
* [[Unfriendly AI]]
+
*[http://www.aleph.se/papers/oracleAI.pdf Thinking inside the box: using and controlling an Oracle AI] by Armstrong, Sandberg and [[Nick Bostrom|Bostrom]]

Revision as of 10:13, 27 June 2012

An oracle AI is a super-intelligent system which is designed for the purpose of answering questions. It is a common proposed solution for achieving Friendly AI, originally proposed in name by Nick Bostrom. Typically an oracle is imagined to be a stationary 'black box' AI which is fed data by the creators. It organizes the data into knowledge and models of the world in order to answer questions through some simple interface such as text or voice.

Safety

It is generally agreed that oracles are safer than fully free agent AIs. But there is much debate as to whether they will act like an agent in dangerous ways. In Dreams of Friendliness, Eliezer Yudkowsky gives an informal argument that all oracles will be agent-like. It rests on the fact that anything considered "intelligent" must be an optimization process. Specifically, there are many possible things to believe and very few correct beliefs. Therefore believing the correct thing means some method was used to select the correct belief from the many incorrect beliefs. By definition, this is an optimization process which has a goal of selecting correct beliefs.

After the establishment of a goal, one can imagine things the optimization process might do towards that goal. For instance, the oracle could answer more accurately and easily if it killed all life on earth. It would also help make answering easier if the oracle used matter outside its box for computation, despite the desires of the creators. Whether or not the oracle will choose these things depends on the specific architecture of the oracle. Whatever process makes it capable of being super-intelligent would make it capable of doing these things if it so chose.

Armstrong, Sandberg and Bostrom discuss oracle safety at length in Thinking inside the box: using and controlling an Oracle AI. They review physical security, such as keeping it in a concrete bunker surrounded by explosives, the potential for the oracle to exploit human psychology, which questions may be safe to ask, utility indifference, and many other factors.

Predictors

A predictor is an oracle AI which only returns predictions. Predictors are possibly less dangerous than full oracles, but there are still known dangers with predictors. One is that they could simulate people inside them, perhaps for the purpose of predicting what the person will do. If these simulations are sufficiently accurate, then they will be people themselves. A second hazard is self-fulfilling predictions. Because a prediction itself will effect the future, this may invalidate the prediction. A super-intelligent predictor would realize this, and would need to choose a prediction that did not invalidate itself. This process of choosing implies that the predictor controls the future in some way.

Blog posts

See also

External links