Difference between revisions of "Oracle AI"

From Lesswrongwiki
Jump to: navigation, search
Line 1: Line 1:
An '''oracle AI''' is a super-intelligent system which is designed for the purpose of answering questions. It is a common proposed solution for achieving [[Friendly AI]], originally proposed in name by [[Nick Bostrom]]. Typically an oracle is imagined to be a stationary 'black box' AI which is fed data by the creators. It organizes the data into knowledge and models of the world in order to answer questions through some simple interface such as text or voice.  
+
An '''Oracle AI''' is a regularly proposed solution to the problem of developing [[Friendly AI]]. It is conceptualized as a super-intelligent system which is designed for only answering questions, and has no ability to act in the world. The name was first suggested by [[Nick Bostrom]].
  
 
==Safety==
 
==Safety==
It is generally agreed that oracles are safer than fully free [[agent]] AIs. But there is much debate as to whether they will act like an agent in dangerous ways. In [http://lesswrong.com/lw/tj/dreams_of_friendliness/ Dreams of Friendliness], [[Eliezer Yudkowsky]] gives an informal argument that all oracles will be agent-like. It rests on the fact that anything considered "intelligent" must be an [[optimization process]]. Specifically, there are many possible things to believe and very few correct beliefs. Therefore believing the correct thing means some method was used to select the correct belief from the many incorrect beliefs. By definition, this is an optimization process which has a goal of selecting correct beliefs.
+
Armstrong, Sandberg and Bostrom discuss Oracle AI safety at length in their [http://www.aleph.se/papers/oracleAI.pdf Thinking inside the box: using and controlling an Oracle AI]. The authors propose a conceptual architecture to create such a system, besides reviewing how one might measure it accuracy and shed some light on human level considerations. Among the last are physical security  – also known as “boxing” -, the potential for the oracle to use social engineering, which questions may be safe to ask, [[utility indifference]], and many other factors.
  
After the establishment of a goal, one can imagine things the optimization process might do towards that goal. For instance, the oracle could answer more accurately and easily if it killed all life on earth. It would also help make answering easier if the oracle used matter outside its box for computation, despite the desires of the creators. Whether or not the oracle will choose these things depends on the specific architecture of the oracle. Whatever process makes it capable of being super-intelligent would make it capable of doing these things if it so chose.
+
The paper’s conclusion that Oracles – or “boxed” AIs - are safer than fully free [[agent]] AIs has raised much debate. In [http://lesswrong.com/lw/tj/dreams_of_friendliness/ Dreams of Friendliness], [[Eliezer Yudkowsky]] gives an informal argument stating that all oracles will be agent-like. It rests on the fact that anything considered "intelligent" must be an [[optimization process]]. That means that the Oracle will have many possible things to believe and very few correct beliefs. Therefore believing the correct thing means some method was used to select the correct belief from the many incorrect beliefs. By definition, this is an optimization process which has a goal of selecting correct beliefs. After the establishment of a goal, one can imagine things the optimization process might do towards that goal. This means that, for instance, the Oracle could answer more accurately and easily to a certain question if it killed all life on earth or turn all matter outside the box in [[computronium]].  
  
Armstrong, Sandberg and Bostrom discuss oracle safety at length in [http://www.aleph.se/papers/oracleAI.pdf Thinking inside the box: using and controlling an Oracle AI]. They review physical security, such as keeping it in a concrete bunker surrounded by explosives, the potential for the oracle to exploit human psychology, which questions may be safe to ask, [[utility indifference]], and many other factors.
+
==Taxonomy==
 +
Based on an old draft by Daniel Dewey, Luke Muehlhauser has [http://lesswrong.com/lw/any/a_taxonomy_of_oracle_ais/ published] a possible taxonomy of Oracle AIs, broadly divided between True Oracular AIs and Oracular non-AIs.
  
==Predictors==
+
===True Oracular AIs===
A '''predictor''' is an oracle AI which only returns predictions. Predictors are possibly less dangerous than full oracles, but there are still known dangers with predictors. One is that they could simulate people inside them, perhaps for the purpose of predicting what the person will do. If these simulations are sufficiently accurate, then they will be people themselves. A second hazard is self-fulfilling predictions. Because a prediction itself will effect the future, this may invalidate the prediction. A super-intelligent predictor would realize this, and would need to choose a prediction that did not invalidate itself. This process of choosing implies that the predictor controls the future in some way.
+
Given that true AIs are goal-oriented agents, it follows that a True Oracular AI has some kind of oracular goals. These act as the motivation system for the Oracle to give us the information we ask and nothing else.  
  
==Blog posts==
+
It is first noted that such a True AI is not actually nor causally isolated from the world, as it has at least an input (questions and information) and an output (answers) channel. Since we expect such an intelligent agent to be able to have a deep impact on the world even through these limited channels, it can only be safe if its goals are fully compatible with human goals.
* [http://lesswrong.com/lw/tj/dreams_of_friendliness/ Dreams of Friendliness]
+
 
 +
This means that a True Oracular AI has to have a full specification of human values, thus making it a [[FAI-complete]] problem – if we could achieve such skill and knowledge we could just build a Friendly AI and bypass the Oracle AI concept.
 +
 
 +
===Oracular non-AIs===
 +
 
 +
Any system that acts only as an informative machine, only answering questions and has no goals is by definition not an AI at all. That means that a non-AI Oracular is but a calculator of outputs based on inputs. Since the term in itself is heterogeneous, the proposals made for a sub-division are merely informal.
 +
 
 +
An ''Advisor'' can be seen as a system that gathers data from the real world and computes the answer to an informal “what we ought to do?” question. They also represent a FAI-complete problem.
 +
 
 +
A ''Question-Answerer'' is a similar system that gathers data from the real world but coupled with a question. It then somehow computes the answer. The difficulty can lay on distinguishing it from an Advisor and controlling the safety of its answers.
 +
 
 +
Finally, a ''Predictor'' is seen as a system that takes a corpus of data and produces a probability distribution over future possible data. There are some proposed dangers with predictors, namely exhibiting goal-seeking behavior which does not converge with humanity goals and the ability to influence us through the predictions.
 +
 
 +
 
 +
==Further reading & References==
 +
*[http://lesswrong.com/lw/tj/dreams_of_friendliness/ Dreams of Friendliness]
 +
*[http://www.aleph.se/papers/oracleAI.pdf Thinking inside the box: using and controlling an Oracle AI] by Armstrong, Sandberg and [[Nick Bostrom|Bostrom]]
  
 
==See also==
 
==See also==
 
* [[Basic AI drives]]
 
* [[Basic AI drives]]
 
* [[Tool AI]]
 
* [[Tool AI]]
 
==External links==
 
*[http://www.aleph.se/papers/oracleAI.pdf Thinking inside the box: using and controlling an Oracle AI] by Armstrong, Sandberg and [[Nick Bostrom|Bostrom]]
 

Revision as of 03:18, 28 September 2012

An Oracle AI is a regularly proposed solution to the problem of developing Friendly AI. It is conceptualized as a super-intelligent system which is designed for only answering questions, and has no ability to act in the world. The name was first suggested by Nick Bostrom.

Safety

Armstrong, Sandberg and Bostrom discuss Oracle AI safety at length in their Thinking inside the box: using and controlling an Oracle AI. The authors propose a conceptual architecture to create such a system, besides reviewing how one might measure it accuracy and shed some light on human level considerations. Among the last are physical security – also known as “boxing” -, the potential for the oracle to use social engineering, which questions may be safe to ask, utility indifference, and many other factors.

The paper’s conclusion that Oracles – or “boxed” AIs - are safer than fully free agent AIs has raised much debate. In Dreams of Friendliness, Eliezer Yudkowsky gives an informal argument stating that all oracles will be agent-like. It rests on the fact that anything considered "intelligent" must be an optimization process. That means that the Oracle will have many possible things to believe and very few correct beliefs. Therefore believing the correct thing means some method was used to select the correct belief from the many incorrect beliefs. By definition, this is an optimization process which has a goal of selecting correct beliefs. After the establishment of a goal, one can imagine things the optimization process might do towards that goal. This means that, for instance, the Oracle could answer more accurately and easily to a certain question if it killed all life on earth or turn all matter outside the box in computronium.

Taxonomy

Based on an old draft by Daniel Dewey, Luke Muehlhauser has published a possible taxonomy of Oracle AIs, broadly divided between True Oracular AIs and Oracular non-AIs.

True Oracular AIs

Given that true AIs are goal-oriented agents, it follows that a True Oracular AI has some kind of oracular goals. These act as the motivation system for the Oracle to give us the information we ask and nothing else.

It is first noted that such a True AI is not actually nor causally isolated from the world, as it has at least an input (questions and information) and an output (answers) channel. Since we expect such an intelligent agent to be able to have a deep impact on the world even through these limited channels, it can only be safe if its goals are fully compatible with human goals.

This means that a True Oracular AI has to have a full specification of human values, thus making it a FAI-complete problem – if we could achieve such skill and knowledge we could just build a Friendly AI and bypass the Oracle AI concept.

Oracular non-AIs

Any system that acts only as an informative machine, only answering questions and has no goals is by definition not an AI at all. That means that a non-AI Oracular is but a calculator of outputs based on inputs. Since the term in itself is heterogeneous, the proposals made for a sub-division are merely informal.

An Advisor can be seen as a system that gathers data from the real world and computes the answer to an informal “what we ought to do?” question. They also represent a FAI-complete problem.

A Question-Answerer is a similar system that gathers data from the real world but coupled with a question. It then somehow computes the answer. The difficulty can lay on distinguishing it from an Advisor and controlling the safety of its answers.

Finally, a Predictor is seen as a system that takes a corpus of data and produces a probability distribution over future possible data. There are some proposed dangers with predictors, namely exhibiting goal-seeking behavior which does not converge with humanity goals and the ability to influence us through the predictions.


Further reading & References

See also