Difference between revisions of "AGI chaining"

From Lesswrongwiki
Jump to: navigation, search
(References)
 
Line 1: Line 1:
'''Chaining God''' is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman [[AGI]]. Traditional AGI plans involve the creation of a [[Seed AI]] which recursively [[AI takeoff|improves itself]]. As it self-improves, human abilities will become drastically insufficient to comprehend the AGI's program enough to trust it, However, the original Seed AI was simple enough for us to understand and trust, and it will likewise be capable of understanding and trusting the successor it built. By having each iteration of the AGI improve a copy of itself only, a chain of AGI's which understand the AGI they created and are understood by their creator can be used to trust even the most complex AGI.
+
'''Chaining God''' is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman [[AGI]]. It involves a chain of AGIs, each more advanced than the next. The idea is that even though humans might not be able to understand the most sophisticated AGI well enough to trust it, they can understand and trust the first AGI in the chain, which will in turn verify the trustworthiness of the next AGI, and so on.
 +
 
 +
Armstrong mentions a number of considerations:
  
Armstrong proposes a number of issues with the chain design:
 
 
* If an AGI at any level ever claims or is claimed to be untrustworthy, the chain should be instructed to gather diagnostic information, then start from scratch.
 
* If an AGI at any level ever claims or is claimed to be untrustworthy, the chain should be instructed to gather diagnostic information, then start from scratch.
 
* If the AGI chain passes integrity checks yet acts untrustworthy, restart from scratch.
 
* If the AGI chain passes integrity checks yet acts untrustworthy, restart from scratch.
* If the AGI chain refuses to and can prevent us from shutting it down, we're screwed and we can only work with the learning systems of the chain.
+
* If the AGI chain refuses to and can prevent us from shutting it down, we are in trouble and can only attempt to negotiate with it, and hope for the best.
 
* If after repeated attempts the chain continues to fail, or a level of intelligence is reached that claims the chain slows it down too much for further progress, at least some safe research has been conducted. We may choose to accept that limitation, or to simply accept an untrustworthy AGI.
 
* If after repeated attempts the chain continues to fail, or a level of intelligence is reached that claims the chain slows it down too much for further progress, at least some safe research has been conducted. We may choose to accept that limitation, or to simply accept an untrustworthy AGI.
* If the AGI chain breaks invisibly, we're probably screwed.
+
* If the AGI chain breaks invisibly, we're probably doomed.
  
This is a very conservative approach to AGI design, and presents a large opportunity cost. Armstrong believes the chain approach would be unlikely to produce anywhere near the best possible future, since the AGI chain would learn from present human values only. Each improved layer of AGI would be limited in improvement to ensure its creator could understand it. With supervision happening at each level, an AGI would take longer to develop and when starting over repeatedly the seed AI would always have to be humanly comprehensible. He  believe an AGI chain is a simple way to create [[Friendly Artificial Intelligence]], but enumerates a number of ways the concept might never work.   
+
This is a very conservative approach to AGI design, and presents a large opportunity cost. Armstrong believes the chain approach would be unlikely to produce anywhere near the best possible future, since the AGI chain would only learn from present human values. Each improved layer of AGI would be limited in improvement to ensure its creator could understand it. With supervision happening at each level, an AGI would take longer to develop and when starting over repeatedly the seed AI would always have to be humanly comprehensible. He  believe an AGI chain is a simple way to create [[Friendly Artificial Intelligence]], but enumerates a number of ways the concept might never work.   
  
 
==See also==
 
==See also==
Line 14: Line 15:
 
*[[Friendly AI]]
 
*[[Friendly AI]]
 
*[[Coherent Extrapolated Volition]]
 
*[[Coherent Extrapolated Volition]]
 +
*[[Seed AI]]
  
 
==References==
 
==References==

Latest revision as of 21:09, 22 October 2012

Chaining God is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. It involves a chain of AGIs, each more advanced than the next. The idea is that even though humans might not be able to understand the most sophisticated AGI well enough to trust it, they can understand and trust the first AGI in the chain, which will in turn verify the trustworthiness of the next AGI, and so on.

Armstrong mentions a number of considerations:

  • If an AGI at any level ever claims or is claimed to be untrustworthy, the chain should be instructed to gather diagnostic information, then start from scratch.
  • If the AGI chain passes integrity checks yet acts untrustworthy, restart from scratch.
  • If the AGI chain refuses to and can prevent us from shutting it down, we are in trouble and can only attempt to negotiate with it, and hope for the best.
  • If after repeated attempts the chain continues to fail, or a level of intelligence is reached that claims the chain slows it down too much for further progress, at least some safe research has been conducted. We may choose to accept that limitation, or to simply accept an untrustworthy AGI.
  • If the AGI chain breaks invisibly, we're probably doomed.

This is a very conservative approach to AGI design, and presents a large opportunity cost. Armstrong believes the chain approach would be unlikely to produce anywhere near the best possible future, since the AGI chain would only learn from present human values. Each improved layer of AGI would be limited in improvement to ensure its creator could understand it. With supervision happening at each level, an AGI would take longer to develop and when starting over repeatedly the seed AI would always have to be humanly comprehensible. He believe an AGI chain is a simple way to create Friendly Artificial Intelligence, but enumerates a number of ways the concept might never work.

See also

References