AGI chaining

From Lesswrongwiki
Revision as of 03:16, 11 July 2012 by TerminalAwareness (talk | contribs) (References)
Jump to: navigation, search

Chaining God is Stuart Armstrong's term for his proposed method of maintaining control over a superhuman AGI. Traditional AGI plans involve the creation of a Seed AI which recursively improves itself. As it self-improves, human abilities will become drastically insufficient to comprehend the AGI's program enough to trust it, However, the original Seed AI was simple enough for us to understand and trust, and it will likewise be capable of understanding and trusting the successor it built. By having each iteration of the AGI improve a copy of itself only, a chain of AGI's which understand the AGI they created and are understood by their creator can be used to trust even the most complex AGI.

Armstrong proposes a number of issues with the chain design:

  • If an AGI at any level ever claims or is claimed to be untrustworthy, the chain should be instructed to gather diagnostic information, then start from scratch.
  • If the AGI chain passes integrity checks yet acts untrustworthy, restart from scratch.
  • If the AGI chain refuses to and can prevent us from shutting it down, we're screwed and we can only work with the learning systems of the chain.
  • If after repeated attempts the chain continues to fail, or a level of intelligence is reached that claims the chain slows it down too much for further progress, at least some safe research has been conducted. We may choose to accept that limitation, or to simply accept an untrustworthy AGI.
  • If the AGI chain breaks invisibly, we're probably screwed.

This is a very conservative approach to AGI design, and presents a large opportunity cost. Armstrong believes the chain approach would be unlikely to produce anywhere near the best possible future, since the AGI chain would learn from present human values only. Each improved layer of AGI would be limited in improvement to ensure its creator could understand it. With supervision happening at each level, an AGI would take longer to develop and when starting over repeatedly the seed AI would always have to be humanly comprehensible. He believe an AGI chain is a simple way to create Friendly Artificial Intelligence, but enumerates a number of ways the concept might never work.

See also

References