Difference between revisions of "AIXI"

From Lesswrongwiki
Jump to: navigation, search
Line 1: Line 1:
AIXI is an algorithm for a maximally intelligent agent, developed by Marcus Hutter.  
+
AIXI is a theoretical model for a maximally intelligent hypothetical agent, developed by Marcus Hutter. It works by simulating all possible actions into the future, considering simpler hypotheses about the way the world works as more likely,
  
The agent model in which AIXI works is as follows: There is an *agent*, and an *environment*, which is a computable function unknown to the agent. So, the agent will need to have a probability distribution on the range of possible environments. On each clock tick, the agent receives an *observation* (a bitstring/number) from the environment, as well as a reward (another number). The agent then outputs an *action* (another number). Then again, on each iteration, the environment provides an observation and reward as a function of the full history of the interaction, and agent outputs its action likewise. The agent's intelligence is defined by its expected reward across all environments.  
+
The agent model in which AIXI operates is as follows: There is an *agent*, and an *environment*, which is a computable function unknown to the agent. So, the agent will need to have a probability distribution on the range of possible environments. On each clock tick, the agent receives an *observation* (a bitstring/number) from the environment, as well as a reward (another number). The agent then outputs an *action* (another number). Then again, on each iteration, the environment provides an observation and reward as a function of the full history of the interaction, and agent outputs its action likewise. The agent's intelligence is defined by its expected reward across all environments.  
  
AIXI uses Solomonoff induction, a formalization of Occam's Razor, to guess about the nature of its environment. It guesses  that less complex environments are more likely than more complex ones. Complexity is measured by the sum of lengths of all programs (in a Turing machine) that would produce a given environment; more complex environments get exponentially less weight. AIXI then calculates the expected reward of each action it might choose and chooses the best one. It does so by extrapolating its actions into the future, using the assumption that at each step into the future it will again choose the best possible action using the same procedure.  
+
AIXI guesses at a probability distribution for its environment using [[Solomonoff Induction]], a formalization of[[Occam's razor]]: Simpler environments are given more weight than more complex ones. It then calculates the expected reward of each action it might choose--weighting the possible environments as mentioned--and chooses the best action. It does this calculation by extrapolating its actions into the future recursively, using the assumption that at each step into the future it will again choose the best possible action using the same procedure.  
  
 
AIXI is provably more intelligent than any other possible agent. However, it is not a feasible AI, as  [[Solomonoff induction]] is  not computable; and it evaluates expected value over an infinite set of possible choices. Thus, it does not serve not as a design for a real AI. But is its valuable as a theoretical model of intelligence, as it abstracts away resource limitations that limit the intelligence of and complicate the analysis of real-world AI.  
 
AIXI is provably more intelligent than any other possible agent. However, it is not a feasible AI, as  [[Solomonoff induction]] is  not computable; and it evaluates expected value over an infinite set of possible choices. Thus, it does not serve not as a design for a real AI. But is its valuable as a theoretical model of intelligence, as it abstracts away resource limitations that limit the intelligence of and complicate the analysis of real-world AI.  

Revision as of 02:23, 23 August 2012

AIXI is a theoretical model for a maximally intelligent hypothetical agent, developed by Marcus Hutter. It works by simulating all possible actions into the future, considering simpler hypotheses about the way the world works as more likely,

The agent model in which AIXI operates is as follows: There is an *agent*, and an *environment*, which is a computable function unknown to the agent. So, the agent will need to have a probability distribution on the range of possible environments. On each clock tick, the agent receives an *observation* (a bitstring/number) from the environment, as well as a reward (another number). The agent then outputs an *action* (another number). Then again, on each iteration, the environment provides an observation and reward as a function of the full history of the interaction, and agent outputs its action likewise. The agent's intelligence is defined by its expected reward across all environments.

AIXI guesses at a probability distribution for its environment using Solomonoff Induction, a formalization ofOccam's razor: Simpler environments are given more weight than more complex ones. It then calculates the expected reward of each action it might choose--weighting the possible environments as mentioned--and chooses the best action. It does this calculation by extrapolating its actions into the future recursively, using the assumption that at each step into the future it will again choose the best possible action using the same procedure.

AIXI is provably more intelligent than any other possible agent. However, it is not a feasible AI, as Solomonoff induction is not computable; and it evaluates expected value over an infinite set of possible choices. Thus, it does not serve not as a design for a real AI. But is its valuable as a theoretical model of intelligence, as it abstracts away resource limitations that limit the intelligence of and complicate the analysis of real-world AI.

AIXI has also served to inspire a computable variant, AIXItl, which is provably more intelligent within time and space constraints than any other agent with the same constraints. AIXItl too is intractable, but implementable variants such as the Monte Carlo approximation by Veness et al. have shown promising results in simple general-intelligence test problems.

Eliezer Yudkowsky and others have pointed out that AIXI lacks a self-model: It extrapolates its own actions into the future indefinitely, on the assumption that it will keep working in the same way in the future. Though AIXI is an abstraction, any real AI would have a physical embodiment that could be damaged, and an implementation which could change its behavior due to bugs; and the AIXI formalism completely ignores these possibilities. This is called the Anvil problem.


References

Blog posts

See also