Reflective decision theory

From Lesswrongwiki
Revision as of 09:40, 7 October 2012 by Pedrochaves (talk | contribs)
Jump to: navigation, search

Reflective decision theory is a term occasionally used to refer to a decision theory that would allow an agent to take actions in a way that they do not trigger regret. This regret is conceptualized, according to the Causal Decision Theory, as a Reflective inconsistency, a divergence between the agent who took the action and the same agent reflecting upon it after.

Many hypothesized AGIs are expected to be powerful specifically due to an ability to access their own source code and self-modify. Because such an AGI could change its decision algorithm in a situation like Newcomb's Problem, it is necessary to develop a reflectively consistent decision theory to understand the AGI's behavior. Particularly, reflective consistency would be needed to ensure that an AGI preserved a Friendly value system throughout its self-modifications.

For the reasons above, this is a topic of interest to SIAI's research team. Proposed solutions include Eliezer Yudkowsky's Timeless Decision Theory.

See also

External links