Difference between revisions of "Reflective decision theory"

From Lesswrongwiki
Jump to: navigation, search
Line 1: Line 1:
 
'''Reflective decision theory''' is a term occasionally used to refer to a decision theory that would allow an agent to take actions in a way that they do not trigger regret. This regret is conceptualized, according to the [[Causal Decision Theory]], as a [[Reflective inconsistency]], a divergence between the agent who took the action and the ''same'' agent reflecting upon it after.
 
'''Reflective decision theory''' is a term occasionally used to refer to a decision theory that would allow an agent to take actions in a way that they do not trigger regret. This regret is conceptualized, according to the [[Causal Decision Theory]], as a [[Reflective inconsistency]], a divergence between the agent who took the action and the ''same'' agent reflecting upon it after.
  
Many hypothesized [[AGI]]s are expected to be powerful specifically due to an ability to access their own source code and self-modify. Because such an AGI could change its decision algorithm in a situation like Newcomb's Problem, it is necessary to develop a reflectively consistent decision theory to understand the AGI's behavior. Particularly, reflective consistency would be needed to ensure that an AGI preserved a [[Friendly Artificial Intelligence|Friendly]] value system throughout its self-modifications.
+
When considering though experiments such as Newcomb’s Problem, it has been suggested that a sufficiently powerful [[AGI]] would be able to access its own source code and self-modify. This would allow for the AGI to alter its own behavior and decision process, beating the paradox through the definition of a precommitment to a certain choice. In order for us to understand the AGI's behavior in this and other situations and to be able to implement it, we will have to create a reflectively consistent decision theory. Particularly, reflective consistency would be needed to ensure that an AGI preserved a [[Friendly Artificial Intelligence|friendly]] value system throughout its self-modifications.
  
For the reasons above, this is a topic of interest to SIAI's research team. Proposed solutions include Eliezer Yudkowsky's [[Timeless Decision Theory]].
+
Eliezer Yudkowsky's has proposed theoretical solution to the problem in his [[Timeless Decision Theory]].
  
 +
==Further Reading & References==
 +
*[http://intelligence.org/upload/TDT-v01o.pdf Timeless Decision Theory] by Eliezer Yudkowsky
 +
*[http://johncarlosbaez.wordpress.com/2011/03/07/this-weeks-finds-week-311/ interview] of Eliezer Yudkowsky by John Baez, March 7th, 2011
  
 
==See also==
 
==See also==
Line 12: Line 15:
 
*[[Timeless Decision Theory]]
 
*[[Timeless Decision Theory]]
 
*[[Complexity of value]]
 
*[[Complexity of value]]
 
==External links==
 
 
*[http://intelligence.org/research/researchareas SIAI research areas]
 
*[http://intelligence.org/upload/TDT-v01o.pdf Timeless Decision Theory] by Eliezer Yudkowsky
 
*[http://johncarlosbaez.wordpress.com/2011/03/07/this-weeks-finds-week-311/ interview] of Eliezer Yudkowsky by John Baez, March 7th, 2011
 
  
 
[[Category:Concepts]]
 
[[Category:Concepts]]

Revision as of 11:14, 7 October 2012

Reflective decision theory is a term occasionally used to refer to a decision theory that would allow an agent to take actions in a way that they do not trigger regret. This regret is conceptualized, according to the Causal Decision Theory, as a Reflective inconsistency, a divergence between the agent who took the action and the same agent reflecting upon it after.

When considering though experiments such as Newcomb’s Problem, it has been suggested that a sufficiently powerful AGI would be able to access its own source code and self-modify. This would allow for the AGI to alter its own behavior and decision process, beating the paradox through the definition of a precommitment to a certain choice. In order for us to understand the AGI's behavior in this and other situations and to be able to implement it, we will have to create a reflectively consistent decision theory. Particularly, reflective consistency would be needed to ensure that an AGI preserved a friendly value system throughout its self-modifications.

Eliezer Yudkowsky's has proposed theoretical solution to the problem in his Timeless Decision Theory.

Further Reading & References

See also