Coherent Extrapolated Volition
On the topic of Friendly AI, Eliezer Yudkowsky argues that it would not be sufficient to explicitly program our desires into an AI. Rather, we must To truly act in our best interests, a superintelligent AI must do what we want it to do, rather than what we tell it to do. Therefore, what we need is a meta-level method of determining what the object-level goal will be. Yudkowsky has presented Coherent Extrapolated Volition as an example of what the solution to this problem would look like.
In calculating CEV, an AI would predict what an idealized version of us would want, "if we knew more, thought faster, were more the people we wished we were, had grown up farther together". It would recursively iterate this prediction for humanity as a whole, and determine the desires which converge. This initial dynamic would be used to form the AI's utility function.
Problems with CEV include the great difficulty of implementing such a program, and the potential that human values may not converge. Yudkowsky considered CEV obsolete almost immediately after its publication in 2004. Despite this, criticism of CEV has been a popular topic of criticism on LessWrong.
Blog posts
- A Short Introduction to Coherent Extrapolated Volition by Michael Anissimov
- Hacking the CEV for Fun and Profit by Wei Dai
- Two questions about CEV that worry me by Vladimir Slepnev
- Beginning resources for CEV research by Luke Muehlhauser
- Cognitive Neuroscience, Arrow's Impossibility Theorem, and Coherent Extrapolated Volition by Luke Muehlhauser
- Objections to Coherent Extrapolated Volition by Alexander Kruel
See also
References
- Yudkowsky, Eliezer (2004) Coherent Extrapolated Volition
- Tarleton, Nick (2010) Coherent Extrapolated Volition: A Meta-Level Approach to Machine Ethics