Coherent Extrapolated Volition

From Lesswrongwiki
Jump to: navigation, search

On the topic of Friendly AI, Eliezer Yudkowsky argues that it would not be sufficient to explicitly program our desires into an AI. Rather, we must To truly act in our best interests, a superintelligent AI must do what we want it to do, rather than what we tell it to do. Therefore, what we need is a meta-level method of determining what the object-level goal will be. Yudkowsky has presented Coherent Extrapolated Volition as an example of what the solution to this problem would look like.

In calculating CEV, an AI would predict what an idealized version of us would want, "if we knew more, thought faster, were more the people we wished we were, had grown up farther together". It would recursively iterate this prediction for humanity as a whole, and determine the desires which converge. This initial dynamic would be used to form the AI's utility function.

Problems with CEV include the great difficulty of implementing such a program, and the potential that human values may not converge. Yudkowsky considered CEV obsolete almost immediately after its publication in 2004. Despite this, criticism of CEV has been a popular topic of criticism on LessWrong.

Blog posts

See also

References