Difference between revisions of "Terminal value"

From Lesswrongwiki
Jump to: navigation, search
(Terminal values vs. instrumental)
(Human terminal values)
Line 11: Line 11:
  
 
==Human terminal values==
 
==Human terminal values==
Humans system of terminal values is quite complex. The values were forged by evolution in the ancestral environment to maximize inclusive genetic fitness. These values include survival life, health, friendship, social status, love of various kinds, joy, aesthetic pleasure, curiosity, and much more. Humans do not have inclusive genetic fitness as a goal. Rather, these instrumental values (from the perspective of inclusive genetic fitness) have become humans' terminal values (an example of [[subgoal stomp]]).
+
Humans system of terminal values is quite complex. The values were forged by evolution in the ancestral environment to maximize inclusive genetic fitness. These values include survival life, health, friendship, social status, love of various kinds, joy, aesthetic pleasure, curiosity, and much more. Evolution's implicit goal is inclusive genetic fitness, but humans do not have inclusive genetic fitness as a goal. Rather, these values, which were *instrumental* to inclusive genetic fitness have become humans' terminal values (an example of [[subgoal stomp]]).  
  
 +
Humans cannot fully introspect their terminal values. Humans' values are often mutually contradictory and change over time.
 +
 +
==Non-human terminal values==
 +
Future artificial general intelligences are may have the maximization of a utility function or of a reward function (reinforcement learning) as their terminal value. The precise nature of this function will likely be chosen by the designers.
 +
 +
The [[paperclip maximizer]] is a thought experiment about an artificial general intelligence assigned the  apparently innocuous terminal value of maximizing the number of paperclips in its collection, with consequences disastrous to humanity.
 +
 +
[[AIXI]] is a mathematical formalism for modeling intelligence. It illustrates that the arbitrariness of terminal values may be optimized by an intelligence: AIXI  is provably more intelligent than any other agent for *any* computable reward function.
  
* The [[paperclip maximizer]] and [[AIXI]] an examples
 
 
* Benevolence as an instrumental and as a terminal value  
 
* Benevolence as an instrumental and as a terminal value  
 
* Shifts in human terminal values
 
* Shifts in human terminal values
 
** Kantian
 
** Kantian
 
** Other
 
** Other
* [[Subgoal stomp]] ("goal displacement") in both senses
 
** A person wants to get rich to better enjoy life, and because of a total focus on money, becomes a workaholic focused on money for its own sake.
 
 
** Humans as adaptation executors
 
** Humans as adaptation executors
  
 
==Links==
 
==Links==
 
Eliezer Yudkowsky, [http://lesswrong.com/lw/l4/terminal_values_and_instrumental_values/ Terminal Values and Instrumental Values]
 
Eliezer Yudkowsky, [http://lesswrong.com/lw/l4/terminal_values_and_instrumental_values/ Terminal Values and Instrumental Values]

Revision as of 06:43, 26 August 2012

A terminal value is an ultimate goal, an end-in-itself. In an AI with a utility or reward function, the terminal value is the maximization of that function.

In Eliezer Yudkowsky's earlier writings, the non-standard term "supergoal" is used instead.

Terminal values vs. instrumental

Terminal values stand in contrast to instrumental values, which are means-to-an-end, mere tools in achieving terminal values. For example, if a given university student does not enjoy studying but is doing so merely as a professional qualification, his terminal value is getting a job, while getting good grades is an instrument to that end.

Some values may be called "terminal" merely in relation to an instrumental goal, yet themselves serve instrumentally towards a higher goal. In the previous example, the student may want the job to gain social status and money; if he could get prestige and money without working he would; and in this case the job is instrumental to these other values. However, in considering future AI, the phrase "terminal value" is generally used only for the top level of the goal hierarchy: the true ultimate goals of a system, those which do not serve any higher value.

Human terminal values

Humans system of terminal values is quite complex. The values were forged by evolution in the ancestral environment to maximize inclusive genetic fitness. These values include survival life, health, friendship, social status, love of various kinds, joy, aesthetic pleasure, curiosity, and much more. Evolution's implicit goal is inclusive genetic fitness, but humans do not have inclusive genetic fitness as a goal. Rather, these values, which were *instrumental* to inclusive genetic fitness have become humans' terminal values (an example of subgoal stomp).

Humans cannot fully introspect their terminal values. Humans' values are often mutually contradictory and change over time.

Non-human terminal values

Future artificial general intelligences are may have the maximization of a utility function or of a reward function (reinforcement learning) as their terminal value. The precise nature of this function will likely be chosen by the designers.

The paperclip maximizer is a thought experiment about an artificial general intelligence assigned the apparently innocuous terminal value of maximizing the number of paperclips in its collection, with consequences disastrous to humanity.

AIXI is a mathematical formalism for modeling intelligence. It illustrates that the arbitrariness of terminal values may be optimized by an intelligence: AIXI is provably more intelligent than any other agent for *any* computable reward function.

  • Benevolence as an instrumental and as a terminal value
  • Shifts in human terminal values
    • Kantian
    • Other
    • Humans as adaptation executors

Links

Eliezer Yudkowsky, Terminal Values and Instrumental Values