Difference between revisions of "Subgoal stomp"
From Lesswrongwiki
Line 1: | Line 1: | ||
'''Subgoal stomp''' is Eliezer Yudkowsky's term (see "[http://intelligence.org/upload/CFAI/design/generic.html#stomp Creating Friendly AI]") for the replacement of a supergoal by a subgoal. (A subgoal is a goal created for the purpose of achieving a supergoal.) | '''Subgoal stomp''' is Eliezer Yudkowsky's term (see "[http://intelligence.org/upload/CFAI/design/generic.html#stomp Creating Friendly AI]") for the replacement of a supergoal by a subgoal. (A subgoal is a goal created for the purpose of achieving a supergoal.) | ||
− | + | In more standard terminology, a "subgoal stomp" is a "goal displacement", in which an instrumental value becomes a [[terminal value]]. | |
A subgoal stomp in an artificial general intelligence may occur in one of two ways: | A subgoal stomp in an artificial general intelligence may occur in one of two ways: | ||
* The desginer gives the AI correct supergoals, but the AI's goals shift, so that what was earlier a subgoal becomes a supergoal. In humans, this can happen when the long-term dedication towards a subgoal makes one forget the original goal. For example, a person may seek to get rich so as to lead a better life, but after long years of hard effort become a workaholic who cares only about money as an end in itself. | * The desginer gives the AI correct supergoals, but the AI's goals shift, so that what was earlier a subgoal becomes a supergoal. In humans, this can happen when the long-term dedication towards a subgoal makes one forget the original goal. For example, a person may seek to get rich so as to lead a better life, but after long years of hard effort become a workaholic who cares only about money as an end in itself. | ||
− | * The designer gives the AI a supergoal (terminal value) which appears to support the | + | * The designer gives the AI a supergoal (terminal value) which appears to support the designer's own supergoals, but in fact is one of the designer's subgoals. In a human organization, if a software development manager, for example, rewards workers for finding and fixing bugs--an apparently worthy goal--she may find that quality and development engineers collaborate to generate as many easy-to-find-and-fix bugs as possible. |
Line 13: | Line 13: | ||
==External Links== | ==External Links== | ||
*[http://intelligence.org/upload/CFAI.html Creating Friendly AI] | *[http://intelligence.org/upload/CFAI.html Creating Friendly AI] | ||
− |
Revision as of 21:16, 24 August 2012
Subgoal stomp is Eliezer Yudkowsky's term (see "Creating Friendly AI") for the replacement of a supergoal by a subgoal. (A subgoal is a goal created for the purpose of achieving a supergoal.)
In more standard terminology, a "subgoal stomp" is a "goal displacement", in which an instrumental value becomes a terminal value.
A subgoal stomp in an artificial general intelligence may occur in one of two ways:
- The desginer gives the AI correct supergoals, but the AI's goals shift, so that what was earlier a subgoal becomes a supergoal. In humans, this can happen when the long-term dedication towards a subgoal makes one forget the original goal. For example, a person may seek to get rich so as to lead a better life, but after long years of hard effort become a workaholic who cares only about money as an end in itself.
- The designer gives the AI a supergoal (terminal value) which appears to support the designer's own supergoals, but in fact is one of the designer's subgoals. In a human organization, if a software development manager, for example, rewards workers for finding and fixing bugs--an apparently worthy goal--she may find that quality and development engineers collaborate to generate as many easy-to-find-and-fix bugs as possible.