Subgoal stomp

From Lesswrongwiki
Revision as of 07:07, 14 June 2012 by Alex Altair (talk | contribs)
Jump to: navigation, search

A subgoal is a goal created for the purpose of helping achieve a higher, more important supergoal. Subgoal stomp refers to pursuing a subgoal in a way that defeats the purpose by ignoring the supergoal. The term was originally defined by Eliezer Yudkowsky in Creating Friendly AI;

A "failure of Friendliness" scenario in which a subgoal stomps on a supergoal - for example, putting on your shoes before your socks, or turning all the matter in the Universe into computronium because some (ex-)petitioner asked you to solve the Riemann Hypothesis.

The concept emphasizes the need for care in designing AGI goal systems. Subgoal stomp can occur any time the programmer fails to give the AI the correct supergoals. It can also happen if the AI does not have a sufficient predictive horizon; that is, if the AI cannot forsee the consequences of its actions far enough ahead. This could occur even with a friendly goal but an inadequate reasoning system.

See Also