Friendly artificial intelligence

From Lesswrongwiki
Revision as of 14:40, 1 November 2009 by Zack M. Davis (talk | contribs) (_specifically designed_ to have a positive effect)
Jump to: navigation, search
Wikipedia has an article about
The Transhumanist Wiki has an article about

A Friendly Artificial Intelligence (FAI) is an artificial general intelligence specifically designed to have a positive effect on humanity. Friendly AI also refers to the field of knowledge required to build such an AI. Note that Friendly (capital-F) is being used as a term of art, referring specifically to AIs that protect humans and humane values; an FAI need not be "friendly" in the conventional sense and need not even be sentient. Any AGI that is not friendly is said to be Unfriendly.

AI that underwent an intelligence explosion could exert unprecedented optimization power over its future; therefore, a Friendly AI could very well create an unimaginably good future. Conversely, an Unfriendly AI could represent an existential risk: destroying all humans, not out of hostility, but as a side effect of trying to do something entirely different. (Beware of Giant cheesecake fallacy.)

Requiring Friendliness doesn't make AGI any easier, and almost certainly makes it harder. Most approaches to AGI aren't amenable to implementing precise goals, and so don't even constitute subprojects for FAI, leading to Unfriendly AI as the only possible "successful" outcome. Specifying Friendliness also presents unique technical challenges: humane moral value is very complex; a lot of seemingly simple-sounding moral concepts conceal hidden complexity not "inherent" in the universe itself. It is likely impossible to specify humane values by explicitly programming them in, one needs a technique for extracting them automatically.


See also

Blog posts