Difference between revisions of "Moral divergence"

From Lesswrongwiki
Jump to: navigation, search
m (changed "utilitarianist" to "utilitarian")
 
Line 3: Line 3:
 
[[Carl Shulman]], Henrik Jonsson and Nick Tarleton discuss the issue in their paper [http://intelligence.org/files/WhichConsequentialism.pdf Which Consequentialism? Machine Ethics and Moral Divergence]. Several authors have suggested these kind of theories as the basis for developing [[machine ethics]]. Shulman et al., however, discuss the variation among the theories, the problems that this causes, and possible ideas on how to solve it.
 
[[Carl Shulman]], Henrik Jonsson and Nick Tarleton discuss the issue in their paper [http://intelligence.org/files/WhichConsequentialism.pdf Which Consequentialism? Machine Ethics and Moral Divergence]. Several authors have suggested these kind of theories as the basis for developing [[machine ethics]]. Shulman et al., however, discuss the variation among the theories, the problems that this causes, and possible ideas on how to solve it.
  
Consequentialism is currently seen as having a great number of "free variables" - that is, there are a number of different dimensions along which it varies. Hedonistic utilitarianism, for example, requires a description of experiental states, from pleasure to pain, but utilitarianists disagree over how much to value each state. At the same time, preference utilitarianists discuss the definition of even simple and crucial terms like ''preference'' and dispute the value of actual satisfaction of preferences ''vs'' just the experience of satisfaction. More broadly, all utilitarianists embrace the need of a inter-personal utility function comparison\aggregation, which in itself brings more free variables such as how to sum or average utilities. It is then clear that the picture emerging from the field brings a lot of confusion and no clear solution to what kind of utility function to implement in AMAs (Artificial Moral Agents).
+
Consequentialism is currently seen as having a great number of "free variables" - that is, there are a number of different dimensions along which it varies. Hedonistic utilitarianism, for example, requires a description of experiental states, from pleasure to pain, but utilitarians disagree over how much to value each state. At the same time, preference utilitarians discuss the definition of even simple and crucial terms like ''preference'' and dispute the value of actual satisfaction of preferences ''vs'' just the experience of satisfaction. More broadly, all utilitarians embrace the need of a inter-personal utility function comparison/aggregation, which in itself brings more free variables such as how to sum or average utilities. It is then clear that the picture emerging from the field brings a lot of confusion and no clear solution to what kind of utility function to implement in AMAs (Artificial Moral Agents).
  
 
The inadequacy of current moral theories to the field of machine ethics is further explored through an analysis of the behavior produced by such different consequential beliefs: despite the differences, they seem to prescribe roughly similar actions. This could imply that AMAs with diverse consequentialist views would converge in their behaviors. This view, however is seen as mistaken, as most of current neuroscientific and moral psychology research shows that moral decisions emerge from unconscious intuitions and that the verbal justifications for those intuitions are mostly rationalizations. It would follow that developing AMAs with explicit moral systems could lead to AIs oblivious to any other informal intuitions that the designers failed to specify.
 
The inadequacy of current moral theories to the field of machine ethics is further explored through an analysis of the behavior produced by such different consequential beliefs: despite the differences, they seem to prescribe roughly similar actions. This could imply that AMAs with diverse consequentialist views would converge in their behaviors. This view, however is seen as mistaken, as most of current neuroscientific and moral psychology research shows that moral decisions emerge from unconscious intuitions and that the verbal justifications for those intuitions are mostly rationalizations. It would follow that developing AMAs with explicit moral systems could lead to AIs oblivious to any other informal intuitions that the designers failed to specify.

Latest revision as of 08:30, 18 August 2013

Moral divergence is a characterization of the current state of consequentialist and utilitarian moral theories. Specifically, it refers to the amount of disagreement between scholars about several dimensions of such ideas.

Carl Shulman, Henrik Jonsson and Nick Tarleton discuss the issue in their paper Which Consequentialism? Machine Ethics and Moral Divergence. Several authors have suggested these kind of theories as the basis for developing machine ethics. Shulman et al., however, discuss the variation among the theories, the problems that this causes, and possible ideas on how to solve it.

Consequentialism is currently seen as having a great number of "free variables" - that is, there are a number of different dimensions along which it varies. Hedonistic utilitarianism, for example, requires a description of experiental states, from pleasure to pain, but utilitarians disagree over how much to value each state. At the same time, preference utilitarians discuss the definition of even simple and crucial terms like preference and dispute the value of actual satisfaction of preferences vs just the experience of satisfaction. More broadly, all utilitarians embrace the need of a inter-personal utility function comparison/aggregation, which in itself brings more free variables such as how to sum or average utilities. It is then clear that the picture emerging from the field brings a lot of confusion and no clear solution to what kind of utility function to implement in AMAs (Artificial Moral Agents).

The inadequacy of current moral theories to the field of machine ethics is further explored through an analysis of the behavior produced by such different consequential beliefs: despite the differences, they seem to prescribe roughly similar actions. This could imply that AMAs with diverse consequentialist views would converge in their behaviors. This view, however is seen as mistaken, as most of current neuroscientific and moral psychology research shows that moral decisions emerge from unconscious intuitions and that the verbal justifications for those intuitions are mostly rationalizations. It would follow that developing AMAs with explicit moral systems could lead to AIs oblivious to any other informal intuitions that the designers failed to specify.

In the conclusions, the authors reject the views that favor these kind of top-down, explicit implementations of moral systems, especially due to the difficulty of reaching a consensus purely through philosophical debate. At the same time, they suggest caution when thinking about implementing bottom-up approaches due to the possibility of the AI learning the wrong values through improper generalization. Instead, it is proposed that knowledge from the fields of neuroscience, experimental philosophy, and moral psychology should be used to aid in the generation of ethical theories.

Further Reading & References

  • Chalmers, David John. (1996). The conscious mind: In search of a fundamental theory. Philosophy of Mind Series
  • Haidt, Jonathan. (2001). The emotional dog and its rational tail: A social intuitionist approach to moral judgment. Psychological Review, 108, (4), 814–834
  • Moor, James H. (2006). The nature, importance, and difficulty of machine ethics. IEEE Intelligent Systems, 21, (4), 18–21
  • Wallach, Wendell, Colin Allen, and Iva Smit. (2008). Machine morality: Bottom-up and top-down approaches for modelling human moral faculties. In Ethics and artificial agents. Special issue, AI & Society, 22, (4)
  • Yudkowsky, Eliezer. (2008). Artificial intelligence as a positive and negative factor in global risk. In Global catastrophic risks, ed. Nick Bostrom and Milan M. Ćirković, 308–345

See also