Computation Hazard

From Lesswrongwiki
Revision as of 05:21, 3 October 2012 by Joaolkf (talk | contribs)
Jump to: navigation, search

A Computation Hazard is a large possible negative consequence arising merely from vast amounts of computation [1]. It is a risk inherently associated with any kind of vast amount of computation and not from a specific computation such as an unfriendly AI. For example, imagine an advanced computer program that was design to understand human pain. In other to do so, it may result in accurately simulating a large group of humans feeling pain, and it may do so in great detail to the point of instantiating actual conscious human suffering.

There are many other hazards involved with computation. More generally, when a large number of computations and algorithms are run, more likely that some of these algorithms are a serious hazard. Fewer and less complex computations are less probable to be a hazard, as a very short and simple program will most likely not be a computation hazard on a normal computer. Some computations will certainly be a hazard because (1) it will run almost all possible computations, hence almost all possible computational hazards (e.g.: a Solomonoff induction algorithm or any Turing complete game simulation) or (2) it is particularly likely to run algorithms that are computation hazards (e.g.: agents, predictors and oracles).

Agents can be a hazard since they are defined by having the intention of maximizing a goal, and this goal may be detrimental to humanity, the most classical example been the paperclip maximizer – an AGI with the solely goal of maximizing the total number of paper clips. Recursive self-improving agents are especially dangerous since their powers can grow rapidly and unpredictably. They also will probably need to simulate other agents (i.e.: humans) behavior, hence they would also present the hazard of simulating a lot of consciousness suffering.

A predictor is a computation which takes data as input, and predicts what data will come next. Oracles are computations designed to answer questions, which can be predictions or questions about predictions. Ar first glance they may seem unharmful, but they might have to simulate agents whose behavior they are trying to predict. These agents may eventually start to self-improve and dominate the predictor/oracle behavior. A predictor may also emit predictions that are more likely to be true if they are emitted, self-fulfilling prophecies, with the intention of enhancing the prediction accuracy. An oracle might want to simulate the minds of its creators in order to better answer their questions.

There are two main strategies one could follow to avoid these kinds of risk. First, to keep the computations small and simple until some clear reassurance of their safety is known. Second, to use some kind of agent detectors - non-person predicates -, which would ensure that a computation doesn't contain agents or persons.