Back to LessWrong

Naturalized induction

From Lesswrongwiki

Jump to: navigation, search

Naturalized induction is an open problem in Friendly AI: Build an algorithm for producing accurate generalizations and predictions from data sets, that treats itself, its data inputs, and its hypothesis outputs as reducible to its physical posits. More broadly, design a workable reasoning method that allows the reasoner to treat itself as fully embedded in the world it's reasoning about.

Naturalized inductors are associated with naturalism in contrast to 'Cartesian' inductors, reasoners that assume a strict boundary between themselves and their environments. The standard example of an idealization of Cartesian induction is Solomonoff induction, an uncomputable but theoretically fruitful specification of a hypothesis space, prior probability distribution, and consistent reassignment of probabilities given data inputs. As Solomonoff induction is currently the leading contender for a formalization of universally correct — albeit physically unrealizable — inductive reasoning, an essential step in formally defining the problem of naturalized induction will be evaluating the limitations of Solomonoff inductors such as AIXI.

Naturalized induction is a particular angle of approach on larger Friendly AI superproblems such as the problem of hypotheses ('what formalism should a Friendly AI's hypotheses look like? how wide a range of possibilities should a Friendly AI be able to consider?') and the problem of priors ('before receiving any data, what prior probabilities should a Friendly AI assign to its hypotheses?'). Here the emphasis is on making sure the AI has a realistic conception of nature and of its own place in nature, whereas other angles of approach to the problem of hypotheses and the problem of priors will put the emphasis on issues like computational tractability, leverage penalties, logical uncertainty, or epistemic stability under self-modification. Subproblems specific to naturalized induction include:

  1. Solomonoff bug-spotting: finding limits on the robustness of AIXI approximations, e.g., formalizing or generalizing the anvil problem
  2. hypothesis idiom selection: selecting the right formalism for representing hypotheses, e.g., algorithmic, automata-theoretic, or model-theoretic
  3. expressivity: setting upper and lower bounds on the diversity of hypotheses given human uncertainty about exotic physics scenarios (e.g., time-travel, hypercomputation, or unusual mathematical structures)
  4. first-person reductionism: formalizing and defining reasonable priors for bridge hypotheses linking agent-internal representations to physical posits
  5. anthropics: conditioning on the reasoner's existence, e.g., in scenarios of indexical uncertainty or self-replication

Blog posts