Solomonoff induction
From Lesswrongwiki
Ray Solomonoff defined an inference system that will learn to correctly predict any computable sequence with only the absolute minimum amount of data. This system, in a certain sense, is the perfect universal prediction algorithm. To summarize it very informally, Solomonoff induction works by:
- Starting with all possible hypotheses (sequences) as represented by computer programs (that generate those sequences), weighted by their simplicity (2-n, where n is the program length);
- Discarding those hypotheses that are inconsistent with the data.
Weighting hypotheses by simplicity, the system automatically incorporates a form of Occam's razor, which is why it has been playfully referred to as Solomonoff's lightsaber.
Solomonoff induction gets off the ground with a solution to the
"problem of the priors". Suppose that you stand before a
universal prefix Turing machine
. You are interested in a certain finite output
string
. In particular, you want to know the
probability that
will produce the output
given a random input tape. This probability is
the Solomonoff a priori probability of
.
More precisely, suppose that a particular infinite input string
is about to be fed into
. However, you know nothing about
other than that each term of the string is either
or
. As far as your
state of knowledge is concerned, the
th digit of
is as likely to be
as it is to
be
, for all
. You
want to find the a priori probability
of
the following proposition:
(*) If takes in
as input, then
will produce output
and then halt.
Unfortunately, computing the exact value of would require solving the halting problem, which is undecidable. Nonetheless, it is easy to derive an expression for
. If
halts on an infinite input string
, then
must read only a finite initial segment of
, after which
immediately halts. We call a finite string
a self-delimiting program if and only if there exists an infinite input string
beginning with
such that
halts on
immediately after reading to the end of
. The set
of self-delimiting programs is the prefix code for
. It is the determination of the elements of
that requires a solution to the halting problem.
Given , we write "
" to express the proposition that
begins with
, and we write "
" to express the proposition that
produces output
, and then halts, when fed any input beginning with
. Proposition (*) is then equivalent to the exclusive disjunction
Since was chosen at random from
, we take the probability of
to be
, where
is the length of
as a bit string. Hence, the probability of (*) is
Blog posts
- An Intuitive Explanation of Solomonoff Induction by lukeprog and Alex_Altair
See also
References
- Algorithmic probability on Scholarpedia