# Solomonoff induction

Ray Solomonoff defined an inference system that will learn to correctly predict any computable sequence with only the absolute minimum amount of data. This system, in a certain sense, is the perfect universal prediction algorithm. To summarize it very informally, Solomonoff induction works by:

• Starting with all possible hypotheses (sequences) as represented by computer programs (that generate those sequences), weighted by their simplicity (2-n, where n is the program length);
• Discarding those hypotheses that are inconsistent with the data.

Weighting hypotheses by simplicity, the system automatically incorporates a form of Occam's razor, which is why it has been playfully referred to as Solomonoff's lightsaber.

Solomonoff induction gets off the ground with a solution to the "problem of the priors". Suppose that you stand before a universal prefix Turing machine $U$. You are interested in a certain finite output string $y_{0}$. In particular, you want to know the probability that $U$ will produce the output $y_{0}$ given a random input tape. This probability is the Solomonoff a priori probability of $y_{0}$.

More precisely, suppose that a particular infinite input string $x_{0}$ is about to be fed into $U$. However, you know nothing about $x_{0}$ other than that each term of the string is either $0$ or $1$. As far as your state of knowledge is concerned, the $i$th digit of $x_{0}$ is as likely to be $0$ as it is to be $1$, for all $i = 1, 2, \ldots$. You want to find the a priori probability $m(y_{0})$ of the following proposition:

(*) If $U$ takes in $x_{0}$ as input, then $U$ will produce output $y_{0}$ and then halt.

Unfortunately, computing the exact value of $m(y_{0})$ would require solving the halting problem, which is undecidable. Nonetheless, it is easy to derive an expression for $m(y_{0})$. If $U$ halts on an infinite input string $x$, then $U$ must read only a finite initial segment of $x$, after which $U$ immediately halts. We call a finite string $p$ a self-delimiting program if and only if there exists an infinite input string $x$ beginning with $p$ such that $U$ halts on $x$ immediately after reading to the end of $p$. The set $\mathcal{P}$ of self-delimiting programs is the prefix code for $U$. It is the determination of the elements of $\mathcal{P}$ that requires a solution to the halting problem.

Given $p \in \mathcal{P}$, we write "$\operatorname{prog}(x_{0}) = p$" to express the proposition that $x_{0}$ begins with $p$, and we write "$U(p) = y_{0}$" to express the proposition that $U$ produces output $y_{0}$, and then halts, when fed any input beginning with $p$. Proposition (*) is then equivalent to the exclusive disjunction

$\bigvee_{p \in \mathcal{P} \colon U(p) = y_{0}} (\operatorname{prog}(x_{0}) = p).$

Since $x_{0}$ was chosen at random from $\{0, 1\}^{\omega}$, we take the probability of $\operatorname{prog}(x_{0}) = p$ to be $2^{-\ell(p)}$, where $\ell(p)$ is the length of $p$ as a bit string. Hence, the probability of (*) is

$m(y_{0}) := \sum_{p \in \mathcal{P} \colon U(p) = y_{0}} 2^{-\ell(p)}.$