IID Deficiency

The notion of randomness, in the sense of a property of individual sequences satisfying the randomness assumption, has played a role in the foundations of probability since at least von Mises's work (see, e.g., Mises 1919). Kolmogorov's measure-theoretic axioms were more successful than von Mises's approach based on random sequences as basis for mathematical probability, but Kolmogorov still felt that von Mises's random sequences (or collectives) were indispensable for the applications of probability. In 1963 Kolmogorov made his first attempt to define the notion of randomness formally (this attempt was "incomplete", according to Kolmogorov 1965). He proposed another, much more satisfactory, definition in the last section of his 1965 paper. His new definition was based on his universal notion of complexity.

The main papers in which Kolmogorov discussed his notion of randomness were Kolmogorov (1968, 1983). The 1968 paper is an imperfect translation from Russian; the Russian version was published in 1969 in Проблемы передачи информации. In those papers Kolmogorov uses "random" in informal discussions and "Bernoulli" (adjectivally) in more formal contexts. This article uses "IID" in the same sense, since:

"Bernoulli" is only applicable in the case of binary sequences. (Kolmogorov only considered the binary case in his papers, and discouraged his students from moving to more general cases before they understand the binary case really well.)
"Random" is nowadays used in a very wide sense, not necessarily meaning IID (following Martin-Lof's 1966 paper).

Kolmogorov defined a binary sequence ⚠ $x$ of length ⚠ $n$ and with ⚠ $k$ 1s to be ⚠ $m$-Bernoulli if ⚠ $K(x\mid n,k)\ge\log\binom{n}{k}-m$, where ⚠ $K$ is Kolmogorov complexity and ⚠ $\log$ is binary logarithm. Therefore, ⚠ $d(x)=\log\binom{n}{k}-K(x\mid n,k)$ measures how Bernoulli the sequence ⚠ $x$ is. However, in the modern language (see, e.g., Bienvenu et al. 2011), ⚠ $d(x)$ is the randomness deficiency of ⚠ $x$ with respect to the exchangeability model rather than the Bernoulli model. The randomness deficiency with respect to the class of all Bernoulli measures (see, e.g., Bienvenu et al. 2011, Section 4) is slightly different. In 1986 Kolmogorov's student (Vovk, 1986) found an exact relation between the deficiency of exchangeability (i.e, Kolmogorov's notion of the Bernoulliness deficiency) and the Bernoulliness deficiency in the modern sense. It has turned out that the latter is, essentially, equal to the former plus the binomiality deficiency of the number ⚠ $k$ of 1s in ⚠ $x$ (the binomiality deficiency of ⚠ $k$ being defined as its randomness deficiency with respect to the class of all binomial measures on ⚠ $\{0,1,\ldots,n\}$). It remains an open problem to extend this result to the general observation spaces.

If the IID deficiency was computable, the problem of prediction under the randomness assumption would be straightforward. Suppose the observations ⚠ $z_i=(x_i,y_i)$ consist of objects ⚠ $x_i$ and binary labels ⚠ $y_i\in\{0,1\}$. Given a training sequence ⚠ $z_1,\ldots,z_n$ and a test object ⚠ $x_{n+1}$, we can consider both possible completions, ⚠ $z_1,\ldots,z_n,(x_{n+1},0))$ and ⚠ $z_1,\ldots,z_n,(x_{n+1},1))$. A confident prediction is possible if and only if only one of the completions has a small IID deficiency. Historically, this was the origin of conformal prediction (Vovk et al., 2005, Section 2.7).