Gauss Linear Model

The Gauss statistical model says that the $(x_n,y_n)\in\mathbb{R}^p\times\mathbb{R}$ are generated as follows:

  • there are no restrictions on the way $x_n$ are generated;
  • given $x_n$, the $y_n$ are generated from $y_n = w\cdot x_n + \xi_n$, where $w\in\mathbb{R}^p$ is a vector of parameters, $\xi_n$ is distributed as $N(0,\sigma^2)$, and $\sigma>0$ is another parameter.

See Section 8.5 of Vovk et al. (2005) and Vovk et al. (2009) for the formulation of this model as an on-line compression model.

The most basic version of this model is where there are no $x$s, and the model is $y_n\sim N(0,\sigma^2)$. The summary of $y_1,\ldots,y_n$ is $t_n:=y_1^2+\cdots+y_n^2$ and the Gauss repetitive structure postulates that the distribution of $y_1,\ldots,y_n$ is uniform on the sphere of radius $t_n^{1\slash{}2}$. Borel (1914) noticed that the Gauss statistical model (used by Maxwell as a model in statistical physics) is equivalent to the Gauss repetitive structure (used for a similar purpose by Gibbs). For further historical comments, see Vovk et al. (2005), Section 8.8, and Diaconis and Freedman (1987), Section 6.


  • Persi Diaconis and David Freedman (1987). A dozen de Finetti-style results in search of a theory. Annales de l'Institut Henri Poincare B 23:397-423.
  • Vladimir Vovk, Alexander Gammerman and Glenn Shafer (2005). Algorithmic learning in a random world. Springer, New York.
  • Vladimir Vovk, Ilia Nouretdinov, and Alexander Gammerman (2009). On-line predictive linear regression. Annals of Statistics 37:1566-1590.