Bayesian Ridge Regression

Given an input vector $x_t$, the online Bayesian Ridge Regression predicts at each step $T$ the normal distribution $N(\gamma_T,\sigma_T^2)$ with the mean and variance given by

$$\gamma_T = Y'_{T-1} X_{T-1} A_{T-1}^{-1} x_T , \quad \sigma_T^2 = \sigma^2 x_T' A_{T-1}^{-1} x_T + \sigma^2$$

for some $a > 0$ and the known noise variance $\sigma^2$. Here $X_t$ is the $t\times n$ matrix of row vectors $x_1',\ldots,x_t'$ and $Y_t$ be the column vector of outcomes $y_1,\ldots,y_t$. Here also $A_t = X'_tX_t + aI$.