# On-line Linear Optimization

An on-line linear optimization problem is defined as the following repeated game between the learner (player) and the environment (adversary, or Reality). Let `⚠ $\mathcal{K} \in \mathbb{R}^n$`

be a compact closed convex set. The protocol of the game is

`⚠ $t=1,2,\dots,T$`

}
`⚠ $x_t \in \mathcal{K}$`

`⚠ $f_t \in \mathbb{R}^n$`

`⚠ $f'_t x_t$`

, and observes some feedback.
The goal of the Player is to minimize his regret `⚠ $R_T = \sum_{t=1}^T f'_t x_t - \min_{x^* \in \mathcal{K}} \sum_{t=1}^T f'_t x^*$`

.
In the *full information* setting of the game, the player may observe the entire function `⚠ $f_t$`

as his feedback, and can exploit this in making decisions. The other setting is a *Bandit* setting, when the player observes only a scalar value `⚠ $f'_t x_t$`

. In practice this means, that we do not know "what would be if we changed our action", but only current feedback. Such a setting is useful for a various range of applications.

To prove the theoretical bound for the regret, Follow the Leader algorithm is sometimes used, and other ones like Follow the regularized leader.