On-line Linear Optimization
An on-line linear optimization problem is defined as the following repeated game between the learner (player) and the environment (adversary, or Reality). Let be a compact closed convex set. The protocol of the game is
The goal of the Player is to minimize his regret . In the full information setting of the game, the player may observe the entire function as his feedback, and can exploit this in making decisions. The other setting is a Bandit setting, when the player observes only a scalar value . In practice this means, that we do not know "what would be if we changed our action", but only current feedback. Such a setting is useful for a various range of applications.