In this talk, I will propose a contextual-bandit approach for demand side management by offering price incentives. More precisely, a target mean consumption is set at each round and the mean consumption is modeled as a complex function of the distribution of prices sent and of some contextual variables such as the temperature, weather, and so on. The performance of our strategies (inspired from contextual bandit algorithm) is measured in quadratic losses through a regret criterion which will be Read more [...]