Accès direct au contenu


Version anglaise


Accueil > Formations > Master MVA > Présentation des cours

Reinforcement learning

Lecturer :  Matteo Pirotta (INRIA)

Objective of the course :

Introduction to the models and mathematical tools used in formalizing the problem of learning and decision-making under uncertainty. In particular, we will focus on the frameworks of reinforcement learning and multi-arm bandit.

Topics :

  • Historical multi-disciplinary basis of reinforcement learning

  •  Markov decision processes and dynamic programming

  • Stochastic approximation and Monte-Carlo methods

  •  Function approximation and statistical learning theory

  • Approximate dynamic programming

  • Introduction to stochastic and adversarial multi-arm bandit

  • Learning rates and finite-sample analysis

Prerequisites :

Basic of Probability and Statistics (niveau L3 maths ou GE)

Organization of courses :

  • 8 cours théoriques de 2h

  • 3 travaux dirigés de 3h

Validation :

Reading of papers of interest, implementation or theoretical analysis of reinforcement learning algorithms.  The project will be evaluated on the basis of a short report and an oral presentation.

    References :


    • Processus decisionnels de Markov et Intelligence Artificielle, 2008. Editeurs O. Sigaud et O. Buffet.
    • Neuro-Dynamic Programming, Bertsekas et Tsitsiklis, 1996.