GAYA Jean-Baptiste

PhD student
Team : MLIA
Arrival date : 03/01/2021
    Sorbonne Université - LIP6
    Boîte courrier 169
    Couloir 26-00, Étage 5, Bureau 513
    4 place Jussieu
    75252 PARIS CEDEX 05

Tel: +33 1 44 27 88 07, Jean-Baptiste.Gaya (at)

Supervision : Laure SOULIER

Co-supervision : DENOYER Ludovic (Facebook)

Reinforcement Learning for Human-Machine Collaboration

The objective of this Ph.D. project, in collaboration between Sorbonne Universite and Facebook AI Research in Paris, is to propose new practical RL algorithms dedicated to the human-machine collaboration setting where a user and an agent are together interacting with an environment, trying to solve a user-specific problem like the taxi setting described previously. To propose new solutions, we propose to attack the different particularities exhibited by this setting, thus defining three different research axis: - Unsupervised/Semi-supervised learning of a space of controllable policies: The objective is to benefit from interaction between the agent and the environment (without any user feedback, or with very few feedback) to discover a space of policies that are well adapted to user-machine collaboration. The underlying idea is to pre-build a set of interesting policies before starting the interaction with a user. - User-in-the-loop Reinforcement learning: The objective is to study the different natures of feedback provided by a user starting from simple ones (users preferences) to complex ones (natural language) in the learning loop. This stage aims at being applied to the space of policies built in the previous step. - Defining concrete use-cases and evaluation metrics: If RL algorithms are usually evaluated on classic benchmarks with pure reward-driven metrics, we propose to create benchmarks allowing the evaluation of the proposed methods when facing real users