Supervised learning for distribution of centralised multiagent patrolling strategies.
Intervenant(s) : Mehdi Othmani-Guibourg
For nearly two decades, patrolling has received significant attention from the multiagent community. Multiagent patrolling (MAP) consists in modelling a patrol task to optimise as a multiagent system. The problem of optimising a patrol task is to distribute agents over the area to patrol in space and time the most efficiently, which constitutes a decision-making problem. A range of algorithms based on reactive, cognitive, reinforcement learning, centralised and decentralised strategies, among others, have been developed to make such a task ever more efficient. However, the existing patrolling-specific approaches based on supervised learning were still at preliminary stages, although a few works addressed this issue. Central to supervised learning, which is a set of methods and tools that allow inferring new knowledge, is the idea of learning a function mapping any input to an output from a sample of data composed of input-output pairs; learning, in this case, enables the system to generalise to new data never observed before. Until now, the best online MAP strategy, namely without precalculation, has turned out to be a centralised strategy with a coordinator. However, as for any centralised decision process in general, such a strategy is hardly scalable. The purpose of this work is then to develop and implement a new methodology aimed at turning any high-performance centralised strategy into a distributed strategy. Indeed, distributed strategies are by design resilient, more adaptive to changes in the environment, and scalable. In doing so, the centralised decision process, generally represented in MAP by a coordinator, is distributed into patrolling agents by means of supervised learning methods, so that agents of the resultant distributed strategy tend to capture each a part of the algorithm executed by the centralised decision process. The outcome is a new distributed decision-making algorithm based on machine learning. In this thesis therefore, such a procedure of distribution of centralised strategy is established, then concretely implemented using some artificial neural networks architectures.
cedric.herpson (at) nulllip6.fr