DULAC-ARNOLD Gabriel

Dottore di ricerca
Gruppo di ricerca : MLIA
Data di partenza : 05/01/2014
https://lip6.fr/Gabriel.Dulac-Arnold
https://lip6.fr/Gabriel.Dulac-Arnold

Relatore : Patrick GALLINARI

Co-relazione : DENOYER Ludovic

A General Sequential Model for Constrained Classification

This thesis introduces a body of work on sequential models for classification. These models allow for a much more flexible and general approach to classification tasks. Many tasks ultimately require the classification of some object, but cannot be han- dled with a single atomic classification step. This is the case for tasks where in- formation is either not immediately available upfront, or where the act of accessing different aspects of the object being classified may present various costs (due to time, computational power, monetary cost, etc.). The goal of this thesis is to introduce a new method, which we call datum-wise classification, that is able to handle these more complex classifications tasks by modelling them as sequential processes. We begin with a brief overview of supervised classification and reinforcement learning. We build on these two subjects to show how the supervised classification task can be modelled as a sequential decision process, which we call a ‘datum-wise classifier’. The datum-wise classifier can be trained using standard reinforcement learning algorithms. We introduce two example situations, based on text classifica- tion and image classification tasks, and demonstrate the advantages of classification as a sequential process. We continue by showing that with a slight modification of the reward function defining our datum-wise classification tasks, we can impose sparsity-like penalties on the model, thus allowing us to handle a much larger family of tasks. In effect, by imposing a negative penalty on certain information acquisition actions the classifier performs, we show that we are imposing an L0 -like penalty on the useage of features by the model. We begin by demonstrating how our model performs relative to more standard linear sparse classifiers such as the L1 -regularized linear SVM and the LASSO. We then demonstrate how, by including more complex (and not necessarily convex) regularizations on the model, we can handle more complex tasks such as classification with costly features, hard limits on feature usage, and even spatially constrained features. Of course the quasi-combinatorial aspect of these tasks means that training time can be problematic for certain tasks, especially if the number of features (and therefore actions) is high. We finish by presenting two new reinforcement learning algorithms that are able to drastically reduce the complexity of our original RL training algorithm, and show that they are able to find ‘good’ policies for problems that are not tractable with the current set of algorithms.

Difesa : 02/07/2014

Membri della commissione :

Stéphane Canu, INSA de Rouen [Rapporteur]
Balázs Kégl, Laboratoire Accélerateur Linéarie, Paris Sud [Rapporteur]
Ludovic Denoyer, LIP6-UPMC
Stéphane Doncieux, ISIR-UPMC
Damien Ernst, Université de Liège
Patrick Gallinari, LIP6-UPMC
Philippe Preux, INRIA Nord / LIFL
Bruno Scherrer, INRIA Nancy

Data di partenza : 05/01/2014

Pubblicazioni 2011-2014

Mentions légales
Mappa del sito