CISSE Mouhamadou Moustapha
Team : MLIA
Departure date : 09/30/2014
Supervision : Thierry ARTIÈRES
Co-supervision : GALLINARI Patrick
Efficient Extreme Classification
Humans naturally and instantly recognize relevant objects in images despite the large number of potential visual concepts. They can also instantly tell which topics are relevant for a given text document even though these topics are chosen among thousands of semantic concepts. This ability to quickly categorize information is an important aspect of high level intelligence and endowing machines with it is an important step towards artificial intelligence.
We propose in this thesis new methods to tackle classification problems with a large number of labels, also called extreme classification. The proposed approaches aim at reducing the inference complexity in comparison with the classical methods (such as one-versus-rest) in order to make learning machines usable in a real life scenario. We propose two types of methods respectively designed for single label and multilabel classification.
The first proposed method uses existing hierarchical information among the categories in order to learn low dimensional binary representation of the categories. The second type of approaches, dedicated to multilabel problems, adapts the framework of Bloom Filters to represent subsets of labels with sparse low dimensional binary vectors. For both methods, binary classifiers are learned to predict the new low dimensional representation of the categories and several algorithms are also proposed to recover the set of relevant labels. Large scale experiments validate the methods.
: 07/25/2014 - 10h - Site Jussieu 25-26/105Jury members
Eric Gaussier, LIG (Grenoble-France) [Rapporteur]
Georges Paliouras, Demokritos (Athens-Greece) [Rapporteur]
Christophe Marsala, LIP6 (Paris-France)
Nicolas Usunier UTC/CNRS (Compiegne-France)
Thierry Artieres LIP6 (Paris-France)
Patrick Gallinari LIP6 (Paris-France)