DURAND Thibaut

PhD graduated
Team : MLIA
Departure date : 12/31/2017

Supervision : Matthieu CORD

Co-supervision : THOME Nicolas

Weakly Supervised Learning for Visual Recognition

This thesis studies the problem of the classification of images, where the goal is to predict if a semantic category - e.g. car - is present in the image according to its visual content. Today, with the massive use of smartphones and social networks, images are ubiquitous in our daily lives. To process and exploit this mass of data, it is important to have recognition systems, to analyze and interpret the visual content of the images. We propose in this manuscript to learn localized representations with weakly supervised learning methods. In the image classification setting, this problem can be seen as a problem of pooling on regions. From the Multiple Instance Learning (MIL) formalism, we proposed SyMIL, which is a symmetric model for the binary classification of bags. SyMIL uses a pooling function, which seeks discriminative instances for each category. Then, we generalized SyMIL to structured prediction problems, introducing MANTRA. MANTRA seeks discriminative regions for the class, but also regions indicating the absence of the class (negative evidence). Thereafter, we integrated the negative evidence model into a deep architecture. We also propose an extension of the pooling function to several regions, to be more robust. In the last section, we proposed a new architecture that learns several modalities for each class class - to have better prediction. We also proposed a unified model for pooling, and an experimental comparison on 6 datasets.

Defence : 09/20/2017 - 11h - Site Jussieu 25-26/105

Jury members :

PEREZ Patrick (Technicolor) [Rapporteur]
BACH Francis (INRIA - Ecole Normale Superieure)
CORD Matthieu (UPMC - LIP6)
SERFATY Véronique (DGA)

2013-2019 Publications