AVILA Sandra

Doctor
Equipo : MLIA
Fecha de salida : 30/09/2013
https://lip6.fr/Sandra.Avila

Dirección de investigación : Matthieu CORD

Co-supervisión : ARAÚJO Arnaldo, THOME Nicolas

Extended Bag-of-Words Formalism for Image Classification

Visual information, in the form of digital images and videos, has become so omnipresent in computer databases and repositories, that it can no longer be considered a "second class citizen", eclipsed by textual information. In that scenario, image classification has become a critical task. In particular, the pursuit of automatic identification of complex semantical concepts represented in images, such as scenes or objects, has motivated researchers in areas as diverse as Information Retrieval, Computer Vision, Image Processing and Artificial Intelligence. Nevertheless, in contrast to text documents, whose words carry semantic, images consist of pixels that have no semantic information by themselves, making the task very challenging.
In this dissertation, we have addressed the problem of representing images based on their visual information. Our aim is content-based concept detection in images and videos, with a novel representation that enriches the Bag-of-Words model. Relying on the quantization of highly discriminant local descriptors by a codebook, and the aggregation of those quantized descriptors into a single pooled feature vector, the Bag-of-Words model has emerged as the most promising approach for image classification. We propose BossaNova, a novel image representation which offers a more information-preserving pooling operation based on a distance-to-codeword distribution.
The experimental evaluations on many challenging image classification benchmarks, such as ImageCLEF Photo Annotation, MIRFLICKR, PASCAL VOC and 15-Scenes, have shown the advantage of BossaNova when compared to traditional techniques, even without using complex combinations of different local descriptors.
An extension of our approach has also been studied. It concerns the combination of BossaNova representation with another representation very competitive based on Fisher Vectors. The results consistently reaches other state-of-the-art representations in many datasets. It also experimentally demonstrate the complementarity of the two approaches. This study allowed us to achieve, in the competition ImageCLEF 2012 Flickr Photo Annotation Task, the 2nd among the 28 visual submissions.
Finally, we have explored our BossaNova representation in the challenging real-world application of pornography detection. Once again, the results validated the relevance of our approach compared to standard techniques on a real application.

Defensa : 14/06/2013

miembros del jurado :

PERRONNIN Florent (Xerox Research Centre Europe) [Rapporteur]
CAMPOS Mario (Université Fédérale de Minas Gerais, Brésil) [Rapporteur]
SCHMID Cordelia (INRIA Grenoble)
PÉREZ Patrick (Technicolor Research & Innovation)
GALLINARI Patrick (Université Pierre et Marie Curie)
THOME Nicolas (Université Pierre et Marie Curie)
CORD Matthieu (Université Pierre et Marie Curie)
ARAÚJO Arnaldo (Université Fédérale de Minas Gerais, Brésil)

Avec en plus deux invités coté UFMG, Brésil :
VALLE Eduardo, Université d'État de Campinas [Examinateur]
SCHWARTZ William, Université Fédérale de Minas Gerais - Brésil [Examinateur]

Fecha de salida : 30/09/2013

Publicaciones 2011-2016

Ellas todas Artículos de revistas Documentos de conferencias Tesis

2016
- M. Carvalho, M. Cord, S. Avila, N. Thome, E. Valle : “Deep Neural Networks Under Stress”, IEEE International Conference on Image Processing (ICIP 2016), Phoenix, AZ, United States (2016)
2013
- S. Avila : “Extension du Modèle par Sac de Mots Visuels pour la Classification d’Images”, tesis, defensa 14/06/2013, dirección de investigación Cord, Matthieu, co-supervisión : Araújo, Arnaldo, Thome, Nicolas (2013)
- Th. Durand, N. Thome, M. Cord, S. Avila : “Image classification using object detectors”, ICIP 2013 : IEEE International Conference on Image Processing, Melbourne, Australia, pp. 4340-4344 (2013)
- S. Avila, N. Thome, M. Cord, E. Valle, A. De Albuquerque Araújo : “Extended Bag-of-Words Formalism for Image Classification”, Brazilian Symposium on Computer Graphics and Image Processing, Arequipa, Peru (2013)
- S. Avila, N. Thome, M. Cord, E. Valle, A. De Albuquerque Araújo : “Pooling in Image Representation: the Visual Codeword Point of View”, Computer Vision and Image Understanding, vol. 117 (5), pp. 453-465, (Elsevier) (2013)
2011
- S. Avila, N. Thome, M. Cord, E. Valle, A. De Albuquerque Araújo : “BOSSA: extended BoW formalism for image classification”, IEEE International Conference on Image Processing (ICIP), Brussels, Belgium, pp. 2909-2912, (IEEE) (2011)