Team : LFI
Departure date : 09/30/2015
Supervision : Marcin DETYNIECKI Co-supervision : TOLLARI Sabrina
Diversity by Clustering in Image Retrieval: Experimental Study
Conventional search engines return relevant results but often the retrieved items are similar. Moreover, the similar results tend to appear together. The user may be interested to find documents that are relevant and diverse at the same time.
In this thesis, we consider the problem of the diversity in image retrieval. We have focused our attention on diversity by clustering, especially in an approach based on an agglomerative hierarchical clustering (AHC) to address the hierarchical nature of the diversity. Furthermore, we propose a novel approach for exploiting richer description resources, such as a «tree of concepts», to increase the diversity.
The different approaches are compared on a highly relevant and manually annotated benchmark: the XiloDiv benchmark; and on the most general: ImageClef2008 and MediaEval2013 benchmarks. The experimental results show that a hierarchical exploitation of the results of the AHC increases the diversity in comparison with two flat clustering methods and a method of diversity by optimization. The results also show that it is better to use concept features than visual features from a diversity point of view. In addition, on the Mediaeval2013 benchmark, we show that an interesting strategy to improve diversity is to increase the relevance using the text, and then to exploit visual based clustering to diversify the results.
Finally, we developed a complete prototype in particular taking into account the strong constraints of response time which makes it suitable to be used in the company's search engine.
Defence : 08/31/2015 - 14h - Site Jussieu 25-26/105 Jury members : MULHEM Philippe (LIG) [Rapporteur]
IONESCU Bogdan (LAPI) [Rapporteur]
CORD Matthieu (LIP6)
POPESCU Adrian (CEA LIST)
DETYNIECKI Marcin (LIP6)
TOLLARI Sabrina (LIP6)