Team : DELYS
Arrival date : 09/01/2016
Localisation : Campus Pierre et Marie Curie
Sorbonne Université - LIP6
Boîte courrier 169
Couloir 26-00, Étage 2, Bureau 209
4 place Jussieu
75252 PARIS CEDEX 05
Tel: +33 1 44 27 45 15, Dimitrios.Vasilas (at) null
: Marc SHAPIRO
Scalable indexing for large-scale distributed storage systems
The initial research problem to be solved in this PhD is the design and implementation of a highly scalable, specialized indexing and search system, focused on queries on metadata. The system will be implemented as an extension to Scality’s storage system. Scality’s distributed storage system, which stores petabytes of data and is frequently updated, poses significant challenges to the implementation of an indexing and search subsystem:
● The primary challenge posed to the design of the indexing subsystem is to enable fast queries on bilions of objects. Also, the system should maintain a small index size relative to the data size, de spite indexing petabytes of data. Furthermore, the index design should support queries on multiple data types, since object metadata contain text (user defined attributes), integers (file size) as well as more complex data types (access control lists).
● Track updates incrementally as they occur. Data updates should provide low latency. Updating the index must not become a bottleneck for the storage system itself.
● Geo-distributed index. The indexing subsystem needs to receive concurrent updates and queries from a large number of clients, located in different geographic locations, and remain available in the presence of network partitions.