- Computer Science Laboratory

LE GUILLOU Ève

Аспирант at Sorbonne University
Подразделение : APR
    Sorbonne Université - LIP6
    Boîte courrier 169
    Couloir 25-26, Étage 3, Bureau 302
    4 place Jussieu
    75252 PARIS CEDEX 05
    FRANCE

+33 1 44 27 88 79
Eve.Le-Guillou (at) nulllip6.fr
https://lip6.fr/Eve.Le-Guillou

Научны(е)й руководител(и)ь : Julien TIERNY, Pierre FORTIN

Topological Data Analysis, High performance computing, Data science, Visualization

Topological Data Analysis (TDA) tackles the complexity of large-scale data by capturing its structural characteristics in a concise encoding for analysis and visualization. As datasets grow, it becomes frequent for a single dataset to exceed the memory limit of one machine, making distributed-memory systems, with their much larger capacities, a necessary solution. However, adapting an algorithm for distributed-memory systems requires substantial changes to ensure correctness and performance. In particular, TDA algorithms face challenges in this context, as they rely on global data accesses and multiple traversals with minimal computation, a combinationthat often scales poorly in a distributed-memory context. Furthermore, existing distributed-memory implementations are mono-tailored for one particular topological representation which induces practical drawbacks. The Topology ToolKit (TTK) aims at providing a unified framework for TDA algorithms with a reusable and efficient data structure. However, TTK was up until now limited to shared-memory parallelism. In this thesis, we add distributed support to TTK using the Message Passing Interface (MPI). First, we adapt TTK’s core data structure and add distributed-memory support to several existing algorithms, both to demonstrate the new features and highlight their performance. Performance tests showcase the efficiency of each algorithm as well as of the overallsoftware infrastructure. Additionally, we apply a real-life topological analysis pipeline to two massive datasets to demonstrate our software’s effectiveness at scale. Then, we focus our effort on a much more complex abstraction: the persistence diagram. Its robustness and reliability make it one of the most used topological representation. The Discrete Morse Sandwich (DMS) is currently the most efficient algorithm for computing the diagram on one node. Our new method, the Distributed Discrete Morse Sandwich (DDMS), builds upon DMS and introduces tailored step-specific modifications, resulting in a hybrid MPI+thread implementation. Performance tests demonstrate thegain of our approach over the original DMS method as well as Dipha, the reference method for persistence diagram computation in a distributed-memory context. Our method successfully computes persistence diagrams on datasets containing up to 6 billion vertices.


Защита диссертаций : 10.10.2025 - 15h - Bâtiment ESPRIT, avenue Paul Langevin, 59650 Villeneuve-d'Ascq (Salle : Atrium)

Члены жюри :

Tom PETERKA, Argonne National Laboratory [Rapporteur]
David COEURJOLLY, CNRS [Rapporteur]
Julien TIERNY, CNRS
Isabelle BLOCH, Sorbonne Université
Bruno RAFFIN, Inria
Federico IURICICH, Clemson University
Christophe CALVIN, CEA
Pierre FORTIN, Université de Lille

Публикации 2024

  • 2024
    • E. Le Guillou, M. Will, P. Guillou, J. Lukasczyk, P. Fortin, Ch. Garth, J. Tierny : “TTK is Getting MPI-Ready”, IEEE Transactions on Visualization and Computer Graphics, pp. 1-18, (Institute of Electrical and Electronics Engineers) (2024)