VACHERET Romain

PhD Student at Sorbonne University
Team : MoVe
    Sorbonne Université - LIP6
    Boîte courrier 169
    Couloir 25-26, Étage 2, Bureau 203
    4 place Jussieu
    75252 PARIS CEDEX 05
    FRANCE

+33 1 44 27 87 71
Romain.Vacheret (at) nulllip6.fr
https://lip6.fr/Romain.Vacheret

Supervision : Tewfik ZIADI

Automatic Fault Localization in the context of Automatic Program Repair

Software has become ubiquitous in nearly every aspect of modern life, from critical infrastructure to daily applications.

Despite significant advances in development methodologies and testing practices, software systems continue to suffer from bugs. The debugging process required to locate and fix these faults remains expensive, requiring significant time and financial resources.

To address this challenge, the research field of Fault Localization (FL) has emerged aiming to automatically identify faulty elements in a program.

Depending on the type of data leveraged for localization, several families of approaches have been developed.

Among the most prominent are Spectrum-Based Fault Localization (SBFL), which relies on test executions and Information Retrieval Fault Localization (IRFL), which leverages textual artifacts such as bug reports.

Granularity is a key dimension of Fault Localization.

Existing techniques often operate at coarse-grained levels such as files or methods, leaving statements relatively underexplored. This gap is particularly critical for IRFL, as the amount of textual information contained in a single statement is limited, thereby reducing the effectiveness of Information Retrieval (IR) techniques.

In this thesis, we address this limitation by investigating both SBFL and IRFL at statement-level.

Our first contribution introduces a hybrid approach that combines SBFL and IRFL in order to overcome their respective limitations. Specifically, we integrate Ochiai, from SBFL, and Latent Dirichlet Allocation (LDA) from IRFL.

This combination allows the localization process to benefit from complementary strengths from both techniques, thus improving Fault Localization precision at the statement-level.

Building upon this foundation, our second contribution tackles the challenge of multi-Fault Localization.

We propose grouping statements into code fragments and applying an Evolutionary Algorithm (EA) to explore the search space of possible fragments. Each fragment is evaluated using a fitness function that combines both Ochiai and LDA scores.

The best-performing fragment is then transformed into a ranking consistent with standard Fault Localization outputs.

Our third contribution addresses the lack of contextual information at the statement-level, a particularly critical issue for IRFL approaches.

Although statements are often treated as independent entities during localization, they naturally belong to larger semantic blocks defined by their structural and dataflow context. To exploit this context, we introduce a document expansion strategy composed of two complementary components.

The first leverages structural context by expanding each statement with preprocessed names of its enclosing class and method as well as an optional associated comment. The second incorporates terms from related lines identified by a graph of variable relations.

Experimental results show that this expansion improves localization quality compared to other IRFL approaches without expansion process.

Together, these three contributions address fundamental challenges of statement-level Fault Localzation: the limitations of individual data sources, the complexity of multiple faults, and the lack of contextual information in short statements.

This work therefore provides new insights for enhancing statement-level localization and suggests additional directions for future research in automated Fault Localization.


Phd defence : 12/08/2025 - 10h - Campus Pierre et Marie Curie, salle Jacques Pitrat (25-26/105)

Jury members :

Nadjib Lazaar, Professeur des universités [Rapporteur]
Nawal Guermouche, Maître de conférences HDR [Rapporteur]
Chouki Tibermacine, Professeur des universités
Jean-François Pradat-Peyre, Professeur des universités
Tewfik Ziadi, Maître de conférences HDR

2024 Publications