MOYSE Gilles

PhD graduated
Team : LFI
Departure date : 09/30/2016

Supervision : Marie-Jeanne LESOT

Linguistic summaries of numerical data : interpretability and series periodicity

Our research is in the field of fuzzy linguistic summaries (FLS) that allow to generate natural language sentences to describe very large amounts of numerical data, providing concise and intelligible views of these data.
We first focus on the interpretability of FLS, crucial to provide end-users with an easily understandable text, but hard to achieve due to its linguistic form. Beyond existing works on that topic, based on the basic components of FLS, we propose a general approach for the interpretability of summaries, considering them globally as groups of sentences. We focus more specifically on their consistency. In order to guarantee it in the framework of standard fuzzy logic, we introduce a new model of oppositions between increasingly complex sentences. The model allows us to show that these consistency properties can be satisfied by selecting a specific negation approach. Moreover, based on this model, we design a 4-dimensional cube displaying all the possible oppositions between sentences in a FLS and show that it generalises several existing logical opposition structures.
We then consider the case of data in the form of numerical series and focus on linguistic summaries about their periodicity: the sentences we propose indicate the extent to which the series are periodic and offer an appropriate linguistic expression of their periods. The proposed extraction method, called DPE, standing for Detection of Periodic Events, splits the data in an adaptive manner and without any prior information, using tools from mathematical morphology. The segments are then exploited to compute the period and the periodicity, measuring the quality of the estimation and the extent to which the series is periodic. Lastly, DPE returns descriptive sentences of the form ``Approximately every 2 hours, the customer arrival is important''. Experiments with artificial and real data show the relevance of the proposed DPE method.
From an algorithmic point of view, we propose an incremental and efficient implementation of DPE, based on established update formulas. This implementation makes DPE scalable and allows it to process real-time streams of data.
We also present an extension of DPE based on the local periodicity concept, allowing the identification of local periodic subsequences in a numerical series, using an original statistical test. The method validated on artificial and real data returns natural language sentences that extract information of the form ``Every two weeks during the first semester of the year, sales are high''.

Defence : 07/19/2016

Jury members :

Janusz Kacprzyk, Polish Academy of Sciences [Rapporteur]
Trevor Martin, University of Bristol [Rapporteur]
Bernadette Bouchon-Meunier, Université Pierre et Marie Curie
Jean-Gabriel Ganascia, Université Pierre et Marie Curie
Anne Laurent, Université Montpellier 2
Adrien Revault d'Allonnes, Université Paris 8
Marie-Jeanne Lesot, Université Pierre et Marie Curie

Departure date : 09/30/2016

2012-2016 Publications