LIP6 CNRS Sorbonne Université Tremplin Carnot Interfaces
Direct Link LIP6 » Tin tức » Nghiên cứu sinh

RENARD Xavier

Tiến sĩ
Nhóm nghiên cứu : LFI
Ngày đi : 30-09-2017
Ban lãnh đạo nghiên cứu : Marcin DETYNIECKI
Đồng hướng dẫn : RIFQI Maria

Dynamic knowledge extraction from complex temporal data

Our research described in this thesis is about the learning of a motif-based representation from time series to perform automatic classification. Meaningful information in time series can be encoded across time through trends, shapes or subsequences usually with distortions. Approaches have been developed to overcome these issues often paying the price of high computational complexity. Among these techniques, it is worth pointing out distance measures and time series representations.
We focus on the representation of the information contained in the time series. We propose a framework to generate a new time series representation to perform classical feature-based classification based on the discovery of discriminant sets of time series subsequences (motifs). This framework proposes to transform a set of time series into a feature space, using subsequences enumerated from the time series, distance measures and aggregation functions. One particular instance of this framework is the well-known shapelet approach.
The potential drawback of such an approach is the large number of subsequences to enumerate, inducing a very large feature space and a very high computational complexity. We show that most subsequences in a time series dataset are redundant. Therefore, a random sampling can be used to generate a very small fraction of the exhaustive set of subsequences, preserving the necessary information for classification and thus generating a much smaller feature space compatible with common machine learning algorithms with tractable computations. We also demonstrate that the number of subsequences to draw is not linked to the number of instances in the training set, which guarantees the scalability of the approach.
The combination of the latter in the context of our framework enables us to take advantage of advanced techniques (such as multivariate feature selection techniques) to discover richer motif-based time series representations for classification, for example by taking into account the relationships between the subsequences.
These theoretical results have been extensively tested on more than one hundred classical benchmarks of the literature with univariate and multivariate time series. Moreover, since this research has been conducted in the context of an industrial research agreement (CIFRE) with Arcelormittal, our work has been applied to the detection of defective steel products based on production line's sensor measurements.
Bảo vệ luận án : 15-09-2017 - 14h30 - Site Jussieu 25-26/105
Hội đồng giám khảo :
DOUZAL Ahlame (Université Joseph Fourier, Grenoble 1) [Rapportrice]
WEHENKEL Louis (Université de Liège) [Rapporteur]
GALLINARI Patrick (Université Pierre et Marie Curie, Paris 6)
MARTEAU Pierre-François (Université Bretagne Sud)
PALPANAS Themis (Université Paris Descartes, Paris 5)
DETYNIECKI Marcin (Université Pierre et Marie Curie, Paris 6) [Directeur]
RIFQI Maria (Université Panthéon-Assas, Paris 2) [Directrice]
FRICOUT Gabriel (Arcelormittal Research) [Encadrant industriel]

Bài báo khoa học 2015-2019

 Mentions légales
Sơ đồ site |