CARON Clément
Научны(е)й руководител(и)ь : Bernd AMANN
Со-руководитель : CONSTANTIN Camelia
Provenance and Quality in Data Oriented Workflows : Application to the WebLab Platform
The WebLab platform is an application used to define and execute media-mining workflows. It is an open source platform, developed by the IPCC section of Airbus Defence and Space, for the integration of external components. A designer can create complex media-mining workflows using components, whose operation is not always known (black-boxes services). These complex workflows can lead to a problem of data quality, however, and before this work, no tool existed to analyse and improve the quality of WebLab workflows.
To deal with black-box services, we choose to tackle this quality problem with a non-intrusive approach: we enhance the definition of the WebLab workflow with provenance and quality propagation rules. Provenance rules generate fine-grained data dependency links between data and services after the execution of a WebLab workflow. Then the quality propagation rules use these links to reason on the influence that the quality of the data used by a component has on the quality of the output data.
The contributions of this thesis are:
- a provenance links generation model based on data dependency rules;
- a propagation model for quality values over a provenance graph;
- an extension of the WebLab architecture with the implementation of our two models, and of a user interface.
Защита диссертаций : 03.11.2015
Члены жюри :
VIDAL Maria Esther, Université Simon Bolivar, Venezuela (PR, CV attaché) [Rapporteur]
GRIGORI Daniela, PR/HDR Université de Dauphine [Rapporteur]
VARGAS-SOLAR Genoveva, CR CNRS/HDR LIG Grenoble
MARSALA Christophe, PR UPMC (EDITE)
AMANN Bernd, PR UPMC (EDITE)
CONSTANTIN Camelia, MCF UPMC (EDITE)