Supervision : Bernd AMANN
Co-supervision : BAAZIZI Mohammed Amine
A model and an algebra of motifs for the representation and interrogation of completeness of relative information
Information incompleteness is a major data quality issue which is amplified by the increasing amount of data collected from unreliable sources. Assessing the completeness of data is crucial for determining the quality of the data and the validity of query answers.
In this thesis, we tackle the issue of extracting and reasoning about complete and missing information under relative information completeness setting. We advance the field by proposing two contributions: a pattern model for providing minimal covers summarizing the extent of complete and missing data partitions and a pattern algebra for deriving minimal pattern covers for query answers to analyze their validity.
The completeness pattern framework presents an intriguing opportunity to achieve many applications, particularly those aiming at improving the quality of tasks impacted by missing data. Data imputation is a well-known technique for repairing missing data values but can incur a prohibitive cost when applied to large data sets. Query-driven imputation offers a better alternative as it allows for We adopt a rule-based query rewriting technique for imputing the answers of aggregation queries that are missing or suffer from incorrectness due to data incompleteness. We present a novel query rewriting mechanism that is guided by the completeness pattern model and algebra.
We also, investigate the generalization of our pattern model for summarizing any data fragments. Summaries can be queried to analyze and compare data fragments in a synthetic and flexible way.
Defence : 06/28/2019
Jury members :
Pr. Nicole Bidoit-Tollu, Université Paris-Sud [Rapporteur]
Pr. Dimitris Kotzinos, Université de Cergy [Rapporteur]
Pr. Ladjel Bellatreche, ENSAM
Pr. Laure Berti-Equille, Université Aix-Marseille
Pr. Christophe Marsala, Sorbonne Université
Pr. Bernd Amann, Sorbonne Université
Dr. Mohamed-Amine Baazizi, Sorbonne Université
- F. Hannou : “Un modèle et une algèbre de motifs pour la représententation et l’interrogation de la complétude de l’information relative”, thesis, defence 06/28/2019, supervision Amann, Bernd, co-supervision : Baazizi, Mohammed Amine (2019)
- F.‑Z. Hannou, B. Amann, M.‑A. Baazizi : “Explaining Query Answer Completeness and Correctness with Partition Patterns”, 30th International Conference on Database and Expert Systems Applications - DEXA 2019, vol. 11707, Lecture Notes in Computer Science, Linz, Austria, pp. 47-62 (2019)
- F.‑Z. Hannou, B. Amann, M.‑A. Baazizi : “Query-Oriented Answer Imputation for Aggregate Queries”, Advances in Databases and Information Systems - 23rd European Conference, ADBIS 2019, vol. 11695, Lecture Notes in Computer Science, Bled, Slovenia, pp. 302-318, (Springer) (2019)
- F.‑Z. Hannou, B. Amann, M.‑A. Baazizi : “Exploring and Comparing Table Fragments With Fragment Summaries”, Eleventh International Conference on Advances in Databases, Knowledge, and Data Applications (DBKDA), Athènes, Greece, pp. 31-38, (IARIA) (2019)
- F.‑Z. Hannou, B. Amann, M.‑A. Baazizi : “Explaining Query Answer Completeness and Correctness with Minimal Pattern Covers”, (2019)
- F.‑Z. Hannou, B. Amann, M.‑A. Baazizi : “Une Algèbre de Motifs pour l’Évaluation et l’Analyse de la Complétude des Données et l’Éxactitude des Requêtes”, BDA 2018 Gestion de Données–Principes, Technologies et Applications, Bucarest, Romania (2018)