From query processing to information integration: the power of decomposition techniques
Speaker(s) : Zoltan Miklos, EPFL Switzerland
The prevalent use of information technology has created a number of new challenges. Applications often need to cope with massive amounts of data, integrated from multiple, heterogeneous sources. We discuss various types of decomposition techniques that are useful to address these challenges. The decomposition techniques enable to focus on smaller sub-problems and then to combine their partial results for obtaining solutions to the original problem. First, we elaborate on decomposition techniques for Boolean conjunctive query processing over relational databases and we present a comprehensive picture of the corresponding complexity questions. Then, we discuss entity resolution problems in Web data collections, where we rely on machine learning methods. In this case, a different type of problem decomposition led to effective techniques. Finally, we report about our ongoing work on P2P data integration, in particular, in the context of business networks. We present our model that focuses on some specific important, however widely ignored aspects of the integration problem. We demonstrate how can a (yet another type of) decomposition method help here.
Short bio: Zoltan Miklos is a postdoctoral researcher at EPFL, Switzerland. He completed his doctorate studies in computer science at University of Oxford. Before moving to Oxford, he used to work as a research assistant at the Technical University of Vienna and the Vienna University of Economics, in Austria. He completed his undergraduate studies at University ELTE, in Budapest. He also worked 4 years as a software developer with Siemens. He is interested both in theoretical and applied aspects of computer science. In particular, his work is focused on databases (database theory, data mining, data integration, entity resolution), artificial intelligence (constraint satisfaction problems, managing uncertainties, machine learning, the Semantic Web), distributed information systems (P2P and publish/subscribe systems, cloud computing), graph theory, information retrieval.
Sahar.Changuel (at) nulllip6.fr