GAO Sheng

PhD graduated
Departure date : 09/30/2012

Supervision : Patrick GALLINARI

Co-supervision : DENOYER Ludovic

Latent Factor Models for Link Prediction Problems

With the rising of Internet as well as modern social media, relational data has become ubiquitous, which consists of those kinds of data where the objects are linked to each other with various relation types. Accordingly, various relational learning techniques have been studied in a large variety of applications with relational data, such as recommender systems, social network analysis, Web mining or bioinformatic. Among a wide range of tasks encompassed by relational learning, we address the problem of link prediction in this thesis.
Link prediction has arisen as a fundamental task in relational learning, which considers to predict the presence or absence of links between objects in the relational data based on the topological structure of the network and/or the attributes of objects. However, the complexity and sparsity of network structure make this a great challenging problem. In this thesis, we propose solutions to reduce the difficulties in learning and fit various models into corresponding applications.
Basically, in Chapter 3 we present a unified framework of latent factor models to address the generic link prediction problem, in which we specifically discuss various configurations in the models from computational perspective and probabilistic view. Then, according to the applications addressed in this dissertation, we propose different latent factor models for two classes of link prediction problems: (i) structural link prediction. (ii) temporal link prediction.
In terms of structural link prediction problem, in Chapter 4 we define a new task called Link Pattern Prediction (LPP) in multi-relational networks. By introducing a specific latent factor for different relation types in addition to using latent feature factors to characterize objects, we develop a computational tensor factorization model, and the probabilistic version with its Bayesian treatment to reveal the intrinsic causality of interaction patterns in multi-relational networks. Moreover, considering the complex structural patterns in relational data, in Chapter 5 we propose a novel model that simultaneously incorporates the effect of latent feature factors and the impact from the latent cluster structures in the network, and also develop an optimization transfer algorithm to facilitate the model learning procedure.
In terms of temporal link prediction problem in time-evolving networks, in Chapter 6 we propose a unified latent factor model which integrates multiple information sources in the network, including the global network structure, the content of objects and the graph proximity information from the network to capture the time-evolving patterns of links. This joint model is constructed based on matrix factorization and graph regularization technique.
Each model proposed in this thesis achieves state-of-the-art performances, extensive experiments are conducted on real world datasets to demonstrate their significant improvements over baseline methods. Almost all of them have been published in international or national peer-reviewed conference proceedings.

Defence : 06/19/2012

Jury members :

GAUSSIER Eric (Professeur à l'Université Joseph Fourier) [Rapporteur]
YVON François (Professeur à l'Université Paris-Sud) [Rapporteur]
ROSSI Fabrice (Professeur à l'Université Paris 1 Panthéon-Sorbonne)
NADIF Mohamed (Professeur à l'Université Paris Descartes)
DENOYER Ludovic (Maitre de Conférences à l'Université Pierre et Marie Curie)
GALLINARI Patrick (Professeur à l'Université Pierre et Marie Curie)

Departure date : 09/30/2012

2010-2017 Publications

Mentions légales
Site map