PhD student
Team : BD
Arrival date : 10/01/2021
    Sorbonne Université - LIP6
    Boîte courrier 169
    Couloir 25-26, Étage 5, Bureau 502
    4 place Jussieu
    75252 PARIS CEDEX 05

Tel: +33 1 44 27 87 56, Hamed.Rahimi (at) nulllip6.fr

Supervision : Bernd AMANN

Co-supervision : NAACKE Hubert

Semantization of large-scale scientific corpora - Application to the interactive analysis of the evolution of science

Today, Knowledge Graphs (KGs) are booming. DBPedia, Wikidata, and Yago provide encyclopedic knowledge on many concepts and entities and are only the most representative examples for many other knowledge graphs, which have reached a very high level of reliability and constitute an immense wealth to better understand the information of the past and produced in mass every day. KGs describe knowledge based on the RDF standard and can be queried using the SPARQL query language. On the other hand, a lot of knowledge is produced in textual form and published as articles in scientific archives like Web of Science, ISTEX, citeSeer, and arXiv. In order to facilitate the exploration and the global analysis of the knowledge represented in these scientific archives, the existing approaches consist of describing (indexing) the documents by sets of terms that do not capture all the conceptual nuances necessary to build maps describing the semantic and temporal (evolution) relationships between research topics and scientific domains. Structuring this knowledge to integrate it into acknowledge graph is difficult and still requires a significant human effort. As a result, the more semantic analysis of knowledge in scientific corpora is hampered by a lack of semantic representation of scientific concepts. In this study, we aim to build semantic topic evaluations in order to characterize the temporal evolution of science by building topic evolution graphs. Besides, we will investigate evolution pattern queries that consist of combining various models, tools, and resources from text mining, machine learning, databases, and semantic web (RDF/SPARQL, Wikidata, DBPedia, Yago).