Deep Neural Networks obtain outstanding performances on many benchmarks, yet the key ingredient of their success remains unknown. This is mainly due to the high dimensional nature of those objects: they have a lot of parameters $D$ and use very large inputs $d$. By now, without loss in generality, we will focus on Neural Networks $\Phi$ learned for a classification task and which have been fed with $N$ samples. The weights of a Neural Network are specified via supervision and those networks tend to generalize well on a new test set: it implies those architectures have memorized important attributes from a dataset. During this PhD, we propose to study those attributes both from a theoretical and numerical point of view: what is their nature, how are they learned, how are they stored? We propose to study two types of mechanisms which can be addressed independently while being neatly connected: the memorization through the symmetries of a supervised or unsupervised task, and the memorization through the data. Interestingly, any improvement concerning on aspect will benefit on the other aspect.
Keywords : apprentissage profond, apprentissage statistique, théorie de l'apprentissage profond
This PhD research project has been submitted for a funding request to “Sorbonne Center for Artificial Intelligence (SCAI)”. The PhD candidate selected by the project leader will therefore participate in the project selection process (including a file and an interview) to obtain funding.