Temporal data constitute a large part of data collected digitally. Predicting their next values is an important and challenging task in domains such as climatology, optimal control, or natural language processing. Standard statistical methods are based on linear models and are often limited to low-dimensional data. We instead use deep learning methods capable of handling high-dimensional structured data and leverage large quantities of examples.
In this thesis, we are interested in latent variable models. Contrary to autoregressive models that directly use past data to perform prediction, latent models infer low-dimensional vectorial representations of data on which prediction is performed.
First, we propose a structured latent model for spatiotemporal data forecasting. Next, we focus on predicting data distributions, rather than point estimates, through a diachronic language modeling task. Finally, we propose a stochastic prediction model applied to video prediction.