CIS Network Seminar: #IDIAP
“Understanding Language as Bayesian Inference of Bayesian Representations”
Dr. James Henderson, Senior Researcher, IDIAP
Monday April 24, 2023 | 15:15-16:15 CEST
The astounding recent advances in AI have been enabled by the Transformer deep learning architecture, and its incredibly effective inductive bias for Large Language Models. We argue that this architecture is fundamentally Bayesian, in that its learned representations are best thought of as probability distributions. We then propose an extension to Transformers which adds nonparametric Bayesian inference of these Bayesian representations. Our Nonparametric Variational Information Bottleneck can regularise any attention-based representation, resulting in generative models with variable-sized sparse latent representations. Existing large Transformers pretrained on text can be reinterpreted as approximating this inference, suggesting that the success of Transformers is due to their ability to approximate nonparametric Bayesian inference of Bayesian representations.
Dr James Henderson is a Senior Researcher at Idiap Research Institute, where he heads the Natural Language Understanding group. He was previously MER at Univ Geneva and held a number of other research positions since his PhD at Univ Pennsylvania and BSc at MIT. He investigates machine learning methods for natural language processing, with a long history of work on neural network structured prediction and variational Bayesian approaches to deep learning.