Razvan Pascanu
I'm currently a Research Scientist at DeepMind.
I grew up in Romania and studied computer science and electrical engineering for my undergrads in Germany. I got my MSc from Jacobs University, Bremen in 2009 under the supervision of prof. Herbert Jaeger. I hold a PhD from University of Montreal (2014), which I did under the supervision of prof. Yoshua Bengio. My PhD thesis can be found here.
I was involved in developing Theano and help write some of the deep learning tutorials for Theano. I've published several papers on topics surrounding deep learning and deep reinforcement learning (see my scholar page). I'm one of the organizers of EEML (www.eeml.eu) and part of the organizers of AIRomania. As part of the AIRomania community, I have organized RomanianAIDays since 2020, and helped build a course on AI aimed at high school students.
Specifically, my research interests include topics as:
Optimization & Learning -- During my PhD I was fascinated by second order methods and natural gradient, on which I spend quite a bit of time. I am interested in:
understanding how we can optimize efficiently and at scale deep models;
how we can make learning more data-efficient (particularly in RL);
understanding the dynamics of learning (particularly under gradient descent);
Memory & RNNs -- I'm particularly interested in how memory is formed in recurrent models, and what kind of structures can help us utilize memory better (a topic I've been working on since my Masters). I'm interested in both looking at the learning process to achieve this goal, as well as structural changes to the model. A few of my published works centers around understanding and exploring alternative recurrent model formulations.
Learning with multiple tasks: Continual Learning, Transfer Learning, Multi-task learning, Curriculum Learning, Meta-learning -- With the explicit goal of improving data efficiency, I have been working on multiple problems formulated around training with multiple tasks. From Continual Learning, when said tasks are explored sequentially, to Transfer Learning, Multi-task learning, Curriculum Learning or Meta-Learning. I've published a few different works that are trying to address these problems from different angles -- some of which had considerable impact -- as well as organized workshops on the topics and recently a new conference (lifelong-ml.cc).
Graph Neural Networks -- Adding meaningful structure to neural networks is definitely an important future direction that I believe we need to understand. I have looked at the impact of graph structured neural networks or how to apply neural models to graph structured data, including a survey of the field. Recently I was involved in organizing a conference specifically on this topic (logconference.org).
Theory for deep networks (representation/learning) -- I'm interested in understanding how neural networks work. I've been looking in this direction both from a representational angle (what family of functions can they represent for a fixed sized model) as well as from a learning perspective (the structure of the loss surface of these models) and published works on both topics.
Reinforcement Learning, Generative Models -- While my work does not focus directly on these topics, I have been a few works in this space and is a topic of interest to me that comes in and out of focus.
Students
Thomas Schmied, jointly with Sepp Hochreiter at Johannes Kepler University
Previous Students
Tudor Berariu, jointly with Claudia Clopath at Imperial College London
Siddhant Jayakumar, jointly with Thore Graepel at University College London