James Ryan Requeima
Email: james.requeima@gmail.com
My Curriculum Vitae.
Machine Learning
I'm a PhD student studying machine learning at the University of Cambridge in the Computational and Biological Learning Lab. My advisor is Dr. Richard Turner. I'm interested in Bayesian optimization, metalearning, and approximate inference methods.
I'm currently a visiting student at MILA under the supervision of Yoshua Bengio .
Previously, I completed a Master's in machine learning, speech and language technology at the University of Cambridge where my advisor was Dr. Zoubin Ghahramani. During my Master's, I worked on an information theoretic acquisition function for Bayesian optimization called IPES.
Invenia
I’m also a researcher at Invenia Technical Computing based in Winnipeg, Manitoba. We use machine learning techniques to forecast demand for power in the electricity grid, energy production from wind farms, and electricity prices in wholesale power markets. I helped set up our research offices in Montréal, Canada and Cambridge, England.
My colleagues at Invenia and I did some analysis on electricity price time series for a couple of North American wholesale electricity markets. You can find our paper here.
Mathematics
I am a tenured member of the Department of Mathematics at Dawson College in Montréal. If you're looking for CEGEPlevel materials and online resources, my colleagues and I maintain this website.
When studying mathematics, my specialization was geometric group theory, combinatorial group theory, and algebraic topology. I studied under Dani Wise at McGill University, who was recently awarded a Guggenheim Fellowship and the Oswald Veblen Prize in Geometry.
Publications
Fast and Flexible MultiTask Classification Using Conditional Neural Adaptive Processes
The goal of this paper is to design image classification systems that, after an initial multitask training phase, can automatically adapt to new tasks encountered at test time. We introduce a conditional neural process based approach to the multitask classification setting for this purpose, and establish connections to the metalearning and fewshot learning literature. The resulting approach, called CNAPs, comprises a classifier whose parameters are modulated by an adaptation network that takes the current task's dataset as input. We demonstrate that CNAPs achieves stateoftheart results on the challenging MetaDataset benchmark indicating highquality transferlearning. We show that the approach is robust, avoiding both overfitting in lowshot regimes and underfitting in highshot regimes. Timing experiments reveal that CNAPs is computationally efficient at testtime as it does not involve gradient based adaptation. Finally, we show that trained models are immediately deployable to continual learning and active learning where they can outperform existing approaches that do not leverage transfer learning. James Requeima, Jonathan Gordon, John Bronskill, Sebastian Nowozin Richard E. TurnerTo appear as a spotlight paper in the Conference on Neural Information Processing Systems, 2019. paper  bibtex 

The Gaussian Process Autoregressive Regression Model (GPAR)
Multioutput regression models must exploit dependencies between outputs to maximise predictive performance. The application of Gaussian processes (GPs) to this setting typically yields models that are computationally demanding and have limited representational power. We present the Gaussian Process Au toregressive Regression (GPAR) model, a scalable multioutput GP model that is able to capture nonlinear, possibly inputvarying, dependencies between outputs in a simple and tractable way: the product rule is used to decompose the joint distribution over the outputs into a set of conditionals, each of which is modelled by a standard GP. GPAR’s efficacy is demonstrated on a variety of synthetic and realworld problems, outperforming existing GP models and achieving stateoftheart performance on established benchmarks. James Requeima, Will Tebbutt, Wessel Bruinsma, Richard E. TurnerTo appear in International Conference on Artificial Intelligence and Statistics, 2019. paper  bibtex 

Characterizing and Warping the Function space of Bayesian Neural Networks
In this work we develop a simple method to construct priors for Bayesian neural networks that incorporates meaningful prior information about functions. This method allows us to characterize the relationship between weight space and function space. Daniel FlamShepherd, James Requeima, David Duvenaud,NeurIPS Bayesian Deep Learning Workshop, 2018. paper  bibtex 

Parallel and distributed Thompson sampling for largescale accelerated exploration of chemical space
Chemical space is so large that brute force searches for new interesting molecules are infeasible. Highthroughput virtual screening via computer cluster simulations can speed up the discovery process by collecting very large amounts of data in parallel, e.g., up to hundreds or thousands of parallel measurements. Bayesian optimization (BO) can produce additional acceleration by sequentially identifying the most useful simulations or experiments to be performed next. However, current BO methods cannot scale to the large numbers of parallel measurements and the massive libraries of molecules currently used in highthroughput screening. Here, we propose a scalable solution based on a parallel and distributed implementation of Thompson sampling (PDTS). We show that, in small scale problems, PDTS performs similarly as parallel expected improvement (EI), a batch version of the most widely used BO heuristic. Additionally, in settings where parallel EI does not scale, PDTS outperforms other scalable baselines such as a greedy search, ϵgreedy approaches and a random search method. These results show that PDTS is a successful solution for largescale parallel BO. José Miguel HernándezLobato, James Requeima, Edward O. PyzerKnapp, Alán AspuruGuzikInternational Conference on Machine Learning, 2017. paper  bibtex 

Mapping Gaussian Process Priors to Bayesian Neural Networks
Currently, BNN priors are specified over network parameters with little thought given to the distributions over functions that are implied. What do N(0, 1) parameter priors look like in function space and is this a reasonable assumption? We should be thinking about priors over functions and that network architecture should be an approximation strategy for these priors. Gaussian Processes offer an elegant mechanism in the kernel to specify properties we believe our underlying function has. In this work we propose a method to, using a BNN, approximate the distribution over functions given by a GP prior. Daniel FlamShepherd, James Requeima, David Duvenaud,NIPS Bayesian Deep Learning Workshop, 2017. paper  bibtex 

Master's Thesis: Integrated Predictive Entropy Search for Bayesian Optimization
Predictive Entropy Search (PES) is an informationtheoretic based acquisition function that has been demonstrated to perform well on several applications. PES harnesses our estimate of the uncertainty in our objective to recommend query points that maximize the amount of information gained about the local maximizer. It cannot, however, harness the potential information gained in our objective model hyperparameters for better recommendations. This dissertation introduces a modification to the Predictive Entropy Search acquisition function called Integrated Predictive Entropy Search (IPES) that uses a fully Bayesian treatment of our objective model hyperparameters. The IPES aquisition function is the same as the original PES aquision function except that the hyperparameters have been marginalized out of the predictive distribution and so it is able to recommend points taking into account the uncertainty and reduction in uncertainty in the hyperparameters. It can recommend queries that yield more information about the local maximizer through information gained about hyperparameters values. James Requeima, Advisor: Zoubin Ghahramanipaper  bibtex  code 

Master's Thesis: Relative sectional curvature in compact angled 2complexes
We define the notion of relative sectional curvature for 2complexes, and prove that a compact angled 2complex that has negative sectional curvature relative to planar sections has coherent fundamental group. We analyze a certain type of 1complex that we call flattenable graphs Γ → X for an compact angled 2complex X, and show that if X has nonpositive sectional curvature, and if for every flattenable graph π1(Γ) → π1(X) is finitely presented, then X has coherent fundamental group. Finally we show that if X is a compact angled 2complex with negative sectional curvature relative to πgons and planar sections then π1(X) is coherent. Some results are provided which are useful for creating examples of 2complexes with these properties, or to test a 2complex for these properties. James Requeima, Advisor: Daniel Wisepaper  bibtex 