Listen to this story
The NeurIPS 2022 has shortlisted the research papers for this year’s conference to be held at the New Orleans Convention Center starting 26 November, 2022. Let’s take a look at some of the most interesting papers from the prominent tech organisations that will be presented:
- ‘Bayesian Optimization for a Better Dessert’ by Greg Kochanski, Daniel Golovin, John Karro, Benjamin Solnik, Subhodeep Moitra, and D. Sculley
A group of researchers from Google Brain submitted a study which applied Bayesian Optimization for a unique real-world problem involving chocolate chip cookies. The paper concluded that the recipe for the cookies could be optimised by tuning the parameters (or ingredients) involved like the amount of flour and sugar.
Sign up for your weekly dose of what's up in emerging technology.
The researchers wanted to pick a task that was familiar enough for everyone to understand the real complications involved and the fruits of the outcome. The study conducted a total of 144 experiments in a mixed-initiative system with chefs, tasters and a machine optimiser. Each of these experiments had a relatively high cost with the need for manual labour to mix and bake the cookies, taste them and then give scores.
The research was a success resulting in some highly rated cookies that surprised tasters in different cities—they contained much less sugar in California, and cayenne in Pittsburgh.
- ‘Matryoshka Representations for Adaptive Deployment’ by Aditya Kusupati, Gantavya Bhatt, Aniket Rege, Matthew Wallingford, Aditya Sinha, Vivek Ramanujan, William Howard-Snyder, Kaifeng Chen, Sham Kakade, Prateek Jain, Ali Farhadi
When training learned representations in ML systems, each downstream task has certain statistical limitations that are unknown. Capacity representations that are too rigid or with a fixed capacity don’t fit the task fully—they either over- or under-accommodate to the task. The researchers set out to question if they could come up with a more flexible representation that can adapt itself to multiple downstream tasks?
The paper presented the Matryoshka Representation Learning or MRL which encodes information at different granularities and helps every embedding to adjust to the computational limits in downstream tasks. The flexibility of MRL offers: up to 14 times smaller embedding size for ImageNet-1K classification with the same levels of accuracy, up to 14 times increase in speed for large-scale retrieval on ImageNet-1K and 4K and up to 2% improvement in accuracy for few-shot classification.
Generalised Additive Models or GAMs, which help linear models learn non-linear relationships, have become the common choice for fully-interpretable ML tools. But they lack easy scalability unlike Deep Neural Networks or DNNs which are more usable for real-world applications. The paper demonstrated a new category of GAMs which used tensor rank decompositions of polynomials. Called Scalable Polynomial Additive Models or SPAM, the approach helped models learn powerful and interpretable models, while also being scalable. SPAMs surpassed other benchmarks proving themselves a likely replacement for DNNs.
- ‘Neural Basis Models for Interpretability’ by Filip Radenovic, Abhimanyu Dubey, Dhruv Mahajan
The second paper also involves work around GAMs and the need for explainability behind model predictions. A bunch of these models have black-box neural networks. GAMs reduce this mystery by learning the shape function that is non-linear for each feature individually, and then has a linear model on top. Even then, these models are usually hard to train and have a huge number of parameters that are not scalable.
The paper suggests a new batch of GAMs that uses a small number of shape functions among all features, that are learned jointly for a given task. This makes the model far more scalable and more capable of handling large-scale data with high-dimensional features. The study employs an architecture called the Neural Basis Model or NBM which uses one neural network to learn all these bases. The research concludes that NBMs have state-of-the-art accuracy and on tabular and image datasets.
- ‘Graph Neural Networks as Gradient Flows’ by Francesco Di Giovanni, James Rowbottom, Benjamin P. Chamberlain, Thomas Markovich and Michael M. Bronstein
Graph Neural Networks or GNNs have become the usual ML tool for a variety of applications, from social networking platforms to particle physics and drug design. GNNs are often accused of having poor explainability, which makes it harder to understand why and how certain predictions are made, and the situations in which they might fail. Researchers have tried to work through the limitations in GNNs using over-smoothing, over-squashing and bottlenecks.
The new paper proposes a Gradient Flow Framework (GRAFF) such that the GNN equations follow the lead from the highest descent of a learnable energy. The research helps interpret GNNs as a multi-particle dynamics where the learned parameters help figure out the potential attractive and repulsive pairs in the feature space.