Last updated December 29, 2021
In AI Origins & Evolution

Top 10 Research Papers On Federated Learning

Federated learning was first introduced by Google in 2016

Share

Published on August 30, 2021

by Amit Raja Naik

Thanks to the increased awareness of user privacy across different devices and platforms, common centralised learning techniques are not appropriate — users are less likely to share data. Hence, the data will be available only on the devices; that is where federated learning comes into play.

What is Federated Learning?

Simply put, federated learning is a decentralised form of machine learning. Google first introduced it in 2016 in a paper titled, ‘Communication Efficient Learning of Deep Networks from Decentralized Data, which provided the first definition of federated learning, along with another research paper on federated optimisation titled ‘Federated Optimization: Distributed Machine Learning for On-Device Intelligence.’

Then, in 2017, Google, in a blog post, ‘Federated Learning: Collaborative Machine Learning without Centralized Training Data,’ explained in detail the nuances of this technique.

Since then, a lot has changed. In this article, we will list some of the top research papers on federated learning.

Advances and Open Problems in Federated Learning

At the workshop on federated learning and analytics held on 17 to 18 June 2021, Google, in collaboration with researchers from top universities, came up with a broad paper surveying the many open challenges in the area of federated learning.

The researchers noted that it is inherently interdisciplinary, and solving them requires not just machine learning, but other techniques and methods from distributed optimisation, cryptography, security, differential privacy, fairness, compressed sensing, systems, information theory, statistics, etc. Many of the most challenging problems are at the intersection of these areas, and researchers believe that collaboration is a plausible solution. According to the researchers, one of the goals of this work was to highlight techniques from these fields that can potentially be combined, raising both interesting possibilities and new challenges.

Since the inception of federated learning, with a primary emphasis on mobile and edge devices, the researchers believe that interest in applying federated learning to other applications are increasing exponentially, including some which might involve a small number of relatively reliable clients. For example, multiple organisations collaborating to train a model; the researchers call these two federated learning environments ‘cross-device’ and ‘cross-silo,’ respectively.

Check out the complete research paper here.

Generative Models for Effective ML on Private, Decentralised Datasets

In this research paper, Google researchers show that generative models trained using federated methods and formal differential guarantees can effectively debug many commonly occurring data issues even when the data is not directly inspected. Further, they explore these methods in applications to text with differentially private federated RNNs, and images using a novel algorithm for differentially private federated GANs.

Check out the research paper here.

Federated Learning for Mobile Keyboard Prediction

In this paper, Google researchers demonstrate the feasibility and benefits of training language models on client devices without exporting sensitive user data to servers. The federated learning environment gives users greater control over the use of their data. Furthermore, it simplifies incorporating privacy by default with distributed training and aggregation across a population of client devices.

Check out the research paper here.

FedML: A Research Library and Benchmark for Federated Machine Learning

In this paper, researchers from Tencent and top universities introduced FedML, an open research library and benchmark, to facilitate federated learning algorithm development and fair performance comparison. Accordingly, FedML supports three computing paradigms: on-device training for edge devices, distributed computing, and single-machine simulation.

Plus, it promotes diverse algorithmic research with flexible and generic API design and comprehensive reference baseline implementations (optimiser, models, and datasets). Finally, the researchers believe that their library and benchmarking framework provides an efficient and reproducible means for developing and evaluating federated learning algorithms.

Check out the research paper, code, documents and user community here.

Moshpit SGD: Communication-Efficient Decentralised Training on Heterogeneous Unreliable Devices

Researchers from Yandex, University of Toronto, Moscow Institute of Physics and Technology, and National Research University of Higher School of Economics proposed Moshpit All-Reduce. This iterative averaging protocol exponentially converges to the global average. In this research paper, the researchers demonstrated the efficiency of their protocol for distributed optimisation with strong theoretical guarantees, along with experiments that show impressive results.

Check out the complete research paper here.

Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge

Researchers from the University of Southern California reformulated federated learning as a group of knowledge transfer training algorithms called FedGKT. It designs an alternating minimisation approach to train small CNNs on edge nodes and periodically transfer their knowledge by knowledge distillation to a large server-side CNN. This knowledge transfer training algorithm consolidates several advantages into a single framework: reduce demand for edge computation, lower communication bandwidth for large CNNs, and asynchronous training, all while maintaining model accuracy comparable to FedAvg.

The source code is available on FedML. Check out the research paper here.

Central Server Free Federated Learning over Single-sided Trust Social Networks

In this paper, researchers from WeBank, Kwai, University of Southern California, University of Michigan, and the University of Rochester proposed a central server free federated learning algorithm called Online Push-Sum (OPS) method to handle various challenges in a generic setting. Furthermore, a rigorous regret analysis is provided, which shows interesting results on how users can benefit from communication with trusted users in the federated learning environment.

Check out the research paper here.

Label Leakage and Protection in Two-party Split Learning

Researchers from ByteDance and Carnegie Mellon showed a ‘norm attack.’ This simple method uses the norm of the communicated gradients between the parties to reveal the participants’ ground-truth labels.

Further, the researchers discuss several protection techniques to mitigate this issue. Accordingly, they have designed a principled approach that directly maximises the worst-case error of label detection. As per the researchers, this method is provided to be more effective in countering norm attacks. In this work, the researchers experimentally demonstrate the competitiveness of their proposed method compared to several other baseline techniques.

Check out the full research paper here.

Learning Private Neural Language Modeling with Attentive Aggregation

Researchers from Monash University, University of Queensland, and the University of Technology Sydney proposed a novel model aggregation with an attention mechanism considering the contribution of client models to the global model, together with an optimisation technique during server aggregation, to solve the problems for mobile keyboard suggestion.

Their proposed attentive aggregation method minimises the weighted distance between the server model and client models by iteratively updating parameters while attending to the distance between the server model and client models. Further, their experiments on two popular language modelling datasets and a social media dataset show that their proposed method outperforms its counterparts in terms of perplexity and communication cost in most comparison settings.

Check out the research paper here.

Flower: A Friendly Federated Learning Research Framework

Researchers from the University College London, University of Cambridge, and Avignon Universite presented Flower, a novel federated learning framework that unifies both perspectives. It is an open-source framework that supports heterogeneous environments, including mobile and edge devices, and scales to many distributed clients.

Flower allows engineers to port existing workloads with little overhead regardless of the ML framework used while enabling researchers to experiment with novel approaches to advance the state-of-the-art. In addition to this, the researchers described the design goals and architecture of Flower. Finally, they used it to evaluate the impacts of scale and heterogeneity on common federated learning methods in experiments with up to a thousand clients.

Check out the research paper here.

Access all our open Survey & Awards Nomination forms in one place

Amit Raja Naik

Amit Raja Naik is a seasoned technology journalist who covers everything from data science to machine learning and artificial intelligence for Analytics India Magazine, where he examines the trends, challenges, ideas, and transformations across the industry.