In 2016, Uber paid $148 million to settle the investigation on a data breach that exposed the personal information of over half a million drivers. In 2020, Google was fined $57 million for a GDPR violation. The rise of on-device machine learning, coupled with the growing concerns of data privacy, has nudged developers and researchers towards techniques such as federated learning–a collaborative learning method that operates without exchanging users’ original data.
The stringent GDPR makes data sharing among European organisations challenging. Meanwhile, federated learning systems (FLS) have shown promise with good predictive accuracy while complying with privacy rules. According to Li et al., federated learning systems are game changers like deep learning frameworks such as PyTorch and TensorFlow.
As illustrated above, the number of related papers in FL has increased rapidly and has reached about 4,400 last year.
Federated Learning leverages techniques from multiple research areas such as distributed systems, machine learning, and privacy. FL is best applied in situations where the on-device data is more relevant than the data that exists on servers. However, FLS faces various challenges such as effectiveness, efficiency, and privacy. FL enables multiple parties to jointly train a machine learning model without exchanging the local data. According to Li et al.,se FLS is being adopted in various domains:
A typical Federated Learning Protocol according to Google AI:
- Federated learning servers are called by devices
- Model checkpoint from storage is read by the servers.
- Models are sent to the select devices.
- On-device model training followed by server update.
- Server aggregates these updates into a global model and writes them into storage.
Federated learning provides a decentralised computation strategy to train a neural model. Modern day mobile devices churn out swathes of personal data, which can be used for training. Instead of uploading data to servers for centralised training, phones process their local data and share model updates with the server. Weights from a large population of mobiles are aggregated by the server and combined to create an improved global model. The distributed approach has been shown to work with unbalanced datasets and data that are not independent or identically distributed across clients.
On-device machine learning comes with a privacy challenge. Data recorded by cameras and microphones can put individuals at great risk in the event of a hack. For example, apps might expose a search mechanism for information retrieval or in-app navigation.
Federated averaging was implemented by researchers from University of Kyoto in practical mobile edge computing (MEC) frameworks by using an operator of MEC frameworks to manage the resources of heterogeneous clients. Both distributed deep reinforcement learning (DRL) and federated learning were also adopted in mobile edge computing systems. The use of DRL and FL has the potential to optimise the mobile edge computing, caching, and communication. FL has also been performed on resource-constrained MEC systems, where the researchers address the problem of how to efficiently utilise the limited computation and communication resources at the edge. A technique called FedGKT was proposed where each device only trains a small part of a whole ResNet to reduce the computation overhead. Using federated averaging, the researchers implemented many machine learning algorithms including linear regression, SVM, and CNN.
For natural language processing
Companies like Google use Federated Averaging techniques in its smartphone keyboard for text prediction. FL was applied in mobile keyboard next-word prediction. Federated averaging method was used to learn a variant of LSTM called Coupled Input and Forget Gate (CIFG). According to the researchers, the FL method can achieve better precision recall than the server-based training with log data.
For instance, companies like Apple use FL techniques and its variants like Federated Tuning (FT) on their products to carry out a combination of on-device computing as well as recommendations with user privacy. For Apple, applications around FE and FT occupy a large percentage of system usage. Federated evaluation (FE) occurs on user interaction history. This significantly reduces turn-around times when compared to live A/B experimentation. FE can help quickly identify the most promising ML system or model candidates before exposing end users to these candidates via live A/B experimentation.
For recommender systems
Federated collaborative filter method is popular with building recommendation systems. Based on a stochastic gradient approach, the item-factor matrix is trained in a global server by aggregating the local updates. The method is said to have no accuracy loss compared to the centralised method. In another method, a federated matrix factorisation framework is used. Here, Federated SGD is used to learn the matrices. Federated recommender system (FedRecSys) has implemented popular recommendation algorithms with SMC protocols. The algorithms include matrix factorisation, singular value decomposition(SVD), factorisation machine, and deep learning.
Though the local data are not exposed in FL, the exchanged model parameters may still leak sensitive information. Model inversion attacks and membership inference attacks etc can potentially infer the raw data by accessing the ML model. This is a major concern especially in domains such as healthcare.
Modern health systems require cooperation among research institutes, hospitals, and federal agencies. Moreover, in a pandemic like situation, collaborative research among countries is vital but not at the expense of privacy. FL makes the cooperation possible because it can ensure privacy. In a federation of healthcare, there is probably no central server. So, another challenging part is the design of a decentralised FLS, which should also be robust against malefactors. The privacy concern can be solved by additional mechanisms like secure multi-party computation and differential privacy. According to a survey on FLS by Li et al., explainability of the FL models is an open problem.