“Artificial intelligence will have a more profound impact on humanity than fire, electricity and the internet”, said Sundar Pichai, Google’s CEO, at the BBC Radio 4 podcast with Amol Rajan. The head declared AI would fundamentally transform the way we live our lives with its applications in healthcare, education and manufacturing. Despite AI’s evolution during the past few years, the technology is still believed to be in its beginning stages, undergoing heavy research to uncover more efficient and accessible implementations with less computational power and training involved. Google AI is one of the leading researchers in this space.
In a recent blogpost, Jeff Dean, Senior Fellow and SVP of Google Research, highlighted the upcoming research themes the company is focusing on. Dean presented the need for continuous research and highlighted the key areas Google is advancing its studies in. Analytics India Magazine has curated a list of the broader themes and study topics Google aims to discover.
Creating general-purpose, large scale multi-modal ML through ‘Pathways’
AIM Daily XO
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
One of the biggest trends for 2022, Google and beyond, is training larger and more capable ML models, especially those powered by NLP. This change is powered by increasing dataset and model size, providing better results in language accuracies on NLP benchmarks. Additionally, the research on transformer models is also accelerating with combining transformer models with convolutional operations for better visual and speech recognition tasks.
According to Google AI, large scale multi models are also gaining momentum. They are some of the most advanced models to date, given their ability to work with different input modalities while producing different output modalities. “This is an exciting direction because, like the real world, some things are easier to learn in data that is multi-modal,” noted Jeff Dean. Such models can be achieved by pairing images and text for multi-lingual retrieval tasks or jointly training models on visual and textual data to increase accuracy on classification tasks or co-training on the image, video, and audio tasks to enhance generalisation performance for all modalities.
Download our Mobile App
Google is also exploring using NLP as input for applications such as image manipulation, instructing robots on interacting with the world and foreshadowing potential changes to how user interfaces are developed. These models will be capable of dealing with speech, sounds, images, video, languages and possible, structured data, knowledge graphs and time-series data.
Additionally, these models will be better trained on self-supervised learning, thereby reducing the effort in creating specific machines for different tasks. Together, these trends can enable a general-purpose model that can handle multiple modalities of data to solve millions of tasks. Google is pursuing research, called Pathways, to enable this next-gen architecture through an umbrella effort.
Pathways to generalise millions of tasks
Improving ML accelerator performance
With a more continuous improvement, Google aims to better the next generation of ML accelerators for faster chip performance and increased system scale.
Driving better model architectures through humans and machines
Google foresees a continuous improvement in model architecture through both human creativity and machine efforts. Intending to reduce computational energy needed, the company combines human efforts with machine learning algorithms such as the NAS that discovers more efficient ML architectures.
Enabling better personal use-cases while maintaining privacy
Google has leveraged ML innovations and silicon hardware to allow mobile devices to sense the surrounding environment effectively, as in the Google Tensor processor on the Pixel 6. Google aims to further the advancement of ML in this context, to improve ease of use while boosting computational power needed for personal benefits like photography, video recording, communication, live translation, live captions, and more. Google AI has been working on combining ML with traditional codec approaches in the Lyra speech codec as well as SoundStream audio codec for better fidelity of communication.
The company aims to do so while strengthening privacy safeguards. Android’s Private Computer Core is an open-source and secure environment, isolated from the rest of the operating system to ensure the data processed is not shared with any other applications on the phone. These features can communicate to Private Compute Core over a small set of open-source APIs that strips our private information. Google’s plan for 2022 is to evolve the security of interactions further while encouraging better computational for personal usage. The company aims to broaden their technology stack to support neural computing and equip access to interactive intelligent interfaces and act as a social entity. The key to enabling so is undertaking a federated unsupervised approach.
Broader applications of computer vision
Google aims to leverage computer vision to create tools that can address global challenges at a large scale. Additionally, it helps keep an accurate record of building footprints, an integral layer for applications today. Since this type of information entails population data, humanitarian responses or environmental and urban planning, it is challenging to calculate it in developing or under-developed nations. But with the help of computer vision technologies, this data can now be accessed through satellite imageries. Google has leapt doing so with their Open Buildings dataset that locates 500 million+ buildings in Africa, and the company aims to leverage this to provide humanitarian aid after natural disasters.
Automate design space architecture for better applications
Google is exploring letting an ML algorithm automatically explore and evaluate a problem’s design space for possible solutions. Through a Transformer-based variational autoencoder that creates document layouts, an algorithm for computer architectural decisions, and another that focuses on game playability, Google has stepped foot in this research already. Along with such use-cases, the technology has been used for material discovery in chemistry. The company focuses on accelerating the technology’s use-cases in scientific research for better applications.
Deploy assistive ML for healthcare use-cases
Google foresees the healthcare sector deploying assistive ML systems to improve breast cancer screening, detect lung cancer, accelerate radiotherapy treatments for cancer, mark abnormal X-rays, stage prostate cancer biopsies and even assist in colonoscopies to help in quality assurance, ensure all polyps are identified and detect elusive polyps.
Leverage ML to help people in daily health management
An emerging trend in the ML world has been using machines to support daily healthcare needs for people. These include healthcare metrics for heart rate assessment, sleeping wellness, speech recognition for those with impairments, support for those with vision impairments and more. The company sees these as just the start of new use-cases and plans on conducting further research.
Mitigate climate changes: EV friendly satellite maps, fusion as an energy source, natural disasters and sustainability
Google believes in the power of accurate data to help mitigate climate challenges. The company has done so with their eco-friendly Google Maps that estimates to save 1 million tons of CO2 emissions per year, partly by being EV friendly. They are also furthering research on fusion as a renewable energy source. Additionally, Google is working on addressing wildfires and floods that are becoming a commonality, with initiatives such as the satellite-powered wildfire boundary map to help map the affected area. The company is currently launching this option in Maps, along with their optimisation algorithm for fire evacuation routes. Lastly, as part of their sustainability initiatives, Google is working on making their data centres operate on carbon-free energy by 2030 through better model architecture and ML accelerator types used in ML training.
Broadening the definition of Responsible AI beyond Western contexts to sociotechnical ML systems
One of the major ethical steps with AI, Google is working to think beyond Western contexts while dealing with the ethical needs of AI. Talking about how assumptions about conventional algorithm fairness frameworks fail in non-Western contexts, Dean pressed on Google’s current position in conducting surveys across continents to understand AI preferences and address the gap. The company is also working on enabling ML applications for smallholder farmers in the Global South and involving community stakeholders in the various stages of the ML pipeline. Google is furthering this section of research through a community-based method and listening to the citizens’ and their needs for sociotechnical ML systems.
Addressing privacy concerns in large ML models
Owing to the increasing security concerns with the increasing size of ML models, Google is also researching to address and ensure the protection of private information. They are doing so by leveraging techniques like federated learning, private clustering, private personalisation, private matrix completion, private weighted sampling, private quantiles, robust private learning of halfspaces, and in general, sample-efficient private PAC learning.