Last updated September 26, 2022
In AI News & Update

Google AI Launches an Open Source Library to Store & Manipulate Large Multi-Dimensional Arrays

The library aims to address key engineering challenges in scientific computing through better management and processing of large datasets.

Published on September 26, 2022
by Bhuvana Kamath

Listen to this story

In a blog article published last week, Google AI introduced TensorStore, an open-source C++ and Python library designed for storage and manipulation of n-dimensional data. The library aims to address key engineering challenges in scientific computing through better management and processing of large datasets.

Various contemporary applications of computer science and machine learning (ML) manipulate multidimensional datasets that span a single and expansive coordinate system. An example could be the use of air measurements over a geographical grid to estimate the weather.

Another could be making medical imaging predictions using multi-channel image intensity values from a 2D or 3D scan.

A single dataset under these circumstances might also need petabytes of storage and working with such datasets could be challenging—as users may receive and write data at different scales and unpredictable intervals.

Researchers at Google AI claim that TensorStore has already been used to solve key engineering challenges such as management and processing of large datasets in neuroscience—such as peta-scale 3d electron microscopy data and “4d” videos of neuronal activity.

Additionally, the library has been used in the creation of PaLM—a large-scale machine learning model—by addressing the problem related to managing model parameters or checkpoints during distributed training.

This library natively supports storage systems like Google Cloud Storage, HTTP servers, local and network filesystems, and more, and offers a unified API for reading and writing diverse array types such as zarr and N5. With strong atomicity, consistency, isolation, and durability (ACID) guarantee, it also provides read/writeback caching and transactions. Furthermore, it is capable of supporting safe, efficient access from multiple processes and machines via optimistic concurrency.

TensorStore is also expected to offer an asynchronous API that would enable high-throughput access even to high-latency remote storage. It provides a simple Python API to load and manipulate large array data. For example, a TensorStore object is created representing 56 trillion voxel 3d image of a fly brain and which accesses a small 100×100 patch of the data as a NumPy array:

Source: Google AI Blog

The blog claims, “No actual data is accessed or stored in memory until the specific 100×100 slice is requested; hence arbitrarily large underlying datasets can be loaded and manipulated without having to store the entire dataset in memory, using indexing and manipulation syntax largely identical to standard NumPy operations.”

To know more, Google AI has provided the TensorStore package that can be installed using simple commands. For further reference, check out the tutorials and API documentation for usage details.

Access all our open Survey & Awards Nomination forms in one place >>

Bhuvana Kamath

I am fascinated by technology and AI’s implementation in today’s dynamic world. Being a technophile, I am keen on exploring the ever-evolving trends around applied science and innovation.

Watch More

Google AI Launches an Open Source Library to Store & Manipulate Large Multi-Dimensional Arrays

Bhuvana Kamath

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discord Server

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Recent Stories

World's Biggest Media & Analyst firm specializing in AI

Advertise with us

AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.

Branded Content

AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.

Corporate Upskilling

ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent

Hackathons

With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.

Talent Assessment

Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring

Research & Advisory

AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.

Conferences & Events

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives.