MITB Banner

This Theoretical Breakthrough At MIT Could Significantly Increase Data Storage

Linear probing hash tables from CSAIL at the Massachusetts Institute of Technology (MIT) could lead to more efficient computer data storage and retrieval.

Share

A team of MIT researchers has made a breakthrough in the storage and retrieval of data in computers. New research from MIT’s CSAIL on linear-probing hash tables may result in more efficient computer data storage and retrieval. In addition, a theoretical discovery could increase data storage. Data structures, known as “linear-probing hash tables”, are a standard strategy of organising and storing data in computers using data structures. Linear-probing hash tables allow data to be stored at any point along with an array.

For instance, William Kuszmaul, a computer science PhD student at MIT, proposes that the database contains 10,000 social security numbers. “Obtain the social security number x and compute its hash function, h (x), which yields a random number between 1 and 10,000. The following step is to take that random number, h (x). Then, to navigate to that position in the array and insert the social security number x.”

Handling Deletion

The deletion of personal information like social security numbers follows a particular set of rules. It can be difficult to find something later if the hash table is left empty after deleting information. When you’re looking for something, it’s easy to have the impression that there are no other places where you can get it. To overcome this issue, Kuszmaul adds, “one can go to the location of the piece that was removed and place a small marker called a ‘tombstone,’ which shows there was once an element here, but it has since been removed.”

Linear Probing Features

However, most people who utilise linear probing hash tables anticipated that when they were too crowded. Linear probing would conduct lengthy stretches of occupied locations in a “cluster” fashion. As a result, the time required to discover an empty seat increases considerably — to the point of quadratic growth — as long as the expectation is false. As a result, individuals have been educated to work with hashtables on a limited capacity basis. They are a strategy that can be costly in terms of the quantity of hardware that a business must purchase and maintain.

However, the study of Kuszmaul and his colleagues, Michael Bender of Stony Brook University, and Bradley Kuszmaul of Google has utterly disproved this long-held notion that has been used to counteract high-load issues. Instead, the linear probing hash table provides great storage capacity without losing performance. Additionally, the team developed a novel approach named “Graveyard Hash.” This hashtable is accomplished by intentionally increasing the number of tombstones in the array until it fills approximately half of the available space. These tombstones provide additional areas for future inscriptions. Kuszmaul says this method is taken in the opposite direction of how people are typically educated and “may result in the best performance in a linear probing hash table.” 

Conclusion

Companies demand new technologies for more effective data storage and retrieval in computers in today’s era of information overload. According to researchers at MIT, linear probing can reintroduce older technology such as linear probing hash tables since they are more efficient. The researchers compared modern and historical storage systems. Along with the traditional method, the researchers developed a new technique called cemetery hashing, which entails artificially increasing the number of tombstones in an array until they occupy approximately half of the available locations. As a result, linear probing will improve data storage, retrieval speed and efficiency from computers.
For more information, refer to the article.

Share
Picture of Dr. Nivash Jeevanandam

Dr. Nivash Jeevanandam

Nivash holds a doctorate in information technology and has been a research associate at a university and a development engineer in the IT industry. Data science and machine learning excite him.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.