A team of MIT researchers has made a breakthrough in the storage and retrieval of data in computers. New research from MIT’s CSAIL on linear-probing hash tables may result in more efficient computer data storage and retrieval. In addition, a theoretical discovery could increase data storage. Data structures, known as “linear-probing hash tables”, are a standard strategy of organising and storing data in computers using data structures. Linear-probing hash tables allow data to be stored at any point along with an array.
For instance, William Kuszmaul, a computer science PhD student at MIT, proposes that the database contains 10,000 social security numbers. “Obtain the social security number x and compute its hash function, h (x), which yields a random number between 1 and 10,000. The following step is to take that random number, h (x). Then, to navigate to that position in the array and insert the social security number x.”
Handling Deletion
The deletion of personal information like social security numbers follows a particular set of rules. It can be difficult to find something later if the hash table is left empty after deleting information. When you’re looking for something, it’s easy to have the impression that there are no other places where you can get it. To overcome this issue, Kuszmaul adds, “one can go to the location of the piece that was removed and place a small marker called a ‘tombstone,’ which shows there was once an element here, but it has since been removed.”
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
Linear Probing Features
However, most people who utilise linear probing hash tables anticipated that when they were too crowded. Linear probing would conduct lengthy stretches of occupied locations in a “cluster” fashion. As a result, the time required to discover an empty seat increases considerably — to the point of quadratic growth — as long as the expectation is false. As a result, individuals have been educated to work with hashtables on a limited capacity basis. They are a strategy that can be costly in terms of the quantity of hardware that a business must purchase and maintain.
However, the study of Kuszmaul and his colleagues, Michael Bender of Stony Brook University, and Bradley Kuszmaul of Google has utterly disproved this long-held notion that has been used to counteract high-load issues. Instead, the linear probing hash table provides great storage capacity without losing performance. Additionally, the team developed a novel approach named “Graveyard Hash.” This hashtable is accomplished by intentionally increasing the number of tombstones in the array until it fills approximately half of the available space. These tombstones provide additional areas for future inscriptions. Kuszmaul says this method is taken in the opposite direction of how people are typically educated and “may result in the best performance in a linear probing hash table.”
Conclusion
Companies demand new technologies for more effective data storage and retrieval in computers in today’s era of information overload. According to researchers at MIT, linear probing can reintroduce older technology such as linear probing hash tables since they are more efficient. The researchers compared modern and historical storage systems. Along with the traditional method, the researchers developed a new technique called cemetery hashing, which entails artificially increasing the number of tombstones in an array until they occupy approximately half of the available locations. As a result, linear probing will improve data storage, retrieval speed and efficiency from computers.
For more information, refer to the article.