Download our Mobile App
In a traditional distributed database setting, sharding is a way to scale a larger database to be more accessible to querying. In the context of blockchain, sharding becomes a much more powerful, and more complex, tool to use.
Sharding is a process wherein a large database is horizontally partitioned into multiple, smaller databases that do not share information. This, in turn, makes it easier to query and scale if there is a need for more data to be added.
Why Is Sharding Useful?
In a larger database that functions on a query language, the time required to query the database directly depends on the size of the database. This leads to a scalability issue, wherein it becomes more and more difficult to constantly query an ever-growing database.
While data can be split over multiple databases and sorted according to specific factors, the size of the discrete databases will also begin to grow. Moreover, the infrastructural requirements for maintaining a set of large databases are huge.
A single central database will not only require a large amount of computing power, but equal costs in order to ensure the redundancy of data contained in the system. These factors come together to create a scaling problem in a database setting.
Sharding aims to correct this by splitting up not only the data, but also the infrastructural costs. As the size of the database decreases, less processing power and redundancy mechanisms are required in order for it to run efficiently.
Sharded databases are also easier to query, owing to their smaller size. They can also be distributed across less expensive hosting solutions and can scale to a limitless scale as long as proper sharding rules are implemented.
Stay ConnectedGet the latest updates and relevant offers by sharing your email.
Why Do Blockchains Need Sharding
In traditional distributed databases, it is easier to implement sharding by setting simple rules to sort the data. Since a central party is managing all the shards, there accurate information regarding the position of data in the shards.
However, in the blockchain, there is no central party that monitors the presence of data on the chain. This causes a lot of problems in terms of shards.
For example, Ethereum, one of the most widely-used blockchains for distributed applications and tokenization, is suffering from scalability issues. This is due to the transaction throughput being capped at 15-20 transactions per second, which is not enough to keep the blockchain functioning smoothly.
This is due to its proof-of-work consensus algorithm, which decides how the transactions are ordered to avoid network failure. The algorithm requires that every node in the network is required to have a copy of the entire blockchain, and synchronize transactions as well.
This system cannot be scaled, as the transaction throughput does not scale with the number of nodes on the network. This is the reason the Ethereum Foundation is looking into sharding to scale the chain.
Blockchain Sharding Implementation And Its Pitfalls
Just as in a traditional database, nodes on the blockchain are grouped into subsets and sharded depending on sorting rules. All of the shards are still able to communicate with the main blockchain, allowing the decentralization and common record to be maintained.
This would, theoretically, allow for nodes to scale exponentially, as each shard would process transactions in parallel, as opposed to processing them in synchronization. According to Ethereum’s creator, Vitalik Buterin, said at a conference last year
“Imagine that Ethereum has been split into thousands of islands. Each island can do its own thing. Each of the islands has its own unique features and everyone belonging on that island i.e., the accounts, can interact with each other AND they can freely indulge in all its features. If they want to contact other islands, they will have to use some sort of protocol.”
This is one of the biggest downfalls of implementing sharding in a blockchain, as individual shards cannot communicate with each other. Moreover, the wide geographical spread and general distribution of the blockchain nodes, there would be no logic to split them, with a random approach being currently proposed.
Randomly sharding the nodes would require communication to be implemented in a seamless way between the nodes, which is where the Ethereum Foundation ran into a pitfall.
There are also alternative approaches to sharding, such as what is seen with the network known as Zilliqa. Zilliqa implements sharding in a novel way that ensures that there is no problem in sharing information across shards.
Instead of sharding the nodes completely, Zilliqa implements two kinds of data. One is to store the ‘state’ of the blockchain, which is smaller in size and can be synchronized across all the nodes easily.
The other is the ‘history’ of the blockchain, which is larger in size and hence sharded. If there is a need for the history to be queried, there can be inter-node communication to obtain the necessary information from the relevant shard.
This allows for the blockchain to be scaled to a large extent without sacrificing on decentralization. However, being an emerging technology in its blockchain implementation, its interoperability with consensus mechanisms is yet to be seen.
If you loved this story, do join our Telegram Community.
Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
What's Your Reaction?
I am an AI enthusiast and love keeping up with the latest events in the space. I love video games and pizza.