Partitioning and Sharding | System Design Interview Concept
For more comprehensive system design content and try your the design questions yourself, check out System Design School.
What is Partitioning?
Imagine your business is growing from 0 to millions of users, your database is filled up and you are running out of space quickly. What do you do? You have two choices
- make the machine and database bigger, aka vertical scaling
- slice the data into pieces and save into multiple databases and possibly multiple machines, aka horizontal scaling
Vertical scaling can only get you so far. You are limited by the memory and disk size of a single machine. And if something goes wrong with that machine, you are toast. That's why horizontal scaling is often more desirable for large web applications.
We can partition a large database into smaller databases(shards) according to certain rules. Naturally there are two ways to partition the database. We can partition the data by row or by column into smaller databases:
Horizontal database partition or sharding is the mostly commonly used partitioning method in SQL databases. Many modern databases have built-in sharding system.
A shard key is selected to decide which shard a data row should go into. Here's is a figure from MySQL's official documentation on shard key. Imagine a sales database, we can partition it by region, date or simply by hash of customer ids.
In the following sections, we will demo the concept of sharding with a SQL database and a NoSQL database.
Sharding a SQL database - MySQL(MariaDB)
To understand how data partitioning works with SQL database, let's try an real example with MySQL.
Suppose we have a comment system with a table of comments. We can partition the table by the year the comment was created. This method is called RANGE Partitioning.
1CREATE TABLE comments 2 ( 3 comment_id INT NOT NULL, 4 page_id INT NOT NULL, 5 user_id INT NOT NULL, 6 content TEXT NOT NULL, 7 created_time DATETIME NOT NULL 8 ) 9PARTITION BY RANGE (year(created_time)) 10 ( 11 PARTITION pold VALUES LESS THAN (2019), 12 PARTITION p19 VALUES LESS THAN (2020), 13 PARTITION p20 VALUES LESS THAN (2021), 14 PARTITION p21 VALUES LESS THAN (2022), 15 PARTITION p22 VALUES LESS THAN (2023) 16 );
Then insert 2 comments, the first comment is from 2009 and the second comment is from 2021.
1INSERT INTO `comments` ( 2 `comment_id`, `page_id`, `user_id`, 3 `content`, `created_time` 4) 5VALUES 6 ( 7 '1', '1', '1', 8 'The Times 03/Jan/2009 Chancellor on brink of second bailout for banks', 9 '2009-01-03 18:15:05' 10 ), 11 ( 12 '2', '2', '2', 13 'Hello algo.monster', 14 '2021-10-11 18:45:02' 15 )
To check the result, we can use the following queries.
1SELECT * FROM comments; 2SELECT * FROM comments PARTITION (pold); 3SELECT * FROM comments PARTITION (p21);
The first query should print all 2 comment. The second query shows the comment from 2009, and the third query shows the comment from 2021. You can try the code here http://sqlfiddle.com/#!9/fccec/3.
Sharding a NoSQL database - Redis Cluster
Redis is a key-value NoSQL database with builtin sharding support. Redis official website has a nice tutorial about Redis Cluster.
For scalability, Redis Cluster distribute data to nodes by the CRC16 of the key modulo 16384, so each node is responsible for a subset of the hash slots.
For redundancy, Redis Cluster simply use replica nodes to store copies the data in master nodes as failovers.
Here are the bash commands for running a minimal cluster with three masters and three replicas from the tutorial.
1mkdir redis-cluster 2cd redis-cluster 3mkdir 7000 7001 7002 7003 7004 7005 4 5# create redis config files 6cat >> ./7000/redis.conf << EOF 7port 7000 8daemonize yes 9cluster-enabled yes 10cluster-config-file nodes.conf 11cluster-node-timeout 5000 12appendonly yes 13EOF 14 15cat >> ./7001/redis.conf << EOF 16port 7001 17daemonize yes 18cluster-enabled yes 19cluster-config-file nodes.conf 20cluster-node-timeout 5000 21appendonly yes 22EOF 23 24cat >> ./7002/redis.conf << EOF 25port 7002 26daemonize yes 27cluster-enabled yes 28cluster-config-file nodes.conf 29cluster-node-timeout 5000 30appendonly yes 31EOF 32 33cat >> ./7003/redis.conf << EOF 34port 7003 35daemonize yes 36cluster-enabled yes 37cluster-config-file nodes.conf 38cluster-node-timeout 5000 39appendonly yes 40EOF 41 42cat >> ./7004/redis.conf << EOF 43port 7004 44daemonize yes 45cluster-enabled yes 46cluster-config-file nodes.conf 47cluster-node-timeout 5000 48appendonly yes 49EOF 50 51cat >> ./7005/redis.conf << EOF 52port 7005 53daemonize yes 54cluster-enabled yes 55cluster-config-file nodes.conf 56cluster-node-timeout 5000 57appendonly yes 58EOF 59 60# start redis servers 61cd 7000 && redis-server redis.conf && cd .. 62cd 7001 && redis-server redis.conf && cd .. 63cd 7002 && redis-server redis.conf && cd .. 64cd 7003 && redis-server redis.conf && cd .. 65cd 7004 && redis-server redis.conf && cd .. 66cd 7005 && redis-server redis.conf && cd .. 67 68# create redis cluster 69redis-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 --cluster-replicas 1
You should see a message
[OK] All 16384 slots covered after everything is done.
1>>> Performing Cluster Check (using node 127.0.0.1:7000) 2M: 91cf3c5c135cf17c0202ac0041a1842d69633d13 127.0.0.1:7000 3 slots:[0-5460] (5461 slots) master 4 1 additional replica(s) 5M: b1d224ff349351dcc2b770e21226cf35697fffbd 127.0.0.1:7002 6 slots:[10923-16383] (5461 slots) master 7 1 additional replica(s) 8S: fef256a968b814eedaf65a0a4f397e97807d1f16 127.0.0.1:7003 9 slots: (0 slots) slave 10 replicates b1d224ff349351dcc2b770e21226cf35697fffbd 11S: 8dd78cf46e709045ef0b72dddf49ac4f91407309 127.0.0.1:7005 12 slots: (0 slots) slave 13 replicates 063b3ed1f0ea6888401025cc37f24291a631fc65 14S: 1a82aa2d68becbabb878913fbd380f2d288ac053 127.0.0.1:7004 15 slots: (0 slots) slave 16 replicates 91cf3c5c135cf17c0202ac0041a1842d69633d13 17M: 063b3ed1f0ea6888401025cc37f24291a631fc65 127.0.0.1:7001 18 slots:[5461-10922] (5462 slots) master 19 1 additional replica(s) 20[OK] All nodes agree about slots configuration. 21>>> Check for open slots... 22>>> Check slots coverage... 23[OK] All 16384 slots covered.
Then we can use
redis-cli -c -p 7000 to connect to the first node of the cluster.
1$ redis-cli -c -p 7000 2127.0.0.1:7000> set algo monster 3OK 4127.0.0.1:7000> set hello world 5OK 6127.0.0.1:7000> set foo bar 7-> Redirected to slot  located at 127.0.0.1:7002 8OK 9127.0.0.1:7002> set a a 10OK 11127.0.0.1:7002> set hello hello 12-> Redirected to slot  located at 127.0.0.1:7000 13OK
At first, both
hello fit in
[0-5460], the hash slots of the node 7000. Then
foo has hash
12182, which is covered by the node 7002, so we are redirected and the prompt is changed. When we set
hello again, we are redirected back to the node 7000 because that's where the key
hello is stored.
As the dataset grows, we might need to add new nodes to provide more storage. In this case, we need to reshard the cluster since we need to rearrange the hash slots.
Got a question? Ask the Teaching Assistant anything you don't understand.