Queues are a great way to control the amount of queries to the database and bring some order in the world of micro-service communication.
When you are designing distributed systems at scale, one question that always comes up is how you scale the databases. This is truly one of the hardest things to do in the system design world from my experience.
The Problem in Hand
It doesn’t matter whether your database is SQL or NoSQL, ACID-compliant or not, when your user base grows and the system becomes larger and larger, eventually, your database is gonna get overloaded.
Imagine this scenario, you are a startup and you have just created an amazing piece of software that processes and handles some kind of transaction. For now, the number of users using your system is small. let us say 1000. So, you decide to go for the following architecture.
You got a couple of instances of your backend service running inside Kubernetes pods which interact with a single instance of Postgres. Pretty simple right? The system runs well.
But suddenly you see a massive growth in your user base (10k users) and you notice a huge spike in the traffic. So, you decide to scale your system. Scaling the backend service is as simple as increasing the number of pods. Scaling the database is not so straightforward. But before we go there let’s see what happens to our database when we scale our backend service.
You have doubled the number of pods for your service, but now because of that, you have also doubled the number of queries coming to your Postgres database.
Replica Sets
When it comes to scaling the database, the first approach is a replica set. Simply put, in a replica set, we have several database instances (usually 3, 5, 7, etc) all of which store the same data.
The catch here is, only one of the database instances handles all the write, update, and delete operations. The rest of the databases are there to handle the read traffic. The database which handles the writes is called the Primary node or the Leader. It syncs the data across all the other nodes.
As you can imagine, if your application has many reads the writes, this approach is great. But it’s not a good choice and vice versa as all the writes still hit a single instance of the database.
Sharding
The next option for scaling databases is sharding. Sharding is the process where you split the data across multiple instances of the database. The data stored in each instance is different from the others. Every instance handles both the read and write operations. Usually, there is some kind of load balancer which sits in front of the instances and directs the read/ write traffic to the correct database.
There are several proven strategies to evenly distribute the data across all the databases. One of them is consistent hashing.
As you can see, sharding works well for writing heavy use cases as well. There is only one problem tho. COST. Sharding databases take up a lot of effort and hosting multiple databases doesn’t come cheap.
A simple solution: Persistent Queues
Perhaps one of the simplest and easiest ways to de-couple the database from the backend service is to use some kind of distributed and persistent queue. We simply push the transactions into the queue and a separate service will write to the database at a constant rate so that it doesn’t get overloaded.
This is the approach on which many of the best systems in the world are based. Paypal, Amazon, Twitter, Uber you name it. They all employ some kind of queueing mechanism to reduce their load on the database. This is a much cheaper option compared to hosting multiple databases.
There are many popular queues available out there that are extensively used. One of the most popular ones is Apache Kafka. It is widely used in probably more than 80% of the companies out there. Another one of my favorites is NATS.
But do note that, queues are not a magical solution to every problem. They are great in asynchronous scenarios where the result might not need to be immediately reflected.
’The programming language can be changed, and the UI can be changed, but if you don’t design your data well, the entire system has to be changed.’
0 Comments
Leave a Comment