Scaling apps to million users

10 min readSep 11, 2023

STEP 1: Single Server Setup:

When a client tries to reach the site using url, the DNS returns the public IP of the url and then the client request reaches the web server using the public IP address.

STEP 2: Separate out Database from the web server:

STEP 3: Add Load Balancer

In this setup the load balancer has a public IP and each of the multiple web servers has a private IP. The DNS returns the public IP of the site which is used to reach the load balancer and then the load balancer uses the private IPs of the web servers to distribute the requests among the web servers.

STEP 4: Data Replication : Add Secondary/Slave Databases

our user base has even increased further and one database server is not being able to handle that load.

We need to segregate read operations from write/update operations to reduce the pressure on the data base server and to serve the users better and achieve low latency.

We need to have multiple Slave Database Servers where data would be replicated from the Master Database Server.

some complexities arising in this step:

Data Synchronization between Master and Slave Servers
When slave database goes offline
If there is only one slave database and it goes down then the read operations are redirected to the master database and a new slave database is setup as soon as possible. If there are more than one slave databases then there is no need to redirect read operations to the master. Rather all the read operations are handled by the rest of the slave databases and a new slave server is setup.
If Master Database Goes Down

If there is only one master database and it goes down then a slave database is promoted to master database.

STEP 5: Add Cache Tier

Database calls are expensive and often slow. In order to reduce the read latency we need to have implement Caching and have a Cache Tier. This would efficiently lighten the database load.

Things to keep in mind while implementing Caching:

An expiration policy for the Cache should be implemented that will cause data to be removed from the cache the data is not accessed for a specified period of time. Often this specified period of time is called TTL or Time To Live. The expiration period should not be too short, otherwise this can cause the application to continually fetch data from the database. The expiration period should not be too long either, otherwise the data would often become stale.
Maintaining Consistency between the cache and the database is very important. An item in the database may be changed or updated at any time and this change may not be reflected in the cache might not be reflected in the cache until the next time the item is loaded into the cache. When scaling across multiple regions, maintaining consistency between data in cache and the persistent storage becomes a big technical challenge.
Data Eviction: The process of purging data from the cache when the cache reaches its maximum size and new data needs to be inserted in the cache is called Data Eviction. Some of the popular cache data eviction policies are Least Recently Used (LRU), Least Frequently Used (LFU), First In First Out (FIFO).

STEP 6: Use Content Delivery Network (CDN)

When the users are geographically highly distributed globally it is better to use Content Delivery Networks (CDN) to store the static contents such as CSS files, images, videos, javascript files and certain dynamic contents. CDNs are globally distributed and the resources are fetched from the CDN nearest to the client. This greatly reduces the load/response time.

STEP 7: Make The Web Tier Stateless (If possible)

As we scale the web tier horizontally we need to move the state such as session data out of the web tier.
If we don’t make the web tier stateless then we won’t be able to send the requests of a user to different servers when state needs to be maintained.
A very naïve solution to this problem is Sticky Session which binds a user’s session to a specific server and consequently all the subsequent requests from that user for that session is sent to that specific server. This potentially means that we are not taking advantage of multiple web servers, and consequently it is not a very elegant solution.
Making the web tier stateless would help to auto scale the web tier, i.e, scale the web tier independently.

Let me simplify this in beginner friendly terms

Imagine a big online store like Amazon. To handle all the people shopping there, they use lots of computers (servers) working together. These servers are like cashiers in a store, helping customers make their purchases online.

Now, let’s talk about “session data.” Think of it as a shopping cart in a real store. When you put items in your cart, that’s like storing information about your shopping session. On a website, session data might include things like what you’ve added to your cart, your login status, or your preferences.

When you’re shopping on Amazon, sometimes the cashier (server) you start with might get too busy. So, it would be great if you could switch to a different cashier who has the same information about your cart and shopping session.

The problem is, some websites “stick” you to one cashier (server) throughout your shopping session. This is like having a shopping cart that’s tied to a specific cashier. If that cashier gets busy, you have to wait in their line, even if other cashiers are available and not busy. This isn’t very efficient because it doesn’t use all the available cashiers effectively.

To make things work better, we want to allow customers (users) to switch to different cashiers (servers) easily if one gets too busy. This is where making the web tier “stateless” comes in.

Making the web tier stateless means that the servers don’t hold onto your shopping session. Instead, they store that information somewhere else, like in a special storage area. So, no matter which server you talk to, they can always find your shopping cart information because it’s not tied to just one server.

This helps the online store work more efficiently. If one server is busy, they can send you to another one, and you won’t lose your shopping cart. It’s like being able to switch to a faster cashier without losing any of your stuff.

So, making the web tier stateless allows the online store to use all its servers effectively, like having many cashiers ready to help you, and it helps them scale up and down easily when there are more or fewer customers. It’s a smarter and more efficient way to run a big online store.

STEP 8: Implement Message Queue : LOOSE COUPLING

We have come a long way from where we began. Now we want to further decouple our web tier into smaller services. The key is to achieve Loose Coupling and build components that do not have tight coupling. If one component fails, other components can continue working as if no failure occurred since even though they are dependent on each other they have very less coupling.

Messaging Queue is a key strategy that is employed in many Distributed Systems to achieve Loose Coupling.

Some of the methods to implement Message Queue are Java Messaging Service or JMS, Amazon SQS Queue, Azure Queue among many others.

Let’s simply above process for an e-commerce application.

Certainly, let’s explain the concept of a Message Queue in the context of an e-commerce application like an online store.

Imagine you’re running a big online store like Amazon, and many things are happening behind the scenes to make it work smoothly:

1. **Order Processing**: When customers place orders, there’s a lot of work to be done. You need to check product availability, handle payments, update inventory, and arrange for shipping.

2. **Customer Support**: Customers might have questions or issues. They send messages for help, and your support team needs to respond promptly.

3. **Inventory Management**: You need to keep track of how much of each product you have in stock and reorder when items are running low.

4. **Notifications**: Customers want to know when their orders ship or when new products arrive.

Now, without a Message Queue, all these tasks might have to talk to each other directly, and that can lead to problems:

- If the order processing system tries to talk directly to the customer support system every time there’s an issue, it could slow everything down.

- If there’s a sudden surge in orders, the order processing system might get overwhelmed, and other parts of your application could suffer.

This is where a Message Queue comes in:

- **Order Processing**: When a customer places an order, instead of immediately telling the shipping system, it drops a message in the Message Queue, like a note in a shared mailbox.

- **Customer Support**: Your support team checks the Message Queue regularly. When they see a new message about a customer issue, they take care of it.

- **Inventory Management**: The inventory system keeps an eye on the Message Queue for updates about product availability. When it sees a message about a product being sold, it updates the inventory.

- **Notifications**: The system responsible for sending notifications checks the Message Queue. When it sees a message about an order being shipped, it sends an email to the customer.

Here’s why this helps in an e-commerce context:

- **Efficiency**: Different parts of your e-commerce system can work independently. The order processing system doesn’t have to wait for customer support to handle issues before it can move on.

- **Scalability**: If there’s a sudden surge in orders, you can add more resources to the order processing part without affecting other parts of the system. The Message Queue acts as a buffer.

- **Fault Tolerance**: If one part of your system has a problem (for example, the customer support system goes down temporarily), the Message Queue can hold messages until the issue is resolved. Other parts of the system can continue to function normally.

So, in the context of an e-commerce application, a Message Queue is like a communication system that helps different parts of the application work together efficiently, handle high demand, and stay resilient even when there are hiccups. It’s like a behind-the-scenes coordinator that keeps your online store running smoothly.

STEP 9: Database Scaling: Sharding

Now time has come to scale out our database by database sharding.

let’s dig more into sharding examples with an e-commerce application.

Imagine you have a gigantic warehouse full of products to sell, like clothes, electronics, and toys. This warehouse is like your database, where you keep track of all the products, their prices, and customer orders.

Now, your online store is becoming super popular, and you’re getting tons of customers and products. Managing this huge warehouse all by yourself is getting tricky because you can’t keep up with everything.

So, what do you do?

Database Sharding is like splitting your big warehouse into smaller, more manageable sections. Each section has its own manager, like a mini-warehouse supervisor.

Product Shards: You might decide to divide the products into categories. For example, one section (shard) is just for clothes, another for electronics, and another for toys. Each shard has its own supervisor who knows everything about the products in that category.
Order Shards: You do the same for orders. One shard handles orders from Monday to Wednesday, another from Thursday to Saturday, and another for Sunday. Each shard has its order managers who keep track of orders for their time period.

Now, why is this helpful?

Faster Service: When a customer looks for clothes, they don’t have to wait for the manager of the electronics section to find the information. They go straight to the clothes section, where things are organized just for them. This makes the online store run faster.
Easier to Manage: Each mini-warehouse (shard) is easier to handle because it’s smaller and has its own manager. If one category gets super popular, you can hire more managers or split that category into even smaller sections.
But… Joining Data: Here’s the catch. If a customer wants to buy a shirt and a tablet, you now need to talk to both the clothes manager and the electronics manager. This can be a bit tricky because they’re in different sections. So, sometimes, you might need to do some extra work to get all the information you need.

STEP 10: Add Automation, Logging, Monitoring and Metrics

Now that the business has grown substantially, collecting different types of metrics is very important as they can give useful business insights about the site. Some important metrics are:

Host Level Metrics: CPU, Memory, Disk I/O
Aggregated Level Metrics: Performance of entire database tier
Key Business Metrics: Daily Active Users, Revenue etc.

Next comes Automation. Our infrastructure and code base are getting bigger and bigger. We need to leverage different automation tools to improve efficiency. Continuous Integration is a very common practice these days.

STEP 11: Multiple Data Centers

The site has grown rapidly and has attracted millions of users internationally. To improve availability and provide a better user experience across wider geographical areas, deploying the site to more than one data center is crucial.

User requests are geoDNS-routed to the nearest data center. GeoDNS is a DNS service that allows domain name to be resolved to an IP address based on the location of the user.

Summary:

The above steps from step 1 to step 11 would potentially help to build a system from scratch and scale the system from a single user to 10 million users.

So, while building distributed scalable systems things to keep in mind are:

Keep Web Tier stateless
Have data replication or redundancy at every tier
Cache data as much as possible
Host static resources on CDN
Scale data tier by sharding
Split tiers into individual services, like using messaging queue
Monitor your system and use automation tools
Support multiple data centers

Happy Learning!!!!!!!!!