Beyond Names & Labels: Load Balancer

8 min readOct 2, 2023

“I learned very early the difference between knowing the name of something and knowing something.”
― Richard P. Feynman

These blog series capture the essence of Feynman’s quote, emphasizing the distinction between superficial knowledge (knowing the name of something) and deep understanding (knowing something). If you want to learn beyond the name of Load Balancer then read away 😊

Problem Statement

You open a bookstore. The bookstore is new and so far you get around 50 customers per day and you are able to serve all these customers satisfactorily. Eventually, the store grows in popularity, and you see there is an increase in the number of customers per day. The increase is so much that now customers are experiencing a minimum 30-minute wait time to get service. This leads to a poor customer experience. One way to solve this problem is to increase your current store capacity by increasing the area of this store and hiring more salespersons. But there comes a limit to how far you can spread this one store. So you decide to open one more store to distribute the customer's load among these two stores. What you just did is Load Balancing.

Load Balancing

Load Balancing is the idea of distributing incoming traffic across multiple servers efficiently.

Just like your bookstore, today’s websites experience high internet traffic (concurrent customer requests). To serve all these customer requests, you can have a single server setup with very powerful application server.

Cons of the above setup:

There is a limit to the capacity of a single machine. A single application server cannot cater to very high traffic with minimal wait time.
This server becomes a single point of failure. When this server is down, your whole application goes down.

Load Balancing is the act of introducing multiple servers to serve the high concurrent traffic.

Now that you have multiple servers and concurrent clients making requests to these servers; who will decide which server will serve these incoming requests? Load Balancer is the black box in the above picture.

What is a Load Balancer?

The load balancer is an entity that does the following operations:

Routes your client requests to the appropriate server that can serve the request
If a server is not available, then the load balancer will redirect traffic to a different healthy server ensuring High Availability.
In case you need to add a server to scale your website, the load balancer will route traffic to the newly added server too.

Benefits of Load Balancer

Availability: The load balancer distributes traffic between multiple servers. It also stops forwarding requests to any server that is not healthy thereby ensuring high availability.
Resource Utilization: Ensures good resource utilization by distributing the traffic across all servers.
Scaling: Helps in horizontally scaling applications by adding and removing servers as needed.
Maximize performance: Load balancer enables multi-server application thereby allowing optimally serving more concurrent requests and minimizing customer wait time.

Hardware vs. Software Load Balancers

The load balancer can be hardware or software.

Hardware Load Balancers: A hardware load balancer is a dedicated standalone physical appliance designed specifically for load-balancing network traffic across multiple servers.
Software Load Balancers: Software load balancers can be deployed on a variety of platforms, including virtual machines, containers, and cloud instances. They offer more flexibility in terms of where and how they are deployed.

Layer 7 & Layer 4 Load Balancers

Before talking about Layer 4 or Layer 7 Load balancers, let's refresh the layers of the Open System Interconnection (OSI) model.

Layer 4 Load Balancer (Network Load Balancer)

Layer 4 load balancer operates at the Transport Layer of the OSI Model. In the transport layer, only the IP and port number are available.
Layer 4 load balancer can simply forward network packets to and from the upstream server without inspecting the content of the packets.
Layer 4 load balancing can make routing decisions based on only the IP addresses and port numbers.
Layer 4 load balancers also perform Network Address Translation (NAT) while forwarding requests to and from the upstream server.

Layer 7 Load Balancer (Application Load Balancer)

Layer 7 load balancer operates at the Application layer of the OSI Model.
Layer 7 Load balancing can make routing decisions based on various characteristics of the HTTP header and the actual contents of the message, such as the URL, or information in a cookie.

Load Balancer Configuration

In this section, I will show you one sample load balancer config file. For this demo, I will be using the Nginx load balancer configuration. Nginx is an open-source software that provides load-balancing capabilities.

Load Balancer Algorithms

In this section, we discuss the different load-balancing algorithms. A sample Nginx configuration is also provided with each algorithm.

Static Algorithms

Static algorithms distribute traffic to servers without taking into account the server's real-time stats — like response time, active connections, etc.

Round Robin

The load balancer sends requests in a cyclic order between different servers. It sends the first request to the first server, the second to the second server, and so on till the end of the server list. When it reaches the end of the list, the load balancer loops the list again from the beginning.

Pros:

Simple and easy to implement

Cons:

It can lead to an overloaded server
This algorithm assumes that all servers have the same capacity in terms of CPU cores, memory, etc., and distributes load evenly.

upstream backend {
   # no load balancing method is specified for Round Robin
   server backend1.example.com;
   server backend2.example.com;
}

Sticky Round Robin

This is an extension of the round-robin algorithm where all subsequent requests of one user go to the same server.

Pros:

This technique helps in leveraging in-memory caching as all the requests of one user go to the same server.

Cons:

This can lead to uneven resource utilization. For example: if one user is sending bulk requests, then this user’s server will get overloaded while other servers sit empty.

upstream backend {
    server backend1.example.com;
    server backend2.example.com;
    sticky cookie srv_id expires=1h domain=.example.com path=/;
}

Weighted Round Robin

Weight is assigned to each server by the admin. Traffic is distributed based on the weightage of a server. The server with a higher weight gets more requests compared to the one with a lower weight.

Pros:

You can configure machines with higher processing capacity (more cores, more memory, etc.) to accept more requests. This leads to better resource utilization.

Cons:

These weights are configured manually which might take some trials to reach the correct weight distribution.

upstream backend {
    server backend1.example.com weight=5;
    server backend2.example.com weight=1;
    server 192.0.0.1 backup;
}

Hash

The hash algorithm uses a hash function to map the request to servers. The hash function can use the client IP address or request URL as input to the hash function.

In IP Hash techniques, the client IP address is taken as input to the hash function. The method guarantees that requests from the same address get to the same server unless it is not available.

Pros:

When hashing is done on the client IP it ensures that each client’s all requests go to one server. This helps in leveraging in-memory caching as all the requests of one user go to the same server.

Cons:

Choosing an optimal hash function to evenly distribute load among all servers can be tricky.

upstream backend {
    ip_hash;
    server backend1.example.com;
    server backend2.example.com;
}

upstream backend {
    # example of generic hash algo where uri is taken as input to hash function
    hash $request_uri consistent;
    server backend1.example.com;
    server backend2.example.com;
}

Dynamic Algorithm

These classes of algorithms route requests based on real-time performance metrics and server stats like average response time, latency, number of active connections, etc. into account.

Least Connection

These load balancers keep track of the active connections of each server. It sends the new request to the server with the least concurrent connections.

Pros:

This technique does not overload a busy application server with excessive requests. A new request is sent to a server with the least open connections.

Cons:

It can lead to an increase in load on certain servers if the connections are long-running and pile up eventually.
This requires efficient tracking of ongoing connections on each server.

 upstream myapp1 {
        least_conn;
        server srv1.example.com;
        server srv2.example.com;
        server srv3.example.com;
    }

Least Response Time

An intelligent load balancer that keeps track of the number of active connections and average latency of each server. It sends the new request to the server with the lowest average latency and the lowest number of active connections.

Pros:

Better resource utilization and optimized response time. The server load is distributed and adjusted in real time based on the available metrics.

Cons:

Complex implementation as it requires continuous monitoring of the latency of each server.
This type of load balancer is mostly available as paid software.

upstream backend {
    least_time header;
    server backend1.example.com;
    server backend2.example.com;
}

Thank you for giving me the gift of your precious time 😇 Happy Reading!!

Beyond Names & Labels: Load Balancer

Problem Statement

Load Balancing

What is a Load Balancer?

Benefits of Load Balancer

Hardware vs. Software Load Balancers

Layer 7 & Layer 4 Load Balancers

Layer 4 Load Balancer (Network Load Balancer)

Layer 7 Load Balancer (Application Load Balancer)

Load Balancer Configuration

Load Balancer Algorithms

Static Algorithms

Dynamic Algorithm

References

Written by Ananya