A load balancer distributes network traffic across a cluster of servers. This ensures that no single server is over-loaded; hence improving the responsiveness of an application, and also improving the availability of the application.
In a distributed application multiple load balancers are commonly put between different layers -
Between the client and web servers
Between the web servers and application servers
Between the application servers and database servers.
Following are the key benefits of a load balancer
Application Responsiveness - A load balancer distributes network traffic across a cluster of servers. This ensures that no single server is over-loaded; hence improving the responsiveness of an application.
Application Availability - A load balancer checks the health of a server before routing requests to that server, hence improving the availability of the application.
SSL Termination - A load balancer can decrypt SSL (Secure Sockets Layer) traffic before passing the request on to the web server, which reduces processing times and CPU cycles on the web server. This is called SSL Termination.
Prevention from DDoS attacks - Load balancer can defend against denial-of-service (DDoS) attacks by off-loading the attack traffic from the servers to cloud provider.
Predictive Analytics - A load balancer can provide predictive analytics such as traffic insights and bottlenecks before they happen.
A load balancer considers two factors to determine which server to route a request to.
Server health check - A load balancer pings the servers at regular intervals to determine the health of the servers. If a server fails a health check then that server is removed from the healthy server pool and requests are not routed to that server, until it responds to the health check again
Load balancing Algorithms - A load balancer determines which server to send a request to, out of the pool of healthy servers, by using on the many load balancing algorithms.
Following are some of the common load balancing algorithms
Least Connection Method - Routes request to the server having the least number of active connections.
Least Response Time Method - Routes request to the server having the least number of active connections and lowest average response time.
Round Robin Method - Routes request to the first available server and then moves it to the end of the queue.
Weighted Round Robin Method - Routes request to the first available server having the highest weight. Each server is assigned a weight, an integer number, based on its processing capacity.
IP Hash Method - The hash of the IP address of a server determines which server receives the request.
Layer 4 - A layer 4 load balancer uses the information defined at the networking transport layer (Layer 4) as the basis for deciding how to route traffic across a cluster of servers. For web traffic, layer 4 information includes source and destination IP addresses and ports.
Layer 7 - A layer 7 load balancer uses the information defined at the networking application layer (Layer 7) as the basis for deciding how to route traffic across a cluster of servers. For web traffic, layer 7 information includes HTTP header and content message such as destination URL, content type (text, video, image), cookie information etc.