Posted in Devops, Information Technology

How Load Balancer works

Large websites may be “load balanced” across multiple machines. In many load balanced setups, a user may hit any of the backend machines during a session. Because of this, a number of methods exist to allow many machines to share user sessions.

The method chosen will depend on the style of load balancing employed, as well as the availability/capacity of backend storage:

Session information stored in cookies only: Session information (not just a session identifier) is stored in a user’s cookie. For example, the user’s cookie might contain the contents of their shopping basket. To prevent users tampering with the session data, a HMAC may be provided along with the cookie. This method is probably least suitable for most applications:

  • No backend storage is required
  • The user does not need to hit the same machine each time, so DNS load balancing can be employed
  • There is no latency associated with retrieving the session information from a database machine (as it is provided with the HTTP request). Useful if your site is load balanced by machines on different continents.
  • The amount of data that can be stored in the session is limited (by the 4K cookie size limit)
  • Encryption has to be employed if a user should not be able to see the contents of their session
  • HMAC (or similar) has to be employed to prevent user tampering of session data
  • Since the session data is not stored server side, it’s more difficult for developers to debug

Load balancer always directs user to the same machine: Many load balancers may set their own session cookie, indicating which backend machine a user is making requests from, and direct them to that machine in the future. Because the user is always directed to the same machine, session sharing between multiple machines is not required. This may be good in some situations:

  • An existing application’s session handling may not need to be changed to become multiple machine aware
  • No shared database system (or similar) is required for storing sessions, possibly increasing reliability, but at the cost of complexity
  • A backend machine going down will take down any user sessions started on it, with it.
  • Taking machines out of service is more difficult. Users with sessions on a machine to be taken down for maintenance should be allowed to complete their tasks, before the machine is turned off. To support this, web load balancers may have a feature to “drain” requests to a certain backend machine.

Shared backend database or key/value store: Session information is stored in a backend database, which all of the webservers have access to query and update. The user’s browser stores a cookie containing an identifier (such as the session ID), pointing to the session information. This is probably the cleanest method of the three:

  • The user never needs be exposed to the stored session information.
  • The user does not need to hit the same machine each time, so DNS load balancing can be employed
  • One disadvantage is the bottleneck that can be placed on whichever backend storage system is employed.
  • Session information may be expired and backed up consistently.

Overall, most dynamic web applications perform a number of database queries or key/value store requests, so the database or key/value store is the logical storage location of session data.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s