Sharing the Load

Load sharing with DNS

by Andrew Kessler
Mar. 27, 1998

Publicity for a Web site could be a boon or it could be a disaster. When a site is featured on a television program or in a newspaper article it can be easily overwhelmed with requests. This type of response could bring even the most powerful Web server to its knees. The method to prevent this problem is called "load sharing."

In this article, we will examine two methods of managing traffic to your site, one by using the Domain Name System (DNS), and the other by adding an active load balancing product to your existing server system.

Load sharing, or load balancing, is the concept of distributing service requests over multiple resources to avoid transmission congestion and bottlenecks. When applied to Web service, this principle usually means a system with several mirrored Web servers all configured with the ability to respond directly to the same requests.

The ideal solution is to have a single address for your Web server that's visible to customers -- let's say www.yourcompany.com for example. This name is actually a "virtual" Web server that seamlessly connects users to one of the many different mirrored servers. This can actually be accomplished in a number of ways.

Load Sharing with DNS

One of the most common ways to do this is by using the Domain Name System (DNS). DNS is an Internet service that translates domain names into IP addresses. Domain names are alphabetic and easy for humans to remember (like www.webreview.com), but information on the Internet is delivered using IP addresses. Everytime you use a URL that contains a domain name, DNS will translate the name into an IP address. For example, www.webreview.com would be translated into 208.201.239.35.

On the system side, DNS allows you to enter multiple IP addresses for the same Web server. Each time a request is made for the Web server, DNS will cycle through the available IP addresses in a round-robin fashion. This is how load sharing is accomplished.

Below is an example using DNS to distribute HTTP requests over three Web servers. A company with a domain of "www.yourcompany.com" has set up three Web servers: www1.yourcompany.com, www2.yourcompany.com, and www3.yourcompany.com.

Figure 1

At the first request for www.yourcompany.com, DNS will reply with the address for www1 (128.1.1.1). The second request will be answered with www2 (128.1.1.2). The third request will be answered with www3 and for the fourth DNS will cycle back to www1. This is the round-robin service approach at work.

One of the most common implementations of DNS is the Berkeley Internet Name Domain (BIND). Below is an example of the DNS Resource Records entered for three mirrored servers at www.yourcompany.com:

  www.yourcompany.com  IN  A  128.1.1.1
  www.yourcompany.com  IN  A  128.1.1.2
  www.yourcompany.com  IN  A  128.1.1.3

Depending on the size of your organization you could either maintain your own DNS server or it could be maintained by your Internet Service Provider (ISP). The administrator of the DNS service would be responsible for adding the necessary addresses. One of the best sources for information on this topic is DNS and BIND, from O'Reilly & Associates.

The obvious advantage of using DNS for load distribution is that it is seamless to the user and simple to implement. There are, however, some limitiations.

Problems and Limitations

Imagine that the hard disk on www2 suddenly fails. The server is getting read errors every time it tries to access an HTML file. For our example, let's say that the failure is so severe that the server completely stops responding to requests.

The DNS has no way of knowing about the failure. As requests come in for www.yourcompany.com the DNS will continue to forward one out of every three requests to www2 -- which of course will fail. Effectively, 33% of all requests to www.yourcompany.com are now connecting to a black hole. This is an improvement over having just one Web server and having all the requests being lost due to a hardware failure, but only to a degree.

In this failure state, the end users might be able to get around the problem. When they notice the server isn't responding they could hit stop and then reload in their browser. This might cause another request to be sent out to the DNS. With some luck, the new request will be directed to www1 or www3. The success of this technique depends on how long the end user's computer holds onto the IP address for www.yourcompany.com. Their machine might cache the IP address and never make another request for that session. The IP address could also be cached at intermediate name servers along the way -- which is the next problem that we will discuss.

It is very common for a computer to request data from the same host several times in a given session. It is also common for many hosts from the same site to make requests from the same servers. Usually, these requests are made to a local name server that in turn asks another name server for the resolution of the domain. In order to minimize network traffic these responses are cached -- possibly on each name server along the way. This helps the network respond quickly with domain name resolution, but it also can defeat the round-robin load distribution.

However, there is a solution to this problem. The DNS protocol includes a time to live (TTL) for each entry. The TTL is the maximum time that the information should be held to be reliable. By setting the TTL to a small amount of time -- 30 seconds for example -- you can maximize the effectiveness of the distribution pattern. The protocol requires that local and intermediate DNS servers dump these entries from their cache when the TTL runs out.

Load sharing can also be useful when performing system maintenance on a Web server. When you know that you are going to be taking down one of the servers it can be done gracefully. First, you could remove the server from the DNS (www2 for example). This might require resetting the name server but it can be done without any loss of service. After a short period of time www2 should not be servicing any http requests -- remember the caching problem we spoke about. Now www2 can be taken out of service for maintenance. When the work is complete, just add www2 back into the DNS and you're back in business.

Alternatively, you could bring up a virtual interface on a different Web server (say www1) with the same IP address as www2. If you coordinate it properly, you should be able to take down www2 and bring up the virtual interface without losing service. When the maintenance on www2 is complete, just reverse the process.


Load Sharing with DNS

Balancing the Load with Cisco's LocalDirector