Clustering Tomcat Servlet Engines is interesting for two reasons: load balancing and failover.
Fail-over is probably the most important issue for web applications. If the front end load balancer detects that one of the nodes has gone down it will redirect all the traffic to the second instance and your clients, apart from any on the failed node, won’t even notice. Well that’s the theory – if the site is under high load you may find that the surviving Tomcat cannot cope with the increased load and fails in short order. You need to size and correctly configure your hardware and software.
You can also manually fail-over a system to upgrade the hardware, JVM or other application without end users being aware. If your management doesn’t tolerate any downtime this is an extremely important feature. Your JVM’s can be installed on separate or the same physical host. The latter leaves you exposed to hardware problems.
There are two easy ways to increase application performance without tuning the application software itself. Buy faster hardware or add more hardware. If you add a second server running Tomcat you can use load balancing to distribute the work between the two servers. Requests come in to the Apache front end, static content is served directly by Apache and any dynamic requests forwarded to the Tomcats based on some algorithm. The Apache Jakarta mod_jk connector is the most popular way of doing this.
In this exercise we will configure two physical computers each running one Apache web server and two tomcats for both failover and load balancing. The example ignores any back-end data storage considerations but will discuss load balancing between the two computers.
The simplest technique for load balancing between the two Apache front end servers is the DNS Round Robin strategy. This performs load balancing by returning a different server IP address each time a (DNS) name server is queried by a client. It is usually found as front-end service on large clusters of servers to provide a server address that is geographically close to the client. Normally an intelligent name server such as lbnamed is used. This will poll the web servers in the list to check if they are online and not overloaded.
DNS Load balancing has some significant drawbacks. Addresses may be cached by clients and local DNS resolvers and it can be hard to accurately manage load. The system simply shuffles the order of the address records each time the name server is queried. No account is taken of real server load or network congestion.
The advantage of using a load balancer compared to using round robin DNS is that it takes care of the load on the web server nodes and will direct requests to the node with the least load. It can also remove failed nodes from the list. Some round robin DNS software can do this, the problem is they have little control about how client’s cache IP addresses.
Many web applications (e.g. forum software, shopping carts, etc.) make use of sessions. If you are in a session on Apache node 1, you would lose that session if suddenly node 2 served your requests. This assumes that the web servers don’t share their session data, say via a database.
For DNS load balancing this is not a problem, the IP address will resolve to a single server. However a front end load balancer must be aware of sessions and always redirect requests from the same user to the same backend server.
There are a number of packages to support front end load balancing. For example RedHat Piranha or Ultramonkey. There are also hardware/firmware solutions from companies such as Cisco.
Apache is configured using the httpd.conf file. Apache - Tomcat workers are defined in a properties file called workers.properties. When starting up, the web server plug-in will instantiate the workers whose name appears in the worker.list property, these are also the workers to whom you can map requests. Tomcat is configured using the server.xml file.
The values given here are a starting point. They will need to be reviewed and tuned as system load changes.
MaxClients limits the maximum simultaneous connections that Apache can serve. This value should be tuned carefully based on the amount of memory available on the system.
MaxClients ≈ (RAM - size_all_other_processes)/(size_apache_process(es))
If Apache is using the PreFork MPM a separate Apache process will be created for each connection, so the total number of processes will be MaxClients + 1. Each process will have a single connection the mod_jk pool to talk to a Tomcat instance. The Worker MPM module uses the thread model to handle client requests and should be slightly more efficient.
As a guide Apache MaxClients and the number of available Tomcat threads configured in server.xml should be equal. Mod_jk maintains persistent connections between Apache and Tomcat. If the Apache front end accepts more users than Tomcat can handle you will see connection errors. In a clustered environment we have to take account of errors the load balancer makes when it tracks the number of concurrent threads to the Tomcat back-ends so the following formula is a starting point:
Apache.MaxClients = 0.8 * (Tomcat-1.maxThreads + … Tomcat-n.maxThreads)
400 = 0.8 * (250 + 250)
The Sezame Apache front end also serves static content and a number of client connections will be used for this purpose for each requested page of content.
Sets the total number of requests that each Apache process will handle before it quits. This is useful to control memory leaks issues with loaded modules. Because mod_jk maintains permanent connections in its pool between Apache and Tomcat we will set this to zero which means that unlimited requests can be handled.
We will need to load the mod_jk module (a shared library) into Apache when it starts and tell the module where to find the workers.properties file and where to write logs (useful for debugging/crashes).
LoadModule jk_module modules/mod_jk-apache-2.0.55.so JkWorkersFile "/etc/apache2/workers.properties" JkLogFile "/var/logs/apache/mod_jk.log" JkLogLevel info JkLogStampFormat "[%a %b %d %H:%M:%S %Y] "
We then have to tell Apache to redirect any requests for JSPs and servlets (in this case resources in the servlet virtual directory ending in .do) to the load-balancer for processing.
<VirtualHost *:80> … LoadModule rewrite_module modules/mod_rewrite.so Alias /image/ "/var/localhost/htdocs/images/" # Delegate the following paths to Tomcat JkMount /*.jsp loadbalancer JkMount /servlet/*.do loadbalancer </VirtualHost>
The following is a suggested configuration for two physical servers with a pair Tomcat instances running on each. The rational is that the front end load balancer will distribute load between the two physical hosts. The f/e load balancer is itself configured for fail-over using a virtual IP address and a heartbeat type operation to determine if the balancer is still operational.
Each physical server has an Apache web server and two tomcat servlet engines. Apache will serve static content either from an NFS mounted disk or from a replicated file system using something like RepliWeb. Apache will load balance between the two local tomcats but if these become overloaded or crash it can forward requests on to the other physical server.
The Apache mod_jk load-balancer can distribute work using one of three methods:
The sticky_session is set to true to maintain a session between the client and Tomcat.
worker.list=loadbalancer worker.loadbalancer.type=lb worker.loadbalancer.method=B worker.loadbalancer.balance_workers=worker1,worker2,worker3,worker4 worker.loadbalancer.sticky_session=true
This is common configuration for all workers. Note if Prefork MPM is used you will want to set the cachesize property. This should reflect the estimated average concurrent number of users for the Tomcat instance. Cache_timeout should be set to close cached sockets. This reduces the number of idle threads on the Tomcat server. Note that connection_pool_size and connection_pool_timeout should be used with mod_jk 1.2.16 and later.
If there is a firewall between Apache and Tomcat the socket_keepalive property should be used. Socket_timeout will tell Apache to close an ajp13 connection after some inactivity time. This also reduces idle threads on Tomcat.
# common config worker.basic.type=ajp13 worker.basic.socket_timeout=300 worker.basic.lbfactor=5
These are the two workers. Notice the use of the local_worker property. The load-balancer will always favour local_workers with a value set to 1.
worker.worker1.host=server1 worker.worker1.port=8009 worker.worker1.local_worker=1 worker.worker1.reference=worker.basic worker.worker2.host=server1 worker.worker2.port=8010 worker.worker2.local_worker=1 worker.worker2.reference=worker.basic worker.worker3.host=server2 worker.worker3.port=8009 worker.worker3.local_worker=0 worker.worker3.reference=worker.basic worker.worker4.host=server2 worker.worker4.port=8010 worker.worker4.local_worker=0 worker.worker4.reference=worker.basic
Http access is only required by administrators for checking if a Tomcat instance is still running by connecting directly to the specified port. The firewall should block client access to this port.
<Connector port="8899" minSpareThreads="5" maxThreads="10" enableLookups="false" redirectPort="8443" acceptCount="10" debug="0" connectionTimeout="60000"/>
The Tomcat instance must listen on the same port as is specified in the corresponding worker’s section in worker.propeties.
<Connector port="8009" enableLookups="false" redirectPort="8443" protocol="AJP/1.3" maxThreads="400" minSpareThreads="50" maxSpareThreads="300" />
maxThreads should be configured so that the CPU can be loaded to no more than 100% without seeing JVM error messages about insufficient threads.
The Engine jvmRoute property should correspond to the worker name in the worker.properties file or the load balancer will not be able to handle stickyness.
<Engine name="Catalina" defaultHost="localhost" jvmRoute="worker1">