PHP Clustering using Apache httpd mod_proxy
For peoples who are new to these terms of scaling, here is the quick info:
1. Vertical Scaling – Also called as scale up, it means to add the hardware resources to the single node of the server stack. For e.g. adding up more memory or cpu to existing hardware, increasing disk space, increasing the number of processes etc.
2. Horizontal Scaling – Also called as scale out, it means to increase the number of nodes in the existing deployment. For e.g. adding more application servers or db servers in the existing stack, creating server clusters etc
Today in this post, I will try to explain about applying horizontal scaling on LAMP server i.e. setting up PHP clusters for improving performance and load handling capability of PHP applications.
Overview of a PHP Cluster
Generally, a PHP cluster is composed of multiple webservers running PHP individually which then is load-balanced either by a hardware or a software based load balancer. Common examples of hardware load balancer could be F5 BigIP, Nortel’s (Alteon), AceDirector AD3 etc and software load balancer are Apache httpd server, Nginx etc.
The request coming from client will be first captured by load balancer and the load balancer will forward the request to available server node based on some algorithm like round robin, biased, random etc.
In this post, I will be using Apache httpd as the software load balancer to distribute load on different PHP nodes.
Requirements for PHP Cluster
To set a PHP cluster, following are required:
1. Three Linux servers – One server as a load balancer and other two as the PHP application nodes. They can run any Linux OS. We would be using CentOS 5.3 64bits edition in these three environments. We will refer to load balancer node as LB and the application nodes as a1 and a2 respectively from now.
2. Apache httpd v2.2 with mod_rewrite, mod_proxy – Apache httpd should be installed on all the three servers with the support of mod_rewrite, mod_proxy, mod_proxy_http modules.
3. PHP – PHP should be installed on the two Linux servers acting as application nodes (A1 and A2). You can refer my earlier post – The Perfect LAMP Stack to setup the application nodes. We will use PHP only for testing our cluster setup.
Theory of Operation
The setup will make load balancer (LB) to act as the reverse proxy in front of the application nodes (A1 and A2). When a client will send the request to our server, the LB will pass it to the application nodes based on a predefined method as set by us in the Apache httpd configuration.
Setting up PHP Cluster
After installing the above mentioned applications, to setup the PHP clusters, now we only require to create Apache httpd configuration on LB server to make it act like a reverse proxy.
Open httpd-vhosts.conf and add the following configurations inside the VirtualHost block
1. ServerName will have your url or ip for the load balancer (LB) server
2. DocumentRoot will be your path to publicly available documents.
3. ProxyRequests off will prevent your load balancer to act as forward proxy server.
Note: When you compile mod_proxy with Apache httpd, it can allow you to use the server to act either as a forward proxy or reverse proxy server. The above setting is just used for prevention.
4. <Proxy> directive will allow you to enforce rules upon the proxied content. Here you can set what constraint you want to apply on the proxied content for e.g. you can only allow few ip range to access the load balancer. In our case we are going to allow from all as it is the load balancer in front of public.
<Proxy *> Order deny,allow Allow from all </Proxy>
5. ProxyPass directive will configure the url mapping from the reverse proxy server to the application nodes. You can also configure which url should be translated and sent to backend and which should not. In the following settings, we have configured that when the /balancer-manager is requested, it should not be sent to the backend whereas all other requests should be forwarded there. The balancer://mycluster/ defines our clusters where we will transmit the load and which we will be configuring in next step. The stickysession directive is used here for the sessions which I will explain later in some other post. nofailover directive is related to sessions.
ProxyPass /balancer-manager ! ProxyPass / balancer://mycluster/ stickysession=PHPSESSID nofailover=On
6. Once again, we would be defining the <Proxy> directive, but this time we will be using it to enhouse our balancer configurations. The BalancerMember directive is used to add the add the application nodes to the cluster and as of our configuration, we are going to add our two application nodes a1 and a2. The route directive value will be appended to the session id. The ProxySet directive is used to define the additional balancer configuration parameters. In our case, we are going to define the load balancing method which we are going to use in the lbmethod parameter and that is byrequests. The other available methods are bytraffic and bybusyness. To know more about load balancing methods, please refer to http://httpd.apache.org/docs/2.1/mod/mod_proxy_balancer.html
<Proxy balancer://mycluster> BalancerMember http://a1.example.com route=a1 BalancerMember http://a2.example.com route=a2 ProxySet lbmethod=byrequests </Proxy>
7. Next, we have defined the settings of the balancer-manager which allows us to view the dynamic update of balancer members i.e. the application nodes. You would be requiring to enable mod_status module in Apache httpd to use the balancer-manager
<Location /balancer-manager> SetHandler balancer-manager Order deny,allow Allow from all </Location>
Note: We have explained the very basic settings here to setup the Apache httpd to act as a reverse proxy load balancer. You can fine tune the settings according to your needs and to know more about the available options, please refer to http://httpd.apache.org/docs/current/mod/mod_proxy.html
The complete settings which we have explained earlier can be seen below:
NameVirtualHost *:80 <VirtualHost *:80> ServerName lb.example.com DocumentRoot /usr/local/apache2/htdocs/ ProxyRequests Off <Proxy *> Order deny,allow Allow from all </Proxy> ProxyPass /balancer-manager ! ProxyPass / balancer://mycluster/ stickysession=PHPSESSID nofailover=On ProxyPassReverse / http://a1.example.com/ ProxyPassReverse / http://a2.example.com/ <Proxy balancer://mycluster> BalancerMember http://a1.example.com route=a1 BalancerMember http://a2.example.com route=a2 ProxySet lbmethod=byrequests </Proxy> <Location /balancer-manager> SetHandler balancer-manager Order deny,allow Allow from all </Location> </VirtualHost>
Save the httpd-vhosts.conf after making the above settings and restart the server. That will complete our task of creating Apache httpd as reverse proxy load balancer server.
Testing PHP Cluster
Now, to test your PHP cluster setup, create a file test.php in your application nodes a1 and a2. The source code for the file would be very simple and as below:
For a1 node,
<?php echo 'Hi, I am from http://a1.example.com'; ?>
For a2 node,
<?php echo 'Hi, I am from http://a2.example.com'; ?>
After creating these files, try to load these files from your load balancer url i.e. http://lb.example.com/test.php for multiple times. If your server setup is correct, you can view the either output as shown in below screenshots:
For node a1,
For node a2
To see, how your load balancer is performing and if you have set it up as explained earlier, you can visit the url http://lb.example.com/balancer-manager and see the output as shown below:
Thats all, to add more application nodes to the load balancer, you only have to configure BalancerMember as configured above with its url. Post any questions if you experience any problems in setting it up. All comments and suggestions are welcome.