preload preload


PHP Clustering using Apache httpd mod_proxy

Often for my clients, I have to prepare the deployment strategy for their LAMP based web applications. Some of them are small to medium businesses and are starting up so a single server setup work out for them. But there are few large web applications too which are growing continuously in terms of users and demands scaling either horizontally or vertically.

For peoples who are new to these terms of scaling, here is the quick info:

1. Vertical Scaling – Also called as scale up, it means to add the hardware resources to the single node of the server stack. For e.g. adding up more memory or cpu to existing hardware, increasing disk space, increasing the number of processes etc.

2. Horizontal Scaling – Also called as scale out, it means to increase the number of nodes in the existing deployment. For e.g. adding more application servers or db servers in the existing stack, creating server clusters etc

Today in this post, I will try to explain about applying horizontal scaling on LAMP server i.e. setting up PHP clusters for improving performance and load handling capability of PHP applications.

Overview of a PHP Cluster

Generally, a PHP cluster is composed of multiple webservers running PHP individually which then is load-balanced either by a hardware or a software based load balancer. Common examples of hardware load balancer could be F5 BigIP, Nortel’s (Alteon), AceDirector AD3 etc and software load balancer are Apache httpd server, Nginx etc.

loadbalancer

loadbalancer

The request coming from client will be first captured by load balancer and the load balancer will forward the request to available server node based on some algorithm like round robin, biased, random etc.

In this post, I will be using Apache httpd as the software load balancer to distribute load on different PHP nodes.

Requirements for PHP Cluster

To set a PHP cluster, following are required:

1. Three Linux servers – One server as a load balancer and other two as the PHP application nodes. They can run any Linux OS. We would be using CentOS 5.3 64bits edition in these three environments. We will refer to load balancer node as LB and the application nodes as a1 and a2 respectively from now.

2. Apache httpd v2.2 with mod_rewrite, mod_proxy – Apache httpd should be installed on all the three servers with the support of mod_rewrite, mod_proxy, mod_proxy_http modules.

3. PHP – PHP should be installed on the two Linux servers acting as application nodes (A1 and A2). You can refer my earlier post – The Perfect LAMP Stack to setup the application nodes. We will use PHP only for testing our cluster setup.

Theory of Operation

The setup will make load balancer (LB) to act as the reverse proxy in front of the application nodes (A1 and A2). When a client will send the request to our server, the LB will pass it to the application nodes based on a predefined method as set by us in the Apache httpd configuration.

Setting up PHP Cluster

After installing the above mentioned applications, to setup the PHP clusters, now we only require to create Apache httpd configuration on LB server to make it act like a reverse proxy.

Open httpd-vhosts.conf and add the following configurations inside the VirtualHost block

1. ServerName will have your url or ip for the load balancer (LB) server


ServerName lb.example.com

2. DocumentRoot will be your path to publicly available documents.


DocumentRoot /usr/local/apache2/htdocs/

3. ProxyRequests off will prevent your load balancer to act as forward proxy server.

Note: When you compile mod_proxy with Apache httpd, it can allow you to use the server to act either as a forward proxy or reverse proxy server. The above setting is just used for prevention.


ProxyRequests Off

4. <Proxy> directive will allow you to enforce rules upon the proxied content. Here you can set what constraint you want to apply on the proxied content for e.g. you can only allow few ip range to access the load balancer. In our case we are going to allow from all as it is the load balancer in front of public.

<Proxy *>
 Order deny,allow
 Allow from all
 </Proxy>

5. ProxyPass directive will configure the url mapping from the reverse proxy server to the application nodes. You can also configure which url should be translated and sent to backend and which should not. In the following settings, we have configured that when the /balancer-manager is requested, it should not be sent to the backend whereas all other requests should be forwarded there. The balancer://mycluster/ defines our clusters where we will transmit the load and which we will be configuring in next step. The stickysession directive is used here for the sessions which I will explain later in some other post. nofailover directive is related to sessions.

ProxyPass /balancer-manager !
ProxyPass / balancer://mycluster/ stickysession=PHPSESSID nofailover=On

6. Once again, we would be defining the <Proxy> directive, but this time we will be using it to enhouse our balancer configurations. The BalancerMember directive is used to add the add the application nodes to the cluster and as of our configuration, we are going to add our two application nodes a1 and a2. The route directive value will be appended to the session id. The ProxySet directive is used to define the additional balancer configuration parameters. In our case, we are going to define the load balancing method which we are going to use in the lbmethod parameter and that is byrequests. The other available methods are bytraffic and bybusyness. To know more about load balancing methods, please refer to http://httpd.apache.org/docs/2.1/mod/mod_proxy_balancer.html

<Proxy balancer://mycluster>
BalancerMember http://a1.example.com  route=a1
BalancerMember http://a2.example.com  route=a2
ProxySet lbmethod=byrequests
</Proxy>

7. Next, we have defined the settings of the balancer-manager which allows us to view the dynamic update of balancer members i.e. the application nodes. You would be requiring to enable mod_status module in Apache httpd to use the balancer-manager

<Location /balancer-manager>
SetHandler balancer-manager
Order deny,allow
Allow from all
</Location>

Note: We have explained the very basic settings here to setup the Apache httpd to act as a reverse proxy load balancer. You can fine tune the settings according to your needs and to know more about the available options, please refer to http://httpd.apache.org/docs/current/mod/mod_proxy.html

The complete settings which we have explained earlier can be seen below:


NameVirtualHost *:80
<VirtualHost *:80>
 ServerName lb.example.com
 DocumentRoot /usr/local/apache2/htdocs/
 ProxyRequests Off

 <Proxy *>
 Order deny,allow
 Allow from all
 </Proxy>

 ProxyPass /balancer-manager !
 ProxyPass / balancer://mycluster/ stickysession=PHPSESSID nofailover=On
 ProxyPassReverse / http://a1.example.com/
 ProxyPassReverse / http://a2.example.com/
 <Proxy balancer://mycluster>
 BalancerMember http://a1.example.com  route=a1
 BalancerMember http://a2.example.com  route=a2
 ProxySet lbmethod=byrequests
 </Proxy>

 <Location /balancer-manager>
 SetHandler balancer-manager
 Order deny,allow
 Allow from all
 </Location>
</VirtualHost>

Save the httpd-vhosts.conf after making the above settings and restart the server. That will complete our task of creating Apache httpd as reverse proxy load balancer server.

Testing PHP Cluster

Now, to test your PHP cluster setup, create a file test.php in your application nodes a1 and a2. The source code for the file would be very simple and as below:

For a1 node,


<?php

echo 'Hi, I am from http://a1.example.com';

?>

For a2 node,


<?php

echo 'Hi, I am from http://a2.example.com';

?>

After creating these files, try to load these files from your load balancer url i.e. http://lb.example.com/test.php for multiple times. If your server setup is correct, you can view the either output as shown in below screenshots:

For node a1,

PHP application node a1

PHP application node a1

For node a2

PHP application node a2

PHP application node a2

To see, how your load balancer is performing and if you have set it up as explained earlier, you can visit the url http://lb.example.com/balancer-manager and see the output as shown below:

Apache httpd mod_proxy balancer-manager

Apache httpd mod_proxy balancer-manager

Thats all, to add more application nodes to the load balancer, you only have to configure BalancerMember as configured above with its url. Post any questions if you experience any problems in setting it up. All comments and suggestions are welcome.

Liked the content? Then why not share with your pals

  • 20 responses to "PHP Clustering using Apache httpd mod_proxy"

  • Edds
    20:40 on October 22nd, 2010

    Thanx for a this post, this is the easiest way of doing load balancing that i have seen.

    worth to try!

  • Jeremy Glover
    22:43 on October 22nd, 2010

    Great post. I would be very interested to see what methods you use to push application updates to a1, a2, etc. I’ve read a lot of articles about the best way to do dev, staging, production, etc., but it seems like the setup and pushes would be difficult with a load-balanced setup.

  • Deepesh
    23:56 on October 22nd, 2010

    @Edd Thanks for the appreciation.

  • Deepesh
    23:57 on October 22nd, 2010

    @Jeremy, definitely I would be interested to cover that in my next posts.

  • Grayson
    7:07 on October 23rd, 2010

    wow this is a really killer post. first end-to-end load balancing tutorial i’ve came across. thanks a bunch.

  • dörte
    2:51 on October 24th, 2010

    @Jeremy: cvs export the system and use the power of rsync :) Your script should push files to a second directory, and then switch directories plus doing db updates and restart caches…

    Thx. Varnish does a nice job as well, you might whant to look into thatas well…

  • Slavi Marinov
    11:25 on October 24th, 2010

    Hi nice, article.

    PHPSESSIONID should be PHPSESSID unless you’ve customized it.

    just wondering do you really have to use “lb.” prefix ?
    if yes, then do you have CNAMEs for domain.com and http://www.domain.com pointing to lb.domain.com

  • john
    15:28 on October 24th, 2010

    Excellent post guys thanks for sharing! This is exactly what I was thinking about doing for a large PHP project I have in mind, and now I have a clear example to follow :-)

  • Deepesh
    9:01 on October 25th, 2010

    @Slavi, Thanks for heads up. I will edit PHPSESSIONID to PHPSESSID.

    No it is not required to setup with “lb.” prefix. I have just used it for better understanding. So, if you are using the similar setup as I wrote, yes, you have to setup the CNAMEs or redirection.

  • Glenn
    6:59 on October 29th, 2010

    Could the proxy mod be used to split requests to two different servers based on different zip codes placed in the url? So you may have requests with embedded zip codes 50000 or lower go to one server and 50000 or higher go to another? Example http://www.mysite.com/zip83992
    And can you then get apache to route all requests to one server if the other fails? Thanks for the help.

  • Deepesh
    18:46 on October 30th, 2010

    Yes, you can do it. You have to use mod_proxy ProxyMatch (http://httpd.apache.org/docs/current/mod/mod_proxy.html#proxymatch) directive. So you can say like

    
    <ProxyMatch "/zip[0-9]{1,4}">
    ProxyPass.............
    <ProxyMatch>
    

    I have just put a sample regex. Make sure your’s is compatible with the PCRE notation. Hope this helps you.

  • Lluis
    13:24 on December 2nd, 2010

    Any one have the experience of multiple front-ends and only one file system? It will be useful if you have N frontends and you one only one point where to upload your application so you don’t need to synchronize N filesystems. also if you upload anything from one frontend it will be available instantly from every server. I did this with virtual folders in IIS but in apache I can’t define alias where the target folder is a remote file system.

  • arfie
    13:45 on February 16th, 2011

    Great article !!!
    But i have a question for you about load balancer.

    In your article, there are 3 servers, 1 server for LB, and the rest are for web servers. But, how if we have 2 servers only and Load balancer combine with main server.
    Example, there are 2 servers: Server A and Server B.
    Server A is Main website and also for Load Balancer.
    Server B is backup website.

    How to configure load balancer to condition like that?
    Thank You for your answer.

  • Deepesh
    18:09 on February 16th, 2011

    @arfie, You will require a external reverse proxy server (LB) to route your requests to the following webserver’s as that will align your request based on unavailability/load of the system. If you host your main website and LB on same server, and if same server goes down due to load that would mean your LB would be down too and will not be available to route requests anywhere.

  • murali
    14:17 on May 3rd, 2011

    Hi,

    I have a similiar sort of application as mentioned above, But with
    multiple virtual hosts, Will this algorithm work for multiple virtual hosts
    too, if not please do help me in doing it

    Thanks,

    Murali..

  • Deepesh
    8:29 on May 4th, 2011

    @murali, Do you mean single server with multiple virtual hosts?

  • Leave a Reply

    * Required
    ** Your Email is never shared