Reverse proxy for high traffic website


As previously explained, I had as part of a customer’s mission to propose a solution to accelerate a web site. The challenge was to replace an aging Juniper appliance with a reverse proxy for a high traffic website. This appliance was accelerating traffic by caching data (max. 100 MB), and balancing traffic between 4 front-end servers. I also wanted to reorganise the whole DMZ infrastructure to improve security. The existing architecture had all servers (front-ends, database and application servers) in the same LAN behind the firewall and the Junipers (see HLD image for As-Is solution below).

As-Is solution

As-is solution

Don’t laugh — those websites receive more than 45,000,000 visits per month, with more than 4 pages seen at each visit (in normal traffic), and are expected to receive more than 100,000,000 visits a month on special occasions. So, to improve this, I went for a classical layered system (see HLD To-be solution below).

HLD To-be solution

HLD To-be solution

Once the design for this reverse proxy for a high traffic website was accepted, the evaluation process for the acceleration platform began.
While working on the different platforms possible for this project, I thought it would be better to be ready to give access to other protocols than HTTP to the front-end servers zone.

My credo in consulting is to be solution-minded, rather than product-minded. This means that each time I have to propose a new solution, I’ll survey existing solutions, test them, and verify how they can fulfil the assigned mission. I will always promote the solutions with the shortest learning curve (taking into account the knowledge and sensibility of the existing IT team), and where possible, using open source products. (This is opposed to the product-minded approach, which proposes the product one knows best, or where one gets the best margin.) One drawback to this is that the time to design a proposed solution is longer (due to testing and probable learning), but the solution is generally more accurate to the customer’s needs.

In this case, I had to find a good reverse caching proxy for HTTP and a load balancer for other protocols (and if possible in open source products). I started from a blank page and began looking for a product able to fulfil both roles. Believe it or not, I couldn’t find a distro dedicated to this role. So I started evaluating different “reverse proxy” solutions:

And after that, evaluate the best solution to load balance traffic other than HTTP.

To be continued…

This article was entirely written on my iPad, and the drawings were designed with the QuickDiag app.

Comments