HAProxy Load Balancer for Effective Uptime and Horizonal Scaling

May 2, 2010

haproxy-backup

Been toying around with horizontal scaling in Amazon EC2 for distributed searches. In short, I was looking for an effective software load balancer and came across HAProxy and Pound. Both a very nice software based load balancers.

HAProxy Software Load Balancer

HAProxy is a bit more bare metal as it targets a very specific set of scenarios focused on TCPIP more than HTTP. You can use cookie based injection with HAProxy to do round robin and stick users to a specific server. However, you can not do this if your site is running SSL traffic. HAProxy can not decrypt the SSL traffic. This is more of the authors dead-fast belief that SSL should not be terminated because of CPU load on the load balancer preventing scaling as you would need to scale the load balancers at some point (we’re talking millions of requests, facebook style).

Pound Software Load Balancer

Pound is a bit more specific to HTTP/Web scenarios. It functions as a 100% Layer-7 load balancer as it does full HTTP(S) integration and has full access to the HTTP stack. What this means is that you can do some fancy routing based on cookies, url regex, and do this with SSL termination. You can combine the two and have Pound decrypt the traffic and forward to HAProxy but your setting up a scaling issue if you do.

Overview

haproxy

In the above context diagram we can see that we should have a firewall (software and/or hardware based) to only allow port 80 and 443. We use pound proxy (see below reference) to decrypt the SSL traffic and then pass the request on to HAProxy. HAProxy then inspects/injects the server cookie ID used for sticking somebody to a particular server and then proceeds to do round robin.

Note that the above assumes all servers/appdomain are up. HAProxy will detect if a server is down and redirect any new clients / old clients from the current domain to a different domain (sometimes causing a session loss event, but not much you can do about that). If you are performing an upgrade and would like to “bleed” users off the box that you want to recycle see the below article as HAProxy can support this as well.

Basic HAProxy Configuration

Below is an example configuration that shows a sharepoint farm (using NTLM) a aspnet_farm (using asp.net forms/basic) and a tomcat farm.

# /etc/haproxy.cfg
global
        user haproxy
        group haproxy

defaults
        mode http
        option forwardfor
        option redispatch
        retries 3
        maxconn 2000
        contimeout 5000
        clitimeout 50000
        srvtimeout 50000

backend sharepoint_farm
        balance roundrobin
        option redispatch
        cookie SERVERID insert nocache indirect
        server myServer1 192.168.1.1:80 cookie sharepoint01 check
        server myServer2 192.168.1.2:80 cookie sharepoint02 check
        server myServer3 192.168.1.3:80 cookie sharepoint03 check

backend aspnet_farm
        balance roundrobin
        option redispatch
        cookie ASP.NET_SessionId prefix
        option httpclose
        option forwardfor
        server myServer1 192.168.1.3:80 cookie aspnet01 check

backend tomcat_farm
        balance roundrobin
        option redispatch
        option httpclose
        cookie SERVERID insert nocache indirect
        server farmsvr1 192.168.1.4:8080 cookie tomcat01 check
        stats uri /stats
        stats realm haproxy
        stats auth admin:P@ss0wrd
        stats scope .
        stats scope aspnet_farm
        stats scope sharepoint_farm

frontend httpid
        bind *:80
        acl is_sharepoint_farm hdr_end(host) -i sharepoint.itcontoso.com
        acl is_aspnet_farm hdr_end(host) -i aspnet.itcontoso.com
        use_backend sharepoint_farm if is_sharepoint_farm
        use_backend aspnet_farm if is_aspnet_farm
        default_backend tomcat_farm

Now what is really nice about HAProxy is that we can have a zero-downtime restart/deploy of all servers. For a detailed guide you can see the below article. How I miss Linux sometimes…

Zero-Downtime Restarts with HAProxy

SSL Termination with Pound + HAProxy

One of the things that might cause a problem is if you need to insert cookie but are running 443. If this is the case and you really have no other alternative (risk vs. reward, total cost of ownership, etc) than you can configure HAProxy + Pount to deliver SSL Termination and then forward calls down to the web-servers as HTTP and the SSL decrypt/crypt happens on the load balancer. This can increase CPU, but last checked some Intel Celeron CPUs can handle 700+ transactions per second with this configuration.

We can test this by generating a self-signed certificate…

# install ssl
sudo apt-get install openssl

# create self signed cert
cd /etc/ssl
sudo openssl req -x509 -newkey rsa:1024 -keyout local.server.pem -out local.server.pem -days 365 -nodes

And then updating the /etc/pound/pound.cfg to include an SSL terminator:

ListenHTTPS
  Address 192.168.1.4
  Port    443
  Cert    "/etc/ssl/local.server.pem"
End

Full Layer-7 Load Balancer Scenarios

One of the huge benefits we get by having a layer-7 load balancer is that we can perform intelligent routing and have it be seamless to our applications.

pound

Examples

  • Load balance all png, jpg, gif requests to a specific server(s).
  • Load balance all mov, avi, mkv requests to a specific server(s).
  • Load balance users in South East US IP range to southeast.yoursite.com
  • Load balance users by ranking (gold customers, silver, bronze)

Additional Resources

Transparent proxy of SSL traffic using Pound to HAProxy backend patch and howto

Linux install and configure pound reverse proxy for Apache http / https web server

Pound Configuration Information

Azure Memcached Dashboard

March 25, 2010

Another post about Azure, this time is a custom developed Memcached Dashboard Demo App that allows you to run memcached in the cloud. The below screen-shot says a lot.

azure

This azure demo contains a fully functional sample of running a complete cache tier within azure. The main dashboard gives current health and statistics in real-time.

Features

  • Multiple Memcached Clusters
  • Memcached HeatMap for memory allocation and diagnosis
  • Azure Performance Counter Monitoring
  • Real-Time Monitoring of Servers in Memcached Cluster
  • Cache Tier Web Service Enables On-Premise and In-Cloud consumption

Future

  • REST interface into cache tier
  • iPhone Application Dashboard
  • Backup/Restore Your Cache to Blob Storage
  • Snapshots of Cache for shutdown/restore of Azure

Live Demo

You can hit this service right now at http://cloudcache.cloudapp.net/. (Gomen! Had to bring it down due to charges).

Overview

For those curious about how this is accomplished I’ll be posting a step-by-step walk-through for this application over the next couple of days but though with the near release it was better to go-live now rather than later.

Codeplex Site Download

“Thought Experiment” – High Performance Distributed Cache

December 18, 2009

I have been thinking a lot about the current structure of data and storage locations across an enterprise and have been feeling like something was wrong with the picture I had formed.

First let me explain that I have come to love simplicity. There is nothing more beautiful than writing 10 lines of code that does something so simple and so native that you have to look at it twice just because it looks too simple to be true.

There is, however, an area that fells like a small battle is being waged – distributed cache. This problem arises after 90% of your read operations are humming along and you want your cache latency to be very small – say within 5 seconds. You have an issue because your legacy systems run on mainframes, you have older VB applications writing and reading data directly from a database and mainframe. There is no governance around information and any application it seems can go in an update sources of data at any time.

The above constraints mean that you have no means of receiving notifications of events or data actions which mean that you can’t wire into a BUS/MSMQ drop to get notified of informational changes. You also can not update these applications because nobody really knows how they work and the risk is seen to be too high by management (and they are right).

The “Perfect” solution

In a perfect would we would be running cache as a first level tier. Your SOA (REST) interfaces would be communicating with some enterprise bus (BizTalk, ApacheMQ, etc) and that bus would exist to broker to any legacy tier and database tier that existed.

Lets trace a call in this architecture…

  1. A request is made from the UI/APP to update a contact’s address
  2. The request is forwarded to a REST interface which delegates the call to a ESB
  3. The ESB picks up the request and forwards it to the primary implementation (see parallel branch)
  4. The ESB returns back to the REST/UI layer success

[Parallel Branch]

  • Thread 1: The ESB identifies a simple object update and saves the data to distributed cache
  • Thread 2: The ESB in a separate instance/thread drops a message into a transactional MSMQ with reliable messaging and ordered delivery

The reason this is so perfect is there is barely an IO hit that was incurred because of this read/write operation. All read/write operations were primarily done in memory (at the moment fastest). We get fail over in case our cache goes down because of the MSMQ drop.

The primary hit from web/app layer would hit the service broker in thread 2 (transaction MSMQ hits disk on target machine) but this would be executing well after the result was actually received. Updating data would primarily be as fast as network/CPU/memory allowed and that data would immediately be updated in cache and then a MSMQ message dropped into some further back end for a physical read/write to disk(s).

This implementation is also scalable as distributed cache scales extremely well and the ESB would likely be network/CPU bound which could be solved by horizontal scaling (to no known limit).

In a worst case scenario where your data center goes down all that would happen is your ESB would continue to write ordered messages to MSMQ and the cache would still be running. You’ll just have to hope when you bring the system back up that the back end can keep up :) . The business side wouldn’t even notice (provided 99.9% of data is in cache and written/read from as primary). It’s an amazing architecture – one that is used by facebook, youtube, amazon, etc. Very high performance and very scalable. However, I don’t exist in a perfect world.

The “Not-So-Perfect” Solution

With the constraints in place our primary issue is that of notification of data updates. The primary store is still treated as the database by 99% of the applications housed in your enterprise and updates are made in no structured manner. The closest it comes to structured is the database tables themselves. And to be honest, that is good enough for us.

We can keep the ESB like we have before at the cost of complexity (see 99% of other applications wont use them but they each have a lifespan of about 5 years). The core issue is that of cache integrity. If data has changed since last cache than augmenting data at the cache level would result in potential issues when writing back to the store.

This can be avoided by introduction of a “Cache Invalidation Service”. The primary goal of this service is to receive REST/SOAP based messages that are subscribed to from an ESB. Since 99% of applications write/read from SQL instances we can work with SQL Server 2005/2008 query notifications to flush our cache of specific items when they are updated.

Notice I said FLUSH not UPDATE! We wouldn’t want to perform a read “just because”. We can delay the read until somebody actually needs it which allows for higher-throughput.

At the end of the day

So this will work but there is something bothering me… This all seems far too much like a well defined pattern and this should already be solved by a vendor. The hardest part is identifying the objects throughout the enterprise and then exposing them via REST while leveraging distributed caching. The actual infrastructure after that is easy.

I can imagine in time somebody might come up with a tool/app/service that allows mapping between physical layers to objects and then it would do all the heavy lifting. However, caching is still something that is hand-crafted, which I don’t like handcrafted solutions.

Useful Links for Query Notification

Understanding When Query Notifications Occur

Planning for Notifications

Using SqlNotificationRequest to Subscribe to Query Notifications