Blog

Blog from November, 2011

The Apache integration day at W-JAX was a big success. We had sessions about CXF, Camel, Karaf, TESB and Continuous Delivery. Now finally most presentations and examples are available.

In enterprise environments a typical requirement is that an integration has to be highly available. Typically you will use at least two nodes to achieve that. Depending on the requirements you will either want all nodes to be active or only one. The problem with having more than one active node is that messages can get out of order. So if your requirement is that your messages keep in sequence then sometimes the only way to achieve that is to make sure only one node is active at any time.

By default Apache Camel has no mechanism to achieve this. So as I had this requirement from some customers I decided to create an addition to apache camel to achieve this.

SimpleCluster

The idea is to use a database table lock to synchronize the locking between the nodes. The reason is that databases are really good at such things and so a database lock is really very reliable. As I did not want to reinvent the wheel I started with the database lock code from ActiveMQ (http://activemq.apache.org/jdbc-master-slave.html). Basically the idea is to do a "select * from mytable for update" in a transaction. This locks the table so only one node can acquire the lock.

dblock

ERROR

Gliffy is unlicensed. Please install a license to draw diagrams in your wiki.

This is encapsulated in the class DbLockManager. The interface FailoverHandler then allows to register a callback into your own code to be notified that you should start or stop. This part is completely independent of Apache Camel and can also be used for other use cases.

Behaviour

The node that is started first will acquire the lock and start the route. All nodes will try to get the lock after the sleep interval. If the lock can not be achieved the db call blocks till the transaction timeout is reached. So the interval between two tries to get the lock is a little larger then the sleep interval. In case of a connection or db failure the node will stop. So if the DB goes down all nodes will stop. That means you should make sure the DB is also HA.

Configuration

The DbLockManager is configured like this:

You can also set the sleep time and the lock table name.

The Camel integration is done with a RoutePolicy. Such a policy can be easily added to a camel route and can control the status of the route. So the FailoverRoutePolicy simply needs to be added as a bean:

So the only thing that remains is to add the policy to the route and to make sure the route does not start on its own:

Code

https://github.com/cschneider/simplecluster