An Architect's View

CFML, Clojure, Software Design, Frameworks and more...

An Architect's View

Transfer Cache Synchronization in a Cluster

April 14, 2008 · 19 Comments

If you use Mark Mandel's awesome Transfer ORM in a cluster, you've probably wondered what to do about keeping the cache in sync across servers in the cluster. I've had to solve this problem a couple of times now and I figured I should publish an example of how to do this.First off, what's the problem we're solving? Transfer maintains a cache of objects so that it doesn't have to hit the database all the time. It can automatically manage a cache that contains as many objects as will fit in memory and discards them when it needs more memory, based on age and usage. Come to Mark Mandel's Transfer ORM Caching Mechanics session at cf.Objective() to learn more! The problem is that in a cluster of servers, each server caches objects and doesn't know when another server has updated an object. This causes each server's cache to get out of sync with the database. Not good. So, what is the solution? We need to arrange for the other servers in the cluster to be notified when an object is updated, so they can drop the object from cache (and fetch it again when needed). Transfer has a method call, discardByClassAndKey(), that drops the specified object from cache, if it is present. Transfer also has an event model that allows us to register listener objects that are invoked whenever an object is updated or deleted (on the local server). Finally, we use JMS - Java Message Service - to send cache update messages from one server to all the others in the cluster.
OMG! How scary! You're going to use event gateways and JMS and complicated stuff I don't understand!
Relax, ColdFusion makes this easy. Trust me. Players in the solution
  • ActiveMQ 4.1.0 - Apache's open source JMS server. Support for ActiveMQ is built into ColdFusion 8 as one of the "example" event gateways.
  • CacheSynchronizer.cfc - A new CFC from my Google Project, available under the Apache Source License 2.0. It acts as both the listener object for the Transfer events and as the listener for JMS messages.
  • Configuration files for the JMS event gateway. See the comments in the CacheSynchronizer CFC.
Each clustered server will run an instance of ActiveMQ. On Mac OS X or Linux, this is as simple as:
  • cd path/to/ActiveMQ-4.1.0
  • bin/activemq
I have no idea what's involved on Windows - you're on your own there - but it shouldn't be too hard. Each clustered server will be configured to have an instance of the Active MQ event gateway called CacheSync, using the CacheSynchronizer CFC and an ActiveMQ configuration file. In addition, each server will also have an additional instance for each other server in the cluster with a slightly different configuration file. Again, see the comments in the CacheSynchronizer CFC for more details. Bottom line: a server notifies its own JMS server about changes and listens to all the other JMS servers to pick up changes on other servers. For most folks, this is going to be perfectly manageable because you'll only have a few event gateways on each of your few servers. For large clusters, this might be more problematic. When I have to address that issue, I'll post an updated solution (but will likely go to a cluster of JMS servers and have each CF server publish / subscribe to that central cluster). Initialization Your coldspring.xml file should define a cacheSynchronizer bean, using the CacheSynchronizer CFC. The init() method accepts the transfer bean and automatically registers itself as the listener for the afterUpdate and afterDelete events in Transfer. You can either declare this lazy-init="false" to have ColdSpring auto-initialize it or you can explicitly getBean("cacheSynchronizer") to force initialization when you are loading your bean factory. Updating or Deleting an Object When an object is updated (or deleted), Transfer automatically calls the registered listener method. This creates a message structure containing the hostname (just for information), the object's class name (package.object) and the primary key (assuming getId() returns that). The message is sent to the local event gateway (CacheSync). Note: this assumes that all your Transfer objects have a primary key called "id" - if you don't do that, the process would be more complex. You could always add a getId() method to your Transfer object decorator to retrieve the actual primary key. Other servers in the cluster receive the message and onIncomingMessage() is automatically invoked in the CacheSynchronizer CFC. It reads the hostname, class name and ID and then asks Transfer to discard that object from cache. How simple is that?

Tags: architecture · cfobjective · coldfusion · coldspring · j2ee · orm · oss

19 responses so far ↓

  • 1 Rob Brooks-Bilson // Apr 15, 2008 at 11:08 AM

    Hi Sean,

    I was wondering about an alternate approach instead of using queues. What do you think about setting up a cluster of dedicated distributed caching servers (memcached, ehcache, etc.), and using that as the caching layer for Transfer instead of synchronizing transfer's built-in caching mechanism. I admit I haven't taken a look at Transfer in a while, but it seems to me there might be additional advantages to using an external cache provider (freeing up memory for the CF jvm, etc.).

    What do you think?
  • 2 Sami Hoda // Apr 15, 2008 at 11:37 AM

    Thats quite impressive Sean. Good show of different technologies working together.
  • 3 Sean Corfield // Apr 15, 2008 at 12:10 PM

    @Rob, yes, that would certainly be a possibility if Transfer supported an external caching system.
  • 4 Brian // Apr 17, 2008 at 12:49 PM

    Sean - thanks for posting this; we will be needing to use it soon. I was wondering; is there anything that prevents this from working on CF7 besides having the sample gateway provided in CF8?
  • 5 Sean Corfield // Apr 17, 2008 at 1:41 PM

    @Brian, I have no idea what it would take to get this running on CFMX 7 (I have no CFMX 7 servers to test on). It doesn't use much functionality in the AMQ gateway so you may be able to use the JMS gateway instead which ships with 7.
  • 6 Mark Mandel // Apr 21, 2008 at 2:28 AM

    @Rob,

    Just to come in late on this -

    This is actually limited by CF, simply because CFC Serialisation is deep serialisation, and there is no way to control it.

    So as soon as you want to serialise a TransferObject for a clustered cache, it clusters the TransferObject.. then Transfer.. and then Transfer's cache.. and then .. well.. it tries to cache EVERYTHING (and it fails).

    So until that can be resolved in CF, there isn't much we can do about Serialising TransferObjects across clusters.

    (Maybe it's something that needs addressing?)
  • 7 Rob Brooks-Bilson // Apr 23, 2008 at 8:19 AM

    @Mark,

    Interesting. I didn't realize it was structured in such a way that trying to serialize a TransferObject would result in serialization of the whole shebang.

  • 8 Sean Corfield // Apr 23, 2008 at 8:35 AM

    @Rob, FWIW, you'd see the same issue with duplicate() on a TransferObject - and that's why I think duplicate() should still throw an exception, just like it did in CFMX7. Duplicating a CFC instance with a built-in function is a loaded gun and people will get caught out by it (and it will be very hard to debug).
  • 9 Jared Langdon // Apr 24, 2008 at 5:54 AM

    So what's the verdict. Mark says that there is a fatal flaw, and that the approach will fail because of how CF does serialization. Yet Sean says that he's had to solve the problem a couple of times now, which suggests that the solution has been field tested. Which is it?
  • 10 Sean Corfield // Apr 24, 2008 at 7:19 AM

    @Jared, Mark and I are totally in agreement: both duplicate(cfc) and serialization of CFCs are fatally flawed - because they both deep copy the entire CFC graph.

    That's why "solving" the issue in a cluster is non-trivial and worth blogging about. You cannot rely on CF/J2EE session replication (I've talked a lot about why I think that is the wrong approach anyway, even without CFCs in the mix). You cannot automatically copy CFCs yourself (structCopy() is too shallow, duplicate() is too deep).

    My solutions in this particular space - using Transfer ORM - rely on Transfer managing its own cache and uses intra-server notification to tell Transfer that a given object was changed somewhere else on the cluster.
  • 11 matias elgart // May 12, 2008 at 9:49 AM

    hey Sean:

    thanks for the post and interesting to read about CF's inherent serialization problems, as well as duplicate()'s.

    i was curious about the set up of JMS servers you describe, you mention that each server in a cluster would have it's own instance of apache MQ, etc.

    was wondering, why not just have a single MQ server for the entire cluster, make it a Topic, and anytime a bean is updated (or whatever other op that might invalidate a cache entry), publish to the topic. ?

    unless i'm misunderstand something, always possible. seems like an instance of MQ on each cluster member seems kind of heavy.
  • 12 Sean Corfield // May 12, 2008 at 10:13 AM

    @Matias, well, for redundancy, I'd need a cluster of MQ servers and since I already have a cluster of CF servers, I'm letting them do double duty.

    Since we're expecting to move to zoned databases and server clusters at some point, we'd need separate topics (ugh) or separate clusters of MQ servers for each zone. This configuration allows for "buddy" servers to just listen to their buddies' MQ stream - and can easily be reconfigured.

    Also, if a server goes down, it will no longer broadcast changes (because it won't be making any) so its corresponding MQ server is not needed (another argument for running MQ on each server).

    ActiveMQ is pretty lightweight but the downside is needing an event gateway instance for each buddy in the cluster.

    I'll post updates about our choices (and changes) in this area as we move forward.
  • 13 Chris Phillips // May 24, 2008 at 1:50 PM

    PSA: Use the "ActiveMQ" gateway type if you are trying this out. Don't be like me and try doing it with the "JMS" event gateway. It will not work with the "JMS" event gateway type.

    Now I return you to your regularly scheduled programming.
  • 14 Sean Corfield // May 24, 2008 at 2:13 PM

    Reading back through this, I realize that I could have been clearer that this was all about the AMQ / ActiveMQ event gateway introduced in CF8 (and associated documentation - yes, there's actually a PDF of developer documentation in the gateway examples tree!).

    The old JMS event gateway only handles topics and doesn't have most of the functionality in the new AMQ event gateway. The reason that the old JMS gateway wasn't simply *replaced* by the new gateway is that because the new AMQ gateway is more generic, I had to make some changes to how the configuration file is written. For backward compatibility, the CF team had to ship both gateways. Sorry for any confusion.

    I highly recommend just using the ActiveMQ event gateway and totally ignoring the old JMS event gateway.
  • 15 Sean Corfield // May 24, 2008 at 2:15 PM

    Oh, and in case folks are wondering, this solution is operating pretty well in production with eleven servers in the cluster!
  • 16 Brian // Jul 1, 2008 at 5:06 PM

    Sean, I don't currently use discard much so forgive me if this is a non-question. I'm wondering what happens if one server has a reference to a transfer object with relationships, say, a person with addresses, and in the middle of some operation, the server is notified to discard either the person object or one of the address objects. Would you get "transfer object not initialised" or other "composition not set" errors?

    Or another case where you get a transfer object, the server is instructed to discard it via the JMS call, and using that original reference to the transfer object you then try to save it?

    You say it's working well in an 11-server cluster; have you run into this or have you done anything specific to avoid it?
  • 17 Sean Corfield // Jul 1, 2008 at 5:20 PM

    @Brian, I'll have to defer to Mark on that but I will note that Mark has indicated (on the Transfer mailing list) that discarding an object also discards all parent objects (presently, he's working to make that less brutal).

    Since Transfer works just fine with cache set to none, I don't believe it should affect operations if an object is discarded from cache in one request while another request is manipulating it (but there may be edge cases).
  • 18 Mark Mandel // Jul 1, 2008 at 5:27 PM

    @Brian -
    Nope, that wouldn't happen, either on the 1.0 release, or the SVN version.

    Reason being, if you retrieve a Person with an Address, if you discard the Address, it doesn't mean the Person object removes the Address from itself, it will either:

    1.0) Discard both the person and the address, in which case, the objects still know about each other, and if you are working on the Person object, you will never know that the object got dropped out of the cache. (Transfer already does cache synchronisation)

    1.1 / BER ) Discard just the address, and unload the address from the Person. If you then needed the Address, it implicitly reloads the object back up, and, again, you wouldn't know any difference, as it works exactly like a lazy load.

    Does that make sense?
  • 19 Brian // Jul 1, 2008 at 6:50 PM

    Sweetness Mark - that is perfect!

Leave a Comment

Leave this field empty