Friday, July 10, 2009

Cluster-wide search

Well, after some refactoring TripTrap is finally commited to the git repository.

Besides a lot of refactorings, rearrangements, broken tests fixes and optimizations the main improvement added recently is that each TripTrap instance in cluster environment is in some sense aware of what other nodes do. We had a principal problem with that issue: each cluster node can perform the same search, even if it's already launched at some other node. That's not a big issue as it's not so often, but such redundancy overloads the system, and it can cost us some money. Nobody likes paying money for exactly nothing.

Some synchronization through DB can be implemented, but that's not a scalable solution. The DB will become a bottleneck in a while. The solution, again, is to use memcached for it and place there a mark like "I'm searching this and that for that supplier, account and request". And delete that mark after search is over. Thus, for polling requests redundancy is avoided at all, for synchronous requests, which are waiting for all possible responses, there's some complex strategy like "if there's such searches on other nodes, first launch local ones, than check again, if some of remote searches completed" and so on. Memcached is rather fast and greatly scalable thing, so no bottlenecks is expected, I guess.

Now there are two different implementations of search, configured in Spring context: cluster-aware one, which uses cluster-wide synchronization and single-instance one, which is simplier and faster.

So it goes.

No comments:

Post a Comment