Saturday, June 27, 2009

Martin mentioned the unique ID generation in previous post, so I recalled the problem which was solved in similar fashion. At the beginning of developing TripTrap the issue "how to find out if such request was already processed?" arose. Well, that's quite easy: store it to DB and next time check if it's already there. What becomes a problem is "how to do that fast?", especially when we have almost twenty request parameters (and thus such a number of conjunct conditions in SQL-query), also different types of requests possible, and also some of the parameters can be omitted, etc.

The solution was the following: count 64-bit integer hash for the request and store request to DB and to memcached together with that hash. Stored request is just the binary serialized and gzipped blob. So, in DB we have both BLOB field and BIGINT fields. The latter is very good indexed and is virtually unique, and it's very good key for memcached storage.

For each incoming request we count the hash, fetch the blob from cache or DB, deserialize the blob and compare it with the incoming one in Java-code. In the virtually impossible case of the same hashes for different requests the only problem is that we need to fetch, deserialize and compare more that one request blob from DB or the list of request blobs from memcached. 

Voila, for memcached with a lot of RAM we get the stuff working as fast as possible.

No comments:

Post a Comment