13th June 2008

Memcache Race Condition Fun

Dealing with cache expiration with memcache has some subtle gotchas that many people ignore at first. Lets say you are using memcache to cache an object that you want to refresh at least every 5 minutes. The typical pattern for this would be (in PHP)-

$myobj = $memcache->get('key');
if($myobj == NULL)
{
  $myobj = LoadMyObjFromMySql();
  $memcache->set('key', $myobj, 300);
}

With this code you should expect to see one query on your database every 5 minutes. But implement this on a high capacity web site and you will probably see your MySql database get pegged every 5 minutes with a spike of a ton of queries. What is going on?

Notice that there is a time window in between the memcache get, the database query and the memcache set. This window isn’t large, but it could easily be 10ms. On a web site that is running 1000 requests per second you could easily have 10 different requests where the memcache get fails (because the object is expired), and they each hit the database before they update the memcache and everything is back to normal.

The solution is to not rely on memcache to manage the expiration. We can rewrite the code above to look more like-

$myobj = $memcache->get('key');
if($myobj == NULL || $myobj->expire + rand(0,120) < time())
{
  $myobj = LoadMyObjFromMySql();
  $myobj->expire = time() + 300;
  $memcache->set('key', $myobj, 600);
}

This way memcache still has an expiration for the objects, but its longer than our real target. For the first 300 seconds the objects always just get loaded from the memcache server. Starting at 300 seconds there is a tiny chance that each request will refresh from the database and update the memcache. This way if there is a large load on the server, the chances are that only 1 or 2 updates will happen, but that it will happen right away. If there is a lighter load within 7 minutes the object will eventually get expired for sure. So the chance that any given request will refresh the cache ends up being inversely proportional to the current load.

The above code of course relies on you storing PHP “objects” in the memcache so it can tag the extra expire property on. If you aren’t storing something that is already an object you can always create a new object that stores your real thing in a property called “data” and still uses expire. You will pay a small overhead for the object marshaling but it shouldn’t be too painful.

posted in Software, Technology | 1 Comment