MemcacheDB looks like a great place to store denormalized subscriptions.

Imagine Kevin Rose, the founder of Digg, who at the time of this presentation had 40,000 followers. If Kevin diggs just once a day that's 40,000 writes. As the most active diggers are the most followed it becomes a huge performance bottleneck. Two problems appear.

You can't update 40,000 follower accounts at once. Fortunately the queuing system we talked about earlier takes care of that.

The second problem is the huge number of writes that happen. Digg has a write problem. If the average user has 100 followers that’s 300 million diggs day. That's 3,000 writes per second, 7GB of storage per day, and 5TB of data spread across 50 to 60 servers.

With such a heavy write load MySQL wasn’t going to work for Digg. That’s where MemcacheDB comes in. In Initial tests on a laptop MemcacheDB was able to handle 15,000 writes a second. MemcacheDB's own benchmark shows it capable of 23,000 writes/second and 64,000 reads/second. At those write rates it's easy to see why Joe was so excited about MemcacheDB's ability to handle their digg deluge.

Posterous has this exact problem when handling subscriptions. We currently denormalize new posts for subscription lists using after_save hooks that translate new posts into new notification records on the backend. But if we can plug this instead into MemcacheDB instead, needless DB writes can be avoided.

views

Tags