Scaling Matters Twitter

Joe Gregorio

Alex Payne:

The problem is that more instances of Rails (running as part of a Mongrel cluster, in our case) means more requests to your database. At this point in time there’s no facility in Rails to talk to more than one database at a time. The solutions to this are caching the hell out of everything and setting up multiple read-only slave databases, neither of which are quick fixes to implement.

I simply don’t buy it. Write your own DB extension to Rails that load balances request. Or, use something like pg-pool for Postgres that pools connections between a cluster of read/write slaves. Databases are often the last places you should be looking to optimise. This isn’t rocket science kids.

Posted by Noah Slater on 2007-04-12

Noah, there are some limitations to that technique. IIRC pgpool can only handle two servers, which means you need to limit yourself to two servers or start chaining pgpools together, neither of which is fun. Various other systems punt entirely; standard MySQL (not MySQL Cluster, which requires special setup in advance) does this, and simply tells you to rewrite your application layer's DB access to distribute the queries.

Posted by James Bennett on 2007-04-12

A MySQL load-balancing proxy might be of use in a situation like this. But I agree that scaling the DB is a task best left outside of the application domain. If you're gonna start rearchitecting your ORM to support specific performance requirements, well, read those "megadata" articles again. Interesting to note that the Twitter guys hadn't even started caching html responses at the time of the interview. "Caching the hell out of everything" is easier than it seems in Rails.

Posted by jchris on 2007-04-13

comments powered by Disqus