I’ve heard a lot of horror stories regarding Rails development and how it has trouble scaling, my verdict is still out. I also think code can be written inefficiently in any language (I’m certainly guilty of writing SQL queries that pull in too many records).
I really enjoy developing in Rails. I think Ruby is a very terse language and Rails is framework that helps developers achieve peak productivity. So I’m trying an experiment with BizeeBee. The team has been through 3 iterations. At the end of each iteration we’ve been focused on cleaning up our code base.
For this iteration our goals are:
- Convert obtrusive AJAX queries (RJS) into straight JS, because we’ve experienced how slow RJS can be.
- Make the DB agnostic by removing straight SQL calls and instead relying on the ActiveRecord framework, because we don’t want to deal with mySQL vs. Postgres incompatibilities.
I’ve also talked to a couple people in DevOps to get their thoughts on Rails as a framework here are their suggestions:
- Think about scale from the beginning because you’ll need to scale before you know it. They seem to almost value premature optimization over no optimizations.
- Limit the number of joins you’re performing on your database. This means restrict the foreign key relationships and try to de-normalize your tables early on.
- Think about archiving or aggregating historical data. This will limit full table scans, and give users a richer experience when dealing with data that pertains to the present. If you need to retrieve older datasets then you’ll need to design around it by messaging users that you’re retrieving older data.
As BizeeBee moves into the fourth iteration, I’ve started to think about how an open beta will result in more users. David, my back-end developer, and I spend more time thinking about data modeling. We know which tables we anticipate growing quickly, how we need to address the growth rat, and have started thinking about partitioning schemes to address the growth of data. Currently our app is hosted on Heroku, which means we don’t have control over our partitioning scheme. So if we do want to partition we’ll need to host the app ourselves. But we like the ease of deployment that Heroku offers and how we can closely mirror the staging and production environments without having to configure them ourselves.
While I don’t anticipate us growing overnight, I think its good to start thinking about these problems early on. I’ve also anticipated the need for caching in the short-term and will implement it depending on usage patterns. As far as using a solution like NoSQL or MongoDB, my main reluctance to embrace it is the need for an ACID database, one that maintains data integrity. I know a lot of startups have openly welcomed both, but my skepticism originates from the need to have a highly accurate system that cannot tolerate data glitches. I’m dealing with transactional data that belongs to small business owners, and can’t afford for the service to be unreliable.
I’m curious to hear about everyone else’s experience and architecture regarding scaling in Rails and how their stack has morphed overtime…


Programmer time way more expensive than computer time!
Computers are cheap now compared to the cost of development, so I say take all the shortcuts you want that make your developers productive and allows them to build a simple app. Screw the database, it can work overtime 🙂
Seriously tho….Rails will scale just fine with the simple stuff and you can do it piecemeal as it becomes necessary so there is nothing to do up front. This is the order I usually look at things as they become necessary (1 and 2 I do on every app from the start):
1. get indexes on every foreign key and field commonly used in a where clause
2. eager loading (the bullet gem can help with this)
3. memcached on big or common queries
4. scale vertically from there
That should get you to many millions of uniques a month and beyond, and you can do it all on Heroku.
If you really blow up after that you can move to your own solution and do in application sharding (there are rails plugins for this) – but it’s a high quality and rare problem to have. Even 37Signals hasn’t reach this point, and they are comfortably doing 9000 requests per second on a single database machine:
http://37signals.com/svn/posts/1819-basecamp-now-with-more-vroom
and
http://37signals.com/svn/posts/1509-mr-moore-gets-to-punt-on-sharding
I think if anything the main bottleneck is MySQL more often than rails. Heroku essentially allows you to scale rails instances to infinity, so the real limitation is the database.
So I say don’t spend much time worrying about it at this point, rails will take you as far as you want to go 🙂
Thanks for the advice Brian! I didn’t know about bullet, I’ll check it out. I think eventually I’d want to move things in house from Heroku, just because I don’t have control over the data or the databases. But until I can afford to hire an architect you’re right about Heroku scaling for me.
Poornima,
When the time comes, consider looking at QActiveRecord
http://blog.directededge.com/2010/05/06/making-activeresource-34x-faster-qactiveresource/
Amit