I’ve heard a lot of horror stories regarding Rails development and how it has trouble scaling, my verdict is still out. I also think code can be written inefficiently in any language (I’m certainly guilty of writing SQL queries that pull in too many records).
I really enjoy developing in Rails. I think Ruby is a very terse language and Rails is framework that helps developers achieve peak productivity. So I’m trying an experiment with BizeeBee. The team has been through 3 iterations. At the end of each iteration we’ve been focused on cleaning up our code base.
For this iteration our goals are:
- Convert obtrusive AJAX queries (RJS) into straight JS, because we’ve experienced how slow RJS can be.
- Make the DB agnostic by removing straight SQL calls and instead relying on the ActiveRecord framework, because we don’t want to deal with mySQL vs. Postgres incompatibilities.
I’ve also talked to a couple people in DevOps to get their thoughts on Rails as a framework here are their suggestions:
- Think about scale from the beginning because you’ll need to scale before you know it. They seem to almost value premature optimization over no optimizations.
- Limit the number of joins you’re performing on your database. This means restrict the foreign key relationships and try to de-normalize your tables early on.
- Think about archiving or aggregating historical data. This will limit full table scans, and give users a richer experience when dealing with data that pertains to the present. If you need to retrieve older datasets then you’ll need to design around it by messaging users that you’re retrieving older data.
As BizeeBee moves into the fourth iteration, I’ve started to think about how an open beta will result in more users. David, my back-end developer, and I spend more time thinking about data modeling. We know which tables we anticipate growing quickly, how we need to address the growth rat, and have started thinking about partitioning schemes to address the growth of data. Currently our app is hosted on Heroku, which means we don’t have control over our partitioning scheme. So if we do want to partition we’ll need to host the app ourselves. But we like the ease of deployment that Heroku offers and how we can closely mirror the staging and production environments without having to configure them ourselves.
While I don’t anticipate us growing overnight, I think its good to start thinking about these problems early on. I’ve also anticipated the need for caching in the short-term and will implement it depending on usage patterns. As far as using a solution like NoSQL or MongoDB, my main reluctance to embrace it is the need for an ACID database, one that maintains data integrity. I know a lot of startups have openly welcomed both, but my skepticism originates from the need to have a highly accurate system that cannot tolerate data glitches. I’m dealing with transactional data that belongs to small business owners, and can’t afford for the service to be unreliable.
I’m curious to hear about everyone else’s experience and architecture regarding scaling in Rails and how their stack has morphed overtime…