Tag Archives: Scalability

Think About Scale from the Start

If you are thinking about scaling a web application or service, congratulations, because you have users that liked you or were curious enough to sign up and stick around! You will of course be acquiring more users shortly.  While the trajectory of user growth is unknown, and depends a lot on your usage model (viral social network vs. word-of-mouth individual user service) there are a few things you need to address:

  1. Capacity – your site will need to handle more concurrent users, signing up users alone can generate a lot of load on the system, even before they get to using the product.
  2. Reliability – users will want to use the service on their own time.  The site needs to be up and running 24 hours with  limited maintenance windows.
  3. Scalability – if users can generate data on your site you will have more data you need to store/retrieve.

If you’re growing too fast a common way to solve #1 and #3 is to throw hardware at the problem.  A startup focuses on creating the MVP (minimum viable product), which means the prototype has just enough functionality to add a significant value to the lives of users that convinces them to sign up and use it for awhile.  Putting the product out there initially means you’re testing the product/market fit, and as a result you’re unsure of how many user will signup, and what their usage patterns will be .  Let’s say you are a cocky and a cheapskate, you know you’ll have users, but you don’t want to solve problems by buying hardware all the time.  If you’re cautious you will do the following:

  1. Start performance tuning and load testing even before releasing the product!
  2. Create a restricted alpha and beta, which allows you to control the growth rate.
  3. Measure the adoption rates and usage patterns for your alpha and beta users.
  4. Use the measured adoption rate to anticipate how many servers you can afford.
  5. Monitor spikes in adoption due to press releases or other news events. And be ready to re-route traffic (failover) in the event of a server failing.

These are the top 5 ways you can initial think about scaling your app without a whole lot of code re-writes.  But there will come a time in which you will need to redo a lot of the prototype’s code base.  We’ll save that for another post…

Enhanced by Zemanta

Performance: Part II Address Scalability Before Its Too Late

As your product and user base grows you want to ensure that your customers both old and new have a good user experience. You want their experience to improve and not stagnate or diminish over time; scalability is another key element to address to ensure the success of your website. Scalability is defined as the capacity to keep pace with changes and growth.

Maintaining a scalable website requires thinking from a business perspective. You want to understand how rapidly your site is growing, and what the frequency of usage is. These two factors serve as metrics for predicting how to allocate resources. You can also use historical usage as an indicator of how much activity you expect in the future. Press releases and media events increase the user base over the course of a few days or even a week by unexpected amounts. Ideally, the number of users is increasing at steady predictable rate week over week. Depending on the type of site you’re running you can figure out what the peak hours for use are, or if user activity increases during certain seasons of the year. Also, knowing the frequency or peak hours of usage helps to schedule maintenance, new feature release, and cron jobs, which won’t interfere with the user’s experience.

Continuing to think in terms of business, you’re site is operating on a fixed budget, hence the amount of resources you have is directly proportional to your operating costs. Until you receive another round of funding or bring in revenue you have to make due with what you have. Therefore, you must understand the limits of your resources in terms of response time, throughput, and concurrency in order to allocate resources efficiently but still guarantee quality. Load testing* is the best way to predict the limits of each of these.

Now lets switch back to thinking like an software engineer. From a code base perspective, web applications should be tier based. Here is a simple tiered approach:

UI -> Business Logic -> Persisted Object -> DB

You might also have another tiered data model that runs in parallel, which could be used to process or retrieve data that is not in immediate use by the user. Usually a messaging protocol such as JMS or RMI is used as a means of communication between these parallel data model tiers. The benefit of having a tier-based is that you can cache data that doesn’t change frequently across tiers, thereby limiting the number of expensive DB calls made. Moreover, as the number of concurrent users increases securing data across users becomes pivotal. With a tier based approach only certain tiers can manipulate and persist data.

I’m sure we’ve all learned from our intro computer architecture class that CPU bound processes are the fastest and can be parallelized, whereas I/O processes are the bottleneck. In the case of a website, accessing the DB is the slowest I/O process. However, you can speed up access to data by sharding the database. Sharding breaks up a large database into smaller pieces that contains redundant information or a parent db can map data to separate dbs.

The last and priciest technique is having multiple servers. Configuring a load balancer to handle requests and the send them to each server is one way improving throughput and response times for user.

Improving the scalability of your website is a good problem to have, because it means your site is growing! But you don’t want to wait until a server crashes or a db thrashes. A little forethought will continue to grow your user base and keep them coming back for more!

* Future article on load testing.

Enhanced by Zemanta