Kung-Fu Chop To the Load Beast – Part I

A common goal every website or online service provider share is to have maximum visitors or server hits possible. That, of course, does not include  spam or denial-of-service attack on the server. While more traffic or hits means more business, it also generates the need to ensure the scalability and load capacity of the server. Interestingly, most engineers haste in order to test the load capacity of the server and put tons of loads the very first day and expect to see great results. Guess what? the server either gets non-responsive (meaning probably crashed) or it did take the load but the engineer did not know they over-killed the hardware and received a shockingly high server invoice at end of month.

Clearly one nor want too small a server muscle, neither too big and expensive of a server. To look eye-to-eye with the load creep, there is need of a strategy. Here is part I of my kung-fu attempt to chop it down. I would like to keep it high level and at a strategic level. In part II I will discuss specific details of

our recent optimization exercise on Rails. I must disclaim that server tuning or optimization is a big topic with tons of material available and what follows is my experience working with load testing and optimization of a Ruby on Rails server for a messaging engine project.

Before You Start Optimizing and Scaling:

a) Size the Beast: Estimate your target load on server. I prefer a number like transactions per min rather than transactions per second. Per seconds is much smaller a number when you have heavy processing and a normal transaction spans over a few seconds. It is also important to clearly define what a transaction means in your system, does a single DB hit counts for a transaction or entertaining a user request end-end counts for a transaction?

b) Benchmark Response Times: The goal of optimization and scaling should not be solely handling more load on the server, but also serving requests within a decent response time. A server handling tons of loads but keeping the user waiting for longer period will soon put the CEO out of business

c) Choose an appropriate load generation mechanism: This could be a free tool like JMeter or SOAPUI who can can create massive HTTP hits on the server. The flexibility these tools provide is quite nice ranging from configuring exact load to put on server using multi-threading and ability to attach a data pool to vary request data. You can also write your own code to generate a load if the request structure is complex. In our case, we used both.

Places to look for Optimization:

a) Starting with code optimizations. Hotspots are DB calls, third party web service calls and parsing large JSONs, XMLs etc. I have experienced that using an async approach of DB writing and JSON/XML parsing (wherever possible) greatly improves system performance and user experience. We optimized one of our routines by 800% using asynchronous DB writing

b) Application server threads: Application server request threads should always maintain appropriate ratio with hardware muscle. You don’t want to do too much or too less parallel request handling on application server. Too much will lead to CPU or Memory starvation and too less means you have getting an oversize server invoice month end. With out pretty standard request size, we have enabled 50 maxClients for Apache on standard EC2 XLarge instance and hitting about 50% of CPU capacity

c) Caching: Caching saves us from disk and notwork latency by reusing already fetched data. Caching is also available at multiple levels starting from Web serer caching, SQL caching provided by standard RDBMS and third party caching such as Memcached

d) DB Indexing: This is not something super latest or cutting-edge and has been in use for a while, but, there is a catch. Normally we create DB indexes on tables whom we hit the most in searching etc. However, if there are massive CUD operation (Create, Update and Delete) on the table as well then indexes will really slow them down because it updates the B-trees every time

Guide Yourself in Load Testing:

a) A cyclic approach is what works. Run more than one tests while recording them. I have found it useful to create a simple spreadsheet that records details and results of every test run. I suggest to record basic information like hardware profile, change in settings/hardware from previous test, load put on server, throughput of the server, exceptions/crashed, and duration of test

b) Its important to bring one change at a time to the system – let it be DB index, memcached, or more memory attached to the system If we bring more than one changes to the system for test run then it will be hard to determine the adverse or positive affect of a change independently

c) Profile your system: We recorded following information during the tests. CPU, memory and disk usage using Munin, system throughput using NewRelic and system response times using JMeter

c) Do not forget longevity tests: While we run many short duration tests it is important to run 10 hour or a day long tests as well to figure out if there are any dormant memory leaks that might crash the system in a few days time

Below is my attempt to picture the optimization process in a simple flow chart:

In part II I will discuss specifics of our recent load testing exercise on Rails

Tags: , ,

One Response to “Kung-Fu Chop To the Load Beast – Part I”

  1. Wimpy says:

    Way to go on this essay, hpeled a ton.

Leave a Comment