Few days ago as I was happily chopping away on my duck taped Rails application I decided it was a good moment to measure up bottlenecks and optimize a bit. Here is overview of what I did. It consists of common sense actions that I picked up here and there, in my ongoing battle with slow applications.
Starting position
Application main index page is loading in about 4.5 seconds, Apache Benchmark tool (just ‘ab’ from now on) measured speed of 5 pages/second.
Paging
Paging is a pet peeve of mine. As a user I hate it, as a developer I need it to make application work. Special annoyance is deciding how many items per page is enough. My rule of thumb is 20-25 items so the users doesn’t have to click pages like crazy (Deep down I am always wondering if anybody is even using paging).
One tweak I had implemented before performance tuning was using jquery.pageless plug-in. It helps in getting rid of paging altogether by loading content as it is needed. In this light I’ve reconsidered the number of items to preload. I opted out to have 6 elements per page (just one screen worth of data). If users wants more it is just a scroll away.
This reduced load time dramatically down to 1.5 second.
N+1 Query
This got me wondering. Why is each item that gets loaded so costly? I fixed some queries that I saw would benefit from eager loading. To make sure I’ve got them all I used this neat bullet gem that snitches on N+1 queries.
One nice trick is that even if you are using Sunspot Solr Search there is a way to eager load. It goes something like this:
|
This reduced load time down to around 1 second
Moral so far is that it is almost always coding practice and trade-offs that counts the most.
Assets
Of course you are using assets pipeline so all of client side code is minimized and streamlined. I’ve just turned on asset_sync to serve it from CDN. This is a bit faster and of course scales better (ab confirmed this). I am using Amazon S3 as storage. It is important to serve files from CDN and not S3. By now I was around 800ms. That is not that bad considering the starting point. The deploy times went up a bit, especially for the first deploy where I had to copy all assets to the S3 storage.
User generated content
For all user generated content I started using S3 storage in combination with CDN. This reduced load time to around 400ms. It varies but it is always under 800ms. Quick ab benchmark showed 24 pages/second with 10 concurrent users.
There is one caveat here, that I fell into. I was using aws/s3 gem since I really like the API. It turns out that you need aws-sdk to make it work with paperclip. Having both of them breaks aws/s3. I had to rewrite a bit of code where I manage upload to S3 and signing of user generated content.
Turbolinks
As a last step I’ve tried Turbolinks. I really like the idea and it is really noticeably faster in action since I have decent amount of .js code. BUT it doesn’t work 100% with all libraries I am using. That is unfortunate, so I had to turn it off (like the last three times I’ve tried it).
Benchmarking
One thing I’ve noticed was that when I launch ab benchmark it works faster first time. Second run it is much slower and when I go over to server it seems as only one Passenger instance is serving requests. I assumed this was due to my rails instance freezing or something similar. So I spent quite some time on it. I’ve even switched server to Unicorn, but to my dismay, problem persisted. Then I turned my attention to this 300 years old, war harden tool Ab, it looks like they have a bug on OSX and you need to recompile it to make it go away.
Well, at least I got Unicorn zero downtime deploy in the process…