tl;dr: I converted my blog to a static site using Jekyll/Octopress and am now hosting it on Amazon CloudFront. It is much faster. There are tons of benchmarks and charts near the bottom of this post.

Before

  • Ruby on Rails framework
  • Hosted on Heroku (single dyno)
  • Unicorn as a server
  • Memcached + Dalli (5MB tier Heroku add-on)
  • Images hosted on CloudFront
  • No image compression or gzipping

Before I made the switch, my blog was a small Ruby on Rails app hosted on Heroku. I was using Unicorn as a server, which meant I could run multiple server processes on a single Heroku dyno. After some tweaking, I determined that 6 running processes was the sweet spot. I was also using Memcached and the Dalli gem for in-memory caching. Because my pockets were (and are) empty, I was on the free 5MB teir add-on from Heroku. I was also hosting images on CloudFront, but nothing else. This costed me only pennies a month.

Before I go any further, I want to emphasize that this setup was performing great. I wasn’t noticing any serious performance issues. I was motivated by curiosity and a drive to try and create something even better, not by any shortcomings in Heroku or the Rails framework.

After

  • Jekyll & Octopress as a framework
  • Everything is hosted on CloudFront
  • Almost everything is gzipped
  • Images are compressed with ImageMagick

I wanted to see if I could make my blog as fast as possible for free, or at least dirt cheap. I decided to give Jekyll a try. Jekyll is a static site generator written in Ruby and specifically designed for blogs. The basic idea is that instead of having a backend framework and a database, you can convert everything in your blog to plain html files (like the olden days). Because Jekyll is so bare bones, I also used Octopress, a blogging platform built on top of Jekyll that powers Github Pages. (I ended up trimming a lot of fat from it, saving only the pieces I needed). I expect that this will still only cost pennies a month.

To squeeze every last bit of performance out of my blog, I made some other important changes. First, I gzipped all the html, css, js, and fonts. Next, I compressed all the images automatically using ImageMagick. Finally, I removed all the javascript (except for google analytics). That’s right. It turns out I didn’t really need it. If that ever changes I can always add it back.

In case you’re curious, I made a gist which shows my deploy process. Everything is automatic. When I run bundle exec rake deploy this is what happens:

  1. Re-generate the site using Jekyll.
  2. Minify all html, css, and js using jitify. (optional)
  3. Gzip everything that should be gzipped.
  4. Compress all images using ImageMagick.
  5. Sync with Amazon S3 (only upload the files that have changed).
  6. Invalidate the objects on CloudFront as needed.

Benchmarks

I used three different tools to test my new blog setup.

For each tool, I tested three different pages: the homepage, a post with one picture, and a post with no pictures. I always tested the most direct domain name possible, i.e. instead of blog.alexbrowne.info I used d1koatif4i39jr.cloudfront.net. All of the data from these benchmarks is publicly viewable.

I also intended to use Blitz but a Javascript error prevented the site from working correctly at the time of this writing. (A shame, because I’ve been pretty happy with them in the past). I will try again later and post an update if it works.

Apache Benchmark

Apache Benchmark (ab) gives the tester a lot of control and is able to spawn multiple user threads that load a web page simultaneously. Each user thread will wait until a request is completed before it starts the next one, and it will keep sending requests until the test terminates. Since ab runs on your own computer, it can be limited by your hardware and/or internet connection. Also, it is my understanding that ab does not load images.

For each of the three pages, I ran 9 different tests with 1,000 requests and different levels of concurrency ranging from 1 to 1,000. The tenth test (the “stress test”) was 10,000 requests at a concurrency level of 1,000.

Dynamic Site tested by Apache Benchmark
Concurrency Time (ms) Req/s Transfer Rate (Kb/s)
1 63.67 15.89 154.76
5 82.67 60.87 599.67
10 93.33 110.35 1053.7
25 237.33 110.17 1034.6
50 433.67 118.41 1123.9
100 851.67 117.64 1119.5
250 1880.7 119.39 1137.3
500 3360.3 116.31 1107.1
1000 5436.0 99.44 999.15
Stress Test 10754 101.49 919.58
Static Site tested by Apache Benchmark
Concurrency Time (ms) Req/s Transfer Rate (Kb/s)
1 39.33 25.65 122.57
5 58.33 86.64 441.86
10 87.00 114.35 561.97
25 82.67 295.09 1467.3
50 145.00 357.13 1727.3
100 161.00 575.52 2859.4
250 478.00 367.74 1849.3
500 883.67 312.38 1562.2
1000 1536.0 279.37 1370.2
Stress Test 5271.7 170.14 829.15

Note the logarithmic scale on the folowing line chart…

Based on average response time, the static site performed 2.65x (or 62.3%) better than the dynamic one. The static site was also able to sustain a 166% higher request rate and a 39% higher transfer rate.

The sweet spot for the static site seems to be around a concurrency level of 100. At that level, the static site was able to serve an impressive 575 req/s at 2,859 Kb/s. All the while maintaining an average response time of only 161 ms! That’s more than 500% better than the dynamic site performed at the same test.

Another important note is that the dynamic site had an average error rate of about 19%. The Heroku logs didn’t indicate that anything was amiss, so I don’t know exactly what was causing the errors. On the other hand, in all the trials the static site didn’t return a single error! Already we can see that the new site is performing better and more consistently under heavy load.

Which Loads Faster

Unlike Apache Benchmark, this tool only does one request at a time. I believe it uses your own hardware and internet connection, and it does load everything on the page, including images and javascript. Because it loads the whole page, this tool does a good job of showing more real world results. For each trial, I pitted the dynamic site and the static site directly against each other and recorded the average of 100 requests. I did three trials (100 requests each) for each of the three pages. Below are the averages for each page and the overall averages.

Which Loads Faster? (all times in ms)
Page Dynamic Static
Homepage 276.67 121.67
Post w/ 1 Picture 262.33 119.00
w/ no Pictures 219.67 102.33
Avg. of All Pages 252.89 114.33

The results show that the static site is performing 2.21x (or 54.8%) better than the dynamic one. It also shows that the entire page is loading on average in a mere 114 ms on my laptop, which is insanely fast! Of course, as we’ll see in the next benchmark tool, this has a lot to do with my location and internet speed, and you can’t necessarily get the same results everywhere.

It’s also worth noting that the server will only respond this fast if you limit it to one request at a time. And the results might have been affected by the fact that I performed these tests in the late hours of the night when Amazon wasn’t seeing much traffic on their U.S. servers.

Pingdom Full Page Tester

Unlike the previous tools, this one uses servers in three locations around the world to and does not depend on your own hardware or internet connection. The three locations are New York, Dallas, and Amsterdam. It also tests only a single request at a time, so the results won’t reflect what happens when the server is experiencing heavy load. I performed a single page load test five times for each of the three pages (homepage, a post with one picture, and a post with no pictures). Below is the matrix of averages. On one axis is the average for each location, and on the other axis is the average for each page. In the bottom right corner is the overall average for all locations and all pages.

Dynamic Site tested by Pingdom Tools (ms)
Homepage Post w/ 1 Picture w/ no Pictures Avg.
New York 806 832 462 735
Dallas 1298 950 634 961
Amsterdam 1192 1126 1338 1219
Avg. 1134 969 811 972
Static Site tested by Pingdom Tools (ms)
Homepage Post w/ 1 Picture w/ no Pictures Avg.
New York 806 508 380 565
Dallas 801 643 439 628
Amsterdam 154 245 201 200
Avg. 587 465 340 464

Based on the average of all three locations and all three pages, the static site performed 2.09x (or 52.25%) better than the dynamic one. What might be more interesting, though, is that this tool showed something that the others didn’t: the static site has much better global performance. This is particularly noticeable in Amsterdam, where the average load time was only 200 ms, 6 times faster than the dynamic site performed at that location.

The Pingdom full page tester also reports how your website compares to all other tested websites in their database. While the dynamic site performed around 85-90% better than all other websites, the static one performed upwards of 95% better. In Amsterdam, my blog performed 98-100% better than all other tested sites!

Conclusions

If you combine the load time averages from each of the three benchmark tools, you’ll find that the changes I made increased my blog performance by around 56.4%. It is now 2.3x faster, and it shows. The exact performance depends on your internet speed, of course, but in my experience I don’t notice any lag as I navigate between pages. None.

I also learned that my blog will now perform much better under a heavy load. Apache Benchmark showed that at even at 500 requests per second, it can consistently serve up pages in less than one second. Actual performance might even exceed those numbers since ab could have been limited by my hardware and internet speed and it was only hitting one of CloudFront’s distribution centers.

The performance benefits are particularly noticeable if you’re outside of the States. With the old site, you might have been waiting 1-2 seconds. The new site should be able to consistently serve you in under a second no matter where you’re located.

Don’t forget– if you want to see the entire data set from all three benchmark tools, I have made it publicly viewable. I will also be open-sourcing all the code behind my blog once I get all the kinks worked out and clean it up a bit. I will update this post when that happens.

As for the price? Well over the course of testing I probably sent more than one million requests to the CloudFront servers and transferred around a gigabyte of data. According to the pricing charts, this will cost me a measly single dollar. Not bad at all, especially considering that’s far more traffic than I expect to get in a typical month.

If you’re running your own blog and have some programming experience, I would strongly recommend you give Jekyll/Octopress a try. You will be pleased with the performance, and writing in markdown with liquid tags can be quite a pleasure. In no particular order, here’s some resources that I found helpful:

If you have any other tips on how I can make my blog even faster, or any suggestions for other benchmarks I could try, I’d love to hear it!

Discussion on Hacker News