Essential Programming Books: "Building Scalable Web Sites"
post

Like many old sk00l developers, I like the security blanket a bookshelf full of technical books gives me. Whenever I have a problem, I check to see if I can figure it out using the reference materials on hand. Just looking at my shelves. Since She Who Must Be Obeyed Around Here frowns on boxes constantly showing up at our house from various online book retailers, I usually only purchase books that I think are going to be good additions to the library.

Taking a quick peek at my bookcase, these are a few of the books that I consider essential reference guides for me:

  • The Pragmatic Programmer
  • Practices of an Agile Developer
  • Mastering Regular Expressions
  • Beautiful Code / Beautiful Architecture
  • Javascript: The Good Parts
I have more books, but those are the ones closest to my desk.

There is now a new book that has earned a spot on my bookcase, and should be in one of the slots closest to my desk. I am reading a borrowed copy of it, but I plan on rectifying that soon. "Building Scalable Web Sites" by Cal Henderson (he of Flickr and "Why I Hate Django fame) is that book.

I don't know how I can describe how awesome this book is in this blog. Just check out the topics in the table of contents:

  1. Introduction
  2. Web Application Architecture
  3. Development Environment
  4. i18n, L10n, and Unicode
  5. Data Integrity and Security
  6. Email
  7. Remote Services
  8. Bottlenecks
  9. Scaling Web Applications
  10. Statistics, Monitoring, and Alerting
  11. APIs

Pardon my vulgarity, but that is pretty much every single fucking thing you need to worry about when building a web site that is going to be used by anyone other than you and your close circle of friends.

This book is old in internet time: published in 2006, meaning it was probably written in 2004-2005. A lot of technology has changed, but the principles themselves have not. Also, a lot of the code inside uses PHP for it's examples, but I really think this applies to any programming language that is stateless and/or promotes a "shared nothing" architecture. That means (as far as I'm concerned) PHP, Python and Ruby. It's very simple: scaling anything is difficult, but scaling something where there is information that needs to be shared between nodes is extremely difficult. The best way to use this book is to ignore the specific technologies being mentioned and focus on the ideas and practices that are being promoted. I mean, let's be brutally honest: Cal Henderson helped build one of the most massively scaled web sites, using a programming language at it's core that the hipster programmer crowd sneers at. Ignore his advice at your peril.

Go buy this book. Right now. You will not regret it.