go and build amazing applications


So, here’s my advice: go and build amazing applications. Build them with the most boring technology you can find. The stuff that has been in use for years and years. Where every edge case has been covered. Where every library you will ever need has been in production for years. Where every part of the release cycle has been ironed out. Where the best practices on how to do testing are known.

Use PHP. Use Java. Use Python. Use Ruby on Rails. Use .NET. Use jQuery. The language may be more verbose, and the framework may be years old, but at least googling your error message will return you results you can use. You don’t have to invent everything yourself. You’re not going to be the first to hit a certain limit.

Pick your battles.

About a decade ago I was really into forum software. I built forum software in Perl and when I was bored with it, I rebuilt it in PHP and then in Java. I knew what a forum needed to do, that was not a challenge. The challenge was in the technology. If the software you’re building is a snooze, a no-brainer, you can possibly afford some technology risk to spice up your life. If it’s not: don’t take too many chances.

No sequel
Everybody seems to be moving to NoSQL databases, because, you know, “MySQL doesn’t scale! My app will outgrow MySQL!” That’s what we call premature optimisation. First, proof you even have to scale, and if you do, that it’s your database that has scaling issues. Facebook uses MySQL to keep most of its data, are you going to get bigger than them? Accept that you simply do not yet know your technological challenges. At Cloud9, more often than not we predicted our bottlenecks wrong. Dead wrong.

I in no way mean to badmouth NoSQL databases. They have use cases, but you have to make sure you hit them. Redis is an amazing piece of engineering. It’s simple and its performance is unbelievable. There have been cases where a bug in one of our scripts would effectively launch a DoS attack on our Redis server, executing queries like crazy, but Redis wouldn’t break a sweat. Many tens of thousands of requests per second on a single box — no problem.

However, much of the Cloud9 data is very relational: we store users, workspaces, workspace members. A user has many workspaces, a workspace has many workspace members. Everything is encoded in clever ways with keys, hashtables and sets. Sadly, a bug or crash could easily make the data inconsistent. Redis has no way to ensure consistency in any way. In addition, the way data is stored in Redis has to be optimised for the type of queries that need to be performed. If, later on, you find out you need other ways to get your data out, you have a problem. For instance, you want to know which user has most workspaces, or what workspace type is most popular. With SQL that would be a single query. With Redis, because you didn’t plan for this type of query, you have to iterate over every single user and count the number of members in its “workspaces” set. Finding out what workspace type is most popular would require pulling in every single workspace record (hundreds of thousands of them), one query at a time, and then checking the value of the “property” field.

If this is a trade off you make to be able to scale to the million of users you have — great. If you pick this because it’s “cool”: don’t torture yourself.

I always used to be — and in many ways still am — an early adopter. New technology gets me excited, makes me want to play with it and use it for everything. I’ve since learned that this attitude can work, but comes at a cost.

In summation:

Don’t underestimate the value of mature technology. Things will break, and you will have to fix them. It’s bad enough if your own code is a problem, it’s worse when the problem is a poorly understood “feature” of your platform of choice.
Don’t optimise prematurely. Don’t choose C because it’s faster. Don’t use MongoDB because, supposedly, it scales better. Don’t cache until you’re sure you have to.
Be pragmatic. Technology like node.js and Redis have many great uses. If you hit one of their use cases: limit their scope to what makes sense. There’s no need to go all-in at all costs.

This entry was posted in SQL Server and tagged . Bookmark the permalink.