CRUD Ain’t Hard

And now for a little exercise in armchair software architecture — the most despicable coder’s pastime. Dear non-coding readers: despite its name, this blog is still mostly not about programming. Just skip this post or something. Dear coders, many of you will probably disagree with me. I am not a very good or accomplished coder myself, and you probably should not be taking your advice from me. But then again, I could be right, so keep your mind open.

You might have been aware of the very popular, but uptime-challenged social networking tool called Twitter. They have one of the best problems to have: too many very active users. The site is so popular that it constantly goes down and displays and “over capacity” screen that the users have nicknamed The Fail Whale.

Rapidly writing and displaying short chunks of text with high concurrency on the web is not one of them unsolvable problems in programming. It’s not easy, but with right people and tools Twitter could be rewritten inside a month. Twitter founders should do some soul searching. Meanwhile the critical mass has already been reached, the niche for bloggers who want to SMS instead of blogging is big, and even horrible uptime can’t this service. I use it myself.

There is a lot of speculation in the blogocube about whether the reason behind the Fail Whale is the wrong choice of technology — the highly hyped and sexy Ruby on Rails and if it can “scale”. Or is it just simple incompetence?

To me Ruby on Rails falls into a class of technologies that are affected by what I call “the VRML syndrome.” Basically, if I wait long enough the hype will go away, the recruiters will stop posting job listings requiring 4 years of experience in a 4 month old technology, books as fat as my two fists will stop being published, and I will not have to learn it.

What’s the problem with Ruby on Rails? Well, it’s the same problem that slightly affects the content management system that I am currently working with (Drupal), and is the reason why I completely gave up using Microsoft web technologies which are saturated with this shit. See, software craptitechts all of a sudden decided that writing CRUD applications is too difficult for regular developers, and complicated GUI tools and frameworks need to be created to help the poor things. CRUD stands for “Create, Read, Update, Delete” and is just a funny way to say “a browser-based application chock-full’o forms”.

The default way to build these is to rather simple. You hand-code the html forms, then you write functions or classes to deal with the form input — validators and SQL queries for creating, updating and deleting. Then you write some code that will query the database and display the saved data in various ways: as pages, xml feeds, etc. None of this is difficult or non-trivial. Bad coders don’t do a good job of validation and input sanitizing resulting in the Little Bobby Tables-type situation, but these things are not very hard to learn and there are great libraries for this.

Ruby on Rails makes it very easy to create CRUD apps without hand-coding forms or writing SQL. RoR goes to great lengths to abstract out SQL, not trusting the developers to do it right. SQL is more functional than procedural, and thus a difficult thing for many programmers to grasp, but it’s not that hard. Really. SQL is located far enough levels from the machine that abstracting it out becomes a horrible thing due to the Law of Leaky Abstractions. Even when you have full control of SQL queries optimizing them is sometimes hard. When they are hidden by another layer it becomes next to impossible.

In short, RoR makes something that is easy (building CRUD apps) trivial, and something that’s hard – optimizing the database layer next to impossible.

In Drupal there are two modules, CCK and Views that allow you to create CRUD entirely through web interfaces. This is a feature that exist in just about every major CMS, it’s just that in Drupal it’s a little buggier and overcomplicated than necessary. These are fine for small websites and are really useful to amateurs. The problem arises when these are used for high traffic websites.

I think that a lot of people will agree with me that writing HTML and SQL queries using GUI tools is amateur hour. You just can’t make a good website with Microsoft Front Page. You can’t, you can’t, you can’t. But in Drupalland it’s all of a sudden fine to use Views to build queries for high traffic sites. Well, it’s not. Dealing with Views and Views Fast Search has been an ongoing nightmare for me. Hell is not even other people’s code in this case. It’s other people’s Views.

RoR, Views, CCK are one level of abstraction higher than you want to be when building a high performance application. The only way the can be an “Enterprise” tool if your enterprise is a) run by a morons that require 100 changes a day AND b) has very few users. In short, if it’s an app for the HR department of a company with 12 employees – knock yourself out. If you are building a public website for millions of people – forget about it.

Your, Deadprogrammer.

P.S. Yes, I know, you can abstract just about everything and reduce your software application to a single button labled “GENERATE MONEY”. You have to be a very smart LISP developer for that.

My Return to Blogging

I am finally free of lousy Dreamhost. I also switched from WordPress to Drupal. There probably will be some glitches and missing urls, and the design will stay in the current stock “Garland” theme, but I am back.

I also apologize in advance for rss feed glitches that might happen – I am still tweaking the site.

The Mystery of Obidos

Whoa, caught amazon.com while it was down.
They are showing a page with Rufus, the Amazon dog.

By the way, I was meaning to write about that for some time now. Did you ever notice enigmatic word “obidos” in Amazon url?

Some theories from usenet:

  • Castle near Lisbon
  • OBI (Wan Kenobi) + DOS (Disk Operating System)
  • ‘OBI’ = Object Broker Interface

    This seems to be the correct answer though: Obidos is is a major port on the Amazon river.

    [update]
    Livejournal user hallerlake had this to add:

    “I worked at Amazon for a couple of years, and can mostly answer that.

    Obidos is the area where the Amazon is “concentrated” – it narrows to a point about a mile wide and a couple hundred feet deep. It’s the chokepoint of the Amazon. A wry sense of humor turned that to the naming scheme.

    The Amazon Marketplace (auctions+zshops+third party) code was called Varzea for similar reasons – it’s the delta point of the amazon river, where the river fans out.

    Amazon wrote their own web serving environment because the selection of scripting/webcontrol languages when they got started was so lousy. They had to call it something, so obidos it was. :) ”


    Obidos is huge, it might be over a gig by now. I don’t think it’s that bad, though. I haven’t been at Amazon for a few years. For a long time Amazon ran on the Netscape web server environment, then eventually moved to a specially tuned Apache. But yeah, the webservers had a lot of RAM in them so that we could fork a bunch of different processes… and a garbage collector got added to take care of some of the memory leaks. Even still we had a service that killed and restarted processes every hundred accesses or so. It wasn’t pretty.

    I don’t know who came up with the name… I’d bet on Shel Kaphan or possibly Joel Spiegel. Shel set the direction for the company’s software development and architecture, including standardization on C (instead of C++) due to easier debugging. Certainly for the first few years he was The Guy for software architecture; these days I would imagine Al Vermeulen has that task.