On Learning to Code. Or Not.

Alert! Jeff Atwood wrote an excellent post about the “learn to code” movement.

He starts with a tirade full of incredulity about Mayor Bloomberg’s New Years resolution to learn to code with Codeacademy.

“Fortunately, the odds of this technological flight of fancy happening – even in jest – are zero, and for good reason: the mayor of New York City will hopefully spend his time doing the job taxpayers paid him to do instead.”

Let’s put aside the princely sum of $1 that His Honor collects from the job. Let’s even put aside that Mayor Bloomberg is doing exactly what he’s supposed to be doing – promoting New York’s bustling tech industry. More to put aside: our Mayor happens to be a technology pioneer with a ridiculous IQ.

This all comes down to a very difficult question: should people learn nerdy things when they have little use for them, just for the sake of learning.

I remember a Livejournal discussion that was hashed over and over in the Russian-speaking community. A math teacher was stumped by a question from his student: why was she supposed to learn about trigonometry when she wanted to become a beautician. The teacher did not come up with a good answer, but the livejournalers did dig up some awesome reasons. One well meaning pro-education-for-the-sake-of-education zelot said something to this effect: well, if you work with nail polish, tangents and cotangents figure prominently in formulas that deal with reflectiveness of thin films. That will lead to a greater understanding of how and why nail polish looks the way it does.

On the surface it may seem that Mayor Bloomberg has about as much need to know how to code as much as a beautician needs to know about sines and cosines.

There’s more: executives who learned a little bit about writing code at some point tend to say the following phrase “oh, I don’t know much about writing code, just enough to be dangerous”. They say it with this look on their faces:

Jeff takes this further with the plumbing analogy: since almost everyone has a toilet, should everyone take a course at toiletacademy.com and spend several weeks learning plumbing?

Normally I’m against education for the sake of education. I once argued for a whole hour with a co-worker who felt that _any_ education is worth _any_ amount of money. I did not know at the time that he held degrees in Psychology of Human Sexuality, Biology, Sociology and Communications. He must have been on to something: he made an amazing career while mine took a nosedive soon after that discussion.

Here’s where Jeff is wrong (I know, this is shocking, Jeff being all wrong and such): it is better to push people to learn incongruous things then to tell them that this is a bad idea. Steve Jobs learned calligraphy in college and it turned out to be super useful. He might not have become a master calligrapher, but man, did that piece of esoteric knowledge change the world.

When I was in college I badly wanted to take a scientific glass blowing class, but did not. I deeply regret that.

Are there people who learned plumbing from This Old House annoying contractors? Yes. Are self-install refrigerator ice maker lines causing millions in water damage? Yes. Is the world better off because Richard Trethewey taught it some plumbing? Absolutely.

If anything, attempting to learn to code will make people more compassionate towards coders. I do believe that people who are not already drawn to programming are not likely to become programmers, more than that, they are not likely to sit through a whole RoR bootcamp or worse. Learn to code movement is not likely to lure in bad programmers, but it might give people some understanding of what coders go through and maybe be more hesitant to have loud yelling-on-the-phone sessions near their cubes. Mayor Bloomberg, who enforces open workspace policies everywhere he works, might understand why programmers need offices. Jeff, let His Honor code a bit.

Semi-literate Programming

I recently finished “Coders at Work“, a series of interviews with famous programmers.

On one hand, reading a book like this is a downer: it’s very clear to me that I occupy a place that is very close to the median of the bell curve, and the skill level of programmers is a very steep non-linear curve in itself. I’ll never be as good as JWZ or Brad Fitzpatrick. But I knew that before, and I am ok with it. On the other hand, this book inspired me to read more code.

The programmers in the book disagree on many points, but they mostly agree on the importance of writing readable code and educating yourself by reading other people’s code. I make my living writing in scripting languages, and I haven’t written a line of C or C++ since college. But there’s nothing preventing me from downloading and taking a look at the source of Apache, PHP, MySQL.

It’s important for me to understand “how the sausage is made” in the PHP stack, and as it turns out, what happens between Apache PHP and MySQL in term of requests and timeouts is not as simple as one might think. I asked at StackOverflow about this, but all the diagrams that people pointed me at were of the very rudimentary type: “look, here’s a happy cow, it goes to Bovine University, look – it’s all shrink wrapped on the supermarket shelf” instead of “sausage farm/slaughterhouse/truck/factory tour, starting with cow insemenation”.

When I downloaded the source code of mod_rewrite, arguably the most useful Apache module in the world, I was amazed to find out that it’s only 5000 lines of C with comments.

The book ends with the interview of Donald Knuth, and another two major questions that the interviewer is asking everyone is – “have you read Knuth’s books and have you tried literate programming”. It was interesting to find out that most of the famous programmers use Knuth’s the same way that I do. The books sit on my bookshelf, I look at them, I sometimes try to read them, I skip most of the math. They serve as a constant reminder to me that I suck at computer science even more than I suck at programming, and luckily there are people out there who know all of this stuff who are not idiots like me.

Here’s a photo of my cubicle at TV Guide circa 2002, Knuth’s books are holding a place of honor next to the mini fridge. By the way, taking pictures of the places where you work and live is something that you should not forget to do: years from now nobody will care about those pictures of flowers, shadows, and sunsets, but

I’ve read the book about Literate Programming at the time, and was rather inspired by it. Ok, maybe I didn’t read it and more like skimmed it. I don’t think I understood what real literate programming is.

The way I understand it, Literate Programming is a way to write programs as a narrative that is readable to computers and humans. My father, in his former career a site supervisor (a type of a contractor) is very fond of giving very detailed instructions to me, the same way he used to give instructions to construction workers. His instructions usually are exaustive algorithms, with error handling. I think that his instructions, expressed as a flow of conciousness, would work not only on me and construction workers, but on computers as well, and are similar to what Donald Knuth has in mind. All you really have to do is to build a layer of abstraction between these instructions and a computer language. Also, since computers don’t forget things, he would only need to repeat his instructions once.

These days my dad is a COBOL programmer. Everybody dumps on COBOL, but in my mind it’s a language worth of a lot of respect. It has a syntax that is very English-like, something that makes reading COBOL code easy. Well, maybe it’s like reading some old-timer’s newsgroup post written in all caps, but it’s still much closer to English than most other computer languages.

At the time I was reading “Literate Programming” I was using ASP 3.0, IIS, and SQL Server 97. My task was to write a system that would account for booked and pending business. This is something that had to be done since the age of Mad Men. You see, the dealings of clients, account executives (like Pete Cambell), their bosses, account coordinators, creative department, etc are rather convoluted. But in the end, to get paid, you have to have a system that will track who brought in what business, who handled what, and how the commissions need to be split.

This is normally the realm of something called EAS (Enterprise Application Software). Back at the turn of the century, this area was still dominated by a company called SAP, but there were a few smaller players, like Salesforce.com that tried to package these applications. Any sane IT manager looks to see if an EAS solution can be purchased first. It turned out that TV Guide’s buseness logic was impossible to shoehorn into any existing solution. SAP folks said – yeah, no problem, we’ll build you what you want, but our prices start at $1M, and then there are consultant fees. ERM world is a crazy place, you can read about some true craziness in “Cube Farm”, an account of one hapless developer’s adventures at Lawson Software. It’s a truly riveting book, and I fell that every developer out there should read it. It’s literally Lovecraftian in nature, that book.

In any case, it fell to me to develop the application from scratch. Inspired by Knuth, I decided to write some semi-literate code. Me and a project manager, Brad, went to the clients and interviewed them at length, documenting their existing process (aka the most complicated set of spreadsheets you’ve ever seen). In the past, before cheap computers, all you needed was a Joan Holloway, but I believe they stopped making them.

Brad went on to go back and forth with a very terse document about 5 pages in length that described how the new system would work. He would sit down with the clients and go through the narrative, step by step, confirming that this is what they wanted. Meanwhile I created an object oriented library that made dealing with the database, creating forms and navigation elements much easier. This is similar to to what you might find in a CMS like Drupal, only a little cruder.

When the document shaped up, I created the database schema, and then I took a big chunk of the document and pasted it into one huge comment block. I proceeded to break off chunks of that block and writing the code around it. Interestingly enough, as time went on, the project manager started helping me to write the code: enough of scary database abstration was hidden by simple classes and method, and there were tons of self-evident examples all around to copy and paste. I switched to writing reports that involved cubes, rollups and other fancy stuff. Stored procedures that did the reports also received comments from the document that described the reports.

This wasn’t a monolythic system – I was writing it for 2 years or so, releasing a chunk after chunk. In the end it was handed off to another developer, the whole transfer took only a couple of hours. There weren’t any major bugs, maintanence issues (I believe I received only one phone call about it after several years of continuous use). All in all I was pretty pleased with this approach and can absolutely recommend it.

I believe this is the reason why so many English majors become excellent programmers: if you can write for people, you can write for computers. Sometimes there are reasons why you can’t do both at the same time, but there’s no reason not to find some middle ground.

The Russian Tea Room Syndrome

 

“Man told me,” He said, “that these here elevators was Mayan architecture. I never knew that till today. An I says to him, ‘What’s that make me– mayonnaise?’ Yes, yes! And while he was thinking that over, I hit him with a question that straightened him up and made him think twice as hard! Yes, yes!”

“Could we please go down, Mr. Knowles?” begged Miss Faust.

“I said to him,” said Knowles, ” ‘This here’s a research laboratory. Re-search means look again, don’t it? Means they’re looking for something they found once and it got away somehow, and now they got to re-search for it? How come they got to build a building like this, with mayonnaise elevators and all, and fill it with all these crazy people? What is it they’re trying to find again? Who lost what?’ Yes, yes!”

“That’s very interesting,” sighed Miss Faust. “Now, could we go down?”

Kurt Vonnegut, “Cat’s Cradle

The Russian Tea Room, once a popular restaurant created by ballerinas and danseurs (aka male ballerinas) of the Russian Imperial Ballet for themselves and their friends. Later it became an expensive restaurant for the Manhattan high society. In 1996 the new owners closed it down for 4 year and $36 million renovations. In 2002 the restaurant closed, and the owners were bankrupt. In the aftermath, one of the chefs, M.D. Rahman, can be found on 6th avenue and 45th street selling some of the tastiest street food in Manhattan. I bet he’s making more than he did back at the Russian Tea Room now with his little cart.

In the parlance of the Internet this is known as a “redesign” or a “relaunch.” If you are making a living out of web development, like I do, chances are that you participated in a vicious cycle of web site redesigns. They usually happen like this: managers decide to do it and get funding, a lot of meetings follow, specifications are written (or not), arbitrary deadlines are set, designers create graphical mock-ups, then coders swarm and engage in what’s referred to as “death-march.” Managers change their minds about the look and feel a few times during the death-march for an extra morale boost. Finally, a redesigned website launches. Managers start planning the next redesign right away.

In the olden times the CEO’s nephew often got the web design job. Well, these days the nephew grew up, he has a consulting agency. “This is old and busted, let me redesign this mess and you’ll get new hotness” – he says. Pointy-haired bosses everywhere nod and say – “yes, yes, new hotness”, and the cycle keeps on going, redesign after a redesign.

There are a few different types of redesigns. Firs of all, there’s changing the look. In the simplest and best form, this is a very quick deal, especially if the site is properly architected for quick changes. It’s like taking your plain vanilla cellphone, buying a snazzy faceplate, one click – instant new hotness. I have nothing against this sort of redesigns.

The only thing you have to look out for here is what I call the “Felicity effect.” A television show Felicity had a famous redesign failure – the actress Keri Russell cut her trademark long hair. One might argue that she is hot no matter what, but the show suffered a huge drop in ratings. You have to keep in mind that a new look rarely attracts new customers, but often upsets the old ones. For instance, I like Keri’s new look, but I would not start watching that show.

The second type of a redesign involves changing the underlying technology of the site. One might change the content management engine, database engine, rewrite the site in a different language, make it run on a different web server, different operating system, etc. These usually turn out to be the most disastrous and costly of redesigns.

Joel Spolsky wrote about “… the single worst strategic mistake that any software company can make: … rewrit[ing] the code from scratch.” In the web publishing world these kinds of rewrites cause a lot of grief and devastation. A huge technology change always requires a lot of debugging and fixing afterwards, and as soon as most of the bugs are fixed, a new redesign comes around, because, see, ASP.NET 2.0 C# is “old and busted” and Vista Cruiser Mega Platform D## is “new hotness.”

I am not talking here about replacing a technology simply because it does not work or is dangerous. But redesigns are rarely aimed at fixing things – they are done in search of hot technologies and hot looks. By the way, amongst pointy-haired web execs fixing things is less glamorous than perusing new technologies, and that is less glamorous than changing the looks.

A building superintendent I know was in a middle of a huge project – repairing three old and unsafe elevators as well as fixing the crumbling facade of the building. Although the repairs were crucial, they did not earn him the love of the tenants that the old superintendent enjoyed. The old super, instead of fixing broken things, engaged in an almost constant painting projects, changing the color of the paint every time just a little bit. And when he wasn’t repainting, he would leave out the paint bucket and a brush on some rugs in the lobby.

The web execs often go for the best of both worlds – equivalent to changing the foundation of the building (and not the old one was sagging), as well as painting it a new color at the same time. The full Monty web redesign is what the pointy-haired want.

Let’s take a look at the sense that such redesigns make from a capitalist point of view in an area that I know well — web publishing. Web publishing businesses work just like any other. You take some money (aka capital), you spend that money to produce something and you hope that that something makes you even more money one way or another. In economics this is known as Marx’s general formula for capital: Money-Commodity-Money.

Another thing that I faintly remember from my economics class is a rather disturbing concept called “opportunity cost“. See, when you invest money in something you instantly incur this cost. Why? because you can’t invest your money twice, and there always seems to be something you could have invested in that would give you a better return. Let’s say it’s 1995 and you are an editor in, oh, Random House or HarperCollins. You have a budget to publish some children’s books and there’s a pile of proposals on your table. You pick a few. They make money, win awards, etc. Yet, the opportunity cost on every one of those books is about a kajillion dollars, as in that pile there was a certain book by a woman named Joanne Rowling.

In theory, any web executive’s first objective should be to make, and not lose money. Also they should look to minimize the opportunity cost whenever possible. This is of course not the case for many of them. They are thinking: hey I have this fat budget – I can do a big redesign, or …. hmm, what else can I do with that money so it will make me more money?

So how would one go about increasing profits? In the web publishing today content is once again king because of the maturing web advertising, vast improvements in hosting costs and google-inspired web indexing and searching. This was not the case in the earlier days of the web, but now you can directly convert “eyeballs” into profits. The process is rather simple: you create web pages, users visit them, you show users ads (for which you are paid). The relationship is linear – more users = more ad impressions = more money.

So, first of all, you might produce more pages. With search engines like Google, even pages that are hidden in archives of your website will still produce pageviews. The more pages you add, the more revenue you’ll get. In fact, pages with useful information, once placed online become something very dear to a capitalist’s heart – an income generating asset, the very thing that the author of Rich Dad, Poor Dad is so excited about. They are like the geese that lay golden eggs.

The cost of producing more pages comes from three sources: the cost of content – you need to pay someone to write, take pictures, etc; the cost of placing it online – “web producers”, the people who write html, create hyperlinks and optimize images draw a salary; and the cost of hosting/bandwidth – if you are hosting huge videos you costs might be more than what you can get from advertising, but if it’s just text and pictures you are golden. As you surely don’t expect the Spanish Inquisition, there’s the fourth cost: the opportunity cost of showing this content for free, instead of asking for subscription money. The main thing to remember, once the content/feature is created, the costs to keep it online and generating money is trivial.

Besides producing more content, there are other ways of making more money. One might improve the relevance of ads on your pages. If you have a third party ad system, you are pretty much can’t do that. But if you have your own, you might create mechanisms for serving super-relevant ads. Sometimes you might add e-commerce capability to your content website. For instance, if you have a gadget review site, injecting opportunities to easily and cheaply buy the gadgets that you are writing about will likely bring in more more money than machine generated dumb ads.

One might create content that is more valuable to advertisers. For instance, keywords such as “mesothelioma lawyers”, “what is mesothelioma” and “peritoneal mesothelioma” generate ridiculous costs per click on Google’s AdSense. If creating content about “form of cancer that is almost always caused by previous exposure to asbestos” that is so popular with lawyers is not your piece of cake, you can create content about loans, mortgages, registering domain names, etc.

Then we enter the murky waters of web marketing, and especially “SEO” – search engine optimization. In short, if you get other websites to link to your pages, you will get more vistits, partially from those links, and even more importantly, because search engines will place your pages higher in their results. The hard, but honest way to do this is to produce unique, interesting and timely content. No body’s interested in that. Encouraging the readers to link by providing urls that never change and even “link to us” buttons is not in vogue: most web execs prefer non-linkable flash pages. Another way is to pay for links – in the best case for straight up advertising, in the worst case – to unscrupulous “link farm” owners that sell PageRank. Then comes the deep SEO voodoo – changing the file names, adding meta tags, creating your own link farms and hidden keyword pages. At the worst, there’s straight up link and comment spamming. Unethical methods of promoting your business work: Vardan Kushnir who spammed the entire world to promote his “Center for American English” had enough money for booze and hookers, but not many people shed a tear for him when he was brutally murdered (maybe even for spamming). In corporate world the equivalent is the PageRank ban from Google.

So, you could spend your money on all of these things that I described, and hopefully make more money. On the other hand, redesigning a website from top to bottom to make it “look good” or “more usable” will not bring in more “eyeballs”. A redesign of a large site takes several months for the entire web staff. The possible positive aspects of the redesign are these:

1) Faster loading pages
2) Easier to read text
3) More straightforward navigation
4) Cleaner look
6) Bug fixes
7) Switching from a more expensive software and hardware to cheaper

Existing users will probably like you better, but will new ones all of a sudden descend onto the redesigned site? Not likely. In fact, some think that the ugliness of MySpace design is an asset rather than a drawback. People want something from websites. Be it news, funny links, videos, naked pictures, savings coupons or product reviews, design does not matter too much to them. If they can click it, read it and (for the valuable geeks with blogs and websites) link to it – users are generally satisfied.

Here’s an example of a well executed major redesign of a high profile website, the New York Times. NYT always had a well designed website, and the new one is pretty nice too. But is there a lot of new traffic? Here’s an Alexa graph.

At the worst redesigns bring:

1) Broken links (sometimes every single url changes and all links from outside break)
2) Heavier graphics, proliferation of Macromedia Flash
3) Slower loading pages
4) Loss of features and content
5) New bugs
6) New software and licensing costs, more expensive servers

Often this is all that they bring. Broken links hurt the search engine positioning. New software costs money. It takes a long while to work out the bugs.

Here’s an Alexa graph of another major redesign on a website, which name I’d like to omit. Just as the traffic recovered after a big redesign in 2000, a new one hit in 2003. It seems to be recovering again.

The thing is, many businesses are very robust and the disastrous effects of web redesigns do not kill them. Pointy-haired bosses make their buddies rich, while getting kudos for the redesigns. Everyone stays busy, and software companies get to sell a lot of server software.

Side Effects Of Programming

“Nelson: Ah, he’s the greatest showman since that kid who eats worms!
Kid Who Eats Worms: My 15 minutes of fame are over!”
The Simpsons, Episode 3G02

The post about Durian seems to have been the most popular one in the recent history of deadprogrammer.com . This once again proves that eating gross things is entertaining to the masses. To prevent the surging popularity of my blog I absolutely must write a little bit about something that I almost never write about. Programming.

My co-worker who could not understand why he could not increment a variable in XSLT found an amazing piece of technical writing in an O’Reilly book about XSLT. Here it is:

“Although these XSLT variables are called variables, they’re not variables in the traditional sense of procedural programming languages like C++ or Java. Remember that earlier we said one goal behind the design of the stylesheet language is to avoid side effects in execution? Well, one of the most common side effects used in most procedural languages is changing the value of a variable. If we write our stylesheet so that the results depend on the varying values of different variables, the stylesheet engine would be forced to evaluate the templates in a certain order.

XSLT variables are more like variables in the traditional mathematical sense. In mathematics, we can define a function called square(x) that returns the value of a number (represented by x) multiplied by itself. In other words, square(2.5) returns 6.25. In this context, we understand that x can be any number; we also understand that the square function can’t change the value of x.

It takes a while to get used to this concept, but you’ll get there. Trust me on this.”

(full text here)

The quote that I highlighted in bold absolutely gets me. Yeah, that’s one good side effect. I get the feeling that XSLT was committee designed with the specific purpose to make life miserable for programmers. Also that committee must have had some really good stuff to smoke.

Update
Uh, see, now this is what happens when you try to write programs without actually understanding computer science fundamentals. I did not realize that XSLT was functional, not procedural. Like most mediocre programmers out there I was not exposed to much functional programming (I did try to teach myself Lisp, but quickly gave up). Having to do a lot of SQL(which is near-procedural) over the years improved my understanding of functional programming, but not enough to realize what the XSLT book was talking about (reading it from beginning would have been helpful too). Now hardware XSLT accelerators, which made me laugh when I first heard about them, make sense too.

Lvalue and Rvalue

Seems like there is no stopping with political stuff trying to get into my life today.

Here I am, reading through code. Programming, one might say. And then I come across this. What do you think does a function commify() do to a staunch conservative, all American array? That’s right. It turns it into a left-wing, pinko communist comma delimited string.

The Mystery of Obidos

Whoa, caught amazon.com while it was down.
They are showing a page with Rufus, the Amazon dog.

By the way, I was meaning to write about that for some time now. Did you ever notice enigmatic word “obidos” in Amazon url?

Some theories from usenet:

  • Castle near Lisbon
  • OBI (Wan Kenobi) + DOS (Disk Operating System)
  • ‘OBI’ = Object Broker Interface

    This seems to be the correct answer though: Obidos is is a major port on the Amazon river.

    [update]
    Livejournal user hallerlake had this to add:

    “I worked at Amazon for a couple of years, and can mostly answer that.

    Obidos is the area where the Amazon is “concentrated” – it narrows to a point about a mile wide and a couple hundred feet deep. It’s the chokepoint of the Amazon. A wry sense of humor turned that to the naming scheme.

    The Amazon Marketplace (auctions+zshops+third party) code was called Varzea for similar reasons – it’s the delta point of the amazon river, where the river fans out.

    Amazon wrote their own web serving environment because the selection of scripting/webcontrol languages when they got started was so lousy. They had to call it something, so obidos it was. :) ”


    Obidos is huge, it might be over a gig by now. I don’t think it’s that bad, though. I haven’t been at Amazon for a few years. For a long time Amazon ran on the Netscape web server environment, then eventually moved to a specially tuned Apache. But yeah, the webservers had a lot of RAM in them so that we could fork a bunch of different processes… and a garbage collector got added to take care of some of the memory leaks. Even still we had a service that killed and restarted processes every hundred accesses or so. It wasn’t pretty.

    I don’t know who came up with the name… I’d bet on Shel Kaphan or possibly Joel Spiegel. Shel set the direction for the company’s software development and architecture, including standardization on C (instead of C++) due to easier debugging. Certainly for the first few years he was The Guy for software architecture; these days I would imagine Al Vermeulen has that task.