Programming in the Twenty-First Century — Full Archive of the Blog by James Hague

A Deeper Look at Tail Recursion in Erlang

The standard "why tail recursion is good" paragraph talks of reducing stack usage and subroutine calls turning into jumps. An Erlang process must be tail recursive or else the stack will endlessly grow. But there's more to how this works behind the scenes, and it directly affects how much code is generated by similar looking tail recursive function calls.

A long history of looking at disassemblies of C code has made me cringe when I see function calls containing many parameters. Yet it's common to see ferocious functions in Erlang, like this example from erl_parse (to be fair, it's code generated by yecc):

yeccpars2(29, __Cat, __Ss, __Stack, __T, __Ts, __Tzr) ->
yeccpars2(yeccgoto(expr_700, hd(__Ss)),
__Cat, __Ss, __Stack, __T, __Ts, __Tzr);

It's tail recursive, yes, but seven parameters? Surely this turns into a lot of pushing and popping behind the scenes, even if stack usage is constant. The good news is that, no, it doesn't. It's more verbose at the language level than what's really going on.

There's a simple rule about tail recursive calls in Erlang: If a parameter is passed to another function, unchanged, in exactly the same position it was passed in, then no virtual machine instructions are generated. Here's an example:

loop(Total, X, Size, Flip) ->
loop(Total, X - 1, Size, Flip).

Let's number the parameters from left to right, so Total is 1, X is 2, and so on. Total enters the function in parameter position 1 and exits, unchanged, in position 1 of the tail call. The value just rides along in a parameter register. Ditto for Size in position 3 and Flip in position 4. In fact, the only change at all is X, so the virtual machine instructions for this function look more or less like:

parameter[2]--
goto loop

Perhaps less intuitively, the same rule applies if the number of parameters increases in the tail call. This idiom is common in functions with accumulators:

count_pairs(List, Limit) -> count_pairs(List, Limit, 0).

The first two parameters are passed through unchanged. A third parameter--zero--is tacked onto the end of the parameter list, the only one of the three that involves any virtual machine instructions.

In fact, just about the worst thing you can do to violate the "keep parameters in the same positions" rule is to insert a new parameter before the others, or to randomly shuffle parameters. This code results in a whole bunch of "move parameter" instructions:

super_deluxe(A, B, C, D, E, F) ->
super_deluxe(F, E, D, C, B, A).

while this code turns into a single jump:

super_deluxe(A, B, C, D, E ,F) ->
super_deluxe(A, B, C, D, E, F).

These implementation techniques, used in the Erlang BEAM virtual machine, were part of the Warren Abstract Machine developed for Prolog.

On the Perils of Benchmarking Erlang

2007 brought a lot of new attention to Erlang, and with that attention has come a flurry of impromptu benchmarks. Benchmarks are tricky to write if you're new to a language, because it's easy for the run-time to be dominated by something quirky and unexpected. Consider a naive Python loop that appends data to a string each iteration. Strings are immutable in Python, so each append causes the entire string created thus far to be copied. Here's my short, but by no means complete, guide to pitfalls in benchmarking Erlang code.

Startup time is slow. Erlang's startup time is more significant than with the other languages I use. Remember, Erlang is a whole system, not just a scripting language. A suite of modules are loaded by default; modules that make sense in most applications. If you're going to run small benchmarks, the startup time can easily dwarf your timings.

Garbage collection happens frequently in rapidly growing processes. An Erlang process starts out very small, to keep the overall memory footprint low in a system with potentially tens of thousands of processes. Once a process heap is full, it gets promoted to a larger size. This involves allocating a new block of memory and copying all live data over to it. Eventually process heap size will stabilize, and the system automatically switches a process over to a generational garbage collector at some point too, but during that initial burst of growing from a few hundred words to a few hundred kilowords, garbage collection happens numerous times.

To get around this, you can start a process with a specific heap size using spawn_opt instead of spawn. The min_heap_size option lets you choose an initial heap size in words. Even a value of 32K can significantly improve the timings of some benchmarks. No need to worry about getting the size exactly right, because it will still be automatically expanded as needed.

Line-oriented I/O is slow. Sadly, yes, and Tim Bray found this out pretty early on. Here's to hoping it's better in the future, but in the meantime any line-oriented benchmark will be dominated by I/O. Use file:read_file to load the whole file at once, if you're not dealing with gigabytes of text.

The more functions exported from a module, the less optimization potential. It's common (and perfectly reasonable) to put:

-compile(export_all).

at the top of module that's in development. There's some nice tech in the Erlang compiler that tracks the types of variables. If a binary is always passed to a function, then that function can be specialized for operating on a binary. Once you open up a function to be called from the outside world, then all bets are off. Assumptions about the type of a parameter cannot be made.

Inlining is off by default. I doubt you'll ever see big speedups from this, but it's worth adding

-compile(inline).

to modules that involve heavy computation.

Large loop indices use bignum math. A "small" integer in Erlang fits into a single word, including the tag bits. I can never remember how many bits are needed for the tag, but I think it's two. (BEAM uses a staged tagging scheme so key types use fewer tag bits.) If a benchmark has an outer loop counting down from ten billion to zero, then bignum math is used for most of that range. "Bignum" means that a value is larger than will fit into a single machine word, so math involves looping and manually handling some things that an add instruction automatically takes care of. Perhaps more significantly, each bignum is heap allocated, so even simple math like X + 1 where X is a bignum, causes the garbage collector to kick in more frequently.

Admitting that Functional Programming Can Be Awkward

My initial interest in functional programming was because it seemed so perverse.

At the time, I was the classic self-taught programmer, having learned BASIC and then 6502 assembly language so I could implement my own game designs. I picked up the August 1985 issue of Byte magazine to read about the then-new Amiga. It also happened to be the issue on declarative languages, featuring a reprint of Backus's famous Turing Award Lecture and a tutorial on Hope, among other articles.

This was all pretty crazy stuff for an Atari 800 game coder to be reading about. I understood some parts, completely missed vast swaths of others, but one key point caught my imagination: programming without modifiable variables. How could that possibly work? I couldn't write even the smallest game without storing values to memory. It appealed to me for its impossibility, much in the way that I had heard machine language was too difficult for most people to approach. But while I had pored over assembly language listings of games in magazines, and learned to write my own as a result, there wasn't such direct applicability for functional programming. It made me wonder, but I didn't use it.

Many years later when I first worked through tutorials for Haskell, Standard ML, and eventually Erlang, it was to figure out how programming without modifying variables could work. In the small, it's pretty easy. Much of what seemed weird back in 1985 had become commonplace: garbage collection, using complex data structures without worrying about memory layout, languages with much less bookkeeping than C or Pascal. But that "no destructive updates" thing was--and still is--tricky.

I suppose it's completely obvious to point out that there have been tens of thousands of video games written using an imperative programming style, and maybe a handful--maybe even just a couple of fingers worth--of games written in a purely functional manner. Sure, there have been games written in Lisp and some games written by language dilettantes fond of Objective Caml, but they never turn out to be programmed in a functional style. You can write imperative code in those languages easily enough. And the reason for going down that road is simple: it's not at all clear how to write many types of complex applications in functional languages.

Usually I can work through the data dependencies, and often I find that there's an underlying simplicity to the functional approach. But for other applications...well, they can turn into puzzles. Where I can typically slog through a messy solution in C, the purely functional solution either eludes me or takes some puzzling to figure out. In those cases I feel like I'm fighting the system, and I realize why it's the road less traveled. Don't believe it? Think that functional purity is always the road to righteousness? Here's an easy example.

I wrote a semi-successful Mac game a while back called Bumbler. At its heart it was your standard sprite-based game: lots of independent objects running some behavioral code and interacting with each other. That kind of code looks easy to write in a purely functional way. An ant, represented as a coordinate, marches across the screen in a straight line and is deleted when it hits the opposite screen edge. That's easy to see as a function. One small clod of data goes in, another comes out.

But the behaviors and interactions can be a lot more tangled than this. You could have an insect that chases other insects, so you've got to pass in a list of existing entities to it. You can have an insect that affects spawn rates of other other insects, but of course you can't modify those rates directly so you've got to return that data somehow. You can have an insect that latches onto eggs and turns them into something else, so now there's a behavioral function that needs to reach into the list of entities and make modifications, but of you're not allowed to do that. You can have an insect that modifies the physical environment (that is, the background of the game) and spawns other insects. And each of these is messier than it sounds, because there are so many counters and thresholds and limiters being managed and sounds being played in all kinds of situations, that the data flow isn't clean by any means.

What's interesting is that it would be trivial to write this in C. Some incrementing, some conditions, direct calls to sound playing routines and insect spawning functions, reading and writing from a pool of global counters and state variables. For a purely functional approach, I'm sure the data flow could be puzzled out...assuming that everything was all perfectly planned and all the behaviors were defined ahead of time. It's much more difficult to take a pure movement function and say "okay, what I'd like is for this object to gravitationally influence other objects once it has bounced off of the screen edges three times." Doable, yes. As directly implementable as the C equivalent? No way.

That's one option: to admit that functional programming is the wrong paradigm for some types of problems. Fair enough. I'd put money on that. But it also may be that almost no one has been thinking about problems like this, that functional programming attracts purists and enamored students. In the game example above, some of the issues are solvable, they just need different approaches. Other issues I don't know how to solve, or at least I don't have solutions that are as straightforward as writing sequential C. And there you go...I'm admitting that functional programming is awkward in some cases. It's also extremely useful in others.

(Also see the follow-up.)

Follow-up to "Admitting that Functional Programming Can Be Awkward"

Admitting that functional programming can be awkward drew a much bigger audience than I expected, so here's some insight into why I wrote it, plus some responses to specific comments.

I started learning some functional programming languages in 1999, because I was looking for a more pleasant way to deal with complex programming tasks. I eventually decided to focus on Erlang (the reasons for which are probably worthy of an entire entry), and after a while I found I was not only using Erlang for some tasks I would have previously used Perl for (and truth be told, I still use Perl sometimes); I was able to approach problems that would have been just plain nasty in C. But I also found that some tasks were surprisingly hard in Erlang, clearly harder than banging out an imperative solution.

Video games are good manifestation of a difficult problem to approach functionally: lots of tangled interactions between actors. I periodically search for information on video games written in functional languages, and I always get the same type of results. There's much gushing about how wonderful functional programming is, how games are complex, and how the two are a great match. Eight years ago I even did this myself, and I keep running into it as a cited source about the topic. Then there's Functional Reactive Programming, but the demos are always bouncing balls and Pong and Space Invaders--which are trivial to write in any language--and it's not at all clear if it scales up to arbitrary projects. There are also a handful of games written in procedural or object-oriented styles in homebrew Lisp variants, and this is often equated with "game in a functional language."

My conclusion is that there's very little work in existence or being done on how to cleanly approach video game-like problems in functional languages. And that's okay; it's unexplored (or at least undocumented) territory. I wrote "Admitting..." because of my own experiences writing game-like code in Erlang. I don't think the overall point was Erlang-specific, and it would apply to pure subsets of ML, Scheme, etc. It wasn't meant as a cry for help or a way of venting frustration. If you're writing games in functional languages, I'd love to hear from you!

Now some responses to specific comments.

All you have to do is pass the world state to each function and return a new state.

True, yes, but...yuck. It can be clunky in a language with single-assignment. And what is this really gaining you over C?

All you have to do is make entities be functions of time. You pass in a time and the position and state of that entity at that time are returned.

For a bouncing ball or spinning cube, yes, that's easy. But is there a closed-form solution for a game where entities can collide with each other and react to what the player is doing?

All languages suck at something. You should use a multi-paradigm language.

Fair enough.

You used the word "modify" which shows you don't understand how functional programming works.

If a spider does something that causes the spawn rate of ants to change, then of course the old rate isn't modified to contain the new value. But conceptually the system needs to know that the new rate is what matters, so somehow that data needs to propagate up out of a function in such a way that it gets passed to the insect spawning function from that point on. I was using "modify" in that regard.

Erlang as a Target for Imperative DSLs

It's easy to show that any imperative program can be implemented functionally. Or more specifically, that any imperative program can be translated into Erlang. That doesn't mean the functional version is better or easier to write, of course.

Take an imperative program that operates on a set of variables. Put all of those variables into a dictionary called Falafel. Make every function take Falafel as the first parameter and return a tuple of the form {NewFalafel, OtherValues}. This is the classic "pass around the state of the global world" approach, except that the topic is so dry that I amuse myself by saying Falafel instead of World. But I'll go back to the normal terminology now.

What's awkward is that every time a new version of World is created inside the same function, it needs a new name. C code like this:

color = 57;
width = 205;

can be mindlessly translated to Erlang:

World2 = dict:store(color, 57, World),
World3 = dict:store(width, 205, World2),

That's completely straightforward, yes, but manually keeping track of the current name of the world is messy. This could be written as:

dict:store(width, 205, dict:store(color, 57, World))

which has the same potential for programmer confusion when it comes to larger, general cases. I wouldn't want to write code like this by hand. But perhaps worrying about the limitations of a human programmer is misguided. It's easy enough to start with a simple imperative language and generate Erlang code from it. Or wait, is that cheating? Or admitting that functional programming can be awkward?

None of this eliminates the issue that dict:store involves a lot of Erlang code, code that's executed for every faux variable update.

A different angle is to remember that parameters in a tail call are really destructive updates (see A Deeper Look at Tail Recursion in Erlang; and I should have said "Tail Calls" instead of "Tail Recursion" when I wrote the title). Arbitrarily messy imperative code can be mechanically translated to Erlang through a simple scheme:

Keep track of the live variables. If a variable is updated, jump to a new function with all live variables passed as parameters and the updated variable replaced with its new value.

Here's some C:

total++;
count += total;
row = x * 200 + count;

And here's the Erlang version, again mindlessly translated:

code_1(X, Total, Count) ->
code_2(X, Total + 1, Count).
code_2(X, Total, Count) ->
code_3(X, Total, Count + Total).
code_3(X, Total, Count) ->
code_4(X, Total, Count, X * 200 + Count).
code_4(X, Total, Count, Row) ->
...

Hard to read? Yes. Bulky at the source code level, too. But this is highly efficient Erlang, much faster than the dictionary version. I'd even call it optimal in terms of the BEAM virtual machine.

Sending Modern Languages Back to 1980s Game Programmers

Take a moment away from the world of Ruby, Python, and JavaScript to consider some of the more audacious archaeological relics of computing: the tens of thousands of commercial products written entirely in assembly language.

That's every game ever written for the Atari 2600. Almost all Apple II games and applications, save early cruft written in BASIC (which was itself written in assembly). VisiCalc. Almost all Atari 800 and Commodore 64 and Sinclair Spectrum games and applications. Every NES cartridge. Almost every arcade coin-op from the 1970s until the early 1990s, including elaborate 16-bit affairs like Smash TV, Total Carnage, and NBA Jam (in case you were only thinking of "tiny sprite on black background" games like Pac-Man). Almost all games for the SNES and SEGA Genesis. And I'm completely ignoring an entire mainframe era that came earlier.

(It's also interesting to look at 8-bit era software that wasn't written in assembly language. A large portion of SunDog: Frozen Legacy for the Apple II was written in interpreted Pascal. The HomePak integrated suite of applications for the Atari 8-bit computers was written in a slick language called Action!. The 1984 coin-op Marble Madness was one of the few games of the time written in C, and that allowed it to easily be ported to the Amiga and later the Genesis. A handful of other arcade games used BLISS.)

Back in 1994, I worked on a SNES game that was 100,000+ lines of 65816 assembly language. Oh yeah, no debugger either. It sounds extreme, almost unthinkable, but there weren't good options at the time. You use what you have to. So many guitar players do what looks completely impossible to me, but there's no shortage of people willing to take the time to play like that. Assembly language is pretty straightforward, provided you practice a lot and don't waste time dwelling on its supposed difficulty.

If you want really extreme there were people hand assembling Commodore 64 code and even people writing Apple II games entirely in the machine language monitor (a friend who clued me into this said you can look at the disassembled code and see how functions are aligned to 256 byte pages, so they can be modified without having to shift around the rest of the program).

It's an interesting exercise to consider what it would have been like to write games for these old, limited systems, but given modern hardware and a knowledge of modern tools: Perl, Erlang, Python, etc. No way would I have tried to write a Commodore 64 game in Haskell or Ruby, but having the languages available and on more powerful hardware would have changed everything. Here's what I'd do.

Write my own assembler. This sounded so difficult back then, but that's because parsing and symbol table management were big efforts in 6502 code. Getting it fast would have taken extra time, too. But now writing a cross assembler in Erlang (or even Perl) is a weekend project at best. A couple hundred lines of code.

Write my own emulator. I don't mean a true, performance-oriented emulator for running old code on a modern PC. I mean a simple interpreter for stepping through code and making sure it works without having real crashes on the target hardware. Again, this would be a quick project. It's what functional languages are designed to do. More than just executing code, I'd want to make queries about which registers a routine changes, get a list of all memory addresses read or modified by a function, count up the cycles in any stretch of code. This is all trivially easy, but it was so out of my realm as a self-taught game author. (And for development tool authors of the time, too. I never saw features like these.)

Write my own optimizers for tricky cases. The whole point of assembly is to have control, but some optimizations are too ugly to do by hand. A good example is realizing that the carry flag is always set when a jump instruction occurs, so the jump (3 bytes) can be replaced with a conditional branch (2 bytes).

Write my own custom language. I used to worship at the altar of low-level optimization, but all of that optimization was more or less pattern recognition or brute force shuffling of code to minimize instructions. I still, even today, cringe at the output I see from most compilers (I have learned that it's best not to look), because generating perfect machine code from a complex language is a tough problem. But given a simple processor like the 6502, 6089, or Z80, I think it would not only be possible to automate all of my old tricks, but to go beyond them into the realm of "too scary to mess with" optimizations. Self-modifying code is a good example. For some types of loops you can't beat stuffing constants into the compare instructions. Doing this by hand...ugh.

Much of the doability of these options comes from the simplicity of 8-bit systems. Have you ever looked into the details of the COFF or Preferred Executable format? Compare the pages and pages of arcana to the six byte header of an Atari executable. Or look at the 6502 instruction set summarized on a single sheet of paper, compared with the two volume set of manuals from Intel for the x86 instructions. But a big part of it also comes from modern programming languages and how pleasant they make approaching problems that would have previously been full-on projects.

Trapped! Inside a Recursive Data Structure

Flat lists are simple. That's what list comprehensions are designed to work with, for example. Code for scanning or transforming a flat list can usually be tail recursive. Once data becomes deep, where elements of a list can contain other lists ad infinitum something changes. It's trivial to iterate over a deep list; any basic Scheme textbook covers this early on. You recurse down, down, down, counting up values, building lists, and then....trapped. You're way down inside a function, and all you really want to do is exit immediately or record some data that applies to the whole nested data structure and keep going, but you can't.

As an example, here's the standard "is X contained in a list?" function written in Erlang:

member(X, [X|_]) -> true;
member(X, [_|T]) -> member(X, T);
member(_, [])    -> false.

Once a match is found, that's it. A value of true is returned. A function to find X in a deep list takes a bit more work:

member(X, [X|_]) ->
true;
member(X, [H|T]) when is_list(H) ->
case member(X, H) of
true -> true;
_    -> member(X, T)
end;
member(X, [_|T]) ->
member(X, T);
member(X, []) ->
false.

The ugly part here is that you could be down 50 levels in a deep list when a match is found, but you're trapped. You can't just immediately stop the whole operation and say "Yes! Done!" You've got to climb back up those 50 levels. That's the reason for checking for "true" in the second function clause. Now this example is mild in terms of claustrophobic trappage, but it can be worse, and you'll know it when you run into such a case.

There are a couple of options here. One is to throw an exception. Another is to use continuation passing style. But there's a third approach which I think is cleaner: manage a stack yourself instead of using the function call stack. This keeps the function tail recursive, making it easy to exit or handle counters or accumulators across the whole deep data structure.

Here's member for deep lists written with an explicit stack:

member(X, L) -> member(X, L, []).
member(X, [X|_], _Stack) ->
true;
member(X, [H|T], Stack) when is_list(H) ->
member(X, H, [T|Stack]);
member(X, [_|T], Stack) ->
member(X, T, Stack);
member(_, [], []) ->
false;
member(X, [], [H|T]) ->
member(X, H, T).

Whenever the head of the list is a list itself, the tail is pushed onto Stack so it can be continued with later, and the list is processed. When there's no more input, check to see if Stack has any data on it. If so, pop the top item and make it the current list. When a match is found, the exit is immediate, because there aren't any truly recursive calls to back out of.

Would I really write member like this? Probably not. But I've found more complex cases where this style is much less restrictive than writing a truly recursive function. One of the signs that this might be useful is if you're operating across a deep data structure as a whole. For example, counting the number of atoms in a deep list. Or taking a deep data structure and transforming into into one that's flat.

Deriving Forth

When most programmers hear a mention of Forth, assuming they're familiar with it at all, a series of memory fragments surface: stack, Reverse Polish Notation, SWAPping and DUPlicating values. While the stack and RPN are certainly important to Forth, they don't describe essence of how the language actually works.

As an illustration, let's write a program to decode modern coffee shop orders. Things like:

I'd like a grande skinny latte

and

Gimme a tall mocha with an extra shot to go

The catch here is that we're not allowed to write a master parser for this, a program that slurps in the sentence and analyzes it for meaning. Instead, we can only look at a single word at a time, starting from the left, and each word can only be examined once--no rewinding.

To get around this arbitrary-seeming rule, each word (like "grande") will have a small program attached to it. Or more correctly, each word is the name of a program. In the second example above, first the program called gimme is executed, then a, then tall, and so on.

Now what do each of these programs do? Some words are clearly noise: I'd, like, a, an, to, with. The program for each of these words simply returns immediately. "I'd like a," which is three programs, does absolutely nothing.

Now the first example ("i'd like a grande skinny latte"), ignoring the noise words, is "grande skinny latte." Three words. Three programs. grande sets a Size variable to 2, indicating large. Likewise, tall sets this same variable to 1, and short sets it to 0. The second program, skinny, sets a Use_skim_milk flag to true. The third program, latte, records the drink name in a variable we'll call Drink_type.

To use a more concise notation, here's a list of the programs for the second example:

gimme -> return
a     -> return
tall  -> Size = 1
mocha -> Drink_type = 1
with  -> return
extra -> return
shot  -> Extra_shot = true
to    -> return
go    -> To_go = true

When all of these programs have been executed, there's enough data stored in a handful of global variables to indicate the overall drink order, and we managed to dodge writing a real parser. Almost. There still needs to be one more program that looks at Drink_type and Size and so on. If we name that program EOL, then it executes after all the other programs, when end-of-line is reached. We can even handle rephrasings of the same order, like "mocha with an extra shot, tall, to go" with exactly the same code.

The process just described is the underlying architecture of Forth: a dictionary of short programs. In Forth-lingo, each of these named programs is called a word. The main loop of Forth is simply an interpreter: read the next bit of text delimited by spaces, look it up in the dictionary, execute the program associated with it, repeat. In fact, even the Forth compiler works like this. Here's a simple Forth definition:

: odd? 1 and ;

The colon is a word too, and the program attached to it first reads the next word from the input and creates a dictionary entry with that name. Then it does this: read the next word in the input, if the word is a semicolon then generate a return instruction and stop compiling, otherwise look up the word in the dictionary, compile a call to it, repeat.

So where do stacks and RPN come into the picture? Our coffee shop drink parser is simple, but it's a front for a jumble of variables behind the scenes. If you're up for some inelegant code, you could do math with the same approach. "5 + 3" is three words:

5   -> Value_1 = 5
+   -> Operation = add
3   -> Value_2 = 3
EOL -> Operation(Value_1, Value_2)

but this is clunky and breaks down quickly. A stack is a good way to keep information flowing between words, maybe the best way, but you could create a dictionary-based language that didn't use a stack at all. Each function in Backus's FP, for example, creates a value or data structure which gets passed to the next function in sequence. There's no stack.

Finally, just to show that my fictional notation is actually close to real Forth, here's a snippet of code for the drink decoder:

variable Size
variable Type
: short   0 Size ! ;
: tall    1 Size ! ;
: grande  2 Size ! ;
: latte   0 Type ! ;
: mocha   1 Type ! ;

Two Stories of Simplicity

In response to Sending modern languages back to 1980s game programmers, one of the questions I received was "Did any 8-bit coders ever use more powerful computers for development?" Sure! The VAX and PDP-11 and other minicomputers were available at the time, though expensive, and some major developers made good use of them, cross-compiling code for the lowly Atari 800 and Apple II. But there was something surprising about some of these systems:

It was often slower to cross-assemble a program on a significantly higher-specced machine like the VAX than it was to do the assembly on a stock 8-bit home computer.

Part of the reason is that multiple people were sharing a single VAX, working simultaneously, but the Apple II user had the whole CPU available for a single task. There was also the process of transferring the cross-assembled code to the target hardware, and this went away if the code was actually built on the target. And then there were inefficiencies that built up because the VAX was designed for large-scale work: more expensive I/O libraries, more use of general purpose tools and code.

For example, a VAX-hosted assembler might dynamically allocate symbols and other data on the heap, something typically not used on a 64K home computer. Now a heap manager--what malloc sits on top of--isn't a trivial bit of code. More importantly, you usually can't predict how much time a request for a block of memory will take to fulfill. Sometimes it may be almost instantaneous, other times it may take thousands of cycles, depending on the algorithms used and current state of the heap. Meanwhile, on the 8-bit machine, those thousands of cycles are going directly toward productive work, not solving the useful but tangential problem of how to effectively manage a heap.

So in the end there were programmers with these little 8-bit machines outperforming minicomputers costing hundreds of thousands of dollars.

That ends the first story.

When I first started programming the Macintosh, right after the switch to PowerPC processors in the mid-1990s, I was paranoid about system calls. I knew the system memory allocation routines were unpredictable and should be avoided in performance-oriented code. I'd noticeably sped up a commercial application by dodging a system "Is point in an arbitrarily complex region?" function. It was in this mindset that I decided to steer clear of Apple's BlockMove function--the MacOS equivalent of memcpy--and write my own.

The easy way to write a fast memory copier is to move as much data at a time as possible. 32-bit values are better than 8-bit values. The problem with using 32-bit values exclusively is that there are alignment issues. If the source address isn't aligned on a four-byte boundary, it's almost as bad as copying 8-bits at a time. BlockMove contained logic to handle misaligned addresses, breaking things into two steps: individual byte moves until the source address was properly aligned, then 32-bit copies from that point on. My plan was that if I always guaranteed that the source and destination addresses were properly aligned, then I could avoid all the special-case address checks and have a simple loop reading and writing 32-bits at a time.

(It was also possible to read and write 64-bit values, even on the original PowerPC chips, using 64-bit floating point registers. But even though this looked good on paper, floating point loads and stores had a slightly longer latency than integer loads and stores.)

I had written a very short, very concise aligned memory copy function, one that clearly involved less code than Apple's BlockMove.

Except that BlockMove was faster. Not just by a little, but 30% faster for medium to large copies.

I eventually figured out the reason for this by disassembling BlockMove. It was even more convoluted than I expected in terms of handling alignment issues. It also had a check for overlapping source and destination blocks--more bloat from my point of view. But there was a nifty trick in there that I never would have figured out on my own.

Let's say that a one megabyte block of data is being copied from one place to another. During the copy loop the data at the source and destination addresses is constantly getting loaded into the cache, 32 bytes at a time (the size of a cache line on early PowerPC chips), two megabytes of cache loads in all.

If you think about this, there's one flaw: all the data from the destination is loaded into the cache...and then it's immediately overwritten by source data. BlockMove contained code to align addresses to 32 byte cache lines, then in the inner copy loop used a special instruction to avoid loading the destination data, setting an entire cache line to zeros instead. For every 32 bytes of data, my code involved two cache line reads and one cache line write. The clever engineer who wrote BlockMove removed one of these reads, resulting in a 30% improvement over my code. This is even though BlockMove was pages of alignment checks and special cases, instead of my minimalist function.

There you go: one case where simpler was clearly better, and one case where it wasn't.

Finally: Data Structure Constants in Erlang

Here's a simple Erlang function to enclose some text in a styled paragraph, returning a deep list:

para(Text) ->
["<p class=\"normal\">", Text, "</p>"].

Prior to the new R12B release of Erlang, this little function had some less than ideal behavior.

Every time para was called, the two constant lists (a.k.a. strings) were created, which is to say that there weren't true data structure constants in Erlang.

Each string was built-up, letter by letter, via a series of BEAM virtual machine instructions. The larger the constant, the more code there was to generate it, and the more time it took to generate.

Because a new version of each string was created for each para call, there was absolutely no sharing of data. If para was called 200 times, 400 strings were created (200 of each). Remember, too, that each element of a list/string in 32-bit Erlang is 8 bytes. Doing the math on this example is mildly unsettling: 22 characters * 8 bytes per character * 200 instances = 35,200 bytes of "constant" string data.

As a result of more data being created, garbage collection occurred more frequently and took longer (because there was more live data).

Years ago, this problem was solved in HIPE, the standard native code compiler for Erlang, by putting constant data structures into a pool. Rather than using BEAM instructions to build constants, each constant list or tuple or more complex structure is simply a pointer into the pool. For some applications, such as turning a tree into HTML, the HIPE approach is a significant win.

And now, finally, as of the December 2007 R12B release, true data structure constants are supported in BEAM.

Revisiting "Programming as if Performance Mattered"

In 2004 I wrote Programming as if Performance Mattered, which became one of my most widely read articles. (If you haven't read it yet, go ahead; the rest of this entry won't make a lot of sense otherwise. Plus there are spoilers, something that doesn't affect most tech articles.) In addition to all the usual aggregator sites, it made Slashdot which resulted in a flood of email, both complimentary and bitter. Most of those who disagreed with me can be divided into two groups.

The first group flat-out didn't get it. They lectured me about how my results were an anomaly, that interpreted languages are dog slow, and that performance comes from hardcore devotion to low-level optimization. This is even though my entire point was about avoiding knee-jerk definitions of fast and slow. The mention of game programming at the end was a particular sore point for these people. "You obviously know nothing about writing games," they raved, "or else you'd know that every line of every game is carefully crafted for the utmost performance." The amusing part is that I've spent almost my entire professional career--and a fairly unprofessional freelance stint before that--writing games.

The second group was more savvy. These people had experience writing image decoders and knew that my timings, from an absolute point of view, were nowhere near the theoretical limit. I talked of decoding the sample image in under 1/60th of a second, and they claimed significantly better numbers. And they're completely correct. In most cases 1/60th of a second is plenty fast for decoding an image. But if a web page has 30 images on it, we're now up to half a second just for the raw decoding time. Good C code to do the same thing will win by a large margin. So the members of this group, like the first, dismissed my overall point.

What surprised me about the second group was the assumption that my Erlang code is as fast as it could possibly get, when in fact there are easy ways of speeding it up.

First, just to keep the shock value high, I kept my code in pure, interpreted Erlang. But there's a true compiler as part of the standard Erlang distribution, and simply compiling the tga module will halve execution time, if not decrease it by a larger factor.

Second, I completely ignored concurrent solutions, both within the decoding of a single image and potentially spinning each image into its own process. The latter solution wouldn't improve execution time of my test case, but could be a big win if many images are decoded.

Then there's perhaps the most obvious thing to do, the first step when it comes to understanding the performance of real code. Perhaps my detailed optimization account made it appear that I had reached the end of the road, that no more performance could be eked out of the Erlang code. In any case, no one suggested profiling the code to see if there are any obvious bottlenecks. And there is such a bottleneck.

(There's one more issue too: in the end, the image decoder was sped-up enough that it was executing below the precision threshold of the wall clock timings of timer:tc/3. I could go in and remove parts of the decoder--obviously giving incorrect results--and still get back the same timings of 15,000 microseconds. The key point is that my reported timings were likely higher than they really were.)

Here's the output of the eprof profiler on tga:test_compressed():

FUNCTION                                       CALLS      TIME

****** Process <0.46.0>    -- 100 % of profiled time *** 
tga:decode_rgb1/1                              54329      78 % 
lists:duplicate/3                              11790      7 % 
tga:reduce_rle_row/3                           2878       3 % 
tga:split/1                                    2878       3 % 
tga:combine/1                                  2874       3 % 
erlang:list_to_binary/1                        1051       2 % 
tga:expand/3                                   1995       1 % 
tga:continue_rle_row/7                         2709       1 % 
lists:reverse/1                                638        0 
...

Sure enough, most of the execution time is spent in decode_rgb1, which is part of decode_rgb. The final version of this function last time around was this:

decode_rgb(Pixels) ->
list_to_binary(decode_rgb1(binary_to_list(Pixels))).
decode_rgb1([255,0,255 | Rest]) ->
[0,0,0,0 | decode_rgb1(Rest)];
decode_rgb1([R,G,B | Rest]) ->
[R,G,B,255 | decode_rgb1(Rest)];
decode_rgb1([]) -> [].

This is short, but contrived. The binary blob of pixels is turned into a list, then the new pixels are built-up in reverse order as a list, and finally that list is reversed and turned back into a binary. There are two reasons for the contrivance. At the time, pattern matching was much faster on lists than binaries, so it was quicker to turn the pixels into a list up front (I timed it). Also, repeatedly appending to a binary was a huge no-no, so it was better to create a new list and turn it into a binary at the end.

In Erlang R12B both of these issues have been addressed, so decode_rgb can be written in the straightforward way, operating on binaries the whole time:

decode_rgb(Pixels) -> decode_rgb(Pixels, <<>>).
decode_rgb(<<255,0,255, Rest/binary>>, Result) ->
decode_rgb(Rest, <<Result/binary,0,0,0,0>>);
decode_rgb(<<R,G,B, Rest/binary>>, Result) ->
decode_rgb(Rest, <<Result/binary,R,G,B,255>>);
decode_rgb(<<>>, Result) -> Result.

This eliminates the memory pressure caused by expanding each byte of the binary to eight bytes (the cost of an element in a list).

But we can do better with a small change to the specification. Remember, decode_rgb is a translation from 24-bit to 32-bit pixels. When the initial pixel is a magic number--255,0,255--the alpha channel of the output is set to zero, indicating transparency. All other pixels have the alpha set to 255, which is fully opaque. If you look at the code, you'll see that the 255,0,255 pixels actually get turned into 0,0,0,0 instead of 255,0,255,0. There's no real reason for that. In fact, if we go with the simpler approach of only changing the alpha value, then decode_rgb can be written using in an amazingly clean way:

decode_rgb(Pixels) ->
[<<R,G,B,(alpha(R,G,B))>> || <<R,G,B>> <= Pixels].

alpha(255, 0, 255) -> 0;
alpha(_, _, _) -> 255.

This version uses bitstring comprehensions, a new feature added in Erlang R12B. It's hard to imagine writing this with any less code.

(Also see the follow-up.)

Timings and the Punchline

I forgot two things in Revisiting "Programming as if Performance Mattered": exact timings of the different versions of the code and a punchline. I'll do the timings first.

timer:tc falls apart once code gets too fast. A classic sign of this is running consecutive timings and getting back a sequence of numbers like 15000, 31000, 31000, 15000. At this point you should write a loop to execute the test function, say, 100 times, then divide the total execution time by 100. This smooths out interruptions for garbage collection, system processes, and so on.

And now the timings (lower is better). The TGA image decoder with the clunky binary / list / binary implementation of decode_rgb, on the same sample image I used in 2004:

16,720 microseconds

(Yes, this is larger than the original 15,000 I reported, because it's an average, not the result of one or two runs.) The recursive version operating directly on binaries:

18,700 microseconds

The ultra-slick version using binary comprehensions:

22,600 microseconds

I think the punchline is obvious at this point.

Were I using this module in production code, I'd do one of three things. If I'm only decoding a handful of images here and there, then this whole discussion is irrelevant. The Erlang code is more than fast enough. If image decoding is a huge bottleneck, I'd move the hotspot, decode_rgb into a small linked-in driver. Or, and the cries of cheating may be justified here, I'd remove decode_rgb completely.

Remember, transparent pixels runs at the start and end of each row are already detected elsewhere. decode_rgb blows up the runs in the middle from 24-bit to 32-bit. At some point this needs to be done, but it may just be that it doesn't need to happen at the Erlang level at all. If the pixel data is passed off to another non-Erlang process anyway, maybe for rendering or for printing or some other operation, then there's no reason the compressed 24-bit data can't be passed off directly. That fits the style I've been using for this whole module, of operating on compressed data without a separate decompression step.

But now we're getting into useless territory: quibbling over microseconds without any actual context. You can't feel the difference between any of the optimized versions of the code I presented last time, and so it doesn't matter.

Would You Bet $100,000,000 on Your Pet Programming Language?

Here's my proposition: I need an application developed and if you can deliver it on time I'll pay you $100,000,000 (USD). It doesn't involve solving impossible problems, but difficult and messy problems: yes.

What language can you use to write it? Doesn't matter to me. It's perfectly fine to use multiple languages; I've got no hangups about that. All that matters is that it gets done and works.

As with any big project, the specs will undoubtedly change along the way. I promise not to completely confuse the direction of things with random requests. Could you add an image editor with all the features of Photoshop, plus a couple of enhancements? What about automatic translation between Korean and Polish? 3D fuzzy llamas you can ride around if network transfers take a long while? None of that. But there are some more realistic things I could see happening:

You need to handle data sets five times larger than anticipated.

I also want this to run on some custom ARM-based hardware, so be sure you can port to it.

Intel announced a 20 core chip, so the code needs to scale up to that level of processing power.

And also...hang on, phone call.

Sadly, I just found out that Google is no longer interested in buying my weblog, so I'm rescinding my $100,000,000 offer. Sigh.

But imagine if the offer were true? Would you bet a hundred million dollars on your pet language being up to the task? And how would it change your criteria for judging programming languages? Here's my view:

Libraries are much more important than core language features. Cayenne may have dependent types (cool!), but are there bindings for Flash file creation and a native-look GUI? Is there a Rich Text Format parsing library for D? What about fetching files via ftp from Mercury? Do you really want to write an SVG decoder for Clean?

Reliability and proven tools are even more important than libraries. Has anyone ever attempted a similar problem in Dolphin Smalltalk or Chicken Scheme or Wallaby Haskell...er, I mean Haskell? Has anyone ever attempted a problem of this scope at all in that language? Do you know that the compiler won't get exponentially slower when fed large programs? Can the profiler handle such large programs? Do you know how to track down why small variations in how a function is written result in bizarre spikes in memory usage? Have some of the useful but still experimental features been banged on by people working in a production environment? Are the Windows versions of the tools actually used by some of the core developers or is it viewed as a second rate platform? Will native compilation of a big project result in so much code that there's a global slowdown (something actually true of mid-1990s Erlang to C translators)?

You're more dependent on the decisions made by the language implementers than you think. Sure, toy textbook problems and tutorial examples always seem to work out beautifully. But at some point you'll find yourself dependent on some of the obscure corners of the compiler or run-time system, some odd case that didn't matter at all for the problem domain the language was created for, but has a very large impact on what you're trying to do.

Say you've got a program that operates on a large set of floating point values. Hundreds of megabytes of floating point values. And then one day, your Objective Caml program runs out of memory and dies. You were smart of course, and knew that floating point numbers are boxed most of the time in OCaml, causing them to be larger than necessary. But arrays of floats are always unboxed, so that's what you used for the big data structures. And you're still out of memory. The problem is that "float" in OCaml means "double." In C it would be a snap to switch from the 64-bit double type to single precision 32-bit floats, instantly saving hundreds of megabytes. Unfortunately, this is something that was never considered important by the OCaml implementers, so you've got to go in and mess with the compiler to change it. I'm not picking on OCaml here; the same issue applies to many languages with floating point types.

A similar, but harder to fix, example is if you discover that at a certain data set size, garbage collection crosses the line from "only noticeable if you're paying attention" to "bug report of the program going catatonic for a several seconds." The garbage collector has already been carefully optimized, and it uses multiple generations, but there's always that point when the oldest generation needs to be scavenged and you sit helplessly while half a gigabyte of complex structures are traversed and copied. Can you fix this? That a theoretically better garbage collection methodology exists on paper somewhere isn't going to make this problem vanish.

By now fans of all sorts of underdog programming languages are lighting torches and collecting rotten fruit. And, really, I'm not trying to put down specific languages. When I'm on trial I can easily be accused of showing some favor to Forth and having spent time tinkering with J (which honestly does look like line noise in a way that would blow the minds of critics who level that charge against Perl). Yes, I'm a recovering language dilettante.

Real projects with tangible rewards do change my perceptions, however. With a $100,000,000 carrot hanging in front of me, I'd be looking solely at the real issues involved with the problem. Purely academic research projects immediately look ridiculous and scary. I'd become very open to writing key parts of an application in C, because that puts the final say on overall data sizes back in my control, instead finding out much later that the language system designer made choices about tagging and alignment and garbage collection that are at odds with my end goals. Python and Erlang get immediate boosts for having been used in large commercial projects, though each clearly has different strengths and weaknesses; I'd be worried about both of them if I needed to support some odd, non-UNIXy embedded hardware.

What would you do? And if a hundred million dollars changes your approach to getting things done in a quick and reliable fashion, then why isn't it your standard approach?

Functional Programming Archaeology

John Backus's Turing Award Lecture from 1977, Can Programming be Liberated from the Von Neumann Style? (warning: large PDF) was a key event in the history of functional programming. All of the ideas in the paper by no means originated with Backus, and Dijkstra publicly criticized it for being poorly thought through, but it did spur interest in functional programming research which eventually led to languages such as Haskell. And the paper is historically interesting as the crystallization of the beliefs about the benefits of functional programming at the time. There are two which jump out at me.

The first is concurrency as a primary motivation. If a program is just a series of side effect-free expressions, then there's no requirement that programs be executed sequentially. In a function call like this:

f(ExpressionA, ExpressionB, ExpressionC)

the three expressions have no interdependencies and can be executed in parallel. This could, in theory, apply all the way down to pieces of expressions. In this snippet of code:

(a + b) * (c + d)

the two additions could be performed at the same time. This fine-grained concurrency was seen as a key benefit of purely functional programming languages, but it fizzled, both because of the difficulty in determining how to parallelize programs efficiently and because it was a poor match for monolithic CPUs.

The second belief which has dropped off the radar since 1977 is the concept of an algebra of programs. Take this simple C expression:

!x

Assuming x is a truth value--either 0 or 1--then !x gives the same result as these expressions:

1 - x
x ^ 1
(x + 1) & 1

If the last of these appeared in code, then it could be mechanically translated to one of the simpler equivalents. Going further, you could imagine an interactive tool that would allow substitution of equivalent expressions, maybe even pointing out expressions that can be simplified.

Now in C this isn't all that useful. And in Erlang or Haskell it's not all that useful either, unless you avoid writing explicitly recursive functions with named values and instead express programs as a series of canned manipulations. This is the so-called point-free style which has a reputation for density to the point of opaqueness.

In Haskell code, point-free style is common, but not aggressively so. Rather than trying to work out a way to express a computation as the application of existing primitives, it's usually easier to write an explicitly recursive function. Haskell programmers aren't taught to lean on core primitive functions wherever possible, and core primitive functions weren't necessarily designed with that goal in mind. Sure, there's the usual map and fold and so on, but not a set of functions that would allow 90% of all programs to be expressed as application of those primitives.

Can Programming be Liberated... introduced fp, a language which didn't catch on and left very little in the way of tutorials or useful programming examples. fp was clearly influenced by Ken Iverson's APL, a language initially defined n 1962 (and unlike fp, you can still hunt down production code written in APL). The APL lineage continued after Backus's paper, eventually leading to APL2 and J (both of which involved Iverson) and a second branch of languages created by a friend of Iverson, Arthur Whitney: A+, K, and Q. Viewed in the right light, J is a melding of APL and fp. And the "build a program using core primitives" technique lives on in J.

Here's a simple problem: given an array (or list, if you prefer), return the indices of values which are greater than 5. For example, this input:

1 2 0 6 8 3 9

gives this result:

3 4 6

which means that the elements in the original array at positions 3, 4, and 6 (where the first position is zero, not one) are all greater than 5. I'm using the APl/J/K list notation here, instead of the Haskelly [3,4,6]. How can we transform the original array to 3 4 6 without explicit loops, recursion, or named values?

First, we can find out which elements in the input list are greater than 5. This doesn't give us their positions, but it's a start.

0 2 0 6 8 3 9 > 5
0 0 0 1 1 0 1

The first line is the input, the second the output. Greater than, like most J functions, operates on whole arrays, kind of like all operators in Haskell having map built in. The above example checks if each element of the input array is greater than 5 and returns an array of the results (0 = false, 1 = true).

There's another J primitive that builds a list of values from 0 up to n-1:

i. 5
0 1 2 3 4

Yes, extreme terseness is characteristic of J--just let it go for now. One interesting thing we can do with our original input is to build up a list of integers as long as the array.

i. # 1 2 0 6 8 3 9
0 1 2 3 4 5 6

(# is the length function.) Stare at this for a moment, and you'll see that the result is a list of the valid indices for the input array. So far we've got two different arrays created from the same input: 0 0 0 1 1 0 1 (where a 1 means "greater than 5") and 0 1 2 3 4 5 6 (the list of indices for the array). Now we take a bit of a leap. Pair these two array together: (first element of the first array, first element of the second array), etc., like this:

(0,0) (0,1) (0,2) (1,3) (1,4) (0,5) (1,6)

This isn't J notation; it's just a way of showing the pairings. Notice that if you remove all pairs that have a zero in the first position, then only three pairs are left. And the second elements of those pairs make up the answer we're looking for: 3 4 6. It turns out that J has an operator for pairing up arrays like this, where the first element is a count and the second is a value to repeat count times. Sort of a run-length expander. The key is that a count of zero can be viewed as "delete me" and a count of 1 as "copy me as is." Or in actual J code:

0 0 0 1 1 0 1 # 0 1 2 3 4 5 6
3 4 6

And there's our answer--finally! (Note that # in this case, with an operand on each side of it, is the "expand" function.) If you're ever going to teach a beginning programming course, go ahead and learn J first, so you can remember what it's like to be an utterly confused beginner.

In the APL/J/K worlds, there's a collection of well-known phrases (that is, short sequences of functions) for operations like this, each made up of primitives. It's the community of programmers with the most experience working in a point-free style. Though I doubt those programmers consider themselves to be working with "an algebra of programs," as Backus envisioned, the documentation is sprinkled with snippets of code declared to be equivalent to primitives or other sequences of functions.

Why Garbage Collection Paranoia is Still (sometimes) Justified

"As new code was compiled, older code (and other memory used by the compiler) was orphaned, eventually causing the PC to run low on free memory. A slow garbage collection process would automatically occur when available memory became sufficiently low, and the compiler would be unresponsive until the process had completed, sometimes taking as long as 15 minutes."

—Naughty Dog's Jak and Daxter post-mortem

I know the title will bait people who won't actually read any of this article, so I'll say it right up front to make them feel the error of their reactionary ways: I am pro garbage collection. It has nothing to do with manual memory management supposedly being too hard (good grief no). What I like is that it stops me from thinking about trivial usage of memory at all. If it would be more convenient to briefly have a data structure in a different format, I just create a new version transformed the way I want it. Manually allocating and freeing these insignificant bits of memory is just busywork.

That's hardly a bold opinion in 2008. There are more programming languages in popular use with garbage collection than there are without. Most of the past paranoia about garbage collection slowness and pauses has been set aside in favor of increased productivity. Computers have gotten much faster. Garbage collectors have gotten better. But those old fears are still valid, those hitches and pauses still lurking, and not just in the same vague way that some people like to assume that integer division is dog slow even on a 3GHz processor. In fact, they apply to every garbage collected language implementation in existence. Or more formally:

In any garbage collector, there exists some pathological case where the responsiveness of your program will be compromised.

"Responsiveness" only matters for interactive applications or any program that's vaguely real-time. In a rocket engine monitoring system, responsiveness may mean "on the order of a few microseconds." In a robotic probe used for surgery, it might be "on the order of four milliseconds." For a desktop application, it might be in the realm of one to two seconds; beyond that, users will be shaking the mouse in frustration.

Now about the "pathological case." This is easy to prove. In a garbage collector, performance is always directly proportional to something. It might be total number of memory allocations. It might be the amount of live data. It might be something else. For the sake of discussion let's assume it's the amount of live data. Collection times might be acceptable for 10MB of live data, maybe even 100MB, but you can always come up with larger numbers: 250MB...or 2GB. Or in a couple of years, 20GB. No matter what you do, at some point the garbage collector is going to end up churning through those 250MB or 2GB or 20GB of data, and you're going to feel it.

Ah, but what about generational collectors? They're based on the observation that most objects are short lived, so memory is divided into a nursery for new allocations and a separate larger pool for older data (or even a third pool for grandfatherly data). When the nursery is full, live data is promoted to the larger pool. These fairly cheap nursery collections keep happening, and that big, secondary pool fills up a little more each time. And then, somewhere, sometime, the old generation fills up, all 200MB of it. This scheme has simply delayed the inevitable. The monster, full-memory collection is still there, waiting for when it will strike.

What about real time garbage collection? More and more, I'm starting to see this as a twist on the myth of the Sufficiently Smart Compiler. If you view "real time" as "well engineered and fast," then it applies to most collectors in use, and they each still have some point, somewhere down the road, at which the pretense of being real time falls apart. The other interpretation of real time is some form of incremental collection, where a little bit of GC happens here, a little bit there, and there's never a big, painful pause.

An interesting question is this: What language systems in existence are using a true incremental or concurrent garbage collector? I know of three: Java, Objective C 2.0 (which just shipped with OS X Leopard), and the .net runtime. Not Haskell. Not Erlang. Not Objective Caml [EDIT: The OCaml collector for the second generation is incremental]. Not any version of Lisp or Scheme. Not Smalltalk. Not Ruby. That begs a lot of questions. Clearly incremental and concurrent collection aren't magic bullets or they'd be a standard part of language implementations. Is it that the additional overhead of concurrent collection is only worthwhile in imperative languages with lots of frequently modified, cross-linked data? I don't know.

Incremental collection is a trickier problem than it sounds. You can't just look at an individual object and decide to copy or free it. In order to know if a data object is live or not, you've got to scan the rest of the world. The incremental collectors I'm familiar with work that way: they involve a full, non-incremental marking phase, and then copying and compaction are spread out over time. This means that the expense of such a collector is proportional to the amount of data that must be scanned during the marking phase and as such has a lurking pathological case.

Does knowing that garbage collectors break down at some point mean we should live in fear of them and go back to manual heap management? Of course not. But it does mean that some careful thought is still required when it comes to dealing with very large data sets in garbage collected languages.

Next time: A look at how garbage collection works in Erlang. The lurking monster is still there, but there are some interesting ways of delaying his attack.

Garbage Collection in Erlang

Given its "soft real time" label, I expected Erlang to use some fancy incremental garbage collection approach. And indeed, such an approach exists, but it's slower than traditional GC in practice (because it touches the entire the heap, not just the live data). In reality, garbage collection in Erlang is fairly vanilla. Processes start out using a straightforward compacting collector. If a process gets large, it is automatically switched over to a generational scheme. The generational collector is simpler than in some languages, because there's no way to have an older generation pointing to data in a younger generation (remember, you can't destructively modify a list or tuple in Erlang).

The key is that garbage collection in Erlang is per process. A system may have tens of thousands of processes, using a gigabyte of memory overall, but if GC occurs in a process with a 20K heap, then the collector only touches that 20K and collection time is imperceptible. With lots of small processes, you can think of this as a truly incremental collector. But there's still a lurking worst case in Erlang: What if all of those processes run out of memory more or less in the same wall-clock moment? And there's nothing preventing an application from using one massive process (such is the case with the Wings 3D modeller).

Per-process GC allows a slick technique that can completely prevent garbage collection in some circumstances. Using spawn_opt instead of the more common spawn, you can specify the initial heap size for a process. If you know, as discovered through profiling, that a process rapidly grows up to 200K and then terminates, you can give that process an initial heap size of 200K. Data keeps getting added to the end of the heap, and then before garbage collection kicks in, the process heap is deleted and its contents are never scanned.

The other pragmatic approach to reducing the cost of garbage collection in Erlang is that lots of data is kept outside of the per-process heaps:

Binaries > 64 bytes. Large binaries are allocated in a separate heap outside the scope of a process. Binaries can't, by definition, contain pointers to other data, so they're reference counted. If there's a 50MB binary loaded, it's guaranteed never to be copied as part of garbage collection.

Data stored in ETS tables. When you look up key in an ETS table, the data associated with that key is copied into the heap for the process the request originated from. For structurally large values (say, a tuple of 500 elements) the copy from ETS table space to the process heap may become expensive, but if there's 100MB of total data in a table, there's no risk of all that data being scanned at once by a garbage collector.

Data structure constants. This is new in Erlang.

Atom names. Atom name strings are stored in a separate data area and are not garbage collected. In Lisp, it's common for symbol names to be stored on the main heap, which adds to garbage collection time. But that also means that dynamically creating symbols in Lisp is a reasonable approach to some problems, but it's not something you want to do in Erlang.

Don't Structure Data All The Way Down

Let's write some functions to operate on circles, where a circle is a defined by a 2D center point and a radius. In Erlang we've got some options for how to represent a circle:

{X, Y, R}                     % raw tuple
{circle, X, Y, R}             % tagged tuple
#circle{x = X, y = Y, r = R}  % icky record

Hmmm...why is a circle represented as a structure, but a point is unwrapped, so to speak, into two values? Attempt #2:

{{X,Y}, R}                    % raw tuples
{circle, {point,X,Y}, R}      % tagged tuples
...                           % gonna stop with records

Now let's write a function to compute the area of a circle, using this new representation:

area({circle, {point,_X,_Y}, R}) ->
math:pi() * R * R.

Simple enough. But take a few steps back and look this. First, we're not actually making use of the structure of the data in area. We're just destructuring it to get the radius. And to do that destructuring, there's a bunch of code generated for this function: to verify the parameter is a tuple of size 3, to verify that the first element is the atom circle, to verify the second element is a tuple of size 3 with the atom point as the first element, and to extract the radius. Then there's a trivial bit of math and we've got an answer.

Now suppose we want to find the area of a circle of radius 17.4. We've got a nice function all set to go...sort of. We need the radius to be part of a circle, so we could try this:

area({circle, {point,0,0}, 17.4})

Kind of messy. What about a function to build a circle for us? Then we could do this:

area(make_circle(0, 0, 17.4))

We could also have a shorter version of make_circle that only takes a radius, defaulting the center point to 0,0. Okay, stop, we're engineering ourselves to death. All we need is a simple function to compute the area of a circle:

area(R) ->
math:pi() * R * R.

Resist the urge to wrap it into an abstract data type or an object. Keep it raw and unstructured and simple. If you want structure, add it one layer up, don't make it part of the foundation. In fact, I'd go so far as to say that if you pass a record-like data structure to a function and any of the elements in that structure aren't being used, then you should be operating on a simpler set of values and not a data structure. Keep the data flow obvious.

Back to the Basics of Functional Programming

I have been accused of taking the long way around to obvious conclusions. Fair enough. But to me it's not the conclusion so much as tracking the path that leads there, so perhaps I need to be more verbose and not go for a minimalist writing style. We shall see.

The modern functional programming world can be a daunting place. All this talk of the lambda calculus. Monads. A peculiar obsession with currying, even though it is really little more than a special case shortcut that saves a bit of finger typing at the expense of being hard to explain. And type systems. I'm going to remain neutral on the static vs. dynamic typing argument, but there's no denying that papers on type systems tend to be hardcore reading.

Functional programming is actually a whole lot simpler than any of this lets on. It's as if the theoreticians figured out functional programming long ago, and needed to come up with new twists to keep themselves amused and to keep the field challenging and mysterious. So where did functional programming come from? I won't even try to give a definitive history, but I can see the path that led to it looking like a good idea.

When I first learned Pascal (the only languages I knew previously were BASIC and 6502 assembly), there was a fixation with parameter passing in the textbooks I read and classes I took. In a procedure heading like this:

function max(a: integer; b: integer): integer;

"a" and "b" are formal parameters. If called with max(1,2), then 1 and 2 are the actual parameters. All very silly, and one of those cases where the trouble of additional terminology takes something mindlessly simple and makes it cumbersome. Half of my high school programming class was hung up on this for a good two weeks.

But then there's more: parameters can be passed by value or by reference. As in C, you can pass a structure by value, even if that structure is 10K in size, and the entire structure will be copied to the stack. And that's usually not a good idea, so by reference is the preferred method in this case... except that data passed by reference might be changed behind the scenes by any function you pass it to. Later languages, such as Ada, got all fancy with multiple types of "by reference" parameters: parameters that were read-only, parameters that were write-only (that is, were assumed to be overwritten by a function), and parameters that could be both read from and written to. All that extra syntax just to reduce the number of cases where a parameter could be stomped all over by a function, causing a global side effect.

One thing Wirth got completely right in Pascal is that "by reference" parameters don't turn into pointers at the language level. They're the same as the references that eventually made it into C++. Introduce full pointers into a language, especially with pointer arithmetic, and now things are really scary. Not only can data structures be modified by any function via reference parameters, but any piece of code can potentially reach out into random data space and tromp other variables in the system. And data structures can contain pointers into other data structures and all bets are off at that point. Any small snippet of code involving pointers can completely change the state of the program, and there's no compile-time analysis that can keep things under control.

There's a simple way out of the situation: Don't allow functions to modify data at all. With that rule in place, it makes no difference if parameters are passed by value or by reference, so the compiler can use whatever is most efficient (usually by value for atomic, primitive types and by reference for structured types). Rather shockingly, this works. It's theoretically possible to write any program without modifying data.

The problem here is how to program in a purely functional manner, and this has gotten surprisingly short shrift in the functional programming community. Yes, types provide more information about intent and can be used to catch a certain class of errors at compile time. Higher-order functions are convenient. Currying is a neat trick. Monads allow I/O and other real-world nastiness to fit into a functional framework. Oh does mergesort look pretty in Haskell. I shudder to think of how tedious it was operating on binary trees in Pascal, yet the Erlang version is breathtakingly trivial.

But ask someone how to write Pac-Man--to choose a hopelessly dated video game--in a purely functional manner. Pac-Man affects the ghosts and the ghosts affect Pac-Man; can most newcomers to FP puzzle out how to do this without destructive updates? Or take just about any large, complex C++ program for that matter. It's doable, but requires techniques that aren't well documented, and it's not like there are many large functional programs that can be used as examples (especially if you remove compilers for functional programming languages from consideration). Monads, types, currying... they're useful, but, in a way, dodges. The most basic principle of writing code without destructive updates is the tricky part.

Five Memorable Books About Programming

I've read the classics--Structure and Interpretation of Computer Programs, Paradigms of Artificial Intelligence Programming--but I'd like to highlight some of the more esoteric books which affected my thinking.

Zen of Assembly Language
Michael Abrash, 1990

I spent much of the 1980s writing 8-bit computer games (which you can read about if you like). Odd as it may seem in retrospect, considering the relative power of an 8-bit 6502 running at sub 2 MHz, I wasn't obsessed with optimizing for performance. If anything, I wanted the code to be small, a side effect of using a line-based editor and writing programs that someone would have to painstakingly type in from a magazine listing. Pages of DATA 0AFF6CA900004021... ugh.

Right when that period of my life came to a close, along came Zen of Assembly Language, by an author I had never heard of, which dissected, explained, and extended the self-taught tricks from my years as a lone assembly hacker. Even though Abrash was focused on the 8086 and not the 6502, it felt like the book was written personally to me.

This is also one of the most bizarrely delayed technical book releases I can recall. The majority of the book was about detailed optimization for the 8088 and 8086, yet it was published when the 80486 was showing up in high-end desktop PCs.

Scientific Forth
Julian Noble, 1992
Fractals, Visualization, and J
Clifford Reiter, 2000

These two books are about entirely different subjects. One is about pure scientific computation. The other is about generating images. Each uses a different, wildly non-mainstream language for the included code.

And yet these two books follow the same general approach, one I wish were more commonly used. Superficially, the authors have written introductions to particular programming languages (which is why the language name is in the title). But in reality it's more that each author has an area of deep expertise and has found a language that enables experimenting with and writing code to solve problems in that field. As such, there aren't forced examples and toy problems, but serious, non-trivial programs that show a language in actual use. Dr. Noble demonstrates how he uses Forth for hardcore matrix work and, when he realizes that RPN notation isn't ideal in all circumstances, develops a translator from infix expressions to Forth. Clifford Reiter jumps into image processing algorithms, plus veers into lighter weight diversions with titles like "R/S Analysis, the Hurst Exponent, and Sunspots."

Both books are wonderful alternatives to the usual "Learning Programing Language of the Month" texts. Sadly, Julian Noble died in 2007.

Programmers At Work
Susan Lammers, 1986

I used to soak up printed interviews with programmers and game designers (and typically game designers were programmers as well). I was enthralled by Levy's Hackers, more the game development chapter than the rest. Programmers At Work was in the same vein: philosophies, ideas, and experiences directly from an odd mix of famous and quirky programmers. But the book wasn't primarily about tech. It was about creativity. Most of the people interviewed didn't have degrees in computer science. There wasn't an emphasis on math, proving programs correct, lambda calculus--just people coming up with ideas and implementing them. And the game connection was there: Jaron Lanier talking about the psychedelic Moon Dust for the Commodore 64, Bill Budge's bold (and still unfulfilled) plan to build a "Construction Set Construction Set."

Lammers's book was the model I used when I put together Halcyon Days. I also pulled Hackers into the mix by interviewing John Harris about his dissatisfaction with Levy's presentation of him. In an odd twist of fate, Programmers at Work and Halcyon Days were packaged together on a single CD sold through the Dr. Dobb's Journal library. The pairing has been around for ten years and is still available, much to my surprise.

Thinking Forth: A Language and Philosophy for Solving Problems
Leo Brodie, 1984

Yes, another book about Forth.

But this one is worth reading less for the Forth and more because it's one of the few books about how to decompose problems and structure code. You'd think this book was written by Fowler and friends, until you realize it's from the mid-1980s. That Brodie uses "factor" (which originated in the Forth community) instead of "refactor" is also a giveaway. What's impressive here is there's no OOP, no discussion of patterns, no heavy terminology. It's a book about understanding what you're trying to achieve, avoiding redundancy, and writing dead simple code.

It's worth it for the Forth, too, especially the interspersed bits of wisdom from experts, including Forth creator Chuck Moore.

In Praise of Non-Alphanumeric Identifiers

Here's a common definition of what constitutes a valid identifier in many programming languages:

The first character must be any letter (A-Z, a-z) or an underscore. Subsequent characters, if any, must be a letter, digit (0-9), or an underscore.

Simple enough. It applies to C, C++, ML, Python, most BASICs, most custom scripting languages (e.g., Game Maker Language). But of course there's no reason for this convention other than being familiar and expected.

One of my favorite non-alphanumeric characters for function names is "?". Why say is_uppercase (or IsUppercase or isUppercase) when you can use the more straightforward Uppercase? instead? That's standard practice in Scheme and Forth, and I'm surprised it hasn't caught on in all new languages.

(As an aside, in Erlang you can use any atom as a function name. You can put non-alphanumeric characters in an atom if you remember to surround the entire name with single quotes. It really does work to have a function named 'uppercase?' though the quotes make it clunky.)

Scheme's "!" is another good example. It's not mnemonic, and it doesn't carry the same meaning as in English punctuation. Instead it was arbitrarily designated a visual tag for "this function destructively updates data": set!, vector-set!. That's more concise than any other notation I can think of ("-m" for "mutates"? Yuck).

Forth goes much further, not only allowing any ASCII character in identifiers, but there's a long history of lexicographic conventions. The fetch and store words--"@" and "!"--are commonly appended to names, so color@ is read as "color fetch." That's a nice alternative to "get" and "set" prefixes. The Forthish #strings beats "numStrings" any day. Another Forth standard is including parentheses in a name, as in (open-file), to indicate that a word is low-level and for internal use only.

And then there are clever uses of characters in Forth that make related words look related, like this:

open{ write-byte write-string etc. }close

The brace is part of both open{ and }close. There no reason the braces couldn't be dropped completely, but they provide a visual cue about scope.

Slumming with BASIC Programmers

I'm a registered user of BlitzMax, an extended BASIC-variant primarily targeted at people who want to write games. It's also the easiest way I've run across to deal with graphics, sound, and user input in a completely cross platform way, and that's why I use it. Every program I've written works perfectly under both OS X on my MacBook and Windows XP on my desktop; all it takes is a quick recompile. The same thing is possible with packages like SDL, but that involves manually fussing with C compiler set-ups. BlitzMax is so much more pleasant.

But still, it's BASIC. BASIC with garbage collection and OOP and Unicode strings, but BASIC nonetheless. It doesn't take long, reading through the BlitzMax community forums, to see that the average user has a shallower depth of programming experience than programmers who know Lisp or Erlang or Ruby. Or C for that matter. There are misconceptions about how data types work, superstitions involving performance that are right up there with music CDs colored with green marker, paranoia about recursion. The discussions about OOP, and the endearing notion that sealing every little bit of code and data into objects is somehow inherently right, feels like a rewind to 15 or more years ago. In Erlang or Python, to report the collision of two entities, I'd simply use:

{collision, EntityA, EntityB}

forgetting that this can be made more complex by defining a CollisionResult class with half a dozen methods for picking out the details of the obvious.

(For an even better example of this sort of time warp, consider PowerBASIC, which touts a keyword for indicating that a variable should be kept in a CPU register. It's C's register variables all over again, the difference being that this is still considered an important optimization in the PowerBASIC world.)

By this point I'm sure I've offended most of the BlitzMax and PowerBASIC users out there, and to everyone else it looks like I'm gleefully making fun of Blub programmers. I may be looking down my snooty language dilettante nose at them, yes, but from a get-things-done productivity point of view I'm seriously impressed. There's a continuous stream of games written in BlitzMax, games that are more than just retro-remakes of 8-bit relics, from people with very little programming experience. There are games with physics, games with elaborate effects, 3D games written with libraries designed for 2D, games written over the course of spring break. I'm withholding judgement on the raw aesthetics and playability of these games; the impressive part is that they exist and are largely written by people with minimal programming background. (I've downloaded more than one BlitzMax game that came with source code and marveled at how the entire core of the project was contained in a single 1000+ line function.)

Compare this with any hipper, purportedly more expressive language. I've written before about how you can count the number of games written in a purely functional style on one hand. Is it that language tinkerers are less concerned about writing real applications? That they know you can solve any problem with focused grunt work, but it's not interesting to them? That the spark and newness of a different language is its own reward? Either way, the BASIC programmers win when it comes down to getting projects finished.

My Road to Erlang

I had three or four job offers my last semester of college, all of them with telecom companies just north of Dallas. I ended up working for Ericsson in the early 1990s.

Now if you're expecting me to talk about how I hung around with the brilliant folks who developed Erlang...don't. Ericsson's telephone exchanges were programmed in a custom, baroque language called PLEX. Syntactically it was a cross between Fortran and a macro assembler. You couldn't pass parameters to functions, for example; you assigned them to variables instead, much like the limited GOSUB of an 8-bit BASIC. The advantage was that there was a clean one-to-one correspondence between the source code and the generated assembly code, a necessity when it came to writing patches for live systems where taking them down for maintenance was Very Bad Indeed.

The other thing worth mentioning about PLEX and Ericsson's hardware of the time is that they were custom designed for large scale message passing concurrency. That hardware created in the 1970s and 1980s was built to handle tens of thousands of processes certainly makes dual core CPUs seem a bit late to the party.

Ericsson had a habit of periodically sending employees to the mothership in Sweden, and after one such trip my office mate brought back a sheet of paper with a short, completely unintelligible to me, Erlang program on it. In three years at Ericsson, my total exposure to Erlang was about thirty seconds. I left shortly after that, getting back into game development.

In 1998 I started looking at very high level programming languages, because I was in a rut and getting behind the times. Most of my experience was in various assembly languages (6502, 68000, 8086, SH2, PowerPC), C, Pascal, plus some oddities like Forth. The only modern language I was familiar with was Perl, so like everyone else at the time I could write CGI scripts. I wanted to leapfrog ahead, to get myself out of the low-level world, so I looked to functional programming (covered a bit more in the first part of Admitting that Functional Programming Can Be Awkward).

I worked through online tutorials for three languages: Standard ML, OCaml, and Haskell. I had fun with them, especially OCaml, but there were two glaring issues that held me back.

The first was that the tutorials were self-absorbed in the accoutrements of functional programming: type systems, fancy ways of using types for generic programming, lambda calculus tricks like currying. The tutorials for all three languages were surprisingly similar. The examples were either trivial or geared toward writing compilers. At the time I was interested in in complex, interactive programs--video games--but I didn't have a clue about how to structure even the simplest of games in Haskell. There were a few trivial games written in OCaml, but they made heavy use of imperative features which made me wonder what the point was.

The second issue was that I was used to working on commercial products, and there was little evidence at the time that Standard ML, OCaml, or Haskell was up to the task. Would they scale up to programs orders of magnitude larger than class assignments? And more critically, would functional programming scale up? Would I hit a point when the garbage collector crossed the line from imperceptible to perceptible? Would there be anything I could possibly do if that happened? Would lazy evaluation become too difficult to reason about? There was also the worry that Windows seemed to be a "barely there" platform in the eyes of all three language maintainers. The OCaml interpreter had a beautiful interactive shell under MacOS, but the Windows version was--or should have been--a great embarrassment to everyone involved.

Somewhere in this period I also took a hard look at Lisp (I was one of the first registered users of Corman Lisp) and Erlang. Erlang wasn't open source yet and required a license for commercial use. The evaluation version still used the old JAM runtime instead of the more modern BEAM and was dog slow. It also had the same dated, cold, industrial feeling of the systems I used when at Ericsson. I put it aside and kept tinkering elsewhere.

But I came back when the move to open source occurred. Here's why:

I found I had an easier time writing programs in Erlang. I was focused entirely on the mysterious concept of writing code that didn't involve destructive updates. I wasn't distracted by type systems or complex module and class systems. The lack of "let" and "where" clauses in Erlang makes code easier to structure, without the creeping indentation caused by numerous levels of scope. Atoms are a beautiful data type to work with.

That the tools had been used to ship large-scale commercial products gave me faith in them. Yes, they are cold and industrial, and I started seeing that as a good thing. The warts and quirks are there for a reason: because they were needed in order to get a project out the door. I'll take pragmatism over idealism any day.

Besides being useful in its own right, concurrency is a good pressure valve for functional programming. Too difficult to puzzle out how to make a large functional program work? Break it into multiple smaller programs that communicate.

Speed was much improved. The BEAM runtime is 3x faster than JAM. When I started looking at different languages, I was obsessed with performance, but I eventually realized I was limiting my options. I could always go back to the madness of writing assembly code if I cared that much. Flexibility mattered more. The 3x improvement pushed Erlang from "kinda slow" to "good enough."

I've been using Erlang since 1999, but I hardly think of myself as a fanatic. I still use Perl, Python, C++, with occasional forays into REBOL, Lua, and Forth, plus some other special purpose languages. They all have strengths and weaknesses. But for most difficult problems I run into, Erlang is my first choice.

Purely Functional Retrogames, Part 1

When I started looking into functional languages in 1998, I had just come off a series of projects writing video games for underpowered hardware: Super Nintendo, SEGA Saturn, early PowerPC-based Macintoshes without any graphics acceleration. My benchmark for usefulness was "Can a programming language be used to write complex, performance intensive video games?"

After working through basic tutorials, and coming to grips with the lack of destructive updates, I started thinking about how to write trivial games, like Pac-Man or Defender, in a purely functional manner. Then I realized that it wasn't performance that was the issue, it was much more fundamental.

I had no idea how to structure the most trivial of games without using destructive updates.

Pac-Man is dead simple in any language that fits the same general model as C. There are a bunch of globals representing the position of Pac-Man, the score, the level, and so on. Ghost information is stored in a short array of structures. Then there's an array representing the maze, where each element is either a piece of the maze or a dot. If Pac-Man eats a dot, the maze array is updated. If Pac-Man hits a blue ghost, that ghost's structure is updated to reflect a new state. There were dozens and dozens of Pac-Man clones in the early 1980s, including tiny versions that you could type in from a magazine.

In a purely functional language, none of this works. If Pac-Man eats a dot, the maze can't be directly updated. If Pac-Man hits a blue ghost, there's no way to directly change the state of the ghost. How could this possibly work?

That was a long time ago, and I've spent enough time with functional languages to have figured out how to implement non-trivial, interactive applications like video games. My plan is to cover this information in a short series of entries. I'm sticking with 8-bit retrogames because they're simple and everyone knows what Pac-Man looks like. I don't want to use abstract examples involving hypothetical game designs. I'm also sticking with purely functional programming language features, because that's the challenge. I know that ML has references and that processes in Erlang can be used to mimic objects, but if you go down that road you might as well be using C.

The one exception to "purely functional" is that I don't care about trying to make I/O fit a functional model. In a game, there are three I/O needs: input from the user, a way to render graphics on the screen, and a real-time clock. Fortunately, these only matter at the very highest level outer loop, one that looks like:

repeat forever {
get user input
process one frame
draw everything on the screen
wait until a frame's worth of time has elapsed
}

"Process one frame" is the interesting part. It takes the current game state and user input as parameters and returns a new game state. Then that game state can be used for the "draw everything" step. "Draw everything" can also be purely functional, returning an abstract list of sprites and coordinates, a list that can be passed directly to a lower level, and inherently impure, function that talks to the graphics hardware.

An open question is "Is being purely functional, even excepting I/O, worthwhile?" Or is it, as was suggested to me via email earlier this year, the equivalent of writing a novel without using the letter 'e'?

Part 2

Purely Functional Retrogames, Part 2

(Read Part 1 if you missed it.)

The difficult, or at least different, part of writing a game in a purely functional style is living without global, destructive updates. But before getting into how to deal with that, anything that can be done to reduce the need for destructive updates is going to make things easier later on.

Back when I actually wrote 8-bit games, much of my code involved updating timers and counters used for animation and special effects and so on. At the time it made a lot of sense, given the limited math capabilities of a 6502. In the modern world you can achieve the same by using a single clock counter that gets incremented each frame.

Ever notice how the power pills in Pac-Man blink on and off? Let's say the game clock is incremented every 1/60th of a second, and the pills flop from visible to invisible--or the other way around--twice per second (or every 30 ticks of the clock). The state of the pills can be computed directly from the clock value:

pills_are_visible(Clock) ->
is_even(Clock div 30).

No special counters, no destructive updates of any kind. Similarly, the current frame of the animation of a Pac-Man ghost can be computed given the same clock:

current_ghost_frame(Clock) ->
Offset = Clock rem TOTAL_GHOST_ANIMATION_LENGTH,
Offset div TIME_PER_ANIMATION_FRAME.

Again, no special counters and no per frame updates. The clock can also be used for general event timers. Let's say the bonus fruit appears 30 seconds after a level starts. All we need is one value: the value of the clock when the level started plus 30*60. Each frame we check to see if the clock matches that value.

None of this is specific to functional programming. It's common in C and other languages. (The reason it was ugly on the 6502 was because of the lack of division and remainder instructions, and managing a single global clock involved verbose 24-bit math.)

There are limits to how much a single clock value can be exploited. You can't make every enemy in Robotron operate entirely as a function of time, because they react to other stimuli in the world, such as the position of the player. If you think about this trick a bit, what's actually going on is that some data is entirely dependent on other data. One value can be used to compute others. This makes a dynamic world be a whole lot more static than it may first seem.

Getting away from clocks and timing, there are other hidden dependencies in the typical retro-style game. In a procedural implementation of Pac-Man, when Pac-Man collides with a blue ghost, a global score is incremented. This is exactly the kind of hidden update that gets ugly with a purely functional approach. Sure, you could return some special data indicating that the score should change, but there's no need.

Let's say that each ghost has a state that looks like this: {State_name, Starting_time}. When a ghost has been eaten and is attempting to return to the box in the center of the maze, the state might be {return_to_box, 56700}. (56700 was the value of the master clock when the ghost was eaten.) Or it might be more fine-grained than that, but you get the idea. The important part is that there's enough information here to realize that a ghost was eaten during the current frame: if the state name is "return_to_box" and the starting time is the same as the current game clock. A separate function can scan through the ghost states and look for events that would cause a score increase.

The same technique also applies to when sounds are played. It's not something that has to be a side effect of the ghost behavior handling code. There's enough implicit information, given the state of the rest of the world, to make decisions about when sounds should be played. Using the example from the preceding paragraph, the same criteria for indicating a score increase can also be used to trigger the "ghost eaten" sound.

Part 3

Purely Functional Retrogames, Part 3

(Read Part 1 if you missed it.)

Every entity in a game needs some data to define where it is and what it's doing. At first thought, a ghost in Pac-Man might be defined by:

{X, Y, Color}

which looks easy enough, but it's naive. There needs to be a lot more data than that: direction of movement, behavior state, some base clock values for animation, etc. And this is just simplistic Pac-Man. In an imperative or OO language this topic barely deserves thought. Just create a structure or object for each entity type and add fields as the situation arises. If the structure eventually contains 50 fields, who cares? But...

In a functional language, the worst thing you can do is create a large "struct" containing all the data you think you might need for an entity.

First, this doesn't scale well. Each time you want to "change" a field value, a whole new structure is created. For Pac-Man it's irrelevant--there are only a handful of entities. But the key is that if you add a single field, then you're adding overhead across the board to all of the entity processing in your entire program. The second reason this is a bad idea is that it hides the flow of data. You no longer know what values are important to a function. You're just passing in everything, and that makes it harder to experiment with writing simple, obviously correct primitives. Which is less opaque:

step_toward({X,Y}, TargetX, TargetY, Speed) ->
...

step_toward(EntityData, TargetX, TargetY, Speed) ->
...

The advantage of the first one is that you don't need to know what an entity looks like. You might not have thought that far ahead, which is fine. You've got a simple function for operating on coordinate pairs which can be used in a variety of places, not just for entity movement.

If we can't use a big struct, what does an entity look like? There are undoubtedly many ways to approach this, but I came up with the following scheme. Fundamentally, an entity is defined by an ID of some sort ("I am one of those fast moving spinning things in Robotron"), movement data (a position and maybe velocity), and the current behavioral state. At the highest level:

{Id, Position, State}

Each of these has more data behind it, and that data varies based on the entity type, the current behavior, and so on. Position might be one of the following:

{X, Y}
{X, Y, XVelocity, YVelocity}

State might look like:

{Name, StartTime, EndTime}
{Name, StartTime, EndTime, SomeStateSpecificData}

StartTime is so there's a base clock to use for animation or to know how long the current state has been running. EndTime is the time in the future when the state should end; it isn't needed for all states.

In my experiments, this scheme got me pretty far. Everything is very clean at a high level--a three element tuple--and below that there's still the absolute minimum amount of data not only per entity type, but for the exact state that the entity is in. Compare that to the normal "put everything in a struct" approach, where fields needed only for the "return to center of maze" ghost logic are always sitting there, unused in most states.

But wait, what about additional state information, such as indicating that a Pac-Man ghost is invulnerable (which is true when a ghost has been reduced to a pair of eyes returning to the center of the maze)? If you remember Part 2, then the parenthetical note in the previous sentence should give it away. If the ghost is invulnerable when in a specific state, then there's no need for a separate flag. Just check the state.

Part 4

Purely Functional Retrogames, Part 4

(Read Part 1 if you missed it.)

By the definition of functional programming, functions can't access any data that isn't passed in. That means you need to think about what data is needed for a particular function, and "thread" that data through your program so a function can access it. It sounds horrible when written down, but it's easy in practice.

In fact, just working out the data dependencies in a simple game is an eye-opening exercise. It usually turns out that there are far fewer dependencies than you might imagine. In Pac-Man, there's an awful lot of state that makes no difference to how the ghosts move: the player's score, whether the fruit is visible or not, the location of dots in the maze. Similarly, the core movement of Pac-Man, ignoring collision detection, only relies on a handful of factors: the joystick position, the location of walls in the maze (which are constant, because there's only one maze), and the current movement speed (which increases as mazes are completed).

That was the easy part. The tricky bit is how to handle functions that affect the state of the world. Now of course a function doesn't actually change anything, but somehow those effects on the world need to be passed back out so the rest of the game knows about them. The "move Pac-Man" routine returns the new state of Pac-Man (see Part 3 for more about how entity state is represented). If collision detection is part of the "move Pac-Man" function, then there are more possible changes to the world: a dot has been eaten, a power pill has been eaten, fruit has been eaten, Pac-Man is dead (because of collision with a non-blue ghost), a ghost is dead (because of a collision with a powered-up Pac-Man).

When I first mused over writing a game in a purely functional style, this had me stymied. One simple function ends up possibly changing the entire state of the world? Should that function take the whole world as input and return a brand new world as output? Why even use functional programming, then?

A clean alternative is not to return new versions of anything, but to simply return statements about what happened. Using the above example, the movement routine would return a list of any of these side effects:

{new_position, Coordinates}
{ate_ghost, GhostName}
{ate_dot, Coordinates}
ate_fruit
killed_by_ghost

All of a sudden things are a lot simpler. You can pass in the relevant parts of the state of the world, and get back a simple list detailing what happened. Actually handling what happened is a separate step, one that can be done later on in the frame. The advantage here is that changes to core world data don't have to be painstakingly threaded in and out of all functions in the game.

I'd like to write some concluding thoughts on this series, to answer the "Why do this?" and "What about Functional Reactive Programming?" questions--among others--but wow I've already taken just about a month for these four short entries, so I'm not going to jump into that just yet.

(I eventually wrote the follow-up.)

Don't Be Afraid of Special Cases

In the body of work on low-level optimization, there's a heavy emphasis on avoiding branches. Here's a well-known snippet of x86 code which sets eax to the smaller of the two values in eax and ecx:

sub ecx, eax
sbb edx, edx
and ecx, edx
add eax, ecx

At the CPU hardware level, branches are indeed expensive and messy. A mispredicted branch empties the entire instruction pipeline, and it can take a dozen or more cycles to get that pipeline full and ticking along optimally again.

But that's only at the lowest level, and unless you're writing a code generator or a routine that's hyper-sensitive to instruction-level tweaks, like movie compression or software texture mapping, it's doubtful that going out of your way to avoid branches will be significant. Ignoring efficiency completely, there's still the stigma that code with many conditionals in it, to handle special cases, is inherently ugly, even poorly engineered.

That's the programmer's code-centric view. The user of an application isn't thinking like that at all. He or she is thinking purely about ease of use, and ugly is when a program displays "1 files deleted" (or even "1 file(s) deleted"), or puts up a dialog box that crosses between two monitors, making it unreadable.

In 1996-7 I wrote a game called "Bumbler" for the Mac. (Yes, I've brought this up before, but that's because I spent 18 months as a full-time indie game developer, which was more valuable--and probably just as expensive--as getting another college degree.) Bumbler is an insect-themed shooter that takes place on a honeycomb background. When an insect is killed, the honeycomb behind it fills with honey. Every Nth honeycomb fills with pulsing red "special honey," which you can fly over and something special happens. Think "power-ups."

The logic driving event selection isn't just a simple random choice between the seven available special honey effects. I could have done that, sure, but it would have been a lazy decision on my part, one that would have hurt the game in noticeable ways. Here are some of the special honey events and the special cases involved:

Create hole. This added a hole to the honeycomb that insects could crawl out of, the only negative special honey event. During play testing I found out that if a hole was created near the player start position, the player would often collide with an insect crawling out of it at the start of a life. So "create hole" was disallowed in a rectangular region surrounding the player start. It was also not allowed if there were already a certain number of holes in the honeycomb, to avoid too much unpredictability.

Release flowers. This spawned bonus flowers from each of the holes in the honeycomb. But if there were already many flowers on the screen, then the player could miss this entirely, and it looked like nothing happened. If there were more than X flowers on the screen, this event was removed from the list of possibilities.

Flower magnet. This caused all the flowers on the screen to flash yellow and home on in the player. This was good, because you got more points, but bad because the flowers blocked your shots. To make this a rare event, one that the player would be surprised by, it was special-cased to not occur during the first ten or so levels, plus once it happened it couldn't be triggered again for another five levels. Okay, that's two special cases. Additionally, if there weren't any flowers on the screen, then it looked like nothing happened, and if there were only a few flowers, it was underwhelming. So this event was only allowed if there were a lot of flowers in existence.

All of these cases improved the game, and play testing supported them. Did they make the code longer and arguably uglier? Yes. Much more so because I wasn't using a language that encourages adding special cases in an unobtrusive way. One of the advantages to a language with pattern matching, like Erlang or Haskell or ML, is that there's a programing assistant of sorts, one that takes your haphazard lists of special cases--patterns--and turns then into an optimal sequence of old-fashioned conditionals, a jump table, or even a hash table.

Coding as Performance

I want to talk about performance coding. Not coding for speed, but coding as performance, a la live coding. Okay, I don't really want to talk about that either, as it mostly involves audio programming languages used for on-the-fly music composition, but I like the principle of it: writing programs very quickly, in the timescale of TV show or movie rather than the years it can take to complete a commercial product. Take any book on agile development or extreme programming and replace "weeks" with "hours" and "days" with "minutes."

Think of it in terms of a co-worker or friend who comes to you with a problem, something that could be done by hand, but would involve much repetitive work ("I've got a big directory tree, and I need a list of the sum total sizes of all files with the same root names, so hello.txt, hello.doc, and hello.whatever would just show in the report as 'hello', followed by the total size of those three files"). If you can write a program to solve the problem in less time than the tedium of slogging through the manual approach, then you win. There's no reason to limit this game to this kind of problem, but it's a starting point.

Working at this level, the difference between gut instinct and proper engineering becomes obvious. The latter always seems to involve additional time--architecture, modularity, code formatting, interface specification--which is exactly what's in short supply in coding as performance. Imagine you want to plant a brand new vegetable garden somewhere in your yard, and the first task is to stake out the plot. Odds are good that you'll be perfectly successful by just eyeballing it, hammering a wooden stake at one corner, and using it as a reference. Or you could be more formal and use a tape measure. The ultimate, guaranteed correct solution is to hire a team of surveyors to make sure the distances are exact and the sides perfectly parallel. But really, who would do that?

(And if you're thinking "not me," consider people like myself who've grepped a two-hundred megabyte XML file, because it was easier than remembering how to use the available XML parsing libraries. If your reaction is one of horror because I clearly don't understand the whole purpose of using XML to structure data, then there you go. You'd hire the surveyors.)

You can easily spot the programming languages designed for projects operating on shorter timescales. Common, non-trivial operations are built-in, like regular expressions and matrix math (as an aside, the original BASIC language from the 1960s had matrix operators). Common functions--reading a file, getting the size of a file--don't require importing libraries after you've managed to remember that getting the size of a file isn't a core operation that's in the "file" library and is instead in "os:file:filesize" or wherever the hierarchical-thinking author put it. But really, any language of the Python or Ruby class is going to be fine. The big wins are having an interactive read / evaluate / print loop, zero compilation time, and data structures that don't require thinking about low-level implementation details.

What matter just as much are visualization tools, so you can avoid the classic pitfall of engineering something for weeks or months only to finally realize that you didn't understand the problem and engineered the wrong thing. (Students of Dijkstra are ready with some good examples of math problems where attempting to guess an answer based on a drawing gives hopelessly incorrect answers, but I'll pretend I don't see them, there in the back, frantically waving their arms.)

I once used an 8-bit debugger with an interrupt-driven display. Sixty times per second, the display was updated. This meant that memory dumps were live. If a running program constantly changed a value, that memory location showed as blurred digits on the screen. You could also see numbers occasionally flick from 0 to 255, then back later. Static parts of the screen meant nothing was changing there. This sounds simple, but wow was it useful for accidentally spotting memory overruns and logic errors. I often never suspected a problem, and I wouldn't haven even known what to look for, but found an error just by seeing movement or patterns in a memory dump that didn't look right.

A modern visualization tool I can't live without is RegEx Coach. I always try out regular expressions using it before copying them over to my Perl or Python scripts. When I make an error, I see it right away. That prevents situations where the rest of my program is fine, but a botched regular expression isn't pulling in exactly the data I'm expecting.

The J language ships with some great visualization tools. Arguably it's the nicest programming environment I've ever used, even though I go back and forth about whether J itself is brilliant or insane. There's a standard library module which takes a matrix and displays it as a grid of colors. Identical values use the same color. Simplistic? Yes. But this display format makes patterns and anomalies jump out of the screen. If you're thinking that you don't write code that involves matrix math, realize that matrices are native to J and you can easily put all sorts of data into a matrix format (in fact, the preferred term for a matrix in J is the more casual "table").

J also has a similar tool that mimics a spreadsheet display. Pass in data, and up pops what looks like an Excel window, making it easy to view data that is naturally columnar. It's easier than dumping values to an HTML file or the old-fashioned method of debug printing a table using a fixed-width font. There's also an elaborate module for graphing data; no need to export it to a file and use a standalone program.

I'm hardly suggesting that everyone--or anyone--switch over to J. It's not the language semantics that matter so much as tools that are focused on interactivity, on working through problems quickly. And the realization that it is valid to get an answer without always bringing the concerns of software engineering--and the time penalty that comes with them--into the picture.

A Spellchecker Used to Be a Major Feat of Software Engineering

Here's the situation: it's 1984, and you're assigned to write the spellchecker for a new MS-DOS word processor. Some users, but not many, will have 640K of memory in their PCs. You need to support systems with as little as 256K. That's a quarter megabyte to contain the word processor, the document being edited, and the memory needed by the operating system. Oh, and the spellchecker.

For reference, on my MacBook, the standard dictionary in /usr/share/dict/words is 2,486,813 bytes and contains 234,936 words.

An enticing first option is a data format that's more compressed than raw text. The UNIX dictionary contains stop and stopped and stopping, so there's a lot of repetition. A clever trie implementation might do the trick...but we'll need a big decrease to go from 2+ megabytes to a hundred K or so.

In fact, even if we could represent each word in the spellchecker dictionary as a single byte, we need almost all the full 256K just for that, and of course the single byte representation isn't going to work. So not only does keeping the whole dictionary in RAM look hopeless, but so does keeping the actual dictionary on disk with only an index in RAM.

Now it gets messy. We could try taking a subset of the dictionary, one containing the most common words, and heavily compressing that so it fits in memory. Then we come up with a slower, disk-based mechanism for looking up the rest of the words. Or maybe we jump directly to a completely disk-based solution using a custom database of sorts (remembering, too, that we can't assume the user has a hard disk, so the dictionary still needs to be crunched onto a 360K floppy disk).

On top of this, we need to handle some other features, such as the user adding new words to the dictionary.

Writing a spellchecker in the mid-1980s was a hard problem. Programmers came up with some impressive data compression methods in response to the spellchecker challenge. Likewise there were some very clever data structures for quickly finding words in a compressed dictionary. This was a problem that could take months of focused effort to work out a solution to. (And, for the record, reducing the size of the dictionary from 200,000+ to 50,000 or even 20,000 words was a reasonable option, but even that doesn't leave the door open for a naive approach.)

Fast forward to today. A program to load /usr/share/dict/words into a hash table is 3-5 lines of Perl or Python, depending on how terse you mind being. Looking up a word in this hash table dictionary is a trivial expression, one built into the language. And that's it. Sure, you could come up with some ways to decrease the load time or reduce the memory footprint, but that's icing and likely won't be needed. The basic implementation is so mindlessly trivial that it could be an exercise for the reader in an early chapter of any Python tutorial.

That's progress.

Want to Write a Compiler? Just Read These Two Papers.

Imagine you don't know anything about programming, and you want learn how to do it. You take a look at Amazon.com, and there's a highly recommended set of books by Knute or something with a promising title, The Art of Computer Programming, so you buy them. Now imagine that it's more than just a poor choice, but that all the books on programming are at written at that level.

That's the situation with books about writing compilers.

It's not that they're bad books, they're just too broadly scoped, and the authors present so much information that it's hard to know where to begin. Some books are better than others, but there are still the thick chapters about converting regular expressions into executable state machines and different types of grammars and so on. After slogging through it all you will have undoubtedly expanded your knowledge, but you're no closer to actually writing a working compiler.

Not surprisingly, the opaqueness of these books has led to the myth that compilers are hard to write.

The best source for breaking this myth is Jack Crenshaw's series, Let's Build a Compiler!, which started in 1988. This is one of those gems of technical writing where what's assumed to be a complex topic ends up being suitable for a first year programming class. He focuses on compilers of the Turbo Pascal class: single pass, parsing and code generation are intermingled, and only the most basic of optimizations are applied to the resulting code. The original tutorials used Pascal as the implementation language, but there's a C version out there, too. If you're truly adventurous, Marcel Hendrix has done a Forth translation (and as Forth is an interactive language, it's easier to experiment with and understand than the C or Pascal sources).

As good as it is, Crenshaw's series has one major omission: there's no internal representation of the program at all. That is, no abstract syntax tree. It is indeed possible to bypass this step if you're willing to give up flexibility, but the main reason it's not in the tutorials is because manipulating trees in Pascal is out of sync with the simplicity of the rest of the code he presents. If you're working in a higher level language--Python, Ruby, Erlang, Haskell, Lisp--then this worry goes away. It's trivially easy to create and manipulate tree-like representations of data. Indeed, this is what Lisp, Erlang, and Haskell were designed for.

That brings me to A Nanopass Framework for Compiler Education [PDF] by Sarkar, Waddell, and Dybvig. The details of this paper aren't quite as important as the general concept: a compiler is nothing more than a series of transformations of the internal representation of a program. The authors promote using dozens or hundreds of compiler passes, each being as simple as possible. Don't combine transformations; keep them separate. The framework mentioned in the title is a way of specifying the inputs and outputs for each pass. The code is in Scheme, which is dynamically typed, so data is validated at runtime.

After writing a compiler or two, then go ahead and plunk down the cash for the infamous Dragon Book or one of the alternatives. Maybe. Or you might not need them at all.

Functional Programming Went Mainstream Years Ago

In school and early in my programming career I must have written linked-list handling code fifty times. Those were the days of Pascal and vanilla C. I didn't have the code memorized either, because there were too many variations: singly-linked list, singly-linked list with dummy head and tail nodes, doubly-linked list, doubly-linked list with dummy head and tail nodes. Insertion and deletion routines for each of those. I worked out the pointer manipulation logic each time I rewrote them. Good thing, too, because the AP Computer Science exam was chock full of linked-list questions.

Early functional languages like Hope and Miranda seemed like magic in comparison. Not only were lists built into those languages, but there was no manual fiddling with pointers or memory at all. Even more so, the entire concept of memory as the most precious of resources, one to be lovingly arranged and conserved, was absent. That's not to say that memory was free and infinite, but it was something fluid and changing. A temporary data structure was created and used transiently, with no permanent cost.

All of this magic is nothing new in currently popular programming languages. Fifteen years ago you could say:

print join ',', @Items

in Perl, taking an arbitrarily long list of arbitrarily long strings, and building an entirely new string consisting of the elements of @Items separated by commas. Once print is finished with that string, it disappears. At a low level this is a serious amount of work, all in the name of temporary convenience. I never would have dared something so cavalier in Turbo Pascal. And yet it opens the door to what's essentially a functional style: creating new values rather than modifying existing ones. You can view a Perl (or Python or Ruby or Lua or Rebol) program as a series of small functional programs connected by a lightweight imperative program.

But there's more to functional programming than a disassociation from the details of memory layout. What about higher order functions and absolute purity and monads and elaborate type systems and type inference? Bits of those already exist in decidedly non-functional languages. Higher Order Perl is a great book. Strings are immutable in Python. Various forms of lambda functions are available in different languages, as are list comprehensions.

Still, the purists proclaim, it's not enough. Python is not a replacement for Haskell. But does it matter? 90% of the impressive magic from early functional languages has been rolled into mainstream languages. That last 10%, well, it's not clear that anyone is really wanting it or that the benefits are actually there. Purity has some advantages, but it's so convenient and useful to directly modify a dictionary in Python. Fold and map are beautiful, but they work just as well in the guise of a foreach loop.

The answer to "When will Haskell finally go mainstream?" is "most of it already has."

Kilobyte Constants, a Simple and Beautiful Idea that Hasn't Caught On

Eric Isaacson's A86 assembler (which I used regularly in the early 1990s) includes a great little feature that I've never seen in another language: the suffix "K" to indicate kilobytes in numeric literals. For example, you can say "16K" instead of "16384". How many times have you seen C code like this:

char Buffer[512 * 1024];

The "* 1024" is so common, and so clunky in comparison with:

char Buffer[512K];

In Forth this is trivial to add, at least outside of compiled definitions. All you need is:

: K 1024 * ;

And then you can write:

512 K allot

Understanding What It's Like to Program in Forth

I write Forth code every day. It is a joy to write a few simple words and solve a problem. As brain exercise it far surpasses cards, crosswords or Sudoku

—Chuck Moore, creator of Forth

I've used and enjoyed Forth quite a bit over the years, though I rarely find myself programming in it these days. Among other projects, I've written several standalone tools in Forth, used it for exploratory programming, wrote a Forth-like language for handling data assets for a commercial project, and wrote two standalone 6502 cross assemblers using the same principles as Forth assemblers.

It's easy to show how beautiful Forth can be. The classic example is:

: square dup * ;

There's also Leo Brodie's oft-cited washing machine program. But as pretty as these code snippets are, they're the easy, meaningless examples, much like the two-line quicksort in Haskell. They're trotted out to show the the strengths of a language, then reiterated by new converts. The primary reason I wrote the Purely Functional Retrogames series, is because of the disconnect between advocates saying everything is easy without destructive updates, and the utter lack of examples of how to approach many kinds of problems in a purely functional way. The same small set of pretty examples isn't enough to understand what it's like to program in a particular language or style.

Chuck Moore's Sudoku quote above is one of the most accurate characterizations of Forth that I've seen. Once you truly understand it, you'll better see what's fun about the language, and also why it isn't as commonly used. What I'd like to do is to start with a trivially simple problem, one that's completely straightforward, even simpler than the infamous FizzBuzz:

Write a Forth word to add together two integer vectors (a.k.a. arrays) of three elements each.

The C version, without bothering to invent custom data types, requires no thought:

void vadd(int *v1, int *v2, int *v3)
{
v3[0] = v1[0] + v2[0];
v3[1] = v1[1] + v2[1];
v3[2] = v1[2] + v2[2];
}

In Erlang it's:

vadd({A,B,C}, {D,E,F}) -> {A+D, B+E, C+F}.

In APL and J the solution is a single character:

first Forth attempt

So now, Forth. We start with a name and stack picture:

: vadd ( v1 v2 v3 -- )

Getting the first value out of v1 is easy enough:

rot dup @

"rot" brings v1 to the top, then we grab the first element of the array (remember that we need to keep v1 around, hence the dup). Hmmm...now we've got four items on the stack:

v2 v3 v1 a

"a" is what I'm calling the first element of v1, using the same letters as in the Erlang function. There's no way to get v2 to the top of the stack, save the deprecated word pick, so we're stuck.

second Forth attempt

Thinking about this a bit more, the problem is we have too many items being dealt with at once, too many items on the stack. v3 sitting there on top is getting in the way, so what if we moved it somewhere else for a while? The return stack is the standard location for a temporary value, so let's try it:

>r over @ over @ + r> !

Now that works. We get v3 out of the way, fetch v1 and v2 (keeping them around for later use), then bring back v3 and store the result. Well, almost, because now v3 is gone and we can't use it for the second and third elements.

third Forth attempt

This isn't as bad as it sounds. We can just keep v3 over on the return stack for the whole function. Here's an attempt at the full version of vadd:

: vadd ( v1 v2 v2 -- )
>r
over @ over @ + r@ !
over cell+ @ over cell+ @ + r@ cell+ !
over 2 cells + @ over 2 cells + @ + r> 2 cells + !
drop drop ;

cell+ is roughly the same as ++ in C. "2 cells +" is equivalent to "cell+ cell+". Notice how v3 stays on the return stack for most of the function, being fetched with r@. The "drop drop" at the end is to get rid of v1 and v2. Some nicer formatting helps show the symmetry of this word:

: vadd ( v1 v2 v2 -- )
>r
over           @  over           @  + r@           !
over cell+     @  over cell+     @  + r@ cell+     !
over 2 cells + @  over 2 cells + @  + r> 2 cells + !
drop drop ;

This can be made more obvious by defining some vector access words:

: 1st ;
: 2nd cell+ ;
: 3rd 2 cells + ;
: vadd ( v1 v2 v2 -- )
>r
over 1st @  over 1st @  + r@ 1st !
over 2nd @  over 2nd @  + r@ 2nd !
over 3rd @  over 3rd @  + r> 3rd !
drop drop ;

A little bit of extra verbosity removes one quirk in the pattern:

: vadd ( v1 v2 v2 -- )
>r
over 1st @  over 1st @  + r@ 1st !
over 2nd @  over 2nd @  + r@ 2nd !
over 3rd @  over 3rd @  + r@ 3rd !
rdrop drop drop ;

And that's it--three element vector addition in Forth. One solution at least; I can think of several completely different approaches, and I don't claim that this is the most concise of them. It has some interesting properties, not the least of which is that there aren't any named variables. On the other hand, all of this puzzling, all this revision...to solve a problem which takes no thought at all in most languages. And while the C version can be switched from integers to floating point values just by changing the parameter types, that change would require completely rewriting the Forth code, because there's a separate floating point stack.

Still, it was enjoyable to work this out. Better than Sudoku? Yes.

Macho Programming

Back before I completely lost interest in debates about programming topics, I remember reading an online discussion that went like this:

Raving Zealot: Garbage collection is FASTER than manual memory management!

Experienced Programmer: You mean that garbage collection is faster than using malloc and free to manage a heap. You can use pools and static allocation, and they'll be faster and more predictable than garbage collection.

Raving Zealot: You need to get over your attitude that programming is a MACHO and RECKLESS endeavor! If you use a garbage collected language, NOTHING can go wrong. You're PROTECTED from error, and not reliant on your MACHONESS.

What struck me about this argument, besides that people actually argue about such things, is how many other respected activities don't have anywhere near the same level of paranoia about protection from mistakes. On the guitar--or any musical instrument--you can play any note at any time, even if it's out of key or, more fundamentally, not played correctly (wrong finger placement or pressure or accidentally muting the string). And people play instruments live, in-concert in front of thousands of people this way, knowing that the solo is improvised in Dorian E, and there's no physical barrier preventing a finger from hitting notes that aren't in that mode. The same goes for sculpting, or painting, or carpentry...almost anything that requires skill.

(And building chickadee houses isn't universally considered a MACHO hobby, even though it involves the use of POWER TOOLS which can LOP OFF FINGERS.)

In these activities, mistakes are usually obvious and immediate: you played the wrong note, you cut a board to the wrong length, there's blood everywhere. In macho programming, a mistake can be silent, only coming to light when there's a crash in another part of the code--even days later--or when the database gets corrupted. Stupidly trivial code can cause this, like:

array[index] = true;

when index is -1. And yet with this incredible potential for error, people still build operating systems and giant applications and massively multiplayer games in C and C++. Clearly there's a lot of machoness out there, or it's simply that time and debugging and testing--and the acceptance that there will be bugs--can overcome what appear to be technical impossibilities. It's hand-rolling matrix multiplication code for a custom digital signal processor vs. "my professor told me that assembly language is impossible for humans to use."

Would I prefer to ditch all high-level improvements, in exchange for programming being the technical equivalent of rock climbing? NO! You can romanticize it all you want, but when I wrote 8-bit games I clearly remember thinking how much more pleasant it was to tinker in BASIC than to spend hours coding up some crazy 6502 code that would lock-up the entire computer time after time (the bug would be that changing a loop index from 120 to 130 made it initially be negative, so the loop would end after one iteration, or some other obscurity).

What both this retro example and the C one-liner have in common is that the core difficulty stems less from the language itself than because code is being turned loose directly on hardware, so crashes are really crashes, and the whole illusion that your source code is actually the program being executed disappears. Problems are debugged at the hardware level, with data breakpoints and trapped CPU exceptions and protected memory pages (this is how debuggers work).

It's a project suitable as part of a single semester undergraduate class to write an interpreter for your favorite low-level language. Write it in Scheme or Erlang or Scala. Use symbolic addresses, not a big array of integers, to represent memory. Keep track of address offsets, instead of doing the actual math. Have functions return lists of memory addresses that have been read from or modified. Keep everything super simple and clean. The goal is to be able to enter expressions or functions and see how they behave, which is a whole lot nicer than tripping address exceptions.

All of a sudden, even hardcore machine code isn't nearly so scary. Write a dangerous function, get back a symbolic representation of what it did. Mistakes are now simply wrong notes, provided you keep your functions small. It's still not easy, but macho has become safe.

(If you liked this, you might enjoy Sending Modern Languages Back to 1980s Game Programmers.)

Timidity Does Not Convince

The only arguments that hold water, in terms of programming language suitability, are bold, finished projects. Not a mini-Emacs written in Haskell. Not a Sudoku solver in Prolog. Not a rewrite of some 1970s video game using Functional Reactive Programming. They need to be large and daring projects, where the finished product is impressive in its own right, and then when you discover it was written in language X, there's a wave of disbelief and then a new reverence for a toolset you had previously dismissed.

And now, two of my favorite bold projects:

Wings 3D

Wings started as an attempt to clone Nendo, a 3D modeller designed around simplicity and ease of use. Nendo development and support had dried-up, and enthusiasm for Nendo fueled Wings 3D development. So now there's a full-featured, mature 3D modeller, with a great focus on usability, and it's written entirely in Erlang. Without a doubt it's the antithesis of what Erlang was designed for, with the entire program running as a single process and intensive use of floating point math. But it clearly works, and shows that there are benefits to using Erlang even outside of its niche of concurrency-oriented programming.

SunDog: Frozen Legacy

SunDog was an elaborate game for the Apple II. A 1MHz 6502 and 48K of memory look more like a platform for simple arcade games, not the space trading and exploration extravaganza that was SunDog. And though assembly language was the norm for circa-1984 commercial games, the authors--Bruce Webster and Wayne Holder--chose to implement the majority of the game in p-code interpreted Pascal. I found a justification in an old email from Bruce:

Wayne and I had some long discussions about what to use to write SunDog (which actually started out being another game). We considered assembly, FORTH, and Pascal; BASIC was far too slow and clumsy for what we wanted to do. We ended up ruling out FORTH for issues of maintenance (ours and lack of a commercial vendor).

I pushed for--and we decided on--Apple Pascal for a few different reasons, including the language itself; the compactness of the p-code; and the automatic (but configurable) memory management of the p-System, which could swap "units" (read: modules) in and out. Pascal made the large project easier, not harder, though it was a struggle to keep the game within 48KB.

And that's how it should be: choose the language that lets you implement your vision.

Accidentally Introducing Side Effects into Purely Functional Code

It's easy to taint even purely functional languages by reintroducing side-effects. Simply have each function take an additional parameter representing the global state of the world--a tree of key/value pairs, for example--and have each function return a new state of the world. This is not news. It's an intentionally pathological case, not something I'd ever consider implementing.

What's more surprising is how easy it is to accidentally introduce side-effects.

For the Purely Functional Retrogames series, I wrote code that operated on a list of game entities:

[A, B, C, D,...]

Each element was a self-contained unit of sorts: an ID, x/y position, current state. Using this list of entities to build a new version for the next game frame was a simple map operation. The ID and state for each entity were used to call the correct transformation function for that entity.

Each of these transformations had three possible outcomes: a new entity would be returned with a different position and/or state, an entity could delete itself, or an entity could create some new entities (think of dropping a bomb or firing a shot). All three of these can be handled by having each transformation function return a list.

For example, if the original list was:

[A, B, C, D]

and entity "B" deleted itself, and entity "C" created four new entities in addition to a new version of itself, then the returned values might look like this:

A => [A1]
B => []
C => [C1, New1, New2, New3, New4]
D => [D1]

and the new overall list of entities would be:

[A1, C1, New1, New2, New3, New4, D1]

Well, that's not quite right. It's actually a list of lists:

[A1, [], [C1, New1, New2, New3, New4], [D1]]

and the individual lists need to be appended together to give the proper result. The append operation creates a brand new list, which means that the time and memory spent creating the individual result lists were wasted. They were just stepping stones to the real result. This almost certainly isn't going to be a significant inefficiency, but there's a pretty way around it: pass an accumulator list in and out of each transformation function. Now the three cases listed above neatly map to three operations:

1. To transform an entity into the next version of itself, simply prepend the new entity to the accumulator list.

2. To delete an entity, do nothing. Simply return the accumulator.

3. To create new entities, prepend each one to the accumulator.

No extra work is involved. We never build-up temporary lists and discard them immediately.

But this pretty little solution has one unintended flaw. By passing in the accumulator list, we're giving full access to previous computations to each of the entity transformation functions. Even worse, each of these functions can arbitrarily transform this list, not only prepending values but also removing existing values or changing the order of them. (No destructive updates need occur, just the returning of a different list.) In theory we could write code that uses the list to make decisions: if the head of the accumulator is an entity of type "E," then spawn a new entity at the same position as E. Now the entire process is order dependent...ugh.

In theory. The "flaw" here assumes that each function is going to do more than either leave the accumulator untouched or prepend values to it, that the programmer of a function may intentionally go rogue and look to sabotage the greater good. It still could open the door to bugs: imagine if a dozen people were all writing these transformation functions independently. Someone will make a mistake at some point.

Either way, the same side effects possible in imperative languages were accidentally introduced into pure functions.

Revisiting "Purely Functional Retrogames"

I wrote the Purely Functional Retrogames series as an experiment. There's been so much empty talk about how functional languages are as good or better than imperative languages--yet very little to back that up. Doubly so in regard to interactive applications. I'd bet there are more people learning to program the Atari 2600 hardware than writing games in Haskell.

For people who only read the beginnings of articles, let me say this up front: Regurgitating the opinion that functional programming (or any technology) is superior does absolutely nothing. That's a road of endless conjecture and advocacy. One day you'll realize you've been advocating something for ten years without having any substantial experience in it. If you think a particular technology shows promise, then get in there and figure it out.

The rest of this entry is about what I learned by writing some retro-style games in Erlang.

The divide between imperative and functional languages, in terms of thinking about and writing code, is much smaller than I once thought it was. It is easy to accidentally introduce side-effects and sequencing problems into purely functional code. It is easy to write spaghetti code, where the entanglement comes from threading data through functions rather than unstructured flow of control. There's mental effort involved in avoiding these problems, just as there is when programming in any language.

Not being able to re-use names for values in functions sometimes led to awkward code. I could have said "not being able to destructively update local variables..." but that would show a lack of understanding that local "variables" are just names for things, not storage locations. For example, the initial position of an object is passed in as "Pos." I use it to create a new position, called "NewPos." Sometimes, because it was the most straightforward approach, I'd end up with another value called "NewPos2" (which I will agree is a horrible name). Then I'd find myself referring to "NewPos" when I meant "NewPos2." If the need arose to shuffle the logic around a bit, then it took care to manually rename these values without introducing errors. It would have been much nicer to create the new position and say "I'm repurposing the name 'Pos'" to refer to this new position.

The lack of a simple dictionary type was the most significant hindrance. This is where I found myself longing for the basic features of Perl, Python, Ruby--pretty much any common language. The ugliness of Erlang records has been well-documented, but syntax alone is not a reason to avoid them. The real problem is that records are brittle, exactly like structs in C. It's difficult to come up with a record that covers the needs of a dozen game entities with varying behaviors. They all have some common data, like an X/Y position, but some have special flags, some use velocities and acceleration values instead of simple motion, some need references to other entities, and so on. I could either make a big generic record that works for everything (and keep adding fields as new cases arise) or switch to another data structure, such as a property list, that doesn't allow pattern matching.

The biggest wins came from symbolic programming and consciously avoiding low-level efficiency. I spent a lot of time focused on efficient representations of entity states and other data. The more thought I put into this, the more the code, as a whole, became awkward and unreadable. Everything started to flow much more nicely when I went down the road of comical inefficiency. Why use a fixed-size record when I can use a tree? Why use bare data types when I can tag them so the code that processes them is easier to read? Why transform one value to another when I can instead return a description of how to transform the value?

Some years ago, John Carmack stated that a current PC was powerful enough to run every arcade game he played when he was growing up--some 300 or so games--at the same time. Yes, there's a large execution time penalty simply for choosing to write in Erlang instead of C, but it's nowhere near enough a factor to matter for most game logic, so use that excess to write code that's clear and beautiful.

Puzzle Languages

I know I've covered this before. I am repeating myself. But it was woven into various other topics, never stated outright:

Some programming languages, especially among those which haven't gained great popularity, are puzzles.

That's not to be confused with "programming in general is a puzzle." There's always a certain amount of thought that goes into understanding a problem and deciding upon an approach to solving it. But if it takes focused thought to phrase that solution into working code, you go down one path then back up, then give up, then try something completely different--then you're almost certainly using a puzzle language.

These are puzzle languages:

Haskell
Erlang
Forth
J

And these are not:

Python
Ruby
Lua
C

In Forth, the puzzle is how to simplify a problem so that it can be mapped cleanly to the stack. In Haskell and Erlang, the puzzle is how to manage with single assignment and without being able to reach up and out of the current environment. In J the puzzle is how to phrase code so that it operates on large chunks of data at once.

Compare this to, say, Python. I can usually bang out a solution to just about anything in Python. I update locals and add globals and modify arrays and get working code. Then I go back and clean it up and usually end up with something simpler. In Erlang, as much as I want to deny it, I usually pick a direction, then realize I'm digging myself into a hole, so I scrap it and start over, and sometimes when I end up with a working solution it feels too fragile, something that wouldn't survive minor changes in the problem description. (Clearly this doesn't apply to easy algorithms or simple transformations of data.)

A critical element of puzzle languages is providing an escape, a way to admit that the pretty solution is elusive, and it's time to get working code regardless of aesthetics. It's interesting that these escapes tend to have a stigma; they induce a feeling of doing something wrong; they're guaranteed to result in pedantic lecturing if mentioned in a forum.

In Forth, an easy pressure value when the stack gets too busy is to use local variables. Local variables have been historically deemed unclean by a large segment of the Forth community (although it's amazing how easy some Forth problems are if you use locals). There's a peculiar angst involved in avoiding locals, even if they clearly make code simpler. Locals aside, there's always the escape of using some additional global variables instead of stack juggling, which has a similarly bad reputation (even though everyone still does it).

In Erlang, ETS tables and the process dictionary are two obvious escapes. And as expected, any mention of the process dictionary always includes the standard parental warning about the dangers of playing darts or standing there with the refrigerator door open. It is handy, as shown by the standard library random number generator (which stores a three element tuple under the name random_seed), and Wings3D (which uses the process dictionary to keep track of GUI state).

A more interesting escape in Erlang is the process. A process is commonly thought of as a mechanism of concurrency, but that need not be the case. It's easy to make an infinite loop by having a tail recursive function. Parameters in such a loop can be--if you dig into the implementation a bit--directly modified, providing a safe and interesting blurring of functional and imperative code. Imagine taking such a function and spawning it into its own process. Each process captures a bit of relevant data in a small, endlessly recursive loop. Imagine dozens or hundreds of these processes, each spinning away, holding onto important state data. Erlang string theory, if you will.

I wouldn't want to break a program into hundreds of processes simply to capture state, but usually there are some important bits which are used and updated across a project. Pulling these out of the purely functional world can be enough of a relief from growing complexity that the rest of the code can remain pure.

But there's still that stigma of doing something dirty. Back before the Norton name became associated with anti-virus products, when MS-DOS was ubiquitous, Peter Norton authored the standard book on programming IBM PCs. In a discussion of the MS-DOS interrupts for displaying characters and moving the cursor, he strongly advised that programmers not access video memory directly, but use the provided services instead. (The theory being that the MS-DOS interrupts would remain compatible on future hardware.) Of course almost every application and game would not have been possible had developers taken Peter's advice to heart. Learning to write directly to video memory was practically a cottage industry until Windows 95 finally ended the MS-DOS era.

Sometimes advice is too idealistic to follow.

How My Brain Kept Me from Co-Founding YouTube

Flickr blew my mind when it appeared back in 2004.

I'd read all the articles about building web pages that load quickly: crunching down the HTML, hand-tweaking GIFs, clever reuse of images. I was immersed in the late 1990s culture of website optimization. Then here comes a site that is 100% based around viewing large quantities of memory-hungry photos. And the size of photos was put entirely in the users' hands: images could be over-sharpened (which makes the JPEGs significantly larger) or uploaded with minimal compression settings. Users could click on on the "show all sizes" button and view the full glory of a 5MB photo. Just viewing a single 200K mid-sized version would outweigh any attempts to mash down the surrounding HTML many times over.

While still trying to figure out how the bandwidth bar had suddenly jumped to an unfathomable height, along comes this site that does the same thing as Flickr...but with VIDEOS. Now you've got people idly clicking around for an hour, streaming movies the entire time, or people watching full thirty minute episodes of sitcoms online. Not only was there no paranoia about bandwidth, but the entire premise of the site was to let people request vast and continual amounts of data. Such an audacious idea was so far away from the technical comfort zone I had constructed that I would never would have contemplated its potential existence.

I've learned my lesson. And yet I see people continually make the same mistake in far more conservative ways:

"On a 64-bit machine, each element in an Erlang list is sixteen bytes, which is completely unacceptable."

"Smalltalk has a 64MB image file, which is ridiculous. I'm not going to ship that to customers."

"I would never use an IDE that required a 2GB download!"

I see these as written declarations of someone's arbitrary limitations and technical worries. Such statements almost always have no bearing on reality and what will be successful or not.

On Being Sufficiently Smart

I'm proud to have created the wiki page for the phrase sufficiently smart compiler back in 2003 or 2004. Not because it's a particularly good page, mind you; it has been endlessly rewritten in standard wiki fashion. It's one of the few cases where I recognized a meme and documented it. I'd been seeing the term over and over in various discussions, and it started to dawn on me that it was more than just a term, but a myth, a fictional entity used to support arguments.

If you're not familiar, here's a classic context for using "sufficiently smart compiler." Language X is much slower than C, but that's because floating point values are boxed and there's a garbage collection system. But...and here it comes...given a sufficiently smart compiler those values could be kept in registers and memory allocation patterns could be analyzed and reduced to static allocation. Of course that's quite a loaded phrase, right up there with "left as an exercise for the reader."

One of the key problems with having a sufficiently smart compiler is that not only does it have to be sufficiently smart, it also has to be perfect.

Back in the mid-1990s my wife and I started an indie game development company, and we needed some labels printed. At the time, all we had was a middling inkjet printer, so the camera ready image we gave to the print shop was, to put it mildly, not of the necessary quality. Then we got the printed labels back and they looked fantastic. All the pixellated edges were smooth, and we theorized that was because of how the ink flowed during the printing process, but it didn't really matter. We had our labels and we were happy.

A few months later we needed to print some different labels, so we went through the same process, and the results were terrible. Every little flaw, every rough edge, every misplaced pixel, all perfectly reproduced on a roll of 1000 product labels. Apparently what had happened with the previous batch was that someone at the print shop took pity upon our low-grade image and did a quick graphics art job, re-laying out the text using the same font, then printing from that. The second time this didn't happen; the inkjet original was used directly. The problem wasn't that someone silently helped out, but that there was no indication of what was going on, and that the help wouldn't be there every time.

Let's say that a compiler can detect O(N^2) algorithms and replace them with O(N) equivalents. This is a classic example of being sufficiently smart. You can write code knowing that the compiler will transform and fix it for you. But what if the compiler isn't perfect (and it clearly won't be, as there aren't O(N) versions all algorithms)? It will fix some parts of your code and leave others as-is. Now you run your program, and it's slow, but why? You need insight into what's going on behind the scenes to figure that out, and if you find the problem then you'll have to manually recode that section to use a linear approach. Wouldn't it be more transparent to simply use linear algorithms where possible in the first place, rather than having to second guess the system?

There's another option, and that's to have the compiler give concrete information about behind the scenes transformations. I have a good mental picture of how Erlang works, in terms of the compiler and run-time. It's usually straightforward to understand what kind of BEAM code will be generated from particular source. That was true until fancy optimizations on binary operations were introduced in 2008. The documentation uses low-level concepts like "match context" and discusses when segmented binaries are copied and so on. It's all abstract and difficult to grasp, and that's why there's a new compiler switch, "bin_opt_info," to provide a window into what kind of code is being generated. Going back to my early programming days, the manual for Turbo Pascal 4 listed exactly what optimizations were performed by the compiler.

The Glasgow Haskell Compiler (GHC) is the closest I've seen to a sufficiently smart compiler, with the advantages and drawbacks that come with such a designation.

I can write code that looks like it generates all kinds of intermediate lists--and indeed such would be the case with similar code in Erlang--and yet the compiler is sufficiently smart to usually remove all of that. Even in the cases where that isn't possible, it's not a make or break issue. In the worst case the Haskell code works like the Erlang version.

But then there's laziness. Laziness is such an intriguing idea: an operation can "complete" immediately, because the actual result isn't computed until there's specific demand for it, which might be very soon or it might be in some other computation that happens much later. Now suppose you've got two very memory intensive algorithms in your code, and each independently pushes the limits of available RAM. The question is, can you guarantee that first algorithm won't be lazily delayed until it is forced to run right in the middle of the second algorithm, completely blowing the memory limit?

The GHC developers know that laziness can be expensive (or at least unnecessary in many cases), so strictness analysis is done to try to convert lazy code to non-lazy code. If and when that's successful, wonderful! Maybe some programs that would have previously blown-up now won't. But this only works in some cases, so as a Haskell coder you've got to worry about the cases where it doesn't happen. As much as I admire the Haskell language and the GHC implementation, I find it difficult to form a solid mental model of how Haskell code is executed, partially because that model can change drastically depending on what the compiler does. And that's the price of being sufficiently smart.

(Also see Digging Deeper into Sufficiently Smartness.)

Let's Take a Trivial Problem and Make it Hard

Here's a simple problem: Given a block of binary data, count the frequency of the bytes within it. In C, this could be a homework assignment for an introductory class. Just zero out an array of 256 elements, then for each byte increment the appropriate array index. Easy.

Now write this in a purely functional way, with an efficiency close to that of the C implementation.

It's easy to do a straightforward translation to Erlang, using tail recursion instead of a for loop, like this:

freq(B) when is_binary(B) ->
freq(B, erlang:make_tuple(256, 0)).

freq(<<X, Rest/binary>>, Totals) ->
I = X + 1,
N = element(I, Totals),
freq(Rest, setelement(I, Totals, N + 1));
freq(<<>>, Totals) ->
Totals.

But of course in the name of purity and simplicity, setelement copies the entire Totals tuple, so if there are fifty million bytes, then the 256 element Totals is copied 50 million times. It's simple, but it's not the right approach.

"Blame the complier" is another easy option. If it could be determined that the Totals tuple can be destructively updated, then we're good. Note that the garbage collector in the Erlang runtime is based on the assumption that pointers in the heap always point toward older data, an assumption that could break if a tuple was destructively updated with, say, a list value. So not only would the compiler have to deduce that that the tuple is only used locally, but it would also have to verify that only non-pointer values (like integers and atoms) were being passed in as the third parameter of setelement. This is all possible, but it doesn't currently work that way, so this line of reasoning is a dead end for now.

Totals could be switched from a tuple to a tree, which might or might not be better than the setelement code, but there's no way it's in the same ballpark as the C version.

What about a different algorithm? Sort the block of bytes, then count runs of identical values. Again, just the suggestion of sorting means we're already off track.

Honestly, I don't know the right answer. In Erlang, I'd go for one of the imperative efficiency hacks, like ets tables, but let's back up a bit. The key issue here is that there are some fundamental assumptions about what "purely functional" means and the expected features in functional languages.

In array languages, like J, this type of problem is less awkward, as it's closer to what they were designed for. If nothing else, reference counted arrays make it easier to tell when destructive updates are safe. And there's usually some kind of classification operator, one that would group the bytes by value for easy counting. That's still not going to be as efficient as C, but it's clearly higher-level than the literal Erlang translation.

A more basic question is this: "Is destructively updating a local array a violation of purely functionalness?" OCaml allows destructive array updates and C-like control structures. If a local array is updated inside of an OCaml function, then the result copied to a non-mutable array at the end, is there really anything wrong with that? It's not the same as randomly sticking your finger inside a global array somewhere, causing a week's worth of debugging. In fact, it looks exactly the same as the purely functional version from the caller's point of view.

Perhaps the sweeping negativity about destructive updates is misplaced.

Digging Deeper into Sufficiently Smartness

(If you haven't read On Being Sufficiently Smart, go ahead and do so, otherwise this short note won't have any context.)

I frequently write Erlang code that builds a list which ends up backward, so I call lists:reverse at the very end to flip it around. This is a common idiom in functional languages.

lists:reverse is a built-in function in Erlang, meaning it's implemented in C, but for the sake of argument let's say that it's written in Erlang instead. This is super easy, so why not?

reverse(L) -> reverse(L, []).
reverse([H|T], Acc) ->
reverse(T, [H|Acc]);
reverse([], Acc) ->
Acc.

Now suppose there's another function that uses reverse at the very end, just before returning:

collect_digits(L) -> collect_digits(L, []).
collect_digits([H|T], Acc) when H >= $0, H =< $9 ->
collect_digits(T, [H|Acc]);
collect_digits(_, Acc) ->
reverse(Acc).

This function returns a list of ASCII digits that prefix a list, so collect_digits("1234.0") returns "1234". And now one more "suppose": suppose that one time we decide that we really need to process the result of collect_digits backward, so we do this:

reverse(collect_digits(List))

The question is, can the compiler detect that there's a double reverse? In theory, the last reverse could be dropped from collect_digits in the generated code, and each call to collect_digits could be automatically wrapped in a call to reverse. If there ends up being two calls to reverse, then get rid of both of them, because it's just wasted effort to double-reverse a list.

With lists:reverse as a built-in, this is easy enough. But can it be deduced simply from the raw source code that reverse(reverse(List)) can be replaced with List? Is that effort easier than simply special-casing the list reversal function?

How to Crash Erlang

Now that's a loaded title, and I know some people will immediately see it as a personal slam on Erlang or ammunition for berating the language in various forums. I mean neither of these. Crashing a particular language, even so-called safe interpreted implementations, is not particularly challenging. Running out of memory or stack space are two easy options that work for most languages. There are pathological cases for regular expressions that may not truly crash, but result in such an extended period of unresponsiveness on large data sets that the difference is moot. In any language that allows directly linking to arbitrary operating system functions...well, that's just too easy.

Erlang, offering more complex features than many languages, has some particularly interesting edge cases.

Run out of atoms. Atoms in Erlang are analogous to symbols in Lisp--that is, symbolic, non-string identifiers that make code more readable, like green or unknown_value--with one exception. Atoms in Erlang are not garbage collected. Once an atom has been created, it lives as long as the Erlang node is running. An easy way to crash the Erlang virtual machine is to loop from 1 to some large number, calling integer_to_list and then list_to_atom on the current loop index. The atom table will fill up with unused entries, eventually bringing the runtime system to halt.

Why is this is allowed? Because garbage collecting atoms would involve a pass over all data in all processes, something the garbage collector was specifically designed to avoid. And in practice, running out of atoms will only happen if you write code that's generating new atoms on the fly.

Run out of processes. Or similarly, "run out of memory because you've spawned so many processes." While the sequential core of Erlang leans toward being purely functional, the concurrent side is decidedly imperative. If you spawn a non-terminating, unlinked process, and manage to lose the process id for it, then it will just sit there, waiting forever. You've got a process leak.

Flood the mailbox for a process. This is something that most new Erlang programmers do sooner or later. One process sends messages to another process without waiting for a reply, and a missing or incorrect pattern in the receive statement causes the receiver to ignore all messages...so they keep piling up until the mailbox fills all available memory, and that's that. Another reminder that concurrency in Erlang is imperative.

Create too many large binaries in a single process. Large--greater than 64 byte--binaries are allocated outside of the per-process heap and are reference counted. The catch is that the reference count indicates how many processes have access to the binary, not how many different pointers there are to it within a process. That makes the runtime system simpler, but it's not bulletproof. When garbage collection occurs for a process, unreferenced binaries are deleted, but that's only when garbage collection occurs. It's possible to create a large process with a slowly growing heap, and create so much binary garbage that the system runs out of memory before garbage collection occurs. Unlikely, yes, but possible.

Want People to Use Your Language Under Windows? Do This.

Whenever I hear about a new programming language or new implementation of an existing language, I usually find myself trying it out. There's a steep cost--in terms of time and effort--in deciding to use a new language for more than just tinkering, so I'm not going to suffer through blatant problems, and I admit to being sensitive to interface issues. Nothing gets me disinterested faster than downloading and installing a new language system, double-clicking the main icon...

...and being presented with an ugly little 80x24 character Microsoft command window using some awkward 1970s font.

(Now before the general outcry begins, be aware that I'm a regular command line user. I've used various versions of JPSoft's enhanced command processors for Windows for close to 20 years now, and I usually have multiple terminal windows open when using my MacBook.)

The poor experience of this standard command window is hard to underestimate. It always starts at a grid of 80x24 characters, even on the highest resolution of displays. Sometimes even the basic help message of an interpreter is wider than forty characters, causing the text to wrap in the middle of a word. The default font is almost always in a tiny point size. Cut and paste don't work by default, and even when enabled they don't follow standard Windows shortcuts. And, rather oddly, only a small subset of fonts actually work in this window.

It's possible to do some customization of the command window--change the font, change the font size, change the number of rows and columns of text--and these will take it from completely unacceptable to something that might pass for 1980s nostalgia. But that's a step I have to take manually. That initial double-click on the icon still brings up everything in the raw.

Basic aesthetics aside, the rudimentary features of a monochrome text window limit the opportunities for usability improvements. I'm always surprised at how many Windows ports of languages don't even let me access previously entered commands (e.g., using up-arrow or alt-P). Or how about using colors or fonts to differentiate between input and output, so I can more easily scan through the session history?

If you want me to use your language--and if you care about supporting Windows at all--then provide a way of interacting with the language using a real Windows application. Don't fall back on cmd.exe.

Some languages are brilliant in this regard. Python has the nice little IDLE window. Factor and PLT Scheme have gone all-out with aesthetically-pleasing and usable environments. Erlang and REBOL aren't up to the level of any of these (Erlang doesn't even remember the window size between runs), but they still provide custom Windows applications for user interaction.

A Personal History of Compilation Speed, Part 1

The first compiled language I used was the Assembler Editor cartridge for the Atari 8-bit computers. Really, it had the awful name "Assembler Editor." I expect some pedantic folks want to interject that an assembler is not a compiler. At one time I would have made that argument myself. But there was a very clear divide between editing 6502 code and running it, a divide that took time to cross, when the textual source was converted into machine-runnable form. Contrast that to Atari BASIC, the only language I knew previously, which didn't feature a human-initiated conversion step and the inevitable time it took.

Conceptually, the Assembler Editor was a clever design. Source code was entered line by line, even using line numbers, just like BASIC. The assembler could compile the source straight from memory and create object code in memory, with no disk access to speak of. The debugger was right there, too, resident in memory, setting the stage for what looked like an efficient and tightly integrated development system.

Except for whatever reason, the assembler was impressively slow, and it got disproportionately slower as program size increased. A linear look-up in the symbol table? Some kind of N-squared algorithm buried in there? Who knows, but I remember waiting over seven minutes for a few hundred lines of code to assemble. Atari knew this was a problem, because there was a note in the manual about it only being suitable for small projects. They offered the friendly advice of purchasing a more expensive product, the Atari Macro Assembler (which was a standalone assembler, not an integrated environment).

Instead I upgraded to MAC/65, a third party alternative that followed the formula set by the Assembler Editor: cartridge-based for fast booting, BASIC-like editor and assembler and debugger all loaded into memory at once. MAC/65 was popular among assembly coders primarily on its reputation for quick assembly times. And quick it was.

Almost certainly the slowness of the Assembler Editor was because of a bad design decision, one not present in MAC/65. But MAC/65 went one step further: source code was parsed and tokenized after each line was entered. For example, take this simple statement:

LDA #19   ; draw all bonus items

It takes a good amount of work, especially on a sub-2MHz processor, to pick that apart. "LDA" needs to be scanned and looked-up somewhere. "19" needs to be converted to binary. The MAC/65 approach was to do much of this at edit-time, storing the tokenized representation in memory instead of the raw text.

In the above example, the tokenized version could be reduced to a byte indicating "load accumulator immediate," plus the binary value 19 (stored as a byte, not as two ASCII characters), and then a token indicating the rest of the line was a comment and could be ignored at assembly time. When the user viewed the source code, it had to be converted from the tokenized form back into text. This had the side-effect of enforcing a single standard for indentation style, whether or not there was a space after the comment semicolon, and so on.

When my Atari 8-bit days ended, and I moved to newer systems, I noticed two definite paths in assembler design. There were the traditional, lumbering assemblers that ran as standalone applications, which almost always required a final linking step. These were usually slow and awkward, seemingly designed as back-ends to high-level language compilers, not meant to be used directly by programmers. And then there were the lightning-fast assemblers, often integrated with editors and debuggers in the tradition of the Assembler Editor and MAC/65. For dedicated assembly programmers during the Amiga and Atari ST years, those were clearly the way to go.

By that time, except when there was no alternative, I was using compilers for higher-level languages. And I was wondering if the "slow, lumbering" and "lightning fast" split applied to those development systems as well.

Part 2

The Pure Tech Side is the Dark Side

When I was writing 8-bit games, I was thrilled to receive each issue of the home computer magazines I subscribed to (especially this one). I spent my time designing games in my head and learning how to make the hardware turn them into reality. Then each month here come these magazines filled with tutorials and ideas and, most importantly, full source code for working games. Sure, most of the games were simple, but I pored over the code line by line--especially the assembly language listings--and that was much of my early programming education. Just seeing games designed by other people was inspiring in a way that's difficult to get across.

Years later, with those 8-bit days behind me, I would regularly pick-up Dr. Dobb's Journal at the local B. Dalton bookstore (now part of Barnes and Noble). Reading it was mildly interesting, but I didn't get much from it. Eventually I realized it was because I wasn't immersed in the subject matter. My PC programming projects were spotty at best, so I read the articles but there wasn't any kind of active learning going on. And there was an overall dryness to it. It wasn't about creativity and wonder, it was about programming.

Those two realizations do a good job of summarizing my opinions about most online discussions and forums.

The ideal forum is when a bunch of people who are individually working away on their own personal projects--whether songwriting or photography or any other endeavor--get together to share knowledge. Each participant has a vested interest, because he or she needs to deliver results first, and is discussing it with others only second. It's easy to tell when people in online discussions aren't result oriented. There's discussion about minute differences between brands and there's an obsession with having the latest and greatest model. Feels like a lot of talking and expounding of personal theories, but not much doing.

And then there's the creative angle. Raw discussions about programming languages or camera models or upcoming CPUs...they don't do anything for me. There's a difference between making a goal of having the newest, most powerful MacBook Pro, and someone who has pushed their existing notebook computer to the limits while mixing 48 tracks of stereo audio and could really use some of the improvements in the latest hardware.

The pure tech side is the dark side, at least for me.

A Personal History of Compilation Speed, Part 2

(Read Part 1 if you missed it.)

My experience with IBM Pascal, on an original model dual-floppy IBM PC, went like this:

I wrote a small "Hello World!" type of program, saved it, and fired up the compiler. It churned away for a bit, writing out some intermediate files, then paused and asked for the disc containing Pass 2. More huffing and puffing, and I swapped back the previous disc and ran the linker. Quite often the compiler halted with "Out of Memory!" at some point during this endeavor.

Now this would have been a smoother process with more memory and a hard drive, but I came to recognize that a compiler was a Very Important Program, and the authors clearly knew it. Did it matter if it took minutes to convert a simple program to a machine language executable? Just that it could be done at all was impressive indeed.

I didn't know it at the time, but there was a standard structure for compilers that had built-up over the years, one that wasn't designed with compilation speed as a priority. Often each pass was a separate program, so they didn't all have to be loaded into memory at the same time. And those seemingly artificial divisions discussed in compiler textbooks really were separate passes: lexical analysis, parsing, manipulation of an abstract intermediate language, conversion to a lower-level level intermediate language, peephole optimization, generation of assembly code. Even that last step could be literal, writing out assembly language source code to be converted to machine language by a separate tool. And linking, there's always linking.

This was all before I discovered Turbo Pascal.

On one of those cheap, floppy-only, 8088 PC clones from the late 1980s, the compilation speed of Turbo Pascal was already below the "it hardly matters" threshold. Incremental builds were in the second or two range. Full rebuilds were about as fast as saying the name of each file in the project aloud. And zero link time. Again, this was on an 8MHz 8088. By the mid-1990s, Borland was citing build times of hundreds of thousands of lines of source per minute.

The last time I remember seeing this in an ad, after Turbo Pascal had become part of Delphi, the number was homing in on a million lines per minute. Projects were compiled before your finger was off of the build key. It was often impossible to tell the difference between a full rebuild of the entire project and compiling a single file. Compilation speed was effectively zero.

Borland's other languages with "Turbo" in the name--like Turbo C--weren't even remotely close to the compilation speeds of Turbo Pascal. Even Turbo Assembler was slower, thanks in part to the usual step of having to run a linker. So what made Turbo Pascal so fast?

Real modules. A large percentage of time in C compilers is spent reading and parsing header files. Even a short school assignment may pull in tens of thousands of lines of headers. That's why most C compilers support precompiled headers, though they're often touchy and take effort to set-up. Turbo Pascal put all the information about exported functions and variables and constants into a compiled module, so it could be quickly loaded, with no character-by-character parsing needed.

Integrated build system. The standard makefile system goes like this: first the "make" executable loads, then it reads and parses a file of rules, then for each source file that is out of date, the compiler is started up. That's not a trivial effort, firing up a huge multi-megabyte executable just to compile one file. The Turbo Pascal system was much simpler: look at the list of module dependencies for the current module; if they're all up date, compile and exit; if not, then recursively apply this process to each dependent module. An entire project could be built from scratch without running any external programs.

Minimal linker. Have you ever looked at the specs for an object file format? "Complicated" and "bulky" are two terms that come to mind. Turbo Pascal used a custom object file with a minimal design. The "linker" wasn't doing anywhere near the work of standard linkers. The result was that the link step was invisible; you didn't even notice it.

Single pass compiler with combined parsing and code generation. No separate lexer, no separate parser, no abstract syntax tree. All of these were integrated into a single step, made possible by the straightforward syntax of Pascal (and by not having a preprocessor with macros). If you're curious, you can read more about the technique.

Yes, there was a drawback to instantaneous compile times. Fewer optimizations were done, and almost always the resultant code was slower than the C equivalent. But it didn't matter. Removing the gap between the steps of writing and running code was worth more than some amount of additional runtime performance. I used to hit the build key every so often, even while typing, just to check for syntax errors. And zero compilation speed eventually became standard, with the rise of interpreted languages like Perl, Ruby, and Python.

The World's Most Mind-Bending Language Has the Best Development Environment

I highly recommend that all programmers learn J. I doubt most will end up using it for daily work, but the process of learning it will stick with you. J is so completely different from everything else out there, and all your knowledge of C++ and Python and Scheme goes right out the window, leaving you an abject, confused beginner. In short, J will make you cry.

But that's not what I want to talk about. Though it's a bizarre and fringe language (yet not one of those programmer attempts at high humor), J is the most beautiful and useful development system I've come across. I'm not talking about the language itself, but the standard environment and add-ons that come with the language.

The IDE is more akin to Python's IDLE than monstrosities which may come to mind. There's a window for entering commands and seeing the results, and you can open up separate, syntax-colored editor windows, running the contents of each with a keypress. It's nothing groundbreaking, but it's something that most languages don't provide. And in the spirit of IDLE, J's IDE is written in J.

(I'll interject that J is cross-platform for Windows, OS X, and Linux, including 64-bit support, just in case anyone is preparing to deride it as Windows-only.)

Then there are the standard libraries: 3D graphics via OpenGL; full GUI support including an interface builder; memory-mapped files; performance profiling tools; a full interface to arbitrary DLLs; regular-expressions; sockets. Again, nothing tremendously unusual, except maybe memory-mapped files and the DLL hooks, but having it all right there and well-documented is a big win. Beginner questions like "What windowing library should I use?" just don't get asked.

The first really interesting improvements over most languages are the visualization tools. It's one line of code to graph arbitrary data. Think about that: no need to use a graphing calculator, no need to export to some separate tool, and most importantly the presence of such easy graphing ability means that you will use it. Once you get started running all kinds of data through visualization tools, you'll find you use them to spot-check for errors or to get a better understanding of what kinds of input you're dealing with. It goes further than just 2D graphs. For example, there's a nifty tool that color codes elements of a table, where identical elements have the same colors. It makes patterns obvious. (Color code a matrix, and you can easily tell if all the elements on a diagonal are the same.)

What makes me happiest is the built-in tutorial system, called "Labs" in J lingo. It's a mix of explanatory text, expressions which are automatically evaluated so you can see the results, and pauses after each small bit of exposition so you can experiment in the live J environment. Labs can be broken into chapters (so you can work through them in parts), and the tool for creating your own labs is part of the standard J download.

While many of the supplied labs are along the lines of "How to use sockets," the best ones aren't about J at all. They're about geometry or statistics or image processing, and you end up learning J while exploring those topics. J co-creator Ken Iverson's labs are the most striking, because they forgo the usual pedantic nature of language tutorials and come across as downright casual. Every "Learn Haskell" tutorial I've read wallows in type systems and currying and all the trappings of the language itself. And after a while it all gets to be too much, and I lose interest. Iverson just goes along talking about some interesting number theory, tosses out some short executable expressions to illustrate his points, and drops in a key bit of J terminology almost as an afterthought.

If you're wondering why I love the J environment so much but don't use it as my primary programming language, that's because, to me, J isn't suited for most projects I'm interested in. But for exploration and learning there's no finer system.

(If you want to see some real J code, try Functional Programming Archaeology.)

Micro-Build Systems and the Death of a Prominent DSL

Normally I don't think about how to rebuild an Erlang project. I just compile a file after editing it--via the c(Filename) shell command--and that's that. With hot code loading there no need for a linking step. Occasionally, such as after upgrading to a new Erlang version, I do this:

erlc *.erl

which compiles all the .erl files in the current directory.

But wait a minute. What about checking if the corresponding .beam file has a more recent date than the source and skipping the compilation step for that file? Surely that's gong to be a performance win? Here's the result of fully compiling a mid-sized project consisting of fifteen files:

$ time erlc *.erl

real    0m1.912s
user    0m0.945s
sys     0m0.108s

That's less than two seconds to rebuild everything. (Immediately rebuilding again takes less than one second, showing that disk I/O is a major factor.)

Performance is clearly not an issue. Not yet anyway. Mid-sized projects have a way of growing into large-sized projects, and those 15 files could one day be 50. Hmmm...linearly interpolating based on the current project size still gives a time of under six-and-a half-seconds, so no need to panic. But projects get more complex in other ways: custom tools written in different languages, dynamically loaded drivers, data files that need to be preprocessed, Erlang modules generated from data, source code in multiple directories.

A good start is to move the basic compilation step into pure Erlang:

erlang_files() -> [
"util.erl",
"http.erl",
"sandwich.erl",
"optimizer.erl"
].

build() ->
c:lc(erlang_files()).

where c:lc() is the Erlang shell function for compiling a list of files.

If you stop and think, this first step is actually a huge step. We've now got a symbolic representation of the project in a form that can be manipulated by Erlang code. erlang_files() could be replaced by searching through the current directory for all files with an .erl extension. We could even do things like skip all files with _old preceding the extension, such as util_old.erl. And all of this is trivially, almost mindlessly, easy.

There's a handful of things that traditional build systems do. They call shell commands. They manipulate filenames. They compare dates. The fancy ones go through source files and look for included files. These things are a small subset of what you can do in Perl, Ruby, Python, or Erlang. So why not do them in Perl, Ruby, Python, or Erlang?

I'm pretty sure there's a standard old build system to do this kind of thing, but in a clunky way where you have to be careful whether you use spaces or tabs, remember arcane bits of syntax, remember what rules and macros are built-in, remember tricks involved in building nested projects, remember the differences between the versions that have gone down their own evolutionary paths. I use it rarely enough that I forget all of these details. There are modern variants, too, that trade all of that 1970s-era fiddling for different lists of things to remember. But there's no need.

It's easier and faster to put together custom, micro-build systems in the high-level language of your choice.

Tales of a Former Disassembly Addict

Like many people who learned to program on home computers in the 1980s, I started with interpreted BASIC and moved on to assembly language. I've seen several comments over the years--including one from Alan Kay--less than thrilled with the 8-bit department store computer era, viewing it as a rewind to a more primitive time in programming. That's hard to argue against, as the decade prior to the appearance of the Apple II and Atari 800 had resulted in Modula-2, Smalltalk, Icon, Prolog, Scheme, and some top notch optimizing compilers for Pascal and C. Yet an entire generation happily ignored all of that and became bit-bumming hackers, writing thousands of games and applications directly at the 6502 and Z80 machine level with minimal operating system services.

There wasn't much of a choice.

In Atari BASIC, this statement:

PRINT SIN(1)

was so slow to execute that you could literally say "dum de dum" between pressing the enter key and seeing the result. Assembly language was the only real option if you wanted to get anywhere near what the hardware was capable of. That there was some amazing Prolog system on a giant VAX did nothing to change this. And those folks who had access to that system weren't able to develop fast graphical games that sold like crazy at the software store at the mall.

I came out of that era being very sensitive to what good low-level code looked like, and it was frustrating.

I'd routinely look at the disassembled output of Pascal and C compilers and throw up my hands. It was often as if the code was some contrived example in Zen of Assembly Language, just to show how much opportunity there was for optimization. I'd see pointless memory accesses, places where comparisons could be removed, a half dozen lines of function entry/exit code that wasn't needed.

And it's often still like that, even though the party line is that compilers can out-code most humans. Now I'm not arguing against the overall impressiveness of compiler technology; I remember trying to hand-optimize some SH4 code and I lost to the C compiler every time (my code was shorter, but not faster). But it's still common to see compilers where this results in unnecessarily bulky code:

*p++ = 10;
*p++ = 20;
*p++ = 30;
*p++ = 40;

while this version ends up much cleaner:

p[0] = 10;
p[1] = 20;
p[2] = 30;
p[3] = 40;
p += 4;

I noticed that under OS X a few years ago--and this may certainly still be the case with the Snow Leopard C compiler--that every access to a global variable resulted in two fetches from memory: one to get the address of the variable, one to get the actual value.

Don't even get me started about C++ compilers. Take some simple-looking code involving objects and overloaded operators, and I can guarantee that the generated code will be filled with instructions to copy temporary objects all over the place. It's not at all surprising if a couple of simple lines of source turn into fifty or a hundred of assembly language. In fact, generated code can be so ridiculous and verbose that I finally came up with an across-the-board solution which works for all compilers on all systems:

I don't look at the disassembled output.

If you've read just a couple of entries in this blog, you know that I use Erlang for most of my personal programming. As a mostly-interpreted language that doesn't allow data structures to be destructively modified, it's no surprise to see Erlang in the bottom half of any computationally intensive benchmark. Yet I find it keeps me thinking at the right level. The goal isn't to send as much data as possible through a finely optimized function, but to figure out how to have less data and do less processing on it.

In the mid-1990s I wrote a 2D game and an enhanced version of the same game. The original had occasional--and noticeable--dips in frame rate on low-end hardware, even though I had optimized the sprite drawing routines to extreme levels. The enhanced version didn't have the same problem, even though the sprite code was the same. The difference? The original just threw dozens and dozens of simple-minded attackers at the player. The enhanced version had a wider variety of enemy behavior, so the game could be just as challenging with fewer attackers. Or more succinctly: it was drawing fewer sprites.

I still see people obsessed with picking a programming language that's at the top of the benchmarks, and they obsess over the timing results the way I used to obsess over disassembled listings. It's a dodge, a distraction...and it's irrelevant.

How Did Things Ever Get This Good?

It's an oft-repeated saying in photography that the camera doesn't matter. All that fancy equipment is a waste of money, and good shots are from inspired photographers with well-trained eyes.

Of course no one actually believes that.

Clearly some photos are just too good to be taken with some $200 camera from Target, and there must be a reason that pros can buy two-thousand dollar lenses and three-thousand dollar camera bodies. The "camera doesn't matter" folklore is all touchy-feely and inspirational and rolls off the tongue easily enough, and then everyone runs back to their Nikon rumor sites and over-analyzes the differences between various models, and thinks about how much better photos will turn out after the next hardware refresh cycle.

But the original saying is actually correct. It's just hard to accept, because it's fun to compare and lust after all the toys available to the modern photographer. I've finally realized that some of those photos that once made me say "Wow, I wish I had a camera like that!" might look casual, but often involve elaborate lighting set-ups. If you could pull back and see more than just the framed shot, there would be a light box over here, and a flash bounced off of a big white sheet over there, and so on. Yes, there's a lot of work involved, but the camera is incorrectly assumed to be doing more than it really is. In fact it's difficult to find a truly bad camera.

What, if anything, does this have to do with programming?

Life is good if you have applications or tools or games that you want to write. Even a language like Ruby, which tends to hang near the bottom of any performance-oriented benchmark, is thousands of times faster than BASICs that people were learning to program 8-bit home computers with in the 1980s. That's not an exaggeration, I do mean thousands.

The world is brimming with excellent programming languages: Python, Clojure, Scala, Perl, Javascript, OCaml, Haskell, Erlang, Lua. Most slams against individual languages are meaningless in the overall scheme of things. If you like Lisp, go for it. There's no reason you can't use it to do what you want to do. String handling is poor in Erlang? Compared to what? Who cares, it's so much easier to use than anything I was programming with twenty years ago that it's not worth discussing. Perl is ugly? It doesn't matter to me; it's fun to program in.

Far, far, too much time has been spent debating the merits of various programming languages. Until one comes along that truly gives me a full magnitude increase in productivity over everything else, I'm good.

Slow Languages Battle Across Time

In my previous optimistic outburst I asserted that "Even a language like Ruby, which tends to hang near the bottom of any performance-oriented benchmark, is thousands of times faster than BASICs that people were learning to program 8-bit home computers with in the 1980s." That was based on some timings I did five years ago, so I decided to revisit them.

The benchmark I used is the old and not-very-good-as-a-benchmark Sieve of Eratosthenes, because that's the only benchmark that I have numbers for in Atari BASIC on original 8-bit computer hardware. Rather than using Ruby as the modern-day language, I'm using Python, simply because I already have it installed. It's a fair swap, as Python doesn't have a reputation for performance either.

The sieve in Atari BASIC, using timings from an article written in 1984 by Brian Moriarty, clocks in at:

324 seconds (or just under 5 and a half minutes)

The Python version, running on hardware that's a generation back--no i7 processor or anything like that--completes in:

3 seconds

Now that's impressive! A straight-ahead interpreted language, one with garbage collection and dynamic typing and memory allocation all over the place, and it's still two orders of magnitude, 108 times, faster than what hobbyist programmers had to work with twenty-five years ago. But what about the "thousands of times" figure I tossed about in the first paragraph?

Oh, yes, I forgot to mention that the Python code is running the full Sieve algorithm one thousand times.

If the Atari BASIC program ran a thousand times, it would finish after 324,000 seconds or 5400 minutes or almost four days. That means the Python version is--get ready for this--108,000 times faster than the Atari BASIC code.

That's progress.

(If you liked this, you might also like A Spellchecker Used to be a Major Feat of Software Engineering.)

How I Learned to Stop Worrying and Love Erlang's Process Dictionary

The rise of languages based upon hash tables is one of the great surprises in programming over the last twenty years.

Whatever you call them--hash tables, dictionaries, associative arrays, hash maps--they sure are useful. A majority of my college data structures courses are immediately negated. If you've got pervasive hash table support, then you've also got arrays (just hash tables with integer keys), growable arrays (ditto), sparse arrays (again, same thing), and it's rare to have to bother with binary trees, red-black trees, tries, or anything more complex. Okay, sets are still useful, but they're hash tables where only the keys matter. I'd even go so far as to say that with pervasive hash table support it's unusual to spend time thinking about data structures at all. It's key/value pairs for everything. (And if you need convincing, study Peter Norvig's Sudoku solver.)

If you don't believe the theory that functional programming went mainstream years ago, then at least consider that the mismatch between dictionary-based programming and functional programming has dealt a serious blow to the latter.

Now, sure, it's easy to build a purely functional dictionary in Erlang or Haskell. In fact, such facilities are already there in the standard libraries. It's using them that's clunky. Reading a value out of a dictionary is straightforward enough, but the restrictions of single-assignment, and that "modifying" a hash table returns a new version, calls for a little puzzle solving.

I can write down any convoluted sequence of dictionary operations:

Add the value of the keys "x" and "y" and store them in key "z". If "z" is greater than 100, then also set a key called "overflow" and add the value of "extra" to "x." If "x" is greater than 326, then set "finished" to true, and clear "y" to zero.

and in Python, Ruby, Lua, or Perl, minimal thought is required to write out a working solution. It's just a sequence of operations that mimic their textual descriptions. Here's the Python version, where the dictionary is called "d":

d['z'] = d['x'] + d['y']
if d['z'] > 100:
d['overflow'] = True
d['x'] += d['extra']
if d['x'] > 326:
d['finished'] = True
d['y'] = 0

I can certainly write the Erlang version of that (and, hey, atoms, so no quotes necessary!), but I can't do it so mindlessly. (Go ahead and try it, using either the dict or gb_trees modules.) I may be able to come up with Erlang code that's prettier than Python in the end, but it takes more work, and a minor change to the problem definition might necessitate a full restructuring of my solution.

Well, okay, no, the previous paragraph isn't true at all. I can bang-out an Erlang version that closely mimics the Python solution. And as a bonus, it comes with the big benefit of Erlang: the hash table is completely isolated inside of a single process. All it takes is getting over the psychological hurdle--and the lecturing from purists--about using the process dictionary. (You can use the ets module to reach the same end, but it takes more effort: you need to create the table first, and you have to create and unpack key/value pairs yourself, among other quirks.)

Let me come right out and say it: it's okay to use the process dictionary.

Clearly the experts agree on this, because every sizable Erlang program I've looked at makes use of the process dictionary. It's time to set aside the rote warnings of how unmaintainable your code will be if you use put and get and instead revel in the usefulness of per-process hash tables. Now what's really being preached against, when the horrors of the process dictionary are spewed forth, is using it as a way to sidestep functional programming, weaving global updates and flags through your code like old-school BASIC. But that's the extreme case. Here's a list of situations where I've found the process dictionary to be useful, starting from the mildest of instances:

Low-level, inherently stateful functions. Random number generation is the perfect example, and not too surprisingly the seed that gets updated with each call lives in the process dictionary.

Storing process IDs. Yes, you can use named processes instead, and both methods keep you from having to pass PIDs around, but named processes are global to the entire Erlang node. Use the process dictionary instead and you can start multiple instances of the same application without conflict.

Write-once process parameters. Think of this as an initial configuration step. Stuff all the settings that will never change within a process in the dictionary. From a programming point of view they're just like constants, so no worries about side effects.

Managing data in a tail recursive server loop. If you done any Erlang coding you've written a tail-recursive server at some point. It's a big receive statement, where each message handling branch ends with a recursive call. If there are six parameters, then each of those calls involves six parameters, usually with one of them changed. If you add a new parameter to the function, you've got to find and change each of the recursive calls. Eventually it makes more sense to pack everything into a data structure, like a gb_tree or ets table. But there's nothing wrong with just using the simpler process dictionary for key/value pairs. It doesn't always make sense (you might want the ability to quickly roll back to a previous state), but sometimes it does.

Handling tricky data flow in high-level code. Sometimes trying to be pure is messy. Nice, clean code gets muddled by having to pass around some data that only gets used in exceptional circumstances. All of a sudden functions have to return tuples intead of simple values. All of a sudden there's tangential data hitchhiking through functions, not being used directly. And when I start going down this road I find myself getting frustrated and annoyed, jumping through hoops to do something that I wouldn't even care about in most languages. Making flags or key elements of data globally accessible, is a huge sigh of relief, and the excess code melts away.

(If you've read this and are horrified that I'm completely misunderstanding functional programming, remember that I've gone further down the functional path than most people for what appear to be state-oriented problems.)

Functional Programming Doesn't Work (and what to do about it)

Read suddenly and in isolation, this may be easy to misinterpret, so I suggest first reading some past articles which have led to this point:

Admitting that Functional Programming Can Be Awkward
Follow-up to "Admitting that Functional Programming Can Be Awkward"
Back to the Basics of Functional Programming
Purely Functional Retrogames
Puzzle Languages
How I Learned to Stop Worrying and Love Erlang's Process Dictionary

After spending a long time in the functional programming world, and using Erlang as my go-to language for tricky problems, I've finally concluded that purely functional programming isn't worth it. It's not a failure because of soft issues such as marketing, but because the further you go down the purely functional road the more mental overhead is involved in writing complex programs. That sounds like a description of programming in general--problems get much more difficult to solve as they increase in scope--but it's much lower-level and specific than that. The kicker is that what's often a tremendous puzzle in Erlang (or Haskell) turns into straightforward code in Python or Perl or even C.

Imagine you've implemented a large program in a purely functional way. All the data is properly threaded in and out of functions, and there are no truly destructive updates to speak of. Now pick the two lowest-level and most isolated functions in the entire codebase. They're used all over the place, but are never called from the same modules. Now make these dependent on each other: function A behaves differently depending on the number of times function B has been called and vice-versa.

In C, this is easy! It can be done quickly and cleanly by adding some global variables. In purely functional code, this is somewhere between a major rearchitecting of the data flow and hopeless.

A second example: It's a common compilation technique for C and other imperative languages to convert programs to single-assignment form. That is, where variables are initialized and never changed. It's easy to mechanically convert a series of destructive updates into what's essentially pure code. Here's a simple statement:

if (a > 0) {
a++;
}

In single-assignment form a new variable is introduced to avoid modifying an existing variable, and the result is rather Erlangy:

if (a > 0) {
a1 = a + 1;
} else {
a1 = a;
}

The latter is cleaner in that you know variables won't change. They're not variables at all, but names for values. But writing the latter directly can be awkward. Depending on where you are in the code, the current value of whatever "a" represents has different names. Inserting a statement in the middle requires inventing new names for things, and you need to make sure you're referencing the right version. (There's more room for error now: you don't just say "a," but the name of the value you want in the current chain of calculations.)

In both of these examples imperative code is actually an optimization of the functional code. You could pass a global state in and out of every function in your program, but why not make that implicit? You could go through the pain of trying to write in single-assignment form directly, but as there's a mechanical translation from one to the other, why not use the form that's easier to write in?

At this point I should make it clear: functional programming is useful and important. Remember, it was developed as a way to make code easier to reason about and to avoid "spaghetti memory updates." The line between "imperative" and "functional" is blurry. If a Haskell program contains a BASIC-like domain specific language which is also written in Haskell, is the overall program functional or imperative? Does it matter?

For me, what has worked out is to go down the purely functional path as much as possible, but fall back on imperative techniques when too much code pressure has built up. Some cases of this are well-known and accepted, such as random number generation (where the seed is modified behind the scenes), and most any kind of I/O (where the position in the file is managed for you).

Learning how to find similar pressure relief valves in your own code takes practice.

One bit of advice I can offer is that going for the obvious solution of moving core data structures from functional to imperative code may not be the best approach. In the Pac-Man example from Purely Functional Retrogames, it's completely doable to write that old game in a purely functional style. The dependencies can be worked out; the data flow isn't really that bad. It still may be a messy endeavor, with lots of little bits of data to keep track of, and selectively moving parts out of the purely functional world will result in more manageable code. Now the obvious target is either the state of Pac-Man himself or the ghosts, but those are part of the core data flow of the program. Make those globally accessible and modifiable and all of a sudden a large part of the code has shifted from functional to imperative...and that wasn't the goal.

A better approach is to look for small, stateful, bits of data that get used in a variety of places, not just on the main data flow path. A good candidate in this example is the current game time (a.k.a. the number of elapsed frames). There's a clear precedent that time/date functions, such as Erlang's now(), cover up a bit of state, and that's what makes them useful. Another possibility is the score. It's a simple value that gets updated in a variety of situations. Making it a true global counter removes a whole layer of data threading, and it's simple: just have a function to add to the score counter and another function to retrieve the current value. No reason to add extra complexity just to dodge having a single global variable, something that a C / Python / Lua / Ruby programmer wouldn't even blink at.

(Also see the follow-up.)

Indonesian translation

Follow-up to "Functional Programming Doesn't Work"

Not surprisingly, Functional Programming Doesn't Work (and what to do about it) started some lively discussion. There were two interesting "you're crazy" camps:

The first mistakenly thought that I was proposing fixing problems via a judicious use of randomly updated global variables, so every program turns into potential fodder for the "Daily WTF."

The second, and really, the folks in this camp need to put some effort into being less predictable, was that I'm completely misunderstanding the nature of functional programming, and if I did understand it then I'd realize the true importance of keeping things pure.

My real position is this: 100% pure functional programing doesn't work. Even 98% pure functional programming doesn't work. But if the slider between functional purity and 1980s BASIC-style imperative messiness is kicked down a few notches--say to 85%--then it really does work. You get all the advantages of functional programming, but without the extreme mental effort and unmaintainability that increases as you get closer and closer to perfectly pure.

That 100% purity doesn't work should only be news to a couple of isolated idealists. Of the millions of non-trivial programs ever written--every application, every game, every embedded system--there are, what, maybe six that are written in a purely functional style? Don't push me or I'll disallow compilers for functional languages from that list, and then it's all hopeless.

"Functional Programming Doesn't Work" was intended to be optimistic. It does work, but you have to ease up on hardliner positions in order to get the benefits.

The Recovering Programmer

I wrote the first draft of this in 2007, and I thought the title would be the name of my blog. But I realized I had a backlog of more tech-heavy topics that I wanted to get out of my system. I think I've finally done that, so I'm going back to the original entry I planned to write.

When I was a kid, I thought I'd be a cartoonist--I was always drawing--or a novelist. Something artistic. When I became obsessed with video games in the 1980s, I saw game design as being in the same vein as cartooning and writing: one person creating something entirely on their own. I learned to program 8-bit computers so I could implement games of my own design. Eventually, slowly, the programming overtook the design. I got a degree in computer science. I worked on some stuff that looks almost impossible now, like commercial games of over 100K lines of assembly language (and later I became possibly the only person to ever write a game entirely in PowerPC assembly language).

Somewhere along the lines, I realized I was looking at everything backward, from an implementation point of view, not from the perspective of a finished application, not considering the user first. And I realized that the inherent bitterness and negativity of programming arguments and technical defensiveness on the web were making me bitter and negative. I've consciously tried to rewind, to go back to when programming was a tool for implementing my visions, not its own end. I've found that Alan Cooper is right, in that a tech-first view promotes scarcity thinking (that is, making perceived memory and performance issues be the primary concerns) and dismissing good ideas because of obscure boundary cases. And now programming seems less frustrating than it once did.

I still like to implement my own ideas, especially in fun languages like Erlang and Perl. I'm glad I can program, because personal programming in the small is fertile ground and tremendously useful. For starters, this entire site is generated by 269 lines of commented Perl, including the archives and the atom feed (and those 269 lines also include some HTML templates). Why? Because it was pleasant and easy, and I don't have to fight with the formatting and configuration issues of other software. Writing concise to-the-purpose solutions is a primary reason for programming in the twenty-first century.

If blogs had phases, then this would be the second phase of mine. I'm not entirely sure what Phase 2 will consist of, but I'll figure that out. Happy 2010!

No Comment

I received a few emails after last time along the lines of "Oh. Perl. Homebrew CMS. That's why you don't allow people to post comments." Well, no, but it was definitely a conscious decision. The Web 2.0 answer is that I'm outsourcing comments to reddit and Hacker News. The real reason is this:

The negativity of online technical discussions makes me bitter, and even though I'm sometimes drawn to them I need to stay away.

To be fair, this isn't true only of technical discussions. Back when I was on Usenet, I took refuge from geeky bickering in a group about cooking...only to find people arguing the merits of Miracle Whip versus mayonnaise. Put enough people together and there are sure to be complaints and conflicting personal agendas. But with smart, technically-oriented people, I'd expect there to be more sharing of real experiences, but that's often not the case.

Here's a lesson I learned very early on after I started working full-time as a programmer (and that's a peculiar sentence for me to read, as I no longer program for a living). I'd be looking at some code at my desk, and it made no sense. Why would anyone write it like this? There's an obvious and cleaner way to approach the same problem.

So I'd go down the hall to the person who wrote it in the first place and start asking questions...and find out that I didn't have the whole picture, the problem was messier than it first appeared, and there were perfectly valid reasons for the code being that way. This happened again and again. Sometimes I did find a real flaw, but even then it may have only occurred with data that wasn't actually possible (because, for example, it was filtered by another part of the system). Talking face to face changed everything, because they could draw diagrams, pull out specs, and give concrete examples.

I think that initial knee-jerk "I've been looking at this for ten seconds and now let me explain the critical flaws" reaction is a common one among people with engineering mindsets. And that's not a good thing. I've seen this repeatedly, from people putting down programming languages for silly, superficial reasons (Perl's sigils, Python's enforced indentation), to ridiculous off-the-cuff put downs of new products (such as the predictions of doom in the Slashdot announcement of the original iPod in 2001).

The online community that I've had the most overwhelmingly positive experience with is the photo-sharing site Flickr. I'll keep talking about Flickr, because it played a big part in getting me out of some ruts, and I've seen more great photographs over the last five years than I would have seen in ten lifetimes otherwise. I know that if you dig around you can find tedious rants from equipment collectors, but I do a good job of avoiding those. I don't think I've ever seen real negativity in photo comments other than suggestions for different crops or the occasional technical criticisms. There are so many good photos to see that there's no reason to waste time with ones the don't appeal to me. That's supported by only allowing up-voting of photos (by adding a shot to your favorites); there's no way to formally register dislike.

Flickr gets my time, but most of the programming discussion sites don't.

Flickr as a Business Simulator

Flickr came along exactly when I needed it.

In 2004, I knew I was too immersed in technical subjects, and Flickr motivated me to get back into photography as a change of pace. I loved taking photos when I was in college (mostly of the set-up variety with a couple of friends), but I hardly touched a camera for the next decade. When I first found out about Flickr, not long after it launched, the combination of having a new camera and a potential audience provided me with a rare level of inspiration. I remember walking around downtown Champaign on June 1, 2004, spending two hours entirely focused on taking photos. I didn't have a plan, I didn't have a preferred subject; I just made things up as I went.

This is one of my favorites from that day:

Flickr was pretty raw back then. You could comment on photos, but there wasn't the concept of favoriting a good shot or the automated interestingness ranking. As those systems went into a place, it was easier to get feedback about the popularity of photos. Why did people like this photo but not this other one? How does that user manage to get dozens of comments per shot?

It took me a while to recognize some of the thought patterns and feelings that I had once I started paying attention to the feedback enabled by Flickr. They were reminiscent of feelings I had when I was an independent developer. I was rediscovering lessons which I had, at great expense, learned earlier. Now I can, and will, recount some of these lessons, but that in itself isn't very useful or exciting. Anyone can recite pithy business knowledge, and anyone can ignore it too, because it's hard to accept advice without it being grounded in personal experience. The important part is that you can experience these lessons firsthand by using Flickr.

Create an account and give yourself a tough goal, such as getting 50,000 photostream views in six months or getting 500 photos flagged as favorites. And now it's a business simulator. You're creating a product--a pool of photographs--which is released into the wild and judged by people you don't control. The six month restriction simulates how long you can survive on your savings. Just like a real business, the results have a lot to do with the effort you put forth. But it's not a simple translation of effort into success; it's trickier than that.

Now some of the lessons.

You don't get bonus points for being the small guy. It sounds so appealing to be the indie that's getting by on a shoestring. Maybe some customers will be attracted to that and want to stick it to the man by supporting you. On Flickr you're on the same playing field as pros with thousands of dollars worth of equipment and twenty years' experience. You can still stand out, but don't fool yourself into thinking that your lack of resources that's used an excuse for lower quality is going to be seen as an endearing advantage.

While quality is important, keep the technical details behind the scenes. Just as no one really cares what language your application is written in, no one really cares what lens you took a photograph with or what filter you used. Be wary of getting too into the tech instead of the end result.

What you think people want might not be what people want. This one is tough. Are you absorbed in things that you think are important but are irrelevant, or even turn-offs, to your potential audience? This is the kind of thing that a good record producer would step in and deal with ("Just stop with the ten minute solos, okay?"), but it can be difficult to come to these realizations on your own, especially if you're seeing the problems as selling points.

Don't fixate on why you think some people are undeservedly successful. All it does it pull you away from improving your own photos/products as you pour energy into being bitter. Your personal idea of taste doesn't apply to everyone else. There may be other factors at work that you don't understand. Just let it go or it will drag you down.

But don't take my word for it. Just spend a few months in the simulator.

Nothing Like a Little Bit of Magic

Like so many other people, I was enthralled by the iPad introduction. I haven't held or even seen an iPad in person yet, but that video hit me on a number of levels. It's a combination of brand new hardware--almost dramatically so--and uses for it that are coming from a completely different line of thinking. I realized it's been a long time since I felt that way about the introduction of a new computer.

I remember the first time I tried a black and white 128K Mac in a retail store. A mouse! Really tiny pixels! Pull-down menus! Graphics and text mixed together! And the only demo program was what made the whole experience click: MacPaint.

I remember when the Atari 520ST was announced. Half a megabyte of memory! Staggering amounts of power for less than $1000! A Mac-like interface but in full color! Some of the demos were simple slideshows of 16-color 320x200 images, done with a program called NeoChrome, but I had never seen anything like them before.

I remember when the Amiga debuted that same year. Real multitasking! Digitized sound! Stereo! Hardware for moving around big bitmaps instead of just tiny sprites! Images showing thousands of colors at once! Just the bouncing ball demo was outside what I expected to ever see on a computer. And there was a flight-sim with filled polygon graphics. Behind the scenes it was the fancy hardware enabling it all, but it was the optimism and feeling of new possibilities that fueled the excitement.

I remember when the Macintosh II came out, with 24-bit color and impossibly high display resolutions for the time. It seemed like a supercomputer on a desk, the kind of thing that only high-end graphics researchers would have previously had access to.

PCs never hit me so unexpectedly and all at once, but there were a few years when 3D hardware started appearing where it felt like the old rules had been thrown out and imagining the future was more important than looking back on the same set of ideas.

Am I going to buy an iPad? I don't know yet. I never bought most of the systems listed above. But I am glad I've been experiencing that old optimism caused by a mix of hardware and software that suddenly invalidates many of the old, comfortable rules and opens up territory that hasn't been endlessly trod upon.

What to do About Erlang's Records?

The second most common complaint about Erlang, right after confusion about commas and semicolons as separators, is about records. Gotta give those complainers some credit, because they've got taste. Statically defined records are out of place in a highly dynamic language. There have been various proposals over the years, including Richard O'Keefe's abstract syntax patterns and Joe Armstrong's structs. Getting one of those implemented needs the solid support of the Erlang system maintainers, and it's understandably difficult to commit to such a sweeping change to the language. So what are the alternatives to records that can be used right now?

To clarify, I'm really talking about smallish, purely functional dictionaries. For large amounts of data there's already the gb_trees module, plus several others with similar purposes.

In Python, a technique I often use is to return a small dictionary with a couple of named values in it. I could use a tuple, but a dictionary removes the need to worry about order. This is straightforward in Erlang, too:

fun(length) -> 46;
(width)  -> 17;
(color)  -> sea_green
end.

Getting the value corresponding to a key is easy enough:

Result(color)

This is handy, but only in certain situations. One shortcoming is that there's no way to iterate through the keys. Well, there's this idea:

fun(keys)   -> [length, width, color];
(length) -> 46;
(width)  -> 17;
(color)  -> sea_green
end.

Now there's a way to get a list of keys, but there's room for error: each key appears twice in the code. The second issue is there's no simple way to take one dictionary and create a new one with a value added or removed. This road is becoming messy to go down, so here's more data-driven representation:

[{length, 46}, {width, 17}, {color, sea_green}]

That's just a list of key/value pairs, which is searchable via the fast, written-in-C function lists:keyfind. New values can be appended to the head of the list, and there are other functions in the lists module for deleting and replacing values. Iteration is also easy: it's just a list.

We still haven't bettered records in all ways. A big win for records, and this is something few purely functional data structures handle well, is the ability to create a new version where multiple keys get different values. For example, start with the above list and create this:

[{length, 200}, {width, 1400}, {color, sea_green}]

If we knew that only those three keys were allowed, fine, but that's cheating. The whole point of dictionaries is that we can put all sorts of stuff in there, and it doesn't change how the dictionary is manipulated. The general solution is to delete all the keys that should have new values, then insert the new key/value pairs at the head of the list. Or step through the list and see if the current key is one that has a new value and replace it. These are not linear algorithms, unfortunately. And you've got the same problem if you want to change multiple values in a gb_tree at the same time.

What I've been using, and I admit that this isn't perfect, is the key/value list approach, but forcing the lists to be sorted.This allows the original list and a list of changes to be merged together in linear time. The downside is that I have to remember to keep a literal list in sorted order (or write a parse transform to do this for me).

There's still one more feature of records that can't be emulated: extracting / comparing values using Erlang's standard pattern matching capabilities. It's not a terrible omission, but there's no way to dodge this one: it needs compiler and runtime system support.

Optimizing for Fan Noise

The first money I ever earned, outside of getting an allowance, was writing assembly language games for an 8-bit home computer magazine called ANALOG Computing. Those games ended up as pages of printed listings of lines like this:

1050 DATA 4CBC08A6A4BC7D09A20986B7B980
0995E895D4B99E099DC91C9DB51CA90095C0C8
CA10E8A20086A88E7D1D8E7E,608

A typical game could be 75 to 125+ of those lines (and those "three" lines above count as one; it's word-wrapped for a 40-column display). On the printed page they were a wall of hex digits. And people typed them in by hand--I typed them in by hand--in what can only be described as a painstaking process. Just try reading that data to yourself and typing it into a text editor. Go ahead: 4C-BC-08-A6...

Typos were easy to make. That's the purpose of the "608" at the end of the line. It's a checksum verified by a separate "correctness checker" utility.

There was a strong incentive for the authors of these games to optimize their code. Not for speed, but to minimize the number of characters that people who bought the magazine had to type. Warning, 6502 code ahead! This:

   LDA #0
TAY

was two fewer printed digits than this:

   LDA #0
LDY #0

Across a 4K or 6K game, those savings mattered. Two characters here, four characters there, maybe the total line count could be reduced by four lines, six lines, ten lines. This had nothing to do with actual code performance. Even on a sub-2MHz processor those scattered few cycles were noise. But finding your place in the current line, saying "A6," then typing "A" and "6" took time. Measurable time. Time that was worth optimizing.

Most of the discussions I see about optimization are less concrete. It's always "speed" and "memory," but in the way someone with a big house and a good job says "I need more money." Optimization only matters if you're optimizing something where you can feel the difference, and you can't feel even thousands of bytes or nanoseconds. Optimizing for program understandability...I'll buy that, but it's more of an internal thing. There's one concern that really does matter these days, and it's not abstract in the least: power consumption.

It's more than just battery life. If a running program means I get an hour less work done before looking for a place to plug in, that's not horrible. The experience is the same, just shorter. But power consumption equals heat and that's what really matters to me: if the CPU load in my MacBook cranks up then it gets hot, and that causes the fan to spin up like a jet on the runway, which defeats the purpose of having a nice little notebook that I can bring places. I can't edit music tracks with a roaring fan like that, and it's not something I'd want next to me on the plane or one table over at the coffee shop. Of course it doesn't loudly whine like that most of the time, only when doing something that pushes the system hard.

What matters in 2010 is optimizing for fan noise.

If you're not buying this, take a look at Apple's stats about power consumption and thermal output of iMacs (which, remember, are systems where the CPU and fan are right there on your desk in the same enclosure as the monitor). There's a big difference in power consumption, and corresponding heat generated, between a CPU idling and at max load. That means it's the programs you are running which are directly responsible for both length of battery charge and how loudly the fan spins.

Obvious? Perhaps, but this is something that didn't occur with most popular 8-bit and 16-bit processors, because those chips never idled. They always ran flat-out all the time, even if just in a busy loop waiting for interrupts to hit. With the iMacs, there's a trend toward the difference between idle and max load increasing as the clock speed of the processor increases. The worst case is the early 2009 24-inch iMac: 387.3 BTU/h at idle, 710.3 BTU/h at max load, for a difference of 323 BTU/h. (For comparison, that difference is larger than the entire maximum thermal output of the 20-inch iMac CPU: 298.5 BTU/h.)

The utmost in processing speed, which once was the goal, now has a price associated with it. At the same time that manufacturers cite impressive benchmark numbers, there's also the implicit assumption that you don't really want to hit those numbers in the everyday use of a mobile computer. Get all those cores going all the time, including the vector floating point units, and you get rewarded with forty minutes of use on a full battery charge with the fan whooshing the whole time. And if you optimize your code purely for speed, you're getting what you asked for.

Realistically, is there anything you can do? Yes, but it means you have to break free from the mindset that all of a computer's power is there for the taking. Doubling the speed of a program by moving from one to four cores is a win if you're looking at the raw benchmark numbers, but an overall loss in terms of computation per watt. Ideas that sounded good in the days of CPU cycles being a free resource, such as anticipating a time-consuming task that the user might request and starting it in the background, are now questionable features. Ditto for persistent unnecessary animations.

Nanoseconds are abstract. The sound waves generated by poorly designed applications are not.

Dehumidifiers, Gravy, and Coding

For a few months I did freelance humor writing. Greeting cards, cartoon captions, that sort of thing. My sole income was from the following slogan, which ended up on a button:

Once I've gathered enough information for the almighty Zontaar, I'm outta here!

Sitting down and cranking out dozens of funny lines was hard. Harder than I expected. I gave it up because it was too draining (and because I wasn't making any money, but I digress).

Periodically I decide I want to boost my creativity. I carry around a notebook and write down conversations, lists, brainstormed ideas, randomness. I recently found one of these notebooks, so I can give some actual samples of its contents. Below half a page of "Luxury Housing Developments in Central Illinois Farmland" (e.g., Arctic Highlands), there's a long list titled "Ridiculous Things." Here are a few:

salads
spackle
key fobs
wine tastings
mulch
hair scrunchies
asphalt
Fry Daddy^TM
cinder blocks
relish
Frito Pie
aeration shoes

Okay, okay, I'll stop. But you get the idea.

As with the humor writing, I remember this taking lots of effort, and it took real focus to keep going. Did this improve my creativity? I'd like to think so. It certainly got me thinking in new directions and about different topics. It also made me realize something fundamental: technical creativity, such as optimizing code or thinking up clever engineering solutions, is completely different from the "normal" creativity that goes into writing stories or taking photos.

Years ago, I followed the development of an indie game. This was back when writing 3D games for non-accelerated VGA cards was cutting edge. The author was astounding in his coding brilliance. He kept pulling out trick after trick, and he wasn't shy about posting key routines for others to use. Eventually the game was released...and promptly forgotten. It may have been a technical masterpiece, but it was terrible as game, completely unplayable.

I still like a good solution to a programming problem. I still like figuring out how to rewrite a function with half the code. But technical creativity is only one form of creativity.

It Made Sense in 1978

Whenever I see this list of memory cell sizes, it strikes me as antiquated:

BYTE = 8 bits
WORD = 16 bits
LONG = 32 bits

Those names were standard for both the Intel x86 and Motorola 68000 families of processors, and it's easy to see where they came from. "Word" isn't synonymous with a 16-bit value; it refers to the fundamental data size that a computer architecture is built to operate upon. On a 16-bit CPU like the 8086, a word is naturally 16-bits.

Now it's 2010, and it's silly to think of a 16-bit value as a basic enough unit of data to get to the designation "word." "Long" is similarly out of place, as 32-bit microprocessors have been around for over 25 years, and yet the standard memory cell size is still labeled in a way that makes it sound abnormally large.

The PowerPC folks got this right back in the early 1990s with this nomenclature:

BYTE = 8 bits
HALFWORD = 16 bits
WORD = 32 bits

That made sense in 1991, and it's still rational today. (64-bit is now common, but the jump isn't nearly as critical as it was the last time memory cell size doubled. The PowerPC name for "64-bits" is "doubleword.")

Occasionally you need to reevaluate your assumptions and not just cling to something because it's always been that way.

Eleven Years of Erlang

I've written about how I started using Erlang. A good question is why, after eleven years, am I still using it?

For the record, I do use other languages. I enjoy writing Python code, and I've taught other people how to use Python. This website is statically generated by a Perl program that I had fun writing. And I dabble in various languages of the month which have cropped up. (Another website I used to maintain was generated by a script that I kept reimplementing. It started out written in Perl, but transitioned through at least REBOL, J, and Erlang before I was through.)

One of the two big reasons I've stuck with Erlang is because of its simplicity. The functional core of Erlang can and has been described in a couple of short chapters. Knowledge of four data types--numbers, atoms, lists, tuples--is enough for most programming problems. Binaries and funs can be tackled later. This simplicity is good, because the difficult part of Erlang and any mostly-functional language is in learning to write code without destructive updates. The language itself shouldn't pour complexity on top of that.

There are many possibilities for extending Erlang with new data types, with an alternative to records being high on the list. Should strings be split off from lists into a distinct entity? What about arrays of floats, so there's no need to box each value? How about a "machine integer" type that's represented without tagging and that doesn't get automatically promoted to an arbitrarily sized "big number" when needed?

All of those additional types are optimizations. Lists work just fine as strings, but even the most naive implementation of strings as unicode arrays would take half the memory of the equivalent lists, and that's powerful enticement. When Knuth warned of premature optimization, I like to think he wasn't talking so much about obfuscating code in the process of micro-optimizing for speed, but he was pointing out that code is made faster by specializing it. The process of specialization reduces your options, and you end up with a solution that's more focused and at the same time more brittle. You don't want to do that until you really need to.

It may be an overreaction to my years of optimization-focused programming, but I like the philosophy of making the Erlang system fast without just caving in and providing C-style abilities. I know how to write low-level C. And now I know how to write good high-level functional code. If I had been presented with a menu of optimization-oriented data types in Erlang, that might never have happened. I'd be writing C in the guise of Erlang.

The second reason I'm still using Erlang is because I understand it. I don't mean I know how to code in it, I mean I get it all the way down. I know more or less what transformations are applied by the compiler and the BEAM loader. I know how the BEAM virtual machine works. And unlike most languages, Erlang holds together as a full system. You could decide to ditch all existing C compilers and CPUs and start over completely, and Erlang could serve as a foundation for this new world of computing. The ECOMP project (warning: PowerPoint) proved that an FPGA running the Erlang VM directly gives impressive results.

Let me zoom in on one specific detail of the Erlang runtime. If you take an arbitrary piece of data in a language of the Lua or Python family, at the lowest-level it ends up wrapped inside a C struct. There's a type field, maybe a reference count, and because it's a heap allocated block of memory there's other hidden overhead that comes along with any dynamic allocation (such as the size of the block). Lua is unabashedly reliant on malloc-like heap management for just about everything.

Erlang memory handling is much more basic. There's a block of memory per process, and it grows from bottom to top until full. Most data objects aren't wrapped in structs. A tuple, for example, is one cell of data for the length followed by the number of cells in the tuple. The system identifies it as a tuple by tagging the pointer to the tuple. You know the memory used for a tuple is always 1 + N, period. Were I trying to optimize data representation by hand, with the caveat that type info needs to be included, it would be tough to do significantly better.

I'm sure some people are correctly pointing out that this is how most Lisp and Scheme systems have worked since those languages were developed. There's nothing preventing an imperative language from using the same methods (and indeed this is sometimes the case).

Erlang takes this further by having a separate block of memory for each process, so when the block gets full only that particular block needs to be garbage collected. If it's a 64K block, it takes microseconds to collect, as compared to potentially traversing a heap containing the hundreds of megabytes of data in the full running system. Disallowing destructive updates allows some nice optimizations in the garbage collector, because pointers are guaranteed to reference older objects (this is sometimes called a "unidirectional heap"). Together these are much simpler than building a real-time garbage collector that can survive under the pressure of giant heaps.

Would I use Erlang for everything? Of course not. Erlang is clearly a bad match for some types of programming. It would be silly to force-fit Erlang into the iPhone, for example, with Apple promoting Objective C as the one true way. But it's the best mix of power and simplicity that I've come across.

A Short Story About Verbosity

In the early 2000s I was writing a book. I don't mean in the vague sense of sitting in a coffeeshop with my laptop and pretending to be a writer; I had a contract with a tech book publisher.

I'm in full agreement with the musician's saying of "never turn down a gig," so when the opportunity arose, I said yes. I did that even though there was one big, crazy caveat:

"In order for a book to sell," said my publisher, "it's got to be thick. 600 pages thick." "In the worst case we could go as low as 500 pages, but 600+ should be your target."

Wow, 600 pages. If I wrote two pages a day, that's almost a full year of writing, and I had less than a year. But still, never turn down a gig, and so I took a serious attempt at it.

I can't prove or refute the claim that a 600 page tech book sells better than thinner ones, but it explains a lot of the bloated tomes out there. Mix sections of code with the text, then reprint the whole program at the end of the chapter. That can eat four or eight pages. Add a large appendix that reiterates a language's standard library, even though all that info is already in the help system and online. Add some fluff survey chapters that everyone is going to skip.

I try not to wax nostalgic about how the olden days of computing were better. While I might have some fond memories of designing games for 8-bit home computers, there has been a lot of incredibly useful progress since then. But I do find myself wishing that the art of the 250 page technical book hadn't gone completely out of style.

Eventually I did give up on the 600 page monster I was writing. It was a combination of me not having enough time and my publisher taking weeks to give feedback about submitted chapters. In the end I think I had written the introduction and maybe eight full chapters. Do I wish I had finished it? Yes. Even with the 600 page requirement, there was still some clout that went along with writing a book at the time. These days it's much less so, and I think those padded-out-to 600 pages volumes had a lot to do with it.

(If you liked this, you might like Two Stories of Simplicity.)

Living Inside Your Own Black Box

Every so often I run across a lament that programmers no longer understand the systems they work on, that programming has turned into searches through massive quantities of documentation, that large applications are built by stacking together loosely defined libraries. Most recently it was Mike Taylor's Whatever happened to programming?, and it's worth the time to read.

To me, it's not that the act of programming has gotten more difficult. I'd even say that programming has gotten much easier. Most of the Apple Pascal assignments I had in high school would be a fraction of the bulk if written in Ruby or Python. Arrays don't have fixed lengths. Strings are easy. Dictionaries have subsumed other data structures. Generic sort routines are painless to use. Functions can be tested interactively. Times are good!

That's not to say all problems can be solved effortlessly. Far from it. But it's a tight feedback loop: think, experiment, write some code, reconsider, repeat. This works as long as you can live inside an isolated world, where the basic facilities of your programming language are the tools you have to work with. But at some point that doesn't work, and you have to deal with outside realities.

Here's the simplest example I can think of: Write a program to draw a line on the screen. Any line, any color, doesn't matter. No ASCII art.

In Python the first question is "What UI toolkit?" There are bindings for SDL, Cocoa, wxWindows, and others. Selecting one of those still doesn't mean that you can simply call a function and see your line. SDL requires some up front effort to learn how to create a window and choose the right resolution and color depth and so on. And then you still can't draw a line unless you use OpenGL or get an add-on package like SDL_gfx. If you decide to take the Cocoa route, then you need to understand its whole messaging / windowing / drawing model, and you also need to understand how Python interfaces with it. Maybe there's a beautifully simple package out there that lets you draw lines, and then the question becomes "Can I access that library from the language I'm using?" An even more basic question: "Is the library written using a paradigm that's a good match for my language?" (Think of a library based on subclassing mutable objects and try to use it from Haskell.)

There's a clear separation between programming languages and the capabilities of modern operating systems. Any popular OS is obviously designed for creating windows and drawing and getting user input, but those are not fundamental features of modern languages. At one time regular expressions weren't standard in programming languages either, but they're part of Perl and Ruby, and they're a library that's part of the official Python distribution.

A handful of language designers have tried to make GUI programming as easy as traditional programming. The Tk library for TCL, which is still the foundation for Python's out-of-the-box IDE, allows basic UI creation with simple, declarative statements. REBOL is a more recent incarnation of the same idea, that sample code involving windows and user input and graphics should be a handful of lines, not multiple pages of wxWindows fussing. I wish more people were working on such things.

A completely different approach is to go back to the isolationist view of only using the natural capabilities of a programming language, but in a more extreme way. I can draw a line in Python with this tuple:

("line",0,0,639,479)

or I can do the same thing in Erlang with two fewer characters:

{line,0,0,639,479}

I know it works, because I can see it right there. The line starts at coordinates 0,0 and ends at 639,479. It works on any computer with any video card, including systems I haven't used yet, like the iPad. I can use the same technique to play sounds and build elaborate UIs.

That the results are entirely in my head is of no matter.

It may sound like I'm being facetious, but I'm not. In most applications, interactions between code and the outside world can be narrowed down to couple of critical moments. Even in something as complex as a game, you really just need a few bytes representing user input at the start of a frame, then much later you have a list of things to draw and a list of sounds to start, and those get handed off to a thin, external driver of sorts, the small part of the application that does the messy hardware interfacing.

The rest of the code can live in isolation, doing arbitrarily complex tasks like laying out web pages and mixing guitar tracks. It takes some practice to build applications this way, without scattering calls to external libraries throughout the rest of the code, but there are big wins to be had. Fewer dependencies on platform specifics. Fewer worries about getting overly reliant on library X. And most importantly, it's a way to declutter and get back to basics, to focus on writing the important code, and to delve into those thousands of pages of API documentation as little as possible.

Rethinking Programming Language Tutorials

Imagine you've never programmed before, and the first language you're learning is Lua. Why not start with the official book about Lua? Not too far in you run across this paragraph:

The table type implements associative arrays. An associative array is an array that can be indexed not only with numbers, but also with strings or any other value of the language, except nil. Moreover, tables have no fixed size; you can add as many elements as you want to a table dynamically. Tables are the main (in fact, the only) data structuring mechanism in Lua, and a powerful one. We use tables to represent ordinary arrays, symbol tables, sets, records, queues, and other data structures, in a simple, uniform, and efficient way. Lua uses tables to represent packages as well. When we write io.read, we mean "the read entry from the io package". For Lua, that means "index the table io using the string "read" as the key".

All right, where to start with this? "Associative arrays"? The topic at hand is tables, and they're defined as being synonymous with an odd term that's almost certainly unfamiliar. Ah, okay, "associative array" is defined in the next sentence, but it goes off track quickly. "Indexed" gets casually used; there's the assumption that the reader understands about arrays and indexing. Then there's the curious addendum of "except nil." All this talk of arrays and association and indexing, and the novice's head is surely swimming, and then the author throws in that little clarification, "except nil," as if that's the question sure to be on the mind of someone who has just learned of the existence of something called a table.

I've only dissected two sentences of that paragraph so far.

Really, I should stop, but I can't resist the declaration "Lua uses tables to represent packages as well." Who is that sentence written for exactly? It has no bearing on what a table is or how to use one; it's a five mile high view showing that a beautifully invisible language feature--packages--is really not so invisible and instead relies on this table idea which hasn't been explained yet.

I don't mean to single out Lua here. I can easily find tutorials for other languages that have the same problems. Every Haskell tutorial trots out laziness and folding and type systems far too early and abstractly. Why? Because those are the concerns of people who write Haskell tutorials.

To really learn to program, you have to go around in circles and absorb a lot of information. You need to get immersed in the terminology. You'll be exposed to the foibles and obsessions of language communities. You'll absorb beliefs that were previously absorbed by people who went on to write programming tutorials. It's hard to come out of the process without being transformed. Not only will you have learned to program, but all that nonsense that you struggled with ("We use tables to represent ordinary arrays...") no longer matters, because you get it. After that point it's difficult to see the madness, but it's still there.

Programming language tutorials shouldn't be about learning languages. They should be about something interesting, and you learn the language in the process.

If you want to learn to play guitar, the wrong approach is to pick up a book about music theory and a chart showing where all the notes are on the fretboard. There's a huge gap between knowing all that stuff and actually playing songs. That's why good music lessons involve playing recognizable songs throughout the process. But what do we get in programming tutorials? Hello World. Fibonacci sequences. Much important manipulation of "foo."

Not all tutorials are this way. Paradigms of Artificial Intelligence Programming is a survey of classic AI programs mixed together with enough details about Lisp to understand them. I've mentioned others in Five Memorable Books About Programming. But I still want to see more. "Image Processing in Lua." "Massively Multiplayer Games in Erlang." "Exploring Music Theory (using Python)."

I'll give a real example of how "Image Processing in Lua" could work. You can convert the RGB values of a pixel to a monochrome intensity value by multiplying Red, Green, and Blue by 0.3, 0.6, and 0.1 respectively, and summing the results. That's an easily understandable Lua function:

function intensity(r, g, b)
return r*0.3 + g*0.6 + b*0.1
end

If each color value ranges from 0 to 255, then a full white pixel should return the maximum intensity:

intensity(255, 255, 255)

and it does: 255. This tiny program opens the door for showing how the R, G, and B values can be grouped together into a single thing...and that turns out to be a table! There's also the opportunity to show that each color element can be named, instead of remembering a fixed order:

{green=255, blue=255, red=255}

Rewriting the "intensity" function first using tables and then using tables with named elements should hammer home what a table is and how it gets used. There was no need to mention any techy tangentials, like "tables have no fixed size." (That can go in a terse reference doc.)

After all, the reason to learn a programming language is to do something useful with it, not simply to know the language.

(If you liked this, you might like The World's Most Mind-Bending Language Has the Best Development Environment.)

How Much Processing Power Does it Take to be Fast?

First, watch this.

It's Defender, an arcade game released thirty years ago. I went out of my way to find footage running on the original hardware, not emulated on a modern computer. (There's clearer video from an emulator if you prefer.)

Here's the first point of note: Defender is running on a 1MHz 8-bit processor. That's right ONE megahertz. This was before the days of pipelined, superscalar architectures, so if an instruction took 5 cycles to execute, it always took 5 cycles.

Here's the second: Unlike a lot of games from the early 1980s, there's no hardware-assisted graphics. No honest-to-goodness sprites where the video processor does all the work. No hardware to move blocks of memory around. The screen is just a big bitmap, and all drawing of the enemies, the score, the scrolling mountains, the special effects, is handled by the same processor that's running the rest of the code.

To be fair, the screen is only 320x256 with four bits per pixel. But remember, this was 1980, and home computers released up until mid-1985 didn't have that combination of resolution and color.

Now it's 2010, and there's much amazement at the responsiveness of the iPad. And why shouldn't it be responsive? There's a 32-bit, gigahertz CPU in there that can run multiple instructions at the same time. Images are moved around by a separate processor dedicated entirely to graphics. When you flick your finger across the screen and some images slide around, there's very little computation involved. The CPU is tracking some input and sending some commands to the GPU. The GPU is happy to render what you want, and a couple of 2D images is way below the tens of thousands of texture-mapped polygons that it was designed to handle.

Okay, JPEG decompression takes some effort. Ditto for drawing curve-based, anti-aliased fonts. And of course there's overhead involved in the application framework where messages get passed around to delegates and so on. None of this justifies the assumption that it takes amazing computing power to provide a responsive user experience. We're so used to interfaces being clunky and static, and programs taking long to load, and there being unsettling pauses when highlighting certain menu items, that we expect it.

All the fawning over the speed iPad is a good reminder that it doesn't have to be this way.

(If you liked this, you might like Slow Languages Battle Across Time.)

How to Think Like a Pioneer

Here's an experiment to try at home: do a Google image search for "integrated development environment." Take some time to go through the first several pages of pictures.

Even if you had no idea what an IDE was, the patterns are obvious. There's a big area with text in it, and on the left side of the screen is a pane that looks a lot like Windows Explorer: a collection of folders and files in a tree view, each with icons. Some folders are closed and have little plus signs next to them. Others are expanded and you can see the files contained within them. To be perfectly fair, this project area is sometimes on the right side of the window instead of the left. Variety is the spice of life and all that.

Why have IDEs settled into this pattern of having a project view take up twenty percent or more of the entire left or right side of the window?

The answer is shallower than you may expect. Someone who decides to create an IDE uses the existing programs he or she is familiar with as models. In other words, "because that's how it's supposed to be."

There's no solid reason the project view pane has to be the way it usually is. In fact, there are some good arguments against doing it that way. First, you are typically either looking at the project view or you're editing a file, not doing both at the same time. Yet the project view is always there, taking up screen real estate, sometimes a lot of screen real estate if highly nested folders are expanded. That view is made even wider by having icons to the left of each filename, even though most projects consist of one or two file types and they could be differentiated with color or an outline instead of a contrived icon attempting to evoke "file of Java code."

Second, a long, thin pane containing files and folders doesn't give a particularly deep or even interesting view of a large project. Open a folder and it might fill the whole pane, and all the other folders are now offscreen.

Are there better options? Sure! The first problem, of losing screen space to a persistent view of the project hierarchy, can be alleviated by a hot key that brings up a full-screen overlay. When you want to see the project as a whole, hit a key. Press escape to make it go away (or double-click a file to edit it).

The data presented in this overlay doesn't need to be a tree view. A simple option is to group related files into colored boxes, much like folders, except you can see the contents the whole time. With the narrow, vertical format out of the way, there can be multiple columns of boxes filling the space. Now you can see 100+ files at a time instead of a dozen.

It might make more sense to display modules as shapes, each connected by lines to the modules which are imported. Or provide multiple visualizations of the project, each for a different purpose.

Someone, sometime, and probably not all that long ago, came up with the canonical "project view on the left side of the window" design for IDEs. And you may wonder how that person arrived at that solution. After all, there were no prior IDEs to use for guidance. I think the answer is a basic one: because it's a solution that worked and was better than what came before. No magic. No in-depth comparison of a dozen possibilities. Clearly a way to see all the files in a project is better than a raw window in a text editor where there's no concept of "project" whatsoever. That solution wasn't the best solution that could ever exist in the entire future history of IDEs, but it sure fooled people into thinking it was.

If you want to think like a pioneer, focus on the problem you're trying to solve. The actual problem. Don't jump directly to what everyone else is doing and then rephrase your problem in terms of that solution. In the IDE case, the problem is "How can I present a visual overview of a project," not "How can I write a tree viewer like in all the other IDEs I've ever seen?"

A Ramble Through Erlang IO Lists

The IO List is a handy data type in Erlang, but not one that's often discussed in tutorials. It's any binary. Or any list containing integers between 0 and 255. Or any arbitrarily nested list containing either of those two things. Like this:

[10, 20, "hello", <<"hello",65>>, [<<1,2,3>>, 0, 255]]

The key to IO lists is that you never flatten them. They get passed directly into low-level runtime functions (such as file:write_file), and the flattening happens without eating up any space in your Erlang process. Take advantage of that! Instead of appending values to lists, use nesting instead. For example, here's a function to put a string in quotes:

quote(String) -> $" ++ String ++ $".

If you're working with IO lists, you can avoid the append operations completely (and the second "++" above results in an entirely new version of String being created). This version uses nesting instead:

quote(String) -> [$", String, $"].

This creates three list elements no matter how long the initial string is. The first version creates length(String) + 2 elements. It's also easy to go backward and un-quote the string: just take the second list element. Once you get used to nesting you can avoid most append operations completely.

One thing that nested list trick is handy for is manipulating filenames. Want to add a directory name and ".png" extension to a filename? Just do this:

[Directory, $/, Filename, ".png"]

Unfortunately, filenames in the file module are not true IO lists. You can pass in deep lists, but they get flattened by an Erlang function (file:file_name/1), not the runtime system. That means you can still dodge appending lists in your own code, but things aren't as efficient behind the scenes as they could be. And "deep lists" in this case means only lists, not binaries. Strangely, these deep lists can also contain atoms, which get expanded via atom_to_list.

Ideally filenames would be IO lists, but for compatibility reasons there's still the need to support atoms in filenames. That brings up an interesting idea: why not allow atoms as part of the general IO list specification? It makes sense, as the runtime system has access to the atom table, and there's a simple correspondence between an atom and how it gets encoded in a binary; 'atom' is treated the same as "atom". I find I'm often calling atom_to_list before sending data to external ports, and that would no longer be necessary.

Tricky When You Least Expect It

Here's a problem: You've got a satellite dish that can be rotated to any absolute angle from 0 to 360 degrees. If you think of the dish as being attached to a pole sticking out of the ground, that's what the dish rotates around. Given a starting angle and a desired angle, how many degrees do you rotate the dish by?

An example should clarify this. If the initial angle is 0 degrees, and the goal is to be at 10 degrees, that's easy. You rotate by 10 degrees. If you're at 10 degrees and you want to end up at 8 degrees, then rotate -2 degrees. It looks at lot like all you have to do is subtract the starting angle from the ending angle, and that's that.

But if the starting angle is 10 and the ending angle is 350...hmmm. 350 - 10 = 340, but that's the long way around. No one would do that. It makes more sense to rotate by -20 degrees. With this in mind and some experimenting, here's a reasonable looking solution (in Erlang, but it could easily be any language):

angle_diff(Begin, End) ->
D = End - Begin,
DA = abs(D),
case DA > 180 of
true -> -(360 - DA);
_ -> D
end.

It seems to cover some quickie test cases, including those listed above. Now try angle_diff(270, 0). The expected answer is 90. But this function returns -90. Oops.

This is starting to sound like the introduction to a book by Dijkstra. He'd have called this problem solving method "guessing," and it's hard to disagree with that assessment. When I run into problems like this that look so simple, and I feel like I'm randomly poking at them to get the right answers, I'm always surprised. So many messy problems are solved as part of the core implementation or standard library in most modern languages, that it's unusual to run into something this subtle.

In Python or Erlang I never worry about sorting, hash functions, heap management, implementing regular expressions, fancy string comparison algorithms such as Boyer-Moore, and so on. Most of the time I write fairly straightforward code that's just basic logic and manipulation of simple data structures. Behind the scenes, that code is leaning heavily on technically difficult underpinnings, but that doesn't change how pleasant things are most of the time. Every once in a while, though, the illusion of all the hard problems being solved for me is shattered, and I run into something that initially seems trivial, yet it takes real effort to work out a correct solution.

Here's a version of the angle_diff function that handles the cases the previous version didn't:

angle_diff(Begin, End) ->
D = End - Begin,
DA = abs(D),
case {DA > 180, D > 0} of
{true, true} -> DA - 360;
{true, _}    -> 360 - DA;
_ -> D
end.

Don't be surprised if it takes some thought to determine if this indeed handles all cases.

There's now a follow-up.

(If you liked this, you might like Let's Take a Trivial Problem and Make it Hard.)

What Do People Like?

I wrote Flickr as a Business Simulator in earnest, but I think it was interpreted more as a theoretical piece. When you build something with the eventual goal of releasing it to the world, the key question is "Will people like this?" And, really, you just won't know until you do it. There's nothing like the actions of tens of thousands of independently acting individuals who have no regard for your watertight theories. What Flickr provides is a way to make lots of quick little "product" releases, and see if your expectations line up with reality. Is this my primary use of Flickr? No! But the educational opportunity is there, regardless. Click on a photo to go to the Flickr page.

The rest of this entry is an annotated list of some photos I've posted to Flickr, with both my conjectures of how they'd be received and what actually happened. In each case the photo comes first with the commentary following.

I sat on this photo for a while after shooting it. I thought it was cliche, something everyone had already seen many times. A real photographer wouldn't bother to recreate such an image. Then I posted it...and it got an immediate string of comments and favorites. It's still one my top ten overall Flickr photos according to the stats system. My invented emphasis on originality didn't matter.

I thought these skid marks in front of a local liquor superstore were photoworthy, but the result didn't grab me. Like the sunset wheat photo, it took on a life of its own on Flickr. Was the hook in the impossibility of those tire tracks? That they look like a signature? Why was I unable to see the appeal before uploading it? It even ended up--with permission--in Scott Berkun's The Myths of Innovation (O'Reilly, 2007).

This one I liked immediately. The red arrow. The odd framing. The blown-out white background that makes the rust pop. The Flickr reaction...well there wasn't one. Is the industrial decay photo niche saturated? Would it have been a hit if I worked at getting hundreds of dedicated followers first? Or maybe I like it because it's better than other photos I've taken recently, but not all that great in absolute terms?

Oh so cleverly titled "I'm Lovin' IT!" I knew this was a novelty. It pulled in some novelty linkage as a ha-ha photo of the day sort of thing. It didn't get anywhere near the exposure of that Yahoo ad next to the 404 distance on a home run fence. The traffic from "I'm Lovin' IT!" was transient, adding points to the view counter, but as they weren't Flickr users they didn't add comments or favorites. In the end it was an empty success.

Explaining Functional Programming to Eight-Year-Olds

"Map" and "fold" are two fundamentals of functional programming. One of them is trivially easy to understand and use. The other is not, but that has more to do with trying to fit it into a particular view of functional programming than with it actually being tricky.

There's not much to say about map. Given a list and a function, you create a new list by applying the same function to each element. There's even special syntax for this in some languages which removes any confusion about whether the calling sequence is map(Function, List) or map(List, Function). Here's Erlang code to increment the values in List:

[X + 1 || X <- List]

Fold, well, it's not nearly so simple. Just the description of it sounds decidedly imperative: accumulate a result by iterating through the elements in a list. It takes three parameters: a base value, a list, and a function. The last of these maps a value and the current accumulator to a new accumulator. In Erlang, here's a fold that sums a list:

lists:foldl(fun(X, Sum) -> X + Sum end, 0, List)

It's short, but it's an awkward conciseness. Now we've two places where the parameter order can be botched. I always find myself having to stop and think about the mechanics of how folding works--and the difference between left and right folding, too (lists:foldl is a left fold). I would hardly call this complicated, but that step of having to pause and run through the details in my head keeps it from being mindlessly intuitive.

Compare this to the analog in array languages like APL and J. The "insert" operation inserts a function between all the elements of a list and evaluates it. Going back to the sum example, it would be "+/" in J, or "insert addition." So this:

1 2 3 4 5 6

turns to this:

1 + 2 + 3 + 4 + 5 + 6

giving a result of 21. The mechanics here are so simple that you could explain them to a class of second graders and not worry about them being confused. There's nothing about iterating or accumulating or a direction of traversal or even parameters. It's just...insertion.

Now there are some edge cases to worry about, such as "What does it mean to insert a function between the elements of a list of length 1"? Or an empty list for that matter. The standard array language solution is to associate a base value with operators, like addition, so summing a list containing the single value 27 is treated as 0 + 27. I'm not going to argue that APL's insert is more general than fold, because it certainly isn't. You can do all sorts of things with the accumulator in a traditional fold (for example, computing the maximum and minimum values of a list at the same time).

But in terms of raw simplicity of understanding, insert flat-out beats fold. That begs the question: Is the difficulty many programmers have in grasping functional programming inherent in the basic concept of non-destructively operating on values, or is it in the popular abstractions that have been built-up to describe functional programming?

(If you liked this, you might like Functional Programming Archaeology.)

Free Your Technical Aesthetic from the 1970s

In the early 1990s, I used Unix professionally for a few years. It wasn't the official Unix, nor was it Linux, but Sun's variant called SunOS. By "used" I mean I wrote commercial, embedded software entirely in a Unix environment. I edited 10,000+ line files in vi. Not vim. The original "one file loaded at a time" vi.

At the time, Unix felt clunky and old. I spent a lot of time in a library room down the hall, going through the shelves of manuals. It took me a long time to discover the umask command for changing the default file permissions and to understand the difference between .bashrc and .bash_profile and how to use tar.

By way of comparison, on my home PC I used a third-party command shell called 4DOS (later 4NT, and it's still available for Windows 7 as TCC LE). It had a wonderful command line history mechanism: type part of a command, then press up-arrow. The bash bang-notation felt like some weird mainframe relic. 4DOS had a built-in, full-screen text file viewer. The Unix equivalent was the minimalist less command. 4DOS help was colorful and pretty and hyperlinked. Documentation paged through as man pages was several steps backward.

The Unix system nailed the core tech that consumer-level computers were way behind on: stability and responsiveness in a networked, multitasking environment. It was ugly, but reliable.

In 2006, I got back into using Unix again (aside from some day-job stuff with Linux ten years ago) in the guise of OS X on a MacBook. The umask command is still there. Ditto for .bashrc and .bash_profile and all the odd command line switches for tar and the clunky bang-notation for history. I'm torn between wonderment that all those same quirks and design choices still live on...and shocked incredulity that all those same quirks and design choices live on.

Enough time has passed since the silly days of crazed Linux advocacy that I'm comfortable pointing out the three reasons Unix makes sense:

1. It works.
2. It's reliable.
3. It stays constant.

But don't--do not--ever, make the mistake of those benefits being a reason to use Unix as a basis for your technical or design aesthetic. Yes, there are some textbook cases where pipelining commands together is impressive, but that's a minor point. Yes, having a small tool for a specific job sometimes works, but it just as often doesn't. ("Those days are dead and gone and the eulogy was delivered by Perl," Rob Pike, 2004.) Use Unix-like systems because of the three benefits above, and simultaneously realize that it's a crusty old system from a bygone era. If you put it up on a pedestal as a thing of beauty, you lose all hope of breaking away from a sadly outdated programmer aesthetic.

(If you liked this, you might like My Road to Erlang.)

One Small Step Toward Reducing Programming Language Complexity

I've taught Python a couple of times. Something that experience made clear to me is just how many concepts and features there are, even in a language designed to be simple. I kept finding myself saying "Oh, and there's one more thing..."

Take something that you'd run into early on, like displaying what's in a dictionary:

for key, value in dictionary.iteritems():
print key, value

Tuples are a bit odd in Python, so I put off talking about them as long as possible, but that's what iteritems returns, so no more dodging that. There's multiple assignment, too. And what the heck is iteritems anyway? Why not just use the keys method instead? Working out a clean path that avoids constant footnotes takes some effort.

This isn't specific to Python. Pick any language and it likely contains a larger interconnected set of features than it first appears. Languages tend to continually grow, too, so this just gets worse over time. Opportunities to reverse that trend--backward compatibility be damned!--would be most welcome. Let me propose one.

The humble string constant has a few gotchas. How to print a string containing quotes, for example. In Python that's easy, just use single quotes around the string that has double quotes in it. It's a little more awkward in Erlang and other languages. Now open the file "c:\my_project\input.txt" under windows. You need to type "c:\\my_projects\\input.txt", but first you've got to say "Oh, and there's one more thing" and explain about how backslashes work in strings.

Which would be fine...except the backslash notation for string constants is, in the twenty-first century, an anachronism.

Who ever uses "\a" (bell)? Or "\b" (backspace)? Who even knows what "\v" (vertical tab) does? The escape sequence that gets used more than all the others combined is "\n" (newline), but it's simpler to have a print function that puts a "return" at the end and one that doesn't. Then there's "\t" (tab), but it has it's own set of quirks, and it's almost always better to use spaces instead. The price for supporting a feature that few people use is core confusion about what a string literal is in the first place. "The length of "\n\n" isn't four? What?"

There's an easy solution to all of this. Strings are literal, with no escapes of any kind. Special characters are either predefined constants (e.g., TAB, CR, LF) or created through a few functions (e.g., char(Value), unicode(Name)). Normal string concatenation pastes them all together. In Python:

"Content-type: text/plain" + NL + NL

In Erlang:

"Content-type: text/plain" ++ NL ++ NL

In both cases, the compiler mashes everything together into one string. There's no actual concatenation taking place at runtime.

Note that in Python you can get rid of backslash notation by preceding a string with the "r" character (meaning "raw"), like this:

r"c:\my_projects\input.txt"

But that adds another feature to the language, one to patch up the problems caused by the first.

(If you liked this, you might like In Praise of Non-Alphanumeric Identifiers.)

Stop the Vertical Tab Madness

In One Small Step Toward Reducing Programming Language Complexity I added "Who even knows what "\v" (vertical tab) does?" as an off the cuff comment. Re-reading that made me realize something that's blatantly obvious in retrospect, so obvious that I've gone all this time without noticing it:

No one is actually using the vertical tab escape sequence.

And I truly mean "no one." If I could stealthily patch the compiler for any language supporting the "\v" escape so I'd receive mail whenever it occurred in source code, then I could trace actual uses of it. I'm willing to bet that all the mail would come from beginners trying to figure out what the heck "\v" actually does, and then giving up when they realize it doesn't do anything. That's because it doesn't do anything, except with some particular printers and terminal emulators, and in those cases you're better off not relying on it anyway.

And yet this crazy old feature, one that no one understands or uses, one that doesn't even do anything in most cases, not only gets special syntax in modern programming languages, it's consistently given space in the documentation, even in tutorials. It's in the official Lua docs. It's in MIT web course notes about printf. It's in the books Programming in Python 3 and Python Essential Reference. It's in an introduction to Python strings. It's in the standard Erlang documentation, too.

[Insert conspiracy theory involving Illuminati here.]

Here's my simple plea: stop it. Stop mentioning vertical tabs in tutorials and language references. Drop the "\v" sequence in all future programming languages. Retroactively remove it from recent languages, like Python 3. Yes, ASCII character number 11 isn't going away, but there's no reason to draw attention to a relic of computing past.

Surprisingly, the "\v" sequence was removed from one language during the last decade: Perl. And even more surprisngly, there's a 2010 proposal to add escaped character sequences, including vertical tab, to the famously minimal language, Forth.

(If you liked this, you might like Kilobyte Constants, a Simple and Beautiful Idea that Hasn't Caught On.)

Personal Programming

I've mentioned before that this site is generated by a small Perl script. How small? Exactly 6838 bytes, which includes comments and an HTML template. Mentioning Perl may horrify you if you came here to read about Erlang, but it's a good match for the problem. Those 6838 bytes have been so pleasant to work on that I wanted to talk about them a bit.

I've used well-known blogging applications, and each time I've come away with the same bad taste, one that's caused by a combination of quirky rich formatting and having to edit text in a small window inside of a browser. I don't want to worry about presentation details: choosing fonts, line spacing, etc. It's surprising how often there are subtle mismatches between the formatting shown in a WYSIWYG editing window and what the final result looks like. Where did that extra blank line come from? Why do some paragraphs have padding below them but others don't?

I decided to see if I could bypass all of that and have a folder of entries marked-up with basic annotations, then have a way to convert that entire folder into a real site. And that's pretty much what I ended up with. The sidebar and "Previously" list and dated permalink are all automatically generated. Ditto for the atom feed and archives page. The command-line formatter lets me rebuild any page, defaulting to the newest entry. If I want to change the overall layout of the site, I can regenerate all of the pages in a second or so.

There are still legitimate questions about the path I chose. "Why not grab an open source program and modify it to fit your needs?" "You do realize that you decided to write an entirely new system from scratch, just because you didn't like a few things in existing programs; surely that's a serious net loss?"

My response is simple: I did it because it was easy. If I get annoyed with some feature of Microsoft Word, I'm not going to think even for a second about writing my own alternative. But the site generation program just took a bit of tinkering here and there over the course of a weekend. It never felt daunting. It didn't require anything I'd label as "engineering." I spent more time getting the site design and style sheet right, something I would have done even if I used other software.

Since then, I've made small adjustments and additions to the original script. I added the archive page nine months later. Earlier this year I added a fix for some smaller feed aggregation sites that don't properly handle relative links. Just today I added mark-up support for block quotes. That last one took ten minutes and four lines of code.

If this suddenly got complicated, if I needed to support reader comments and ten different feed formats and who knows what else, I'd give it up. I have no interest in turning these 6838 bytes into something that requires a grand architecture to keep from collapsing. But there's some magic in a solution that's direct, reliable, easy to understand, and one that fits my personal vision of how it should work.

(If you liked this, you might like Micro-Build Systems and the Death of a Prominent DSL.)

Common Sense, Part 1

There's a photo of mine in the September 2010 issue of Popular Photography. I'm excited about it; my photo credits are few and far between, and it brings back the feelings I had when I wrote for magazines long ago. Completely ignoring the subject of the image, there are couple of surprising facts about it.

The first is that it was a taken on a circa-2004 Canon PowerShot G5, a camera with a maximum resolution of five megapixels.

The second is that it's a doubly-compressed JPEG. The original photo was a JPEG, then I adjusted the colors and contrast a bit, and saved it out as a new JPEG. Each save lost some of the image quality. I was perfectly willing to submit the adjusted photo as a giant TIFF to avoid that second compression step, but was told not to worry about it; the JPEG would be fine.

Yet there it is: the five megapixel, doubly-compressed photo, printed across almost two pages of the magazine. And those two technical facts are irrelevant. I can't tell the difference; it looks great in print.

Now it is an impressionistic shot, so it could just be that the technical flaws aren't noticeable in this case. Fortunately, I have another anecdote to back it up.

Last year I was in New Mexico and took a lot of photos. After I got back home, I decided to get a few photo books printed. The source images were all twelve megapixel JPEGs, but the book layout software recommended a six megapixel limit. I cut the resolution in half, again twice-compressing them. When I got the finished books back, the full-page photos were sharp and beautiful.

The standard, pedantic advice about printing photos is that resolution is everything. Shoot as high as possible. Better yet, save everything as RAW files, so there's no lossy compression. Any JPEG compression below maximum is unacceptable. Double-compression is an error of the highest order, one only made by rank amateurs. And so it goes. But I know from personal experience that while it sounds authoritative, and while it's most likely given in a well-meaning manner, it's advice that's endlessly repeated in a loose, "how could it possibly be wrong?" sort of way and never actually tested.

Erlang vs. Unintentionally Purely Functional Python

Here's a little Python function that should be easy to figure out, even if you don't know Python:

def make_filename(path):
return path.lower() + ".jpg"

I want to walk through what's going on behind the scenes when this function executes. There is, of course, a whole layer of interpreting opcodes and pushing and popping parameters, but that's just noise.

The first interesting part is that the lower() method creates an entirely new string. If path contains a hundred characters, then all hundred of those are copied to a new string in the process of being converted to lowercase.

The second point of note is that the append operation--the plus--is doing another copy. The entire lowercased string is moved to a new location, then the four character extension is tacked on to the end. The original path has now been copied twice.

Those previous two paragraphs gloss over some key details. Where is the memory for the new strings coming from? It's returned by a call to the Python memory allocator. As with all generic heap management functions, the execution time of the Python memory allocator is difficult to predict. There are various checks and potential fast paths and manipulations of linked lists. In the worst case, the code falls through into C's malloc and the party continues there. Remember, too, that objects in Python have headers, which include reference counts, so there's more overhead that I've ignored.

Also, the result of lower gets thrown out after the subsequent concatenation, so I could peek into "release memory" routine and see what's going on down in that neck of the woods, but I'd rather not. Just realize there's a lot of work going on inside the simple make_filename function, even if the end result still manages to be surprisingly fast.

A popular criticism of functional languages is that a lack of mutable variables means that data is copied around, and of course that just has to be slow, right?

A literal Erlang translation of make_filename behaves about the same as the Python version. The string still gets copied twice, though in Erlang it's a linked list which uses eight bytes per characters given a 32-bit build of the language. If the string in Python is UTF-8 (the default in Python 3), then it's somewhere between 1 and 4 bytes per character, depending. The big difference is that memory allocation in Erlang is just a pointer increment and bounds check, and not a heavyweight call to a heap manager.

I'm not definitively stating which language is faster for this specific code, nor does it matter to me. I suspect the Erlang version ends up running slightly longer, because the lowercase function is itself written in Erlang, while Python's is in C. But all that "slow" copying of memory isn't even part of the performance discussion.

(If you liked this, you might like Functional Programming Went Mainstream Years Ago.)

Advice to Aimless, Excited Programmers

I occasionally see messages like this from aimless, excited programmers:

Hey everyone! I just learned Erlang/Haskell/Python, and now I'm looking for a big project to write in it. If you've got ideas, let me know!

or

I love Linux and open source and want to contribute to the community by starting a project. What's an important program that only runs under Windows that you'd love to have a Linux version of?

The wrong-way-aroundness of these requests always puzzles me. The key criteria is a programing language or an operating system or a software license. There's nothing about solving a problem or overall usefulness or any relevant connection between the application and the interests of the original poster. Would you trust a music notation program developed by a non-musician? A Photoshop clone written by someone who has never used Photoshop professionally? But I don't want to dwell on the negative side of this.

Here's my advice to people who make these queries:

Stop and think about all of your personal interests and solve a simple problem related to one of them. For example, I practice guitar by playing along to a drum machine, but I wish I could have human elements added to drum loops, like auto-fills and occasional variations and so on. What would it take to do that? I could start by writing a simple drum sequencing program--one without a GUI--and see how it went. I also take a lot of photographs, and I could use a tagging scheme that isn't tied to a do-everything program like Adobe Lightroom. That's simple enough that I could create a minimal solution in an afternoon.

The two keys: (1) keep it simple, (2) make it something you'd actually use.

Once you've got something working, then build a series of improved versions. Don't create pressure by making a version suitable for public distribution, just take a long look at the existing application, and make it better. Can I build an HTML 5 front end to my photo tagger?

If you keep this up for a couple of iterations, then you'll wind up an expert. An expert in a small, tightly-defined, maybe only relevant to you problem domain, yes, but an expert nonetheless. There's a very interesting side effect to becoming an expert: you can start experimenting with improvements and features that would have previously looked daunting or impossible. And those are the kind of improvements and features that might all of a sudden make your program appealing to a larger audience.

A Concurrent Language for Non-Concurrent Software

Occasionally I get asked why, as someone who uses Erlang extensively, do I rarely talk about concurrency?

The answer is because concurrency is not my primary motivation for using Erlang.

Processes themselves are wonderful, and I often use them as a way to improve modularity. Rather than passing the state of the world all over the place, I can spin off processes that capture a bit of it. This works surprisingly well, but it's just a coarser-grained version of creating objects in Python or other languages. Most of the time when I send a message to another process, my code sits and waits for the result to come back, which is hardly "concurrency."

Suppose Erlang didn't have processes at all. Is there still anything interesting about the language? To me, yes, there is. I first tried functional programming to see if I could think at a higher level, so I could avoid a whole class of concerns that I was tired of worrying about. Erlang is further down the purely functional road than most languages, giving the benefits that come with that, but at the same time there's a divergence from the hardcore, theoretical beauty of Haskell. There's no insistence on functions taking a single value, there isn't a typing-first viewpoint. The result is being able to play fast and loose with a handful of data types--especially atoms--and focus on how to arrange and rearrange them in useful ways.

(Okay, there are some small things I like about Erlang too, such as being able to introduce named values without creating a new scope that causes creeping indentation. It's the only functional language I've used that takes this simple approach.)

The angle of writing code that doesn't involve micromanaging destructive updates takes some time to sink in. Possibly too long; something almost always ignored when presenting a pathologically beautiful one-liner that makes functional programming look casually effortless. There are a number of techniques that aren't obvious, that aren't demonstrated in tutorials. I wrote about one in 2007. And here's another:

Lists in Erlang--and Haskell and Scheme--are singly-linked. Given a list, you can easily get the next element. Getting the previous element looks impossible; there's no back pointer to follow. But that's only true if you're looking at the raw definition of lists. It's easy if you add some some auxiliary data. When you step forward, remember the element you just moved away from. When you step back, just grab that element. The data structure looks like this:

{Previous_Items, Current_List}

To move through a list, start out with {[], List}. You can step forward and back with two functions:

forward({Prev, [H|T]}) ->
{[H|Prev], T}.

back({[H|T], L}) ->
{T, [H|L]}.

Wait, isn't that cheating? Creating a new list on the fly like that? No, that's the point of being free from thinking about managing memory or even instantiating classes.

This Isn't Another Quick Dismissal of Visual Programming

I stopped following technical forums for three reasons: pervasive negativity, waning interest on my part, and I realized I could predict the responses to most questions. "I bet this devolves into a debate about the validity of the singleton pattern." *click* "Ha! I knew it! Wait...why am I wasting my time on this?"

"Visual programming" is one of those topics that gets predictable responses. I don't mean "visual" in the GUI-design sense of the word, like Visual C++. I mean building programs without creating text files, programming where the visual component is the program.

Not too surprisingly, this is a subject that brings on the defensiveness. It's greeted with diatribes about "real" programming and how drawing lines to connect components is too limiting and links to articles about the glory of raw text files. At one time I would have agreed, but more and more it's starting to smack of nostalgic "that's how it's always been done"-ness.

In Microsoft's old QBasic, programs were still text files, but you could choose to view them as a grid of subroutine names. Click one and view code for that function--and only that function. It was very different than moving around within a single long document. There was the illusion that each function was its own entity, that a program was a collection of smaller parts. Behind the scenes, of course, all of those functions were still part of a single text file, but no matter. That a program was presented in a clear, easily navigable way changed everything. Was this visual programming? No, but it was a step away from thinking about coding as editing text files.

Take that a bit further: what if a visual representation of a program made it easier to understand, given code you've never seen before? To get in the proper frame of mind, go into the Erlang distribution and load up one of the compiler modules, like beam_jump.erl or beam_block.erl. Even with all the comments at the top of the former, just coming to grips with the overall flow of the module takes some effort. I'm not talking about fully understanding the logic, but simply being able to get a picture of how data is moving through the functions. Wouldn't a nice graphical representation make that obvious?

Neither of these examples are visual programming. I'd call them different forms of program visualization. Regardless, I can see the benefits, and I suspect there are larger gains to be had with other non-textual ways of dealing with code. I don't want to block out those possibilities because I'm too busy wallowing in my text-file comfort zone.

Easy to Please

I have favorited over seven thousand photos on Flickr.

"Favoriting" is not a valuable currency. Clicking the "Add to Faves" icon means I like a photo, I'm inspired by it, and I want to let the photographer know this. Doing so doesn't cost me anything, and it doesn't create any kind of tangible reward for who took the photo. As such, I don't feel a need to be tight-fisted about it. If I enjoy a photo, I favorite it. For photos that don't grab me, I don't do anything. I look at them, but "Add to Faves" remains unclicked. There's no negativity, no reason to bash them.

With over seven thousand favorites, I'm either easy to please or have low standards--or both.

I prefer "easy to please," but the "low standards" angle is an interesting one. There's a Macintosh rumors site that lets users rate stories as positive or negative. Here's a story from earlier this year that I can't imagine causing any bitterness: "iPhone 4 Available for Pre-Order from Apple." It has 545 negative votes. Who are these people? I don't mean "people who don't like or own an iPhone," I mean "people who would follow an Apple rumors site and down-vote the announcement of a flagship product that's both exciting and delivers on its promises." Clearly those people need to have lower standards or they'll never be happy.

I'll let you in on a secret: it's okay to not think about technology on an absolute scale. And it's also okay to use two competing technologies without being focused on which is better.

My dad bought an Atari 800 when I was 14, so that's what I used to write my own games. He later got an Apple //e, and there was a stark difference between the two computers. The Atari was full of colors and sounds and sprites. The Apple was full of colors, but it was connected to a monochrome monitor that only showed green so I couldn't experience the others. Sounds? Funny clicks and buzzes. Hardware sprites? No. But it had its own charms, and I wrote a couple of games for it, too.

I like using Erlang to solve problems. I also like using Perl. And C. And Python. It might look like there's some serious internal inconsistency going on, and I guess that's true. I think about functional programming when using Erlang. It's a hybrid of procedural and functional approaches with Perl and Python. In C I think about pointers and how memory is arranged. Is one of these approaches superior? Sometimes. And other times a different one is.

I'm not even sure that many extreme cases of putting down technologies hold water. The anti-PHP folks seem rather rabid, but that's not stopping people from happily using it. Despite the predictable put-downs of BASIC, I'm fond of BlitzMax, which is an oddball BASIC built around easy access to graphics and sound. Everyone makes fun of Windows, but it's an understatement to say that it clearly works and is useful (and I'm saying that even though I'm typing on a MacBook).

(If you liked this, you might like Slumming with BASIC Programmers.)

Good-Bye to the Sprawling Suburbs of Screen Space

Application platforms are going off in two completely different directions. Desktop monitors keep getting bigger--using two or three monitors at once isn't the rarity that it once was--and then there's the ultra-portable end of things: iPhone, iPad, Blackberry, Nintendo DS.

The common reaction to this is that large monitors are good for programmers and musicians and Photoshop users--people doing serious work--and portable devices let you take care of some common tasks when away from your desk.

I don't see it that way at all. The freedom to do creative work when and where I want to, and not be tied to a clunky computer at a desk, has great appeal. Notebooks are a good start, but they're still rooted in the old school computer world: slow booting, too much emphasis on mental noise (menus, moving windows around, pointless manipulation of items on a virtual desktop). And there are the beginnings of of people using portable devices for real work. At least two people have made serious attempts at writing novels on their iPhones. Check out what a professional photographer can do with an older model iPhone (see "iPhone as Art" in the "Portfolios" section).

There's becoming a great divide in terms of UI design for desktops and ultra-portables. Desktop UIs need all the surface area they can get, with more stuff shown at once, with more docked or floating palettes. Just look at any music production software. Or any graphic arts package. None of these interfaces translates over to a pocket-sized LCD screen.

There's truth to the human interface guideline that flat is (often) better than nested. That's why the tool ribbons in Microsoft Office take less fumbling than a set of nested menus, and why easy-to-use websites keep most options right there in front of you, ready to click. Flat, not surprisingly, takes more screen real estate. But it has also become too easy to take advantage of all those pixels on a large monitor simply because they exist. I can't help but see some interfaces as the equivalent of the "data dump" style of PowerPoint presentation (see Presenting to Win by Jerry Weissman). The design choices are not choices so much as simply deciding to show everything: as many toolbars and inspectors and options as possible. If it's too much, let the user sort it out. Make things float or dock and provide customization settings and everyone is running at 1600-by-something resolution on a 19+ inch monitor anyway.

Except on an iPhone.

I don't have an all-encompassing solution for how to make giant-interface apps run on screen that fits in your pocket, but simply having that as a goal changes things. Instead of being enslaved to cycles and bytes--which rarely make or break programs in the way that optimization obsessed developers long for--there's a much more relevant limited resource to be concerned with the conservation of: pixels. How to take the goings-on of a complex app and present them on a screen with some small number of square inches of screen space?

(If you liked this, you might like Optimizing for Fan Noise.)

Learning to Ignore Superficially Ugly Code

Back when I was in school and Pascal was the required language for programming assignments, I ran across a book by Henry Ledgard: Professional Pascal. Think of it as the 1980s version of Code Complete. This was my first exposure to concerns of code layout, commenting style, and making programs look pretty. At the time the advice and examples resonated with me, and for a long time afterward I spent time adjusting and aligning my code so it was aesthetically pleasing, so the formatting accentuated structural details.

To give a concrete example, I'd take this:

win_music_handle = load_music("win.aiff");
bonus_music_handle = load_music("bonus.aiff");
high_score_music_handle = load_music("highscore.aiff");

and align it like this:

win_music_handle        = load_music("win.aiff");
bonus_music_handle      = load_music("bonus.aiff");
high_score_music_handle = load_music("highscore.aiff");

The theory was that this made the repeated elements obvious, that at a quick glance it was easy to see that three music files were being loaded. Five or six years ago I wrote a filter that takes a snippet of Erlang code and vertically aligns the "->" arrows. I still have it mapped to ",a" in vim.

More and more, though, I'm beginning to see code aesthetics as irrelevant, as a distraction. After all, no one cares what language an application was written in, and certainly no one cares about the way a program was architected. How the program actually looks is far below either of those. Is there a significant development-side win to having pleasingly formatted code? Does it make any difference at all?

In the manual for Eric Isaacson's A86 assembler (which I used in the early 1990s), he advises against the age-old practice of aligning assembly code into columns, like this:

add     eax, 1
sub     eax, ebx
call    squish
jc      error

His view is that that the purpose of a column is so you can scan down it quickly, and there's no reason you'd ever want to scan down a list of unrelated operands. The practice of writing columnar code comes from ancient tools that required textual elements to begin in specific column numbers. There's a serious downside to this layout methodology, too: you have to take the time to make sure your code is vertically aligned.

How far to take a lack of concern with layout aesthetics? Here's a quick test. Imagine you're looking for an error in some code, and narrowed it down to a single character in this function:

void amazing_adjustment(int amount)
{
int temp1 = Compartment[1].range.low;

int temp_2 = Compartment[ 2 ].range.low;
int average = (temp1 + temp_2)/2;
Global_Adjustment =
average;
}

The fix is to change the first index from 1 to 0. Now when you were in there, would you take the time to "correct" the formatting? To adjust the indentation? To remove the inconsistencies? To make the variable names more meaningful? Or would you just let it slide, change the one character, and be done with it? If you do decide to make those additional fixes, who is actually benefiting from them?

(If you liked this, you might like Macho Programming.)

Instant-On

"Mobile" is the popular term used to describe devices like the iPhone and iPad. I prefer "instant-on." Sure, they are mobile, but what makes them useful is that you can just turn them on and start working. All the usual baggage associated with starting-up a computer--multiple boot sequences that add up to a minute or more of time, followed by a general sluggishness while things settle down--are gone.

What's especially interesting to me is that instant-on is not new, not by any means, but it was set aside as a goal, even considered impossible stuff of fantasy.

Turn on any 1970s-era calculator. It's on and usable immediately.

Turn on any 1970s or 1980s game console. It's on and usable immediately.

Turn on any 8-bit home computer. Give it a second or two, and there's the BASIC prompt. You can start typing code or use it as a fancy calculator (a favorite example of Jef Raskin). To be fair, it wasn't quite so quick as soon as you started loading extensions to the operating system from a floppy disc (such as Atari DOS).

That it got to where it wasn't unusual for a PC to take from ninety seconds to two minutes to fully boot-up shows just how far things had strayed from the simple, pleasing goal of instant-on. Yes, operating systems were bigger and did more. Yes, a computer from 2000 was so much more powerful than one from 1985. But those long boot times kept them firmly rooted in the traditional computer world. They reveled in being big iron, with slow self-testing sequences and disjoint flickering between different displays of cryptic boot messages.

And now, thankfully, instant-on is back. Maybe not truly instant; there's still a perceived start-up time on an iPad. But it's short enough that it doesn't get in the way, that by the time you've gotten comfortable and shifted into the mindset for your new task the hardware is ready to use. That small shift from ninety seconds to less than ten makes all the difference.

(If you liked this, you might like How Much Processing Power Does it Take to be Fast?.)

Write Code Like You Just Learned How to Program

I'm reading Do More Faster, which is more than a bit of an advertisement for the TechStars start-up incubator, but it's a good read nonetheless. What struck me is that several of the people who went through the program, successfully enough to at least get initial funding, didn't know how to program. They learned it so they could implement their start-up ideas.

Think about that. It's like having a song idea and learning to play an instrument so you can make it real. I suspect that the learning process in this case would horrify most professional musicians, but that horror doesn't necessarily mean that it's a bad idea, or that the end result won't be successful. After all, look at how many bands find success without the benefit of a degree in music theory.

I already knew how to program when I took an "Intro to BASIC" class in high school. One project was to make a visual demo using the sixteen-color, low-res mode of the Apple II. I quickly put together something algorithmic, looping across the screen coordinates and drawing lines and changing colors. It took me about half an hour to write and tweak, and I was done.

I seriously underestimated what people would create.

One guy presented this amazing demo full of animation and shaded images. I'm talking crazy stuff, like a skull that dripped blood from its eye into a rising pool at the bottom of the screen. And that was just one segment of his project. I was stunned. Clearly I wasn't the hotshot programmer I thought was.

I eventually saw the BASIC listing for his program. It was hundreds and hundreds of lines of statements to change colors and draw points and lines. There were no loops or variables. To animate the blood he drew a red pixel, waited, then drew another red pixel below it. All the coordinates were hard-coded. How did he keep track of where to draw stuff? He had a piece of graph paper that he updated as he went.

My prior experience hurt me in this case. I was thinking about the program, and how I could write something that was concise and clean. The guy who wrote the skull demo wasn't worried about any of that. He didn't care about what the program looked like or how maintainable it was. He just wanted a way to present his vision.

There's a lesson there that's easy to forget--or ignore. It's extremely difficult to be simultaneously concerned with the end-user experience of whatever it is that you're building and the architecture of the program that delivers that experience. Maybe impossible. I think the only way to pull it off is to simply not care about the latter. Write comically straightforward code, as if you just learned to program, and go out of your way avoid wearing any kind of software engineering hat--unless what you really want to be is a software engineer, and not the designer of an experience.

(If you liked this, you might like Coding as Performance.)

A Three-Year Retrospective

This is not a comprehensive index, but a categorization of some of the more interesting or well-received entries from November 2007 through December 2010. Feel free to dig through the archives if you want everything. Items within each section are in chronological order.

Admitting that Functional Programming Can Be Awkward
Sending Modern Languages Back to 1980s Game Programmers
Why Garbage Collection Paranoia is Still (sometimes) Justified
Garbage Collection in Erlang
Five Memorable Books About Programming
Purely Functional Retrogames
A Spellchecker Used to Be a Major Feat of Software Engineering
Want to Write a Compiler? Just Read These Two Papers.
Puzzle Languages
The World's Most Mind-Bending Language Has the Best Development Environment
Optimizing for Fan Noise
How Much Processing Power Does it Take to be Fast?
Free Your Technical Aesthetic from the 1970s
Stop the Vertical Tab Madness
Advice to Aimless, Excited Programmers
Write Code Like You Just Learned How to Program

Accidental Innovation, Part 1

In the mid-1980s I was writing 8-bit computer games. Looking back, it was the epitome of indie. I came up with the idea, worked out the design, drew the art, wrote the code, made the sound effects, all without any kind of collaboration or outside opinions. Then I'd send the finished game off to a magazine like ANALOG Computing or Antic, get a check for a few hundred dollars, and show up in print six months or a year later.

Where I got the ideas for those games is a good question. I was clearly influenced by frequent visits to arcades, but I also didn't just want to rip-off designs and write my own versions of the games I played there.

I remember seeing screenshots of some public domain games that appeared to have puzzle elements, though I never actually played them so I didn't know for sure. The puzzley aspects may have been entirely in my head. I had a flash of an idea about arranging objects so that no two of the same type could touch. Initially those objects were oranges, lemons, and limes, which made no sense, and there was still no gameplay mechanic. Somewhere in there I hit upon the idea of dropping the fruits in a pile and switched the theme to be about cans of radioactive waste that would hit critical mass if the same types were near each other for more than a moment.

Here's what I ended up with, designed and implemented in five days, start to finish: Uncle Henry's Nuclear Waste Dump.

Missing image: Uncle Henry's Nuclear Waste Dump

At first glance, this may look like any of the dozens of post-Tetris puzzle games. There's a pit that you drop stuff into, and you move back and forth over the top if it. Stuff piles up at the bottom, and the primary gameplay is in deciding where to drop the next randomly-determined item. It's all very familiar, and except that the goal is reversed--to fill the pit instead preventing it from filling--there's nothing remarkable here.

Except that Tetris wasn't released in the United States until 1987 and Uncle Henry's Nuclear Waste Dump was published in 1986. I didn't even know what Tetris was until a few years later.

Was Uncle Henry's as good as Tetris? No. Not a chance. I missed the obvious idea of guiding each piece as it fell and instead used the heavy-handed mechanic of releasing the piece before a timer expired (that's the number in the upper right corner). And keeping three waste types separated wasn't nearly as addictive as fitting irregular shapes together. Overall the game wasn't what it could have been, and yet it's interesting that it's instantly recognizable as having elements of a genre that didn't exist at the time.

So close.

(While looking for the above screenshot, I found that someone wrote a clone in 2006. The part about the game ending when the pile gets too high sounds like the gameplay isn't exactly the same, but the countdown timer for dropping a can is still there.)

Part 2

Accidental Innovation, Part 2

In 1995 I was writing a book, a collection of interviews with people who wrote video and computer games in the 1980s. I had the inside track on the whereabouts of many of those game designers--this was before they were easy to find via Google--and decided to make use of that knowledge. But at the time technology books that weren't "how to" were a tough sell, so after a number of rejections from publishers I set the project aside.

A year later, my wife and I were running a small game development company, and those interviews resurfaced as a potential product. We were already set-up to handle orders and mail games to customers, so the book could be entirely digital. But what format to use? PDF readers were clunky and slow. Not everyone had Microsoft Word. Then it hit me: What about using HTML as a portable document format? I know, I know, it's designed to be a portable document format, but I was thinking of it as an off-line format, not just for viewing websites. I hadn't seen anyone do this yet. It was before HTML became a common format for documentation and help files.

And so for the next couple of years people paid $20 for Halcyon Days: Interviews with Classic Computer and Video Game Programmers. Shipped via U.S. mail. On a 3 1/2 inch floppy disc. Five years later I put it on the web for free.

Even though I made the leap of using HTML for off-line e-books, and web browsers as the readers, I still didn't realize how ubiquitous HTML and browsers would become. I don't remember the details of how it happened, but I asked John Romero to write the introduction, which he enthusiastically did. I mentioned that I was looking for a distribution format for those people who didn't use browsers, and his comment was (paraphrased): "Are you crazy! Don't look backward! This is the future!"

Obviously, he was right.

Part 3

Accidental Innovation, Part 3

I didn't write the previous two installments so I could build up my ego. I wanted to give concrete examples of innovation and the circumstances surrounding it, to show that it's not magic or glamorous, to show that innovation is more than sitting down and saying "Okay, time to innovate!"

It's curious how often the "innovative" stamp is applied to things that don't fit any kind of reasonable definition of the word. How many times have you seen text like this on random company X's "About" page:

We develop innovative solutions which enable enterprise-class cloud computing infrastructure that...something something synergy...something something "outside the box."

I've seen that enough that I've formulated a simple rule: If you have to say that you're innovating, then you're not. Or in a less snarky way: Innovation in itself is an empty goal, so if you're using it in the mission statement for the work you're doing, then odds are the rest of the mission statement is equally vacant.

Really, the only way to innovate is to do so accidentally.

In both the examples I gave, I wasn't thinking about how to do things differently. I was thinking about how to solve a problem and only that problem. The results ended up being interesting because I didn't spend all my time fixated on what other people had done, to the point where that's all I could see. If I started designing a puzzle game in 2011, I'd know all about Tetris and all the knock-offs of Tetris and all the incremental changes and improvements that stemmed from Tetris. It would be difficult to work within the restrictions of the label "puzzle game" and come up with something that transcends the boundaries of those restrictions.

Suppose it's the late 1990s, and your goal is to design a next generation graphical interface for desktop PCs--something better than Windows. Already you're sunk, because you're looking at Windows, you're thinking about Windows, and all of your decisions will be colored by a long exposure to Windows-like interfaces. There are icons, a desktop, resizable windows, some kind of task bar, etc. What you end up with will almost certainly not be completely identical to Windows, but it won't be innovative.

Now there are some interesting problems behind that vague goal of building a next generation GUI. The core question is how to let the user run multiple applications at the same time and switch between them. And that question has some interesting and wide-ranging answers. You can see the results of some of those lines of thinking in current systems, such as doing away with the "app in a movable window" idea and having each application take over the entire screen. Then the question becomes a different one: How to switch between multiple full-screen apps? This is all very different than starting with the desktop metaphor and trying to morph it into something innovative.

(If you liked this, you might like How to Think Like a Pioneer.)

Exploring Audio Files with Erlang

It takes surprisingly little Erlang code to dig into the contents of an uncompressed audio file. And it turns out that three of the most common uncompressed audio file formats--WAV, AIFF, and Apple's CAF--all follow the same general structure. Once you understand the basics of one, it's easy to deal with the others. AIFF is the trickiest of the three, so that's the one I'll use as an example.

First, load the entire file into a binary:

load(Filename) ->
{ok, B} = file:read_file(Filename),
B.

There's a small header: four characters spelling out "FORM", a length which doesn't matter, then four more characters spelling out "AIFF". The interesting part is the rest of the file, so let's just validate the header and put the rest of the file into a binary called B:

<<"FORM", _:32, "AIFF", B/binary>> = load(Filename).

The "rest of file" binary is broken into chunks that follow a simple format: a four character chunk name, the length of the data in the chunk (which doesn't include the header), and then the data itself. Here's a little function that breaks a binary into a list of {Chunk_Name, Contents} pairs:

chunkify(Binary) -> chunkify(Binary, []).
chunkify(<<N1,N2,N3,N4, Len:32,
Data:Len/binary, Rest/binary>>, Chunks) ->
Name = list_to_atom([N1,N2,N3,N4]),
chunkify(adjust(Len, Rest), [{Name, Data}|Chunks]);
chunkify(<<>>, Chunks) ->
Chunks.

Ignore the adjust function for now; I'll get back to that.

Given the results of chunkify, it's easy to find a specific chunk using lists:keyfind/3. Really, though, other than to test the chunkification code, there's rarely a reason to iterate through all the chunks in a file. It's nicer to return a function that makes lookups easy. Replace the last line of chunkify with this:

fun(Name) ->
element(2, lists:keyfind(Name, 1, Chunks)) end.

The key info about sample rates and number of channels and all that is in a chunk called COMM and now we've got an easy way to get at and decode that chunk:

Chunks = chunkify(B).
<<Channels:16, Frames:32,
SampleSize:16,
Rate:10/binary>> = Chunks('COMM').

The sound samples themselves are in a chunk called SSND. The first eight bytes of that chunk don't matter, so to decode that chunk it's just:

<<_:8/binary, Samples/binary>> = Chunks('SSND').

Okay, now the few weird bits of the AIFF format. First, if the size of a chunk is odd, then there's one extra pad byte following it. That's what the adjust function is for. It checks if a pad byte exists and removes it before decoding the rest of the binary. The second quirk is that the sample rate is encoded as a ten-byte extended floating point value, and most languages don't have support for them--including Erlang. There's an algorithm in the AIFF spec for encoding and decoding extended floats, and I translated it into Erlang.

Here's the complete code for the AIFF decoder:

load_aiff(Filename) ->
<<"FORM", _:32, "AIFF", B/binary>> = load(Filename),
Chunks = chunkify(B),
<<Channels:16, Frames:32, SampleSize:16, Rate:10/binary>> =
Chunks('COMM'),
<<_:8/binary, Samples/binary>> = Chunks('SSND'),
{Channels, Frames, SampleSize, ext_to_int(Rate), Samples}.

chunkify(Binary) -> chunkify(Binary, []).
chunkify(<<N1,N2,N3,N4, Length:32,
Data:Length/binary, Rest/binary>>, Chunks) ->
Name = list_to_atom([N1,N2,N3,N4]),
chunkify(adjust(Length, Rest), [{Name, Data}|Chunks]);
chunkify(<<>>, Chunks) ->
fun(Name) -> element(2, lists:keyfind(Name, 1, Chunks)) end.

adjust(Length, B) ->
case Length band 1 of
1 -> <<_:8, Rest/binary>> = B, Rest;
_ -> B
end.

ext_to_int(<<_, Exp, Mantissa:32, _:4/binary>>) ->
ext_to_int(30 - Exp, Mantissa, 0).
ext_to_int(0, Mantissa, Last) ->
Mantissa + (Last band 1);
ext_to_int(Exp, Mantissa, _Last) ->
ext_to_int(Exp - 1, Mantissa bsr 1, Mantissa).

load(Filename) ->
{ok, B} = file:read_file(Filename),
B.

WAV and CAF both follow the same general structure of a header followed by chunks. WAV uses little-endian values, while the other two are big-endian. CAF doesn't have chunk alignment requirements, so that removes the need for adjust. And fortunately it's only AIFF that requires that ugly conversion from extended floating point in order to get the sample rate.

Don't Distract New Programmers with OOP

When I get asked "What's a good first programming language to teach my [son / daughter / other-person-with-no-programming-experience]?" my answer has been the same for the last 5+ years: Python.

That may be unexpected, coming from someone who often talks about non-mainstream languages, but I stand by it.

Python is good for a wide range of simple and interesting problems that would be too much effort in C. (Seriously, a basic spellchecker can be implemented in a few lines of Python.) There are surprisingly few sticking points where the solution is easy to see, but there's a tricky mismatch between it and the core language features. Erlang has a couple of biggies. Try implementing any algorithm that's most naturally phrased in terms of in-place array updates, for example. In Python the sailing tends to be smooth. Arrays and dictionaries and sets cover a lot of ground.

There's one caveat to using Python as an introductory programming language: avoid the object-oriented features. You can't dodge them completely, as fundamental data types have useful methods associated with them, and that's okay. Just make use of what's already provided and resist talking about how to create classes, and especially avoid talking about any notions of object-oriented design where every little bit of data has to be wrapped up in a class.

The shift from procedural to OO brings with it a shift from thinking about problems and solutions to thinking about architecture. That's easy to see just by comparing a procedural Python program with an object-oriented one. The latter is almost always longer, full of extra interface and indentation and annotations. The temptation is to start moving trivial bits of code into classes and adding all these little methods and anticipating methods that aren't needed yet but might be someday.

When you're trying to help someone learn how to go from a problem statement to working code, the last thing you want is to get them sidetracked by faux-engineering busywork. Some people are going to run with those scraps of OO knowledge and build crazy class hierarchies and end up not as focused on on what they should be learning. Other people are going to lose interest because there's a layer of extra nonsense that makes programming even more cumbersome.

At some point, yes, you'll need to discuss how to create objects in Python, but resist for as long as you can.

(November 2012 update: There's now a sequel of sorts.)

If You're Not Gonna Use It, Why Are You Building It?

Just about every image editing or photo editing program I've tried has a big collection of visual filters. There's one to make an image look like a mosaic, one to make it look like watercolors, and so on. Except for few of the most fundamental image adjustments, like saturation and sharpness, I never use any of them.

I have this suspicion that the programmers of these tools got hold of some image processing textbooks and implemented everything in them. If an algorithm had any tweakable parameters, then those were exposed to the user as sliders.

Honestly, that sounds like something I might have done in the past. The process of implementing those filters is purely technical--almost mechanical--yet it makes the feature list longer and more impressive. And they could be fun to code up. But no consideration is given to if those filters have any practical value.

Contrast this with apps like Instagram and Hipstamatic. Those programs use your phone's camera to grab images, then apply built-in filters to them. They're fully automatic; you can't make any manual adjustments. And yet unlike all of those filter-laden photo editors I've used in the past, I'm completely hooked on Hipstamatic. It rekindled my interest in photography, and I can't thank the authors enough.

What's the difference between those apps and old-fashioned photo editors?

The Hipstamatic and Instagram filters were designed with clear goals in mind: to emulate certain retro-camera aesthetics, to serve as starting points and inspirations for photographs. Or more succinctly: they were built to be used.

If you find yourself creating something, and you don't understand how it will be used, and you don't plan on using it yourself, then it's time to take a few steps back and reevaluate what you're doing.

(If you liked this, you might like Advice to Aimless, Excited Programmers.)

Caught-Up with 20 Years of UI Criticism

Interaction designers have leveled some harsh criticisms at the GUI status-quo over the last 20+ years. The mouse is an inefficient input device. The desktop metaphor is awkward and misguided. Users shouldn't be exposed to low-level details like the raw file-system and having to save their work.

And they were right.

But instead of better human/computer interaction, we got faster processors and hotter processors and multiple processors and entire processors devoted to 3D graphics. None of which are bad, mind you, but it was always odd to see such tremendous advances in hardware while the researchers promoting more pleasant user experiences wrote books that were eagerly read by people who enjoyed smirking at the wrong-headedness of an entire industry--yet who weren't motivated enough to do anything about it. Or so it seemed.

It's miraculous that in 2011, the biggest selling computers are mouse-free, run programs that take over the entire screen without the noise of a faux-desktop, and the entire concept of "saving" has been rendered obsolete.

Clearly, someone listened.

(If you liked this, you might enjoy Free Your Technical Aesthetic from the 1970s.)

Revisiting "Tricky When You Least Expect It"

Since writing Tricky When You Least Expect It in June 2010, I've gotten a number of responses offering better solutions to the angle_diff problem. The final version I presented in the original article was this:

angle_diff(Begin, End) ->
D = End - Begin,
DA = abs(D),
case {DA > 180, D > 0} of
{true, true} -> DA - 360;
{true, _}    -> 360 - DA;
_ -> D
end.

But, maybe surprisingly, this function can be written in two lines:

angle_diff(Begin, End) ->
(End - Begin + 540) rem 360 - 180.

The key is to shift the difference into the range -180 to 180 before the modulo operation. The "- 180" at the end adjusts it back. One quirk of Erlang is that the modulo operator (rem) gives a negative result if the first value is negative. That's easily fixed by adding 360 to the difference (180 + 360 = 540) to ensure that it's always positive. (Remember that adding 360 to an angle gives the same angle.)

So how did I miss this simpler solution? I got off track by by thinking I needed an absolute value, and things went downhill from there. I'd like to think if I could rewind and re-attempt the problem from scratch, then I'd see the error of my ways, but I suspect I'd miss it the second time, too. And that's what I was getting at when I wrote "Tricky When You Least Expect It": that you never know when it will take some real thought to solve a seemingly simple problem.

(Thanks to Samuel Tardieu, Benjamin Newman, and Greg Rosenblatt, who all sent almost identical solutions.)

Follow the Vibrancy

Back in 1999 or 2000, I started reading a now-defunct Linux game news site. I thought the combination of enthusiastic people wanting to write video games and the excitement surrounding both Linux and open source would result in a vibrant, creative community.

Instead there were endless emulators and uninspired rewrites of stale old games.

I could theorize about why there was such a lack of spark, a lack of motivation to create anything distinctive and exciting. Perhaps most of the projects were intended to fulfill coding itches, not personal visions. I don't know. But I lost interest, and I stopped following that site

When I wanted to modernize my programming skills, I took a long look at Lisp. It's a beautiful and powerful language, but I was put off by the community. It was a justifiably smug community, yes, but it was an empty smugness. Where were the people using this amazing technology to build impressive applications? Why was everyone so touchy and defensive? That doesn't directly point at the language being flawed--not by any means--but it seemed an indication that something wasn't right, that maybe there was a reason that people driven to push boundaries and create new experiences weren't drawn to the tremendous purported advantages of Lisp. So I moved on.

Vibrancy is an indicator of worthwhile technology. If people are excited, if there's a community of developers more concerned with building things than advocating or justifying, then that's a good place to be. "Worthwhile" may not mean the best or fastest, but I'll take enthusiasm and creativity over either of those.

(If you liked this, you might enjoy The Pure Tech Side is the Dark Side.)

Impressed by Slow Code

At one time I was interested in--even enthralled by--low-level optimization.

Beautiful and clever tricks abound. Got a function call followed by a return statement? Replace the pair with a single jump instruction. Once you've realized that "load effective address" operations are actually doing math, then they can subsume short sequences of adds and shifts. On processors with fast "count leading zero bits" instructions, entire loops can be replaced with a couple of lines of linear code.

I spent a long time doing that before I realized it was a mechanical process.

I don't necessarily mean mechanical in the "a good compiler can do the same thing" sense, but that it's a raw engineering problem to take a function and make it faster. Take a simple routine that potentially loops through a lot of data, like a case insensitive string comparison. The first step is to get as many instructions out of the loop as possible. See if what remains can be rephrased using fewer or more efficient instructions. Can any of the calculations be replaced with a small table? Is there a way to process multiple elements at the same time using vector instructions?

The truth is that there's no magic in taking a well-understood, working function, analyzing it, and rewriting it in a way that involves doing slightly or even dramatically less work at run-time. If I ended up with a routine that was a bottleneck, I know I could take the time to make it faster. Or someone else could. Or if it was small enough I could post it to an assembly language programming forum and come back in a couple of days when the dust settled.

What's much more interesting is speeding up something complex, a program where all the time isn't going into a couple of obvious hotspots.

All of a sudden, that view through the low-level magnifying glass is misleading. Yes, that's clearly an N-squared algorithm right there, but it may not matter at all. (It might only get called with with low values of N, for example.) This loop here contains many extraneous instructions, but that's hardly a big picture view. None of this helps with understanding the overall data flow, how much computation is really being done, and where the potential for simplification lies.

Working at that level, it makes sense to use a language that keeps you from thinking about exactly how your code maps to the underlying hardware. It can take a bit of faith to set aside deeply ingrained instincts about performance and concerns with low-level benchmarks, but I've seen Python programs that ended up faster than C. I've seen complex programs running under the Erlang virtual machine that are done executing before my finger is off the return key.

And that's what's impressive: code that is so easy to label as slow upon first glance, code containing functions that can--in isolation--be definitively proven to be dozens or hundreds of times slower than what's possible on a given CPU, and yet the overall program is decidedly one of high performance.

(If you liked this, you might enjoy Timidity Does Not Convince.)

Constantly Create

When I wrote Flickr as a Business Simulator, I was thinking purely about making a product--photos--and getting immediate feedback from a real audience. Seeing how much effort it takes to build-up a following. Learning if what you think people will like and what they actually like are the same thing.

It works just as well for learning what it's like to be in any kind of creative profession, such as an author of fiction or a recording artist.

Go look at music reviews on Amazon, and you'll see people puzzling over why a band's latest release doesn't have the spark of their earlier material, pointing out filler songs on albums, complaining about inconsistency between tracks. Sometimes the criticisms are empty, but there's often a ring of truth. There's an underlying question of why. How could a songwriter or band release material that isn't always at the pinnacle of perfection?

After years of posting photos to Flickr, I get it. I'm just going along, taking my odd photographs, when all of a sudden one resonates and breaks through and I watch the view numbers jump way up. Then I've got pressure: How can I follow that up? Sometimes I do, with a couple of winners in a row, but inevitably I can't stay at that level. Sometimes I take a break, not posting shots for a month or more, and then I lose all momentum.

When I'm at a low point, when I devolve into taking pictures of mundane subjects, pictures I know aren't good, I think about how I'm ever going to get out of that rut. Inevitably I do, though it's often a surprise when I go from a forgettable photo one day to something inspired the next.

The key for me is to keep going, to keep taking and posting photos. If I get all perfectionist then there's too much pressure, and I start second-guessing myself. If I give up when my quality drops off, then that's not solving anything. The steady progress of continual output, whether good or bad output, is part of the overall creative process.

Tough Love for Indies

At one time I was the independent software developer's dream customer.

I was a pushover. I bought applications, I bought tools, I bought games. This was back when "shareware" was still legitimate, back before the iPhone App Store made five dollars sound like an outrageous amount of money for a game. I did it to support the little guy, to promote the dream of living in the mountains or a coastal town with the only source of income coming from the sale of homemade code.

Much of the stuff I bought wasn't great. I bought it because it showed promise, because it clearly had some thought and effort behind it. That I knew it was produced by one person working away in his spare hours softened my expectations.

The thing is, most people don't think that way.

These days I still gravitate toward toward apps that were developed by individuals or small companies, but I don't cut them any slack for lack of quality. I can't justify buying an indie game because it has potential but isn't actually fun. I won't downgrade my expectations of interface design and usability so I can use a program created by two people instead of a large corporation.

That whole term "indie" only means something if you go behind the scenes and find out who wrote a piece of software. And while I think it's fascinating to watch the goings-on of small software developers, it's a quirk shared by a small minority of potential customers. The first rule of being indie is that people don't care if you're indie. You don't get any preferential treatment for not having a real office or a QA department. The only thing that matters is the end result.

(If you liked this, you might enjoy Easy to Please.)

Living in the Era of Infinite Computing Power

Basic math used to be slow. To loop 10K times on an 8-bit processor, it was faster to iterate 256 times in an inner loop, then wrap that in an outer loop executing 40 times. That avoided multi-instruction 16-bit addition and comparison each time through.

Multiplication and division used to be slow. There were no CPU instructions for those operations. If one of the multiplicands was constant, then the multiply could be broken down into a series of adds and bit shifts (to multiply N by 44: N lshift 5 + N lshift 3 + N lshift 2), but the general case was much worse.

Floating point used to be slow. Before FPUs, floating point math was done in software at great expense. Early hardware was better, but hardly impressive. On the original 8087 math coprocessor, simple floating point addition took a minimum of 90 cycles, division over 200, and there were instructions that took over a thousand cycles to complete.

Graphics used to be slow. For the longest time, programmers who had trouble getting 320x200 displays to update at any kind of reasonable rate, scoffed at the possibility of games running at the astounding resolution of 640x480.

All of these concerns have been solved to comical degrees. A modern CPU can add multiple 64-bit values at the same time in a single cycle. Ditto for floating point operations, including multiplication. All the work of software-rendering sprites and polygons has been offloaded to separate, highly-parallel processors that run at the same time as the multiple cores of the main CPU.

Somewhere in the late 1990s, when the then-popular Pentium II reached clock speeds in the 300-400MHz range, processing power became effectively infinite. Sure there were the notable exceptions, like video compression and high-end 3D games and editing extremely high-resolution images, but I was comfortably developing in interpreted Erlang and running complex Perl scripts without worrying about performance.

Compared to when I was building a graphically intensive game on an early 66MHz Power Macintosh, compared to when I was writing commercial telecommunications software on a 20MHz Sun workstation, compared to developing on a wee 8-bit Atari home computer, that late 1990s Pentium II was a miracle.

Since then, all advances in processing power have been icing. Sure, some of that has been eaten up by cameras spitting out twelve megapixels of image data instead of two, by Windows 7 having more overhead than Windows 98, and by greatly increased monitor resolution. And there are always algorithmically complex problems that never run fast enough; that some hardware review site shows chipset X is 8.17% faster than chipset Y in a particular benchmark isn't going to overcome that.

Are you taking advantage of living in the era of infinite computing power? Have you set aside fixations with low-level performance? Have you put your own productivity ahead of vague concerns with optimization? Are you programming in whatever manner lets you focus on the quality and usefulness of the end product?

To be honest, that sounds a bit Seth Godin-esque, feel-good enough to be labeled as inspirational yet promptly forgotten. But there have been and will be hit iOS / Android / web applications from people without any knowledge of traditional software engineering, from people using toolkits that could easily be labeled as technically inefficient, from people who don't even realize they're reliant on the massive computing power that's now part of almost every available platform.

(If you liked this, you might enjoy How Much Processing Power Does it Take to be Fast?.)

The Nostalgia Trap

I used to maintain a site about 8-bit game programmers and the games they created. To be fair, I still update the "database" now and then, but changes are few and far between, and I stopped posting news blurbs five years ago.

There's a huge amount of information on that site. Clearly I was passionate--or at least obsessive--about it, and for a long time, too. When I learned to program in the 1980s, I saw game design as a new outlet for creativity, a new art form. Here were these people without artistic backgrounds, who weren't professional developers, buying home computers and making these new experiences out of essentially nothing. I wanted to document that period. I wanted to communicate with those people and find out what drove them.

(In 2002, D.B. Weiss wrote a novel called Lucky Wander Boy. It followed the story of someone who attempted to catalog every video game ever made. Amusingly, I received a promotional copy.)

That's why I started the site. A better question is "Why did I stop?"

Partly it was because I answered the questions that I had. I was in contact with a hundred or more designers of 8-bit computer games, and I learned their stories. But mostly I needed to move on, to not be spending so much time looking to the past.

Nostalgia is is intensely personal. I was a teenage game designer seeing hundreds of previously unimagined new creations for the Apple II and Atari 800 and Commodore 64 come into existence, and I have fond memories of those years. Other people wax nostalgic about VAX system administration, about summer afternoons with cartridges for the Nintendo Entertainment System, about early mainframe games like Rogue or Hack played on a clunky green terminal, or of the glory days of shareware in the early 1990s. Some people pine for the heyday of MS-DOS development--of cycling the power after every crash--or writing programs in QBASIC.

But don't mistake wistful nostalgia for "how things ought to be."

Just because you used to love the endless string of platformers for a long-dead game system doesn't mean that recreating them for the iPhone is a worthy endeavor. Just because you get a warm and fuzzy feeling when recalling thirty year-old UNIX command-line programs is different than putting them on a pedestal as model for how to design tools. That doesn't mean you shouldn't learn from the past and avoid repeating expensive mistakes. Just don't get trapped by thinking that older software or technologies are superior because they happened to be entangled with more carefree periods in your life.

The future is much more interesting.

The End is Near for Vertical Tab

Stop the Vertical Tab Madness wasn't based on a long-standing personal peeve. It dawned on me after writing Rethinking Programming Language Tutorials and a follow-up piece that here is this archaic escape sequence ("\v") that no one uses or understands, yet it's mindlessly included in new programming languages and pedantically repeated in tutorials and reference manuals.

One year later, a Google search for vertical tab produces this:

Missing image: Google search for vertical tab

There's the Wikipedia entry about tab in general, and then there's an essay pointing out the utter uselessness of vertical tab in modern programming.

This is progress!

I will take this opportunity to repeat my plea. If you're a programming language maintainer, please follow the lead taken by Perl and drop support for the vertical tab escape sequence. If you're writing a tutorial, don't even hint that the vertical tab character exists.

Thank you.

8-Bit Scheme: A Revisionist History

In The Nostalgia Trap I wrote, "I was in contact with a hundred or more designers of 8-bit computer games, and I learned their stories." Those stories were fantastically interesting, but most of them were only incidentally about programming. The programming side usually went like this:

Early home computers were magic, and upon seeing one there was a strong desire to learn how to control it and create experiences from moving graphics and sound. At power-up there was the prompt from a BASIC interpreter, and there was a BASIC reference manual in the box, so that was the place to start.

Later there was serendipitous exposure to some fast and impressive game that was far beyond the animated character graphics or slow line-drawing of BASIC, and that led to the discovery of assembly language and the freedom to exploit the hardware that came with it. It was suitably clunky to be writing a seven instruction sequence to do 16-bit addition and remembering that "branch on carry set" could be thought of as "branch on unsigned greater or equal to." The people with prior programming experience, the ones who already knew C or Algol or Scheme, they may have been dismayed at the primitive nature of it all, but really it came down to "you do what you have to do." The goal was never to describe algorithms in a concise and expressive manner, but to get something interactive and wonderful up on the screen.

Now imagine if an Atari 800 or Commodore 64 shipped with a CPU designed to natively run a high-level language like Scheme. That's not completely outrageous; Scheme chips were being developed at MIT in the late 1970s.

My suspicion is that Scheme would have been learned by budding game designers without a second thought. Which language it was didn't matter nearly so much as having a language that was clearly the right choice for the system. All the quirks and techniques of Scheme would have been absorbed and worked around as necessary.

It's not so simple with today's abundance of options, none of which is perfect. Haskell is beautiful, but it looks difficult to reason about the memory usage of Haskell code. Erlang has unappealing syntax. Python is inconsistently object-oriented. Lisp is too bulky--Scheme too minimalist. All of these ring more of superficiality than the voice of experience. Yet those criticisms are preventing the deep dive needed to get in there and find out how a language holds up for a real project.

What if you had to use Scheme? Or Haskell? Or Erlang? You might slog it out and gain a new appreciation for the mundane, predictable nature of C. Or you might find out that once you've worked through a handful of tricky bits, there are great advantages in working with a language that's more pleasant and reliable. Either way, you will have learned something.

(If you liked this, you might enjoy Five Memorable Books About Programming.)

Collapsing Communities

At one time the Lisp and Forth communities were exciting places. Books and articles brimmed with optimism. People were creating things with those languages. And then slowly, slowly, there was a loss of vibrancy. Perhaps the extent of the loss went unnoticed by people inside those communities, but the layers of dust and the reek of years of defensiveness jump out at the curious who wander in off the street, not realizing that the welcome sign out front was painted decades earlier by people who've long since moved away.

This is not news. Time moves on. Product and technical communities grow tired and stale. The interesting questions are when do they go stale and how do you realize it?

The transition from MS-DOS to Windows was a difficult one for many developers. Windows 3.1 wasn't a complete replacement for the raw audio/visual capabilities of DOS. Indeed, the heyday of MS-DOS game creation took place in the years between Windows 3.1 and Windows 95. But even in 2002, seven years after every PC booted into Windows by default, it wasn't uncommon to see the authors of software packages, even development environments, still resisting the transition, still targeting MS-DOS. Why did they hang on in the face of clear and overwhelming change? How did they justify that?

Unfortunately, it was easy to justify. "Windows is overly complex. Look at the whole shelf of manuals you need to program for it." "I can put a pixel on the screen with two lines of code under MS-DOS versus 200 for Windows." "I'm not going to take the performance hit from virtual memory, pre-emptive multitasking, and layers of hardware drivers."

Even if some of those one-sided arguments hold a bit of water, they made no difference at all to the end result. It would have been better to focus on learning the new platform rather than tirelessly defendinding the old one.

I've been a Flickr user since the early days. Oh, the creativity and wonder touched off by that site! But there have been signs that it is growing crusty. The iPhone support is only halfway there and has been for some time, for example. I wouldn't say Flickr is truly collapsing, but the spark is dimmer than it once was. I'm far from shutting down my account, but there's more incentive to start poking around the alternatives.

When I do move on to another photo sharing site, I won't fight it. I won't post long essays about why I won't leave. I'll simply follow the vibrancy.

"Avoid Premature Optimization" Does Not Mean "Write Dumb Code"

First there's a flurry of blog entries citing a snippet of a Knuth quote: "premature optimization is the root of all evil." Then there's the backlash about how performance needs to be considered up front, that optimization isn't something that can be patched in at the end. Around and around it goes.

What's often missed in these discussions is that the advice to "avoid premature optimization" is not the same thing as "write dumb code." You should still try to make programs clear and reliable and factor out common operations and use good names and all the usual stuff. There's this peculiar notion that as soon as you ease up on the hardcore optimization pedal, then you go all mad and regress into a primitive mindset that thinks BASIC on an Apple ][ is the epitome of style and grace.

The warning sign is when you start sacrificing clarity and reliability while chasing some vague notion of performance.

Imagine you're writing an application where you frequently need to specify colors. There's a nice list of standard HTML color names which is a good starting point. A color can be represented as an Erlang atom: tomato, lavenderBlush, blanchedAlmond. Looking up the R,G,B value of a color, given the name, is straightforward:

color(tomato) -> {255,99,71};
color(lavenderBlush) -> {255,240,245};
color(blanchedAlmond) -> {255,235,205};
...

That's beautiful in its textual simplicity, but what's going on behind the scenes? That function gets turned into the virtual machine equivalent of a switch statement. At load time, some additional optimization gets done and that switch statement is transformed into a binary search for the proper atom.

What if, instead of atoms, colors are represented by integers from zero to some maximum value? That's easy with macros:

-define(Tomato, 0).
-define(LavenderBlush 1).
-define(BlanchedAlmond, 2).
...

This change allows the color function to be further, automatically, optimized at load time. The keys are consecutive integers, so there's no need for a search. At runtime there's a bounds check and a look-up, and that's it. It's hands-down faster than the binary search for an atom.

What's the price for this undefined amount of extra speed? For starters, colors get displayed as bare integers instead of symbolic names. An additional function to convert from an integer to a name string fixes that...well, except in post-crash stack traces. The easy-to-read and remember names can't be entered interactively, because macros don't exist in the shell. And the file containing the macros has to be included in every source file where colors are referenced, adding dependencies to the project that otherwise wouldn't be present.

The verdict? This is madness: sacrificing ease of development, going against the grain of Erlang, all in the name of nanoseconds.

(If you liked this, you might enjoy Two Stories of Simplicity.)

It's Like That Because It Has Always Been Like That

At a time when most computers could only display phosphorescent screens of text, the first GUI calculator app was a bold experiment. It looked like an honest-to-goodness pocket calculator. No instruction manual necessary; click on keys with the mouse. And that it could be opened while working within another application was impressive in itself.

Of course now the interaction design mistakes of having a software calculator mimic the real-life plastic device are well-understood. Why click on graphical buttons when there's a computer keyboard? And if keyboard input is accepted, then why waste screen space displaying the buttons at all? Isn't it easier and less error prone to type an expression such as "806 * (556.5 / 26.17)" than to invisibly insert operators within a series of separately entered numbers?

That a literal digitization of a physical calculator isn't a particularly good solution is no longer news. What's surprising is how long the design mistakes of that original implementation have hung on.

If I were teaching a class, and I gave the assignment of "mock-up the interface for a desktop PC calculator app," I'd fully expect to get back a variety of rectangular windows with a numeric display along the top and a grid of buttons below. What a calculator on a computer is supposed to look like is so ingrained that all thought of alternatives is blocked.

This kind of blindness is both easy and difficult to discover. It's easy because all you have to do is stop and give an honest answer to the question "What problem am I trying to solve?" and then actually solve that problem. It's difficult because there are many simple, superficial dodges to that question, such as "because I need to add a calculator to my application."

A better problem statement is along the lines of "a way for users to compute and display the results of basic math operations." The solution is in no way locked into a rectangle containing a numeric display with a grid of buttons below it.

(If you liked this, you might enjoy If You're Not Gonna Use It, Why Are You Building It?)

Building Beautiful Apps from Ugly Code

I wish I could entirely blame my computer science degree for undermining my sense of aesthetics, but I can't. Much of it was self-inflicted from being too immersed in programming and technology for its own sake, and it took me a long time to recover.

There's a tremendous emphasis on elegance and beauty in highbrow coding circles. A sort implemented in three lines of Haskell. A startlingly readable controller for a washing machine in half a dozen lines of Forth. Any example from Structure and Interpretation of Computer Programs. It's difficult to read books, follow blogs, take classes, and not start developing an eye for elegant code.

All those examples of beauty and elegance tend to be small. The smallness--the conciseness--is much of the elegance. If you've ever implemented a sorting algorithm in C, then you likely had three lines of code just to swap values. Three lines for the entire sort is beautifully concise.

Except that beauty rarely scales.

Pick any program outside of the homework range, any program of 200+ lines, and it's not going to meet a standard of elegance. There are special cases and hacks and convoluted spots where the code might be okay except that the spec calls for things which are at odds with writing code that crackles with craftsmanship. There are global variable updates and too many parameters are passed around. Bulky functions because things can't be reduced to one or two clean possibilities. Lines of code wasted translating between types. Odd-looking flag tests needed to fix reported bugs.

Small, elegant building blocks are used to construct imperfect, even ugly, programs. And yet those imperfect, ugly programs may actually be beautiful applications.

The implications of that are worth thinking about. Functional programming zealots insist upon the complete avoidance of destructive updates, yet there's a curious lack of concrete examples of programs which meet that standard of purity. Even if there were, the purely functional style does not any in way translate to the result being a stunningly powerful and easy to use application. There's no cause and effect. And if there is a stunningly powerful and easy to use application, does that mean the code that runs the whole thing is a paragon of beauty? Of course not.

The only sane solution is to focus on the end application first. Get the user to experience the beauty of it and be happy. Don't compromise that because, behind the scenes, the code to draw ovals is more elegant than the code to draw rounded rectangles.

(If you liked this, you might enjoy Write Code Like You Just Learned How to Program.)

Boldness and Restraint

Modern mobile devices are hardly the bastions of minimalism once synonymous with embedded systems. They're driven by bold technical decisions. Full multi-core, 32-bit processors. Accelerated 3D graphics all the way down, including shader support. No whooshing fans or hot to the touch parts. Big, UNIX-like operating systems. This is all the realm of fantasy; pocket-sized computers outperforming what were high-end desktop PCs not all that long ago.

What goes hand-in-hand with that boldness is restraint. It's not the cutting edge, highest clocked, monster of a CPU that ends up in an iPhone, but a cooler, slower, relatively simpler chip. A graphics processor doesn't have to be driven as hard to push the number of pixels in a small display. Storage space is a fraction of a desktop PC, allowing flash memory to replace whirring hard drives and keeping power consumption down.

This is a complete turnaround from the bigger is better at any cost philosophy of the early to mid 2000s. That was when the elite PC hobbyists sported thousand watt power supplies and impressive arrays of fans and heatsinks and happily upgraded to new video cards that could render 18% more triangles at the expense of 40% higher power consumption.

An interesting case where I can't decide if it's impressive boldness or unrestrained excess is in the ultra high-resolution displays expected to to be in near future tablets, such as the iPad 3.

If you haven't been following this, here's the rundown. Prior to mid-2010, the iPhone had a resolution of 480x320 pixels. With the iPhone 4, this was doubled in each dimension to 960x480, and Apple dubbed it a retina display. Now there's a push for the iPad to have its resolution similarly boosted.

The math here is interesting.

The original iPhone, with a resolution of 480x320, has 153,600 pixels.

The iPhone 4's retina display has 614,400 pixels.

The iPad 2 has a resolution of 1024x768, for a total of 786,432 pixels.

A double-in-each-dimension display for the iPad--2048x1536 resolution--has 3,145,728 pixels.

That's an amazing number. It's twenty times the pixel count of the original iPhone. It's over five times the pixels of the iPhone 4 display. It's even 1.7 times the number of pixels on my PC monitor at home. And we're talking about a nine inch screen vs. a desktop display.

I have zero doubt that a display of that resolution will find its way into a next generation iPad. Zero. It's bold, it's gutsy, the precedent is there, the displays already exist, and people want them. But such a tremendous increase in raw numbers, all to make an ultrasharp display be even sharper at close viewing distances? Maybe the days of restraint are over.

(If you liked this, you might enjoy How My Brain Kept Me from Co-Founding YouTube.)

Beyond Empty Coding

There's a culture of cloning and copying that I have a hard time relating to.

I taught myself to program so I could create things of my own design--originally 8-bit video games. There's an engineering side to that, of course, and learning how to better structure code and understand algorithms built my technical knowledge, enabling the creation of things that are more interesting and sophisticated. By itself, that engineering side is pedestrian, even mechanical, much like grammar is an unfortunate necessity for writing essays and short stories. But using that knowledge to create new experiences? That's exciting!

When I see people writing second-rate versions of existing applications simply because they disagree with the licensing terms of the original, or cloning an iPhone app because there isn't an Android version, or rehashing stale old concepts in a rush to make money in the mobile game market...I don't get it.

Oh, I get it from an "I know how to program, and I'm looking for a ready-made idea that I can code-up" angle. What I don't understand is the willingness to so quickly narrow the possibility space, to start with a wide-open sea of ways to solve a problem and develop an easy to use application, but choosing instead to take an existing, half-baked solution as gospel and recreating it (maybe even with a few minor improvements).

Yes, there are some classic responses to this line of thinking. Everything is a remix. Every story ever written can be boiled down to one of seven fundamental plots.

But is that kind of self-justification enough reason to stop trying altogether? To elevate the empty act of coding above the potential to make progress and explore new territory? To say that all music and movies and games are derivative and that's how they'll always be and bring on the endless parade of covers and remakes?

I can only answer for myself: no, it's not.

(If you liked this, you might enjoy Personal Programming.)

Greetings from the Bottom of the Benchmarks

I can guarantee that if you write a benchmark pitting Erlang's dictionary type against that of any other language, Erlang is going to lose. Horribly. It doesn't matter if you choose the dict module or gb_trees; Erlang will still have an embarrassing time of it, and there will be much snickering and posting of stories on the various programming news aggregation sites.

Is the poor showing because dictionaries in Erlang are purely functional, so the benchmark causes much copying of data? Or perhaps because Erlang is dynamically typed?

Neither. It's because the standard Erlang dictionary modules are written in Erlang.

In that light, the low benchmark numbers are astoundingly impressive. The code for every dictionary insert, look-up, and deletion is run through the same interpreted virtual machine as any other code. And the functions being interpreted aren't simply managing a big, mutable hash table, but a high-concept, purely functional tree structure. The Erlang code is right there to look at and examine. It isn't entangled with the dark magic of the runtime system.

There are still some targets of criticism here. Why are there multiple key/value mappings in the standard library, one with the awkward name of gb_trees, in addition to the hackier, clunkier-to-use "Erlang term storage" tables? Why would I choose one over the others? Why is dict singular and gb_trees plural? Let's face it: The Erlang standard library is not a monument of consistency.

But performance? I've used both dictionary types in programs I've written, and everything is so instantaneous that I've never taken the time to see if a disproportionate amount of time is being spent in those modules. Even if I'm overstating things, over-generalizing based on the particular cases where I've used dictionaries, it's still high-level Erlang code going up against the written-in-C runtime libraries of most languages. And that it comes across as "instantaneous" in any kind of real-world situation is impressive indeed.

(If you liked this, you might enjoy Tales of a Former Disassembly Addict.)

Optimization on a Galactic Scale

The code to generate this site has gotten bloated. When I first wrote about it, the Perl script was 6838 bytes. Now it's grown to a horrific 7672 bytes. Part of the increase is because the HTML template is right there in the code, so when I tweak or redesign the layout, it directly affects the size of the file.

The rest is because of a personal quirk I've picked-up: when I write tools, I don't like to overwrite output files with exactly the same data. That is, if the tool generates data that's byte-for-byte identical to the last time the tool was run, then leave that file alone. This makes it easy to see what files have truly changed, plus it often triggers fewer automatic rebuilds down the line (imagine if one of the output files is a C header that's included throughout a project).

How do you avoid overwriting a file with exactly the same data? In the write_file function, first check if the file exists and if so, is it the same size as the data to be written? If those are true, then load the entire file and compare it with the new data. If they're the same, return immediately, otherwise overwrite the existing file with the new data.

At one time I would have thought this was crazy talk, but it's simple to implement, works well, and I've yet to run into any perceptible hit from such a mad scheme. This site currently has 112 pages plus the archive page and the atom feed. In the worst case, where I force a change by modifying the last byte of the template and regenerate the whole site, well, the timings don't matter. The whole thing is over in a tenth of a second on a five year old MacBook.

That's even though the read-before-write method has got to be costing tens or hundreds of millions of cycles. A hundred million cycles is a mind-bogglingly huge number, yet in this case it's irrelevant.

As it turns out, fully half of the execution time is going into one line that has nothing to do with the above code. I have a folder of images that gets copied into another folder if they've changed. To do that I'm passing the buck to the external rsync command using Perl's backticks.

It's oh so innocuous in the Perl source, but behind the scenes it's a study in excess. The shell executable is loaded and decoded, dependent libraries get brought in as needed, external references are fixed-up, then finally the shell itself starts running. The first thing it does is start looking for and parsing configuration files. When the time comes to process the rsync command, then here we go again with all the executable loading and configuration reading and eventually the syncing actually starts.

It must be a great disappointment after all that work to discover that the two files in the image folder are up to date and nothing needs to be done. Yet that whole process is as expensive as the rest of the site generation, much more costly than the frivolous reading of 114 files which are immediately tromped over with new data.

This is all a far cry from Michael Abrash cycle-counting on the 8086, from an Apple II graphics programmer trimming precious instructions from a drawing routine.

(If you liked this, you might enjoy How Did Things Ever Get This Good?)

The Revolution is Personal

If you were going to reinvent the music / film / video game industry, what would you do?

Articles deriding the state of modern music, et al, are staples of the web. They're light and fun to read, and snickering at the antics of a multi-billion dollar industry feels like the tiniest seed of revolution.

There, I've used that word twice now: industry. It's not someone's name, but a faceless scapegoat. Corporations. Wall Street. The Man. It's an empty term. I should have phrased the opening question as "If you were going to make an album / film / video game to buck the current trends which you dislike, what would that album / film / video game be?" Now it's concrete and, perhaps surprisingly, a more difficult problem.

The iOS App Store set the stage for a revolution. You can make anything you want and put it in front of a tremendous audience. Sure, Apple has to give cursory approval to the result, but don't read too much into that. They're only concerned with some blatant edge cases, not with censoring your creativity, and some of the stuff that gets into the App Store emphasizes that.

But the App Store itself is only a revolution in distribution. The ability to implement iOS software and get it out to the world isn't synonymous with having a clear, personal vision about what to implement in the first place. Even just over three years later, some deep ruts in the landscape of independent iOS game development have formed. A cartoony art style. A cute animal as the hero. Mechanics lifted from a small set of past games. If you've ever browsed the iPhone App Store, I'm sure made-up titles like Ninja Cow, Pogo Monkey, Goat Goes Home, and Distraught Penguin all evoke a certain image of what you'd get for your ninety-nine cents.

If you truly want to reinvent even a small part of a creative field, then start developing a personal vision.

Papers from the Lost Culture of Array Languages

2012 is the 50th anniversary of Ken Iverson's A Programming Language, which described the notation that became APL (even though a machine executable version of APL didn't exist yet). Since then there's been APL2, Nial, A+, K, Q, and other array-oriented languages. Iverson (1920-2004) teamed with Roger Hui to create a modern successor to APL, tersely named J, in the late 1980s.

The culture of array languages is a curious one. Though largely functional, array languages represent a separate evolutionary timeline from the lambda calculus languages like Miranda and Haskell. (Trivia: The word monad is an important term in both Haskell and J, but has completely different meanings.) Most strikingly, while Haskell was more of a testbed for functional language theorists that eventually became viable for commercial products, array languages found favor as serious development tools early on. Even today, K is used to analyze large data sets, such as from the stock market. J is used in actuarial work.

Notation as a Tool of Thought, Ken Iverson's 1979 Turing Award Lecture, is the most widely read paper on APL. Donald McIntyre (1923-2009) explored similar ideas in Language as an Intellectual Tool: From Hieroglyphics to APL. When I first learned of McIntyre's paper roughly ten years ago, it wasn't available on the web. I inquired about it via email, and he said he'd see if he or one of his acquaintances had a copy they could send to me. A week later I received an envelope from Ken Iverson (!) containing non-photocopied reprints of Hieroglyphics and his own A Personal View of APL. I still have both papers in the original envelope.

Donald McIntyre also wrote The Role of Composition in Computer Programming, which is mind-melting. (Note that it uses an earlier version of J, so you can't always just cut and paste into the J interpreter.)

There's a touch of melancholy to this huge body--fifty years' worth--of ideas and thought. Fifty years of a culture surrounding a paradigm that's seen as an oddity in the history of computing. Even if you found the other papers I've mentioned to be so many unintelligible squiggles, read Keith Smillie's My Life with Array Languages. It covers a thirty-seven year span of programming in APL, Nial, and J that started in 1968.

(If you liked this, you might enjoy Want to Write a Compiler? Just Read These Two Papers.)

Starting in the Middle

When I start on a personal project, I'm bright-eyed and optimistic. I've got an idea in my head, and all I need to do is implement it. Wait, before I can begin working on the good stuff there are some foundational underpinnings that don't yet exist. I work on those for a while, then I get back to main task...until I again realize that there are other, lower-level libraries that I need to write first.

Now I'm worried, because I'm building all this stuff, but I'm no closer to seeing results. I charge ahead and write one-off routines and do whatever it takes to get a working version one-point-oh. Then I hear the creaks and groans of impending code collapse. I shift my focus to architecture and move things to separate modules, refactor, sweep out dark corners, and eventually I'm back to thinking about the real problem. It's only a short respite. As the code gets larger and more complex, I find myself having to wear my software engineering hat more and more of the time, and that's no fun.

That's the story sometimes, anyway. When using functional programming languages I take a different approach: I pick an interesting little bit that's right in the middle of the problem and start working on it. I don't build up a foundation needed to support the solution. I don't think about how it integrates into the whole. There's a huge win here that should be the selling point of functional programming: you can build large programs without worrying about architecture.

Okay, sure, that architecture dodge isn't entirely true, but it's dramatic enough that I'm surprised functional programming isn't the obvious choice for anyone writing "So you want to learn to program?" tutorials. Or at least that it isn't the focus of "Why Functional Programming is Great" essays. If nothing else, it's a more compelling hook than currying or type systems.

How can starting a project in the middle possibly work? By writing symbolic code that lets me put off as many design decisions as possible. If I'm writing the "move entity" function for a video game, the standard approach is to directly modify a structure or object representing that entity. It's much easier to return a description of the change, like {new_ypos, 76} or {new_color, red}. (Those are both Erlang tuples.) That avoids the whole issue of how to rebuild what may be a complex, nested data structure with a couple of new values.

If I want to multiply the matrices M and N, the result is {'*', M, N}. (This is another Erlang tuple. The single quotes around the asterisk mean that it's an atom--a symbol in Lisp or Scheme. Those quotes are only necessary if the atom isn't alphanumeric.) The function to transpose a matrix returns {transpose, M}.

It looks like the essential work is being dodged, but it depends what you're after. I can write code and see at a glance that it gives the right result. I can use those functions to create more interesting situations and learn about the problem. If I find my understanding of the problem is wrong, and I need to back up, that's okay. It's more than okay: it's great! Maybe it turns out that I don't need to multiply matrices after all, so I didn't waste time writing a multiply routine. Maybe the transpose function is always called with a parameter of {transpose, Something}, so the two transpositions cancel out and there's no need to do anything.

At some point I have to stop living this fantasy and do something useful with these abstract descriptions. Hopefully by that time my experiments in symbolic programming have better defined both the problem and the solution, and I won't need to spend as much time thinking about boring things like architecture.

(If you liked this, you might enjoy Living Inside Your Own Black Box.)

Things That Turbo Pascal is Smaller Than

Turbo Pascal 3 for MS-DOS was released in September 1986. Being version 3, there were lesser releases prior to it and flashier ones after, but 3 was a solid representation of the Turbo Pascal experience: a full Pascal compiler, including extensions that it made it practical for commercial use, tightly integrated with an editor. And the whole thing was lightning fast, orders of magnitude faster at building projects than Microsoft's compilers.

The entire Turbo Pascal 3.02 executable--the compiler and IDE--was 39,731 bytes. How does that stack up in 2011 terms? Here are some things that Turbo Pascal is smaller than, as of October 30, 2011:

The minified version of jquery 1.6 (90,518 bytes).

The yahoo.com home page (219,583 bytes).

The image of the white iPhone 4S at apple.com (190,157 bytes).

zlib.h in the Mac OS X Lion SDK (80,504 bytes).

The touch command under OS X Lion (44,016 bytes).

Various vim quick reference cards as PDFs. (This one is 47,508 bytes.)

The compiled code for the Erlang R14B02 parser (erl_parse.beam, 286,324 bytes).

The Wikipedia page for C++ (214,251 bytes).

(If you liked this, you might like A Personal History of Compilation Speed.)

Adventures in Unfiltered Global Publishing

I remember sitting in my parents' backyard in Texas, in the mid 1980s, reading a computer magazine that contained a game and accompanying article I had written. I don't know what the circulation of the magazine--Antic--was, but it was popular enough that I could walk into any mall bookstore and flip through a copy.

The amazing part, of course, was that my game was in there. Not that it was a great game, but it had gone from initial design to final implementation in under two weeks. I didn't talk to anyone about the concept. I didn't have any help with the development. I don't think I even asked anyone to playtest it. Yet there it was in print, the name of the game right on the cover, and available in dozens of bookstores in the Dallas area alone.

In early 1998, Gordon Cameron asked if I'd be the guest editor for SIGGRAPH Computer Graphics Quarterly. The issue was focused on gaming and graphics, and the invitation was largely based on Halcyon Days which I had put together the previous year. I wasn't even a SIGGRAPH member.

I talked to some people I had been in contact with, like Steven Collins (who co-founded Havok that same year) and Owen Rubin (who wrote games for those old "glowing vector" arcade machines). I still like this bit from Noah Falstein's "Portrait of the Artists in a Young Industry":

Incidentally, Sinistar was probably the first videogame to employ motion capture for graphics--of a sort. Jack provided us with three mouth positions, closed, half-open and open. It was up to Sam Dicker, the lead programmer, and myself to figure out which positions to use for which phrases. After a few unsuccessful attempts to synchronize it by hand we hit on a scheme. We wrote each of the short phrases Sinistar spoke on a whiteboard. Then Sam held a marker to his chin with its tip touching the board and moved his head along the phrase, reading it aloud. This gave us a sort of graph showing how his chin dropped as he spoke. Then we "digitized" it, eyeballing the curve, reducing it to three different states and noting duration.

I think one of the seven contributors was recommended by Gordon; the other six were my choice. I suggested topics, edited the articles (and over-edited at least one), wrote the "From the Guest Editor" column, and the completed issue was mailed out to SIGGRAPH members in May.

In both of these cases, I failed to realize how unusual it is to go from idea to print without any interference whatsoever. Somehow my own words and thoughts were getting put into professionally produced, respectable periodicals, without going through any committees, without anyone stopping to ask "Hey, does this guy even know what he's talking about?"

On October 30th of this year, I sat down on a couch in my basement to write a short article I had in my head. The total time from first word to finished piece was one hour, and most of that was spent researching some numbers. I've had unintentionally popular blog entries before, most notably Advice to Aimless, Excited Programmers and Write Code Like You Just Learned How to Program, but that start to finish in one hour entry, Things That Turbo Pascal is Smaller Than, took off faster than anything I've written. It was all over the place that same evening and inexplicably ended up on Slashdot within forty-eight hours.

If you read or linked to that article, thank you.

(If you just started reading this site, you might enjoy A Three-Year Retrospective.)

Photography as a Non-Technical Hobby

When I got into photography in 2004, I approached it differently from the more technical endeavors I've been involved in. It was a conscious decision, not an accident.

I'd been overexposed to years of bickering about computer hardware, programming languages, you name it. All the numbers (this CPU is 17% faster in some particular benchmark), all the personal opinions stated as fact (open source is superior to closed), all the comparisons and put downs (Ruby sucks!). I'd had enough.

Now photographers can be similarly cranky and opinionated. All the different makes and models of cameras, lenses, filters, flashes. Constant dissection of every rumored product. Debates about technique, about whether something is real art or cheating.

I didn't want any of that. I wanted to enjoy creating good pictures without getting into the photography community, without thinking about technical issues at all. No reading tutorials or photography magazines (even though I've had a photo published in a tutorial in one of those magazines). No hanging out in forums. And it has been refreshing.

I've even gone so far as to leave my fancy-pants Nikon in a cupboard most of the time, because it's so much more fun to use my iPhone 4 with the Hipstamatic app. The iPhone completely and utterly loses to the Nikon in terms of absolute image quality, but that's more than balanced out by guaranteeing that I have an unobtrusive camera with me at all times, one that can directly upload photos to my Flickr account.

Here are a few photos I've taken this year. Each one is a link to the Flickr original.

(If you liked this, you might enjoy Constantly Create.)

User Experience Intrusions in iOS 5

The iPhone has obsoleted a number of physical gadgets. A little four-track recorder that I use as a notebook for song ideas. A stopwatch. A graphing calculator. Those ten dollar LCD games from Toys 'R Us. And it works because an iPhone app takes over the device, giving the impression that it's a custom piece of hardware designed for that specific purpose.

But it's only an illusion. I can be in the middle of recording a track, and I get a call. That puts the recorder to sleep and switches over to the phone interface. Or I can be playing a game and the "Battery is below 20%" alert pops up at an inopportune moment. These are interesting edge cases, where the reality that the iPhone is a more complex system--and not a dedicated game player or recorder--bleeds into the user experience. These intrusions are driven by things outside of my control. I didn't ask to be called at that moment; it just happened. I understand that. I get it.

What if there was something I could do within an app that broke the illusion? Suppose that tapping the upper-left corner of the screen ten times in row caused an app to quit (it doesn't; this is just an example). Now the rule that an app can do whatever it wants, interface-wise, has been violated. You could argue that tapping the corner of the screen ten times is so unlikely that it doesn't matter, but that's a blind assumption. Think of a game based around tapping, for example. Or a drum machine.

As it turns out, two such violations were introduced in iOS 5.

On the iPad, there are a number of system-wide gestures, such as swiping left or right with four fingers to switch between apps. Four-finger swipes? That's convoluted, but imagine a virtual mixing console with horizontal sliders. Quickly move four of them at once...and you switch apps. Application designers have to work around these, making sure that legitimate input methods don't mimic the system-level gestures.

The worst offender is this: swipe down from the top of the screen to reveal the Notification Center (a window containing calendar appointments, the weather, etc.). A single-finger vertical motion is hardly unusual, and many apps expect such input. The games Flight Control and Fruit Ninja are two prime examples. Unintentionally pulling down the Notification Center during normal gameplay is common. A centered vertical swipe is natural in any paint program, too. Do app designers need build around allowing such controls? Apparently, yes.

There's an easy operating system-level solution to the Notification Center problem. Require the gesture to start on the system bar at the top of the screen, where the network status and battery indicator are displayed. Allowing the system bar in an app is already an intrusion, but one opted into by the developer. Some apps turn off the system bar, including many games, and that's fine. It's an indication that the Notification Center isn't available.

(If you liked this, you might enjoy Caught-Up with 20 Years of UI Criticism.)

2011 Retrospective

I was going to end this blog one year ago.

Prog21 was entirely a personal outlet for the more technical ideas kicking around in my head, and it had run its course. Just before Christmas 2010, I sat down and wrote a final "thanks for reading," essay. I've still got it on my MacBook. But instead of posting it, I dashed off Write Code Like You Just Learned How to Program, and the response made me realize my initial plan may have been too hasty.

In 2011 I posted more articles than in any previous year--32, including this one [EDIT: well, actually it was the second most; there were 33 in 2010]. I finally gave the site a much needed visual makeover. And I'm still wrestling with how to balance the more hardcore software engineering topics that I initially wrote about with the softer, less techy issues that I've gotten more interested in.

Have a great 2012, everyone!

others from 2011 that I personally like

Accidental Innovation
Follow the Vibrancy
Impressed by Slow Code
Constantly Create
8-Bit Scheme: A Revisionist History
Greetings from the Bottom of the Benchmarks
Adventures in Unfiltered Global Publishing
Photography as a Non-Technical Hobby

(There's also a retrospective covering 2007-2010.)

A Programming Idiom You've Never Heard Of

Here are some sequences of events:

Take the rake out of the shed, use it to pile up the leaves in the backyard, then put the rake back in the shed.

Fly to Seattle, see the sights, then fly home.

Put the key in the door, open it, then take the key out of the door.

Wake-up your phone, check the time, then put it back to sleep.

See the pattern? You do something, then do something else, then you undo the first thing. Or more accurately, the last step is the inverse of the first. Once you're aware of this pattern, you'll see it everywhere. Pick up the cup, take a sip of coffee, put the cup down. And it's all over the place in code, too:

Open a file, read the contents, close the file.

Allocate a block of memory, use it for something, free it.

Load the contents of a memory address into a register, modify it, store it back in memory.

While this is easy to explain and give examples of, it's not simple to implement. All we want is an operation that looks like idiom(Function1, Function2), so we could write the "open a file..." example above as idiom(Open, Read). The catch is that there needs to be a programmatic way to determine that the inverse of "open" is "close." Is there a programming languages where functions have inverses?

Surprisingly, yes: J. And this idiom I keep talking about is even a built-in function in J, called under. In English, and not J's terse syntax, the open file example is stated as "read under open."

One non-obvious use of "under" in J is to compute the magnitude of a vector. Magnitude is an easy algorithm: square each component, sum them up, then take the square root of the result. Hmmm...the third step is the inverse of the first. Sum under square. Or in actual J code:

mag =: +/ &.: *:

+/ is "sum." The ampersand, period, colon sequence is "under." And *: is "square."

(Also see the follow-up.)

Follow-up to "A Programming Idiom You've Never Heard Of"

Lots of mail, lots of online discussion about A Programming Idiom You've Never Heard Of, so I wanted to clarify a few things.

What I was trying to do was get across the unexpected strangeness of function inverses in a programming language. In that short definition of vector magnitude, there wasn't a visible square root function. There was only an operator for squaring a value, and another operator that involved inverting a function.

How does the J interpreter manage to determine a function inverse at runtime? For many primitives, there's an associated inverse. The inverse of add is subtract. The inverse of increment is decrement. For some primitives there isn't a true, mathematical inverse, but a counterpart that's often useful. That's why the preferred term in J isn't inverse, but obverse.

For user-defined functions, there's an attempt at inversion (er, obversion) that works much of the time. A function that reverses a list then adds five to each element turns into a function that subtracts five from each element then reverses the list. For cases where the automated obverse doesn't work, or where you want the obverse to have different behavior, you can associate a user-defined obverse with any verb (J lingo for function). You could define an open_file verb which opens a file and has an obverse that closes a file. Or in actual J:

open_file =: open :. close

Well, really, that should be:

open_file =: (1!:21) :. (1!:22)

But the former, without the explicit foreign function calls, gets the point across clearer, I think.

One common use of obverses and the "under" operator is for boxing and unboxing values. In J, a list contains values of the same type. There's no mixing of integers and strings like Lisp or Python. Instead you can "box" a value, then have a list containing only boxed values. But there's nothing you can do with a boxed value except unbox it, so it's common to say "[some operation] under open box," like "increment under open box." That means unbox the value, increment it, then put it back in a box. Or in real, eyeball-melting J:

inc_box =: >: &. >

The >: is increment. The right > means open box. That's the "under" operation in the middle.

Now it sounds like this "open box, do something, close box" sequence would translate beautifully to the "open file, read the contents, close the file" example I gave last time, but it doesn't. The catch is that the open / read / close verbs aren't manipulating a single input the way inc_box is. Opening a file returns a handle, which gets passed to read. But reading a file returns the contents of the file, which is not something that can be operated on by close. So this definition won't work:

read_file =: read &. open

If a structured data type like a dictionary was being passed around, then okay, but that's not a pretty example like I hoped it would be.

Still, I encourage learning J, if only to make every other language seem easy.

Recovering From a Computer Science Education

I was originally going to call this "Undoing the Damage of a Computer Science Education," but that was too link-baity and too extreme. There's real value in a computer science degree. For starters, you can easily get a good paying job. More importantly, you've gained the ability to make amazing and useful things. But there's a downside, too, in that you can get so immersed in the technical and theoretical that you forget how wonderful it is to make amazing and useful things. At least that's what happened to me, and it took a long time to recover.

This is a short list of things that helped me and might help you too.

Stay out of technical forums unless it's directly relevant to something you're working on. It's far too easy to get wrapped up in discussions of the validity of functional programming or whether or not Scheme can be used to make commercial applications or how awful PHP is. The deeper you get into this, the more you lose touch.

Keep working on real projects related to your area of interest. If you like designing games, write games. If you like photography, write a photo organizer or camera app. Don't approach things wrong-way-around, thinking that "a photo organizer in Haskell" is more important than "a photo organizer which solves a particular problem with photo organizers."

If you find yourself repeatedly putting down a technology, then take some time to actually learn and use it. All the jokes and snide remarks aside, Perl is tremendously useful. Ditto for PHP and Java and C++. Who wins, the person who has been slamming Java online for ten years or the author of Minecraft who just used the language and made tens of millions of dollars?

Don't become an advocate. This is the flipside of the previous item. If Linux or Android or Scala are helpful with what you're building, then great! That you're relying on it is a demonstration of its usefulness. No need to insist that everyone else use it, too.

Have a hobby where you focus the end results and not the "how." Woodworkers can become tool collectors. Photographers can become spec comparison addicts. Forget all of that and concern yourself with what you're making.

Do something artistic. Write songs or short stories, sketch, learn to do pixel art. Most of these also have the benefit of much shorter turnaround times than any kind of software project.

Be widely read. There are endless books about architecture, books by naturalists, both classic and popular modern novels, and most of them have absolutely nothing to do with computers or programming or science fiction.

Virtual Joysticks and Other Comfortably Poor Solutions

Considering that every video game system ever made shipped with a physical joystick or joypad, the smooth, featureless glass of mobile touchscreens was unnerving. How to design a control scheme when there is no controller?

One option was to completely dodge the issue, and that led to an interesting crop of games. Tip the entire device left and right and read the accelerometer. Base the design around single-finger touches or drawing lines or dragging objects. But the fallback solution for games that need more traditional four or eight way input is to display a faux controller for the player to manipulate.

The virtual joystick option is obvious and easy, but it needs pixels, filling the bottom of the screen with a bitmap representation of an input device. Sometimes it isn't too obtrusive. Other times it's impressively ugly. Aesthetics aside, there's a fundamental flaw: you can't feel the image. There's no feedback indicating that your hand is in the right place or if it slides out of the control area.

There may have been earlier attempts, but Jeff Minter's Minotaur Rescue, released just over a year ago, was the first good alternative to a virtual joystick that I ran across. Minter's insight was that directional movement anywhere on the screen contains useful information. Touch, then slide to the right: that's the same as moving a virtual controller to the right. Without lifting your finger, slide up: that's upward motion. There's no need to restrict input to a particular part of the screen; anywhere is fine.

He even extended this to work for twin-stick shooter controls. The first touch is for movement, the second for shooting, then track each independently. Again, it's not where you touch the screen, it's when and how.

It's all clean and obvious in retrospect, but it took getting past the insistence that putting pictures of joysticks and buttons on the screen was the only approach.

Pretend This Optimization Doesn't Exist

In any modern discussion of algorithms, there's mention of being cache-friendly, of organizing data in a way that's a good match for the memory architectures of CPUs. There's an inevitable attempt at making the concepts concrete with a benchmark manipulating huge--1000x1000--matrices. When rows are organized sequentially in memory, no worries, but switch to column-major order, and there's a very real slowdown. This is used to drive home the impressive gains to be had if you keep cache-friendliness in mind.

Now forget all about that and get on with your projects.

It's difficult to design code for non-trivial problems. Beautiful code quickly falls apart, and it takes effort to keep things both organized and correct. Now add in another constraint: that the solution needs to access memory in linear patterns and avoid chasing pointers to parts unknown.

You'll go mad trying to write code that way. It's like writing a short story without using the letter "t."

If you fixate on the inner workings of caches, fundamental and useful techniques suddenly turn horrible. Reading a single global byte loads an entire cache line. Think objects are better? Querying a byte-sized field is just as bad. Spreading the state of a program across objects scattered throughout memory is guaranteed to set off alarms when you run a hardware-level performance analyzer.

Linked lists are a worst case, potentially jumping to a new cache line for each element. That's damning evidence against languages like Haskell, Clojure, and Erlang. Yet some naive developers insist on using Haskell, Clojure, and Erlang, and they cavalierly disregard the warnings of the hardware engineers and use lists as their primary data structure...

...and they manage to write code where performance is not an issue.

(If you liked this, you might enjoy Impressed by Slow Code.)

Four Levels of Idea Theft

Imagine you've just seen a tremendously exciting piece of software--a mobile app, a web app, a game--and your immediate reaction is "Why didn't I think of that?!" With your mind full of new possibilities, you start on a project, a project enabled by exposure to the exciting software. What happens next is up to you. How far do you let your newfound motivation take you?

Borrowing specific features. You like the way the controls work. The sign-in process. Something specific.

Sliminess Factor: None. This is how progress happens.

General inspiration. If web-based photo sharing had never occurred to you, and then you saw Flickr, that opens the door for thinking about the entire problem space. Some of those options may be Flickr-ish, some aren't.

Sliminess Factor: Low. It's a common reaction to be excited and inspired by something new, and it inevitably affects your thinking.

Using the existing product as a template. Now you're not simply thinking about photo sharing, but having groups and contacts and favorites and tags and daily rankings. You're not still writing a full-on Flickr clone--there are lots of things to be changed for the better--but it's pretty clear what your model is.

Sliminess Factor: Medium. While there's nothing illegal going on, you won't be able to dodge the comparisons, and you'll look silly if you get defensive. Any claims of innovation or original thinking will be dismissed as marketing-speak.

Wholesale borrowing of the design. All pretense of anything other than recreating an existing product have gone out the window. Your photo sharing site is called "Phlickr" and uses the same page layouts as the original.

Sliminess Factor: High. This is the only level that legitimately deserves to be called theft.

(If you liked this, you might enjoy Accidental Innovation.)

A Peek Inside the Erlang Compiler

Erlang is a complex system, and I can't do its inner workings justice in a short article, but I wanted to give some insight into what goes on when a module is compiled and loaded. As with most compilers, the first step is to convert the textual source to an abstract syntax tree, but that's unremarkable. What is interesting is that the code goes through three major representations, and you can look at each of them.

Erlang is unique among functional languages in its casual scope rules. You introduce variables as you go, without fanfare, and there's no creeping indentation caused by explicit scopes. Behind the scenes that's too quirky, so the syntax tree is converted into Core Erlang. Core Erlang looks a lot like Haskell or ML with all variables carefully referenced in "let" statements. You can see the Core Erlang representation of a module with this command from the shell:

c(example, to_core).

The human-readable Core Erlang for the example module is written to example.core.

The next big transformation is from Core Erlang to code for the register-based BEAM virtual machine. BEAM is poorly documented, but it's a lot like the Warren Abstract Machine developed for Prolog (but without the need for backtracking). BEAM isn't terribly hard to figure out if you write short modules and examine them with:

c(example, 'S').

The disassembled BEAM code for the example module is written to example.S. The key to understanding BEAM is that there are two sets of registers: one for passing parameters ("x" registers) and one for use as locals within functions ("y" registers).

Virtual BEAM code is the final output of the compiler, but it's still not what gets executed by the system. If you look at the source for the Erlang runtime, you'll see that beam_load.c is over six thousand lines of code. Six thousand lines to load a module? That's because the beam loader is doing more than its name lets on.

There's an optimization pass on the virtual machine instructions, specializing some for certain situations and combining others into superinstructions. To check if a value is a tuple of three elements is accomplished with a pair of BEAM operations: is_tuple and is_arity. The BEAM loader turns these into one superinstruction: is_tuple_of_arity. You can see this condensed representation of BEAM code with:

erts_debug:df(example).

The disassembled code is written to example.dis. (Note that the module must be loaded, so compile it before giving the above command.)

The loader also turns the BEAM bytecode into threaded code: a list of addresses that get jumped to in sequence. There's no "Now what do I do with this opcode?" step, just fetch and jump, fetch and jump. If you want to to know more about threaded code, look to the Forth world.

Threaded code takes advantage of the labels as values extension of gcc. If you build the BEAM emulator with another compiler like Visual C++, it falls back on using a giant switch statement for instruction dispatch and there's a significant performance hit.

(If you liked this, you might enjoy A Ramble Through Erlang IO Lists.)

Don't Fall in Love With Your Technology

In the 1990s I followed the Usenet group comp.lang.forth. Forth has great personal appeal. It's minimalist to the point of being subversive, and Forth literature once crackled with rightness.

Slowly, not in a grand epiphany, I realized that there was something missing from the discussions in that group. There was talk of tiny Forths, of redesigning control structures, of ways of getting by with fewer features, and of course endless philosophical debates, but no one was actually doing anything with the language, at least nothing that was in line with all the excitement about the language itself. There were no revolutions waiting to happen.

I realized comp.lang.forth wasn't for me.

A decade later, I stuck my head back in and started reading. It was the same. The same tinkering with the language, the same debates, and the same peculiar lack of interest in using Forth to build incredible things.

Free Your Technical Aesthetic from the 1970s is one of the more misunderstood pieces I've written. Some people think I was bashing on Linux/Unix as useless, but that was never my intent. What I was trying to get across is that if you romanticize Unix, if you view it as a thing of perfection, then you lose your ability to imagine better alternatives and become blind to potentially dramatic shifts in thinking.

It's bizarre to realize that in 2007 there were still people fervently arguing Emacs versus vi and defending the quirks of makefiles. That's the same year that multi-touch interfaces exploded, low power consumption became key, and the tired, old trappings of faux-desktops were finally set aside for something completely new.

Don't fall in love with your technology the way some Forth and Linux advocates have. If it gives you an edge, if it lets you get things done faster, then by all means use it. Use it to build what you've always wanted to build, then fall in love with that.

(If you liked this, you might enjoy Follow the Vibrancy.)

A Complete Understanding is No Longer Possible

Let's say you've just bought a MacBook Air, and your goal is to become master of the machine, to understand how it works on every level.

Amit Singh's Mac OS X Internals: A Systems Approach is a good place to start. It's not about programming so much as an in-depth discussion of how all the parts of the operating system fit together: what the firmware does, the sequence of events during boot-up, what device drivers do, and so on. At 1680 pages, it's not light summer reading.

To truly understand the hardware, Intel has kindly provided a free seven volume set of documentation. I'll keep things simple by recommending Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture (550 pages) and the two volumes describing the instruction set (684 pages and 704 pages respectively).

Objective-C is the language of OS X. We'll go with Apple's thankfully concise The Objective-C Programming Language (137 pages).

Of course Objective-C is a superset of C, so also work through the second edition of The C Programming Language (274 pages).

Now we're getting to the core APIs of OS X. Cocoa Fundamentals Guide is 239 pages. Application Kit Framework Reference is a monster at 5069 pages. That's help a file-like description of every API call. To be fair I'll stop there with the Cocoa documentation, even though there are also more usable guides for drawing and Core Audio and Core Animation and a dozen other things.

Ah, wait, OpenGL isn't part of Cocoa, so throw in the 784 page OpenGL Reference Manual. And another 800 pages for OpenGL Shading Language, Second Edition.

The total of all of this is 79 pages shy of eleven thousand. I neglected to include man pages for hundreds of system utilities and the Xcode documentation. And I didn't even touch upon the graphics knowhow needed to do anything interesting with OpenGL, or how to write good C and Objective-C or anything about object-oriented design, and...

(If you liked this, you might enjoy Things That Turbo Pascal is Smaller Than.)

Solving the Wrong Problem

Occasionally, against my better judgement, I peek into discussion threads about things I've written to see what the general vibe is, to see if I've made some ridiculous mistake that no one bothered to tell me directly about. The most unexpected comments have been about how quickly this site loads, that most pages involve only two requests--the HTML file and style sheet--for less than ten kilobytes in total, and that this is considered impressive.

Some of that speed is luck. I use shared hosting, and I have no control over what other sites on the same server are doing.

But I've also got a clear picture of how people interact with a blog: they read it. With the sole exception of myself, all people do with the prog21 site is grab files and read them. There's no magic to serving simple, static pages. What's surprising is that most implementers of blogging software are solving the wrong problems.

An SQL database of entries that can be on-the-fly mapped to themed templates? That's a solution designed to address issues of blog maintainers, not readers, yet all readers pay the price of slower page loads or not being able to see a page at all after a mention on a high-profile site.

(At one time, the Sieve of Eratosthenes--an algorithm for finding all prime numbers up to a given limit--was a popular benchmark for programming language performance. As an example of raw computation, the Sieve was fine, but suppose you needed a list of the primes less than 8,000 in a performance-sensitive application. Would you bother computing them at run time? Of course not. You already know them. You'd run the program once during development, and that's that.)

Tracking cookies and Google Analytics scripts? Those solve a problem specific to the site owner: "How can I know exactly how much traffic I'm getting"? Readers don't care.

Widgets for Google+ and Twitter and Facebook? These don't solve anyone's problems. You can tweet easily enough without a special button. Aggregation sites, Google Reader, and even Google search results have the equivalent of "like" buttons, so why duplicate the functionality? More importantly, only a small minority of users bother with these buttons, but all the associated scripting and image fetching slows down page loads for everyone.

(If you liked this, you might enjoy It's Like That Because It Has Always Been Like That.)

Turning Your Code Inside Out

If I gave this assignment to a class of novice programmers:

Write a C program to sum the elements of an array and display the results. Include five test cases.

I'd expect multiple people would come up with a function like this:

void sum_array(int array[], int size)
{
int sum = 0;
for (int i = 0; i < size; i++) {
sum += array[i];
}
printf("the sum of the array is %d\n", sum);
}

There's a cute new programmer-ism in that solution: displaying the output is built into sum_array instead of returning the sum and printing it elsewhere. Really, it's hard to see why that one extra line is a bad idea, at least up front. It prevents duplication of the printf call, which is good, right? But tack on an extra requirement such as "Use your array summing function to compute the total sum of three arrays," and there will be an "Ohhhhh, I don't want that print in there" moment.

The design error was obvious in this example, but it crops up in other cases where it isn't so immediately clear.

Let's say we're writing a video game and there's a function to spawn an attacker. To alert the player of the incoming danger, there's a sound accompanying the appearance of each new attacker, so it makes sense to play that sound inside new_attacker.

Now suppose we want to spawn five attackers at the same time by calling new_attacker in a loop. Five attackers are created as expected, but now five identical sounds are starting during the same frame. Those five sounds will be perfectly overlaid on top of each other, at best sounding five times louder than normal and at worst breaking-up because the audio levels are too high. As a bonus, we're taking up five audio channels so the player can hear this mess.

The solution is conceptually the same as the sum_array example: take the sound playing out of new_attacker and let the caller handle it. Now there's a single function call to start a sound followed by a loop that creates the attackers.

Why am I bothering to talk about this?

This method of turning your code inside out is the secret to solving what appear to be hopelessly state-oriented problems in a purely functional style. Push the statefulness to a higher level and let the caller worry about it. Keep doing that as much as you can, and you'll end up with the bulk of the code being purely functional.

(If you liked this, you might enjoy Purely Functional Retrogames.)

This is Why You Spent All that Time Learning to Program

There's a standard format for local TV news broadcasts that's easy to criticize. (By "local," I mean any American town large enough to have its own television station.)

There's an initial shock-value teaser to keep you watching. News stories are read in a dramatic, sensationalist fashion by attractive people who fill most of the screen. There's an inset image over the shoulder of the reader. Periodically there's a cutaway to a reporter in the field; it's often followed-up with side-by-side images of the newscaster and reporter while the former asks a few token questions to latter. There's pretend banter between newscasters after a feel-good story.

You get the idea. Now what if I wanted to change this entrenched structure?

I could get a degree in journalism and try to get a job at the local TV station. I'd be the new guy with no experience, so it's not likely I could just step-in and make sweeping reforms. All the other people there have been doing this for years or decades, and they've got a established routines. I can't make dozens of people change their schedules and habits because I think I'm so smart. To be perfectly fair, a drastic reworking of the news would result in people who had no issues with old presentation getting annoyed and switching to one of the other channels that does things the old way.

When I sit down to work on a personal project at home, it's much simpler.

I don't have to follow the familiar standards of whatever kind of app I'm building. I don't have to use an existing application as a model. I can disregard history. I can develop solutions without people saying "That's not how it's supposed to work!"

That freedom is huge. There are so many issues in the world that people complain about, and there's little chance of fixing the system in a significant way. Even something as simple as reworking the local news is out of reach. But if you're writing an iOS game, an HTML 5 web app, a utility that automates work so you can focus on the creative fun stuff, then you don't have to fall back on the existing, comfortable solutions that developers before you chose simply because they too were trapped by the patterns of the solutions that came before them.

You can fix things. You can make new and amazing things. Don't take that ability lightly.

(If you liked this, you might enjoy Building Beautiful Apps from Ugly Code.)

100,000 Lines of Assembly Language

I occasionally get asked about writing Super Nintendo games. How did anyone manage to work on projects consisting of hundreds of thousands of lines of 16-bit assembly language?

The answer is that it's not nearly as Herculean as it sounds.

The SNES hardware manual is a couple of hundred pages. I don't remember the exact number, so I'll shoot high: 400 pages. Add in a verbose 65816 assembly language book and combined we're talking 800 or 900 pages tops. That's eight percent of the total I came up with for having a complete understanding of an OS X based computer: nearly 11,000 pages.

Sure, there are whole classes of errors that you can make in assembly language that are invisible in C. For example, here's some old-school x86 code:

mov ax, 20
mov bx, -1
int XX

This sets up a couple of parameters and calls an interrupt. It looks right, it works, it may even ship in a commercial product, but then there's a new MS-DOS version and it crashes. Why? Because the second parameter should be passed in the dx register, not bx. It only worked because a previous interrupt happened to return -1 in dx, so the second line above isn't actually doing anything useful. But those kinds of errors are rare.

The secrets of working entirely in assembly language are being organized, thinking before implementing, and keeping things clean and understandable. That sounds a lot like how to write good Javascript or C++. Steve McConnell's Code Complete is actually a guidebook for the Super Nintendo game programmer.

But all this talk of programming languages and hardware is backward. Jordan Mechner designed and built the original Prince of Persia on an Apple II. The game and the editor for laying out the levels are written in assembly code for the 8-bit 6502. He kept a journal while writing the game.

You might expect the journal to be filled with coding philosophies and 6502 tricks, but there's little of that. Sure, he's doing difficult tech work behind the scenes, but that's not what he's writing about. They're the journals of a designer and a director, of someone living far away from home after graduating from college, with long detours into his screenwriting aspirations (and don't let that scare you off; they're fascinating).

He may have had second set of coding journals, but I like to think he didn't. Even if he did, he was clearly thinking about more than the techie side of things, in the way that a novelist's personal journal is unlikely to be filled with ramblings about grammar and sentence structure.

(If you liked this, you might enjoy The Pure Tech Side is the Dark Side.)

Use and Abuse of Garbage Collected Languages

The garbage collection vs. manual memory management debates ended years ago. As with the high-level vs. assembly language debates which came before them, it's hard to argue in favor of tedious bookkeeping when there's an automatic solution. Now we use Python, Ruby, Java, Javascript, Erlang, and C#, and enjoy the productivity benefits of not having to formally request and release blocks of bytes.

But there's a slight, gentle nagging--not even a true worry--about this automatic memory handling layer: what if when my toy project grows to tens or hundreds of megabytes of data, it's no longer invisible? What if, despite the real-time-ness and concurrent-ness of the garbage collector, there's a 100 millisecond pause in the middle of my real-time application? What if there's a hitch in my sixty frames per second video game? What if that hitch lasts two full seconds? The real question here is "If this happens, then what can I possibly do about it?"

These concerns aren't theoretical. There are periodic reports from people for whom the garbage collector has switched from being a friendly convenience to the enemy. Maybe it's because of a super-sized heap? Or maybe accidentally triggering worst-case behavior in the GC? Or maybe it's simply using an environment where GC pauses didn't matter until recently?

Writing a concurrent garbage collector to handle gigabytes is a difficult engineering feat, but any student project GC will tear through a 100K heap fast enough to be worthy of a "soft real-time" label. While it should be obvious that keeping data sizes down is the first step in reducing garbage collection issues, it's something I haven't seen much focus on. In image processing code written in Erlang, I've used the atom transparent to represent pixels where the alpha value is zero (instead of a full tuple: {0,0,0,0}). Even better is to work with runs of transparent pixels (such as {transparent, Length}). Data-size optimization in dynamic languages is the new cycle counting.

There's a more often recommended approach to solving garbage collection pauses, and while I don't want to flat-out say it's wrong, it should at least be viewed with suspicion. The theory is that more memory allocations means the garbage collector runs more frequently, therefore the goal is to reduce the number of allocations. So far, so good. The key technique is to preallocate pools of objects and reuse them instead of continually requesting memory from and returning it to the system.

Think about that for a minute. Manual memory management is too error prone, garbage collection abstracts that away, and now the solution to problems with garbage collection is to manually manage memory? This is like writing your own file buffering layer that sits on top of buffered file I/O routines. The whole point of GC is that you can say "Hey, I'd like a new [list/array/object]," and it's quick, and it goes away when no longer referenced. Memory is a lightweight entity. Need to build up an intermediate list and then discard it? Easy! No worries!

If this isn't the case, if memory allocations in a garbage collected language are still something to be calorie-counted, then maybe the memory management debates aren't over.

(If you liked this, you might enjoy Why Garbage Collection Paranoia is Still (sometimes) Justified.)

Can You Be Your Own Producer?

I've worked on personal projects where I went badly off track and didn't realize it until much later. What I needed was someone to nudge me in the right direction, someone to objectively point out the bad decisions I was making.

What I needed was a producer.

When a band is recording an album, the producer isn't there to write songs or play instruments, but to provide focus and give outside perspective. A good producer should say "Guys, you sound great live, but it's not coming through in the recording; can we try getting everyone in here at same time and see how that goes?" or "We've got three songs with similar wandering breakdowns; if you could replace one, which would it be?"

(Here I should point out that a producer in the music production sense is different than a producer in film or in video games. Same term, different meanings.)

If you're a lone wolf or part of a small group building something in your basement, there's tremendous value in being able to step back and get into the producer mindset. Maybe you can't be both a producer and developer at the same time, but recognizing that you need to switch hats periodically is key.

What are the kinds of question you should be asking when in your producer role?

Are you letting personal technology preferences cloud your vision? Haskell is a great language, but what if you're writing a lot of code that you'd get for free with Python? Yes, Android is open to some extent and iOS isn't, but should you be dismissing the entire iPhone / iPad market for that reason?

Are you avoiding doing the right thing because it's hard? If everyone you show your project to has the same confusion about how a feature works, disregarding it because it would take a month of rearchitecting usually isn't a valid response.

Are you simply copying existing ideas without offering anything new? It's so easy to see a finished application and jump into writing your own version. But think about the problem that it was designed to solve instead of copying the same solution. A painting program doesn't have to use the blueprint laid out by MacPaint in 1984 [EDIT: that blueprint appeared earlier in Draw for the Xerox Alto and SuperDraw for the PC]. An IDE doesn't have to follow the project tree view on the left schematic.

Are you spending too much time building your own solutions instead of using what's already out there? Should you really be writing another webserver or markdown formatter or JSON decoder?

Do you understand your audience? If you're building an application for graphic artists, are you familiar with how graphic artists work? Are you throwing out features that would be met with horrified stares?

A Forgotten Principle of Compiler Design

That a clean system for separately compiled modules appeared in Modula-2, a programming language designed by Niklaus Wirth in 1978, but not in the 2011 C++ standard...hmmm, no further comment needed. But the successor to Modula-2, Oberon, is even more interesting.

With Oberon, Wirth removed features from Modula-2 while making a few careful additions. It was a smaller language overall. Excepting the extreme minimalism of Forth, this is the first language I'm aware of where simplicity of the implementation was a concern. For example, nested modules were rarely used in Modula-2, but they were disproportionately complex to compile, so they were taken out of Oberon.

That simplicity carried over to optimizations performed by the compiler. Here's Michael Franz:

Optimizing compilers tend to be much larger and much slower than their straightforward counterparts. Their designers usually do not follow Oberon's maxim of making things "as simple as possible", but are inclined to completely disregard cost (in terms of compiler size, compilation speed, and maintainability) in favor of code-quality benefits that often turn out to be relatively marginal. Trying to make an optimizing compiler as simple as possible and yet as powerful as necessary requires, before all else, a measurement standard, by which both simplicity and power can be judged.

For a compiler that is written in the language it compiles, two such standards are easily found by considering first the time required for self-compilation, and then the size of the resulting object program. With the help of these benchmarks, one may pit simplicity against power, requiring that every new capability added to the compiler "pays its own way" by creating more benefit than cost on account of at least one of the measures.

The principle is "compiler optimizations should pay for themselves."

Clearly it's not perfect (the Oberon compiler doesn't make heavy use of floating point math, for example, so floating point optimizations may not speed it up or make it smaller), but I like the spirit of it.

(If you liked this, you might enjoy Papers from the Lost Culture of Array Languages.)

The Most Important Decisions are Non-Technical

I occasionally get puzzled questions about a parenthetical remark I made in 2010: that I no longer program for a living. It's true. I haven't been a full-time programmer since 2003. The short version of these questions is "Why?" The longer version is "Wait, you've got a super technical programming blog and you seem to know all this stuff, but you don't want to work as a programmer?"

The answer to both of these is that I realized that the most important decisions are non-technical. That's a bare and bold statement, so let me explain.

In the summer of 1993, I answered a newspaper ad looking for a "6502 hacker" (which I thought was amusing; the Pentium was released that same year) and got a job with a small company near Seattle writing Super Nintendo games. I had done game development prior to that, but it was me working by myself in my parents' living room.

The first SNES game I worked on was a Tarzan-themed platformer authorized by the estate of Edgar Rice Burroughs (it had no connection to the Disney movie, which was still six years in the future). I had fun working out ways of getting tropical fish to move in schools and creating behaviors for jungle animals like monkeys and birds. It was a great place to work, with the ten or so console programmers all sharing one big space.

The only problem was that the game was clearly going to be awful. It was a jumble of platformer clichés, and it wasn't fun. All the code tweaking and optimization and monkey behavior improvements weren't going to change that. To truly fix it required a project-level rethink of why were building it in the first place. As a "6502 hacker" I wasn't in a position to make those decisions.

While it's fun to discuss whether an application should be implemented in Ruby or Clojure, to write beautiful and succinct code, to see how far purely functional programming can be taken, these are all secondary to defining the user experience, to designing a comfortable interface, to keeping things simple and understandable, to making sure you're building something that's actually usable by the people you're designing it for. Those are more important decisions.

Whatever happened to that Tarzan game? Even though it had been previewed in Nintendo Power, the publisher wisely chose to pay for the year of development and shelve the finished project.

You, Too, Can Be on the Cutting Edge of Functional Programming Research

In 1999 I earned $200 writing an essay titled Toward Programmer Interactivity: Writing Games in Modern Programming Languages. It was an early, optimistic exploration of writing commercial games in Haskell, ML, and Lisp.

It was not a good article.

It's empty in the way that so many other "Hey everyone! Functional programming! Yeah!" essays are. I demonstrated the beauty of Haskell in the small, but I didn't offer any solutions for how to write state-heavy games in it. There are a few silly errors in the sample code, too.

Occasionally during the following years, I searched for information about writing games in functional languages, and that article kept coming up. Other interesting references turned up too, like papers on Functional Reactive Programming, but apparently I had accidentally become an authority. An authority who knew almost nothing about the subject.

I still didn't know if a mostly-functional style would scale-up past the usual toy examples. I didn't know how to build even a simple game without destructive assignment. I wasn't sure if there were even any legitimate benefits. Was this a better way of implementing game ideas than the usual tangled web of imperative code--or madness?

As an experiment, I decided to port an action game that I wrote in 1997 to mostly-pure Erlang. It wasn't a toy, but a full-featured game chock full of detail and special cases. I never finished the port, but I had the bulk of the game playable and running smoothly, and except for a list of special cases that I could write on a "Hello! My Name is" label, it was purely functional. I wrote about what I learned in Purely Functional Retrogames.

Now when I search for info about writing games in a functional style, that's what I find.

Sure, there are some other sources out there. Several times a year a new, exuberant "Haskell / ML / Erlang is a perfect match for games!" blog entry appears. Functional Reactive Programming keeps evolving. A couple of people have slogged through similar territory and managed to bang-out real games in Haskell (Raincat is a good example).

If you want to be on the cutting edge of functional programming research, it's easy. Pick something that looks like a poor match for state-free code, like a video game, and start working on it. Try to avoid going for the imperative pressure release valves too quickly. At some point you're going to need them, but make sure you're not simply falling back on old habits. Keep at it, and it won't be long before you're inventing solutions to problems no one else has had to deal with.

If you come up with something interesting, I'd like to hear about it.

(If you liked this, you might enjoy Back to the Basics of Functional Programming.)

We Who Value Simplicity Have Built Incomprehensible Machines

The 8086 "AAA" instruction seemed like a good idea at the time. In the 1970s there was still a case to be made for operating on binary-coded decimal values, with two digits per byte. What's the advantage of BCD? Large values can be easily displayed without multi-byte division or multiplication. "ASCII Adjust After Addition," or AAA, was committed to the x86 hardware and 30+ years later it's still there, emulated in microcode, in every i7 processor.

The C library function memcpy seemed like a good idea at the time. memmove was fast and robust, properly handling the case where the source and destination overlapped. That handling came at the expense of a few extra instructions that were enough of a concern to justify a second, "optimized" memory copying routine (a.k.a. memcpy). Since then we've had to live with both functions, though there has yet to be an example of an application whose impressive performance can be credited to the absence of overlap-detection code in memcpy.

libpng seemed like a good idea at the time. The theory was to have an easy, platform-independent way of reading and writing PNG files. The result does work, and it is platform independent, but it's possibly the only image decoding library where I can read through the documentation and still not know how to load an image. I always Google "simple libpng example" and cut and paste the 20+ line function that turns up.

The UNIX ls utility seemed like a good idea at the time. It's the poster child for the UNIX way: a small tool that does exactly one thing well. Here that thing is to display a list of filenames. But deciding exactly what filenames to display and in what format led to the addition of over 35 command-line switches. Now the man page for the BSD version of ls bears the shame of this footnote: "To maintain backward compatibility, the relationships between the many options are quite complex."

None of these examples are what caused modern computers to be incomprehensible. None of them are what caused SDKs to ship with 200 page overview documents to give some clue where to start with the other thousands of pages of API description.

But all the little bits of complexity, all those cases where indecision caused one option that probably wasn't even needed in the first place to be replaced by two options, all those bad choices that were never remedied for fear of someone somewhere having to change a line of code...they slowly accreted until it all got out of control, and we got comfortable with systems that were impossible to understand.

We did this. We who claim to value simplicity are the guilty party. See, all those little design decisions actually matter, and there were places where we could have stopped and said "no, don't do this." And even if we were lazy and didn't do the right thing when changes were easy, before there were thousands of users, we still could have gone back and fixed things later. But we didn't.

(If you liked this, you might enjoy Living in the Era of Infinite Computing Power.)

The Pace of Technology is Slower than You Think

"That post is OLD! It's from 2006!" The implication is that articles on technology have a shelf-life, that writings on programming and design and human factors quickly lose relevance. Here's a reminder that the pace of technological advancement isn't as out of control as it may seem.

The first book on Objective-C, the language of modern iOS development, was published in 1986.

Perl came on the scene in 1987, Python in 1991, Ruby in 1995.

You can still buy brand new 6502 and Z80 microprocessors (a Z80 is $2.49 from Jameco Electronics). A Z80 programming guide written in 1979 is still relevant.

Knowledge of the C standard library would have served you equally well developing for MS-DOS, early SUN workstations, the Atari ST, Microsoft Windows, and iOS.

The Quicksort algorithm, taught in all computer science curricula, was developed by Tony Hoare in 1960.

Bill Joy wrote vi in 1976. The span of time between it and the initial release of Bram Moolenaar's vim in 1991 (15 years) is shorter than the time between the release of vim and this blog entry (21 years).

The instruction set of the 80386 CPU, announced in 1985 and available the following year, is still a common target for 32-bit software development.

The tar command appeared in Seventh Edition UNIX in 1979, the same year the vector-based Asteroids arcade game was released. Pick up any 2012 MacBook Air or Pro and tar is there.

Another Programming Idiom You've Never Heard Of

New programmers quickly pick-up how array indexing works. You fetch an element like this: array[3]. (More experienced folks can amuse themselves with the equally valid 3[array] in C.) Now here's a thought: what if you could fetch multiple values at the same time and the result was a new array?

Let's say the initial array is this:

10 5 9 6 20 17 1

Fetching the values at indices 0, 1, 3, and 6, gives:

10 5 6 1

In the J language you can actually do this, though the syntax likely isn't familiar:

0 1 3 6 { 10 5 9 6 20 17 1

The list of indices is on the left, and the original array on the right. That awkwardly unmatched brace is the index operator. (You can also achieve the same end in the R language, if you prefer.)

This may seem like a frivolous extension to something you already knew how to do, but all of a sudden things have gotten interesting. Now indexing can be used for more than just indexing. For example, you can delete elements by omitting indices. This drops the first two elements:

2 3 4 5 6 { 10 5 9 6 20 17 1

Or how about reversing an array without needing a special primitive:

6 5 4 3 2 1 0 { 10 5 9 6 20 17 1

This last case is particularly significant, because the indices specify a permutation of the original array. Arrange the indices however you want, and you can transform an array to that order.

In J, there's an operator that's like a sort, except the result specifies a permutation: a list of where each element should go. Using the same "10 5 9..." array, that first element should be in position 4, the value 5 should be in position 1, and so on. Here's the whole array of permuted indices.

6 1 3 2 0 5 4

What good is that? If you use that list of indices on the left side of the "{" operator with the original array on the right, you sort the array:

6 1 3 2 0 5 4 { 10 5 9 6 20 17 1

Now imagine you've got two other parallel arrays that you want to keep in sync with the sorted one. All you do is use that same "sorted permutation" array to index into each of the other arrays, and you're done.

(If you liked this, you might enjoy the original A Programming Idiom You've Never Heard Of.)

Your Coding Philosophies are Irrelevant

I'll assume you've got a set of strongly-held beliefs about software development. This is a safe bet; anyone who writes code has some personal mantras and peeves.

Maybe you think that PHP is a broken mess or that Perl is unmaintainable? Maybe you're quick to respond in forums with essays about the pointlessness of the singleton pattern? You should always check the result code after calling malloc. Or wait, no, result codes are evil and exceptions are The Way. Always write the test cases before any code. Static typing is...well, I could keep going and hit dozens of other classic points of contention and link to arguments for each side.

Now imagine there are two finished apps that solve roughly identical problems. One is enjoyable to use and popular and making a lot of money. The other just doesn't feel right in a difficult to define way. One of these apps follows all of your development ideals, but which app is it? What if the successful product is riddled with singletons, doesn't check result codes after allocating memory (but the sizes of these allocations are such that failures only occur in pathological cases), and the authors don't know about test-driven development? Oh, and the website of the popular app makes extensive use of PHP.

Or even simpler: pick a single tool or game or application you admire, one you don't have any inside information about. Now just by staring hard at the screen, determine if the author favored composition over inheritance or if he's making rampant use of global variables. Maybe there are are five-hundred line functions and gotos all over the place, but you can't tell. Even if you pick a program that constantly crashes, how do you know the author doesn't have exactly the same opinions about development as you?

It's not the behind-the-scenes, pseudo-engineering theories that matter. An app needs to work and be relatively stable and bug free, but there are many ways to reach that point. There isn't a direct connection between some techie feel-good rule and success. For most arbitrary rules espoused in forums and blogs, you'll find other people vehemently arguing the opposite opinion. And it might just be that too much of this kind of thinking is turning you into an obsessive architect of abstract code, not the builder of things people want.

(If you liked this, you might enjoy Don't Fall in Love with Your Technology.)

The Silent Majority of Experts

When I still followed the Usenet group comp.lang.forth, I wasn't the only person frustrated by the lack of people doing interesting things with the language. Elizabeth Rather, co-founder of Forth, Inc., offered the following explanation: there are people solving real problems with Forth, but they don't hang-out in the newsgroup. She would know; her company exists to support the construction of commercial Forth projects.

In 1996 I worked on a port of The Need for Speed to the SEGA Saturn. (If you think that's an odd system to be involved with, I also did 3DO development, went to a Jaguar conference at Atari headquarters, and had an official set of Virtual Boy documentation.) There were a number of game developers with public faces in the 1990s, but the key people responsible for the original version of The Need for Speed, released in 1994, remained unknown and behind the scenes. That's even though they had written a game based around rigid-body physics before most developers had any idea that term was relevant to 3D video games. And they did it without an FPU: the whole engine used fixed-point math.

Yes, there are many people who blog and otherwise publicly discuss development methodologies and what they're working on, but there are even more people who don't. Blogging takes time, for example, and not everyone enjoys it. Other people are working on commercial products and can't divulge the inner workings of their code.

That we're unable to learn from the silent majority of experts casts an unusual light upon online discussions. Just because looking down your nose at C++ or Perl is the popular opinion doesn't mean that those languages aren't being used by very smart folks to build amazing, finely crafted software. An appealing theory that gets frantically upvoted may have well-understood but non-obvious drawbacks. All we're seeing is an intersection of the people working on interesting things and who like to write about it--and that's not the whole story.

Your time may better spent getting in there and trying things rather than reading about what other people think.

(If you liked this, you might enjoy Photography as a Non-Technical Hobby.)

I Am Not a Corporation

In 2009, when I exclusively used a fancy Nikon DSLR, my photographic work flow was this: take pictures during the day, transfer them to a PC in the evening, fiddle with the raw version of each shot in an image editor, save out a full-res copy, make a smaller version and upload it to Flickr.

Once I started using an iPhone and the Hipstamatic app, my work flow was this: take a picture, immediately upload it to Flickr.

Pick any criteria for comparing the absolute quality of the Nikon vs the iPhone, and the Nikon wins every time: sharpness, resolution, low-light ability, you name it. It's not even close. And yet I'm willing to trade that for the simplicity and fun of using the iPhone.

That's because I'm not a professional photographer who gets paid to put up with the inconveniences that come with the higher-end equipment. If I can avoid daily image transfers, that's a win for me. If I don't have to tweak around with contrast settings and color curves, that's huge.

I also work on projects in my spare time that involve writing code, but I don't have the luxury of a corporate IT department that keeps everything up to date and running smoothly. I don't want to be maintaining my own Linux installation. I would prefer not to wait forty-five minutes to build the latest bug-fix release of some tool. I don't think most developers want to either; that kind of self-justified technical noise feels very 1990s.

When I'm trying out an idea at home, I'm not getting paid to deal with what a professional software engineer would. If I've got thirty minutes to make progress, I don't want to spend that puzzling out why precompiled headers aren't working. I don't want to spend it debugging a makefile quirk. I don't want to decipher an opaque error message because I got something wrong in a C++ template. I don't want to wait for a project to compile at all. I'm willing to make significant trades to avoid these things. If I can get zero or close to zero compilation speed, that's worth a 100x performance hit in the resulting code. Seriously.

If I were a full-time programmer paid to eke out every last bit of performance, then there's no way I'd consider making such a trade. But I'm not, and if I pretended otherwise and insisted on using the same tools and techniques as the full-time pros, I'd end up frustrated and go all Clifford Stoll and move to an internet-free commune in Tennessee.

Fun and simplicity are only optional if you're paid to ignore them.

(If you liked this, you might enjoy Recovering From a Computer Science Education.)

Things to Optimize Besides Speed and Memory

Whittling down a function to accomplish the same result with fewer instructions is, unfortunately, fun. It's a mind teaser in the same way that crossword puzzles and Sudoku are. Yet it's a waste of time to finely hone a C++ routine that would be more than fast enough if implemented in interpreted Python. Fortunately, there are plenty of other targets for that optimization instinct, and it's worth retraining your habits to give these aspects of your projects more attention:

Power consumption, battery life, heat, and fan noise.

Number of disk sector writes (especially for solid-state drives). Are you rewriting files that haven't changed?

Overall documentation size and complexity.

How much time it takes to read a tutorial--and the engagement level of that tutorial.

Number of bytes of network traffic. The multiplayer game folks have been concerned with this from the start, but now almost every application has some level of network traffic that might go over non-free phone networks or through slow public Wi-Fi.

#include file size. This is more about the number of entities exposed than the byte count.

Number of taps/clicks it takes to accomplish a task.

App startup time.

How long it takes to do a full rebuild of your project. Or how long it takes to make usability tweaks and verify that they work.

The number of special cases that must be documented, either to the user or in your code.

Blog entry length.

(If you liked this, you might enjoy "Avoid Premature Optimization" Does Not Mean "Write Dumb Code".)

App Store Failure and Personal Responsibility

"I wrote an iPhone app, and it didn't make any money" is a growing literary genre, and I sympathize with the authors. I really do. Building any kind of non-trivial, commercial application takes an immense amount of work that combines coding, writing, interaction design, and graphic arts. To spend a thousand hours on a project that sells 103 copies at 99 cents apiece...well, it's disheartening to say the least.

Dismissing that failure as losing the "app store lottery" (meaning that success or failure is out of your control) dodges important questions. When I was writing and selling indie games in the mid 1990s, I went through the experience of releasing a game to the world--euphoria!--followed by despair, confusion, and endless theorizing about why it wasn't the smash hit I knew it deserved to be. Most of the failed iPhone app articles sound like something I would have written in 1997. Of course the iPhone and Apple's App Store didn't even exist then, but my feelings and reactions were exactly same.

What I learned from that experience may sound obvious, and that's precisely why it's a difficult lesson to learn: just because you slogged through the massive effort it takes to design and release a product doesn't have any bearing at all on whether or not anyone actually wants what you made.

See? I told you it sounds obvious, but that doesn't make it any easier to deal with. Getting something out the door is the price of entry, not a guarantee of success. If it doesn't go as planned, then you have to accept that there's some reason your beautiful creation isn't striking a chord with people, and that involves coming face to face with issues that aren't fun to think about for most bedroom coders.

Have you ever watched complete strangers use your app? Are they interpreting the tutorials correctly? Are they working in the way you expected them to? If it's a game, is the difficulty non-frustrating? Maybe you designed and polished twenty levels, not realizing that only a handful of players get past level one.

It's harder to judge if the overall quality is there. Your cousin might draw icons for free, but do they give the impression of high-end polish? Are there graphics on the help screen or just a wall of text? Are you using readable fonts? Are you avoiding improvements because they'd be too much work? Developer Mike Swanson wrote an Adobe Illustrator to Objective-C exporter just so images would stay sharp when scaled.

It's also worth taking a step back and looking at the overall marketplace. Maybe you love developing 16-bit retro platformers, but what's the overall level of interest in 16-bit retro platformers? Is there enough enthusiasm to support dozens of such games or is market saturation capping your sales? If you've written a snazzy to-do list app, what makes it better than all the other to-do lists out there? Can folks browsing the app store pick up on that quickly?

It would be wonderful to be in a position of developing software, blindly sending it out into the world, and making a fortune. It does happen. But when it doesn't, it's better to take responsibility for the failure and dig deeper into what to do about it rather than throwing up your hands and blaming the system.

One Small, Arbitrary Change and It's a Whole New World

I want to take one item from Things to Optimize Besides Speed and Memory and run with it: optimizing the number of disk sector writes.

This isn't based on performance issues or the limited number of write cycles for solid-state drives. It's an arbitrary thought experiment. Imagine we've got a system where disk writes (but not reads!) are on par with early 1980s floppy disks, one where writing four kilobytes of data takes a full second. How does looking through the artificial lens of disk writes being horribly slow change the design perceptions of a modern computer (an OS X-based laptop in this case, because that's what I'm using)?

Poking around a bit, there's a lot more behind-the-scenes writing of preferences and so on than expected. Even old-school tools like bash and vim save histories by default. Perhaps surprisingly, the innocuous less text-file viewer writes out a history file every time it's run.

There are system-wide log files for recording errors and exceptional occurrences, but they're used for more than that. The Safari web browser logs a message when the address/search bar is used and every time a web page is loaded. Some parts of the OS are downright chatty, recording copyright messages and every step of the initialization process for posterity. There's an entire megabyte of daemon shutdown details logged every time the system is turned off. Given the "4K = 1 second" rule, that's over four minutes right there.

The basic philosophy of writing data files needs a rethink. If a file is identical to what's already on disk, don't write it. Yes, that implies doing a read and compare, but those aren't on our performance radar. Here's an interesting case: what if you change the last line of a 100K text file? Most of the file is the same, so we can get by with writing a single 4K sector instead of the 25 second penalty for blindly re-saving the whole thing.

All of this is minor compared to what goes on in a typical development environment. Compiling a single file results in a potentially bulky object file. Some compilers write out listings that get fed to an assembler. The final executable gets written to disk as well. Can we avoid all intermediate files completely? Can the executable go straight to memory instead of being saved to disk, run once for testing, then deleted in the next build?

Wait, hold on. We started with the simple idea of avoiding disk writes and now we're rearchitecting development environments?

Even though it was an off-the-cuff restriction, it helped uncover some inefficiencies and unnecessary complexity. Exactly why does less need to maintain a history file? It's not a performance issue, but it took code to implement, words to document, and it raises a privacy concern as well. Not writing over a file with identical data is a good thing. It makes it easy to sort by date and see what files are actually different.

Even the brief wondering about development systems brings up some good questions about whether the design status quo for compilers forty years ago is the best match for the unimaginable capabilities of the average portable computer in 2012.

All because we made one small, arbitrary change to our thinking.

(If you'd like to subscribe, here's the news feed.)

All that Stand Between You and a Successful Project are 500 Experiments

Suppose there was a profession called "maker." What does a maker do? A maker makes things! Dinner. Birdhouses. Pants. Shopping malls. Camera lenses. Jet engines. Hydroelectric power stations. Pianos. Mars landers.

Being a maker is a rough business. It's such a wide-ranging field, and just because you've made hundreds of flowerpots doesn't give you any kind of edge if you need to make a catalytic converter for a 1995 Ford truck.

Now think about a profession called "programmer." What does a programmer do? A programmer programs things! Autonomous vehicles. Flight simulators. Engine control systems. Solid state drive firmware. Compilers. Video games. Airline schedulers. Digital cameras.

If you focus in on one area it expands like a fractal. Video games? That covers everything from chess to 3D open world extravaganzas to text adventures to retro platformers. Pick retro platformers and there's a wide range of styles and implementation techniques. Even if you select a very specific one of these, slight changes to the design may shift the problem from comfortable to brain teaser.

The bottom line is that it's rare to do software development where you have a solid and complete understanding of the entire problem space you're dealing with. Or looking at it another way, everything you build involves forays into unfamiliar territory. Everything you build is to a great extent a research project.

How do you come to grips with something you have no concrete experience with? By running experiments. Lots of little throwaway coding and interface experiments that answer questions and settle your mind.

Writing a PNG decoder, for example, is a collection of dozens of smaller problems, most of which can be fiddled around with in isolation with real code. Any significant app has user interactions that need prototyping, unclear and conflicting design options to explore, tricky bits of logic, API calls you've never used--hundreds of things. Maybe five hundred. And until you run those experiments, you won't have a solid understanding of what you're making.

(If you liked this, you might enjoy Tricky When You Least Expect It.)

Hopefully More Controversial Programming Opinions

I read 20 Controversial Programming Opinions, and I found myself nodding "yes, yes get to the good stuff." And then, after "less code is better than more," it was over. It was like reading a list of controversial health tips that included "eat your veggies" and "don't be sedentary." In an effort to restore a bit of spark to the once revolutionary software development world, I present some opinions that are hopefully more legitimately controversial.

Computer science should only be offered as a minor. You can major in biology, minor in computer science. Major in art, minor in computer science. But you can't get a degree in CS.

It's a mistake to introduce new programmers to OOP before they understand the basics of breaking down problems and turning the solutions into code.

Complex compiler optimizations are almost never worth it, even if they result in faster code. They can disproportionately slow down the compiler. They're risky, in that a mishandled edge case in the optimizer may result in obscure, latent bugs in the application. They make reasoning about performance much more difficult.

You shouldn't be allowed to write a library for use by other people until you have ten years of programming under your belt. If you think you know better and ignore this rule, then one day you will come to realize the mental suffering that you have inflicted upon others, and you will have to live with that knowledge for the rest of your life.

Superficially ugly code is irrelevant. Pretty formatting--or lack thereof--has no bearing on whether the code works and is reliable, and that kind of mechanical fiddling is better left to an automated tool.

Purely functional programming doesn't work, but if you mix in a small amount of imperative code then it does.

A software engineering mindset can prevent you from making great things.

The Goal is to be Like a Bad Hacker Movie

The typical Hollywood hacking scene is an amalgamation of familiar elements: screens full of rapidly changing hex digits, database searches that show each fingerprint or image as it's encountered, password prompts in a 72 point font, dozens of windows containing graphs and random data...oh, and 3D flights through what presumably are the innards of a computer. Somehow the protagonist uses these ridiculous tools to solve a difficult problem in a matter of minutes, all while narrating his exploits with nonsensical techno jargon.

Admittedly, it's a lot more entertaining than the reality of staring at code for hours, sitting through ten minute compile and link cycles, and crying over two page error messages from a C++ template gone bad.

But the part about solving a problem in a matter of minutes? There's some strong appeal in that. Who wouldn't want to explore and come to grips with a tricky issue in real-time?

It's not as outlandish as it sounds. Traditional software development methodologies are based more around encapsulation and architecture than immediate results. To mimic the spirit--not the aesthetics or technical details--of a scene from a bad hacker movie, you need different priorities:

Visualization tools. At the one end of the spectrum are Bret Victor-esque methods for interactively exploring a problem space. On a more basic level, it's tremendously useful to have graphing facilities available at all times. How often are zeros occurring in this data? What are the typical lengths of strings going through this function? Does displaying a matrix as a grid of rectangles, with each unique element mapped to a separate color, show any hidden patterns?

Terseness. It's easier to just say print or cos than remembering output display functions are in system.output.text and cosine is in math.transcendentals. It's easier to have built-in support for lists than remembering what constructors are available for the list class. It may initially seem obtuse that Forth's memory fetch and store operations are named "@" and "!", but that reaction quickly fades and the agility of single-character words sticks with you.

A set of flexible, combinable operations. The humble "+" in array languages does more than a plus operator in C. It not only adds to two numbers, but it can add a value to each element of an arbitrarily long array and add two arrays together. Follow it with a slash ("+/") and the addition operator gets "inserted" between the elements of an array, returning the sum.

It gets interesting when you've got a collection of operations like this that can be combined with each other. Here's a simple example: How do you transform a list of numbers into a list of pairs, where the first element of each pair is the index and the second the original number? Create a list of increasing values as long as the original, then zip the two together. That's impossibly terse in a language like J, but maybe more readily understandable in Haskell or Erlang:

lists:zip(lists:seq(1,length(L)),L).

The trick here is to forget about loops and think entirely in terms of stringing together whole array or list transformations. That lets you try a series single-line experiments while avoiding opening up a text editor and switching your mindset to "formal coding" mode.

(If you liked this, you might enjoy This Isn't Another Quick Dismissal of Visual Programming.)

Minimalism in an Age of Tremendous Hardware

You don't know minimalism until you've explored the history of the Forth programming language.

During Forth's heyday, it was unremarkable for a full development environment--the entire language with extensions, assembler, and integrated editor--to be less than 16K of object code. That's not 16K of data loaded by another program, but a full, standalone, 16K system capable of meta-compiling its own source code to create a new standalone system.

With work, and depending on how much you wanted to throw out, you could get that 16K number much lower. 8K was reasonable. 1K for an ultralight system with none of the blanks filled in. Half that if you were an extremist that didn't mind bending the definition of Forth into unrecognizable shapes.

Some Forths booted directly from a floppy disk and took over the machine. No operating system necessary. Or perhaps more correctly, Forth was the operating system.

In the early 1990s, Forth creator Chuck Moore had an epiphany and decided that interactively hex-editing 32-bit x86 machine code was the way to go. (After writing an entire chip layout package this way he realized there were some drawbacks to the scheme and reverted to designing languages that used source code.)

Looking back, there are many questions. Was there ever a time when reducing a development environment from 16K to 8K actually mattered? Why did the Forth community expend so much effort gazing inward, constantly rethinking and rewriting the language instead of building applications that incidentally happened to be written in Forth? Why was there such emphasis on machine-level efficiency instead of developer productivity?

In 2012 it all seems like so much madness, considering that I could write a Forth interpreter in Lua that when running on an iPhone from a couple generations back would be 10,000 times faster than the most finely crafted commercial Forth of 30 years ago. I'm not even considering the tremendous hardware in any consumer-level desktop.

Still, there's something to the minimalism that drove that madness. The mental burden involved in working with a 50K file of Python code is substantially higher than one of 10K. It doesn't matter that these modest numbers are dwarfed by the multi-megabyte system needed to execute that code. Those hundreds of extra lines of Python mean there's more that can go wrong, more that you don't understand, more time sitting in front of a computer fixing and puzzling when you should be out hiking or playing guitar.

Usually--almost always--there's a much simpler solution waiting to be discovered, one that doesn't involve all the architectural noise, convolutions of the straightforward, and misguided emphasis on hooks and options for all kinds of tangents which might be useful someday. Discovering that solution may not be easy, but it is time well spent.

That's what I learned from Forth.

(If you liked this, you might enjoy Deriving Forth.)

What's Your Hidden Agenda?

In July 1997, the Issaquah Press printed an article with the headline "Man Shoots Computer in Frustration." Now realize that Issaquah is just south of Redmond, so it's not surprising that this story was picked up nationally. It rapidly became a fun to cite piece of odd news, the fodder of morning radio shows. Google News has a scan of one version of the story, so you can read it for yourself.

A week later, the Issaquah Press ran a correction to the original story. It turns out that not only was the PC powered down at the time of the shooting, but the man wasn't even in the same with it room when he fired his gun. The bullets went through a wall and hit the computer more or less by accident. I'm not denying that this guy had issues, but one of them wasn't anger stemming from computer trouble.

So how did the original story manage to get into print?

Somehow the few facts were lined up, and from an objective point of view there were gaps between them. A distraught man. A discharged gun. Bullet holes in a no longer functioning PC. I have no way of knowing who mentally pieced together the sequence of events, but to someone the conclusion was blindingly obvious: computers are frustrating, wouldn't we all like to shoot one? Perhaps the unknown detective recently lost hours of work when a word processor crashed? Perhaps it was the influence of all the overheard and repeated comments about Windows 95 stability?

When I read forum postings and news articles, I'm wary of behind the scenes agendas. Sometimes they're obvious, sometimes not. Sometimes it takes a while to realize that this is a guy with a beef about Apple, this other person will only say good things about free-as-in-freedom software, this kid endlessly defends the honor of the PlayStation 3 because that's what his parents got him for Christmas, and he can't afford to also have an Xbox. And then I realize these people are unable to present me with a clear vision of what happened in that house in Issaquah in 1997.

Digging Out from Years of Homogeneous Computing

When I first started looking into functional programming languages, one phrase that I kept seeing in discussions was As Fast as C. A popular theory was that functional programming was failing to catch on primarily because of performance issues. If only implementations of Haskell, ML, and Erlang could be made As Fast As C, then programmers would flock to these languages.

Since then, all functional languages have gotten impressively fast. The top-end PC in 1998 was a 350MHz Pentium II. The passage of time has solved all non-algorithmic speed issues. But at the same time, there was a push for native code generation, for better compilers, for more optimization. That focus was a mistake, and it would take a decade for the full effect of that decision come to light.

In the early 2000s, PCs were the computing world. I'm not using "personal computer" in the generic sense; I'm talking about x86 architecture boxes with window-oriented GUIs and roughly the same peripherals. People in the demo-coding scene would shorten the term "x86 assembly language" to "asm" as if no other processor families existed. The Linux and Windows folks with nothing better to do argued back and forth, but they were largely talking about different shades of the same thing. One of the biggest points of contention in the Linux community was how to get a standard, "better than Windows but roughly the same" GUI for Linux, and several were in development.

Then in 2007 the iPhone arrived and everything changed.

This has nothing to do with Apple fanboyism. It's that a new computer design which disregarded all the familiar tenets of personal computing unexpectedly became a major platform. The mouse was replaced with a touchscreen. The decades old metaphor of overlapping windows shuffled around like papers on a table was replaced by apps that owned all the pixels of the device. All those years of learning the intricacies of the Win32 API no longer mattered; this was something else entirely. And most significantly for our purposes: the CPU was no longer an x86.

Compiler writers had been working hard, and showing great progress, in getting Haskell and Objective Caml turning into fast x86 machine code. Then, through no fault of their own, they had to deal with the ARM CPU and a new operating system to interface with, not to mention that Objective-C was clearly the path of least resistance for hardware developed and being rapidly iterated by a company that promoted Objective-C.

That a functional language compiler on a desktop PC was getting within a reasonable factor of the execution time of C no longer mattered if you were a mobile developer. The entire emphasis put on native code compilation seemed questionable. With the benefit of hindsight, it would have been better to focus on ease of use and beautiful coding environments, on smallness and embeddability. I think that would have been a tough sell fifteen years ago, blinded by the holy grail of becoming As Fast as C.

To be fair, ARM did become a target for the Glasgow Haskell compiler, though it's still not a reasonable option for iOS developers, and I doubt that's the intent. But there is one little language that was around fifteen years ago, one based around a vanilla interpreter, one that's dozens of times slower than Haskell in the general case. That language is Lua, and it gets a lot of use on iPhone, because it was designed from the start to be embeddable in C programs.

(If you liked this, you might enjoy Caught-Up with 20 Years of UI Criticism.)

Do You Really Want to be Doing This When You're 50?

When I was still a professional programmer, my office-mate once asked out of the blue, "Do you really want to be doing this kind of work when you're fifty?"

I have to say that made me stop and think.

To me, there's an innate frustration in programming. It doesn't stem from having to work out the solutions to difficult problems. That takes careful thought, but it's the same kind of thought a novelist uses to organize a story or to write dialog that rings true. That kind of problem-solving is satisfying, even fun.

But that, unfortunately, is not what most programming is about. It's about trying to come up with a working solution in a problem domain that you don't fully understand and don't have time to understand.

It's about skimming great oceans of APIs that you could spend years studying and learning, but the market will have moved on by then and that's no fun anyway, so you cut and paste from examples and manage to get by without a full picture of the architecture supporting your app.

It's about reading between the lines of documentation and guessing at how edge cases are handled and whether or not your assumptions will still hold true two months or two years from now.

It's about the constant evolutionary changes that occur in the language definition, the compiler, the libraries, the application framework, and the underlying operating system, that all snowball together and keep you in maintenance mode instead of making real improvements.

It's about getting derailed by hairline fractures in otherwise reliable tools, and apparently being the first person to discover that a PNG image with four bits-per-pixel and an alpha channel crashes the decoder, then having to work around that.

One approach is to dig in and power through all the obstacles. If you're fresh out of school, there are free Starbucks lattes down the hall, and all your friends are still at the office at 2 AM, too...well, that works. But then you have to do it again. And again. It's always a last second skid at 120 miles per hour with brakes smoking and tires shredding that makes all the difference between success and failure, but you pulled off another miracle and survived to do it again.

I still like to build things, and if there's no one else to do it, then I'll do it myself. I keep improving the the tiny Perl script that puts together this site, because that tiny Perl script is unobtrusive and reliable and lets me focus on writing. I have a handy little image compositing tool that's less than 28 kilobytes of C and Erlang source. I know how it works inside and out, and I can make changes to it in less time than than it takes to coax what I want out of ImageMagick.

But large scale, high stress coding? I may have to admit that's a young man's game.

The Background Noise Was Louder than I Realized

A few years ago I started cutting back on the number of technology and programming sites I read. It was never a great number, and now it's only a handful. This had nothing to with being burned out on technology and programming; it was about being burned out on reading about technology and programming. Perhaps surprisingly, becoming less immersed in the online tech world has made me more motivated to build things.

Here's some of what I no longer bother with:

Tired old points of contention that make no difference no matter who says what (e.g., static vs. dynamic typing).

Analyses of why this new product is going to be the end of a multi-billion dollar corporation.

Why some programming language sucks.

Overly long, detailed reviews of incrementally improved hardware and operating system releases. (I like iOS 6 just fine, but from a user's point of view it's iOS 5 with a few tweaks and small additions that will be discovered through normal use.)

Performance comparisons of just about anything: systems, GPUs, CPUs, SSDs. The quick summary is that they're all 5-15% faster than last year's infinitely fast stuff.

All of these things are noise. They're below the threshold of what matters. Imagine you started hanging out with people who were all, legitimately, writing books. They each have their own work styles and organization methods and issues with finding time to write efficiently. As a software designer, you might see some ways to help them overcome small frustrations with their tools or maybe even find an opportunity for a new kind of writing app. But I can guarantee that GPU numbers and programming language missteps and the horrors of dynamic typing will have no relevance to any of what you observe.

I do still read some tech (and non-tech) blogs, even ones that sometimes violate the above rules. If the author is sharing his or her direct, non-obvious experience or has an unusual way of seeing the world, then I'll happily subscribe. Being much more selective has kept me excited and optimistic and aware of possibilities instead of living down below in a world of endless detail and indecision and craning my neck to see what's going on above the surface.

(As a footnote, a great way to avoid the usual aggregation sites is to subscribe to the PDF or real paper edition of Hacker News Monthly. Read it cover to cover one Saturday morning with a good coffee instead of desperately refreshing your browser every day of the week. Disclosure: I've gotten free copies of the PDF version for a while now, because I've had a few articles reprinted in it.)

OOP Isn't a Fundamental Particle of Computing

The biggest change in programming over the last twenty-five years is that today you manipulate a set of useful, flexible data types, and twenty-five years ago you spent a disproportionately high amount of time building those data types yourself.

C and Pascal--the standard languages of the time--provided a handful of machine-oriented types: numbers, pointers, arrays, the illusion of strings, and a way of tying multiple values together into a record or structure. The emphasis was on using these rudiments as stepping stones to engineer more interesting types, such as stacks, trees, linked lists, hash tables, and resizable arrays.

In Perl or Python or Erlang, I don't think about this stuff. I use lists and strings and arrays with no concern about how many elements they contain or where the memory comes from. For almost everything else I use dictionaries, again no time spent worrying about size or details such as how hash collisions are handled.

I still need new data types, but it's more a repurposing of what's already there than crafting a custom solution. A vector of arbitrary dimension is an array. An RGB color is a three-element tuple. A polynomial is either a tuple (where each value is the coefficient and the index is the degree) or a list of {Coefficient, Degree} tuples. It's surprising how arrays, tuples, lists, and dictionaries have eliminated much of the heavy lifting from the data structure courses I took in college. The focus when implementing a balanced binary tree is on how balanced binary trees work and not about suffering through a tangled web of pointer manipulation.

Thinking about how to arrange ready-made building blocks into something new is a more radical change than it may first appear. How those building blocks themselves come into existence is no longer the primary concern. In many programming courses and tutorials, everything is going along just fine when there's a sudden speed bump of vocabulary: objects and constructors and abstract base classes and private methods. Then in the next assignment the simple three-element tuple representing an RGB color is replaced by a class with getters and setters and multiple constructors and--most critically--a lot more code.

This is where someone desperately needs to step in and explain why this is a bad idea and the death of fun, but it rarely happens.

It's not that OOP is bad or even flawed. It's that object-oriented programming isn't the fundamental particle of computing that some people want it to be. When blindly applied to problems below an arbitrary complexity threshold, OOP can be verbose and contrived, yet there's often an aesthetic insistence on objects for everything all the way down. That's too bad, because it makes it harder to identify the cases where an object-oriented style truly results in an overall simplicity and ease of understanding.

(Consider this Part 2 of Don't Distract New Programmers with OOP. There's also a Part 3.)

An Outrageous Port

In You, Too, Can Be on the Cutting Edge of Functional Programming Research I wrote:

As an experiment, I decided to port an action game that I wrote in [1996 and] 1997 to mostly-pure Erlang. It wasn't a toy, but a full-featured game chock full of detail and special cases. I never finished the port, but I had the bulk of the game playable and running smoothly, and except for a list of special cases that I could write on a "Hello! My Name is" label, it was purely functional. I wrote about what I learned in Purely Functional Retrogames.

I didn't mention the most unusual part: this may be the world's only port of a game from RISC (PowerPC) assembly language to a functional language (Erlang).

Exactly why I chose to write an entire video game in PowerPC assembly language in the first place is hard to justify. I could point out that this was when the 300,000+ pixels of the lowest resolution on a Macintosh was a heavy burden compared to the VGA standard of 320x200. Mostly, though, it was a bad call.

Still, it's a fascinating bit of code archaeology to look at something developed by my apparently mad younger self. By-the-book function entry/exit overhead can be 30+ instructions on the PowerPC--lots of registers to save and restore--but this code is structured in a way that registers hardly ever need to be saved. In the sprite drawing routines, option flags are loaded into one of the alternate sets of condition registers so branches don't need to be predicted. The branch processing unit already knows which way the flow will go.

In Erlang, the pain of low-level graphics disappeared. Instead of using a complicated linkage between Erlang and OpenGL, I moved all of the graphics code to a small, separate program and communicated with it via local socket. (This is such a clean and easy approach that I'm surprised it's not the go-to technique for interfacing with the OS from non-native languages.)

With sprite rendering out of the way, what's left for the Erlang code? Everything! Unique behaviors for sixteen enemy types, a scripting system, collision detection and resolution, player control, level transitions, and all the detail work that makes a game playable and game design fun. (The angle_diff problem came from this project, too, as part of a module for handling tracking and acceleration.)

All of this was recast in interpreted Erlang. Yes, the purely functional style resulted in constant regeneration of lists and tuples. Stop-the-world garbage collection kicked in whenever needed. Zoom in on any line of code and low-level inefficiencies abound. Between all the opcode dispatching in the VM and dynamic typing checks for most operations I'm sure the end result is seen by hardware engineers as some kind of pathological exercise in poor branch prediction.

All of this, all of this outrageous overhead, driving a game that's processing fifty or more onscreen entities at sixty frames per second, sixteen milliseconds per frame...

...and yet on the 2006 non-pro MacBook I was using at the time, it took 3% of the CPU.

Three percent!

If anything, my port was too timid. There were places where I avoided some abstractions and tried to cut down on the churning through intermediate data structures, but I needn't have bothered. I could have more aggressively increased the code density and made it even easier to work with.

(If you liked this, you might enjoy Slow Languages Battle Across Time.)

"Not Invented Here" Versus Developer Sanity

Developers, working independently, are all pushing to advance the state of the art, to try to make things better. The combined effect is to make things more chaotic and frustrating in the short term.

Early PC video cards rapidly advanced from bizarrely hued CGA modes to the four-bit pastels of EGA to the 256 color glory of VGA, all in six years. Supporting the full range required three sets of drawing routines and three full sets of art.

iPhone resolution went from 320x480 to 640x960 to 640x1136 in less time. Application design issues aside, the number of icons and launch screen images required for a submitted app exploded.

Windows 95 offered huge benefits over odd little MS-DOS, but many companies selling development tools were unwilling or unmotivated to make the transition and those tools slowly withered.

Starting with iOS 5, Apple required applications to have a "root view controller," a simple change that resulted in disproportionate amount of confusion (witness the variety of fixes in this Stack Overflow thread).

GraphicsMagick is smaller, cleaner, and faster than its predecessor ImageMagick, but that's of no consolation if you're reliant on one of the features dropped from the latter in the name of simplicity and cleanliness.

Keeping up with all of these small, evolutionary changes gets tiring, but there's no point in complaining. Who doesn't want double iPhone resolution, the boost that DirectX 9 gives over previous versions, or the aesthetics of curve-based, anti-aliased font rendering instead of 8x8 pixel grids?

Sometimes, occasionally, you can hide from the never-ending chaos. You can take refuge in your own custom crafted tools--maybe even a single tool--that does exactly what you need it to do. A tool that solves a problem core to your area of focus. A tool that's as independent as realistically possible from the details of a specific operating system and from libraries written by people with different agendas.

This isn't a blanket license for Not Invented Here syndrome or reinventing the wheel. If you have a small (small!) library or tool that does exactly what you need, that you understand inside and out, then you know if your needs change slightly you can get in there and make adjustments without pacing back and forth for a new release. If you have an epiphany about how to further automate things or how to solve difficult cases you didn't think were possible, then you can code up those improvements. Maybe they don't always work out, but you can at least try the experiments.

Mostly it's about developer sanity and having something well-understood that you can cling to amidst the swirling noise of people whose needs and visions of the right solutions never quite line up with your own.

The UNIX Philosophy and a Fear of Pixels

I've finally crossed the line from mild discomfort with people who espouse the UNIX way as the pinnacle of computing to total befuddlement that there's anyone who still wants to argue such a position. One key plank in the UNIX party platform is that small tools can be combined together providing great expressiveness. Here's a simple task: print a list of all the files with a txt extension in the current directory except for ignore_me.txt.

Getting a list of text files is easy: ls *.txt. Now how to remove ignore_me.txt from that list? Hmmm...well, you might know that grep can be inverted via a switch so it returns lines that don't match:

ls *.txt | grep -v ^ignore_me\\.txt$

There's also the find utility which can do the whole thing in one step, but it takes more fiddling around to get the parameters right:

find *.txt -type f ! -name ignore_me.txt

This all works, and we've all figured this stuff out by reading man pages and googling around, but take a moment to consider how utterly anachronistic both of the above solutions come across to non-believers in 2012. It's like promoting punch cards or IBM's job control language from the 1960s. You've got to get that space between the ! and -name or you'll get back "!-name: event not found." But this isn't what I wanted to talk about so I'll stop there.

What I really wanted to talk about are text files and visual programming.

I keep seeing the put-downs of any mention of programming that involves a visual component. I wrote an entire entry two years ago on the subject, This Isn't Another Quick Dismissal of Visual Programming, and now I don't think it was strong enough. Maybe the problem is that "visual programming" is a bad term, and it should be "ways to make programming be more visual." At one time all coding was done on monochrome monitors, but inexpensive color displays and more CPU power led to syntax highlighting, which most developers will agree is a win.

Now go further and stop thinking of code as a long scroll of text, but rather as discrete functions that you can view and edit independently. That's starting to get interesting. Or consider the discussion of trees in any algorithms book, where nodes and leaves are rendered inside of boxes, and arrows show the connections between them. It's striking that $500 consumer hardware has over three million pixels and massively parallel GPUs to render those pixels, yet there's old school developer resistance to anything fancier than dumping out characters in a monospaced font? Why is that?

It's because tools to operate on text files are easy to write, and anything involving graphics is several orders of magnitude harder.

Think about all the simple, interview-style coding problems you've seen. "Find all the phone numbers in this text file." FizzBuzz. Do any of them involve colors or windows or UI? For example, "On this system, how many pixels wide is a given string in 18 point Helvetica Bold?" "List all the filenames in the current directory in alphabetical order, with the size of the font relative to the size of the file (the names of the largest and smallest files should be displayed in the largest and smallest font, respectively)."

There have been some tantalizing attempts at making graphical UI development as easy as working with text. I don't mean tools like Delphi or the iOS UIKit framework, where you write a bunch of classes that inherit from a core set of classes, then use visual layout packages to design the front-end. I mean tools that let you quickly write a description of what you want UI-wise, and then there it is on the screen. No OOP. No code generators. If you've ever used the Tk toolkit for Tcl, then you've got a small taste of what's possible.

The best attempt I've seen is the UI description sub-language of REBOL. Creating a basic window with labeled buttons is a one-liner. Clearly all wasn't perfect in REBOL-ville, as a burst of excitement in the late 1990s was tempered with a long period of inactivity, and some features of the language never quite lived up to their initial promises.

These days HTML is the most reasonable approach to anything involving fonts and images and interaction. It's not as beautifully direct as REBOL, and being trapped in a browser is somewhere between limiting and annoying, but the visual toolkit is there, and it's ubiquitous. (For the record, I would have solved the "list all the filenames..." problem by generating HTML, but firing up a browser to display the result is a heavy-handed solution.)

Code may still be text behind the scenes, but that doesn't mean that coding has to always be about working directly in a text editor or monospaced terminal window.

Dangling by a Trivial Feature

I'm looking for a good vector-illustration app and download a likely candidate. I rely on knowing the position of the cursor on the page--strangely, some programs won't show this--so it's a good first sign to see coordinates displayed at the top of the screen. Now I drag the selection rectangle around a shape to measure it.

Uh-oh, it's still showing only the cursor coordinates as I drag. What I want to see are two sets of values: the cursor position and the current size of the rectangle. Now I could do the subtraction myself, but I'm using a computer that can do billions of calculations each second. I don't want to get slowed down because of mistakes in my mental computation or because I mistyped a number into the Erlang or Python interpreter I use as a calculator.

I set this app aside and start evaluating another. And that sentence should be utterly horrifying to developers everywhere.

A team of people spent thousands of hours building that app. There are tens or hundreds of thousands of lines of codes split across dozens of files. There's low-level manipulation of cubic splines, a system for creating layers and optimizing redraw when there are dozens of them, a complex UI, importing and exporting of SVG and Adobe Illustrator and Postscript files, tricky algorithms for detecting which shape you're clicking on, gradient and drop shadow rendering, text handling...and I'm only hitting some of the highlights.

Yet here I am dismissing it in a casual, offhand way because of how the coordinates of the selection rectangle are displayed. The fix involves two subtractions, a change to a format string, and a bit of testing. It's trivial, especially in comparison to all the difficult, under-the-hood work to make the selection of objects possible in the first place, but it makes no difference, because I've moved on.

These are the kind of front-facing features people use to decide if they like an app or not. Seemingly superficial fit and finish issues are everything, and the giant foundation that enables those bits of polish is simply assumed to exist.

(If you liked this, you might enjoy It's Like That Because It Has Always Been Like That.)

Documenting the Undocumentable

Not too long ago, any substantial commercial software came in a substantial box filled with hundreds or thousands of printed pages of introductory and reference material, often in multiple volumes. Over time the paper manuals became less comprehensive, leaving only key pieces of documentation in printed form, the reference material relegated to online help systems. In many cases the concept of a manual was dropped completely. If you can't figure something out you can always Google for it or watch YouTube videos or buy a book.

If you're expecting a lament for good, old-fashioned paper manuals, then this isn't it. I'm torn between the demise of the manual being a good thing, because almost no one read them in the first place, and the move to digital formats hiding how undocumentable many modern software packages have become.

Look at Photoshop CS6. The "Help and Tutorials" PDF is 750 pages, with much of that being links to external videos, documents, and tutorials. Clearly that's still not enough information, because there's a huge market for Photoshop books and classes. The first one I found at Amazon, Adobe Photoshop CS6 Bible, is 1100 pages.

The most fascinating part of all of this is what's become the tip of the documentation iceberg: the Quick Start guide.

This may be the only non-clinical documentation that ships with an application. It's likely the only thing a user will read before clicking around and learning through discovery or Google. So what do you put in the Quick Start guide? Simple tutorials? Different tutorials for different audiences? Explanations of the most common options?

Here's what I'd like to see: What the developers of the software were thinking when they designed it.

I don't mean coding methodologies; I mean the assumptions that were made about how the program should be used. For example, some image editors add a new layer each time you create a vector-based element like a rectangle. That means lots of layers, and that's okay. The philosophy is that bitmaps and editable vector graphics are kept completely separate. Other apps put everything into the same layer unless you explicitly create a new one. The philosophy is that layers are an organizational tool for the user.

Every application has philosophies like this that provide a deeper understanding once you know about them, but seem random otherwise. Why does the iMovie project size remain the same after removing twenty seconds of video? Because the philosophy is that video edits are non-destructive, so you never lose the source footage. Why is it so much work to change the fonts in a paper written in Word? Because you shouldn't be setting fonts directly; you should be using paragraph styles to signify your intent and then make visual adjustments later.

I want to see these philosophies documented right up front, so I don't have to guess and extrapolate about what I perceive as weird behavior. I'm thinking "What? Where are all these layers coming from?" but the developers wouldn't even blink, because that's normal to them.

And I'd know that, if they had taken the time to tell me.

(If you liked this, you might enjoy A Short Story About Verbosity.)

2012 Retrospective

A short summary of 2012: more entries than any previous year by far (41 vs. 33 in 2010), and a site design that finally doesn't look so homemade.

And a tremendous increase in traffic.

It's not the numbers of network packets flying around that matter. To you, reading this, this may look like a blog that's ostensibly about building things with technology while more often than not dancing around any kind of actual coding, but in (hopefully) interesting ways. To me, this is an outlet for ideas and for writing. That I'm able to fulfill my own desire to write and there's a large audience that finds it useful...I am stunned that such an arrangement exists.

You have my sincere gratitude for taking the time to read what I've written.

others from 2012 that I personally like

Turning Your Code Inside Out
The Most Important Decisions are Non-Technical
You, Too, Can Be on the Cutting Edge of Functional Programming Research
The Silent Majority of Experts
All that Stand Between You and a Successful Project are 500 Experiments
Dangling by a Trivial Feature

(Here's last year's retrospective.)

An Irrational Fear of Files on the Desktop

A sign of the clueless computer user has long been saving all files directly to the desktop. You can spot this from across the room, the background image peeking through a grid of icons. Well-intentioned advice of "Here, let me show you how to make a folder somewhere else," is ignored.

The thing is, it's not only okay to use the desktop as a repository for all your work, it's beautiful from an interaction design perspective.

The desktop is the file system, and it's a visual one too. Everything is right there in front of you as a sort of permanent file browser. There's no need for a "My Computer" icon, having to open an application for browsing files (i.e., Windows Key + E), or dealing with the conceptual difference between the desktop and, say, "My Documents" (something surprisingly difficult to explain to new users). It's only too bad so much time has been spent disparaging the desktop as a document storage location.

What about the mess caused a screen full of icons? That's the best part: you can see your mess. You can be disorganized regardless of where you store documents, but if you just dump everything into "My Documents" you don't have the constant in-your-face reminder to clean things up. The lesson shouldn't be not to put things on the desktop, but how to create folders for projects or for things you're no longer working on.

To be fair, there were once good arguments against storing everything on the desktop. Back when the Windows Start menu required navigating nested menus, it was easier to have desktop shortcuts for everything--most applications still create one by default. That muddled the metaphor. Was the desktop for documents or programs? Once you were able to run an app by clicking Start and typing a few letters of the name (or the OS X equivalent: Spotlight), the desktop was no longer needed as an application launcher. (And now there are more iOS-like mechanisms for this purpose in both OS X Mountain Lion and Windows 8.)

The puzzling part of all this is how a solid, easy to understand model of storing things on a computer became exactly what the knowledgeable folks--myself included--were warning against.

(If you liked this, you might enjoy User Experience Intrusions in iOS 5.)

Trapped by Exposure to Pre-Existing Ideas

Let's go back to the early days of video games. I don't mean warm and fuzzy memories of the Nintendo Entertainment System on a summer evening, but all the way back to the early 1970s when video games first started to exist as a consumer product. We have to go back that far, because that's when game design was an utterly black void, with no genres or preconceptions whatsoever. Each game that came into existence was a creation no one had previously imagined.

While wandering through this black void, someone had--for the very first time--the thought to build a video game with a maze in it.

The design possibilities of this maze game idea were unconstrained. Was it an open space divided up by a few lines or a collection of tight passageways? Was the goal to get from the start to the finish? Was it okay to touch the walls or did they hurt you? Two people could shoot at each other in a spacious maze using the walls for cover. You could be in a maze with robots firing at you. Maybe could break through some of the walls? Or what if you built the walls yourself? "Maze" was only a limitation in the way that "detective story" was for writers.

And then in 1980, when only a relative handful of maze game concepts had been explored, Toru Iwatani designed Pac-Man.

It featured a maze of tight passageways full of dots, and the goal was to eat all of those dots by moving over them. You were chased by four ghosts that killed you on contact, but there were special dots that made the ghosts edible for a brief period, so you could hunt them down.

After the release of Pac-Man, when someone had the thought to create a game with a maze in it, more often than not that game had tight passageways full of dots, something--often four of them--chasing you, and a way to turn the tables on those somethings so you could eliminate them.

Because by that time, there were no other options.

(If you liked this, you might enjoy Accidental Innovation.)

Sympathy for Students in Beginning Programming Classes

Here's a template for a first programming class: Use a book with a language name in the title. Start with the very basics like formatted output and simple math. Track through more language features with each chapter and assignment, until at the end of the semester everyone is working with overloaded operators and templates and writing their own iterators and knows all the keywords related to exception handling.

If you're a student in a class like this, you have my sympathy, because it's a terrible way to be introduced to programming.

Once you've learned a small subset of a language like Python--variables, functions, control flow, arrays, and dictionaries--then features are no longer the issue. Sure, you won't know all the software engineery stuff like exceptions and micromanagement of variable and function scopes, but it's more important to learn how to turn thoughts into code before there's any mention of engineering.

I'd even go so far as to say that most OOP is irrelevant at this point, too.

My real template for a first programming class is this: Teach the bare minimum of language features required to do interesting things. Stop. Spend the rest of the semester working on short assignments that introduce students to problem solving and an appreciation for the usefulness of knowing how to write code.

The Highest-Level Feature of C

At first blush this is going to sound ridiculous, but bear with me: the highest-level feature of C is the switch statement.

As any good low-level language should be, C is designed for transparent compilation. If you take a bit of C source, the corresponding object code emitted by the compiler--even a heavily optimizing compiler--roughly mimics the structure of the original text.

The switch statement is the only part of the language where you specify an intent, and the choice of how to make that a reality is not only out of your hands, but the resulting code can vary in algorithmic complexity.

Sure, there are other situations where the compiler can step in and reinterpret things. A for loop known to execute three times can be replaced by three instances of the loop body. In some circumstances, if you're careful not to trip over all the caveats, a loop can be vectorized so multiple elements can be processed in each iteration. None of these are fundamental changes. Your loop is still conceptually a loop, one way or another.

The possibilities when compiling a switch are much more varied. It can result in a trivial series of if..else statements. It can result in a binary search. Or, if the values are consecutive, a jump table. Or for a complex sequence, some combination of these techniques. If each case simply assigns a different value to the same variable, then it can be implemented as a range check and array lookup. The overall sweep of the solutions, from hundreds of sequential, mispredicted comparisons to a single memory read, is substantial.

The same principle is what makes pattern matching so useful in Erlang and Haskell. You provide this great, messy bunch of patterns containing a mix of numbers and lists and tuples and "don't care" values. At compile time the commonalities, exceptional cases, and opportunities for table lookups are sorted out, and fairly optimally, too.

In the compiled code for this bit of Erlang, the tuple size is used for dispatching to the correct line:

case Position of
{X, Y, Dir}       -> ...
{X, Y, Dir, _, _} -> ...
{X, Y, _, _}      -> ...
{X, Y}            -> ...
end

The switch statement in C is a signal that even though you could do it yourself, you'd prefer to have the compiler act as a robotic assistant who'll take your spec--a list of values and actions--and write the code for you.

(If you liked this, you might enjoy On Being Sufficiently Smart.)

Simplicity is Wonderful, But Not a Requirement

Whenever I write about the overwhelming behind-the-scenes complexity of modern systems, and the developer frustration that comes with it, I get mail from computer science students asking "Am I studying the right field? Should I switch to something else?"

It seems somewhere between daunting and impossible to build anything with modern technology if it's that much of a mess. But despite endless claims by knowledgeable insiders as to the extraordinary difficulty and deeply flawed nature of software development, there's no end of impressive achievements that are at odds with that pessimism.

How could anyone manage to build an operating system out of over ten million lines of error prone C and C++? Yet I can cite three easy examples: Windows, OS X, and Linux.

How could anyone craft an extravagant 3D video game with pointers and manual memory management woven throughout a program that has triple the number of lines as the one in the space shuttle's main computer? Yet dozens of such games are shipped each year.

How could anyone write an executable specification of a superscalar, quad-core CPU with 730,000,000 transistors? One that includes support for oddball instructions that almost made sense in the 1970s, multiple floating point systems (stack-based and vector-based), and in addition to 32-bit and 64-bit environments also includes a full 16-bit programming model that hasn't been useful since the mid 1990s? Yet the Intel i7 powers millions of laptops and desktops.

If you wallow in the supposed failure of software engineering, then you can convince yourself that none of these examples should actually exist. While there's much to be said for smallness and simpleness, it's clearly not a requirement when it comes to changing the world. And perhaps there's comfort in knowing that if those crazy people working on their overwhelmingly massive systems are getting them to work, then life is surely much easier for basement experimenters looking to change the world in smaller ways.

(If you liked this, you might enjoy Building Beautiful Apps from Ugly Code.)

Don't Be Distracted by Superior Technology

Not long after I first learned C, I stumbled across a lesser-used language called Modula-2. It was designed by Niklaus Wirth who previously created Pascal. While Pascal was routinely put down for being awkwardly restrictive, Wirth nudged and reshaped the language into Modula-2, arguably the finest systems-level programming language of its day.

Consider that Modula-2 had the equivalent of C++ references from the start (and for the record, so did Pascal and ALGOL). Most notably, if you couldn't guess from the name, Modula-2 had a true module system that has managed to elude the C and C++ standards committees for decades.

My problem became that I had been exposed to the Right Approach to separately compiled modules, and going back to C felt backward--even broken.

When I started exploring functional programming, I used to follow the Usenet group comp.lang.functional. A common occurrence was that someone struggling with how to write programs free of destructive updates would ask a question. As the thread devolved into bickering about type systems and whatnot, someone would inevitably point out a language that gracefully handled that particular issue in a much nicer way than Haskell, ML, or Erlang.

Except that the suggested language was an in-progress research project.

The technology world is filled with cases where smart and superior alternatives exist, but their existence makes no difference because you can't use them. 1980s UNIX was incredibly stable compared to MS-DOS, but it was irrelevant if you intended to use MS-DOS software. Clojure and Factor are wonderful languages, but if you want to write iOS games then you're better off pretending you've never heard of them. Not only are they not good options for iOS, at least not at the moment, but going so against the grain brings extra work and headaches with it.

Words like better, superior, and right are misleading. Yes, Modula-2 has a beautiful module system, but that's negated by being a fringe language that isn't likely to be available from the start when exciting new hardware is released. Erlang isn't as theoretically beautiful as those cutting-edge research languages, but it's been through the forge of shipping large-scale systems. What may look like warts upon first glance may be the result of pragmatic choices.

There's much more fun to be had building things than constantly being distracted by better technology.

(If you liked this, you might enjoy The Pace of Technology is Slower than You Think.)

Expertise, the Death of Fun, and What to Do About It

I've started writing this twice before. The first time it turned into Simplicity is Wonderful, But Not a Requirement. The second time it ended up as Don't Be Distracted by Superior Technology. If you re-read those you might see bits and pieces of what I've been wanting to say, which goes like this:

There is danger in becoming an expert. Long-term exposure to programming, coding, software development--whatever you want to call it--changes you. You start to recognize the extreme complexity in situations where there doesn't need to be any, and it eats at you. You realize how broken the tools are. You discover bygone flashes of amazing beauty in old systems that have been set aside in favor of the way things have always been done.

This is a bad line of thinking.

It's why you run into twenty-year veteran coders who can no longer write FizzBuzz. It's why people right out of school often create impressive and impossible-seeming things, because they haven't yet developed an aesthetic that labels all of that successful hackery as ugly and wrong. It's why some programmers migrate to more and more obscure languages, trading productivity for poetic tinkering.

Maybe a better title for this piece is "So You've Become Jaded and Dissatisfied. Now What?"

Cultivate a "try it first" attitude. Yes, it's amusing to read about those silly developers who can't write FizzBuzz. But your first reaction should be to set aside the article and try implementing it yourself.

Active learning or bust. Don't bother with tutorials or how-to books unless you're going to use the information immediately. Fire up your favorite interpreter and play along as you read. Don't take the author's word for anything; prove it to yourself. Do the exercises and invent your own.

Be realistic about the limitations of your favorite programming language. I enjoy Erlang, but it's puzzle language, meaning that some truly trivial problems don't have a straightforward mapping to the strengths of the language (such as most algorithms based around destructive array updates). When I don't have a clear picture of what features I'm going to need, I reach for something with few across-the-board sticking points, like Python. Sometimes the cleanest approach involves straightforward loops and counters and return statements right there in the middle of it all.

Let ease of implementation trump perfection. Yes, yes, grepping an XML file is fundamentally wrong...but it often works and is easier than dealing with an XML parsing library. Yes, you're supposed to properly handle exceptions that get thrown and always check for memory errors and division by zero and all that. If you're writing code for a manned Mars mission, then please take the time to do it right. But for a personal tool or a prototype, it's okay if you don't. Really. It's better to focus on the fun parts and generating useful results quickly.

Exploring the Lower Depths of Terseness

There's a 100+ year old system for recording everything that happens in a baseball game. It uses sheet of paper with a small box for each batter. Whether that batter gets to base or is out--and why--gets coded into that box. It's a scorekeeping method that's still in use at the professional and amateur level, and at major league games you can buy a program which includes a scorecard.

What's surprising is how cryptic the commonly used system is. For starters, each position is identified by a number. The pitcher is 1. The center fielder 8. If the ball is hit to the shortstop who throws it to the first baseman, the sequence is 6-3. See, there isn't even the obvious mnemonic of the first, second, and third basemen being numbered 1 through 3 (they're 3, 4, and 5).

In programming, no one would stand for this. It breaks the rule of not having magic numbers. I expect the center fielder would be represented by something like:

visitingTeam.outfield.center

The difference, though, is that programming isn't done in real-time like scorekeeping. After the initial learning curve, 8 is much more concise, and the terseness is a virtue when recording plays with the ball moving between multiple players. Are we too quick to dismiss extremely terse syntax and justify long-winded notations because they're easier for the uninitiated to read?

Suppose you have a file where each line starts with a number in parentheses, like "(124)", and you want to replace that number with an asterisk. In the vim editor the keystrokes for this are "^cib*" followed by the escape key. "^" moves to the start of the line. The "c" means you're going to change something, but what? The following "ib" means "inner block" or roughly "whatever is inside parentheses." The asterisk fills in the new character.

Once you get over the dense notation, you may notice a significant win: this manipulation of text in vim can be described and shared with others using only five characters. There's no "now press control+home" narrative.

The ultimate in terse programming languages is J. The boring old "*" symbol not only multiplies two numbers, but it pairwise multiplies two lists together (as if a map operation were built in) and also multiplies a scalar value with each element of a list, depending on the types of its operands.

That's what happens with two operands anyway. Each verb (the J terminology for "operator"), also works in a unary fashion, much like the minus sign in C represents both subtraction and negation. When applied to a lone value "*" is the sign function, returning either -1, 0, or 1 if the operand is negative, zero, or positive.

So now each single-character verb has two meanings, but it goes further than that. To increase the number of symbolic verbs, each can have either a period or a colon as a second character, and then each of these have both one and two operand versions. "*:" squares a single parameter or returns the nand ("not and") of two parameters. Then there's the two operand version of "*." which computes the least common multiple, and I'll give it up now before everyone stops reading.

Here's the reason for this madness: it allows a wide range of built-in verbs that never conflict with user-defined, alphanumeric identifiers. Without referencing a single library you've got access to prime number generation ("p:"), factorial ("!"), random numbers ("?"), and matrix inverse ("%.").

Am I recommending that you switch to vim for text editing and J for coding? No. But when you see an expert working with those tools, getting results with fewer keystrokes than it would take to import a Python module, let alone the equivalent scripting, then...well, there's something to the terseness that's worth remembering. It's too impressive to ignore simply because it doesn't line up with the prevailing aesthetic for readable code.

(If you liked this, you might enjoy Papers from the Lost Culture of Array Languages.)

Remembering a Revolution That Never Happened

Twenty-three years ago, a book by Edward Cohen called Programming in the 1990s: An Introduction to the Calculation of Programs was published. It was a glimpse into the sparkling software development world of the future, a time when ad hoc coding would be supplanted by Dijkstra-inspired manipulation of proofs. Heck, no need to even run the resulting programs, because they're right by design.

Clearly Mr. Cohen's vision did not come to pass, but I co-opted the title for this blog.

That book is a difficult read. It starts out as bright-eyed and enthusiastic as you could expect a computer science text to be, then rapidly turns into chapter-long slogs to prove the equivalent of a simple linear search correct. It wasn't the difficulty that made the program derivation approach unworkable. Reading and writing music looks extraordinarily complex and clunky to the uninitiated, but that's not stopping vast numbers of people from doing so. The problem is that for almost any non-trivial program, it's not clear what "correct" means.

Here's a simple bit of code to write: display a sorted list of the filenames in a folder. That should take a couple of minutes, including googling around for how to get the contents of a directory.

Except that on some systems you're getting weird filenames like "." and ".." that you don't want to display.

Except that there are also hidden files, either based on an attribute or a naming convention, and you should ignore those too.

Except that you need the sort to be case insensitive or else the results won't make sense to most users.

Except that some people are using spaces between words and some are using underscores, so they should be treated the same when sorting.

Except that a naive sort is going to put "File 10" before "File 9", and while that's logical in the cold innards of the CPU, it's no excuse to present the data that way.

And this is a well-understood, weird old relic of a problem that's nothing compared to all the special cases and exceptions needed to implement a solid user experience in a modern app. Making beautiful code ugly--and maybe impossible to prove correct--by making things easier for the user is a good thing.

(If you liked this, you might enjoy Write Code Like You Just Learned How to Program.)

A Short Quiz About Language Design

Suppose you're designing a programming language. What syntax would you use for a string constant? This isn't a trick; it's as simple as that. If you want to print Hello World then how do you specify a basic string like that in your language?

I'll give you a moment to think about it.

The obvious solution is to use quotes: "Hello World". After all, that's how it works in English, so it's easy to explain to new students of your language. But then someone is going to ask "What if I want to put a quotation mark inside a string? That's a legitimate question, because it's easy to imagine displaying a string like:

"April 2013" could not be found.

There are a couple of options to fix this. Some form of escape character is one, so an embedded quote is preceded by, say, a backslash. That works, but now you've got to explain a second concept in order to explain strings. Another option is to allow both single and double quotes. If your string contains single quotes, enclose it in double quotes, and vice-versa. A hand goes up, and someone asks about how to enter this string:

"April 2013" can't be found.

Ugh. Now you have two kinds of string delimiters, and you still need escapes. You need to explain these special cases up front, because they're so easy to hit.

What if instead falling back on the unwritten rule of using single and double quotes, strings were demarcated by something less traditional? Something that's not common in Latin-derived languages? I'll suggest a vertical bar:

|"April 2013" can't be found.|

That may be uncomfortable at first glance, but give it a moment. Sure, a vertical bar will end up in a string at some point--regular expressions with alternation come to mind--but the exceptional cases are no longer blatant and nagging, and you could get through a beginning class without even mentioning them.

(If you liked this, you might enjoy Explaining Functional Programming to Eight-Year-Olds.)

Stumbling Into the Cold Expanse of Real Programming

This is going to look like I'm wallowing in nostalgia, but that's not my intent. Or maybe it is. I started writing this without a final destination in mind. It begins with a question:

How did fast action games exist at all on 8-bit systems?

Those were the days of processors living below the 2 MHz threshold, with each instruction run to completion before even considering the next. No floating point math. Barely any integer math, come to think of it: no multiplication or division and sums of more than 255 required two additions.

But that kind of lively statistic slinging doesn't tell the whole story or else there wouldn't have been so many animated games running--usually at sixty frames-per-second--on what appears to be incapable hardware. I can't speak to all the systems that were available, but I can talk about the Atari 800 I learned to program on.

Most games didn't use memory-intensive bitmaps, but a gridded character mode. The graphics processor converted each byte to a character glyph as the display was scanned out. By default these glyphs looked like ASCII characters, but you could change them to whatever you wanted, so the display could be mazes or platforms or a landscape, and with multiple colors per character, too. Modify one of the character definitions and all the references to it would be drawn differently next frame, no CPU work involved.

Each row of characters could be pixel-shifted horizontally or vertically via two memory-mapped hardware registers, so you could smoothly scroll through levels without moving any data.

Sprites, which were admittedly only a single color each, were merged with the tiled background as the video chip scanned out the frame. Nothing was ever drawn to a buffer, so nothing needed to be erased. The compositing happened as the image was sent to the monitor. A sprite could be moved by poking values in position registers.

The on-the-fly compositing also checked for overlap between sprites and background pixels, setting bits to indicate collisions. There was no need for even simple rectangle intersection tests in code, given pixel-perfect collision detection at the video processing level.

What I never realized when working with all of these wonderful capabilities, was that to a large extent I was merely scripting the hardware. The one sound and two video processors were doing the heavy lifting: flashing colors, drawing characters, positioning sprites, and reporting collisions. It was more than visuals and audio; I didn't even think about where random numbers came from. Well, that's not true: I know they came from reading memory location 53770 (it was a pseudo-random number generator that updated every cycle).

When I moved to newer systems I found I wasn't nearly the hotshot game coder I thought I was. I had taken for granted all the work that the dedicated hardware handled, allowing me to experiment with game design ideas.

On a pre-Windows PC of the early 1990s, I had to write my own sprite-drawing routines. Real ones, involving actual drawing and erasing. Clipping at the screen edges? There's something I never thought about. The Atari hardware silently took care of that. But before I could draw anything, I had to figure out what data format to use and how to preprocess source images into that format. I couldn't start a tone playing with two register settings; I had to write arcane sound mixing routines.

I had wandered out of the comfortable realm where I could design games in my head and make them play out on a TV at my parents' house and stumbled into the cold expanse of real programming.

(If you liked this, you might enjoy A Personal History of Compilation Speed.)

Flickr's Redesign is a Series of Evolutionary Changes

After years of teetering on the brink of relevance, Flickr is back in the limelight thanks in part to a more modern appearance. But here's something that may not be so obvious: it wasn't a sudden reworking of Flickr. It's been evolving through a series of smaller improvements over the course of fifteen months.

In February 2012, photo thumbnails presented as grid of small squares floating in a sea of whitespace were replaced with the justified view: images cropped to varying widths and packed into aesthetically pleasing rows in the browser window. Initially this was only for the favorites page, but a few months later it was applied to the amalgamation of recent photos from your contacts, then to the photos in topic-oriented groups.

In December 2012, the iOS Flickr app was replaced with a completely rewritten, better designed version. It sounds drastic, rewriting an app, but it's only a client for interacting with the Flickr database. The core of Flickr remained the same.

Around the same time, the justified view spread to the Explore (top recent photos) page.

When the May 2013 redesign hit, most of the pieces were already in place. Sure, there was some visual design work involved, but if you look closely one of the most striking changes is that the justified view is now used for individual photostreams.

I love stories like this, because it's my favorite way to develop: given an existing, working application, pick one thing to improve. Not a full rewrite. Not a Perl 6 level of manic redesign. Not a laundry list of changes. One thing. The lessons learned from that one improvement may lead to further ideas to try which will lead to still further ideas. Meanwhile you're dealing with an exponentially simpler problem than architecting an entire lineage of such theoretical improvements all at once.

(If you liked this, you might enjoy What Do People Like?)

Getting Comfortable with the Softer Side of Development

When I was in college, I took an upper-level course called "Operating Systems." It was decidedly hardcore: preemptive multitasking and synchronization, task scheduling, resource management, deadlock avoidance, and so on. These were the dark, difficult secrets that few people had experience with. Writing one's own operating system was the pinnacle of geeky computer science aspirations.

The most interesting thing about that course, in retrospect, is what wasn't taught: anything about how someone would actually use an operating system. I don't mean flighty topics like flat vs. skeuomorphic design, but instead drop way down to something as fundamental as how to start an application or even how you'd know which applications are available to choose from. Those were below the radar of the computer science definition of "operating system." And not just for the course, either. Soft, user experience topics were nowhere to be found in the entire curriculum.

At this point, I expect there are some reactions to the previous two paragraphs brewing:

"You're confusing computer science and human-computer interaction! They're two different subjects!"

"Of course you wouldn't talk about those things in an operating systems course! It's about the lowest-level building blocks of an OS, not about user interfaces."

"I don't care about that non-technical stuff! Some designer-type can do that. I'm doing the engineering work."

There's some truth in each of these--and the third is simply a personal choice--but all it takes is reading a review of OS X or Windows where hundreds of words are devoted to incremental adjustments to the Start menu and dock to realize those fluffy details aren't so fluffy after all. They matter. If you want to build great software, you have to accept that people will dismiss your application because of an awkward UI or font readability issues, possibly switching to a more pleasing alternative that was put together by someone with much less coding skill than you.

So how do you nudge yourself in that direction without having to earn a second degree in a softer, designery field?

Learn basic graphic design. Not so much how to draw things or create your own artistic images (I'm hopeless in that regard), but how to use whitespace, how fonts work together, what a good color scheme looks like. Find web pages and book covers that you like and deconstruct them. Take the scary step of starting with a blank page and arranging colors and fonts and text boxes on it. Hands-on experimentation is the only way to get better at this.

Read up on data visualization. Anything by Edward Tufte is a good place to start.

Foster a minimalist aesthetic. If you're striving for minimalism, then you're giving just as much thought to what to leave out as what to include, and you need to make hard choices. That level of thought and focus is only going to make your application better. You can go too far with minimalism, but a quick glance around the modern software world shows that this isn't a major worry.

Don't build something a certain way simply because "that's how it's always been done." There's strong programmer impulse to clone, to implement what you've already seen. That can result in long eras of misguidedly stagnant IDEs or calculator apps, because developers have lost sight of the original problem and are simply rehashing what they're familiar with.

Optimize for things that directly affect users. Speed and memory are abstract in most cases. Would you even notice if an iPhone app used 10 megabytes instead of 20? Documentation size and tutorial length are more concrete, as are the number of steps it takes to complete common tasks.

Tips for Writing Functional Programming Tutorials

With the growing interest in a functional programming style, there are more tutorials and blog entries on the subject, and that's wonderful. For anyone so inclined to write their own, let me pass along a few quick tips.

Decide if you're writing a tutorial about functional programming or a specific language. If you're covering the feature set of Haskell, from the type system to laziness to monads, then you're writing about Haskell. If you show how to explore interesting problems and the executable parts of your tutorial happen to be written in Haskell, then you're writing about functional programming. See the difference?

Let types explain themselves. The whole point of type inference is that it's behind the scenes and automatic, helping you write more correct code with less bookkeeping. Don't negate that benefit by talking about the type system explicitly. Let it be silently assimilated while working through interesting examples and exercises that have nothing to do with types.

Don't talk about currying. There's a fascinating theoretical journey from a small set of expressions--the lambda calculus--to a more useful language. With just the barest of concepts you can do seemingly crazy things like recursion without named functions and using single-argument functions to mimic functions that take multiple arguments (a.k.a. currying). Don't get so swept up in that theory that you forget the obvious: in any programming language ever invented, there's already a way to easily define functions of multiple arguments. That you can build this up from more primitive features is not useful or impressive to non-theoreticians.

Make sure you've got meaningful examples. If you have functions named foo or bar, then that's a warning sign right there. If you're demonstrating factorials or the Fibonacci sequence without a reason for calculating them (and there are reasons, such as permutations), then choose something else. There are curious and approachable problems everywhere. It's easy to write a dog_years function based on the incorrect assumption that one human year equals seven dog years. There's a more accurate computation where the first two years of a dog's life are 10.5 human years each, then each year after that maps to four human years. That's a perfect beginner-level problem.

(If you liked this, you might enjoy You, Too, Can Be on the Cutting Edge of Functional Programming Research.)

Organizational Skills Beat Algorithmic Wizardry

I've seen a number of blog entries about technical interviews at high-end companies that make me glad I'm not looking for work as a programmer. The ability to implement oddball variants of heaps and trees on the spot. Puzzles with difficult constraints. Numeric problems that would take ten billion years to complete unless you can cleverly analyze and rephrase the math. My first reaction is wow, how do they manage to hire anyone?

My second reaction is that the vast majority of programming doesn't involve this kind of algorithmic wizardry.

When it comes to writing code, the number one most important skill is how to keep a tangle of features from collapsing under the weight of its own complexity. I've worked on large telecommunications systems, console games, blogging software, a bunch of personal tools, and very rarely is there some tricky data structure or algorithm that casts a looming shadow over everything else. But there's always lots of state to keep track of, rearranging of values, handling special cases, and carefully working out how all the pieces of a system interact. To a great extent the act of coding is one of organization. Refactoring. Simplifying. Figuring out how to remove extraneous manipulations here and there.

This is the reason there are so many accidental programmers. You don't see people casually become neurosurgeons in their spare time--the necessary training is specific and intense--but lots of people pick up enough coding skills to build things on their own. When I learned to program on an 8-bit home computer, I didn't even know what an algorithm was. I had no idea how to sort data, and fortunately for the little games I was designing I didn't need to. The code I wrote was all about timers and counters and state management. I was an organizer, not a genius.

I built a custom a tool a few years ago that combines images into rectangular textures. It's not a big program--maybe 1500 lines of Erlang and C. There's one little twenty line snippet that does the rectangle packing, and while it wasn't hard to write, I doubt I could have made it up in an interview. The rest of the code is for loading files, generating output, dealing with image properties (such as origins), and handling the data flow between different parts of the program. This is also the code I tweak whenever I need a new feature, better error handling, or improved usability.

That's representative of most software development.

(If you liked this, you might enjoy Hopefully More Controversial Programming Opinions.)

Getting Past the Cloning Instinct

When I write about creativity or similarly non-technical subjects, I often get mail from pure coders asking how they can become better at design. There's an easy response: take an idea from your head all the way to completion.

Almost certainly that idea isn't as well thought-out as you'd hoped, and you'll have to run experiments and backtrack and work out alternate solutions. But if you stick with it and build everything from the behind-the-scenes processing to the UI, make it simple enough for new users to learn without you being in the room to guide them, and agonize over all the choices and details that define an app, then congratulations! You've just gone through the design process.

Obvious advice? It would be, except there's a conflicting tendency that needs to be overcome first: the immediate reaction upon seeing an existing, finished product of "I want to make my own version of that."

It's a natural reaction, and the results are everywhere. If a fun little game gets popular, there's an inevitable flood of very similar games. If an app is useful, there will be me-too versions, both commercial and open source. Choosing to make something that already exists shifts the problem from one of design to one that's entirely engineering driven. There's a working model to use for reference. Structure and usability problems are already solved. Even if you desperately want to change one thing you think the designer botched, you're simply making a tweak to his or her work.

Originality is not the issue; what matters is the process that you go through. If your work ends up having similarities to something that came before it, that's completely different than if you intentionally set-out to duplicate that other app in the first place.

The cloning instinct makes sense when you're first learning development. Looking at an existing application and figuring out how to implement it is much easier than creating something new at the same time. But at some point, unless you're happy being just the programmer, you need to get beyond it.

(If you liked this, you might enjoy The Pure Tech Side is the Dark Side.)

How much memory does malloc(0) allocate?

On most systems, this little C program will soak up all available memory:

while (1) {
malloc(0);
}

so the answer is not the obvious "zero." But before getting into malloc(0), let's look at the simpler case of malloc(1).

There's an interesting new C programmer question about malloc: "Given a pointer to dynamically allocated memory, how can I determine how many bytes it points to?" The answer, rather frustratingly, is "you can't." But when you call free on that same pointer, the memory allocator knows how big the block is, so it's stored somewhere. That somewhere is commonly adjacent to the allocated memory, along with any other implementation-specific data needed for the allocator.

In the popular dlmalloc implementation, between 4 and 16 bytes of this overhead are added to a request, depending on how the library is configured and whether pointers are 32 or 64 bits. 8 bytes is a reasonable guess for a 64-bit system.

To complicate matters, there's a minimum block size that can be returned by malloc. Alignment is one reason. If there's an integer size secretly prepended to each block, then it doesn't make sense to allocate a block smaller than an integer. But there's another reason: when a block is freed, it gets tracked somehow. Maybe it goes into a linked list, maybe a tree, maybe something fancier. Regardless, the pointers or other data to make that work have to go somewhere, and inside the just-freed block is a natural choice.

In dlmalloc, the smallest allowed allocation is 32 bytes on a 64-bit system. Going back to the malloc(1) question, 8 bytes of overhead are added to our need for a single byte, and the total is smaller than the minimum of 32, so that's our answer: malloc(1) allocates 32 bytes.

Now we can approach the case of allocating zero bytes. It turns out there's a silly debate about the right thing to do, and it hasn't been resolved, so technically allocating zero bytes is implementation-specific behavior. One side thinks that malloc(0) should return a null pointer and be done with it. It works, if you don't mind a null return value serving double duty. It can either mean "out of memory" or "you didn't request any memory."

The more common scheme is that malloc(0) returns a unique pointer. You shouldn't dereference that pointer because it's conceptually pointing to zero bytes, but we know from our adventures above that at least dlmalloc is always going to allocate a 32 byte block on a 64-bit system, so that's the final answer: it takes 32 bytes to fulfill your request for no memory.

[EDIT: I modified the last two paragraphs to correct errors pointed out in email and a discussion thread on reddit. Thank you for all the feedback!]

(If you liked this, you might enjoy Another Programming Idiom You've Never Heard Of.)

Purely Functional Photoshop

One of the first things you learn about Photoshop--or any similarly styled image editor--is to use layers for everything. Don't modify existing images if you can help it. If you have a photo of a house and want to do some virtual landscaping, put each tree in its own layer. Want to add some text labels? More layers.

The reason is straightforward: you're keeping your options open. You can change the image without overwriting pixels in a destructive way. If you need to save out a version of the image without labels, just hide that layer first. Maybe it's better if the labels are slightly translucent? Don't change the text; set the opacity of the layer.

This stuff about non-destructive operations sounds like something from a functional programming tutorial. It's easy to imagine how all this layer manipulation could look behind the scenes. Here's a list of layers using Erlang notation:

[House, MapleTree, AshTree, Labels]

If you want to get rid of the label layer, return a new list:

[House, MapleTree, AshTree]

Or to reverse the order of the trees, make another new list:

[House, AshTree, MapleTree, Labels]

Again, nothing is being modified. Each of these simple manipulations returns a brand new list. Performance-wise there are no worries no matter how much data House represents. Each version of the list is referencing the same data, so nothing is being copied. In Erlang, each of these alternate list transformations creates three or four conses (six or eight memory cells total), which is completely irrelevant.

Now what about changing the opacity of the labels layer? Realistically, a layer should be a dictionary of some sort, maybe a property list:

[{name,"labels"},...]

If one of the possible properties is opacity, then the goal is to return a new list where the layer looks like this:

[{name,"labels"},{opacity,0.8}]

Is this all overly obvious and simplistic? Maybe, especially if you have a functional programming background, but it's an interesting example for a couple of reasons. Non-destructive manipulations are the natural approach; there's no need to keep saying "I know, I know, this may seem awkward, but bear with me, okay?" It also shows the most practical reason for using a language like Erlang, Haskell, or Lisp: so you can easily work with symbolic descriptions of data instead of the raw data itself.

Why Do Dedicated Game Consoles Exist?

The announcement of the Nintendo 2DS has reopened an old question: "Should Nintendo give up on designing their own hardware and write games for existing platforms like the iPhone?" A more fundamental question is "Why do dedicated game consoles exist in the first place?"

Rewind to the release of the first major game system with interchangeable cartridges, the Atari VCS (a.k.a. Atari 2600) in 1977. Now instead of buying that game system, imagine you wanted a general purpose PC that could create displays of the same color and resolution as the Atari. What would the capabilities of that mythical 1977 PC need to be?

For starters, you'd need a 160x192 pixel display with a byte per pixel. Well, technically you'd need 7-bits, as the 2600 can only display 128 colors, but a byte per pixel is simpler to deal with. That works out to 30,720 bytes for the display. Sounds simple enough, but there's a major roadblock: 4K of RAM in 1977 cost roughly $125. To get enough memory for our 2600-equivalent display, ignoring everything else, would have been over $900.

For comparison, the retail price of the Atari 2600 was $200.

How did Atari's engineers do it? By cheating. Well, cheating is too strong of a word. Instead of building a financially unrealistic 30K frame buffer, they created an elaborate, specialized illusion. They built a video system--a monochrome background and two single-color sprites--that was only large enough for a single horizontal line. To get more complex displays, game code wrote and rewrote that data for each line on the TV screen. That let the Atari 2600 ship with 128 bytes of RAM instead of the 30K of our fantasy system.

Fast-forward fourteen years to 1991 and the introduction of the Super Nintendo Entertainment System. Getting an early 90s PC to equal the color and resolution of the SNES is easy. The 320x200 256-color VGA mode is a good match for most games. The problem is no longer display quality. It's motion.

The VGA card's memory was sitting on the other side of a strained 8-bit bus. Updating 64,000 pixels at the common Super Nintendo frame rate of 60fps wasn't possible, yet the SNES was throwing around multiple parallaxing backgrounds and large animated objects.

Again, it was a clever focus that made the console so impressive. The display didn't exist as a perfect grid of pixels, but was diced into tiles and tilemaps and sprites and palettes which were all composited together with no involvement from the underpowered 16-bit CPU. There was no argument that a $2000 PC was a more powerful general-purpose machine, likely by several orders of magnitude, but that didn't stop the little $200 game system from providing an experience that the PC couldn't.

The core of both of these examples--graphics on a 2D screen--is a solved problem. Even free-with-contract iPhones have beautiful LCDs overflowing with resolution and triangle-drawing ability, so it's hard to justify a hand-held system that largely hinges on having a similar or worse display. There are other potential points of differentiation, of course. Tactile screens. Head-mounted displays. 3D holographic projection. But eventually it all comes down to this: Is the custom hardware so fundamentally critical to the experience that you couldn't provide it otherwise? Or is the real goal to design great games and have people play them, regardless of which popular system they run on?

(If you liked this, you might enjoy Nothing Like a Little Bit of Magic.)

Dynamic Everything Else

Static vs. dynamic typing is one of those recurring squabbles that you should immediately run away from. None of the arguments matter, because there are easy to cite examples of big, famous applications written using each methodology. Then there are confusing cases like large C++ apps that use dynamically typed Lua for scripting. And right about now, without fail, some know-it-all always points out that dynamic typing is really a subset of static typing, which is a lot like defining a liberal as a conservative who holds liberal views, and nothing worthwhile comes from this line of reasoning.

I have no interest in the static vs. dynamic typing dispute. What I want is dynamic everything else.

Sitting in front of me is a modern, ultra-fast MacBook Pro. I know it can create windows full of buttons and checkboxes and beautifully rendered text, because I see those things in every app I use. I should be able to start tapping keys and, a command or two later, up pops a live OS X window that's draggable and receives events. I should be able to add controls to that window in a playful sort of way. Instead I have to create an XCode project (the first obstacle to creative fiddling), compile and run to see what I'm doing (the second), then quit and re-run it for each subsequent change (the third).

There's an impressive system for rendering shapes and curves and fonts under OS X. Like the window example above, I can't interactively experiment with these capabilities either. I end up using a vector-based image editor, but the dynamism goes away when I save what I've created and load it into a different app. Why must the abilities to grab curves and change font sizes be lost when I export? Why can't the editing features be called forth for any image made of vectors?

I know how to solve these problems. They involve writing custom tools, editors, and languages. Switching to a browser and HTML is another option, with the caveat that the curves and glyphs being manipulated are virtual entities existing only inside the fantasy world of the browser.

That aside, it is worth taking a moment to think about the expectations which have been built up over the decades about how static and inflexible most computing environments are.

Code is compiled and linked and sealed in self-contained executables. There's no concept of live-editing, of changing a running system, or at least that's relegated to certain interpreted languages or the distant memories of Smalltalk developers. Reaching for an open source JPEG library is often easier than using the native operating system--even though the OS is clearly capable of loading and displaying JPEGs--especially if you're not using the language it was designed to interface with.

We've gotten used to all of this, but there's no fundamental law dictating systems must be designed this way.

(If you liked this, you might enjoy The UNIX Philosophy and a Fear of Pixels.)

What Are You The World's Foremost Authority Of?

It started with schools of fish, then butterflies.

I was given a series of animals, with some art for each, and my job was to make them move around in ways that looked believable. Well, as believable as you could get on a 16-bit game system. It was more about giving the impression of how fish would swim in a river than being truly realistic. My simple schooling algorithm and obsessive fiddling with acceleration values must have worked, because I ended up demoing those fish half a dozen times to people who stopped by to see them.

Later for an indie game I designed, I implemented a variety of insects: swooping hornets, milling bees, scuttling centipedes, marching ants--sixteen in all. (My wife, Jessica, gets half the credit here; she did all the art and animation.)

I certainly never had it as my goal, but I've gotten pretty good at implementing naturalistic behaviors.

Not long ago I took a shot at getting loose flower petals to fly in the breeze. I didn't have a plan and didn't know if I could come up with something workable, but it only took a few hours. It's also the only time I've used calculus in any code I've written, ever. (Don't be overly impressed; it's about the simplest calculus imaginable.) In one of Bret Victor's wonderful talks, he proposes that mimicking leaf motion by tracing it on a touchscreen is easier than building it programmatically. That's the only time I've disagreed with him. I had already started thinking about how to model a falling leaf.

What kind of company am I in with this odd talent? Is there a society of insect movement simulation designers that I'm not familiar with? Or have I accidentally built up a base of arcane knowledge and experience that's unique to me alone?

Let me ask another question, one that I don't mean to sound condescending or sarcastic in the least: What are you the world's foremost authority of?

Surely it's something. If you have your own peculiar interests and you work on projects related to them, then you've likely implemented various solutions and know the advantages and drawbacks of each. Or maybe you've taken one solution and iterated it, learning more with each version. The further you go, the more likely that you're doing things no one else has done in the same way. You're becoming an authority.

And if you're stumped and can't think of anything you've worked on that isn't in pretty much the same exact territory as what hundreds of other people have done, then it's time to fix that. Go off in a slightly odd direction. Re-evaluate one of your base assumptions. Do something completely random to shake things up. You may end up an authority in one small, quirky area, but you'll still be an authority.

(If you liked this, you might enjoy Constantly Create.)

Three Years in an Alternate Universe

My first post-college programming job was with Ericsson Network Systems in the early 1990s. I had similar offers from three other hard to differentiate telecom companies. The main reason I went with Ericsson was because the word "engineer" was in the title, which I thought sounded impressive. I stayed until I got the three year itch and moved on without looking back, but during those three years I never realized how out of the ordinary that job was.

I was paid overtime. Yes, overtime as a salaried software engineer. There was an unpaid five-hour gap between forty and forty-five hours, but everything after that was paid in full. When I worked 65 hour crunch weeks, I earned 50% more pay.

One in six software engineers were women. Well, okay, on an absolute scale that's not a big number. But in comparison to workplaces I've been in since then, where it's been one in twenty or even a flat-out zero, it's a towering statistic. Note that I'm only including people who designed and wrote code for massively concurrent telephone exchanges as their primary jobs, not non-technical managerial or support roles.

Since then, I know that unpaid crunch time is how things work, and blog complaints about this being free labor are perennial fountains of karma. Likewise, there's much lamenting the abysmally low numbers of women in software development positions. But for three years, when I didn't have enough life experience to know otherwise, I worked in an alternate universe where these problems didn't exist to the degree that I've seen since.

[EDIT: I remember the number being closer to one in three, and I thought I still had my old department directory to prove it, but I didn't. Instead I downplayed the numbers, and "one in six" is close to the overall average for engineers. That watered down the whole piece.]

C is Lower Level Than You Think

Here's a bit of code that many new C programmers have written:

for (int i = 0; i < strlen(s); i++) {
...
}

The catch is that strlen is executed in each iteration, and as it involves looking at every character in search of a null, it's an unintentional n-squared loop. The right solution is to assign the length of the string to a local variable before the loop and check that.

"That's just busywork," says our novice coder, "modern compilers are smart enough to do that kind of trivial optimization."

As it turns out, this is much trickier to automate than may first appear. It's only safe if it can be guaranteed that the body of the loop doesn't modify the string, and that guarantee in C is hard to come by. All bets are off after a single external function call, because memory used by the string could be referenced somewhere else and modified by that call. Most bets are off after a single store through a pointer inside the loop, because it could be pointing to the string passed to strlen. Actually, it's even worse than that: any time you write a value to memory you could be changing the value of any variable in memory. Determining that a[i] can be cached in a register across even a single memory write is unsolvable in the general case.

(To control the chaos, the C99 standard includes a way to assert that a pointer is used in a restricted manner. It's only an affirmation on the part of the programmer, and is not checked by the compiler. If you get this wrong the results are undefined.)

The GCC C compiler, as it turns out, will move the strlen call out of the loop in some cases. Don't get too excited, because now you've got an algorithm that's either n-squared or linear depending on the compiler. You could also say the hell with all of this and write a naive optimizer that always lifts the strlen out of a for-loop expression. Great! It works in the majority of real-life cases. But now if you go and write an algorithm, even a contrived one, that's dependent on the string length changing inside the loop...uh oh, now the compiler is transforming your valid intent into code that doesn't work. Do you want this kind of nonsense going on behind the scenes?

The clunky "manually assign the length to a constant" solution is a better one across the board. You're clearly stating that it doesn't matter what external functions do or that there are other writes to memory. You've already grabbed the value you want and that's that.

(If you liked this, you might enjoy How much memory does malloc(0) allocate?)

Self-Imposed Complexity

Bad data visualizations are often more computationally expensive--and harder to implement--than clear versions. A 3D line graph is harder to read than the standard 2D variety, yet the code to create one involves the additional concepts of filled polygons, shading, viewing angle, and line depth. An exploded 3D pie chart brings nothing over an unexploded version, and both still miss out on the simplicity of a flat pie chart (and there's a strong case to be made for using the even simpler bar chart instead).

Even with a basic bar chart there are often embellishments that detract from the purpose of the chart, but increase the amount of interface for creating them and code to draw them: bars with gradients or images, drop shadows, unnecessary borders. Edward Tufte has deemed these chartjunk. A bar chart with all of the useless fluff removed looks like something that, resolution aside, could have been drawn on a computer from thirty years ago, and that's a curious thing.

But what I really wanted to talk about is vector graphics.

Hopeful graphic designers have been saying for years that vector images should replace bitmaps for UI elements. No more redrawing and re-exporting to support a new screen size. No more smooth curves breaking into jagged pixels when zoomed-in. It's an enticing proposition, and if it had been adopted years ago, then the shift to ultra-high resolution displays would have been seamless--no developer interaction required.

Except for one thing: realistic, vector icons are more complicated than they appear. If you look at an Illustrator tutorial for creating a translucent, faux 3D globe, something that might represent "the network" or "the internet," it's not just a couple of Bezier curves and filled regions. There are drop shadows with soft edges and blur filters and glows and reflections and tricky gradients. That's the problem with scalable vectors for everything. It takes a huge amount of processing to draw and composite all of these layers of detailed description, and meanwhile the 64x64 bitmap version was already drawn by the GPU, and there's enough frame time left to draw thousands more.

That was the view three or more years ago, when user-interface accoutrements were thick with gloss and chrome and textures that you wanted to run your finger over to feel the bumps. But now looking at the comparatively primitive, yet aesthetically pleasing icons of iOS 7 and Windows 8, the idea that they could be live vector descriptions isn't so outlandish. And maybe what's kept us from getting there sooner is that it was hard to have to have self-imposed restraint amid a whirlwind of so much new technology. It was hard to say, look, we're going to have a clean visual language that, resolution aside, could have worked on a computer from thirty years ago.

Optimization in the Twenty-First Century

I know, I know, don't optimize. Reduce algorithmic complexity and don't waste time on low-level noise. Or embrace the low-level and take advantage of magical machine instructions rarely emitted by compilers. Most of the literature on optimization focuses on these three recommendations, but in many cases they're no longer the best place to start. Gone are the days when you could look like a superstar by replacing long, linear lookups with a hash table. Everyone is already using the hash table from the get-go, because it's so easy.

And yet developers are still having performance problems, even on systems that are hundreds, thousands, or even over a hundred-thousand times than faster those which came before. Here's a short guide to speeding up applications in the modern world.

Get rid of the code you didn't need to write in the first place. Early programming courses emphasize writing lots of code, not avoiding it, and it's a hard habit to break. The first program you ever wrote was something like "Hello World!" It should have looked like this:

Hello world!

There's no code. I just typed "Hello world!" Why would anyone write a program for that when it's longer than typing the answer? Similarly, why would anyone compute a list of prime numbers at runtime--using some kind of sieve algorithm, for example--when you can copy a list of pre-generated primes? There are lots of applications out there with, um, factory manager caching classes in them that sounded great on paper, but interfacing with the extra layer of abstraction is more complex than what life was like before writing those classes. Don't write that stuff until you've tried to live without it and fully understand why you need it.

Fix that one big, dumb thing. There are some performance traps that look like everyday code, but can absorb hundreds of millions--or more--cycles. Maybe the most common is a function that manipulates long strings, adding new stuff to the end inside a loop. But, uh-oh, strings are immutable, so each of these append operations causes the entire multi-megabyte string to be copied.

It's also surprisingly easy to unintentionally cause the CPU and GPU to become synchronized, where one is waiting for the other. This is why reducing the number of times you hand-off vertex data to OpenGL or DirectX is a big deal. Sending a lone triangle to the GPU can be as expensive as rendering a thousand triangles. A more obscure gotcha is that writing to an OpenGL vertex buffer you've already sent off for rendering will stall the CPU until the drawing is complete.

Shrink your data. Smallness equals performance on modern hardware. You'll almost always win if you take steps to reduce the size of your data. More fits into cache. The garbage collector has less to trace through and copy around. Can you represent a color as an RGB tuple instead of a dictionary with the named elements "red", "green", and "blue"? Can you replace a bulky structure containing dozens of fields with a simpler, symbolic representation? Are you duplicating data that you could trivially compute from other values?

As an aside, the best across-the-board compilation option for most C/C++ compilers is "compile for size." That gets rid of optimizations that look good in toy benchmarks, but have a disproportionately high memory cost. If this saves you 20K in a medium-sized program, that's way more valuable for performance than any of those high-end optimizations would be.

Concurrency often gives better results than speeding up sequential code. Imagine you've written a photo editing app, and there's an export option where all the filters and lighting adjustments get baked into a JPEG. It takes about three seconds, which isn't bad in an absolute sense, but it's a long time for an interactive program to be unresponsive. With concerted effort you can knock a few tenths of a second off that, but the big win comes from realizing that you don't need to wait for the export to complete before continuing. It can be handled in a separate thread that's likely running on a different CPU core. To the user, exporting is now instantaneous.

(If you liked this, you might enjoy Use and Abuse of Garbage Collected Languages.)

Success Beyond the Barrier of Full Understanding

The most memorable computer science course I took in college was a two part sequence: Software Development I and II. In the first semester you built a complete application based on a fictional customer's specification. To date myself, it was written in Turbo Pascal for MS-DOS. In the second semester, you were given someone's completed project from a previous year of Software Development--a different project than the one you just worked through--and were asked to make a variety of modifications to it.

The checkbook tracking application I inherited was written by a madman. A madman who was clearly trying to write as many lines of Pascal as possible. Anything that should have been encapsulated in a handy helper function, wasn't. Code to append an extension to a filename was written in-line every time it was needed. Error checking was duplicated when a file was opened. There weren't any abstract data types, just repetitive manipulation of global data structures. For example, if there was a special case that needed fixing up, then it was handled with separate code in twenty places. The average function was over two hundred lines long. There were functions with fifteen levels of indentation.

And yet, it worked. The author of this mess didn't realize you aren't supposed to write code like this, that all alarms were loudly reporting that the initial plan and set of abstractions were failing, and it was time to stop and re-evaluate. But he or she didn't know what all the sirens and buzzers meant and hit the afterburners and kept going and going past all point of reason. And the end result worked. Not in a "you'd rely on it in a life and death situation" way, but good enough for how most non-critical apps get used.

That is why big companies hire young, enthusiastic coders. Not because they're as clueless as my madman, but because they can self-motivate beyond the barrier of full understanding and into imperfect and brute force solutions. I'd want to stop and rework the abstractions I'm using and break things into lots of smaller, reliable, understandable pieces. My code might be more bullet-proof in the end, but I still have a level of admiration for people who can bang out complex apps before they become jaded enough to realize it's not that easy.

(If you liked this, you might enjoy Do You Really Want to be Doing This When You're 50?)

A Worst Case for Functional Programming?

Several times now I've seen the following opinion:

For anything that's algorithm-oriented or with lots of math, I use functional programming. But if it's any kind of simulation, an object-oriented solution is much easier.

I'm assuming "simulation" means something with lots of moving, nested actors, like a battlefield where there are vehicles containing soldiers who are carrying weapons, and even the vehicle itself has wheels and different parts that can be damaged independently and so on. The functional approach looks to be a brain-teaser. If I'm deep down inside the code for a tank, and I need to change a value in another object, how do I do that? Does the state of the world have to get passed in and out of every function? Who would do this?

In comparison, the object-oriented version is obvious and straightforward: just go ahead and modify objects as needed (by calling the proper methods, of course). Objects contain references to other objects and all updates happen destructively and in-place. Or is it that simple?

Let's say the simulation advances in fixed-sized time steps and during one of those steps a tank fires a shell. That's easy; you just add a shell object into the data structures for the simulation. But there's a catch. The tanks processed earlier in the frame don't know about this shell, and they won't until next frame. Tanks processed later, though, have access to information from the future. When they run a "Were any shells recently fired?" check, one turns up, and they can take immediate action.

The fix is to never pollute the simulation by adding new objects mid-frame. Queue up the new objects and insert them at the end of the frame after all other processing is complete.

Now suppose each tank decides what to do based on other entities in the vicinity. Tank One scans for nearby objects, then moves forward. Tank Two scans for objects and decides Tank One is too close. Now it isn't actually too close yet; this is based on an incorrect picture of the field caused by Tank One updating itself. And it may never be too close, if Tank Two is accelerating away from Tank One.

There are a couple of fixes for this. The first is to process situational awareness for every actor on the field as a separate step, then pass that information to the decision/movement phase. The second is to avoid any kind of intra-frame pollution of object data by keeping a list of all changes (e.g., that a tank moved to a new position), then applying all of those changes atomically as a final step.

If I were writing such a simulation in a functional style, then the fixes listed above would be there from the start. It's a more natural way to work when there aren't mutable data structures. Would it be simpler than the OOP version? Probably not. Even though entity updates are put off until later, there's the question of how to manage all of the change information getting passed around. But at one time I would have thought the functional version a complete impossibility, and now it feels like the obvious way to approach these kinds of problems.

(Some of these ideas were previously explored in Purely Functional Retrogames, Part 4 and Turning Your Code Inside Out.)

You Don't Want to Think Like a Programmer

It's an oft-stated goal in introductory coding books and courses: to get you to think like a programmer. That's better than something overly specific and low-level like "to learn Java." It's also not meant to be taken literally. A clearer, more accurate phrasing would be "to get you to break down problems in an analytical way." But let that initial, quirky sequence of five words--"to think like a programmer"--serve as a warning and a reminder.

Because you really don't want to think like a programmer.

It starts slowly, as you first learn good coding practices from the bad. Never use global variables; wrap all data into objects. Write getter and setter methods to hide internal representations. Use const wherever possible. Only one class definition per file, please. Format your source code to encourage reading and understanding by others. Take time to line up your equal signs so things are in nice, neat columns.

Eventually this escalates to thinking in terms of design patterns and citing rules from Code Complete. All these clueless people want you add features that are difficult and at odds with your beautiful architecture; don't they realize that complexity is the enemy? You come to understand why every time a useful program is written in Perl or PHP it's an embarrassment to computer science. Lisp is the way, and it's worth using even if you don't have access to most of the libraries that make Python such a vital tool. Then one day you find yourself arguing static versus dynamic typing and passionately advocating test-driven development and all hope is lost.

It's not that any of these things are truly bad on their own, but together they occupy your mind. You should be obsessing about the problem domain you're working in--how to make a game without pedantic tutorials, what's the most intuitive set of artistic controls in a photography app--and not endless software engineering concerns.

Every so often I see someone attempting to learn a skill (e.g., web design, game development, songwriting), by finishing a project every day/week/month. I love these! They're exciting and inspirational and immediate. What a great way to learn! The first projects are all about getting something--anything--working. That's followed by re-engineering familiar designs. How to implement Snake, for example. Or Tetris.

If you've embarked on such a journey, the big step is to start exploring your own ideas. Don't copy what people who came before you were copying from other people. Experiment. Do crazy things. If you stick to the path of building what has already been made, then you're setting yourself up as implementor, as the engineer of other people's ideas, as the programmer. Take the opportunity to build a reputation as the creator of new experiences.

And, incidentally, you know how to write code.

(If you liked this, you might enjoy Learning to Ignore Superficially Ugly Code.)

Popular iOS Games That Could Have Been Designed for 8-Bit Systems

Amid all the Flappy Bird hoopla, it struck me that I could have brought that game to life thirty years ago on an 8-bit home computer--if only I had thought of the idea. And then of course someone confirmed this hypothesis by writing a Commodore 64 version. That made me wonder what other popular iOS titles meet the same criteria.

"Implementable on an 8-bit computer" can't simply be equated with pixelated graphics. When I designed 8-bit games, I spent a lot of time up front making sure my idea was a good match for the hardware. Rendering a dozen (or even half that) sprites in arbitrary positions wasn't possible. Ditto for any kind of scaling, rotation, or translucency. Flappy Bird undershoots the hardware of the Atari 800 I learned to program on, with a sparse, scrolling background and one tiny sprite. What other iOS games would work?

Jetpack Joyride. If you've never seen it, take Flappy Bird and change the controls so that a touch-and-hold moves you upward and you drop when released. Make the scrolling world more interesting than pipes. Add floating coins to collect. That's the gist of it, anyway. Other niceties would translate, too, like semi-procedural environments and mission-like objectives ("fly 1000m without collecting any coins"). Most of the special vehicles you can commandeer would need to be dropped, but they're icing and not core gameplay.

Ridiculous Fishing. At first glance there's a lot going on visually, as you drop a line through many layers of fish. In a design move that looks like a concession for 8-bit hardware, the fish swim in horizontal bands. On most systems with hardware sprites, there's a limit to how many can be displayed on the same scan line. But those sprites can be modified and repositioned as the screen draws from top to bottom, so four sprites can be repurposed to eight or twelve or twenty, as long as they're in separate strips. That's a good match for the fishing portion of the game, but less so for the bonus shooting segment (which would need a rethink).

Super Hexagon. This one looks impossible, being based around hardware-accelerated polygons, but it could have been designed by a bold 8-bit coder. The key is that the polygons are flat in the graphic design sense: no textures, no gradients. How do you move a huge, flat triangle across the screen on a retro machine? Draw a line on one side of the triangle in the background color, then draw a new line on the other side. Repeat. Writing a line clipper will take some work, but it's doable. The "don't collide with the shapes" part of the design is easy. Math-heavy polygon collision routines can be replaced by checking a single "does sprite overlap a non-background pixel" register.

Threes! Here's a straightforward one, with no technical trickery or major omissions necessary. A retro four-way joystick is the perfect input device.

All of these designs could have been discovered thirty years ago, but they weren't. Think about that; someone could have come up with Jetpack Joyride's objective system in 1984, but they didn't. Ten years later, they still hadn't. It's a pleasant reminder that good design isn't all about the technology, and that there's a thoughtful, human side to development which doesn't need to move at the breakneck pace we've come to associate with the computing world.

(If you liked this you might enjoy Trapped by Exposure to Pre-Existing Ideas.)

Range-Checks and Recklessness

Here's an odd technical debate from the 1980s: Should compiler-generated checks for "array index out of range" errors be left in production code?

Before C took over completely, with its loose accessing of memory as an offset from any pointer, there was a string of systems-level languages with deeper treatment of arrays, including the ALGOL family, PL/1, Pascal, Modula-2, and Ada. Because array bounds were known, every indexing operation, such as:

frequency[i] = 0

could be checked at runtime to see if it fell within the extents of the array, exiting the program with an error message otherwise.

This was such a common operation that hardware support was introduced with the 80286 processor in the form of the bound instruction. It encapsulated the two checks to verify an index was between the upper and lower bounds of an array. Wait, wasn't the lower bound always zero? Often not. In Pascal, you could have declarations like this:

type Nineties = array[1990..1999] of integer;

Now back to the original question of whether the range checks should live on in shipping software. That error checking is great during development was not controversial, but opinions after that were divided. One side believed it wasteful to keep all that byte and cycle eating around when you knew it wasn't needed. The other group claimed you could never guarantee an absence of bugs, and wouldn't it be better to get some kind of error message than to silently corrupt the state of the application?

There's also a third option, one that wasn't applicable to simpler compilers like Turbo Pascal: have the compiler determine an index is guaranteed to be valid and don't generate range checking code.

This starts out easy. Clearly the constant in Snowfall[1996] is allowed for a variable of type Nineties. Replace "1996" with a variable, and it's going to take more work. If it's the iteration variable in a for loop, and we can ensure that the bounds of the loop are between 1990 and 1999 inclusive, then the range checks in the loop body can be omitted.

Hmmm...what if the for loop bounds aren't constants? What if they're computed by a function in another module? What if there's math done on the indices? What if it's a less structured while loop? Is this another case of needing a sufficiently smart compiler? At what point do diminishing returns kick in, and the complexity of implementation makes it hard to have faith that the solution is working correctly?

I set out to write this not for the technical details and trivia, but more about how my thinking has changed. When I first ran across the range-check compiler option, I was fresh out of the school of assembly language programming, and my obsessive, instruction-counting brain was much happier with this setting turned off. These days I can't see that as anything but reckless. Not only would I happily leave it enabled, but were I writing the compiler myself I'd only remove the checks in the most obvious and trivial of cases. It's not a problem worth solving.

Get Good at Idea Generation

I get more mail about The Recovering Programmer than anything else I've written. Questions like "How can I be more than just the programmer who implements other peoples' master plans?" are tough to respond to. Seeing that feeling of "I can make anything!" slide into "Why am I doing this?" makes me wish there was some easy advice to give, or at least that I could buy each of these askers a beer, so I'd like to offer at least one recommendation:

Get good at idea generation.

Ideas have a bad reputation. They're a dime a dozen. They're worthless unless implemented. Success is 90% perspiration. We've all seen the calls for help from a self-proclaimed designer and his business partner who have a brilliant company logo and a sure-fire concept for an app. All they need is a programmer or two to make it happen, and we all know why it won't work out.

Now get past ideas needing to be on a grand scale--the vision for an entire project--and think smaller. You've got a UI screen that's confusing. You have something non-trivial to teach users and there's no manual. The number of tweakable options is getting out of hand. Any of the problems that come up dozens of times while building anything.

The two easy approaches are to ignore the problem ("What's one more item on the preferences panel?") or do an immediate free association with all the software you've ever been exposed to and pick the closest match.

Here's what I do: I start writing a list of random solutions on a piece of paper. Some won't work, some are simple, some are ridiculous. What I'm trying to do is work through my initial batch of middling thoughts to get to the interesting stuff. If you've ever tried one of those "write a caption for the image" contests, it's the same thing. The first few captions you come up with seem like they're funny, but keep going. Eventually you'll hit comedy gold and those early attempts will look dumb in comparison.

Keep the ideas all over the place instead of circling around what you've already decided is the right direction. What if you were had to remove the feature entirely? Could you negate the problem through a change in terminology? What's the most over-engineered solution you can think of? What if this was a video game instead of a serious app? What would different audiences want: a Linux advocate, the person you respect most on twitter, an avant garde artist, someone who can't speak your native language?

The point of this touchy-feeliness isn't just to solve your current problem, but to change your thinking over time. To get your mind working in a unique way, not just restating what you've seen around the web. Every so often you'll have a small breakthrough of an idea that will become a frame for future solutions. Later you'll have another small breakthrough that builds on it. Eventually you'll be out in a world of thought of your own making, where you're seriously considering ideas that aren't even in someone else's realm of possibility.

(If you liked this you might enjoy Advice to Aimless, Excited Programmers.)

You Don't Read Code, You Explore It

(I wrote this in 2012 and rediscovered it in January of this year. I didn't feel comfortable posting it so close to Peter Seibel's excellent Code is Not Literature, so I held off for a few months.)

I used to study the program listings in magazines like Dr. Dobb's, back when they printed the source code to substantial programs. While I learned a few isolated tricks and techniques, I never felt like I was able to comprehend the entirety of how the code worked, even after putting in significant effort.

It wasn't anything like sitting down and reading a book for enjoyment; it took work. I marked up the listings and kept notes as I went. I re-read sections multiple times, uncovering missed details. But it was easy to build-up incorrect assumptions in my head, and without any way of proving them right or wrong I'd keep seeing what I wanted to instead of the true purpose of one particular section. Even if the code was readable in the software engineering sense, boundary cases and implicit knowledge lived between the lines. I'd understand 90% of this function and 90% of that function and all those extra ten percents would keep accumulating until I was fooling myself if I thought I had the true meaning in my grasp.

That experience made me realize that read isn't a good verb to apply to a program.

It's fine for hunting down particular details ("let's see how many buffers are allocated when a file is loaded"), but not for understanding the architecture and flow of a non-trivial code base.

I've worked through tutorials in the J language--called "labs" in the J world--where the material would have been opaque and frustrating had it not been interactive. The presentation style was unnervingly minimal: here's a concept with some sentences of high-level explanation, and here are some lines of code that demonstrate it. Through experimentation and trial and error, and simply because I typed new statements myself, I learned about the topic at hand.

Of particular note are Ken Iverson's interactive texts on what sound like dry, mathematical subjects, but they take on new life when presented in exploratory snippets. That's even though they are reliant on J, the most mind-melting and nothing-at-all-like-C language in existence.

I think that's the only way to truly understand arbitrary source code. To load it up, to experiment, to interactively see how weird cases are handled, then keep expanding that knowledge until it encompasses the entire program. I know, that's harder to do with C++ than with Erlang and Haskell (and more specifically, it's harder to do with languages where functions can have wide-ranging side effects that can change the state of the system in hidden ways), and that's part of why interactive, mostly-functional languages can be more pleasant than C++ or Java.

(If you liked this, you might enjoy, Don't Be Distracted by Superior Technology.)

Programming Without Being Obsessed With Programming

I don't get asked this very often, and that's surprising. I ask myself it all the time:

If you're really this recovering programmer and all, then why do you frequently write about super technical topics like functional programming?

Sometimes it's just for fun. How much memory does malloc(0) allocate? was a good exercise in explaining something obscure in a hopefully clear way. Those pieces also make me the most nervous, because there are so many experts with all kinds of specialized knowledge, and if I make any mistakes...let's just say that they don't get quietly ignored. (I am grateful for the corrections, in any case.)

But that's not the whole story.

If I don't code, I don't get to make things for all the wonderful devices out there in the world. Some people get all bent out of shape about that requirement and say "see, you're conflating programmer and product designer; you can do all the design work and leave the programming to someone else." That may be true if you're on the right team, but for personal projects it's like saying that writer should be split into two positions: the designer of the plot and characters, and the person who forms sentences on the page. It doesn't work like that.

The catch is, as we all know, developing for today's massive, deeply-layered systems is difficult, and that difficulty can be all-consuming: unreadable quantities of documentation, complex languages, software engineering rules and methodologies to keep everything from spontaneously going up in a spectacular fireball, too much technical choice. There's enough to keep you busy without ever thinking an original thought or crafting a vision of what you'd like to build.

For me the question is not whether to write code, but how to keep the coding side of my mind in check, how to keep it from growing and thinking about too many details and all the wrong things, and then at that point I've lost. I've become a software engineer, and I really don't want to be a software engineer.

That's my angle right there. Programming without being overwhelmed by and obsessed with programming. Simplicity of languages, simplicity of tools, and simplicity in ways of writing code are all part of that.

(If you liked this, then consider being an early follower on twitter.)

Unexpectedly Simple

This is a story of the pursuit of user experience simplicity, confounded by delusions and over-engineering. It's also about text formatters.

The first computer text formatter, RUNOFF, was written in 1964 in assembly language for the CTSS operating system. If you've never used RUNOFF or one its descendants like the UNIX utility troff, imagine HTML where each tag appears on a line by itself and is identified with a leading period. Want to embolden a word in the middle of a sentence? That's one line to turn bold on, one line for the word, then a third line to turn bold off. This led to elongated documents where the formatter commands gave little visual indication of what the final output would look like.

The RUNOFF command style, of the first character on a line indicating a formatting instruction, carried over to early word processors like WordStar (first released in 1978). But in WordStar that scheme was only used for general settings like double-spacing. You could use control codes mid-line for bold, italics, and underline. This word is in bold: ^Bword^B. (And to be fair you could do this in later versions of troff, too, but it was even more awkward.)

WordPerfect 4.2 for MS-DOS (1986), hid the formatting instructions so the text looked clean, but they could be displayed with a "reveal codes" toggle. I clearly remember thinking this was a terrible system, having to lift the curtain and manually fiddle with command codes. After all, MacWrite and the preceding Bravo for the Xerox Star had already shown that WYSIWYG editing was possible, and clearly it was the simplest possible solution for the user. But I was wrong.

WYSIWYG had drawbacks that weren't apparent until you dove in and worked with it, rather than writing a sentence about unmotivated canines on MacWrite at the local computer shop and trying to justify a $2495 purchase. If you position the cursor at the end of an italicized word and start typing, will the new characters be in italicized or not? It depends. They might not even be in the same font. If you paste a paragraph, and all of a sudden there's excess trailing space below it even though there isn't a carriage return in the text, how do you remove it?

More fundamentally, low-level presentation issues--font families, font sizes, boldness, italics, colors--were now intermingled with the text itself. You don't want to manually change the formatting of all the text serving as section headers; you want them each to be formatted the way a section header should be formatted. That's fixable by adding another layer, one of user-defined paragraph styles. Now there's more to learn, and some of the simplicity of WYSIWYG is lost.

Let's back up a bit to the initial problem of RUNOFF: the marked-up text bears little resemblance to the structure of the formatted output. What if instead of drawing attention to a rigid command structure, the goal is to make it invisible. Instead of .PP on its own line to indicate a paragraph, assume all text is in paragraphs separated by blank lines. An asterisk as the first character means a line is an element of an unordered list.

The MediaWiki markup language takes some steps in this direction. The REBOL MakeDoc tool goes further. John Gruber's Markdown is perhaps the cleanest and most complete system for translating visually formatted text to HTML. (Had I known about Markdown, I wouldn't have developed the minimal mark-up notation I use for the articles on this site.)

That's my incomplete and non-chronological history of text formatters, through ugly and over-engineered to a previously overlooked simplicity. You might say "What about SGML and HTML? What about TeX?" which I'll pretend I didn't hear, and say that the real question is "What other application types have grown convoluted and are there unexpectedly simple solutions that are being ignored?"

(If you liked this, you might enjoy Documenting the Undocumentable.)

You Can't Sit on the Sidelines and Become a Philosopher

At some point every competent developer has that flash of insight when he or she realizes everything is fundamentally broken: the tools, the languages, the methodologies. The brokenness--and who could argue with it--is not the important part. What matters is what happens next after this moment of clarity, after this exposure to the ugly realities of software.

You could ignore it, because that's how it is. You still get paid regardless of what you're forced to use.

You could go on a quest for perfection and try all the exotic languages and development environments, even taking long archaeological expeditions into once promising but now lost ideas of the 1970s and 80s. Beware, for you may never return.

You could try to recreate computing in your own image, starting from a new language, no wait, a new operating system--wait, wait, wait--a new processor architecture. This may take a while, and eventually you will be visited by people on archaeological expeditions.

The right answer is a blend of all of these. You have to ignore some things, because while they're driving you mad, not everyone sees them that way; you've built up a sensitivity. You can try new tools and languages, though you may have to carry some of their concepts into future projects and not the languages themselves. You can fix things, especially specific problems you have a solid understanding of, and probably not the world of technology as a whole.

As long as you eventually get going again you'll be fine.

There's another option, too: you could give up. You can stop making things and become a commentator, letting everyone know how messed-up software development is. You can become a philosopher and talk about abstract, big picture views of perfection without ever shipping a product based on those ideals. You can become an advocate for the good and a harsh critic of the bad. But though you might think you're providing a beacon of sanity and hope, you're slowly losing touch with concrete thought processes and skills you need to be a developer.

Meanwhile, other people in their pre-epiphany states are using those exact same technologies that you know are broken, and despite everything you do to convince them that this can't possibly work...they're successful.

I decided to take my own advice by writing an iPhone game. It's not written in an exotic functional language, just a lot of C++, some Objective-C, and a tiny interpreter for scripting. There are also parts of the code written in a purely functional style, and some offline tools use Erlang. It wasn't ever intended as a get-rich project, but more of get-back-in-touch project. As such, it has been wildly successful. (Still, if you have fun with it, an App Store rating would be appreciated.)

(If you liked this, you might enjoy The Background Noise Was Louder than I Realized.)

Lost Lessons from 8-Bit BASIC

Unstructured programming with GOTO is the stuff of legend, as are calling subroutines by line number--GOSUB 1000--and setting global variables as a mechanism for passing parameters.

The little language that fueled the home computer revolution has been long buried beneath an avalanche of derision, or at least disregarded as a relic from primitive times. That's too bad, because while the language itself has serious shortcomings, the overall 8-bit BASIC experience has high points that are worth remembering.

It's hard to separate the language and the computers it ran it on; flipping the power switch, even without a disk drive attached, resulted in a BASIC prompt. If nothing else, it could be treated as a calculator:

PRINT "seconds in a week: ",60*60*24*7

or

PRINT COS(2)/2

Notice how the cosine function is always available for use. No importing a library. No qualifying it with MATH.TRIG.

Or take advantage of this being a full programming language:

T = 0
FOR I=1 TO 10:T=T+I*I:NEXT I
PRINT T

It wasn't just math. I remember seeing the Atari 800 on display in Sears, the distinctive blue background and READY prompt visible across the department. I'd switch to a bitmapped graphics mode with a command window at the bottom and dash off a program that looped across screen coordinates displaying a multicolored pattern. It would run as an in-store demo for the rest of the day or until some other know-it-all pressed the BREAK key.

There's a small detail that I skipped over: entering a multi-line program on a computer in a department store. Without starting an external editor. Without creating a file to be later loaded into the BASIC interpreter (which wasn't possible without a floppy drive).

Here's the secret. Take any line of statements that would normally get executed after pressing return:

PLOT 0,0:DRAWTO 39,0

and prefix it with a number:

10 PLOT 0,0:DRAWTO 39,0

The same commands, the same editing keys, and yet it's entirely different. It adds the line to the current program as line number 10. Or if line 10 already exists, it replaces it.

Lines are syntax checked as entered. Well, each line is parsed and tokenized so that previous example turns into this:

Line #: 10
Bytes in line: 6
PLOT command
X: 0
Y: 0
DRAWTO command
X: 39
Y: 0

That's how the line is stored in memory, provided there aren't any errors. The displayed version is an interpretation of those bytes. Code formatting is entirely handled by the system and not something you think about.

All of this, from the always-available functions, to being able to develop programs without external tools, to code stored as pre-parsed tokens, made BASIC not just a language but a development system. Compare that to most of today's compilers which feed on self-contained files of code. Sometimes there's a run-eval-print loop so there's interactivity, but editing real programs happens elsewhere. And then there are what have come to be known as Integrated Development Environments which tie together file-oriented compilers with text editors and sometimes interactive command lines, but now they get derided for reasons that BASIC didn't: for being bulky and cumbersome.

Did I mention that Atari BASIC was contained in an eight kilobyte ROM cartridge?

How did IDEs go so wrong?

(If you liked this you might enjoy Stumbling Into the Cold Expanse of Real Programming.)

Design is Expensive

The result may at first glance seem a trifle, but I have a notebook filled with the genesis and evolution of the iPhone game DaisyPop. All those small, painstaking choices that now get "of course it should be like that" reactions...that's where the bulk of the development time went. I wanted to go over two specific design details, neither of which I knew the existence of at the project's outset.

The core mechanic is tapping flowers and insects that drift and scurry about the screen. Tapping a daisy freezes this wandering and it rapidly expands outward before bursting. Any flowers contacted by the first also expand and so on recursively. Insects, as everyone knows, don't expand when touched; they race forward, possibly setting off other daises and insects.

I put off implementing audio for this chaining process until late. I wasn't worried. I'd just plug in the sounds when they were ready, and I did. And it sounded terrible.

All the sound effects in a ten-length chain played in an overlapping jumble, sometimes four or five sounds starting the same frame. I spent a while fiddling with the chain code, trying to slow things down, to limit how much activity could occur at once, but it didn't help. I might have prevented two sounds from triggering on the same frame, but they were separated by a mere sixtieth of a second which didn't make a discernible difference. Messing with the chain system itself was also breaking the already polished and proven feel of the game.

The eventual solution was to not play sounds immediately, but queue them up. Every eight frames--a number found by trial and error--take a sound from the queue and start it. And it worked beautifully, stretching out the audio experience for big chains over several seconds, a regular rhythm of pentatonic tones. Almost.

Now the sounds weren't in sync with the visuals, and surprisingly it didn't matter. There's no real-life reference for when a purple daisy expanding to touch a white flower makes a sound, so the softness introduced by the audio queueing scheme wasn't a problem. But it was immediately noticeable when the quick run of notes played by a racing insect wasn't lined up with the animation, even by a little bit. The fix was to play insect sounds immediately instead of queuing them like daisy audio. It's inconsistent, yes, but it worked.

In any game there are goals that drive the player forward, such as finishing a level or trying to get a high score. In DaisyPop it's the latter. Typically you're only alerted of an exceptional score after you finish playing (think of when you enter your initials in an old school arcade game). I was thinking about how to give feedback mid-game. If there's a top ten list of scores, wouldn't it be motivational to know you've broken into that list and are now climbing?

A plain message is one option, but it's not dynamic; it doesn't draw your eye amidst the chaos. Eventually I settled on a triangle that appears and drifts upward--signifying rising up the score charts--which acts as a backdrop for overlaid text that moves in parallax: "#5" in a larger font, with "best score" beneath it." (Trivia: the triangle is one of two pieces of art I did myself.) After working out the right motion and how to handle edge cases I'm happy with the result. More games should do this.

I could have dodged both of these issues by wholesale borrowing an existing game concept and not trying something experimental. I would have had a working model to serve as a reference. It would have been so much easier in terms of time and mental effort if I had said sure, I just want to make a version of that game right there that people already know how to play and like.

Thinking about the details, wandering around an unknown space trying to invent the right solution, is expensive.

(If you liked this, you might enjoy All that Stand Between You and a Successful Project are 500 Experiments.)

Extreme Formatting

Not long after I wrote Solving the Wrong Problem, it occurred to me that this site is small because of what I decided to leave out, and that I never tried to optimize what remained. To that end I used a PNG reducer on the two images (one accompanies Accidental Innovation, the other is in The End is Near for Vertical Tab). And I ran the CSS through a web-based minifier.

A Cascading Style Sheet written in the common way looks like this:

blockquote {
font-style: italic;
margin-left: 1.25em
}
#top {
background-color: #090974;
color: #FFF;
margin-bottom: .67em;
border-color: #7373D9;
border-style: none none solid;
border-width: 12px;
padding: 2em 0 0
}

and so on, usually for a couple of screenfuls, depending on the complexity. The minified version of the above has no extraneous spaces or tabs and is only two lines: one for each selector. But now I had introduced a workflow problem. I needed to keep around the nicely formatted original for easy editing, then re-minify it before uploading to the site. The first time I wanted to change the CSS it was a simple tweak, so rather than automate the conversion I went in and edited the minified version directly.

And I found that I preferred working with the crunched-down CSS.

Unless you're reading the newsfeed, the CSS file is already on your computer or phone, so take a moment to look at it (Editor's Note: I've formatted and inlined it in the <head>). If you're sighing and shaking your head at this point, you could put each selector on a line by itself, but that adds another 23 lines. 23 more if each closing brace gets its own line. And another 40 or so to make sure there's a newline after each property. Somehow the raw 23-line version puts the overall simplicity of the stylesheet into clear perspective. Free of superficial structure, it takes less than half of a vertical window in my text editor. Is inflating that to over 100 lines--enough that I need to scroll to see them all--buying anything concrete, or is it that verticality is such an ingrained formatting convention?

Okay, right, there's also readability. Surely those run together properties are harder to scan visually? Syntax highlighting makes a big difference, and any editor with a "highlight all occurrences of a string" feature makes this layout amazing. I can see everywhere the border-style property is used all at once. No jumping to different parts of the document.

Here's a good question: Does this apply to real code and not just HTML stylesheets?

There's an infamous, single printed page of C code, written by Arthur Whitney one afternoon in 1989, which became the inspiration for the J language interpreter. It occasionally gets rediscovered and held up as an example of what would happen if a programmer went rogue and disregarded all rules and aesthetics of code formatting, and most who see it are horrified. All those macros? Short identifiers? Many statements on the same line? Entire functions on a single line, including those triggers of international debate, the curly braces?

Despite being misaligned with popular layout standards, is it really such a mess? It's small, so you can study the whole thing at once without scrolling. The heavy use of macros prevents noisy repetition and allows thinking in terms of higher-level chunks. That level of density makes the horizontal layout easier to follow than it would be with preprocessor-free C. (To be fair, the big downside is that this is not the kind of code debuggers work well with.)

I suspect I'm not seeing these two examples of extreme formatting the same way that someone who has programmed exclusively with languages of the C / Java / Javascript class does. I happily fused BASIC statements together with a colon, though I admit a moment of hesitation before attempting the same daring feat in C with a semicolon. J and Forth naturally have tight, horizontal layouts, and that's part of why those language cultures sometimes use quantity of code to specify problem difficulty.

"How hard do you think that is?"

"Maybe a dozen lines or so."

Programming Modern Systems Like It Was 1984

Imagine you were a professional programmer in 1984, then you went to sleep and woke up 30 years later. How would your development habits be changed by the ubiquitous, consumer-level supercomputers of 2014?

Before getting to the answer, realize that 1982 wasn't all about the cycle counting micro-optimizations that you might expect. Well, actually it was, at least in home hobbyist circles, but more of necessity and not because it was the most pleasant option. BASIC's interactivity was more fun than sequencing seven instructions to perform the astounding task of summing two 16-bit numbers. Scheme, ML, and Prolog were all developed in the previous decade. Hughes's Why Functional Programming Matters was written in 1984, and it's easy to forget how far, far from practical reality those recommendations must have seemed at the time. It was another year of two parallel universes, one of towering computer science ideas and the other of popular hardware incapable of implementing them.

The possible answers to the original question--and mind you, they're only possible answers--are colored by the sudden shift of a developer jumping from 1984 to 2014 all in one go, without experiencing thirty years of evolution.

It's time to be using all of those high-level languages that are so fun and expressive, but were set aside because they pushed the limits of expensive minicomputers with four megabytes of memory and weren't even on the table for the gold rush of 8-bit computer games.

Highly optimizing compilers aren't worth the risk. Everything is thousands, tens of thousands, of times faster then it used to be. Chasing some additional 2-4x through complex and sensitive manipulations isn't worth it. You'll regret your decision when for no clear reason your app starts breaking up at high optimization settings, maybe only on some platforms. How can anyone have confidence in a tool like that?

Something is wrong if most programs don't run instantaneously. Why does this little command line program take two seconds to load and print the version number? It would take serious effort to make it that slow. Why does a minor update to a simple app require re-downloading all 50MB of it? Why are there 20,000 lines of code in this small utility? Why is no one questioning any of this?

Design applications as small executables that communicate. Everything is set-up for this style of development: multi-core processors, lots of memory, native support for pipes and sockets. This gives you multi-core support without dealing with threads. It's also the most bulletproof way of isolating components, instead of the false confidence of marking class members "private."

Don't write temporary files to disk, ever. There's so much RAM you can have nightmares about getting lost in it. On the most fundamental level, why isn't it possible to create and execute a script without saving to a file first? Why does every tweak to a learning-the-language test program result in a two megabyte executable that shortly gets overwritten?

Everything is so complex that you need to isolate yourself from as many libraries and APIs as possible. With thousands of pages of documentation for any system, it's all too easy to become entangled in endless specific details. Build applications to be self-contained and have well-defined paths for interfacing with the operating system, even if those paths involve communicating with an external, system-specific server program of sorts.

C still doesn't have a module system? Seriously? And people are still using it, despite all the alternatives?

(If you liked this, you might enjoy Remembering a Revolution That Never Happened.)

The Software Developer's Sketchbook

It takes ten-thousand hours to master your field, so the story goes, except that programming is too broad of a field.

Solving arbitrary problems with code isn't what ultimately matters, but problems within a specific domain. Maybe it's strategy games. Maybe vector drawing tools. Music composition software. Satellite control systems. Viewed this way, who is truly an expert? Once I was asked how many high-end 3D games I work on a year, and I wrote "3" on the whiteboard, paused a moment to listen to the "that's fewer than I expected" comment, then finished writing: "1 / 3". How many satellite control systems do you work on in a decade? One? Maybe two?

Compare this to carpenters or painters or cartoonists who can look back on a huge body of work in a relatively short time, assuming they're dedicated. Someone who does roof work on fifty houses a year looks a lot more the expert than someone who needs two years to ship a single software project. Have you mastered building user interfaces for paint programs when you've only created one, then maintained it for five years, and you've never tried alternate approaches?

To become the expert, you need more projects. They can be smaller, experimental, and even be isolated parts of a non-existent larger app. You can start in the middle without building frameworks and modules first. This is the time to try different languages, so you can see if there's any benefit to Go or Clojure or Rust before committing to them. Or to attempt a Chuck Moore-esque exercise in extreme minimalism: is it possible to get the core of your idea working in under a hundred lines of code? But mostly it's to burn through a lot of possibilities.

This sketchbook of implemented ideas isn't a paper book, but a collection of small programs. It could be as simple as a folder full of Python scripts or Erlang modules. It's not about being right or wrong; many ideas won't work out, and you'll learn from them. It's about exploring your interests on a smaller scale. It's about playing with code. It's about having fun. And you might just become an expert in the process.

(If you liked this, you might enjoy The Silent Majority of Experts.)

Retiring Python as a Teaching Language

For the last ten years, my standard advice to someone looking for a programming language to teach beginners has been start with Python. And now I'm changing that recommendation.

Python is still a fine language. It lets you focus on problem solving and not the architectural stuff that experienced developers, who've forgotten what it's like to an absolute beginner, think is important. The language itself melts into the background, so lessons aren't explanations of features and philosophies, but about how to generate musical scales in any key, computing distances around a running track based on the lane you're in, or writing an automated player for poker or Yahtzee.

Then one day a student will innocently ask "Instead of running the poker simulator from the command line, how can I put it in a window with a button to deal the next hand?"

This is a tough question in a difficult-to-explain way. It leads to looking at the various GUI toolkits for Python. Turns out that Guido does the same thing every few years, re-evaluating if TkInter is the right choice for IDLE, the supplied IDE. For now, TkInter it is.

A week later, another question: "How can I write a simple game, one with graphics?"

Again, time to do some exploration into what's out there. Pyglet looks promising, but it hasn't been updated since July 2012. There are some focused libraries that don't try to do everything, like SplatGL, but it's pretty new and there aren't many examples. PyGame appears popular, and there's even a book, so okay let's start teaching how to use PyGame.

A month later, more questions: "How can I give this game I made to my friend? Even better, is there a way can I put this on my phone so I can show it to kids at school without them having to install it?"

Um.

All of these questions have put me off of Python as a teaching language. While there's rigor in learning how to code in an old-school way--files of algorithmic scripts that generate monochromatic textual output in a terminal window--you have to recognize the isolation that comes with it and how far away this is from what people want to make. Yes, you can find add-on packages for just about anything, but which ones have been through the sweat and swearing of serious projects, and which are well-intentioned today but unsupported tomorrow?

The rise of non-desktop platforms complicates matters, and I can sympathize. My goal in learning Erlang was to get away from C and C++ and shift my thinking to a higher level. I proved that I could use Erlang and a purely functional style to work in the domain that everyone is most scared of: games. Then the iPhone came out and that was that. Erlang wasn't an option.

It's with all of this in mind that my recommended language for teaching beginners is now Javascript. I know, I know, it's quirky and sometimes outright weird, but overall it's decent and modern enough. More importantly it's sitting on top of an unprecedentedly ubiquitous cross-platform toolkit for layout, typography, and rendering. Want to display UI elements, images, or text? Use HTML directly. Want to do graphics or animation? Use canvas.

I expect some horrified reactions to this change of thinking, at least to the slight degree that one can apply horrified to a choice of programming language. Those reactions should have nothing to do with the shortcomings of Javascript. They should be because I dismissed so many other languages without considering their features, type systems, or syntaxes, simply because they aren't natively supported by modern web browsers.

Life is More Than a Series of Cache Misses

I don't know what to make of the continual stream of people in 2015 with fixations on low-level performance and control. I mean the people who deride the cache-obliviousness of linked lists, write-off languages that aren't near the top of the benchmark table, and who rant about the hopelessness of garbage collection. They're right in some ways. And they're wrong at the same time.

Yes, you can do a detailed analysis of linked list traversal and realize "Hey! Looping over an array is much faster!" It is not news to anyone that different languages have different performance characteristics. Garbage collection is a little trickier, because unfortunately there are still issues depending on the situation, and not all that rarely either.

I could take a little Erlang or Scheme program and put on a show, publicly tearing it to pieces, analyzing the inefficiencies from dynamic typing and immutability and virtual machines. There would be foot stomping and cheering and everyone would leave convinced that we've been fooling ourselves and that the only way to write code is to frame problems in terms of cache architectures.

And then I'd reveal that the massively inefficient Erlang program takes only a couple of milliseconds to run.

Back in the 1990s I decided to modernize my skills, because my experience was heavily skewed toward low-level C and assembly work. I went through tutorials for a number of modern languages before settling on Erlang. While learning, I wrote programs for problems I did in college. Things like tree manipulation, binary search--the classics. And while I remember these being messy in C and Pascal, writing them in Erlang was fun. I'm not giving up that fun if I can help it. Fun is more productive. Fun leads to a better understanding of the problem domain. And that leads to fast code, even if it might be orders of magnitude away from optimal when viewed through a microscope.

There is an exception to all of this. Imagine you're an expert in building a very specific type of application. You've shipped five of them so you've got a map of all the sinkholes and poorly lit places. There's a chance, maybe, depending on your background, that your knowledge transcends the capabilities provided by higher level programming languages, and you can easily crystallize a simple, static architecture in C.

But until I'm that expert, I'll take the fun.

Are You Sure?

It's an old, familiar prompt. You delete a file, discard a work in progress, or hit Cancel mid-install:

Are you sure?

It's not always so quaintly phrased these days, but the same cautious attitude lives on in the modern confirmation box. In iOS 8, tapping the trash can while viewing a photo brings up a "Delete Photo?" button. Even so, that only moves it to the Recently Deleted album. Permanently removing it requires another delete with confirmation.

It may seem that the motivation behind "Are you sure?" is to prevent rash decisions and changes of heart. The official White House photographer isn't allowed to delete any shots, so that solves that problem. But for everyone else the little prompt quickly becomes part of a two-button sequence that finds its way into your muscle memory.

More commonly this second layer of confirmation averts legitimate mistakes. If I'm in a UNIX shell wanting to delete a file and it turns out to be write protected, then I thank whoever decided that a little "C'mon, really?" check was a good idea. Or I might unintentionally delete a video when making a clumsy attempt to grab my falling phone, were it not for those three familiar words, waiting, visible through the cracked screen.

But now there are better options, especially given the prevalence of touchscreens. The ideal is something easy to remember, easy to do, but that's naturally outside the realm of normal input. Here are a few.

Imagine tapping an image thumbnail four times. The first selects it. The subsequent taps expand the image, as if it's being inflated, until the fourth pops it and deletes it.

If that's too much fun, and you find your nephew has popped your entire photo library, touch each quadrant of an image in sequence. It doesn't matter which you start with, as long as you get all four. As you tap, that quadrant disappears, then with the fourth it's gone.

Long-holds are little used on touchscreens, so there's another possibility. Don't display a quit button in a game; hold your finger in the same place for three seconds. After a second, a circle starts shrinking toward your fingertip, to give feedback. Don't want to quit? Lift your finger.

These are only examples, and I know there are other approaches. There are some basic usability issues as well, such as how does an uninitiated person know about the four-quadrant tapping? But it's worth trying different ideas rather than, once again and without thought, following the "Are you sure?" model, the same one that prevented unintended MS-DOS disk formatting in the pre-Macintosh days.

(If you liked this, you might enjoy Virtual Joysticks and Other Comfortably Poor Solutions.)

The Wrong Kind of Paranoia

Have you ever considered how many programming language features exist only to prevent developers from doing something? And it's not only to keep you from doing something in other people's code. Often the person you're preventing from doing this thing is yourself.

For example, modules let you prevent people from calling functions that haven't been explicitly exported. In C there's static which hides a function from other separately compiled files.

const prevents modifying a variable. For pointers there's a second level of const-ness, making the pointed-to data read-only. C++ goes even further, as C++ tends to, allowing a class method to be marked const, meaning that it doesn't change any instance variables.

Many object-oriented languages let you group methods into private and public sections, so you can't access private methods externally. At least Java, C++, and Object Pascal add protected, which muddies the water. In C# you can seal classes so they can't be inherited. I'm trying real hard not to bring up friend classes, so I won't.

Here's the question: how much does all this pedantic hiding, annotating, and making sure you don't double-cross yourself by using a "for internal use only" method actually improve your software? I realize I'm treading in dangerous territory here, so take a few deep breaths first.

I like const, and I automatically precede local variables with it, but the compiler doesn't need me to do that. It can tell that a local integer is only assigned to once, and the generated code will be exactly the same. You could argue that the qualifier prevents accidental changes, but if I've ever had that happen in real code it's rare enough that I can't recall.

Internal class methods are similar. If they're not in the tutorial, examples, or reference, you don't even know they exist. If you use the header file for documentation, and internal methods are grouped together beneath the terse comment "internal methods," then why are you calling them? Even if they're secured with the private incantation, nothing is stopping you from editing the file, deleting that word, and going for it. And if this is your own code that you're doing this with, then this scenario is teetering on the brink of madness.

What all of these fine-grained controls have done is to put the focus on software engineering in the small. The satisfaction of building so many tiny, faux-secure fortresses by getting publics and protecteds in the right places and adding immutability keywords before every parameter and local variable. But you've still got a sea of modules and classes and is anything actually simpler or more reliable because some methods are behind the private firewall?

I'm going to give a couple of examples of building for isolation and reliability at the system level, but don't overgeneralize these.

Suppose you're building code to control an X-ray machine. You don't want the UI and all of that mixed together with the scary code that irradiates the patient. You want the control code on the device itself, and a small channel of communication for sending commands and getting back the results. The UI system only knows about that channel, and can't accidentally compromise the state of the hardware.

There's an architecture used in video games for a long time now where rendering and other engine-level functions are decoupled from the game logic, and the two communicate via a local socket. This is especially nice if the engine is in C++ and you're using a different language for the game proper. I've done this with Erlang, which worked out well, at least under OS X.

Both of these have a boldness to them, where an entire part of the system is isolated from the rest, and the resulting design is easier to understand and simpler overall. That's more important than trying to protect each tiny piece from yourself.

Reconsidering Functional Programming

Key bits and pieces from the functional programming world have, perhaps surprisingly, been assimilated into the everyday whole of development: single-assignment "variables," closures, maps and folds, immutable strings, the cavalier creation and immediate disregarding of complex structures.

The next step, and you really need this one if you want to stick to the gospel of referential transparency that dominated the early push toward FP, is immutable everything. You can create objects and data structures, but you can't modify them--ever. And this is a hard problem, or at least one that requires thinking about programs in a different way. Working that out is what drove many of my early blog entries, like Functional Programming Doesn't Work (and what to do about it) from 2009.

Across the board immutability may seem a ridiculous restriction, but on-the-fly modifying of data is a dangerous thing. If you have a time-sliced simulation or a game, and you change the internals of an object mid-frame, then you have to ask "At what point in the frame was this change made?" Some parts of the game may have looked at the original version, while others looked at it after the change. Now magnify the potential for crossed-wires by all of the destructive rewriting going on during a typical frame, and it's a difficult problem. Wouldn't it be better to have the core data be invariant during the frame, so you know that no matter when you look at it, it's the same?

I wrote a number of smaller games in Erlang, plus one big one, exploring this, and I finally had a realization: keeping the whole of a game or simulation frame purely functional is entirely unrelated to the functional programming in the small that gets so much attention. In fact, it has nothing to do with functional programming languages as all. That means it isn't about maps or folds or lambdas or even avoiding destructive-operation languages like C or C++.

The overall approach to non-destructive simulations is to keep track of changes from one frame to the next, without making any changes mid-frame. At the very end of the frame when the deltas from the current to the next have been collected, then and only then apply those changes to the core state. You can make this work in, say, purely functional Erlang, but it's tedious, and a bit of a house of cards, with changes threaded throughout the code and continually passed back up the call-chain.

Here's an alternate way of thinking about this. Instead of modifying or returning data, print the modification you want to make. I mean really print it with printf or its equivalent. If you move a sprite in the X direction, print this:

sprite 156: inc x,1.5

Of course you likely wouldn't use text, but it's easier to visualize than a binary format or a list of values appended to an internal buffer. How is this different than passing changes back up the line? It's direct, and there's now one place for collecting all changes to the frame. Run the frame logic, look at the list of changes, apply those changes back to the game, repeat. Never change the core data mid-frame, ever.

As with most realizations, I can see some people calling this out as obvious, but it's something I never considered until I stopped thinking about purely functional languages as a requirement for writing purely functional software. And in most other languages it's just so easy to overwrite a field in a data structure; there's a built-in operator and everything. Not going for that as the first solution for everything is the hard part.

(If you liked this you might enjoy Turning Your Code Inside-Out.)

Why Doesn't Creativity Matter in Tech Recruiting?

A lot of buzz last week over the author of the excellent Homebrew package manager being asked to invert a binary tree in a Google interview. I've said it before, that organizational skills beat algorithmic wizardy in most cases, probably even at Google. But maybe I'm wrong here. Maybe these jobs really are hardcore and no day goes by without implementing tricky graph searches and finding eigenvectors, and that scares me.

A recruiter from Google called me up a few years ago, and while I wasn't interested at the time, it made me wonder: could I make it through that kind of technical interview, where I'm building heaps and balancing trees on a whiteboard? And I think I could, with one caveat. I'd spend a month or two immersing myself in the technical, in the algorithms, in the memorization, and in the process push aside my creative and experimental tendencies.

I hope that doesn't sound pretentious, because it's a process I've experienced repeatedly. If I focus on the programming and tech, then that snowballs into more interest in technical topics, and then I'm reading programming forums and formulating tech-centric opinions. If I get too much into the creative side and don't program, then everything about coding seems much harder, and I talk myself out of projects. It's difficult to stay in the middle; I usually swing back and forth.

Is that the intent of the hardcore interview process? To find people who are pure programming athletes, who amaze passersby with non-recursive quicksorts written on a subway platform whiteboard, and aren't distracted by non-coding thoughts? It's kinda cool in a way--that level of training is impressive--but I'm unconvinced that such a technically homogeneous team is the way to go.

I've always found myself impressed by a blend between technical ability and creativity. The person who came up with and implemented a clever game design. The person doing eye-opening visualizations with D3.js or Processing. The artist using a custom-made Python tool for animation. It's not creating code so much as coding as a path to creation.

So were I running my ideal interview, I'd want to hear about side projects that aren't pure code. Have you written a tutorial with an unusual approach for an existing project? Is there a pet feature that you dissect and compare in apps you come across (e.g., color pickers)? And yes, Homebrew has a number of interesting usability decisions that are worth asking about.

(If you liked this, you might enjoy Get Good at Idea Generation.)

If You Haven't Done It Before, All Bets Are Off

I've been on one side or the other of most approaches to managing software development: from hyper-detailed used of specs and flowcharts to variants of agile to not having any kind of planning or scheduling at all. And I've distilled all of that down into one simple rule: If you haven't done it before--if you haven't built something close to this before--then all bets are off.

It's one of the fundamental principles of programming, that it's extremely difficult to gauge how much work is hidden behind the statement of a task, even to where the trivial and impossible look the same when silhouetted in the morning haze. Yet even the best intentioned software development methodologies still ride atop this disorientation. That little, easy feature hiding in the schedule, the one that gets passed over in discussions because everyone knows it's little and easy, turns out to be poorly understood and cascades into another six months for the project.

This doesn't mean you shouldn't keep track of what work you think you have left, or that you shouldn't break down vague tasks into concrete ones, or that you shouldn't be making drastic simplifications to what you're making (if nothing else, do this last one).

What it does mean is that there's value in having built the same sort of thing a couple of times.

If you've previously created a messaging service and you want to build a new messaging service, then you have infinitely more valuable insight than someone who has only worked on satellite power management systems and decides to get into messaging. You know some of the dead ends. You know some of the design decisions to be made. But even if it happens that you've never done any of this before, then nothing is stopping you from diving in and finding your way, and in the end you might even be tremendously successful.

Except when it comes to figuring out how much work it's going to take. In that case, without having done it before, all bets are off.

(If you liked this, you might enjoy Simplicity is Wonderful, But Not a Requirement.)

Computer Science Courses that Don't Exist, But Should

CSCI 2100: Unlearning Object-Oriented Programming
Discover how to create and use variables that aren't inside of an object hierarchy. Learn about "functions," which are like methods but more generally useful. Prerequisite: Any course that used the term "abstract base class."

CSCI 3300: Classical Software Studies
Discuss and dissect historically significant products, including VisiCalc, AppleWorks, Robot Odyssey, Zork, and MacPaint. Emphases are on user interface and creativity fostered by hardware limitations.

CSCI 4020: Writing Fast Code in Slow Languages
Analyze performance at a high level, writing interpreted Python that matches or beats typical C++ code while being less fragile and more fun to work with.

CSCI 2170: User Experience of Command Line Tools
An introduction to UX principles as applied to command line programs designed as class projects. Core focus is on output relevance, readability, and minimization. UNIX "ls" tool is a case study in excessive command line switches.

PSYC 4410: Obsessions of the Programmer Mind
Identify and understand tangential topics that software developers frequently fixate on: code formatting, taxonomy, type systems, splitting projects into too many files. Includes detailed study of knee-jerk criticism when exposed to unfamiliar systems.

The Right Thing?

A Perl program I was working on last year had fifteen lines of code for loading and saving files wholesale (as opposed to going line by line). It could have been shorter, but I was using some system routines that were supposedly the fastest option for block reads and writes.

The advice I had been seeing in forums for years was that I shouldn't be doing any of this, but instead use the nifty File::Slurp module. It seemed silly to replace fifteen lines of reliable code with a module, but eventually I thought I'd do the right thing and switch.

I never should have looked, but File::Slurp turned out to be 800+ lines of code, including comments, and not counting the documentation block at the bottom. One of those comments stood out, as is typical when prefaced with "DEEP DARK MAGIC":

DEEP DARK MAGIC. this checks the UNTAINT IO flag of a glob/handle. only the DATA handle is untainted (since it is from trusted data in the source file). this allows us to test if this is the DATA handle and then to do a sysseek to make sure it gets slurped correctly. on some systems, the buffered i/o pointer is not left at the same place as the fd pointer. this sysseek makes them the same so slurping with sysread will work.

Still, I kept using it--the right thing and all--until one day I read about a Unicode security flaw with File::Slurp. Now I was using an 800 line module containing deep dark magic and security issues. Oh, and also no one was maintaining it. This was no longer the recommended solution, and there were people actively pointing out why it should be avoided.

I dug up my original fifteen lines, took out the optimizations, and now I'm back to having no module dependencies. Also the code is faster, likely because it doesn't have to load another module at runtime. As a footnote, the new right thing is the Path::Tiny module, which in addition to providing a host of operations on file paths, also includes ways to read and write entire files at once. For the moment, anyway.

(If you liked this, you might enjoy Tricky When You Least Expect It.)

What Can You Put in a Refrigerator?

This may sound ridiculous, but I'm serious. The goal is to write a spec for what's allowed to be put into a refrigerator. I intentionally picked something that everyone has lots of experience with. Here's a first attempt:

Anything that (1) fits into a refrigerator and (2) is edible.

#1 is hard to argue with, and the broad stroke of #2 is sensible. Motorcycles and bags of cement are off the list. Hmmm...what about liquids? Can I pour a gallon of orange juice into the refrigerator? All right, time for version 2.0:

Anything that's edible and fits into a refrigerator. Liquids must be in containers.

Hey, what about salt? It fits, is edible, and isn't a liquid, so you're free to pour a container of salt into this fridge. You could say that salt is more of a seasoning than a food, in an attempt to disallow it, but I'll counter with uncooked rice. This could start a long discussion about what kinds of food actually need refrigeration--uncooked rice doesn't, but cooked rice does. Could we save energy in the long haul by blocking things that don't need to be kept cool? That word need complicates things, so let's drop this line of thinking for now.

Anything that's edible and fits into a refrigerator. Items normally stored in containers must be in containers.

How about a penguin? Probably need some kind of clause restricting living creatures. Maybe the edibility requirement covers this, except leopard seals and sea lions eat penguins. No living things across the board is safest way to plug this hole. Wait, do the bacteria in yogurt count as living? This entire edibility issue is troublesome. What about medicine that needs to be kept cool?

Oh no, we've only been thinking about residential uses! A laboratory refrigerator changes everything. Now we've got to consider organs and cultures and chemicals and is it okay to keep iced coffee in there with them. It also never occurred to me until right now that we can't even talk about any of this until we define exactly what the allowed temperature range of a refrigeration appliance is.

In the interest of time, I'll offer this for-experts-only spec for "What can you put in a refrigerator?":

Anything that fits into a refrigerator.

Alternate Retrocomputing Histories

There's a computer science course that goes like this: First you build an emulator for a fictional CPU. Then you write an assembler for it. Then you close the loop by defining an executable format that the emulator can load and the assembler can generate, and you have a complete, if entirely virtual, development system.

Of course this project is intended as an educational tool, to gain exposure to hardware and operating systems concepts. When I took that course, the little homemade CPU felt especially hopeless, making an expensive minicomputer slower than an Apple II. Use it to develop games? Not a chance.

And now, there's MAME.

The significance of this deserves some thought. All those processors that were once so fast, from the 6809 to the 68000 to the lesser known TMS34010 CPU/GPU combo that powers Mortal Kombat and NBA Jam, being completely duplicated by mere programs. This pretend hardware can, in real-time, reanimate applications always cited as requiring the ultimate performance: high frame rate games. When you look at the results under MAME, the screen full of enticing pixels, that instructions are being decoded and dispatched by a layer of C code isn't something that makes its way through the system and into your mind.

Maybe that virtual CPU from that college class isn't so crazy any more?

Now, sure, you could design your own processor and emulate it on a modern desktop or phone. You could even ship commercial software with it. This little foray into alternate retrocomputing histories will result in systems that are orders of magnitude simpler than what we've currently got. Your hundred virtual opcodes is a footnote to the epic volumes of Intel's x86 instruction set manuals. No matter what object code file structure you come up with, it pales in comparison to the Portable Executable Format that's best explained by large posters.

I wouldn't do that. It's still assembly language, and I don't want to go back down that road.

The most fascinating part of this thought experiment is that it's possible at all. You can set aside decades of cruft, start anew in a straightforward way, and the result is immediately usable. There's not much personal appeal to a Z80 emulator, but many applications I've written have small, custom-built interpreters in them, and maybe I didn't take them far enough. Is all the complaining about C++ misguided, in that the entire reason for the existence of C++ is so you can write systems that prevent having to use that language?

(If you liked this, you might enjoy Success Beyond the Barrier of Full Understanding.)

The Same User Interface Mistakes Over and Over

It has been 42 years since the not-very-wide release of the Xerox Alto and almost 32 since the mainstream Macintosh. You might expect we've moved beyond the era of egregious newbie mistakes when building graphical UIs, but clearly we have not. Drop-down lists containing hundreds of elements are not rare sights. Neither are modal preference dialogs, meaningless alerts where the information is not actionable, checkboxes that allow mutually exclusive options to be selected at the same time, and icons that don't clearly represent anything. I could go on, but we've all experienced this firsthand.

Wait, I need to call out one of the biggest offenses: applications stealing the user's focus--jumping into the foreground--so that clicks intended for the previously front-most app are now applied to the other, possibly with drastic results.

That there are endless examples of bad UIs to cite and laugh at and ignore is not news. The real question is why, after all this time, do developers still make these mistakes? There are plenty of UI experts teaching simplicity and railing against poor design. Human-computer interaction and UX are recognized fields. So what happened?

We've gotten used to it. Look at the preferences panel in most applications, and there are guaranteed to be settings that you can't preview, but instead have to select, apply, close the window, and then can't be undone if you don't like them. You have to manually re-establish the previous settings. This is so common that it wouldn't even be mentioned in a review.

(At one time the concern was raised that the ubiquitous "About..." menu option was mislabeled, because it didn't give information about what a program was or how it worked, but instead displayed a version number and copyright information. It's a valid point, but it doesn't get a second thought now. We accept the GUI definition of "About.")

There's no standard resource. How do you know that certain uses of checkboxes or radio buttons are bad? From experience using apps, mostly, and some designers may never notice. If you're starting out building an interface, there's no must-have, coffee-stained reference--or a web equivalent--that should be sitting on your desk. Apple and others have their own guidelines, but these are huge and full of platform-specific details; the fundamentals are easy to overlook.

There aren't always perfect alternatives. There's so much wrong with the focus-stealing, jump-to-the-front application, but what's the solution? Standard practice is a notification system which flashes or otherwise vies for attention, then you choose when you want to interact with the beckoning program. What this notification system is depends on the platform. There isn't a definitive approach for getting the user's attention. It's also not clear that the model of background apps requesting the user's attention works. How many iPhones have you seen with a red circle containing "23" on the home screen, indicating that 23 apps need updating...and it's been there for months?

Implementing non-trivial GUIs is still messy. Windows, OS X, and iOS are more or less the same when it comes to building interfaces. Use a tool to lay out a control-filled window, setting properties and defaults. Write classes which are hooked to events fired by controls. There's more architecture here than there should be, with half of the design in code, half in a tool, and trying to force-fit everything into an OOP model. It's also easy to build interfaces that are too static. REBOL and Tk showed how much nicer this could be, but they never became significant. It's better in HTML, where layout and code are blurred, but this doesn't help native apps.

(If you liked this, then you might enjoy If You're Not Gonna Use It, Why Are You Building It?)

What's Your Secondary Language?

Most of the "Which programming language is best?" discussion is irrelevant. If you work at a company which uses C++ or Python or Java, then you use that; there's no argument to be had. In other cases your options are limited by what's available and well-supported. If you want to write an iPhone game, Common Lisp is not on the menu of reasonable options. You could figure out a way to make it work, but you're fighting the system and C or Swift would almost certainly be less stressful in the end.

At some point in the development of your mandated-to-use-Java project, you'll need to do some quick calculations on the side, ones that won't involve Java. I never use a faux plastic-button GUI calculator for that; I bring up an interpreter with an interactive command prompt. Going beyond math, algorithms are easier to prototype in a language that isn't batch-compiled and architecture-oriented. When I was working on a PlayStation 2 launch title, I had never implemented a texture cache before, so I experimented with some possibilities in Erlang.

When there's debate in a project meeting about some topic and everything being said is an unproven opinion, the person in the back who immediately starts in a small prototype to provide concrete data is the only person I trust.

The important question is not "Which programming language is best?" but "What's your secondary language?" The language you reach for to solve problems, prove that ideas work before implementing them for real, and to do interesting things with files and data.

The criteria for judging a development language and secondary language are completely different. The latter is all about expressiveness, breadth of readily available capabilities, and absence of roadblocks. Languages without interactive command lines are non-starters. Ditto for languages that are geared toward building infrastructure. You want floating point vectors now, and not ways to build overloaded floating point vector classes with private methods.

My secondary language has jumped around quite a bit and at the moment I have two. It was J for a while, because J is an amazing calculator with hooks to a variety of visualizations. At some point J shifted to being more cross-platform and lost much of its usefulness (this may or may not still be true). Erlang is my go-to for algorithms and math and small tools, but it's not something I'd use to build a GUI. Recently I've used JavaScript for anything interactive or visual. I know, I know, those scope rules! But I can sit down and make stuff to show people fairly quickly, and that outweighs quibbles I have with the language itself.

Messy Structs/Classes in a Functional Style

There are two major difficulties that get ignored in introductions to functional programming. One is how to build interactive programs like games (see Purely Functional Retrogames). The other is how to deal with arbitrary conglomerations of data, such as C structs or C++ classes.

Yes, yes, you can create composite data types in functional languages, no problem. What happens in C, though, is that it's easy to define a struct, then keep putting things in there as needed. One day you realize that this structure contains dozens of flags and counters and other necessary things, which sounds bad--and technically is bad--except that it sure was nice to just add them and not worry about it. You can do the same thing in a functional language, but it's a poor match for immutability. "Change" one of those hundred fields and they all get copied. When interactively testing it's hard to see what's different. There are just these big 50-field data types that get dumped out.

I have a couple of guidelines for dealing with messy struct/class-like data in Erlang, and I expect they will apply to other languages. I've never seen these mentioned anywhere, so I want to pass them along. Again, the key word is messy. I'm not talking about how to represent an RGB tuple or other trivial cases. Set perfection aside for the moment and pretend you're working from a C++ game where the "entity" type is filled with all kinds of data.

The first step is to separate the frequently changed fields from the rest. In a game, the position of an entity is something that's different every frame, but other per-entity data, like the name of the current animation, changes only occasionally. One is in the 16-33 millisecond time frame, the other in seconds or tens of seconds. Using Erlang notation, it would be something like this:

{Position, Everything_Else}

The technical benefit is that in the majority of frames only Position and the outer tuple are created, instead of copying the potentially dozens of fields that make up Everything_Else. This factoring based on frequency of change provides additional information for thinking about the problem at hand. Everything_Else can be a slower to rebuild data structure, for example.

The other rule of thumb I've found helpful is to determine which fields are only used in certain cases. That is, which are optional most of the time. In this oversized entity, there might be data that only applies if the character is swimming. If the character is on-foot most of the time, don't add the water-specific data to the core structure. Now we've got something like:

{Position, Most_Everything_Else, Optional_Stuff}

In my code, the optional stuff is an Erlang property list, and values come and go as needed (were I to do it today I might use a map instead). In a real game, I found that almost everything was optional, so I ended up with simply:

{Position, Optional_Stuff}

(If you liked this, you might enjoy A Worst Case for Functional Programming?)

On the Madness of Optimizing Compilers

There's the misconception that the purpose of a compiler is to generate the fastest possible code. Really, it's to generate working code--going from a source language to something that actually runs and gives results. That's not a trivial task, mapping any JavaScript or Ruby or C++ program to machine code, and in a reliable manner.

That word any cannot be emphasized enough. If you take an existing program and disassemble the generated code, then it's easy to think "It could have been optimized like this and like that," but it's not a compiler designed for your program only. It has to work for all programs written by all these different people working on entirely different problems.

For the compiler author, the pressure to make the resultant programs run faster is easy to succumb to. There are moments, looking at the compiled output of test programs, where if only some assumptions could be made, then some of these instructions could be removed. Those assumptions, as assumptions tend to be, may look correct in a specific case, but don't generalize.

To give a concrete example, it may be obvious that an object could be allocated on the stack instead of the heap. To make that work in the general case, though, you need to verify that the pointer to the object isn't saved anywhere--like inside another object--so it outlives the data on the stack. You can trace through the current routine looking for pointer stores. You can trace down into local functions called from the current routine. There may be cases where the store happens in one branch of a conditional, but not the other. As soon as that pointer is passed into a function outside of the current module, then all bets are off. You can't tell what's happening, and have to assume the pointer is saved somewhere. If you get any of this wrong, even in an edge case, the user is presented with non-working code for a valid program, and the compiler writer has failed at his or her one task.

So it goes: there are continual, tantalizing cases for optimization (like the escape analysis example above), many reliant on a handful of hard to prove, or tempting to overlook, restrictions. And the only right thing to do is ignore most of them.

The straightforward "every program all the time" compiler is likely within 2-3x of the fully optimized version (for most things), and that's not a bad place to be. A few easy improvements close the gap. A few slightly tricky but still safe methods make up a little more. But the remainder, even if there's the potential for 50% faster performance, flat out isn't worth it. Anything that ventures into "well, maybe not 100% reliable..." territory is madness.

I've seen arguments that some people desperately need every last bit of performance, and even a few cycles inside a loop is the difference between a viable product and failure. Assuming that's true, then they should be crafting assembly code by hand, or they should be writing a custom code generator with domain-specific knowledge built-in. Trying to have a compiler that's stable and reliable and also meets the needs of these few people with extreme, possibly misguided, performance needs is a mistake.

(If you liked this, you might enjoy A Forgotten Principle of Compiler Design.)

Moving Beyond the OOP Obsession

Articles pointing out the foibles of an object-oriented programming style appear regularly, and I'm as guilty as anyone. But all this anecdotal evidence against OOP doesn't have much effect. It's still the standard taught in universities and the go-to technique for most problems.

The first major OO language for PCs was Borland Turbo Pascal 5.5, introduced in 1989. Oh, sure, there were a few C++ compilers before that, but Turbo Pascal was the language for MS-DOS in the 1980s, so it was the first exposure to OOP for many people. In Borland's magazine ads, inheritance was touted as the big feature, with an example of how different variants of a sports car could be derived from a base model. What the ads didn't mention at all was encapsulation or modularity, because Turbo Pascal programmers already knew how to do that in earlier, pre-object versions of the language.

In the years since then, the situation has reversed. Inheritance is now the iffiest part of the object-oriented canon, while modularity is everything. In OOP-first curricula, objects are taught as the method of achieving modularity, to the point where the two have become synonymous. It occurred to me that there are now a good many coders who've never stopped to think about how modularity and encapsulation could work in C++ without using classes, so that's what I want to do now.

Here's a classic C++ method call for an instance of a class called mixer:

m->set_volume(0.8);

m is a pointer to instance of the mixer class. set_volume is the method being called. Now here's what this would look like in C++ without using objects:

mixer_set_volume(m, 0.8);

This is in a file called mixer.cpp, where all the functions have the mixer_ prefix and take a pointer (or reference) to a variable of the mixer type. Instead of new mixer you call mixer_new. It might take a moment to convince yourself, but barring some small details, these two examples are the same thing. You don't need OOP to do basic encapsulation.

(If you're curious, the pre-object Turbo Pascal version is almost the same:

mixer.set_volume(m, 0.8);

mixer is not an object, but the name of the module, and the dot means that the following identifier is inside that module.)

Now the C++ mixer_set_volume example above is slightly longer than the class version, which I expect will bother some people. The mild verbosity is not a bad thing, because you can do simple text searches to find everywhere that mixer_set_volume is used. There is no confusion from multiple classes having the same methods. But if you insist, this is easy to remedy by making it an overloaded function where the first parameter is always of type mixer. Now you can simply say:

set_volume(m, 0.8);

I expect there are some people waiting to tell me I'm oversimplifying things, and I know perfectly well that I'm avoiding virtual functions and abstract base classes. That's more or less my point: that while simple, this covers the majority of use cases for objects, so teach it first without having the speed bump of terminology that comes with introducing OOP proper.

To extend my example a bit more, what if "mixer" is something that there can only be one of, because it's closely tied to the audio hardware? Just remove the first parameter to all the function calls, and you end up with:

mixer_set_volume(0.8);

You can teach this without ever using the word "singleton."

(You might enjoy Part 1 and Part 2 of this unintentional trilogy.)

Death of a Language Dilettante

I used to try every language I came across. That includes the usual alternatives like Scheme, Haskell, Lua, Forth, OCaml, and Prolog; the more esoteric J, K, REBOL, Standard ML, and Factor; and some real obscurities: FL, Turing, Hope, Pure, Fifth. What I hoped was always that there was something better than what I was using. If it reduced the pain of programming at all, then that was a win.

Quests for better programming languages are nothing new. Around the same time I started tinkering with Erlang in the late 1990s, I ran across a site by Keith Waclena, who was having a self-described "programming language crisis." He assigned point values to a list of features and computed a score for each language he tried. Points were given for static typing, local function definition, "the ability to define new control structures" and others.

There's a certain set of languages often chosen by people who are outside of computer science circles: PHP, JavaScript, Flash's ActionScript, Ruby, and some more esoteric app-specific scripting languages like GameMaker's GML. If I can go further back, I'll also include line-numbered BASIC. These also happen to be some of the most criticized languages by people who have the time for that sort of thing. JavaScript for its weird scope rules (fixed in ES6, by the way) and the strange outcomes from comparing different types. Ruby for its loose typing and sigils. PHP for having dozens of reserved keywords. BASIC for its lack of structure.

This criticism is troubling, because there are clear reasons for choosing these languages. Want to write client-side web code? JavaScript. Using GameMaker? GML. Flash? ActionScript. Picked up an Atari 130XE from the thrift shop? BASIC. There's little thought process needed here. Each language is the obvious answer to a question. They're all based around getting real work done, yet there's consistent agreement that these are the wrong languages to be using.

If you veer off into discussions of programming language theory (PLT), it quickly becomes muddy why one language is better than another, but more importantly, as with Keith's crisis, the wrong criteria are being used. Even something as blatantly broken as the pre-ES6 scoping rules in JavaScript isn't the fundamental problem it's made out to be. It hasn't been stopping people from making great things with the language. Can PLT even be trusted as a field? And what criteria do you use for choosing a programming language?

Does this language run on the target system that I need it to? If the answer is no, end of discussion. Set aside your prejudices and move on.

Will I be swimming against the current, not being able to cut and paste from SDK documentation and get answers via Google searches, if I choose this language? You might be able to write a PlayStation 4 game in Haskell, but should you?

Are the compiler and other tools pleasant to use, quick, and reliable? Once I discovered that Modula-2 was cleaner than C and Pascal, I wanted to use it. Unfortunately, there were fewer choices for Modula-2 compilers, and none of them were as fast and frustration-free as Turbo Pascal.

Am I going to hit cases where I am at the mercy of the implementors, such as the performance of the garbage collector or compile times for large projects? You don't want to get in a situation where you need certain improvements to the system, but the maintainers don't see that as important, or even see it as against the spirit of the language. You're not going to run into that problem with the most heavily used toolsets.

Do I know that this is a language that will survive the research phase and still be around in ten years? Counterpoint: BitC.

Here's an experiment I'd like to see: give a language with a poor reputation (JavaScript, Perl) to someone who knows it passably well and--this is the key--has a strong work ethic. The kind of person who'd jump in and start writing writing a book rather than dreaming about being famous novelist. Then let the language dilettante use whatever he or she wants, something with the best type system, hygenic macros, you name it. Give them both a real-world task to accomplish.

As someone who appreciates what modern languages have to offer, I really don't want this to be the case, but my money is on the first person by a wide margin.

Evolution of an Erlang Style

I first learned Erlang in 1999, and it's still my go-to language for personal projects and tools. The popular criticisms--semicolons, commas, and dynamic typing--have been irrelevant, but the techniques and features I use have changed over the years. Here's a look at how and why my Erlang programming style has evolved.

I came to Erlang after five years of low-level coding for video games, so I was concerned about the language being interpreted and the overhead of functional programming. One of the reasons I went with Erlang is that there's an easy correspondence between source code and the BEAM virtual machine. Even more than that, there's a subset of Erlang that results in optimal code. If a function makes only tail calls and calls to functions written in C, then parameters stay in fixed registers even between functions. What looks like a lot of parameter pushing and popping turns into destructive register updates. This is one of the first things I wrote about here, back in 2007.

It's curious in retrospect, writing in that sort of functional assembly language. I stopped thinking about it once BEAM performance, for real problems, turned out to much better than I expected. That decision was cemented by several rounds of hardware upgrades.

The tail-recursive list building pattern, with an accumulator and a lists:reverse at the end, worked well with that primitive style, and it's a common functional idiom. Now I tend to use a more straightforward recursive call in the right hand side of the list constructor. The whole "build it backward then reverse" idea feels clunky.

For a small project I tried composing programs from higher-level functions (map, filter, foldl, zip) as much as possible, but it ended up being more code and harder to follow than writing out the "loops" in straight Erlang. Some of that is awkward syntax (including remembering parameter order), but there are enough cases where foldl isn't exactly right--such as accumulating a list and counting something at the same time--that a raw Erlang function is easier.

List comprehensions, though, I use all the time. Here the syntax makes all the difference, and there's no order of parameters to remember. I even do clearly inefficient things like:

lists:sum([X || {_,_,X} <- List]).

because it's simpler than foldl.

I use funs--lambdas--often, but not to pass to functions like map. They're to simplify code by reducing the number of parameters that need to be passed around. They're also handy for returning a more structured type, a sort of simple object, again to hide unneccessary details.

Early on I was also concerned about the cost of communicating with external programs. The obvious method was to use ports (essentially bidirectional pipes), but the benchmarks under late-1990s Windows were not good. Instead I used linked-in drivers, which were harder to get right and could easy crash the emulator. Now I don't even think about it: it's ports for everything. I rewrote a 2D action game for OS X with the graphics and user input in an external program and the main game logic in Erlang. The Erlang code spawns the game "driver," and they communicate via a binary protocol. Even at 60fps, performance is not an issue.

Fun vs. Computer Science

I've spent most of my career working on games, either programming or designing them or both. Games are weird, because everything comes down to this nebulous thing called fun, and there's a complete disconnect between fun and most technical decisions:

Does choosing C++14 over C++11 mean the resulting game is more fun?

Does using a stricter type system mean the game is more fun?

Does using a more modern programming language mean the game is more fun?

Does favoring composition over inheritance mean the game is more fun?

Now you could claim that some of this tech would be more fun for the developer. That's a reasonable, maybe even important point, but there's still a hazy at best connection between this kind of "developer fun" and "player fun."

A better argument is that some technologies may result in the game being more stable and reliable. Those two terms should be a prerequisite to fun, and even though people struggle along--and have fun with--buggy games (e.g., Pokemon Go), I'm not going to argue against the importance of reliability. Think about all the glitchiness and clunkiness you experience every day, from spinning cursors, to Java tricking you into installing the Ask toolbar, to an app jumping into the foreground so you click on the wrong thing. Now re-watch The Martian and pretend all the computers in the movie work like your desktop PC. RIP Matt Damon.

The one thing that does directly make a game more fun is decreased iteration time. Interactive tweaking beats a batch compile and re-launch every time, and great ideas can come from on the fly experimentation. The productivity win, given the right tools, is 10x or more, and I can't emphasize this enough.

And yet this more rapid iteration, which is so important to me, does not seem to be of generally great importance. It's not something that comes up in computer sciencey discussions of development technologies. There's much focus on sophisticated, and slow, code optimization, but turnaround time is much more important in my work. A certain circle of programmers puts type systems above all else, yet in Bret Victor's inspirational Inventing on Principle talk from 2012, he never mentioned type systems, not once, but oh that interactivity.

I realize that we're heading toward the ultimate software engineer dream of making a type-checked change that's run through a proven-correct compiler that does machine-learning driven, whole program optimization...but it's going the exact opposite of the direction I want. It's not helping me in my quest for creating fun.

For the record, I just picked those buzzwords out of my mind. I'm not criticizing static type checking or any of those things, or even saying that they preclude interactive iteration (see Swift's playgrounds, for example). They might make things harder though, if they necessitate building a new executable of the entire game for every little change.

Interactivity, I may have to grudgingly accept, is not trendy in computer science circles.

(If you liked this, you might enjoy You Don't Read Code, You Explore It.)

Optimizing for Human Understanding

Long ago, I worked on a commercial game that loaded a lot of data from text files. Eventually some of these grew to over a megabyte. That doesn't sound like a lot now, but they were larger than the available buffer for decoding them, so I looked at reducing the size of the files.

The majority of the data was for placement of 3D objects. The position of each object was a three-element floating point vector delineated with square brackets like this:

[ 659.000000 -148.250000 894.100000 ]

An orientation was a 3x3 matrix, where each row was a vector:

[ [ 1.000000 0.000000 0.000000 ]
[ 0.000000 1.000000 0.000000 ]
[ 0.000000 0.000000 1.000000 ] ]

Now this format looks clunky here, but imagine a text file filled with hundreds of these. The six-digits after the decimal point was to keep some level of precision, but in practice many values ended up being integers. Drop the decimal point and everything after it, and the orientation matrix becomes:

[ [ 1 0 0 ]
[ 0 1 0 ]
[ 0 0 1 ] ]

which is a big improvement. In the vector example, there's "-148.250000" which isn't integral, but those last four zeros don't buy anything. It can be reduced to "-148.25".

The orientation still isn't as simple as it could be. It's clearly an identity matrix, yet all nine values are still specified. I ended up using this notation:

[ I ]

I also found that many orientations were simply rotations around the up vector (as you would expect in a game with a mostly flat ground plane), so I could reduce these to a single value representing an angle, then convert it back to a matrix at load time:

[ -4.036 ]

I don't remember the exact numbers, but the savings were substantial, reducing the file size by close to half. At the time the memory mattered, but half a megabyte is trivial to find on any modern system. This also didn't result in simpler code, because the save functions were now doing more than just fprintf-ing values.

What ended up being the true win, and the reason I'd do this again, is because it makes the data easier to visually interpret. Identity matrices are easy to pick out, instead of missing that one of the other values is "0.010000" instead of "0.000000". Common rotations are clearly such, instead of having to mentally decode a matrix. And there's less noise in "0.25" than "0.250000" (and come to think of it, I could have simplified it to ".25"). It's optimized for humans.

(If you liked this, you might enjoy Optimization in the Twenty-First Century.)

The New Minimalism

You don't know minimalism until you've spent time in the Forth community. There are recurring debates about whether local variables should be part of the language. There are heated discussions about how scaled integer arithmetic is an alternative to the complexity of floating point math. I don't mean there were those debates back in the day; I mean they still crop up now and again. My history with Forth and stack machines explains the Forth mindset better than I can, but beware: it's a warning as much as a chronology.

Though my fascination with Forth is long behind me, I still tend toward minimalist programming, but not in the same, extreme, way. I've adopted a more modern approach to minimalism:

Use the highest-level language that's a viable option.

Lean on the built-in features that do the most work.

Write as little code as possible.

The "highest-level language" decision means you get as much as possible already done for you: arbitrary length integers, unicode, well-integrated data structures, etc. Even better are graphics and visualization capabilities, such as in R or Javascript.

"Lean on built-in features," means that when there's a choice, prefer the parts of the system that are both fast--written in C--and do the most work. In Perl, for example, you can split a multi-megabyte string into many pieces with one function call, and it's part of the C regular expression library. Ditto for doing substitutions in a large string. In Perl/Python/Ruby, lean on dictionaries, which are both flexible and heavily optimized. I've seen Python significantly outrun C, because the C program used an off-the-cuff hash table implementation.

I've been mostly talking about interpreted languages, and there are two ways to write fast interpreters. The first is to micro-optimize the instruction fetch/dispatch loop. There are a couple of usual steps for this, but there's only so far you can go. The second is to have each instruction do more, so there are fewer to fetch and dispatch. Rule #2 above is taking advantage of the latter.

Finally, "write as little code as possible." Usual mistakes here are building a wrapper object around an array or dictionary and representing simple types like a three-element vector as a dictionary with x, y, and z keys, or worse, as a class. You don't need a queue class; you've already got arrays with ways to add and remove elements. Keep things light and readable at a glance, where you don't have to trace into layers of functions to understand what's going on. Remember, you have lots of core language capabilities to lean on. Don't insist upon everything being part of an architecture or framework.

This last item, write less code, is the one that the other two are building toward. If you want people to be able to understand and modify your programs--which is the key to open source--then have less to figure out. That doesn't mean fewer characters or lines at all costs. If you need a thousand lines, then you need a thousand lines, but make those thousand lines matter. Make them be about the problem at hand and not filler. Don't take a thousand lines to write a 500 line program.

(If you liked this, you might enjoy The Software Developer's Sketchbook.)

Being More Than "Just the Programmer"

There's a strange dichotomy that college doesn't prepare computer science majors for: knowing how to program is a huge benefit if you want to create something new and useful, but as a programmer you're often viewed as the implementer of someone else's vision--as just the programmer--and have limited say in crafting the application as a whole.

(Note that here I'm using "application as a whole" to mean the feature set and experience of using the app, not the underlying architecture.)

In my first game development job writing 16-bit console games, I naively expected there to be a blend of coding and design, like there was when I was writing my own games for home computer magazines, but the two were in different departments in different locations in the rented office park space. It had never occurred to me that a game could be failing because of poor design, yet I wouldn't be able to do anything about it, not having a title with "design" in it. I came out of that experience realizing that I needed to be more than an implementer.

I wanted to write up some tips for people in similar situations, people who want to be more than just the programmer.

Go through some formalities to prove that you have domain knowledge. You might think you know how to design good user interfaces, but why should anyone listen to you? Buy and read the top books in the field, have them at your desk, and use them to cite guidelines. Or take a class, which might be less efficient than reading on your own, but it's concrete and carries more weight than vague, self-directed learning.

Don't get into technical details when it doesn't matter. "Why is that going to take three weeks to finish?" "Well, there's a new version of the library that doesn't fully work with the C++11 codebase that we're using so I'm going to have to refactor a few classes, and also there are issues with move semantics in the new compiler, so..." No matter how you say this, it sounds like complaining, and you get a reputation as the programmer who spouts technical mumbo jumbo. Sure, talk tech with the right people, but phrase things in terms of the project--not the code--otherwise.

Don't get into programming or technology arguments, ever. Just don't. Again, this is usually thinking on the wrong level, and you don't want to advertise that. There's also this Tony Hoare quote that I love:

You know, you shouldn't trust us intelligent programmers. We can think up such good arguments for convincing ourselves and each other of the utterly absurd.

Get to know people in departments whose work interests you. Continuing the user interface example from above, go talk to the UX folks. Learn what they like and don't like and why they've made certain decisions. They'll be glad that someone is taking an interest, and you'll be learning from people doing the work professionally.

Build prototypes to demonstrate ideas. If you jump in and do work that someone else is supposed to do, like changing the game design, then that's not going to turn out well. A better approach is to build a small prototype of a way you think something should work and get feedback. Take the feedback to heart and make changes based on it (also good, because you're showing people you value their opinions). Sometimes these prototypes will fall flat, but other times you'll have a stream of people stopping by your desk to see what they've heard about.

Picturing WebSocket Protocol Packets

(I'm using JavaScript in this article. If you're reading this via the news feed, go to the original version to see the missing parts.)

I recently wrote a WebSocket server in Erlang. I've gotten fond of separating even desktop apps into two programs: one to handle the graphics and interface, and one for the core logic, and they communicate over a local socket. Any more it makes sense to use a browser for the first of these, with a WebSocket connecting it to an external program. The only WebSocket code I could find for Erlang needed existing web server packages, which is why I wrote my own.

The WebSocket spec contains this diagram to describe the messages between the client and server:

0               1               2               3
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len |    Extended payload length    |
|I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |
|N|V|V|V|       |S|             |   (if payload len==126/127)   |
| |1|2|3|       |K|             |                               |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
4               5               6               7              
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
|     Extended payload length continued, if payload len == 127  |
+ - - - - - - - - - - - - - - - +-------------------------------+
8               9               10              11             
+ - - - - - - - - - - - - - - - +-------------------------------+
|                               |Masking-key, if MASK set to 1  |
+-------------------------------+-------------------------------+
12              13              14              15
+-------------------------------+-------------------------------+
| Masking-key (continued)       |          Payload Data         |
+-------------------------------- - - - - - - - - - - - - - - - +
:                     Payload Data continued ...                :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
|                     Payload Data continued ...                |
+---------------------------------------------------------------+

This is a confusing a diagram for a number of reasons. The ASCII art, for example, makes it hard to see which lines contain data and which are for byte numbers. When I first looked at it, it made me think there was more overhead than there actually is. That's unfortunate, because there's a simplicity to WebSocket protocol packets that's hard to extract from the above image, and that's what I want to demonstrate.

Here's the fixed part of the header, the 16-bits that are always present. This is followed by additional info, if needed, then the data itself. The number of bits is shown below each field. You should keep coming back to this for reference.

[See the original or enable JavaScript.]

F = 1 means this is a complete, self-contained packet. Assume it's always 1 for now. The main use of the opcode (Op) is to specify if the data is UTF-8 text or binary. M = 1 signals the data needs to be exclusive or-ed with a 32-bit mask. The length (Len) has three different encodings depending on much much data there is.

Messages to the server are required to have a mask, so here's what packets look like for each of the three length encodings.

[See the original or enable JavaScript.]

The first has a length of 60 bytes, the second 14,075, and the third 18,000,000. Special escape values for the 7 bit Len field indicate the presence of additional 16 or 64 bit length fields.

Packets from the server to the client don't use the mask, so the headers are shorter. Again, for the same three data lengths:

[See the original or enable JavaScript.]

The remaining part is what fragmented messages look like. The F bit is 1 only for the Final packet. The initial packet contains the opcode; the others have 0 in the opcode field.

[See the original or enable JavaScript.]

This message is 8256 bytes in total: two of 4096 bytes and one of 64. Notice how different length encodings are used, just like in the earlier examples.

(If you liked this, you might enjoy Exploring Audio Files with Erlang.)

Learning to Program Without Writing the Usual Sort of Code

There's much anecdotal evidence, from teachers of beginning programming classes, that many people can't come to grips with how to program. Sticking points can be as fundamental as not being able to break a problem down into a series of statements executed one after another, or struggling with how variables are updated and have different values at different points in the program.

I don't think it's quite as straightforward as that, because there are real life analogs for both of these sticking points. Clearly you have to go into the restaurant before you can sit at the table, then you order, eat, pay the bill, and leave. Everyone gets that (and knows why you don't sit at the table before going to the restaurant). When you pay for the meal, the money you have is decreased, and it stays that way afterward. The difference with code is that it's much more fine-grained, much more particular, and it's not nearly so easy to think about.

If you think that I'm oversimplifying, here's a little problem for you to code up: Write a program that, given an unordered array, finds the second largest value.

(I'll wait for you to finish.)

You could loop through the array and see if each element is greater than Largest. If that's true, then set NextLargest to Largest, and Largest to the current element. Easy. Except that this doesn't work if you find a value smaller than Largest but greater than NextLargest. You need another check for that (did you get that right?). I'm not saying this is a hard problem, just a little tricky, and a hard one for beginners to think about. Even in the first, simple case you have to get the two assignments in the right order, or both variables end up with the same value.

Set aside that kind of programming for a bit, and let's look at other ways of solving the "second largest" problem. Remember, nowhere in the description does it say anything about performance, so that's not a concern.

Here's the easiest solution: Sort the array from largest to smallest and take the second element.

There's a little extra housekeeping for this to be completely correct (what if the length of the array is 1?), but the solution is still trivial to think about. It's two steps. No looping. No variables. Of course you don't write the sort function; that's assumed to exist.

If you're not buying the sort, here's another: Find the largest value in the array, remove it, then find the largest value in the updated array. This sounds like looping and comparisons (even though finding the largest element is an easier problem than the second largest), but think about it in terms of primitive operations that should already exist: (1) finding the largest value in an array, and (2) deleting a value from an array. You could adjust those so you're getting the index of the largest value and deleting the element at an index, but the naive version is perfectly fine.

What I'm getting at is that thinking about problems given a robust set of primitives to work with is significantly easier than the procedural coding needed to write those primitives in the first place. Yet introductions to programming are focused almost exclusively on the latter.

As much as I like functional programming, I don't think it gets this right. The primitives in most functional languages are based around maps and folds and zip-withs, all of which require writing small, anonymous functions as parameters. Now if a fold that adds up all the integers in an array is named "sum" (as in at least Haskell and Erlang), then that's a solid, non-abstract primitive to think about and work with.

Long-time readers will expect me to talk about array languages like J at this point, and I won't disappoint. The entire history of array languages has been about finding a robust and flexible set of primitives for manipulating data, especially lists of numbers. To be completely fair, J falls down in cases that don't fit that model, but it's a beautiful system for not writing code. Instead you interactively experiment with a large set of pre-written verbs (to use Ken Iverson's term).

J or not, this kind of working exclusively with primitives may be a good precursor to traditional programming.

(If you liked this, you might enjoy Explaining Functional Programming to Eight-Year-Olds.)

Progress Bars are Surprisingly Difficult

We've all seen progress bars that move slowly for twenty minutes, then rapidly fill up in the last 30 seconds. Or the reverse, where a once speedy bar takes 50% of the time covering the last few pixels. And bars that occasionally jump backward in time are not the rarity you'd expect them to be.

Even this past month, when I installed the macOS Sierra update, the process completed when the progress bar was only two-thirds full. DOOM 2016 has a circular progress meter for level loads, with the percent-complete in the center. It often sits for a while at 0%, gets stuck at 74% and 99%, and sometimes finishes in the 90s before reaching 100%.

Clearly this is not a trivial problem, or these quirks would be behind us.

Conceptually, a perfect progress bar is easy to build. All you need to know is exactly how long the total computation will take, then update the bar in its own thread so it animates smoothly. Simple! Why do developers have trouble with this? Again, all you need to know is exactly how long...

Oh.

You could time it with a stopwatch and use that value, but that assumes your system is the standard, and that other people won't have faster or slower processors, drives, or internet connections. You could run a little benchmark and adjust the timing based on that, but there are too many factors. You could refine the estimate mid-flight, but this is exactly the road that leads to the bar making sudden jumps into the past. It's all dancing around that you can't know ahead of time exactly how long it should take for the progress bar to go from empty to full.

There's a similar problem in process scheduling, where there are a number of programs to run sequentially in batch mode. One program at a time is selected to run to completion, then the next. If the goal is to have the lowest average time for programs being completed, then best criteria for choosing the next program to run is the one with the shortest execution time (see shortest job next). But this requires knowing how long each program will take before running it, and that's not possible in the general case.

And so the perfect progress bar is forever out of reach, but they're still useful, as established by Brad Allan Meyers in his 1985 paper ("The importance of percent-done progress indicators for computer-human interfaces"). But "percent-done" of what? It's easy to map the loading of a dozen similarly sized files to an overall percentage complete. Not so much when all kinds of downloading and local processing is combined together into a single progress number. At that point the progress bar loses all meaning except as an indication that there's some sort of movement toward a goal, and that mostly likely the application hasn't hasn't locked up.

(If you liked this, you might enjoy An Irrational Fear of Files on the Desktop.)

Writing Video Games in a Functional Style

When I started this blog in 2007, a running theme was "Can interactive experiences like video games be written in a functional style?" These are programs heavily based around mutable state. They evolve, often drastically, during development, so there isn't a perfect up-front design to architect around. These were issues curiously avoided by the functional programming proponents of the 1980s and 1990s.

It's still not given much attention in 2016 in either. I regularly see excited tutorials about mapping and folding and closures and immutable variables, and even JavaScript has these things now, but there's a next step that's rarely discussed and much more difficult: how to keep the benefits of immutability in large and messy programs that could gain the most from functional solutions--like video games.

Before getting to that, here are the more skeptical functional programming articles I wrote, so it doesn't look like I'm a raving advocate:

Admitting that Functional Programming Can Be Awkward
Back to the Basics of Functional Programming
Functional Programming Went Mainstream Years Ago
Puzzle Languages
Let's Take a Trivial Problem and Make it Hard
Functional Programming Doesn't Work (and what to do about it)

I took a straightforward, arguably naive, approach to interactive functional programs: no monads (because I didn't understand them), no functional-reactive programming (ditto, plus all implementations had severe performance problems), and instead worked with the basic toolkit of function calls and immutable data structures. It's completely possible to write a video game (mostly) in that style, but it's not a commonly taught methodology. "Purely Functional Retrogames" has most of the key lessons, but I added some additional techniques later:

Purely Functional Retrogames (4 parts)
Turning Your Code Inside Out
A Worst Case for Functional Programming?
Messy Structs/Classes in a Functional Style
Reconsidering Functional Programming

The bulk of my experience came from rewriting a 60fps 2D shooter in mostly-pure Erlang. I wrote about it in An Outrageous Port, but there's not much detail. It really needed to be a multi-part series with actual code.

For completeness, here are the other articles that directly discuss FP:

Functional Programming Archaeology
Accidentally Introducing Side Effects into Purely Functional Code
Explaining Functional Programming to Eight-Year-Olds
Erlang vs. Unintentionally Purely Functional Python
Starting in the Middle
You, Too, Can Be on the Cutting Edge of Functional Programming Research
Tips for Writing Functional Programming Tutorials
Purely Functional Photoshop

If I find any I missed, I'll add them.

So Long, Prog21

I always intended "Programming in the 21st Century" to have a limited run. I knew since the Recovering Programmer entry from January 1, 2010, that I needed to end it. It just took a while.

And now, an explanation.

I started this blog to talk about issues tangentially related to programming, about soft topics like creativity and inspiration and how code is a medium for implementing creative visions. Instead I worked through more technical topics that I'd been kicking around over the years. That was fun! Purely Functional Retrogames is something I would have loved to read in 1998. More than once I've googled around and ended up back at one of my essays.

As I started shifting gears and getting back toward what I originally wanted to do, there was one thing that kept bothering me: the word programming in the title.

I don't think of myself as a programmer. I write code, and I often enjoy it when I do, but that term programmer is both limiting and distracting. I don't want to program for its own sake, not being interested in the overall experience of what I'm creating. If I start thinking too much about programming as a distinct entity then I lose sight of that. Now that I've exhausted what I wanted to write about, I can clear those topics out of my head and focus more on using technology to make fun things.

Thanks for reading!

It's hard to sum up 200+ articles, but here's a start. This is not even close to a full index. See the archives if you want everything. (There are some odd bits in there.)

widely linked

Things That Turbo Pascal is Smaller Than
Do You Really Want to be Doing This When You're 50?
Organizational Skills Beat Algorithmic Wizardry
Retiring Python as a Teaching Language
Computer Science Courses that Don't Exist, But Should

Five Memorable Books About Programming
A Spellchecker Used to Be a Major Feat of Software Engineering
Want to Write a Compiler? Just Read These Two Papers.
On Being Sufficiently Smart
Optimizing for Fan Noise
Free Your Technical Aesthetic from the 1970s
Advice to Aimless, Excited Programmers
Write Code Like You Just Learned How to Program
Don't Distract New Programmers with OOP
Recovering From a Computer Science Education
Don't Fall in Love With Your Technology
A Complete Understanding is No Longer Possible
Solving the Wrong Problem
This is Why You Spent All that Time Learning to Program
We Who Value Simplicity Have Built Incomprehensible Machines
Your Coding Philosophies are Irrelevant
The Silent Majority of Experts
Hopefully More Controversial Programming Opinions
How much memory does malloc(0) allocate?

on creativity

The Pure Tech Side is the Dark Side
Flickr as a Business Simulator
How to Think Like a Pioneer
What Do People Like?
Accidental Innovation, Part 1
If You're Not Gonna Use It, Why Are You Building It?
It's Like That Because It Has Always Been Like That
Trapped by Exposure to Pre-Existing Ideas
Get Good at Idea Generation
You Can't Sit on the Sidelines and Become a Philosopher
The Software Developer's Sketchbook
Design is Expensive
Why Doesn't Creativity Matter in Tech Recruiting?

others that I like

Deriving Forth
Tales of a Former Disassembly Addict
Living Inside Your Own Black Box
Tricky When You Least Expect It
The Most Important Decisions are Non-Technical
Things to Optimize Besides Speed and Memory
All that Stand Between You and a Successful Project are 500 Experiments
The UNIX Philosophy and a Fear of Pixels
Dangling by a Trivial Feature
Documenting the Undocumentable
You Don't Want to Think Like a Programmer
You Don't Read Code, You Explore It
If You Haven't Done It Before, All Bets Are Off
What Can You Put in a Refrigerator?
The Same User Interface Mistakes Over and Over
Fun vs. Computer Science

Erlang

A Deeper Look at Tail Recursion in Erlang
My Road to Erlang
Garbage Collection in Erlang
How to Crash Erlang
Eleven Years of Erlang
A Ramble Through Erlang IO Lists
A Concurrent Language for Non-Concurrent Software
A Peek Inside the Erlang Compiler
Evolution of an Erlang Style

retro

A Personal History of Compilation Speed
Slow Languages Battle Across Time
How Much Processing Power Does it Take to be Fast?
8-Bit Scheme: A Revisionist History
Stumbling Into the Cold Expanse of Real Programming
Why Do Dedicated Game Consoles Exist?
Lost Lessons from 8-Bit BASIC
Programming Modern Systems Like It Was 1984

Also see the previous entry for all of the functional programming articles.

Programming as if Performance Mattered is something I wrote in 2004 which used to be linked from every prog21 entry.