Programming in the Twenty-First Century — Full Archive of the Blog by James Hague
Listed under Programming.Theme
From the original article.
Note from Chadnet: links for images don't lead to Flickr but instead to local files.
A Deeper Look at Tail Recursion in Erlang
The standard "why tail recursion is good" paragraph
talks of reducing stack usage and subroutine calls turning
into jumps. An Erlang process must be tail recursive or
else the stack will endlessly grow. But there's more to how
this works behind the scenes, and it directly affects how
much code is generated by similar looking tail recursive
function calls.
A long history of looking at disassemblies of C code has
made me cringe when I see function calls containing many
parameters. Yet it's common to see ferocious functions in
Erlang, like this example from erl_parse (to be fair, it's
code generated by yecc):
yeccpars2(29, __Cat, __Ss, __Stack, __T, __Ts, __Tzr) ->
yeccpars2(yeccgoto(expr_700, hd(__Ss)),
__Cat, __Ss, __Stack, __T, __Ts, __Tzr);
It's tail recursive, yes, but seven parameters? Surely
this turns into a lot of pushing and popping behind the
scenes, even if stack usage is constant. The good news is
that, no, it doesn't. It's more verbose at the language
level than what's really going on.
There's a simple rule about tail recursive calls in
Erlang: If a parameter is passed to another function,
unchanged, in exactly the same position it was passed in,
then no virtual machine instructions are generated. Here's
an example:
loop(Total, X, Size, Flip) ->
loop(Total, X - 1, Size, Flip).
Let's number the parameters from left to right, so Total
is 1, X is 2, and so on. Total enters the function in
parameter position 1 and exits, unchanged, in position 1 of
the tail call. The value just rides along in a parameter
register. Ditto for Size in position 3 and Flip in position
4. In fact, the only change at all is X, so the virtual
machine instructions for this function look more or less
like:
parameter[2]--
goto loop
Perhaps less intuitively, the same rule applies if the
number of parameters increases in the tail call. This idiom
is common in functions with accumulators:
count_pairs(List, Limit) -> count_pairs(List, Limit, 0).
The first two parameters are passed through unchanged. A
third parameter--zero--is tacked onto the end of the
parameter list, the only one of the three that involves any
virtual machine instructions.
In fact, just about the worst thing you can do to
violate the "keep parameters in the same positions" rule is
to insert a new parameter before the others, or to
randomly shuffle parameters. This code results in a whole
bunch of "move parameter" instructions:
super_deluxe(A, B, C, D, E, F) ->
super_deluxe(F, E, D, C, B, A).
while this code turns into a single jump:
super_deluxe(A, B, C, D, E ,F) ->
super_deluxe(A, B, C, D, E, F).
These implementation techniques, used in the Erlang BEAM
virtual machine, were part of the Warren
Abstract Machine developed for Prolog.
On the Perils of Benchmarking Erlang
2007 brought a lot of new attention to Erlang, and with
that attention has come a flurry of impromptu benchmarks.
Benchmarks are tricky to write if you're new to a language,
because it's easy for the run-time to be dominated by
something quirky and unexpected. Consider a naive Python
loop that appends data to a string each iteration. Strings
are immutable in Python, so each append causes the entire
string created thus far to be copied. Here's my short, but
by no means complete, guide to pitfalls in benchmarking
Erlang code.
Startup time is slow. Erlang's startup time is
more significant than with the other languages I use.
Remember, Erlang is a whole system, not just a scripting
language. A suite of modules are loaded by default; modules
that make sense in most applications. If you're going to
run small benchmarks, the startup time can easily dwarf
your timings.
Garbage collection happens frequently in rapidly
growing processes. An Erlang process starts out very
small, to keep the overall memory footprint low in a system
with potentially tens of thousands of processes. Once a
process heap is full, it gets promoted to a larger size.
This involves allocating a new block of memory and copying
all live data over to it. Eventually process heap size will
stabilize, and the system automatically switches a process
over to a generational garbage collector at some point too,
but during that initial burst of growing from a few hundred
words to a few hundred kilowords, garbage collection
happens numerous times.
To get around this, you can start a process with a
specific heap size using spawn_opt
instead of
spawn
. The min_heap_size
option
lets you choose an initial heap size in words. Even a value
of 32K can significantly improve the timings of some
benchmarks. No need to worry about getting the size exactly
right, because it will still be automatically expanded as
needed.
Line-oriented I/O is slow. Sadly, yes, and
Tim
Bray found this out pretty early on. Here's to hoping
it's better in the future, but in the meantime any
line-oriented benchmark will be dominated by I/O. Use
file:read_file
to load the whole file at once,
if you're not dealing with gigabytes of text.
The more functions exported from a module, the less
optimization potential. It's common (and perfectly
reasonable) to put:
-compile(export_all).
at the top of module that's in development. There's some
nice tech in the Erlang compiler that tracks the types of
variables. If a binary is always passed to a function, then
that function can be specialized for operating on a binary.
Once you open up a function to be called from the outside
world, then all bets are off. Assumptions about the type of
a parameter cannot be made.
Inlining is off by default. I doubt you'll ever
see big speedups from this, but it's worth adding
-compile(inline).
to modules that involve heavy computation.
Large loop indices use bignum math. A "small"
integer in Erlang fits into a single word, including the
tag bits. I can never remember how many bits are needed for
the tag, but I think it's two. (BEAM uses a staged tagging
scheme so key types use fewer tag bits.) If a benchmark has
an outer loop counting down from ten billion to zero, then
bignum math is used for most of that range. "Bignum" means
that a value is larger than will fit into a single machine
word, so math involves looping and manually handling some
things that an add
instruction automatically
takes care of. Perhaps more significantly, each bignum is
heap allocated, so even simple math like X + 1
where X
is a bignum, causes the garbage
collector to kick in more frequently.
Admitting that Functional Programming Can Be
Awkward
My initial interest in functional programming was
because it seemed so perverse.
At the time, I was the classic self-taught programmer,
having learned BASIC and then 6502 assembly language so I
could implement my own game designs. I picked up the August
1985 issue of Byte magazine to read about the then-new
Amiga. It also happened to be the issue on declarative
languages, featuring a reprint of Backus's famous Turing
Award Lecture and a tutorial on Hope,
among other articles.
This was all pretty crazy stuff for an Atari 800 game
coder to be reading about. I understood some parts,
completely missed vast swaths of others, but one key point
caught my imagination: programming without modifiable
variables. How could that possibly work? I couldn't write
even the smallest game without storing values to memory. It
appealed to me for its impossibility, much in the way that
I had heard machine language was too difficult for most
people to approach. But while I had pored over assembly
language listings of games in magazines, and learned to
write my own as a result, there wasn't such direct
applicability for functional programming. It made me
wonder, but I didn't use it.
Many years later when I first worked through tutorials
for Haskell, Standard ML, and eventually Erlang, it was to
figure out how programming without modifying variables
could work. In the small, it's pretty easy. Much of what
seemed weird back in 1985 had become commonplace: garbage
collection, using complex data structures without worrying
about memory layout, languages with much less bookkeeping
than C or Pascal. But that "no destructive updates" thing
was--and still is--tricky.
I suppose it's completely obvious to point out that
there have been tens of thousands of video games written
using an imperative programming style, and maybe a
handful--maybe even just a couple of fingers worth--of
games written in a purely functional manner. Sure, there
have been games written in Lisp and some games written by
language dilettantes fond of Objective Caml, but they never
turn out to be programmed in a functional style. You can
write imperative code in those languages easily enough. And
the reason for going down that road is simple: it's not at
all clear how to write many types of complex applications
in functional languages.
Usually I can work through the data dependencies, and
often I find that there's an underlying simplicity to the
functional approach. But for other applications...well,
they can turn into puzzles. Where I can typically slog
through a messy solution in C, the purely functional
solution either eludes me or takes some puzzling to figure
out. In those cases I feel like I'm fighting the system,
and I realize why it's the road less traveled. Don't
believe it? Think that functional purity is always the road
to righteousness? Here's an easy example.
I wrote a semi-successful Mac game a while back called
Bumbler. At its heart it was your standard sprite-based
game: lots of independent objects running some behavioral
code and interacting with each other. That kind of code
looks easy to write in a purely functional way. An ant,
represented as a coordinate, marches across the screen in a
straight line and is deleted when it hits the opposite
screen edge. That's easy to see as a function. One small
clod of data goes in, another comes out.
But the behaviors and interactions can be a lot more
tangled than this. You could have an insect that chases
other insects, so you've got to pass in a list of existing
entities to it. You can have an insect that affects spawn
rates of other other insects, but of course you can't
modify those rates directly so you've got to return that
data somehow. You can have an insect that latches onto eggs
and turns them into something else, so now there's a
behavioral function that needs to reach into the list of
entities and make modifications, but of you're not allowed
to do that. You can have an insect that modifies the
physical environment (that is, the background of the game)
and spawns other insects. And each of these is messier than
it sounds, because there are so many counters and
thresholds and limiters being managed and sounds being
played in all kinds of situations, that the data flow isn't
clean by any means.
What's interesting is that it would be trivial to write
this in C. Some incrementing, some conditions, direct calls
to sound playing routines and insect spawning functions,
reading and writing from a pool of global counters and
state variables. For a purely functional approach, I'm sure
the data flow could be puzzled out...assuming that
everything was all perfectly planned and all the behaviors
were defined ahead of time. It's much more difficult to
take a pure movement function and say "okay, what I'd like
is for this object to gravitationally influence other
objects once it has bounced off of the screen edges three
times." Doable, yes. As directly implementable as the C
equivalent? No way.
That's one option: to admit that functional programming
is the wrong paradigm for some types of problems. Fair
enough. I'd put money on that. But it also may be that
almost no one has been thinking about problems like this,
that functional programming attracts purists and enamored
students. In the game example above, some of the issues are
solvable, they just need different approaches. Other issues
I don't know how to solve, or at least I don't have
solutions that are as straightforward as writing sequential
C. And there you go...I'm admitting that functional
programming is awkward in some cases. It's also extremely
useful in others.
(Also see the follow-up.)
Follow-up to "Admitting that Functional Programming Can
Be Awkward"
Admitting that functional programming
can be awkward drew a much bigger audience than I
expected, so here's some insight into why I wrote it, plus
some responses to specific comments.
I started learning some functional programming languages
in 1999, because I was looking for a more pleasant way to
deal with complex programming tasks. I eventually decided
to focus on Erlang (the reasons for which are probably
worthy of an entire entry), and after a while I found I was
not only using Erlang for some tasks I would have
previously used Perl for (and truth be told, I still use
Perl sometimes); I was able to approach problems that would
have been just plain nasty in C. But I also found that some
tasks were surprisingly hard in Erlang, clearly harder than
banging out an imperative solution.
Video games are good manifestation of a difficult
problem to approach functionally: lots of tangled
interactions between actors. I periodically search for
information on video games written in functional languages,
and I always get the same type of results. There's much
gushing about how wonderful functional programming is, how
games are complex, and how the two are a great match. Eight
years ago I even
did this myself, and I keep running into it as a cited
source about the topic. Then there's
Functional Reactive Programming, but the demos are
always bouncing balls and Pong and Space Invaders--which
are trivial to write in any language--and it's not at all
clear if it scales up to arbitrary projects. There are also
a handful of games written in procedural or object-oriented
styles in homebrew Lisp variants, and this is often equated
with "game in a functional language."
My conclusion is that there's very little work in
existence or being done on how to cleanly approach video
game-like problems in functional languages. And that's
okay; it's unexplored (or at least undocumented) territory.
I wrote "Admitting..." because of my own experiences
writing game-like code in Erlang. I don't think the overall
point was Erlang-specific, and it would apply to pure
subsets of ML, Scheme, etc. It wasn't meant as a cry for
help or a way of venting frustration. If you're writing
games in functional languages, I'd love to hear from
you!
Now some responses to specific comments.
All you have to do is pass the world state to each
function and return a new state.
True, yes, but...yuck. It can be clunky in a language
with single-assignment. And what is this really gaining you
over C?
All you have to do is make entities be functions of
time. You pass in a time and the position and state of that
entity at that time are returned.
For a bouncing ball or spinning cube, yes, that's easy.
But is there a closed-form solution for a game where
entities can collide with each other and react to what the
player is doing?
All languages suck at something. You should use a
multi-paradigm language.
Fair enough.
You used the word "modify" which shows you don't
understand how functional programming works.
If a spider does something that causes the spawn rate of
ants to change, then of course the old rate isn't modified
to contain the new value. But conceptually the
system needs to know that the new rate is what matters, so
somehow that data needs to propagate up out of a function
in such a way that it gets passed to the insect spawning
function from that point on. I was using "modify" in that
regard.
Erlang as a Target for Imperative DSLs
It's easy to show that any imperative program can be
implemented functionally. Or more specifically, that any
imperative program can be translated into Erlang. That
doesn't mean the functional version is better or easier to
write, of course.
Take an imperative program that operates on a set of
variables. Put all of those variables into a dictionary
called Falafel
. Make every function take
Falafel
as the first parameter and return a
tuple of the form {NewFalafel, OtherValues}
.
This is the classic "pass around the state of the global
world" approach, except that the topic is so dry that I
amuse myself by saying Falafel
instead of
World
. But I'll go back to the normal
terminology now.
What's awkward is that every time a new version of
World
is created inside the same function, it
needs a new name. C code like this:
color = 57;
width = 205;
can be mindlessly translated to Erlang:
World2 = dict:store(color, 57, World),
World3 = dict:store(width, 205, World2),
That's completely straightforward, yes, but manually
keeping track of the current name of the world is messy.
This could be written as:
dict:store(width, 205, dict:store(color, 57, World))
which has the same potential for programmer confusion
when it comes to larger, general cases. I wouldn't want to
write code like this by hand. But perhaps worrying about
the limitations of a human programmer is misguided. It's
easy enough to start with a simple imperative language and
generate Erlang code from it. Or wait, is that cheating? Or
admitting that functional programming can
be awkward?
None of this eliminates the issue that
dict:store
involves a lot of Erlang code, code
that's executed for every faux variable update.
A different angle is to remember that parameters in a
tail call are really destructive updates (see A Deeper Look at Tail Recursion in Erlang; and
I should have said "Tail Calls" instead of "Tail Recursion"
when I wrote the title). Arbitrarily messy imperative code
can be mechanically translated to Erlang through a simple
scheme:
Keep track of the live variables. If a variable is
updated, jump to a new function with all live variables
passed as parameters and the updated variable replaced
with its new value.
Here's some C:
total++;
count += total;
row = x * 200 + count;
And here's the Erlang version, again mindlessly
translated:
code_1(X, Total, Count) ->
code_2(X, Total + 1, Count).
code_2(X, Total, Count) ->
code_3(X, Total, Count + Total).
code_3(X, Total, Count) ->
code_4(X, Total, Count, X * 200 + Count).
code_4(X, Total, Count, Row) ->
...
Hard to read? Yes. Bulky at the source code level, too.
But this is highly efficient Erlang, much faster than the
dictionary version. I'd even call it optimal in terms of
the BEAM virtual machine.
Sending Modern Languages Back to 1980s Game
Programmers
Take a moment away from the world of Ruby, Python, and
JavaScript to consider some of the more audacious
archaeological relics of computing: the tens of thousands
of commercial products written entirely in assembly
language.
That's every game ever written for the Atari 2600.
Almost all Apple II games and applications, save early
cruft written in BASIC (which was itself written in
assembly). VisiCalc. Almost all Atari 800 and Commodore 64
and Sinclair Spectrum games and applications. Every NES
cartridge. Almost every arcade coin-op from the 1970s until
the early 1990s, including elaborate 16-bit affairs like
Smash TV, Total Carnage, and NBA Jam (in case you were only
thinking of "tiny sprite on black background" games like
Pac-Man). Almost all games for the SNES and SEGA Genesis.
And I'm completely ignoring an entire mainframe era that
came earlier.
(It's also interesting to look at 8-bit era software
that wasn't written in assembly language. A large
portion of SunDog:
Frozen Legacy for the Apple II was written in
interpreted Pascal. The HomePak
integrated suite of applications for the Atari 8-bit
computers was written in a slick language called Action!. The
1984 coin-op Marble
Madness was one of the few games of the time written in
C, and that allowed it to easily be ported to the Amiga and
later the Genesis. A handful of other arcade games used
BLISS.)
Back in 1994, I worked on a SNES game that was 100,000+
lines of 65816 assembly language. Oh yeah, no debugger
either. It sounds extreme, almost unthinkable, but there
weren't good options at the time. You use what you have to.
So many guitar players do what looks completely impossible
to me, but there's no shortage of people willing to take
the time to play like that. Assembly language is pretty
straightforward, provided you practice a lot and don't
waste time dwelling on its supposed difficulty.
If you want really extreme there were people
hand assembling Commodore 64 code and even people
writing Apple II games entirely in the machine language
monitor (a friend who clued me into this said you can
look at the disassembled code and see how functions are
aligned to 256 byte pages, so they can be modified without
having to shift around the rest of the program).
It's an interesting exercise to consider what it would
have been like to write games for these old, limited
systems, but given modern hardware and a knowledge of
modern tools: Perl, Erlang, Python, etc. No way would I
have tried to write a Commodore 64 game in Haskell or Ruby,
but having the languages available and on more
powerful hardware would have changed everything. Here's
what I'd do.
Write my own assembler. This sounded so difficult
back then, but that's because parsing and symbol table
management were big efforts in 6502 code. Getting it fast
would have taken extra time, too. But now writing a cross
assembler in Erlang (or even Perl) is a weekend project at
best. A couple hundred lines of code.
Write my own emulator. I don't mean a true,
performance-oriented emulator for running old code on a
modern PC. I mean a simple interpreter for stepping through
code and making sure it works without having real crashes
on the target hardware. Again, this would be a quick
project. It's what functional languages are designed to do.
More than just executing code, I'd want to make queries
about which registers a routine changes, get a list of all
memory addresses read or modified by a function, count up
the cycles in any stretch of code. This is all trivially
easy, but it was so out of my realm as a self-taught game
author. (And for development tool authors of the time, too.
I never saw features like these.)
Write my own optimizers for tricky cases. The
whole point of assembly is to have control, but some
optimizations are too ugly to do by hand. A good example is
realizing that the carry flag is always set when a jump
instruction occurs, so the jump (3 bytes) can be replaced
with a conditional branch (2 bytes).
Write my own custom language. I used to worship
at the altar of low-level optimization, but all of that
optimization was more or less pattern recognition or brute
force shuffling of code to minimize instructions. I still,
even today, cringe at the output I see from most compilers
(I have learned that it's best not to look), because
generating perfect machine code from a complex language is
a tough problem. But given a simple processor like the
6502, 6089, or Z80, I think it would not only be possible
to automate all of my old tricks, but to go beyond them
into the realm of "too scary to mess with" optimizations.
Self-modifying code is a good example. For some types of
loops you can't beat stuffing constants into the compare
instructions. Doing this by hand...ugh.
Much of the doability of these options comes from the
simplicity of 8-bit systems. Have you ever looked into the
details of the COFF or Preferred Executable format? Compare
the pages and pages of arcana to the six byte header of an
Atari executable. Or look at the 6502 instruction set
summarized on a single sheet of paper, compared with the
two volume set of manuals from Intel for the x86
instructions. But a big part of it also comes from modern
programming languages and how pleasant they make
approaching problems that would have previously been
full-on projects.
Trapped! Inside a Recursive Data Structure
Flat lists are simple. That's what list comprehensions
are designed to work with, for example. Code for scanning
or transforming a flat list can usually be tail recursive.
Once data becomes deep, where elements of a list can
contain other lists ad infinitum something changes.
It's trivial to iterate over a deep list; any basic Scheme
textbook covers this early on. You recurse down, down,
down, counting up values, building lists, and
then....trapped. You're way down inside a function, and all
you really want to do is exit immediately or record some
data that applies to the whole nested data structure and
keep going, but you can't.
As an example, here's the standard "is X contained in a
list?" function written in Erlang:
member(X, [X|_]) -> true;
member(X, [_|T]) -> member(X, T);
member(_, []) -> false.
Once a match is found, that's it. A value of true is
returned. A function to find X in a deep list takes a bit
more work:
member(X, [X|_]) ->
true;
member(X, [H|T]) when is_list(H) ->
case member(X, H) of
true -> true;
_ -> member(X, T)
end;
member(X, [_|T]) ->
member(X, T);
member(X, []) ->
false.
The ugly part here is that you could be down 50 levels
in a deep list when a match is found, but you're trapped.
You can't just immediately stop the whole operation and say
"Yes! Done!" You've got to climb back up those 50 levels.
That's the reason for checking for "true" in the second
function clause. Now this example is mild in terms of
claustrophobic trappage, but it can be worse, and you'll
know it when you run into such a case.
There are a couple of options here. One is to throw an
exception. Another is to use continuation passing style.
But there's a third approach which I think is cleaner:
manage a stack yourself instead of using the function call
stack. This keeps the function tail recursive, making it
easy to exit or handle counters or accumulators across the
whole deep data structure.
Here's member
for deep lists written with
an explicit stack:
member(X, L) -> member(X, L, []).
member(X, [X|_], _Stack) ->
true;
member(X, [H|T], Stack) when is_list(H) ->
member(X, H, [T|Stack]);
member(X, [_|T], Stack) ->
member(X, T, Stack);
member(_, [], []) ->
false;
member(X, [], [H|T]) ->
member(X, H, T).
Whenever the head of the list is a list itself, the tail
is pushed onto Stack so it can be continued with later, and
the list is processed. When there's no more input, check to
see if Stack has any data on it. If so, pop the top item
and make it the current list. When a match is found, the
exit is immediate, because there aren't any truly recursive
calls to back out of.
Would I really write member
like this?
Probably not. But I've found more complex cases where this
style is much less restrictive than writing a truly
recursive function. One of the signs that this might be
useful is if you're operating across a deep data structure
as a whole. For example, counting the number of
atoms in a deep list. Or taking a deep data structure and
transforming into into one that's flat.
Deriving Forth
When most programmers hear a mention of
Forth, assuming they're familiar with it at all, a
series of memory fragments surface: stack, Reverse Polish
Notation, SWAPping and DUPlicating values. While the stack
and RPN are certainly important to Forth, they don't
describe essence of how the language actually works.
As an illustration, let's write a program to decode
modern coffee shop orders. Things like:
I'd like a grande skinny latte
and
Gimme a tall mocha with an extra shot to go
The catch here is that we're not allowed to write a
master parser for this, a program that slurps in the
sentence and analyzes it for meaning. Instead, we can only
look at a single word at a time, starting from the
left, and each word can only be examined once--no
rewinding.
To get around this arbitrary-seeming rule, each word
(like "grande") will have a small program attached to it.
Or more correctly, each word is the name of a
program. In the second example above, first the program
called gimme
is executed, then a
,
then tall
, and so on.
Now what do each of these programs do? Some words are
clearly noise: I'd, like, a, an, to, with. The program for
each of these words simply returns immediately. "I'd like
a," which is three programs, does absolutely nothing.
Now the first example ("i'd like a grande skinny
latte"), ignoring the noise words, is "grande skinny
latte." Three words. Three programs. grande
sets a Size
variable to 2, indicating large.
Likewise, tall
sets this same variable to 1,
and short
sets it to 0. The second program,
skinny
, sets a Use_skim_milk
flag
to true. The third program, latte
, records the
drink name in a variable we'll call
Drink_type
.
To use a more concise notation, here's a list of the
programs for the second example:
gimme -> return
a -> return
tall -> Size = 1
mocha -> Drink_type = 1
with -> return
extra -> return
shot -> Extra_shot = true
to -> return
go -> To_go = true
When all of these programs have been executed, there's
enough data stored in a handful of global variables to
indicate the overall drink order, and we managed to dodge
writing a real parser. Almost. There still needs to be one
more program that looks at Drink_type
and
Size
and so on. If we name that program
EOL
, then it executes after all the other
programs, when end-of-line is reached. We can even handle
rephrasings of the same order, like "mocha with an extra
shot, tall, to go" with exactly the same code.
The process just described is the underlying
architecture of Forth: a dictionary of short programs. In
Forth-lingo, each of these named programs is called a
word. The main loop of Forth is simply an
interpreter: read the next bit of text delimited by spaces,
look it up in the dictionary, execute the program
associated with it, repeat. In fact, even the Forth
compiler works like this. Here's a simple Forth
definition:
: odd? 1 and ;
The colon is a word too, and the program attached to it
first reads the next word from the input and creates a
dictionary entry with that name. Then it does this: read
the next word in the input, if the word is a semicolon then
generate a return instruction and stop compiling, otherwise
look up the word in the dictionary, compile a call to it,
repeat.
So where do stacks and RPN come into the picture? Our
coffee shop drink parser is simple, but it's a front for a
jumble of variables behind the scenes. If you're up for
some inelegant code, you could do math with the same
approach. "5 + 3" is three words:
5 -> Value_1 = 5
+ -> Operation = add
3 -> Value_2 = 3
EOL -> Operation(Value_1, Value_2)
but this is clunky and breaks down quickly. A stack is a
good way to keep information flowing between words, maybe
the best way, but you could create a dictionary-based
language that didn't use a stack at all. Each function in
Backus's FP, for example, creates a value or data structure
which gets passed to the next function in sequence. There's
no stack.
Finally, just to show that my fictional notation is
actually close to real Forth, here's a snippet of code for
the drink decoder:
variable Size
variable Type
: short 0 Size ! ;
: tall 1 Size ! ;
: grande 2 Size ! ;
: latte 0 Type ! ;
: mocha 1 Type ! ;
Two Stories of Simplicity
In response to Sending modern languages
back to 1980s game programmers, one of the questions I
received was "Did any 8-bit coders ever use more powerful
computers for development?" Sure! The VAX and PDP-11 and other
minicomputers were available at the time, though expensive,
and some major developers made good use of them,
cross-compiling code for the lowly Atari 800 and Apple II.
But there was something surprising about some of these
systems:
It was often slower to cross-assemble a program on a
significantly higher-specced machine like the VAX than it
was to do the assembly on a stock 8-bit home computer.
Part of the reason is that multiple people were sharing
a single VAX, working simultaneously, but the Apple II user
had the whole CPU available for a single task. There was
also the process of transferring the cross-assembled code
to the target hardware, and this went away if the code was
actually built on the target. And then there were
inefficiencies that built up because the VAX was designed
for large-scale work: more expensive I/O libraries, more
use of general purpose tools and code.
For example, a VAX-hosted assembler might dynamically
allocate symbols and other data on the heap, something
typically not used on a 64K home computer. Now a heap
manager--what malloc sits on top of--isn't a trivial bit of
code. More importantly, you usually can't predict how much
time a request for a block of memory will take to fulfill.
Sometimes it may be almost instantaneous, other times it
may take thousands of cycles, depending on the algorithms
used and current state of the heap. Meanwhile, on the 8-bit
machine, those thousands of cycles are going directly
toward productive work, not solving the useful but
tangential problem of how to effectively manage a heap.
So in the end there were programmers with these little
8-bit machines outperforming minicomputers costing hundreds
of thousands of dollars.
That ends the first story.
When I first started programming the Macintosh, right
after the switch to PowerPC processors in the mid-1990s, I
was paranoid about system calls. I knew the system memory
allocation routines were unpredictable and should be
avoided in performance-oriented code. I'd noticeably sped
up a commercial application by dodging a system "Is point
in an arbitrarily complex region?" function. It was in this
mindset that I decided to steer clear of Apple's BlockMove
function--the MacOS equivalent of memcpy--and write my
own.
The easy way to write a fast memory copier is to move as
much data at a time as possible. 32-bit values are better
than 8-bit values. The problem with using 32-bit values
exclusively is that there are alignment issues. If the
source address isn't aligned on a four-byte boundary, it's
almost as bad as copying 8-bits at a time. BlockMove
contained logic to handle misaligned addresses, breaking
things into two steps: individual byte moves until the
source address was properly aligned, then 32-bit copies
from that point on. My plan was that if I always guaranteed
that the source and destination addresses were properly
aligned, then I could avoid all the special-case address
checks and have a simple loop reading and writing 32-bits
at a time.
(It was also possible to read and write 64-bit values,
even on the original PowerPC chips, using 64-bit floating
point registers. But even though this looked good on paper,
floating point loads and stores had a slightly longer
latency than integer loads and stores.)
I had written a very short, very concise aligned memory
copy function, one that clearly involved less code than
Apple's BlockMove.
Except that BlockMove was faster. Not just by a little,
but 30% faster for medium to large copies.
I eventually figured out the reason for this by
disassembling BlockMove. It was even more convoluted than I
expected in terms of handling alignment issues. It also had
a check for overlapping source and destination blocks--more
bloat from my point of view. But there was a nifty trick in
there that I never would have figured out on my own.
Let's say that a one megabyte block of data is being
copied from one place to another. During the copy loop the
data at the source and destination addresses is constantly
getting loaded into the cache, 32 bytes at a time (the size
of a cache line on early PowerPC chips), two megabytes of
cache loads in all.
If you think about this, there's one flaw: all the data
from the destination is loaded into the cache...and then
it's immediately overwritten by source data. BlockMove
contained code to align addresses to 32 byte cache lines,
then in the inner copy loop used a special instruction to
avoid loading the destination data, setting an entire cache
line to zeros instead. For every 32 bytes of data, my code
involved two cache line reads and one cache line write. The
clever engineer who wrote BlockMove removed one of these
reads, resulting in a 30% improvement over my code. This is
even though BlockMove was pages of alignment checks and
special cases, instead of my minimalist function.
There you go: one case where simpler was clearly better,
and one case where it wasn't.
Finally: Data Structure Constants in Erlang
Here's a simple Erlang function to enclose some text in
a styled paragraph, returning a deep list:
para(Text) ->
["<p class=\"normal\">", Text, "</p>"].
Prior to the new R12B release of Erlang, this little
function had some less than ideal behavior.
Every time para
was called, the two
constant lists (a.k.a. strings) were created, which
is to say that there weren't true data structure constants
in Erlang.
Each string was built-up, letter by letter, via a series
of BEAM virtual machine instructions. The larger the
constant, the more code there was to generate it, and the
more time it took to generate.
Because a new version of each string was created for
each para
call, there was absolutely no
sharing of data. If para
was called 200 times,
400 strings were created (200 of each). Remember, too, that
each element of a list/string in 32-bit Erlang is 8 bytes.
Doing the math on this example is mildly unsettling: 22
characters * 8 bytes per character * 200 instances = 35,200
bytes of "constant" string data.
As a result of more data being created, garbage
collection occurred more frequently and took longer
(because there was more live data).
Years ago, this problem was solved in HIPE, the standard
native code compiler for Erlang, by putting constant data
structures into a pool. Rather than using BEAM instructions
to build constants, each constant list or tuple or more
complex structure is simply a pointer into the pool. For
some applications, such as turning a tree into HTML, the
HIPE approach is a significant win.
And now, finally, as of the December 2007 R12B release,
true data structure constants are supported in BEAM.
Revisiting "Programming as if Performance
Mattered"
In 2004 I wrote Programming
as if Performance Mattered, which became one of my most
widely read articles. (If you haven't read it yet, go
ahead; the rest of this entry won't make a lot of sense
otherwise. Plus there are spoilers, something that
doesn't affect most tech articles.) In addition to all the
usual aggregator sites, it made Slashdot which resulted in a
flood of email, both complimentary and bitter. Most of
those who disagreed with me can be divided into two
groups.
The first group flat-out didn't get it. They lectured me
about how my results were an anomaly, that interpreted
languages are dog slow, and that performance comes from
hardcore devotion to low-level optimization. This is even
though my entire point was about avoiding knee-jerk
definitions of fast and slow. The mention of
game programming at the end was a particular sore point for
these people. "You obviously know nothing about writing
games," they raved, "or else you'd know that every line of
every game is carefully crafted for the utmost
performance." The amusing part is that I've spent almost my
entire professional career--and a fairly unprofessional
freelance stint before that--writing games.
The second group was more savvy. These people had
experience writing image decoders and knew that my timings,
from an absolute point of view, were nowhere near the
theoretical limit. I talked of decoding the sample image in
under 1/60th of a second, and they claimed significantly
better numbers. And they're completely correct. In most
cases 1/60th of a second is plenty fast for decoding an
image. But if a web page has 30 images on it, we're now up
to half a second just for the raw decoding time. Good C
code to do the same thing will win by a large margin. So
the members of this group, like the first, dismissed my
overall point.
What surprised me about the second group was the
assumption that my Erlang code is as fast as it could
possibly get, when in fact there are easy ways of speeding
it up.
First, just to keep the shock value high, I kept my code
in pure, interpreted Erlang. But there's a true compiler as
part of the standard Erlang distribution, and simply
compiling the tga
module will halve execution
time, if not decrease it by a larger factor.
Second, I completely ignored concurrent solutions, both
within the decoding of a single image and potentially
spinning each image into its own process. The latter
solution wouldn't improve execution time of my test case,
but could be a big win if many images are decoded.
Then there's perhaps the most obvious thing to do, the
first step when it comes to understanding the performance
of real code. Perhaps my detailed optimization account made
it appear that I had reached the end of the road, that no
more performance could be eked out of the Erlang code. In
any case, no one suggested profiling the code to see
if there are any obvious bottlenecks. And there is such a
bottleneck.
(There's one more issue too: in the end, the image
decoder was sped-up enough that it was executing below the
precision threshold of the wall clock timings of
timer:tc/3
. I could go in and remove parts of
the decoder--obviously giving incorrect results--and still
get back the same timings of 15,000 microseconds. The key
point is that my reported timings were likely higher
than they really were.)
Here's the output of the eprof profiler on
tga:test_compressed()
:
FUNCTION CALLS TIME
****** Process <0.46.0> -- 100 % of profiled time ***
tga:decode_rgb1/1 54329 78 %
lists:duplicate/3 11790 7 %
tga:reduce_rle_row/3 2878 3 %
tga:split/1 2878 3 %
tga:combine/1 2874 3 %
erlang:list_to_binary/1 1051 2 %
tga:expand/3 1995 1 %
tga:continue_rle_row/7 2709 1 %
lists:reverse/1 638 0
...
Sure enough, most of the execution time is spent in
decode_rgb1
, which is part of
decode_rgb
. The final version of this function
last time around was this:
decode_rgb(Pixels) ->
list_to_binary(decode_rgb1(binary_to_list(Pixels))).
decode_rgb1([255,0,255 | Rest]) ->
[0,0,0,0 | decode_rgb1(Rest)];
decode_rgb1([R,G,B | Rest]) ->
[R,G,B,255 | decode_rgb1(Rest)];
decode_rgb1([]) -> [].
This is short, but contrived. The binary blob of pixels
is turned into a list, then the new pixels are built-up in
reverse order as a list, and finally that list is reversed
and turned back into a binary. There are two reasons for
the contrivance. At the time, pattern matching was much
faster on lists than binaries, so it was quicker to turn
the pixels into a list up front (I timed it). Also,
repeatedly appending to a binary was a huge no-no, so it
was better to create a new list and turn it into a binary
at the end.
In Erlang R12B both of these issues have been addressed,
so decode_rgb
can be written in the
straightforward way, operating on binaries the whole
time:
decode_rgb(Pixels) -> decode_rgb(Pixels, <<>>).
decode_rgb(<<255,0,255, Rest/binary>>, Result) ->
decode_rgb(Rest, <<Result/binary,0,0,0,0>>);
decode_rgb(<<R,G,B, Rest/binary>>, Result) ->
decode_rgb(Rest, <<Result/binary,R,G,B,255>>);
decode_rgb(<<>>, Result) -> Result.
This eliminates the memory pressure caused by expanding
each byte of the binary to eight bytes (the cost of an
element in a list).
But we can do better with a small change to the
specification. Remember, decode_rgb
is a
translation from 24-bit to 32-bit pixels. When the initial
pixel is a magic number--255,0,255--the alpha channel of
the output is set to zero, indicating transparency. All
other pixels have the alpha set to 255, which is fully
opaque. If you look at the code, you'll see that the
255,0,255 pixels actually get turned into 0,0,0,0 instead
of 255,0,255,0. There's no real reason for that. In fact,
if we go with the simpler approach of only changing the
alpha value, then decode_rgb
can be written
using in an amazingly clean way:
decode_rgb(Pixels) ->
[<<R,G,B,(alpha(R,G,B))>> || <<R,G,B>> <= Pixels].
alpha(255, 0, 255) -> 0;
alpha(_, _, _) -> 255.
This version uses bitstring comprehensions, a new
feature added in Erlang R12B. It's hard to imagine writing
this with any less code.
(Also see the follow-up.)
Timings and the Punchline
I forgot two things in Revisiting
"Programming as if Performance Mattered": exact timings
of the different versions of the code and a punchline. I'll
do the timings first.
timer:tc
falls apart once code gets too
fast. A classic sign of this is running consecutive timings
and getting back a sequence of numbers like 15000, 31000,
31000, 15000. At this point you should write a loop to
execute the test function, say, 100 times, then divide the
total execution time by 100. This smooths out interruptions
for garbage collection, system processes, and so on.
And now the timings (lower is better). The TGA image
decoder with the clunky binary / list / binary
implementation of decode_rgb
, on the same
sample image I used in 2004:
16,720 microseconds
(Yes, this is larger than the original 15,000 I
reported, because it's an average, not the result of one or
two runs.) The recursive version operating directly on
binaries:
18,700 microseconds
The ultra-slick version using binary comprehensions:
22,600 microseconds
I think the punchline is obvious at this point.
Were I using this module in production code, I'd do one
of three things. If I'm only decoding a handful of images
here and there, then this whole discussion is irrelevant.
The Erlang code is more than fast enough. If image decoding
is a huge bottleneck, I'd move the hotspot,
decode_rgb
into a small linked-in driver. Or,
and the cries of cheating may be justified here, I'd remove
decode_rgb
completely.
Remember, transparent pixels runs at the start and end
of each row are already detected elsewhere.
decode_rgb
blows up the runs in the middle
from 24-bit to 32-bit. At some point this needs to be done,
but it may just be that it doesn't need to happen at the
Erlang level at all. If the pixel data is passed off to
another non-Erlang process anyway, maybe for rendering or
for printing or some other operation, then there's no
reason the compressed 24-bit data can't be passed off
directly. That fits the style I've been using for this
whole module, of operating on compressed data without a
separate decompression step.
But now we're getting into useless territory: quibbling
over microseconds without any actual context. You can't
feel the difference between any of the optimized
versions of the code I presented last time, and so it
doesn't matter.
Would You Bet $100,000,000 on Your Pet Programming
Language?
Here's my proposition: I need an application developed
and if you can deliver it on time I'll pay you $100,000,000
(USD). It doesn't involve solving impossible problems, but
difficult and messy problems: yes.
What language can you use to write it? Doesn't matter to
me. It's perfectly fine to use multiple languages; I've got
no hangups about that. All that matters is that it gets
done and works.
As with any big project, the specs will undoubtedly
change along the way. I promise not to completely confuse
the direction of things with random requests. Could you add
an image editor with all the features of Photoshop, plus a
couple of enhancements? What about automatic translation
between Korean and Polish? 3D fuzzy llamas you can ride
around if network transfers take a long while? None of
that. But there are some more realistic things I could see
happening:
You need to handle data sets five times larger than
anticipated.
I also want this to run on some custom ARM-based
hardware, so be sure you can port to it.
Intel announced a 20 core chip, so the code needs to
scale up to that level of processing power.
And also...hang on, phone call.
Sadly, I just found out that Google is no longer
interested in buying my weblog, so I'm rescinding my
$100,000,000 offer. Sigh.
But imagine if the offer were true? Would you bet
a hundred million dollars on your pet language being up to
the task? And how would it change your criteria for judging
programming languages? Here's my view:
Libraries are much more important than core language
features. Cayenne may have dependent types (cool!), but
are there bindings for Flash file creation and a
native-look GUI? Is there a Rich Text Format parsing
library for D? What about fetching files via ftp from
Mercury? Do you really want to write an SVG decoder for
Clean?
Reliability and proven tools are even more important
than libraries. Has anyone ever attempted a similar
problem in Dolphin Smalltalk or Chicken Scheme or Wallaby
Haskell...er, I mean Haskell? Has anyone ever attempted a
problem of this scope at all in that language? Do
you know that the compiler won't get exponentially slower
when fed large programs? Can the profiler handle such large
programs? Do you know how to track down why small
variations in how a function is written result in bizarre
spikes in memory usage? Have some of the useful but still
experimental features been banged on by people working in a
production environment? Are the Windows versions of the
tools actually used by some of the core developers or is it
viewed as a second rate platform? Will native compilation
of a big project result in so much code that there's a
global slowdown (something actually true of mid-1990s
Erlang to C translators)?
You're more dependent on the decisions made by the
language implementers than you think. Sure, toy
textbook problems and tutorial examples always seem to work
out beautifully. But at some point you'll find yourself
dependent on some of the obscure corners of the compiler or
run-time system, some odd case that didn't matter at all
for the problem domain the language was created for, but
has a very large impact on what you're trying to do.
Say you've got a program that operates on a large set of
floating point values. Hundreds of megabytes of floating
point values. And then one day, your Objective Caml program
runs out of memory and dies. You were smart of course, and
knew that floating point numbers are boxed most of the time
in OCaml, causing them to be larger than necessary. But
arrays of floats are always unboxed, so that's what you
used for the big data structures. And you're still out of
memory. The problem is that "float" in OCaml means
"double." In C it would be a snap to switch from the 64-bit
double type to single precision 32-bit floats, instantly
saving hundreds of megabytes. Unfortunately, this is
something that was never considered important by the OCaml
implementers, so you've got to go in and mess with the
compiler to change it. I'm not picking on OCaml here; the
same issue applies to many languages with floating point
types.
A similar, but harder to fix, example is if you discover
that at a certain data set size, garbage collection crosses
the line from "only noticeable if you're paying attention"
to "bug report of the program going catatonic for a several
seconds." The garbage collector has already been carefully
optimized, and it uses multiple generations, but there's
always that point when the oldest generation needs to be
scavenged and you sit helplessly while half a gigabyte of
complex structures are traversed and copied. Can you fix
this? That a theoretically better garbage collection
methodology exists on paper somewhere isn't going to make
this problem vanish.
By now fans of all sorts of underdog programming
languages are lighting torches and collecting rotten fruit.
And, really, I'm not trying to put down specific languages.
When I'm on trial I can easily be accused of showing some
favor to Forth and having spent time tinkering with
J (which honestly
does look like line noise in a way that would blow
the minds of critics who level that charge against Perl).
Yes, I'm a recovering language dilettante.
Real projects with tangible rewards do change my
perceptions, however. With a $100,000,000 carrot hanging in
front of me, I'd be looking solely at the real issues
involved with the problem. Purely academic research
projects immediately look ridiculous and scary. I'd become
very open to writing key parts of an application in C,
because that puts the final say on overall data sizes back
in my control, instead finding out much later that the
language system designer made choices about tagging and
alignment and garbage collection that are at odds with my
end goals. Python and Erlang get immediate boosts for
having been used in large commercial projects, though each
clearly has different strengths and weaknesses; I'd be
worried about both of them if I needed to support some odd,
non-UNIXy embedded hardware.
What would you do? And if a hundred million dollars
changes your approach to getting things done in a quick and
reliable fashion, then why isn't it your standard
approach?
Functional Programming Archaeology
John
Backus's Turing Award Lecture from 1977, Can
Programming be Liberated from the Von Neumann Style?
(warning: large PDF) was a key event in the history of
functional programming. All of the ideas in the paper by no
means originated with Backus, and Dijkstra publicly
criticized it for being poorly thought through, but it
did spur interest in functional programming research which
eventually led to languages such as Haskell. And the paper
is historically interesting as the crystallization of the
beliefs about the benefits of functional programming at the
time. There are two which jump out at me.
The first is concurrency as a primary motivation. If a
program is just a series of side effect-free expressions,
then there's no requirement that programs be executed
sequentially. In a function call like this:
f(ExpressionA, ExpressionB, ExpressionC)
the three expressions have no interdependencies and can
be executed in parallel. This could, in theory, apply all
the way down to pieces of expressions. In this snippet of
code:
(a + b) * (c + d)
the two additions could be performed at the same time.
This fine-grained concurrency was seen as a key benefit of
purely functional programming languages, but it fizzled,
both because of the difficulty in determining how to
parallelize programs efficiently and because it was a poor
match for monolithic CPUs.
The second belief which has dropped off the radar since
1977 is the concept of an algebra of programs. Take this
simple C expression:
!x
Assuming x
is a truth value--either 0 or
1--then !x
gives the same result as these
expressions:
1 - x
x ^ 1
(x + 1) & 1
If the last of these appeared in code, then it could be
mechanically translated to one of the simpler equivalents.
Going further, you could imagine an interactive tool that
would allow substitution of equivalent expressions, maybe
even pointing out expressions that can be simplified.
Now in C this isn't all that useful. And in Erlang or
Haskell it's not all that useful either, unless you avoid
writing explicitly recursive functions with named values
and instead express programs as a series of canned
manipulations. This is the so-called point-free
style which has a reputation for density to the point
of opaqueness.
In Haskell code, point-free style is common, but not
aggressively so. Rather than trying to work out a way to
express a computation as the application of existing
primitives, it's usually easier to write an explicitly
recursive function. Haskell programmers aren't taught to
lean on core primitive functions wherever possible, and
core primitive functions weren't necessarily designed with
that goal in mind. Sure, there's the usual map
and fold
and so on, but not a set of functions
that would allow 90% of all programs to be expressed as
application of those primitives.
Can Programming be Liberated... introduced
fp, a language which didn't catch on and left very
little in the way of tutorials or useful programming
examples. fp was clearly influenced by Ken Iverson's APL, a
language initially defined n 1962 (and unlike fp, you can
still hunt down production code written in APL). The APL
lineage continued after Backus's paper, eventually leading
to APL2 and J (both of which involved Iverson) and a second
branch of languages created by a friend of Iverson, Arthur
Whitney: A+, K, and Q. Viewed in the right light, J is a
melding of APL and fp. And the "build a program using core
primitives" technique lives on in J.
Here's a simple problem: given an array (or list, if you
prefer), return the indices of values which are greater
than 5. For example, this input:
1 2 0 6 8 3 9
gives this result:
3 4 6
which means that the elements in the original array at
positions 3, 4, and 6 (where the first position is zero,
not one) are all greater than 5. I'm using the APl/J/K list
notation here, instead of the Haskelly
[3,4,6]
. How can we transform the original
array to 3 4 6
without explicit loops,
recursion, or named values?
First, we can find out which elements in the input list
are greater than 5. This doesn't give us their positions,
but it's a start.
0 2 0 6 8 3 9 > 5
0 0 0 1 1 0 1
The first line is the input, the second the output.
Greater than, like most J functions, operates on whole
arrays, kind of like all operators in Haskell having
map
built in. The above example checks if each
element of the input array is greater than 5 and returns an
array of the results (0 = false, 1 = true).
There's another J primitive that builds a list of values
from 0 up to n-1:
i. 5
0 1 2 3 4
Yes, extreme terseness is characteristic of J--just let
it go for now. One interesting thing we can do with our
original input is to build up a list of integers as long as
the array.
i. # 1 2 0 6 8 3 9
0 1 2 3 4 5 6
(#
is the length function.) Stare at this
for a moment, and you'll see that the result is a list of
the valid indices for the input array. So far we've got two
different arrays created from the same input: 0 0 0 1
1 0 1
(where a 1 means "greater than 5") and 0
1 2 3 4 5 6
(the list of indices for the array). Now
we take a bit of a leap. Pair these two array together:
(first element of the first array, first element of the
second array), etc., like this:
(0,0) (0,1) (0,2) (1,3) (1,4) (0,5) (1,6)
This isn't J notation; it's just a way of showing the
pairings. Notice that if you remove all pairs that have a
zero in the first position, then only three pairs are left.
And the second elements of those pairs make up the answer
we're looking for: 3 4 6
. It turns out that J
has an operator for pairing up arrays like this, where the
first element is a count and the second is a value to
repeat count times. Sort of a run-length expander.
The key is that a count of zero can be viewed as "delete
me" and a count of 1 as "copy me as is." Or in actual J
code:
0 0 0 1 1 0 1 # 0 1 2 3 4 5 6
3 4 6
And there's our answer--finally! (Note that
#
in this case, with an operand on each side
of it, is the "expand" function.) If you're ever going to
teach a beginning programming course, go ahead and learn J
first, so you can remember what it's like to be an utterly
confused beginner.
In the APL/J/K worlds, there's a collection of
well-known phrases (that is, short sequences of
functions) for operations like this, each made up of
primitives. It's the community of programmers with the most
experience working in a point-free style. Though I doubt
those programmers consider themselves to be working with
"an algebra of programs," as Backus envisioned, the
documentation is sprinkled with snippets of code declared
to be equivalent to primitives or other sequences of
functions.
Why Garbage Collection Paranoia is Still (sometimes)
Justified
"As new code was compiled, older code (and other
memory used by the compiler) was orphaned, eventually
causing the PC to run low on free memory. A slow garbage
collection process would automatically occur when available
memory became sufficiently low, and the compiler would be
unresponsive until the process had completed, sometimes
taking as long as 15 minutes."
—Naughty Dog's Jak and Daxter
post-mortem
I know the title will bait people who won't actually
read any of this article, so I'll say it right up front to
make them feel the error of their reactionary ways: I am
pro garbage collection. It has nothing to do with manual
memory management supposedly being too hard (good grief
no). What I like is that it stops me from thinking about
trivial usage of memory at all. If it would be more
convenient to briefly have a data structure in a different
format, I just create a new version transformed the way I
want it. Manually allocating and freeing these
insignificant bits of memory is just busywork.
That's hardly a bold opinion in 2008. There are more
programming languages in popular use with garbage
collection than there are without. Most of the past
paranoia about garbage collection slowness and pauses has
been set aside in favor of increased productivity.
Computers have gotten much faster. Garbage collectors have
gotten better. But those old fears are still valid, those
hitches and pauses still lurking, and not just in the same
vague way that some people like to assume that integer
division is dog slow even on a 3GHz processor. In fact,
they apply to every garbage collected language
implementation in existence. Or more formally:
In any garbage collector, there exists some
pathological case where the responsiveness of your program
will be compromised.
"Responsiveness" only matters for interactive
applications or any program that's vaguely real-time. In a
rocket engine monitoring system, responsiveness may mean
"on the order of a few microseconds." In a robotic probe
used for surgery, it might be "on the order of four
milliseconds." For a desktop application, it might be in
the realm of one to two seconds; beyond that, users will be
shaking the mouse in frustration.
Now about the "pathological case." This is easy to
prove. In a garbage collector, performance is always
directly proportional to something. It might be
total number of memory allocations. It might be the amount
of live data. It might be something else. For the sake of
discussion let's assume it's the amount of live data.
Collection times might be acceptable for 10MB of live data,
maybe even 100MB, but you can always come up with larger
numbers: 250MB...or 2GB. Or in a couple of years, 20GB. No
matter what you do, at some point the garbage collector is
going to end up churning through those 250MB or 2GB or 20GB
of data, and you're going to feel it.
Ah, but what about generational collectors? They're
based on the observation that most objects are short lived,
so memory is divided into a nursery for new allocations and
a separate larger pool for older data (or even a third pool
for grandfatherly data). When the nursery is full, live
data is promoted to the larger pool. These fairly cheap
nursery collections keep happening, and that big, secondary
pool fills up a little more each time. And then, somewhere,
sometime, the old generation fills up, all 200MB of it.
This scheme has simply delayed the inevitable. The monster,
full-memory collection is still there, waiting for when it
will strike.
What about real time garbage collection? More and more,
I'm starting to see this as a twist on the myth of the
Sufficiently
Smart Compiler. If you view "real time" as "well
engineered and fast," then it applies to most collectors in
use, and they each still have some point, somewhere down
the road, at which the pretense of being real time falls
apart. The other interpretation of real time is some form
of incremental collection, where a little bit of GC happens
here, a little bit there, and there's never a big, painful
pause.
An interesting question is this: What language systems
in existence are using a true incremental or concurrent
garbage collector? I know of three: Java, Objective C 2.0
(which just shipped with OS X Leopard), and the .net
runtime. Not Haskell. Not Erlang. Not Objective Caml [EDIT:
The OCaml collector for the second generation is incremental].
Not any version of Lisp or Scheme. Not Smalltalk. Not Ruby.
That begs a lot of questions. Clearly incremental and
concurrent collection aren't magic bullets or they'd be a
standard part of language implementations. Is it that the
additional overhead of concurrent collection is only
worthwhile in imperative languages with lots of frequently
modified, cross-linked data? I don't know.
Incremental collection is a trickier problem than it
sounds. You can't just look at an individual object and
decide to copy or free it. In order to know if a data
object is live or not, you've got to scan the rest of the
world. The incremental collectors I'm familiar with work
that way: they involve a full, non-incremental marking
phase, and then copying and compaction are spread out over
time. This means that the expense of such a collector is
proportional to the amount of data that must be scanned
during the marking phase and as such has a lurking
pathological case.
Does knowing that garbage collectors break down at some
point mean we should live in fear of them and go back to
manual heap management? Of course not. But it does mean
that some careful thought is still required when it comes
to dealing with very large data sets in garbage collected
languages.
Next time: A look at how garbage collection works in Erlang. The
lurking monster is still there, but there are some
interesting ways of delaying his attack.
Garbage Collection in Erlang
Given its "soft real time" label, I expected Erlang to
use some fancy incremental garbage collection approach. And
indeed, such an approach exists,
but it's slower than traditional GC in practice (because it
touches the entire the heap, not just the live data). In
reality, garbage collection in Erlang is fairly vanilla.
Processes start out using a straightforward compacting
collector. If a process gets large, it is automatically
switched over to a generational scheme. The generational
collector is simpler than in some languages, because
there's no way to have an older generation pointing to data
in a younger generation (remember, you can't destructively
modify a list or tuple in Erlang).
The key is that garbage collection in Erlang is per
process. A system may have tens of thousands of
processes, using a gigabyte of memory overall, but if GC
occurs in a process with a 20K heap, then the collector
only touches that 20K and collection time is imperceptible.
With lots of small processes, you can think of this as a
truly incremental collector. But there's still a lurking
worst case in Erlang: What if all of those processes run
out of memory more or less in the same wall-clock moment?
And there's nothing preventing an application from using
one massive process (such is the case with the Wings 3D modeller).
Per-process GC allows a slick technique that can
completely prevent garbage collection in some
circumstances. Using spawn_opt
instead of the
more common spawn
, you can specify the initial
heap size for a process. If you know, as discovered through
profiling, that a process rapidly grows up to 200K and then
terminates, you can give that process an initial heap size
of 200K. Data keeps getting added to the end of the heap,
and then before garbage collection kicks in, the process
heap is deleted and its contents are never scanned.
The other pragmatic approach to reducing the cost of
garbage collection in Erlang is that lots of data is kept
outside of the per-process heaps:
Binaries > 64 bytes. Large binaries are
allocated in a separate heap outside the scope of a
process. Binaries can't, by definition, contain pointers to
other data, so they're reference counted. If there's a 50MB
binary loaded, it's guaranteed never to be copied as part
of garbage collection.
Data stored in ETS tables. When you look up key
in an ETS table, the data associated with that key is
copied into the heap for the process the request originated
from. For structurally large values (say, a tuple of 500
elements) the copy from ETS table space to the process heap
may become expensive, but if there's 100MB of total data in
a table, there's no risk of all that data being scanned at
once by a garbage collector.
Data structure constants. This is new in Erlang.
Atom names. Atom name strings are stored in a
separate data area and are not garbage collected. In Lisp,
it's common for symbol names to be stored on the main heap,
which adds to garbage collection time. But that also means
that dynamically creating symbols in Lisp is a reasonable
approach to some problems, but it's not something you want
to do in Erlang.
Don't Structure Data All The Way Down
Let's write some functions to operate on circles, where
a circle is a defined by a 2D center point and a radius. In
Erlang we've got some options for how to represent a
circle:
{X, Y, R} % raw tuple
{circle, X, Y, R} % tagged tuple
#circle{x = X, y = Y, r = R} % icky record
Hmmm...why is a circle represented as a structure, but a
point is unwrapped, so to speak, into two values? Attempt
#2:
{{X,Y}, R} % raw tuples
{circle, {point,X,Y}, R} % tagged tuples
... % gonna stop with records
Now let's write a function to compute the area of a
circle, using this new representation:
area({circle, {point,_X,_Y}, R}) ->
math:pi() * R * R.
Simple enough. But take a few steps back and look this.
First, we're not actually making use of the structure of
the data in area
. We're just
destructuring it to get the radius. And to do that
destructuring, there's a bunch of code generated for this
function: to verify the parameter is a tuple of size 3, to
verify that the first element is the atom
circle
, to verify the second element is a
tuple of size 3 with the atom point
as the
first element, and to extract the radius. Then there's a
trivial bit of math and we've got an answer.
Now suppose we want to find the area of a circle of
radius 17.4. We've got a nice function all set to go...sort
of. We need the radius to be part of a circle, so we could
try this:
area({circle, {point,0,0}, 17.4})
Kind of messy. What about a function to build a circle
for us? Then we could do this:
area(make_circle(0, 0, 17.4))
We could also have a shorter version of
make_circle
that only takes a radius,
defaulting the center point to 0,0. Okay, stop, we're
engineering ourselves to death. All we need is a simple
function to compute the area of a circle:
area(R) ->
math:pi() * R * R.
Resist the urge to wrap it into an abstract data type or
an object. Keep it raw and unstructured and simple. If you
want structure, add it one layer up, don't make it part of
the foundation. In fact, I'd go so far as to say that if
you pass a record-like data structure to a function and
any of the elements in that structure aren't being
used, then you should be operating on a simpler set of
values and not a data structure. Keep the data flow
obvious.
Back to the Basics of Functional Programming
I have been accused of taking the long way around to
obvious conclusions. Fair enough. But
to me it's not the conclusion so much as tracking the path
that leads there, so perhaps I need to be more verbose and
not go for a minimalist writing style. We shall see.
The modern functional programming world can be a
daunting place. All this talk of the lambda calculus.
Monads. A peculiar obsession with currying, even though it
is really little more than a special case shortcut that
saves a bit of finger typing at the expense of being hard
to explain. And type systems. I'm going to remain neutral
on the static vs. dynamic typing argument, but there's no
denying that papers on type systems tend to be hardcore
reading.
Functional programming is actually a whole lot simpler
than any of this lets on. It's as if the theoreticians
figured out functional programming long ago, and needed to
come up with new twists to keep themselves amused and to
keep the field challenging and mysterious. So where did
functional programming come from? I won't even try to give
a definitive history, but I can see the path that led to it
looking like a good idea.
When I first learned Pascal (the only languages I knew
previously were BASIC and 6502 assembly), there was a
fixation with parameter passing in the textbooks I read and
classes I took. In a procedure heading like this:
function max(a: integer; b: integer): integer;
"a" and "b" are formal parameters. If called with
max(1,2)
, then 1 and 2 are the actual
parameters. All very silly, and one of those cases where
the trouble of additional terminology takes something
mindlessly simple and makes it cumbersome. Half of my high
school programming class was hung up on this for a good two
weeks.
But then there's more: parameters can be passed by value
or by reference. As in C, you can pass a structure by
value, even if that structure is 10K in size, and the
entire structure will be copied to the stack. And that's
usually not a good idea, so by reference is the preferred
method in this case... except that data passed by reference
might be changed behind the scenes by any function you pass
it to. Later languages, such as Ada, got all fancy with
multiple types of "by reference" parameters: parameters
that were read-only, parameters that were write-only (that
is, were assumed to be overwritten by a function), and
parameters that could be both read from and written to. All
that extra syntax just to reduce the number of cases where
a parameter could be stomped all over by a function,
causing a global side effect.
One thing Wirth got completely right in Pascal is that
"by reference" parameters don't turn into pointers at the
language level. They're the same as the references that
eventually made it into C++. Introduce full pointers into a
language, especially with pointer arithmetic, and now
things are really scary. Not only can data structures be
modified by any function via reference parameters, but any
piece of code can potentially reach out into random data
space and tromp other variables in the system. And
data structures can contain pointers into other data
structures and all bets are off at that point. Any small
snippet of code involving pointers can completely change
the state of the program, and there's no compile-time
analysis that can keep things under control.
There's a simple way out of the situation: Don't allow
functions to modify data at all. With that rule in
place, it makes no difference if parameters are passed by
value or by reference, so the compiler can use whatever is
most efficient (usually by value for atomic, primitive
types and by reference for structured types). Rather
shockingly, this works. It's theoretically possible to
write any program without modifying data.
The problem here is how to program in a purely
functional manner, and this has gotten surprisingly short
shrift in the functional programming community. Yes, types
provide more information about intent and can be used to
catch a certain class of errors at compile time.
Higher-order functions are convenient. Currying is a neat
trick. Monads allow I/O and other real-world nastiness to
fit into a functional framework. Oh does mergesort look
pretty in Haskell. I shudder to think of how tedious it was
operating on binary trees in Pascal, yet the Erlang version
is breathtakingly trivial.
But ask someone how to write Pac-Man--to choose a
hopelessly dated video game--in a purely functional manner.
Pac-Man affects the ghosts and the ghosts affect Pac-Man;
can most newcomers to FP puzzle out how to do this without
destructive updates? Or take just about any large, complex
C++ program for that matter. It's doable, but requires
techniques that aren't well documented, and it's not like
there are many large functional programs that can be used
as examples (especially if you remove compilers for
functional programming languages from consideration).
Monads, types, currying... they're useful, but, in a way,
dodges. The most basic principle of writing code without
destructive updates is the tricky part.
Five Memorable Books About Programming
I've read the classics--Structure and Interpretation
of Computer Programs, Paradigms of Artificial
Intelligence Programming--but I'd like to highlight
some of the more esoteric books which affected my
thinking.
Zen of Assembly Language
Michael Abrash, 1990
I spent much of the 1980s writing 8-bit computer games
(which you can read
about if you like). Odd as it may seem in retrospect,
considering the relative power of an 8-bit 6502 running at
sub 2 MHz, I wasn't obsessed with optimizing for
performance. If anything, I wanted the code to be
small, a side effect of using a line-based editor
and writing programs that someone would have to
painstakingly type in from a magazine listing. Pages of
DATA 0AFF6CA900004021... ugh.
Right when that period of my life came to a close, along
came Zen of Assembly Language, by an author I had
never heard of, which dissected, explained, and extended
the self-taught tricks from my years as a lone assembly
hacker. Even though Abrash was focused on the 8086 and not
the 6502, it felt like the book was written personally to
me.
This is also one of the most bizarrely delayed technical
book releases I can recall. The majority of the book was
about detailed optimization for the 8088 and 8086, yet it
was published when the 80486 was showing up in high-end
desktop PCs.
Scientific Forth
Julian Noble, 1992
Fractals, Visualization, and J
Clifford Reiter, 2000
These two books are about entirely different subjects.
One is about pure scientific computation. The other is
about generating images. Each uses a different, wildly
non-mainstream language for the included code.
And yet these two books follow the same general
approach, one I wish were more commonly used.
Superficially, the authors have written introductions to
particular programming languages (which is why the language
name is in the title). But in reality it's more that each
author has an area of deep expertise and has found a
language that enables experimenting with and writing code
to solve problems in that field. As such, there aren't
forced examples and toy problems, but serious, non-trivial
programs that show a language in actual use. Dr. Noble
demonstrates how he uses Forth for hardcore matrix work
and, when he realizes that RPN notation isn't ideal in all
circumstances, develops a translator from infix expressions
to Forth. Clifford Reiter jumps into image processing
algorithms, plus veers into lighter weight diversions with
titles like "R/S Analysis, the Hurst Exponent, and
Sunspots."
Both books are wonderful alternatives to the usual
"Learning Programing Language of the Month" texts. Sadly,
Julian Noble died in 2007.
Programmers At Work
Susan Lammers, 1986
I used to soak up printed interviews with programmers
and game designers (and typically game designers were
programmers as well). I was enthralled by Levy's
Hackers, more the game development chapter than the
rest. Programmers At Work was in the same vein:
philosophies, ideas, and experiences directly from an odd
mix of famous and quirky programmers. But the book wasn't
primarily about tech. It was about creativity. Most of the
people interviewed didn't have degrees in computer science.
There wasn't an emphasis on math, proving programs correct,
lambda calculus--just people coming up with ideas and
implementing them. And the game connection was there: Jaron
Lanier talking about the psychedelic Moon Dust for
the Commodore 64, Bill Budge's bold (and still unfulfilled)
plan to build a "Construction Set Construction Set."
Lammers's book was the model I used when I put together
Halcyon Days. I
also pulled Hackers into the mix by interviewing
John Harris about his dissatisfaction with Levy's
presentation of him. In an odd twist of fate,
Programmers at Work and Halcyon Days were
packaged together on a single CD sold through the Dr.
Dobb's Journal library. The pairing has been around for ten
years and is still available, much to my surprise.
Thinking Forth:
A Language and Philosophy for Solving Problems
Leo Brodie, 1984
Yes, another book about Forth.
But this one is worth reading less for the Forth and
more because it's one of the few books about how to
decompose problems and structure code. You'd think this
book was written by Fowler and friends, until you realize
it's from the mid-1980s. That Brodie uses "factor" (which
originated in the Forth community) instead of "refactor" is
also a giveaway. What's impressive here is there's no OOP,
no discussion of patterns, no heavy terminology. It's a
book about understanding what you're trying to achieve,
avoiding redundancy, and writing dead simple code.
It's worth it for the Forth, too, especially the
interspersed bits of wisdom from experts, including Forth
creator Chuck Moore.
In Praise of Non-Alphanumeric Identifiers
Here's a common definition of what constitutes a valid
identifier in many programming languages:
The first character must be any letter (A-Z, a-z) or
an underscore. Subsequent characters, if any, must be a
letter, digit (0-9), or an underscore.
Simple enough. It applies to C, C++, ML, Python, most
BASICs, most custom scripting languages (e.g., Game Maker
Language). But of course there's no reason for this
convention other than being familiar and expected.
One of my favorite non-alphanumeric characters for
function names is "?". Why say is_uppercase
(or IsUppercase
or isUppercase
)
when you can use the more straightforward
Uppercase?
instead? That's standard practice
in Scheme and Forth, and I'm surprised it hasn't caught on
in all new languages.
(As an aside, in Erlang you can use any atom as a
function name. You can put non-alphanumeric characters in
an atom if you remember to surround the entire name with
single quotes. It really does work to have a function named
'uppercase?'
though the quotes make it
clunky.)
Scheme's "!" is another good example. It's not mnemonic,
and it doesn't carry the same meaning as in English
punctuation. Instead it was arbitrarily designated a visual
tag for "this function destructively updates data":
set!
, vector-set!
. That's more
concise than any other notation I can think of ("-m" for
"mutates"? Yuck).
Forth goes much further, not only allowing any ASCII
character in identifiers, but there's a long history of
lexicographic conventions. The fetch and store words--"@"
and "!"--are commonly appended to names, so
color@
is read as "color fetch." That's a nice
alternative to "get" and "set" prefixes. The Forthish
#strings
beats "numStrings" any day. Another
Forth standard is including parentheses in a name, as in
(open-file)
, to indicate that a word is
low-level and for internal use only.
And then there are clever uses of characters in Forth
that make related words look related, like this:
open{ write-byte write-string etc. }close
The brace is part of both open{
and
}close
. There no reason the braces couldn't be
dropped completely, but they provide a visual cue about
scope.
Slumming with BASIC Programmers
I'm a registered user of BlitzMax, an extended
BASIC-variant primarily targeted at people who want to
write games. It's also the easiest way I've run across to
deal with graphics, sound, and user input in a completely
cross platform way, and that's why I use it. Every program
I've written works perfectly under both OS X on my MacBook
and Windows XP on my desktop; all it takes is a quick
recompile. The same thing is possible with packages like
SDL, but that involves
manually fussing with C compiler set-ups. BlitzMax is so
much more pleasant.
But still, it's BASIC. BASIC with garbage collection and
OOP and Unicode strings, but BASIC nonetheless. It doesn't
take long, reading through the BlitzMax community forums,
to see that the average user has a shallower depth of
programming experience than programmers who know Lisp or
Erlang or Ruby. Or C for that matter. There are
misconceptions about how data types work, superstitions
involving performance that are right up there with music
CDs colored with green marker, paranoia about recursion.
The discussions about OOP, and the endearing notion that
sealing every little bit of code and data into objects is
somehow inherently right, feels like a rewind to 15
or more years ago. In Erlang or Python, to report the
collision of two entities, I'd simply use:
{collision, EntityA, EntityB}
forgetting that this can be made more complex by
defining a CollisionResult class with half a dozen methods
for picking out the details of the obvious.
(For an even better example of this sort of time warp,
consider PowerBASIC,
which touts a keyword for indicating that a variable should
be kept in a CPU register. It's C's register
variables all over again, the difference being that this is
still considered an important optimization in the
PowerBASIC world.)
By this point I'm sure I've offended most of the
BlitzMax and PowerBASIC users out there, and to everyone
else it looks like I'm gleefully making fun of Blub
programmers. I may be looking down my snooty language
dilettante nose at them, yes, but from a get-things-done
productivity point of view I'm seriously impressed. There's
a continuous stream of games written in BlitzMax, games
that are more than just retro-remakes of 8-bit relics, from
people with very little programming experience. There are
games with physics, games with elaborate effects, 3D games
written with libraries designed for 2D, games written over
the course of spring break. I'm withholding judgement on
the raw aesthetics and playability of these games; the
impressive part is that they exist and are largely
written by people with minimal programming background.
(I've downloaded more than one BlitzMax game that came with
source code and marveled at how the entire core of the
project was contained in a single 1000+ line function.)
Compare this with any hipper, purportedly more
expressive language. I've written before about how you can count the number of
games written in a purely functional style on one hand. Is
it that language tinkerers are less concerned about writing
real applications? That they know you can solve any problem
with focused grunt work, but it's not interesting to them?
That the spark and newness of a different language is its
own reward? Either way, the BASIC programmers win when it
comes down to getting projects finished.
My Road to Erlang
I had three or four job offers my last semester of
college, all of them with telecom companies just north of
Dallas. I ended up working for Ericsson in the early
1990s.
Now if you're expecting me to talk about how I hung
around with the brilliant folks who developed
Erlang...don't. Ericsson's telephone exchanges were
programmed in a custom, baroque language called PLEX.
Syntactically it was a cross between Fortran and a macro
assembler. You couldn't pass parameters to functions, for
example; you assigned them to variables instead, much like
the limited GOSUB of an 8-bit BASIC. The advantage was that
there was a clean one-to-one correspondence between the
source code and the generated assembly code, a necessity
when it came to writing patches for live systems where
taking them down for maintenance was Very Bad Indeed.
The other thing worth mentioning about PLEX and
Ericsson's hardware of the time is that they were custom
designed for large scale message passing concurrency. That
hardware created in the 1970s and 1980s was built to handle
tens of thousands of processes certainly makes dual core
CPUs seem a bit late to the party.
Ericsson had a habit of periodically sending employees
to the mothership in Sweden, and after one such trip my
office mate brought back a sheet of paper with a short,
completely unintelligible to me, Erlang program on it. In
three years at Ericsson, my total exposure to Erlang was
about thirty seconds. I left shortly after that, getting
back into game development.
In 1998 I started looking at very high level programming
languages, because I was in a rut and getting behind the
times. Most of my experience was in various assembly
languages (6502, 68000, 8086, SH2, PowerPC), C, Pascal,
plus some oddities like Forth. The only modern language I
was familiar with was Perl, so like everyone else at the
time I could write CGI scripts. I wanted to leapfrog ahead,
to get myself out of the low-level world, so I looked to
functional programming (covered a bit more in the first
part of Admitting that Functional
Programming Can Be Awkward).
I worked through online tutorials for three languages:
Standard ML, OCaml, and Haskell. I had fun with them,
especially OCaml, but there were two glaring issues that
held me back.
The first was that the tutorials were self-absorbed in
the accoutrements of functional programming: type systems,
fancy ways of using types for generic programming, lambda
calculus tricks like currying. The tutorials for all three
languages were surprisingly similar. The examples were
either trivial or geared toward writing compilers. At the
time I was interested in in complex, interactive
programs--video games--but I didn't have a clue about how
to structure even the simplest of games in Haskell. There
were a few trivial games written in OCaml, but they made
heavy use of imperative features which made me wonder what
the point was.
The second issue was that I was used to working on
commercial products, and there was little evidence at the
time that Standard ML, OCaml, or Haskell was up to the
task. Would they scale up to programs orders of magnitude
larger than class assignments? And more critically, would
functional programming scale up? Would I hit a point when
the garbage collector crossed the line from imperceptible
to perceptible? Would there be anything I could possibly do
if that happened? Would lazy evaluation become too
difficult to reason about? There was also the worry that
Windows seemed to be a "barely there" platform in the eyes
of all three language maintainers. The OCaml interpreter
had a beautiful interactive shell under MacOS, but the
Windows version was--or should have been--a great
embarrassment to everyone involved.
Somewhere in this period I also took a hard look at Lisp
(I was one of the first registered users of Corman Lisp) and Erlang.
Erlang wasn't open source yet and required a license for
commercial use. The evaluation version still used the old
JAM runtime instead of the more modern BEAM and was dog
slow. It also had the same dated, cold, industrial feeling
of the systems I used when at Ericsson. I put it aside and
kept tinkering elsewhere.
But I came back when the move to open source occurred.
Here's why:
I found I had an easier time writing programs in
Erlang. I was focused entirely on the mysterious
concept of writing code that didn't involve destructive
updates. I wasn't distracted by type systems or complex
module and class systems. The lack of "let" and "where"
clauses in Erlang makes code easier to structure, without
the creeping indentation caused by numerous levels of
scope. Atoms are a beautiful data type to work with.
That the tools had been used to ship large-scale
commercial products gave me faith in them. Yes, they
are cold and industrial, and I started seeing that
as a good thing. The warts and quirks are there for a
reason: because they were needed in order to get a project
out the door. I'll take pragmatism over idealism any
day.
Besides being useful in its own right, concurrency is
a good pressure valve for functional programming. Too
difficult to puzzle out how to make a large functional
program work? Break it into multiple smaller programs that
communicate.
Speed was much improved. The BEAM runtime is 3x
faster than JAM. When I started looking at different
languages, I was obsessed with performance, but I
eventually realized I was limiting my options. I could
always go back to the madness of writing assembly code if I
cared that much. Flexibility mattered more. The 3x
improvement pushed Erlang from "kinda slow" to "good
enough."
I've been using Erlang since 1999, but I hardly think of
myself as a fanatic. I still use Perl, Python, C++, with
occasional forays into REBOL, Lua, and Forth, plus some
other special purpose languages. They all have strengths
and weaknesses. But for most difficult problems I run into,
Erlang is my first choice.
Purely Functional Retrogames, Part 1
When I started looking into functional languages in
1998, I had just come off a series of projects writing
video games for underpowered hardware: Super Nintendo, SEGA
Saturn, early PowerPC-based Macintoshes without any
graphics acceleration. My benchmark for usefulness was "Can
a programming language be used to write complex,
performance intensive video games?"
After working through basic tutorials, and coming to
grips with the lack of destructive updates, I started
thinking about how to write trivial games, like Pac-Man or
Defender, in a purely functional manner. Then I realized
that it wasn't performance that was the issue, it was much
more fundamental.
I had no idea how to structure the most trivial of
games without using destructive updates.
Pac-Man is dead simple in any language that fits the
same general model as C. There are a bunch of globals
representing the position of Pac-Man, the score, the level,
and so on. Ghost information is stored in a short array of
structures. Then there's an array representing the maze,
where each element is either a piece of the maze or a dot.
If Pac-Man eats a dot, the maze array is updated. If
Pac-Man hits a blue ghost, that ghost's structure is
updated to reflect a new state. There were dozens and
dozens of Pac-Man clones in the early 1980s, including tiny
versions that you could type in from a magazine.
In a purely functional language, none of this works. If
Pac-Man eats a dot, the maze can't be directly updated. If
Pac-Man hits a blue ghost, there's no way to directly
change the state of the ghost. How could this possibly
work?
That was a long time ago, and I've spent enough time
with functional languages to have figured out how to
implement non-trivial, interactive applications like video
games. My plan is to cover this information in a short
series of entries. I'm sticking with 8-bit retrogames
because they're simple and everyone knows what Pac-Man
looks like. I don't want to use abstract examples involving
hypothetical game designs. I'm also sticking with purely
functional programming language features, because that's
the challenge. I know that ML has references and that
processes in Erlang can be used to mimic objects, but if
you go down that road you might as well be using C.
The one exception to "purely functional" is that I don't
care about trying to make I/O fit a functional model. In a
game, there are three I/O needs: input from the user, a way
to render graphics on the screen, and a real-time clock.
Fortunately, these only matter at the very highest level
outer loop, one that looks like:
repeat forever {
get user input
process one frame
draw everything on the screen
wait until a frame's worth of time has elapsed
}
"Process one frame" is the interesting part. It takes
the current game state and user input as parameters and
returns a new game state. Then that game state can be used
for the "draw everything" step. "Draw everything" can also
be purely functional, returning an abstract list of sprites
and coordinates, a list that can be passed directly to a
lower level, and inherently impure, function that talks to
the graphics hardware.
An open question is "Is being purely functional, even
excepting I/O, worthwhile?" Or is it, as was suggested to
me via email earlier this year, the equivalent of writing a
novel without using the letter 'e'?
Part 2
Purely Functional Retrogames, Part 2
(Read Part 1 if you missed
it.)
The difficult, or at least different, part of writing a
game in a purely functional style is living without global,
destructive updates. But before getting into how to deal
with that, anything that can be done to reduce the need for
destructive updates is going to make things easier later
on.
Back when I actually wrote 8-bit games, much of my code
involved updating timers and counters used for animation
and special effects and so on. At the time it made a lot of
sense, given the limited math capabilities of a 6502. In
the modern world you can achieve the same by using a single
clock counter that gets incremented each frame.
Ever notice how the power pills in Pac-Man blink on and
off? Let's say the game clock is incremented every 1/60th
of a second, and the pills flop from visible to
invisible--or the other way around--twice per second (or
every 30 ticks of the clock). The state of the pills can be
computed directly from the clock value:
pills_are_visible(Clock) ->
is_even(Clock div 30).
No special counters, no destructive updates of any kind.
Similarly, the current frame of the animation of a Pac-Man
ghost can be computed given the same clock:
current_ghost_frame(Clock) ->
Offset = Clock rem TOTAL_GHOST_ANIMATION_LENGTH,
Offset div TIME_PER_ANIMATION_FRAME.
Again, no special counters and no per frame updates. The
clock can also be used for general event timers. Let's say
the bonus fruit appears 30 seconds after a level starts.
All we need is one value: the value of the clock when the
level started plus 30*60. Each frame we check to see if the
clock matches that value.
None of this is specific to functional programming. It's
common in C and other languages. (The reason it was ugly on
the 6502 was because of the lack of division and remainder
instructions, and managing a single global clock involved
verbose 24-bit math.)
There are limits to how much a single clock value can be
exploited. You can't make every enemy in Robotron operate
entirely as a function of time, because they react to other
stimuli in the world, such as the position of the player.
If you think about this trick a bit, what's actually going
on is that some data is entirely dependent on other data.
One value can be used to compute others. This makes a
dynamic world be a whole lot more static than it may first
seem.
Getting away from clocks and timing, there are other
hidden dependencies in the typical retro-style game. In a
procedural implementation of Pac-Man, when Pac-Man collides
with a blue ghost, a global score is incremented. This is
exactly the kind of hidden update that gets ugly with a
purely functional approach. Sure, you could return some
special data indicating that the score should change, but
there's no need.
Let's say that each ghost has a state that looks like
this: {State_name, Starting_time}. When a ghost has been
eaten and is attempting to return to the box in the center
of the maze, the state might be {return_to_box, 56700}.
(56700 was the value of the master clock when the ghost was
eaten.) Or it might be more fine-grained than that, but you
get the idea. The important part is that there's enough
information here to realize that a ghost was eaten during
the current frame: if the state name is "return_to_box" and
the starting time is the same as the current game clock. A
separate function can scan through the ghost states and
look for events that would cause a score increase.
The same technique also applies to when sounds are
played. It's not something that has to be a side effect of
the ghost behavior handling code. There's enough implicit
information, given the state of the rest of the world, to
make decisions about when sounds should be played. Using
the example from the preceding paragraph, the same criteria
for indicating a score increase can also be used to trigger
the "ghost eaten" sound.
Part 3
Purely Functional Retrogames, Part 3
(Read Part 1 if you missed
it.)
Every entity in a game needs some data to define where
it is and what it's doing. At first thought, a ghost in
Pac-Man might be defined by:
{X, Y, Color}
which looks easy enough, but it's naive. There needs to
be a lot more data than that: direction of movement,
behavior state, some base clock values for animation, etc.
And this is just simplistic Pac-Man. In an imperative or OO
language this topic barely deserves thought. Just create a
structure or object for each entity type and add fields as
the situation arises. If the structure eventually contains
50 fields, who cares? But...
In a functional language, the worst thing you can do
is create a large "struct" containing all the data you
think you might need for an entity.
First, this doesn't scale well. Each time you want to
"change" a field value, a whole new structure is created.
For Pac-Man it's irrelevant--there are only a handful of
entities. But the key is that if you add a single field,
then you're adding overhead across the board to all of
the entity processing in your entire program. The
second reason this is a bad idea is that it hides the flow
of data. You no longer know what values are important to a
function. You're just passing in everything, and that makes
it harder to experiment with writing simple, obviously
correct primitives. Which is less opaque:
step_toward({X,Y}, TargetX, TargetY, Speed) ->
...
step_toward(EntityData, TargetX, TargetY, Speed) ->
...
The advantage of the first one is that you don't need to
know what an entity looks like. You might not have thought
that far ahead, which is fine. You've got a simple function
for operating on coordinate pairs which can be used in a
variety of places, not just for entity movement.
If we can't use a big struct, what does an entity look
like? There are undoubtedly many ways to approach this, but
I came up with the following scheme. Fundamentally, an
entity is defined by an ID of some sort ("I am one of those
fast moving spinning things in Robotron"), movement data (a
position and maybe velocity), and the current behavioral
state. At the highest level:
{Id, Position, State}
Each of these has more data behind it, and that data
varies based on the entity type, the current behavior, and
so on. Position might be one of the following:
{X, Y}
{X, Y, XVelocity, YVelocity}
State might look like:
{Name, StartTime, EndTime}
{Name, StartTime, EndTime, SomeStateSpecificData}
StartTime
is so there's a base clock to use
for animation or to know how long the current state has
been running. EndTime
is the time in the
future when the state should end; it isn't needed for all
states.
In my experiments, this scheme got me pretty far.
Everything is very clean at a high level--a three element
tuple--and below that there's still the absolute minimum
amount of data not only per entity type, but for the exact
state that the entity is in. Compare that to the normal
"put everything in a struct" approach, where fields needed
only for the "return to center of maze" ghost logic are
always sitting there, unused in most states.
But wait, what about additional state information, such
as indicating that a Pac-Man ghost is invulnerable (which
is true when a ghost has been reduced to a pair of eyes
returning to the center of the maze)? If you remember
Part 2, then the parenthetical note
in the previous sentence should give it away. If the ghost
is invulnerable when in a specific state, then there's no
need for a separate flag. Just check the state.
Part 4
Purely Functional Retrogames, Part 4
(Read Part 1 if you missed
it.)
By the definition of functional programming, functions
can't access any data that isn't passed in. That means you
need to think about what data is needed for a particular
function, and "thread" that data through your program so a
function can access it. It sounds horrible when written
down, but it's easy in practice.
In fact, just working out the data dependencies in a
simple game is an eye-opening exercise. It usually turns
out that there are far fewer dependencies than you might
imagine. In Pac-Man, there's an awful lot of state that
makes no difference to how the ghosts move: the player's
score, whether the fruit is visible or not, the location of
dots in the maze. Similarly, the core movement of Pac-Man,
ignoring collision detection, only relies on a handful of
factors: the joystick position, the location of walls in
the maze (which are constant, because there's only one
maze), and the current movement speed (which increases as
mazes are completed).
That was the easy part. The tricky bit is how to handle
functions that affect the state of the world. Now of course
a function doesn't actually change anything, but somehow
those effects on the world need to be passed back out so
the rest of the game knows about them. The "move Pac-Man"
routine returns the new state of Pac-Man (see Part 3 for more about how entity state is
represented). If collision detection is part of the "move
Pac-Man" function, then there are more possible changes to
the world: a dot has been eaten, a power pill has been
eaten, fruit has been eaten, Pac-Man is dead (because of
collision with a non-blue ghost), a ghost is dead (because
of a collision with a powered-up Pac-Man).
When I first mused over writing a game in a purely
functional style, this had me stymied. One simple function
ends up possibly changing the entire state of the world?
Should that function take the whole world as input and
return a brand new world as output? Why even use functional
programming, then?
A clean alternative is not to return new versions of
anything, but to simply return statements about what
happened. Using the above example, the movement routine
would return a list of any of these side effects:
{new_position, Coordinates}
{ate_ghost, GhostName}
{ate_dot, Coordinates}
ate_fruit
killed_by_ghost
All of a sudden things are a lot simpler. You can pass
in the relevant parts of the state of the world, and get
back a simple list detailing what happened. Actually
handling what happened is a separate step, one that
can be done later on in the frame. The advantage here is
that changes to core world data don't have to be
painstakingly threaded in and out of all functions in the
game.
I'd like to write some concluding thoughts on this
series, to answer the "Why do this?" and "What about
Functional Reactive Programming?" questions--among
others--but wow I've already taken just about a month for
these four short entries, so I'm not going to jump into
that just yet.
(I eventually wrote the follow-up.)
Don't Be Afraid of Special Cases
In the body of work on low-level optimization, there's a
heavy emphasis on avoiding branches. Here's a well-known
snippet of x86 code which sets eax to the smaller of the
two values in eax and ecx:
sub ecx, eax
sbb edx, edx
and ecx, edx
add eax, ecx
At the CPU hardware level, branches are indeed expensive
and messy. A mispredicted branch empties the entire
instruction pipeline, and it can take a dozen or more
cycles to get that pipeline full and ticking along
optimally again.
But that's only at the lowest level, and unless you're
writing a code generator or a routine that's
hyper-sensitive to instruction-level tweaks, like movie
compression or software texture mapping, it's doubtful that
going out of your way to avoid branches will be
significant. Ignoring efficiency completely, there's still
the stigma that code with many conditionals in it, to
handle special cases, is inherently ugly, even poorly
engineered.
That's the programmer's code-centric view. The user of
an application isn't thinking like that at all. He or she
is thinking purely about ease of use, and ugly is when a
program displays "1 files deleted" (or even "1 file(s)
deleted"), or puts up a dialog box that crosses between two
monitors, making it unreadable.
In 1996-7 I wrote a game called "Bumbler" for the Mac.
(Yes, I've brought this up before, but that's because I
spent 18 months as a full-time indie game developer, which
was more valuable--and probably just as expensive--as
getting another college degree.) Bumbler is an
insect-themed shooter that takes place on a honeycomb
background. When an insect is killed, the honeycomb behind
it fills with honey. Every Nth honeycomb fills with pulsing
red "special honey," which you can fly over and something
special happens. Think "power-ups."
The logic driving event selection isn't just a simple
random choice between the seven available special honey
effects. I could have done that, sure, but it would
have been a lazy decision on my part, one that would have
hurt the game in noticeable ways. Here are some of the
special honey events and the special cases involved:
Create hole. This added a hole to the honeycomb
that insects could crawl out of, the only negative special
honey event. During play testing I found out that if a hole
was created near the player start position, the player
would often collide with an insect crawling out of it at
the start of a life. So "create hole" was disallowed in a
rectangular region surrounding the player start. It was
also not allowed if there were already a certain number of
holes in the honeycomb, to avoid too much
unpredictability.
Release flowers. This spawned bonus flowers from
each of the holes in the honeycomb. But if there were
already many flowers on the screen, then the player could
miss this entirely, and it looked like nothing happened. If
there were more than X flowers on the screen, this event
was removed from the list of possibilities.
Flower magnet. This caused all the flowers on the
screen to flash yellow and home on in the player. This was
good, because you got more points, but bad because the
flowers blocked your shots. To make this a rare event, one
that the player would be surprised by, it was special-cased
to not occur during the first ten or so levels, plus once
it happened it couldn't be triggered again for another five
levels. Okay, that's two special cases. Additionally, if
there weren't any flowers on the screen, then it looked
like nothing happened, and if there were only a few
flowers, it was underwhelming. So this event was only
allowed if there were a lot of flowers in existence.
All of these cases improved the game, and play testing
supported them. Did they make the code longer and arguably
uglier? Yes. Much more so because I wasn't using a language
that encourages adding special cases in an unobtrusive way.
One of the advantages to a language with pattern matching,
like Erlang or Haskell or ML, is that there's a programing
assistant of sorts, one that takes your haphazard lists of
special cases--patterns--and turns then into an optimal
sequence of old-fashioned conditionals, a jump table, or
even a hash table.
Coding as Performance
I want to talk about performance coding. Not coding for
speed, but coding as performance, a la
live
coding. Okay, I don't really want to talk about that
either, as it mostly involves audio programming languages
used for on-the-fly music composition, but I like the
principle of it: writing programs very quickly, in the
timescale of TV show or movie rather than the years it can
take to complete a commercial product. Take any book on
agile development or extreme programming and replace
"weeks" with "hours" and "days" with "minutes."
Think of it in terms of a co-worker or friend who comes
to you with a problem, something that could be done by
hand, but would involve much repetitive work ("I've got a
big directory tree, and I need a list of the sum total
sizes of all files with the same root names, so hello.txt,
hello.doc, and hello.whatever would just show in the report
as 'hello', followed by the total size of those three
files"). If you can write a program to solve the problem in
less time than the tedium of slogging through the manual
approach, then you win. There's no reason to limit this
game to this kind of problem, but it's a starting
point.
Working at this level, the difference between gut
instinct and proper engineering becomes obvious. The latter
always seems to involve additional time--architecture,
modularity, code formatting, interface specification--which
is exactly what's in short supply in coding as performance.
Imagine you want to plant a brand new vegetable garden
somewhere in your yard, and the first task is to stake out
the plot. Odds are good that you'll be perfectly successful
by just eyeballing it, hammering a wooden stake at one
corner, and using it as a reference. Or you could be more
formal and use a tape measure. The ultimate, guaranteed
correct solution is to hire a team of surveyors to make
sure the distances are exact and the sides perfectly
parallel. But really, who would do that?
(And if you're thinking "not me," consider people like
myself who've grepped a two-hundred megabyte XML file,
because it was easier than remembering how to use the
available XML parsing libraries. If your reaction is one of
horror because I clearly don't understand the whole purpose
of using XML to structure data, then there you go. You'd
hire the surveyors.)
You can easily spot the programming languages designed
for projects operating on shorter timescales. Common,
non-trivial operations are built-in, like regular
expressions and matrix math (as an aside, the original
BASIC language from the 1960s had matrix operators). Common
functions--reading a file, getting the size of a
file--don't require importing libraries after you've
managed to remember that getting the size of a file isn't a
core operation that's in the "file" library and is instead
in "os:file:filesize" or wherever the hierarchical-thinking
author put it. But really, any language of the Python or
Ruby class is going to be fine. The big wins are having an
interactive read / evaluate / print loop, zero compilation
time, and data structures that don't require thinking about
low-level implementation details.
What matter just as much are visualization tools,
so you can avoid the classic pitfall of engineering
something for weeks or months only to finally realize that
you didn't understand the problem and engineered the wrong
thing. (Students of Dijkstra are
ready with some good examples of math problems where
attempting to guess an answer based on a drawing gives
hopelessly incorrect answers, but I'll pretend I don't see
them, there in the back, frantically waving their
arms.)
I once used an 8-bit debugger with an interrupt-driven
display. Sixty times per second, the display was updated.
This meant that memory dumps were live. If a running
program constantly changed a value, that memory location
showed as blurred digits on the screen. You could also see
numbers occasionally flick from 0 to 255, then back later.
Static parts of the screen meant nothing was changing
there. This sounds simple, but wow was it useful for
accidentally spotting memory overruns and logic errors. I
often never suspected a problem, and I wouldn't haven even
known what to look for, but found an error just by seeing
movement or patterns in a memory dump that didn't look
right.
A modern visualization tool I can't live without is
RegEx Coach.
I always try out regular expressions using it before
copying them over to my Perl or Python scripts. When I make
an error, I see it right away. That prevents
situations where the rest of my program is fine, but a
botched regular expression isn't pulling in exactly the
data I'm expecting.
The J language ships
with some great visualization tools. Arguably it's the
nicest programming environment I've ever used, even though
I go back and forth about whether J itself is brilliant or
insane. There's a standard library module which takes a
matrix and displays it as a grid of colors. Identical
values use the same color. Simplistic? Yes. But this
display format makes patterns and anomalies jump out of the
screen. If you're thinking that you don't write code that
involves matrix math, realize that matrices are native to J
and you can easily put all sorts of data into a matrix
format (in fact, the preferred term for a matrix in J is
the more casual "table").
J also has a similar tool that mimics a spreadsheet
display. Pass in data, and up pops what looks like an Excel
window, making it easy to view data that is naturally
columnar. It's easier than dumping values to an HTML file
or the old-fashioned method of debug printing a table using
a fixed-width font. There's also an elaborate module for
graphing data; no need to export it to a file and use a
standalone program.
I'm hardly suggesting that everyone--or anyone--switch
over to J. It's not the language semantics that matter so
much as tools that are focused on interactivity, on working
through problems quickly. And the realization that it is
valid to get an answer without always bringing the concerns
of software engineering--and the time penalty that comes
with them--into the picture.
A Spellchecker Used to Be a Major Feat of Software
Engineering
Here's the situation: it's 1984, and you're assigned to
write the spellchecker for a new MS-DOS word processor.
Some users, but not many, will have 640K of memory in their
PCs. You need to support systems with as little as 256K.
That's a quarter megabyte to contain the word processor,
the document being edited, and the memory needed by the
operating system. Oh, and the spellchecker.
For reference, on my MacBook, the standard dictionary in
/usr/share/dict/words
is 2,486,813 bytes and
contains 234,936 words.
An enticing first option is a data format that's more
compressed than raw text. The UNIX dictionary contains
stop and stopped and stopping, so
there's a lot of repetition. A clever trie implementation
might do the trick...but we'll need a big decrease to go
from 2+ megabytes to a hundred K or so.
In fact, even if we could represent each word in the
spellchecker dictionary as a single byte, we need almost
all the full 256K just for that, and of course the single
byte representation isn't going to work. So not only does
keeping the whole dictionary in RAM look hopeless, but so
does keeping the actual dictionary on disk with only an
index in RAM.
Now it gets messy. We could try taking a subset of the
dictionary, one containing the most common words, and
heavily compressing that so it fits in memory. Then we come
up with a slower, disk-based mechanism for looking up the
rest of the words. Or maybe we jump directly to a
completely disk-based solution using a custom database of
sorts (remembering, too, that we can't assume the user has
a hard disk, so the dictionary still needs to be crunched
onto a 360K floppy disk).
On top of this, we need to handle some other features,
such as the user adding new words to the dictionary.
Writing a spellchecker in the mid-1980s was a hard
problem. Programmers came up with some impressive data
compression methods in response to the spellchecker
challenge. Likewise there were some very clever data
structures for quickly finding words in a compressed
dictionary. This was a problem that could take months of
focused effort to work out a solution to. (And, for the
record, reducing the size of the dictionary from 200,000+
to 50,000 or even 20,000 words was a reasonable option, but
even that doesn't leave the door open for a naive
approach.)
Fast forward to today. A program to load
/usr/share/dict/words
into a hash table is 3-5
lines of Perl or Python, depending on how terse you mind
being. Looking up a word in this hash table dictionary is a
trivial expression, one built into the language. And
that's it. Sure, you could come up with some ways to
decrease the load time or reduce the memory footprint, but
that's icing and likely won't be needed. The basic
implementation is so mindlessly trivial that it could be an
exercise for the reader in an early chapter of any Python
tutorial.
That's progress.
Want to Write a Compiler? Just Read These Two
Papers.
Imagine you don't know anything about
programming, and you want learn how to do it. You take a
look at Amazon.com, and there's a highly recommended set of
books by Knute or something with a promising title, The
Art of Computer Programming, so you buy them. Now
imagine that it's more than just a poor choice, but that
all the books on programming are at written at that
level.
That's the situation with books about writing
compilers.
It's not that they're bad books, they're just too
broadly scoped, and the authors present so much information
that it's hard to know where to begin. Some books are
better than others, but there are still the thick chapters
about converting regular expressions into executable state
machines and different types of grammars and so on. After
slogging through it all you will have undoubtedly expanded
your knowledge, but you're no closer to actually writing a
working compiler.
Not surprisingly, the opaqueness of these books has led
to the myth that compilers are hard to write.
The best source for breaking this myth is Jack
Crenshaw's series, Let's Build a
Compiler!, which started in 1988. This is one of those
gems of technical writing where what's assumed to be a
complex topic ends up being suitable for a first year
programming class. He focuses on compilers of the Turbo
Pascal class: single pass, parsing and code generation are
intermingled, and only the most basic of optimizations are
applied to the resulting code. The original tutorials used
Pascal as the implementation language, but there's a C
version out there, too. If you're truly adventurous, Marcel
Hendrix has done a Forth
translation (and as Forth is an interactive language,
it's easier to experiment with and understand than the C or
Pascal sources).
As good as it is, Crenshaw's series has one major
omission: there's no internal representation of the program
at all. That is, no abstract syntax tree. It is indeed
possible to bypass this step if you're willing to give up
flexibility, but the main reason it's not in the tutorials
is because manipulating trees in Pascal is out of sync with
the simplicity of the rest of the code he presents. If
you're working in a higher level language--Python, Ruby,
Erlang, Haskell, Lisp--then this worry goes away. It's
trivially easy to create and manipulate tree-like
representations of data. Indeed, this is what Lisp, Erlang,
and Haskell were designed for.
That brings me to A
Nanopass Framework for Compiler Education [PDF] by
Sarkar, Waddell, and Dybvig. The details of this paper
aren't quite as important as the general concept: a
compiler is nothing more than a series of transformations
of the internal representation of a program. The authors
promote using dozens or hundreds of compiler passes,
each being as simple as possible. Don't combine
transformations; keep them separate. The framework
mentioned in the title is a way of specifying the inputs
and outputs for each pass. The code is in Scheme, which is
dynamically typed, so data is validated at runtime.
After writing a compiler or two, then go ahead and plunk
down the cash for the infamous
Dragon Book or one of the alternatives. Maybe. Or you
might not need them at all.
Functional Programming Went Mainstream Years Ago
In school and early in my programming career I must have
written linked-list handling code fifty times. Those were
the days of Pascal and vanilla C. I didn't have the code
memorized either, because there were too many variations:
singly-linked list, singly-linked list with dummy head and
tail nodes, doubly-linked list, doubly-linked list with
dummy head and tail nodes. Insertion and deletion routines
for each of those. I worked out the pointer manipulation
logic each time I rewrote them. Good thing, too, because
the AP Computer Science exam was chock full of linked-list
questions.
Early functional languages like Hope
and
Miranda seemed like magic in comparison. Not only were
lists built into those languages, but there was no manual
fiddling with pointers or memory at all. Even more
so, the entire concept of memory as the most precious of
resources, one to be lovingly arranged and conserved, was
absent. That's not to say that memory was free and
infinite, but it was something fluid and changing. A
temporary data structure was created and used transiently,
with no permanent cost.
All of this magic is nothing new in currently popular
programming languages. Fifteen years ago you could say:
print join ',', @Items
in Perl, taking an arbitrarily long list of arbitrarily
long strings, and building an entirely new string
consisting of the elements of @Items
separated
by commas. Once print
is finished with that
string, it disappears. At a low level this is a serious
amount of work, all in the name of temporary convenience. I
never would have dared something so cavalier in Turbo
Pascal. And yet it opens the door to what's essentially a
functional style: creating new values rather than modifying
existing ones. You can view a Perl (or Python or Ruby or
Lua or Rebol) program as a series of small functional
programs connected by a lightweight imperative program.
But there's more to functional programming than a
disassociation from the details of memory layout. What
about higher order functions and absolute purity and monads
and elaborate type systems and type inference? Bits of
those already exist in decidedly non-functional languages.
Higher Order Perl
is a great book. Strings are immutable in Python. Various
forms of lambda functions are available in different
languages, as are list comprehensions.
Still, the purists proclaim, it's not enough. Python is
not a replacement for Haskell. But does it matter? 90% of
the impressive magic from early functional languages has
been rolled into mainstream languages. That last 10%, well,
it's not clear that anyone is really wanting it or that the
benefits are actually there. Purity has some advantages,
but it's so convenient and useful to directly modify a
dictionary in Python. Fold and map are beautiful, but they
work just as well in the guise of a foreach loop.
The answer to "When will Haskell finally go mainstream?"
is "most of it already has."
Kilobyte Constants, a Simple and Beautiful Idea that
Hasn't Caught On
Eric Isaacson's A86
assembler (which I used regularly in the early 1990s)
includes a great little feature that I've never seen in
another language: the suffix "K" to indicate kilobytes in
numeric literals. For example, you can say "16K" instead of
"16384". How many times have you seen C code like this:
char Buffer[512 * 1024];
The "* 1024" is so common, and so clunky in comparison
with:
char Buffer[512K];
In Forth this is trivial to add, at least outside of
compiled definitions. All you need is:
: K 1024 * ;
And then you can write:
512 K allot
Understanding What It's Like to Program in Forth
I write Forth code every day. It is a joy to write a
few simple words and solve a problem. As brain exercise it
far surpasses cards, crosswords or Sudoku
—Chuck
Moore, creator of Forth
I've used and enjoyed Forth quite a bit over the years,
though I rarely find myself programming in it these days.
Among other projects, I've written several standalone tools
in Forth, used it for exploratory programming, wrote a
Forth-like language for handling data assets for a
commercial project, and wrote two standalone 6502 cross
assemblers using the same principles as Forth
assemblers.
It's easy to show how beautiful Forth can be. The
classic example is:
: square dup * ;
There's also Leo Brodie's oft-cited washing
machine program. But as pretty as these code snippets
are, they're the easy, meaningless examples, much like the
two-line quicksort in Haskell. They're trotted out to
show the the strengths of a language, then reiterated by
new converts. The primary reason I wrote the Purely Functional
Retrogames series, is because of the disconnect between
advocates saying everything is easy without destructive
updates, and the utter lack of examples of how to approach
many kinds of problems in a purely functional way. The same
small set of pretty examples isn't enough to understand
what it's like to program in a particular language or
style.
Chuck Moore's Sudoku quote above is one of the most
accurate characterizations of Forth that I've seen. Once
you truly understand it, you'll better see what's fun about
the language, and also why it isn't as commonly used. What
I'd like to do is to start with a trivially simple problem,
one that's completely straightforward, even simpler than
the infamous
FizzBuzz:
Write a Forth word to add together two integer
vectors (a.k.a. arrays) of three elements each.
The C version, without bothering to invent custom data
types, requires no thought:
void vadd(int *v1, int *v2, int *v3)
{
v3[0] = v1[0] + v2[0];
v3[1] = v1[1] + v2[1];
v3[2] = v1[2] + v2[2];
}
In Erlang it's:
vadd({A,B,C}, {D,E,F}) -> {A+D, B+E, C+F}.
In APL and J the solution is a single character:
+
first Forth attempt
So now, Forth. We start with a name and stack
picture:
: vadd ( v1 v2 v3 -- )
Getting the first value out of v1 is easy enough:
rot dup @
"rot
" brings v1 to the top, then we grab
the first element of the array (remember that we need to
keep v1 around, hence the dup
). Hmmm...now
we've got four items on the stack:
v2 v3 v1 a
"a" is what I'm calling the first element of v1, using
the same letters as in the Erlang function. There's no way
to get v2 to the top of the stack, save the deprecated word
pick
, so we're stuck.
second Forth attempt
Thinking about this a bit more, the problem is we have
too many items being dealt with at once, too many items on
the stack. v3 sitting there on top is getting in the way,
so what if we moved it somewhere else for a while? The
return stack is the standard location for a temporary
value, so let's try it:
>r over @ over @ + r> !
Now that works. We get v3 out of the way, fetch v1 and
v2 (keeping them around for later use), then bring back v3
and store the result. Well, almost, because now v3 is gone
and we can't use it for the second and third elements.
third Forth attempt
This isn't as bad as it sounds. We can just keep v3 over
on the return stack for the whole function. Here's an
attempt at the full version of vadd
:
: vadd ( v1 v2 v2 -- )
>r
over @ over @ + r@ !
over cell+ @ over cell+ @ + r@ cell+ !
over 2 cells + @ over 2 cells + @ + r> 2 cells + !
drop drop ;
cell+
is roughly the same as
++
in C. "2 cells +
" is
equivalent to "cell+ cell+
". Notice how v3
stays on the return stack for most of the function, being
fetched with r@
. The "drop drop
"
at the end is to get rid of v1 and v2. Some nicer
formatting helps show the symmetry of this word:
: vadd ( v1 v2 v2 -- )
>r
over @ over @ + r@ !
over cell+ @ over cell+ @ + r@ cell+ !
over 2 cells + @ over 2 cells + @ + r> 2 cells + !
drop drop ;
This can be made more obvious by defining some vector
access words:
: 1st ;
: 2nd cell+ ;
: 3rd 2 cells + ;
: vadd ( v1 v2 v2 -- )
>r
over 1st @ over 1st @ + r@ 1st !
over 2nd @ over 2nd @ + r@ 2nd !
over 3rd @ over 3rd @ + r> 3rd !
drop drop ;
A little bit of extra verbosity removes one quirk in the
pattern:
: vadd ( v1 v2 v2 -- )
>r
over 1st @ over 1st @ + r@ 1st !
over 2nd @ over 2nd @ + r@ 2nd !
over 3rd @ over 3rd @ + r@ 3rd !
rdrop drop drop ;
And that's it--three element vector addition in Forth.
One solution at least; I can think of several completely
different approaches, and I don't claim that this is the
most concise of them. It has some interesting properties,
not the least of which is that there aren't any named
variables. On the other hand, all of this puzzling, all
this revision...to solve a problem which takes no thought
at all in most languages. And while the C version can be
switched from integers to floating point values just by
changing the parameter types, that change would require
completely rewriting the Forth code, because there's
a separate floating point stack.
Still, it was enjoyable to work this out. Better than
Sudoku? Yes.
Macho Programming
Back before I completely lost interest in debates about
programming topics, I remember reading an online discussion
that went like this:
Raving Zealot: Garbage collection is FASTER than
manual memory management!
Experienced Programmer: You mean that garbage
collection is faster than using malloc
and
free
to manage a heap. You can use pools and
static allocation, and they'll be faster and more
predictable than garbage collection.
Raving Zealot: You need to get over your attitude
that programming is a MACHO and RECKLESS endeavor! If you
use a garbage collected language, NOTHING can go wrong.
You're PROTECTED from error, and not reliant on your
MACHONESS.
What struck me about this argument, besides that people
actually argue about such things, is how many other
respected activities don't have anywhere near the same
level of paranoia about protection from mistakes. On the
guitar--or any musical instrument--you can play any note at
any time, even if it's out of key or, more fundamentally,
not played correctly (wrong finger placement or pressure or
accidentally muting the string). And people play
instruments live, in-concert in front of thousands of
people this way, knowing that the solo is improvised in
Dorian E, and there's no physical barrier preventing a
finger from hitting notes that aren't in that mode. The
same goes for sculpting, or painting, or carpentry...almost
anything that requires skill.
(And building chickadee houses isn't universally
considered a MACHO hobby, even though it involves the use
of POWER TOOLS which can LOP OFF FINGERS.)
In these activities, mistakes are usually obvious and
immediate: you played the wrong note, you cut a board to
the wrong length, there's blood everywhere. In macho
programming, a mistake can be silent, only coming to light
when there's a crash in another part of the code--even days
later--or when the database gets corrupted. Stupidly
trivial code can cause this, like:
array[index] = true;
when index
is -1. And yet with this
incredible potential for error, people still build
operating systems and giant applications and massively
multiplayer games in C and C++. Clearly there's a lot of
machoness out there, or it's simply that time and debugging
and testing--and the acceptance that there will be
bugs--can overcome what appear to be technical
impossibilities. It's hand-rolling matrix multiplication
code for a custom digital signal processor vs. "my
professor told me that assembly language is impossible for
humans to use."
Would I prefer to ditch all high-level improvements, in
exchange for programming being the technical equivalent of
rock climbing? NO! You can romanticize it all you want, but
when I wrote 8-bit games I clearly remember thinking how
much more pleasant it was to tinker in BASIC than to spend
hours coding up some crazy 6502 code that would lock-up the
entire computer time after time (the bug would be that
changing a loop index from 120 to 130 made it initially be
negative, so the loop would end after one iteration, or
some other obscurity).
What both this retro example and the C one-liner have in
common is that the core difficulty stems less from the
language itself than because code is being turned loose
directly on hardware, so crashes are really crashes, and
the whole illusion that your source code is actually the
program being executed disappears. Problems are debugged at
the hardware level, with data breakpoints and trapped CPU
exceptions and protected memory pages (this is how
debuggers work).
It's a project suitable as part of a single semester
undergraduate class to write an interpreter for your
favorite low-level language. Write it in Scheme or Erlang
or Scala. Use symbolic addresses, not a big array of
integers, to represent memory. Keep track of address
offsets, instead of doing the actual math. Have functions
return lists of memory addresses that have been read from
or modified. Keep everything super simple and clean. The
goal is to be able to enter expressions or functions and
see how they behave, which is a whole lot nicer than
tripping address exceptions.
All of a sudden, even hardcore machine code isn't nearly
so scary. Write a dangerous function, get back a symbolic
representation of what it did. Mistakes are now simply
wrong notes, provided you keep your functions small. It's
still not easy, but macho has become safe.
(If you liked this, you might enjoy Sending Modern Languages Back to 1980s Game
Programmers.)
Timidity Does Not Convince
The only arguments that hold water, in terms of
programming language suitability, are bold, finished
projects. Not a mini-Emacs written in Haskell. Not a Sudoku
solver in Prolog. Not a rewrite of some 1970s video game
using Functional Reactive Programming. They need to be
large and daring projects, where the finished product is
impressive in its own right, and then when you discover it
was written in language X, there's a wave of disbelief and
then a new reverence for a toolset you had previously
dismissed.
And now, two of my favorite bold projects:
Wings 3D
Wings started as an attempt to clone Nendo, a 3D
modeller designed around simplicity and ease of use. Nendo
development and support had dried-up, and enthusiasm for
Nendo fueled Wings 3D development. So now there's a
full-featured, mature 3D modeller, with a great focus on
usability, and it's written entirely in Erlang. Without a
doubt it's the antithesis of what Erlang was designed for,
with the entire program running as a single process and
intensive use of floating point math. But it clearly works,
and shows that there are benefits to using Erlang even
outside of its niche of concurrency-oriented
programming.
SunDog: Frozen Legacy
SunDog was an elaborate game for the Apple II. A 1MHz
6502 and 48K of memory look more like a platform for simple
arcade games, not the space trading and exploration
extravaganza that was SunDog. And though assembly language
was the norm for circa-1984 commercial games, the
authors--Bruce Webster and Wayne Holder--chose to implement
the majority of the game in p-code interpreted Pascal. I
found a justification in an old email from Bruce:
Wayne and I had some long discussions about what to
use to write SunDog (which actually started out being
another game). We considered assembly, FORTH, and
Pascal; BASIC was far too slow and clumsy for what we
wanted to do. We ended up ruling out FORTH for issues
of maintenance (ours and lack of a commercial
vendor).
I pushed for--and we decided on--Apple Pascal for a
few different reasons, including the language itself;
the compactness of the p-code; and the automatic (but
configurable) memory management of the p-System, which
could swap "units" (read: modules) in and out. Pascal
made the large project easier, not harder, though it
was a struggle to keep the game within 48KB.
And that's how it should be: choose the language that
lets you implement your vision.
Accidentally Introducing Side Effects into Purely
Functional Code
It's easy to taint even purely functional languages by
reintroducing side-effects. Simply have each function take
an additional parameter representing the global state of
the world--a tree of key/value pairs, for example--and have
each function return a new state of the world. This is not
news. It's an intentionally pathological case, not
something I'd ever consider implementing.
What's more surprising is how easy it is to
accidentally introduce side-effects.
For the Purely Functional
Retrogames series, I wrote code that operated on a list
of game entities:
[A, B, C, D,...]
Each element was a self-contained unit of sorts: an ID,
x/y position, current state. Using this list of entities to
build a new version for the next game frame was a simple
map operation. The ID and state for each entity were used
to call the correct transformation function for that
entity.
Each of these transformations had three possible
outcomes: a new entity would be returned with a different
position and/or state, an entity could delete itself, or an
entity could create some new entities (think of dropping a
bomb or firing a shot). All three of these can be handled
by having each transformation function return a list.
For example, if the original list was:
[A, B, C, D]
and entity "B" deleted itself, and entity "C" created
four new entities in addition to a new version of itself,
then the returned values might look like this:
A => [A1]
B => []
C => [C1, New1, New2, New3, New4]
D => [D1]
and the new overall list of entities would be:
[A1, C1, New1, New2, New3, New4, D1]
Well, that's not quite right. It's actually a list of
lists:
[A1, [], [C1, New1, New2, New3, New4], [D1]]
and the individual lists need to be appended together to
give the proper result. The append operation creates a
brand new list, which means that the time and memory spent
creating the individual result lists were wasted. They were
just stepping stones to the real result. This almost
certainly isn't going to be a significant inefficiency, but
there's a pretty way around it: pass an accumulator list in
and out of each transformation function. Now the three
cases listed above neatly map to three operations:
1. To transform an entity into the next version of
itself, simply prepend the new entity to the accumulator
list.
2. To delete an entity, do nothing. Simply return the
accumulator.
3. To create new entities, prepend each one to the
accumulator.
No extra work is involved. We never build-up temporary
lists and discard them immediately.
But this pretty little solution has one unintended flaw.
By passing in the accumulator list, we're giving full
access to previous computations to each of the entity
transformation functions. Even worse, each of these
functions can arbitrarily transform this list, not
only prepending values but also removing existing values or
changing the order of them. (No destructive updates need
occur, just the returning of a different list.) In theory
we could write code that uses the list to make decisions:
if the head of the accumulator is an entity of type "E,"
then spawn a new entity at the same position as E. Now the
entire process is order dependent...ugh.
In theory. The "flaw" here assumes that each function is
going to do more than either leave the accumulator
untouched or prepend values to it, that the programmer of a
function may intentionally go rogue and look to sabotage
the greater good. It still could open the door to
bugs: imagine if a dozen people were all writing these
transformation functions independently. Someone will make a
mistake at some point.
Either way, the same side effects possible in imperative
languages were accidentally introduced into pure
functions.
Revisiting "Purely Functional Retrogames"
I wrote the Purely Functional
Retrogames series as an experiment. There's been so
much empty talk about how functional languages are as good
or better than imperative languages--yet very little to
back that up. Doubly so in regard to interactive
applications. I'd bet there are more people learning to
program the Atari 2600 hardware than writing games in
Haskell.
For people who only read the beginnings of articles, let
me say this up front: Regurgitating the opinion that
functional programming (or any technology) is superior does
absolutely nothing. That's a road of endless conjecture and
advocacy. One day you'll realize you've been advocating
something for ten years without having any substantial
experience in it. If you think a particular technology
shows promise, then get in there and figure it out.
The rest of this entry is about what I learned by
writing some retro-style games in Erlang.
The divide between imperative and functional
languages, in terms of thinking about and writing code, is
much smaller than I once thought it was. It is easy to
accidentally introduce side-effects
and sequencing problems into purely functional code. It is
easy to write spaghetti code, where the entanglement comes
from threading data through functions rather than
unstructured flow of control. There's mental effort
involved in avoiding these problems, just as there is when
programming in any language.
Not being able to re-use names for values in
functions sometimes led to awkward code. I could have
said "not being able to destructively update local
variables..." but that would show a lack of understanding
that local "variables" are just names for things, not
storage locations. For example, the initial position of an
object is passed in as "Pos." I use it to create a new
position, called "NewPos." Sometimes, because it was the
most straightforward approach, I'd end up with another
value called "NewPos2" (which I will agree is a horrible
name). Then I'd find myself referring to "NewPos" when I
meant "NewPos2." If the need arose to shuffle the logic
around a bit, then it took care to manually rename these
values without introducing errors. It would have been much
nicer to create the new position and say "I'm repurposing
the name 'Pos'" to refer to this new position.
The lack of a simple dictionary type was the most
significant hindrance. This is where I found myself
longing for the basic features of Perl, Python,
Ruby--pretty much any common language. The ugliness of
Erlang records has been well-documented, but syntax alone
is not a reason to avoid them. The real problem is that
records are brittle, exactly like structs in C. It's
difficult to come up with a record that covers the needs of
a dozen game entities with varying behaviors. They all have
some common data, like an X/Y position, but some have
special flags, some use velocities and acceleration values
instead of simple motion, some need references to other
entities, and so on. I could either make a big generic
record that works for everything (and keep adding fields as
new cases arise) or switch to another data structure, such
as a property list, that doesn't allow pattern
matching.
The biggest wins came from symbolic programming and
consciously avoiding low-level efficiency. I spent a
lot of time focused on efficient representations of entity
states and other data. The more thought I put into this,
the more the code, as a whole, became awkward and
unreadable. Everything started to flow much more nicely
when I went down the road of comical inefficiency. Why use
a fixed-size record when I can use a tree? Why use bare
data types when I can tag them so the code that processes
them is easier to read? Why transform one value to another
when I can instead return a description of how to transform
the value?
Some years ago, John Carmack stated that a current PC
was powerful enough to run every arcade game he played when
he was growing up--some 300 or so games--at the same time.
Yes, there's a large execution time penalty simply for
choosing to write in Erlang instead of C, but it's nowhere
near enough a factor to matter for most game logic, so use
that excess to write code that's clear and beautiful.
Puzzle Languages
I know I've covered this before. I am repeating myself.
But it was woven into various other topics, never stated
outright:
Some programming languages, especially among those
which haven't gained great popularity, are puzzles.
That's not to be confused with "programming in general
is a puzzle." There's always a certain amount of thought
that goes into understanding a problem and deciding upon an
approach to solving it. But if it takes focused thought to
phrase that solution into working code, you go down one
path then back up, then give up, then try something
completely different--then you're almost certainly using a
puzzle language.
These are puzzle languages:
Haskell
Erlang
Forth
J
And these are not:
Python
Ruby
Lua
C
In Forth, the puzzle is how to
simplify a problem so that it can be mapped cleanly to the
stack. In Haskell and Erlang, the puzzle is how to manage
with single assignment and without being able to reach up
and out of the current environment. In J the puzzle is how
to phrase code so that it operates on large chunks of data
at once.
Compare this to, say, Python. I can usually bang out a
solution to just about anything in Python. I update locals
and add globals and modify arrays and get working code.
Then I go back and clean it up and usually end up with
something simpler. In Erlang, as much as I want to deny it,
I usually pick a direction, then realize I'm digging myself
into a hole, so I scrap it and start over, and sometimes
when I end up with a working solution it feels too fragile,
something that wouldn't survive minor changes in the
problem description. (Clearly this doesn't apply to easy
algorithms or simple transformations of data.)
A critical element of puzzle languages is providing an
escape, a way to admit that the pretty solution is elusive,
and it's time to get working code regardless of aesthetics.
It's interesting that these escapes tend to have a stigma;
they induce a feeling of doing something wrong; they're
guaranteed to result in pedantic lecturing if mentioned in
a forum.
In Forth, an easy pressure value when the stack gets too
busy is to use local variables. Local variables have been
historically deemed unclean by a large segment of the Forth
community (although it's amazing how easy some Forth
problems are if you use locals). There's a peculiar angst
involved in avoiding locals, even if they clearly make code
simpler. Locals aside, there's always the escape of using
some additional global variables instead of stack juggling,
which has a similarly bad reputation (even though everyone
still does it).
In Erlang, ETS tables and the process dictionary are two
obvious escapes. And as expected, any mention of the
process dictionary always includes the standard parental
warning about the dangers of playing darts or standing
there with the refrigerator door open. It is handy,
as shown by the standard library random number generator
(which stores a three element tuple under the name
random_seed
), and Wings3D (which uses the
process dictionary to keep track of GUI state).
A more interesting escape in Erlang is the process. A
process is commonly thought of as a mechanism of
concurrency, but that need not be the case. It's easy to
make an infinite loop by having a tail recursive function.
Parameters in such a loop can be--if you dig into the implementation a bit--directly
modified, providing a safe and interesting blurring of
functional and imperative code. Imagine taking such a
function and spawning it into its own process. Each process
captures a bit of relevant data in a small, endlessly
recursive loop. Imagine dozens or hundreds of these
processes, each spinning away, holding onto important state
data. Erlang string theory, if you will.
I wouldn't want to break a program into hundreds of
processes simply to capture state, but usually there are
some important bits which are used and updated across a
project. Pulling these out of the purely functional world
can be enough of a relief from growing complexity that the
rest of the code can remain pure.
But there's still that stigma of doing something dirty.
Back before the Norton name became associated with
anti-virus products, when MS-DOS was ubiquitous, Peter
Norton authored the standard book on programming IBM PCs.
In a discussion of the MS-DOS interrupts for displaying
characters and moving the cursor, he strongly advised that
programmers not access video memory directly, but use the
provided services instead. (The theory being that the
MS-DOS interrupts would remain compatible on future
hardware.) Of course almost every application and game
would not have been possible had developers taken Peter's
advice to heart. Learning to write directly to video memory
was practically a cottage industry until Windows 95 finally
ended the MS-DOS era.
Sometimes advice is too idealistic to follow.
How My Brain Kept Me from Co-Founding YouTube
Flickr blew my mind
when it appeared back in 2004.
I'd read all the articles about building web pages that
load quickly: crunching down the HTML, hand-tweaking GIFs,
clever reuse of images. I was immersed in the late 1990s
culture of website optimization. Then here comes a site
that is 100% based around viewing large quantities of
memory-hungry photos. And the size of photos was put
entirely in the users' hands: images could be
over-sharpened (which makes the JPEGs significantly larger)
or uploaded with minimal compression settings. Users could
click on on the "show all sizes" button and view the full
glory of a 5MB photo. Just viewing a single 200K mid-sized
version would outweigh any attempts to mash down the
surrounding HTML many times over.
While still trying to figure out how the bandwidth bar
had suddenly jumped to an unfathomable height, along comes
this site that does
the same thing as Flickr...but with VIDEOS. Now you've got
people idly clicking around for an hour, streaming movies
the entire time, or people watching full thirty minute
episodes of sitcoms online. Not only was there no paranoia
about bandwidth, but the entire premise of the site was to
let people request vast and continual amounts of data. Such
an audacious idea was so far away from the technical
comfort zone I had constructed that I would never would
have contemplated its potential existence.
I've learned my lesson. And yet I see people continually
make the same mistake in far more conservative ways:
"On a 64-bit machine, each element in an Erlang list
is sixteen bytes, which is completely
unacceptable."
"Smalltalk has a 64MB image file, which is
ridiculous. I'm not going to ship that to
customers."
"I would never use an IDE that required a 2GB
download!"
I see these as written declarations of someone's
arbitrary limitations and technical worries. Such
statements almost always have no bearing on reality and
what will be successful or not.
On Being Sufficiently Smart
I'm proud to have created the wiki page for the phrase
sufficiently
smart compiler back in 2003 or 2004. Not because it's a
particularly good page, mind you; it has been endlessly
rewritten in standard wiki fashion. It's one of the few
cases where I recognized a meme and documented it. I'd been
seeing the term over and over in various discussions, and
it started to dawn on me that it was more than just a term,
but a myth, a fictional entity used to support
arguments.
If you're not familiar, here's a classic context for
using "sufficiently smart compiler." Language X is much
slower than C, but that's because floating point values are
boxed and there's a garbage collection system. But...and
here it comes...given a sufficiently smart compiler
those values could be kept in registers and memory
allocation patterns could be analyzed and reduced to static
allocation. Of course that's quite a loaded phrase, right
up there with "left as an exercise for the reader."
One of the key problems with having a sufficiently smart
compiler is that not only does it have to be sufficiently
smart, it also has to be perfect.
Back in the mid-1990s my wife and I started an indie
game development company, and we needed some labels
printed. At the time, all we had was a middling inkjet
printer, so the camera ready image we gave to the print
shop was, to put it mildly, not of the necessary quality.
Then we got the printed labels back and they looked
fantastic. All the pixellated edges were smooth, and
we theorized that was because of how the ink flowed during
the printing process, but it didn't really matter. We had
our labels and we were happy.
A few months later we needed to print some different
labels, so we went through the same process, and the
results were terrible. Every little flaw, every
rough edge, every misplaced pixel, all perfectly reproduced
on a roll of 1000 product labels. Apparently what had
happened with the previous batch was that someone at the
print shop took pity upon our low-grade image and did a
quick graphics art job, re-laying out the text using the
same font, then printing from that. The second time this
didn't happen; the inkjet original was used directly. The
problem wasn't that someone silently helped out, but that
there was no indication of what was going on, and that the
help wouldn't be there every time.
Let's say that a compiler can detect O(N^2) algorithms
and replace them with O(N) equivalents. This is a classic
example of being sufficiently smart. You can write code
knowing that the compiler will transform and fix it for
you. But what if the compiler isn't perfect (and it clearly
won't be, as there aren't O(N) versions all algorithms)? It
will fix some parts of your code and leave others as-is.
Now you run your program, and it's slow, but why? You need
insight into what's going on behind the scenes to figure
that out, and if you find the problem then you'll have to
manually recode that section to use a linear approach.
Wouldn't it be more transparent to simply use linear
algorithms where possible in the first place, rather than
having to second guess the system?
There's another option, and that's to have the compiler
give concrete information about behind the scenes
transformations. I have a good mental picture of how Erlang
works, in terms of the compiler and run-time. It's usually
straightforward to understand what kind of BEAM code will
be generated from particular source. That was true until
fancy optimizations on binary operations were introduced in
2008. The documentation uses low-level concepts like "match
context" and discusses when segmented binaries are copied
and so on. It's all abstract and difficult to grasp, and
that's why there's a new compiler switch, "bin_opt_info,"
to provide a window into what kind of code is being
generated. Going back to my early programming days, the
manual for Turbo Pascal 4 listed exactly what optimizations
were performed by the compiler.
The Glasgow Haskell Compiler (GHC) is the closest I've
seen to a sufficiently smart compiler, with the advantages
and drawbacks that come with such a designation.
I can write code that looks like it generates all kinds
of intermediate lists--and indeed such would be the case
with similar code in Erlang--and yet the compiler is
sufficiently smart to usually remove all of that. Even in
the cases where that isn't possible, it's not a make or
break issue. In the worst case the Haskell code works like
the Erlang version.
But then there's laziness. Laziness is such an
intriguing idea: an operation can "complete" immediately,
because the actual result isn't computed until there's
specific demand for it, which might be very soon or it
might be in some other computation that happens much later.
Now suppose you've got two very memory intensive algorithms
in your code, and each independently pushes the limits of
available RAM. The question is, can you guarantee that
first algorithm won't be lazily delayed until it is forced
to run right in the middle of the second algorithm,
completely blowing the memory limit?
The GHC developers know that laziness can be expensive
(or at least unnecessary in many cases), so strictness
analysis is done to try to convert lazy code to non-lazy
code. If and when that's successful, wonderful! Maybe some
programs that would have previously blown-up now won't. But
this only works in some cases, so as a Haskell coder you've
got to worry about the cases where it doesn't happen. As
much as I admire the Haskell language and the GHC
implementation, I find it difficult to form a solid mental
model of how Haskell code is executed, partially because
that model can change drastically depending on what the
compiler does. And that's the price of being sufficiently
smart.
(Also see Digging Deeper into
Sufficiently Smartness.)
Let's Take a Trivial Problem and Make it Hard
Here's a simple problem: Given a block of binary data,
count the frequency of the bytes within it. In C, this
could be a homework assignment for an introductory class.
Just zero out an array of 256 elements, then for each byte
increment the appropriate array index. Easy.
Now write this in a purely functional way, with an
efficiency close to that of the C implementation.
It's easy to do a straightforward translation to Erlang,
using tail recursion instead of a for
loop,
like this:
freq(B) when is_binary(B) ->
freq(B, erlang:make_tuple(256, 0)).
freq(<<X, Rest/binary>>, Totals) ->
I = X + 1,
N = element(I, Totals),
freq(Rest, setelement(I, Totals, N + 1));
freq(<<>>, Totals) ->
Totals.
But of course in the name of purity and simplicity,
setelement
copies the entire
Totals
tuple, so if there are fifty million
bytes, then the 256 element Totals
is copied
50 million times. It's simple, but it's not the right
approach.
"Blame the complier" is another easy option. If it could
be determined that the Totals
tuple can be
destructively updated, then we're good. Note that the
garbage collector in the Erlang runtime is based on the
assumption that pointers in the heap always point toward
older data, an assumption that could break if a tuple was
destructively updated with, say, a list value. So not only
would the compiler have to deduce that that the tuple is
only used locally, but it would also have to verify that
only non-pointer values (like integers and atoms) were
being passed in as the third parameter of
setelement
. This is all possible, but it
doesn't currently work that way, so this line of reasoning
is a dead end for now.
Totals
could be switched from a tuple to a
tree, which might or might not be better than the
setelement
code, but there's no way it's in
the same ballpark as the C version.
What about a different algorithm? Sort the block of
bytes, then count runs of identical values. Again, just the
suggestion of sorting means we're already off track.
Honestly, I don't know the right answer. In Erlang, I'd
go for one of the imperative efficiency hacks, like ets
tables, but let's back up a bit. The key issue here is that
there are some fundamental assumptions about what "purely
functional" means and the expected features in functional
languages.
In array languages, like J, this type of problem is
less awkward, as it's closer to what they were designed
for. If nothing else, reference counted arrays make it
easier to tell when destructive updates are safe. And
there's usually some kind of classification operator, one
that would group the bytes by value for easy counting.
That's still not going to be as efficient as C, but it's
clearly higher-level than the literal Erlang
translation.
A more basic question is this: "Is destructively
updating a local array a violation of purely
functionalness?" OCaml allows destructive array updates and
C-like control structures. If a local array is updated
inside of an OCaml function, then the result copied to a
non-mutable array at the end, is there really anything
wrong with that? It's not the same as randomly sticking
your finger inside a global array somewhere, causing a
week's worth of debugging. In fact, it looks exactly the
same as the purely functional version from the caller's
point of view.
Perhaps the sweeping negativity about destructive
updates is misplaced.
Digging Deeper into Sufficiently Smartness
(If you haven't read On Being
Sufficiently Smart, go ahead and do so, otherwise this
short note won't have any context.)
I frequently write Erlang code that builds a list which
ends up backward, so I call lists:reverse
at
the very end to flip it around. This is a common idiom in
functional languages.
lists:reverse
is a built-in function in
Erlang, meaning it's implemented in C, but for the sake of
argument let's say that it's written in Erlang instead.
This is super easy, so why not?
reverse(L) -> reverse(L, []).
reverse([H|T], Acc) ->
reverse(T, [H|Acc]);
reverse([], Acc) ->
Acc.
Now suppose there's another function that uses
reverse
at the very end, just before
returning:
collect_digits(L) -> collect_digits(L, []).
collect_digits([H|T], Acc) when H >= $0, H =< $9 ->
collect_digits(T, [H|Acc]);
collect_digits(_, Acc) ->
reverse(Acc).
This function returns a list of ASCII digits that prefix
a list, so collect_digits("1234.0")
returns
"1234"
. And now one more "suppose": suppose
that one time we decide that we really need to process the
result of collect_digits
backward, so we do
this:
reverse(collect_digits(List))
The question is, can the compiler detect that there's a
double reverse? In theory, the last reverse
could be dropped from collect_digits
in the
generated code, and each call to
collect_digits
could be automatically wrapped
in a call to reverse
. If there ends up being
two calls to reverse
, then get rid of both of
them, because it's just wasted effort to double-reverse a
list.
With lists:reverse
as a built-in, this is
easy enough. But can it be deduced simply from the raw
source code that reverse(reverse(List))
can be
replaced with List
? Is that effort easier than
simply special-casing the list reversal function?
How to Crash Erlang
Now that's a loaded title, and I know some people will
immediately see it as a personal slam on Erlang or
ammunition for berating the language in various forums. I
mean neither of these. Crashing a particular language, even
so-called safe interpreted implementations, is not
particularly challenging. Running out of memory or stack
space are two easy options that work for most languages.
There are pathological cases for regular expressions that
may not truly crash, but result in such an extended period
of unresponsiveness on large data sets that the difference
is moot. In any language that allows directly linking to
arbitrary operating system functions...well, that's just
too easy.
Erlang, offering more complex features than many
languages, has some particularly interesting edge
cases.
Run out of atoms. Atoms in Erlang are analogous
to symbols in Lisp--that is, symbolic, non-string
identifiers that make code more readable, like
green
or unknown_value
--with one
exception. Atoms in Erlang are not garbage collected. Once
an atom has been created, it lives as long as the Erlang
node is running. An easy way to crash the Erlang virtual
machine is to loop from 1 to some large number, calling
integer_to_list
and then
list_to_atom
on the current loop index. The
atom table will fill up with unused entries, eventually
bringing the runtime system to halt.
Why is this is allowed? Because garbage collecting atoms
would involve a pass over all data in all processes,
something the garbage collector was
specifically designed to avoid. And in practice,
running out of atoms will only happen if you write code
that's generating new atoms on the fly.
Run out of processes. Or similarly, "run out of
memory because you've spawned so many processes." While the
sequential core of Erlang leans toward being purely
functional, the concurrent side is decidedly imperative. If
you spawn a non-terminating, unlinked process, and manage
to lose the process id for it, then it will just sit there,
waiting forever. You've got a process leak.
Flood the mailbox for a process. This is
something that most new Erlang programmers do sooner or
later. One process sends messages to another process
without waiting for a reply, and a missing or incorrect
pattern in the receive
statement causes the
receiver to ignore all messages...so they keep piling up
until the mailbox fills all available memory, and that's
that. Another reminder that concurrency in Erlang is
imperative.
Create too many large binaries in a single
process. Large--greater than 64 byte--binaries are
allocated outside of the per-process heap and are reference
counted. The catch is that the reference count indicates
how many processes have access to the binary, not
how many different pointers there are to it within a
process. That makes the runtime system simpler, but it's
not bulletproof. When garbage collection occurs for a
process, unreferenced binaries are deleted, but that's only
when garbage collection occurs. It's possible to create a
large process with a slowly growing heap, and create so
much binary garbage that the system runs out of memory
before garbage collection occurs. Unlikely, yes, but
possible.
Want People to Use Your Language Under Windows? Do
This.
Whenever I hear about a new programming language or new
implementation of an existing language, I usually find
myself trying it out. There's a steep cost--in terms of
time and effort--in deciding to use a new language for more
than just tinkering, so I'm not going to suffer through
blatant problems, and I admit to being sensitive to
interface issues. Nothing gets me disinterested faster than
downloading and installing a new language system,
double-clicking the main icon...
...and being presented with an ugly little 80x24
character Microsoft command window using some awkward 1970s
font.
(Now before the general outcry begins, be aware that I'm
a regular command line user. I've used various versions of
JPSoft's enhanced command
processors for Windows for close to 20 years now, and I
usually have multiple terminal windows open when using my
MacBook.)
The poor experience of this standard command window is
hard to underestimate. It always starts at a grid of 80x24
characters, even on the highest resolution of displays.
Sometimes even the basic help message of an interpreter is
wider than forty characters, causing the text to wrap in
the middle of a word. The default font is almost always in
a tiny point size. Cut and paste don't work by default, and
even when enabled they don't follow standard Windows
shortcuts. And, rather oddly, only a small subset of fonts
actually work in this window.
It's possible to do some customization of the command
window--change the font, change the font size, change the
number of rows and columns of text--and these will take it
from completely unacceptable to something that might pass
for 1980s nostalgia. But that's a step I have to take
manually. That initial double-click on the icon still
brings up everything in the raw.
Basic aesthetics aside, the rudimentary features of a
monochrome text window limit the opportunities for
usability improvements. I'm always surprised at how many
Windows ports of languages don't even let me access
previously entered commands (e.g., using up-arrow or
alt-P). Or how about using colors or fonts to differentiate
between input and output, so I can more easily scan through
the session history?
If you want me to use your language--and if you care
about supporting Windows at all--then provide a way of
interacting with the language using a real Windows
application. Don't fall back on cmd.exe.
Some languages are brilliant in this regard. Python has
the nice little IDLE window. Factor and PLT Scheme have
gone all-out with aesthetically-pleasing and usable
environments. Erlang and REBOL aren't up to the level of
any of these (Erlang doesn't even remember the window size
between runs), but they still provide custom Windows
applications for user interaction.
A Personal History of Compilation Speed, Part 1
The first compiled language I used was the Assembler
Editor cartridge for the Atari 8-bit computers. Really,
it had the awful name "Assembler Editor." I expect some
pedantic folks want to interject that an assembler is not a
compiler. At one time I would have made that argument
myself. But there was a very clear divide between editing
6502 code and running it, a divide that took time to cross,
when the textual source was converted into machine-runnable
form. Contrast that to Atari BASIC, the only language I
knew previously, which didn't feature a human-initiated
conversion step and the inevitable time it took.
Conceptually, the Assembler Editor was a clever design.
Source code was entered line by line, even using line
numbers, just like BASIC. The assembler could compile the
source straight from memory and create object code in
memory, with no disk access to speak of. The debugger was
right there, too, resident in memory, setting the stage for
what looked like an efficient and tightly integrated
development system.
Except for whatever reason, the assembler was
impressively slow, and it got disproportionately slower as
program size increased. A linear look-up in the symbol
table? Some kind of N-squared algorithm buried in there?
Who knows, but I remember waiting over seven minutes for a
few hundred lines of code to assemble. Atari knew this was
a problem, because there was a note in the manual about it
only being suitable for small projects. They offered the
friendly advice of purchasing a more expensive product, the
Atari Macro Assembler (which was a standalone assembler,
not an integrated environment).
Instead I upgraded to MAC/65, a third
party alternative that followed the formula set by the
Assembler Editor: cartridge-based for fast booting,
BASIC-like editor and assembler and debugger all loaded
into memory at once. MAC/65 was popular among assembly
coders primarily on its reputation for quick assembly
times. And quick it was.
Almost certainly the slowness of the Assembler Editor
was because of a bad design decision, one not present in
MAC/65. But MAC/65 went one step further: source code was
parsed and tokenized after each line was entered. For
example, take this simple statement:
LDA #19 ; draw all bonus items
It takes a good amount of work, especially on a sub-2MHz
processor, to pick that apart. "LDA" needs to be scanned
and looked-up somewhere. "19" needs to be converted to
binary. The MAC/65 approach was to do much of this at
edit-time, storing the tokenized representation in memory
instead of the raw text.
In the above example, the tokenized version could be
reduced to a byte indicating "load accumulator immediate,"
plus the binary value 19 (stored as a byte, not as two
ASCII characters), and then a token indicating the rest of
the line was a comment and could be ignored at assembly
time. When the user viewed the source code, it had to be
converted from the tokenized form back into text. This had
the side-effect of enforcing a single standard for
indentation style, whether or not there was a space after
the comment semicolon, and so on.
When my Atari 8-bit days ended, and I moved to newer
systems, I noticed two definite paths in assembler design.
There were the traditional, lumbering assemblers that ran
as standalone applications, which almost always required a
final linking step. These were usually slow and awkward,
seemingly designed as back-ends to high-level language
compilers, not meant to be used directly by programmers.
And then there were the lightning-fast assemblers, often
integrated with editors and debuggers in the tradition of
the Assembler Editor and MAC/65. For dedicated assembly
programmers during the Amiga and Atari ST years, those were
clearly the way to go.
By that time, except when there was no alternative, I
was using compilers for higher-level languages. And I was
wondering if the "slow, lumbering" and "lightning fast"
split applied to those development systems as well.
Part 2
The Pure Tech Side is the Dark Side
When I was writing 8-bit games, I was thrilled to
receive each issue of the home computer magazines I
subscribed to (especially this
one). I spent my time designing games in my head and
learning how to make the hardware turn them into reality.
Then each month here come these magazines filled with
tutorials and ideas and, most importantly, full source code
for working games. Sure, most of the games were simple, but
I pored over the code line by line--especially the assembly
language listings--and that was much of my early
programming education. Just seeing games designed by other
people was inspiring in a way that's difficult to get
across.
Years later, with those 8-bit days behind me, I would
regularly pick-up Dr.
Dobb's Journal at the local B. Dalton bookstore (now
part of Barnes and Noble). Reading it was mildly
interesting, but I didn't get much from it. Eventually I
realized it was because I wasn't immersed in the subject
matter. My PC programming projects were spotty at best, so
I read the articles but there wasn't any kind of active
learning going on. And there was an overall dryness to it.
It wasn't about creativity and wonder, it was about
programming.
Those two realizations do a good job of summarizing my
opinions about most online discussions and forums.
The ideal forum is when a bunch of people who are
individually working away on their own personal
projects--whether songwriting or photography or any other
endeavor--get together to share knowledge. Each participant
has a vested interest, because he or she needs to deliver
results first, and is discussing it with others only
second. It's easy to tell when people in online discussions
aren't result oriented. There's discussion about minute
differences between brands and there's an obsession with
having the latest and greatest model. Feels like a lot of
talking and expounding of personal theories, but not much
doing.
And then there's the creative angle. Raw discussions
about programming languages or camera models or upcoming
CPUs...they don't do anything for me. There's a difference
between making a goal of having the newest, most powerful
MacBook Pro, and someone who has pushed their existing
notebook computer to the limits while mixing 48 tracks of
stereo audio and could really use some of the improvements
in the latest hardware.
The pure tech side is the dark side, at least for
me.
A Personal History of Compilation Speed, Part 2
(Read Part 1 if you missed
it.)
My experience with IBM Pascal, on an original model
dual-floppy IBM PC, went like this:
I wrote a small "Hello World!" type of program, saved
it, and fired up the compiler. It churned away for a bit,
writing out some intermediate files, then paused and asked
for the disc containing Pass 2. More huffing and puffing,
and I swapped back the previous disc and ran the linker.
Quite often the compiler halted with "Out of Memory!" at
some point during this endeavor.
Now this would have been a smoother process with more
memory and a hard drive, but I came to recognize that a
compiler was a Very Important Program, and the authors
clearly knew it. Did it matter if it took minutes to
convert a simple program to a machine language executable?
Just that it could be done at all was impressive
indeed.
I didn't know it at the time, but there was a standard
structure for compilers that had built-up over the years,
one that wasn't designed with compilation speed as a
priority. Often each pass was a separate program, so they
didn't all have to be loaded into memory at the same time.
And those seemingly artificial divisions discussed in
compiler textbooks really were separate passes: lexical
analysis, parsing, manipulation of an abstract intermediate
language, conversion to a lower-level level intermediate
language, peephole optimization, generation of assembly
code. Even that last step could be literal, writing out
assembly language source code to be converted to machine
language by a separate tool. And linking, there's always
linking.
This was all before I discovered Turbo
Pascal.
On one of those cheap, floppy-only, 8088 PC clones from
the late 1980s, the compilation speed of Turbo Pascal was
already below the "it hardly matters" threshold.
Incremental builds were in the second or two range. Full
rebuilds were about as fast as saying the name of each file
in the project aloud. And zero link time. Again, this was
on an 8MHz 8088. By the mid-1990s, Borland was citing build
times of hundreds of thousands of lines of source per
minute.
The last time I remember seeing this in an ad, after
Turbo Pascal had become part of Delphi, the number was
homing in on a million lines per minute. Projects were
compiled before your finger was off of the build key. It
was often impossible to tell the difference between a full
rebuild of the entire project and compiling a single file.
Compilation speed was effectively zero.
Borland's other languages with "Turbo" in the name--like
Turbo C--weren't even remotely close to the compilation
speeds of Turbo Pascal. Even Turbo Assembler was slower,
thanks in part to the usual step of having to run a linker.
So what made Turbo Pascal so fast?
Real modules. A large percentage of time in C
compilers is spent reading and parsing header files. Even a
short school assignment may pull in tens of thousands of
lines of headers. That's why most C compilers support
precompiled headers, though they're often touchy and take
effort to set-up. Turbo Pascal put all the information
about exported functions and variables and constants into a
compiled module, so it could be quickly loaded, with no
character-by-character parsing needed.
Integrated build system. The standard makefile
system goes like this: first the "make" executable loads,
then it reads and parses a file of rules, then for each
source file that is out of date, the compiler is started
up. That's not a trivial effort, firing up a huge
multi-megabyte executable just to compile one file. The
Turbo Pascal system was much simpler: look at the list of
module dependencies for the current module; if they're all
up date, compile and exit; if not, then recursively apply
this process to each dependent module. An entire project
could be built from scratch without running any external
programs.
Minimal linker. Have you ever looked at the specs
for an object file format? "Complicated" and "bulky" are
two terms that come to mind. Turbo Pascal used a custom
object file with a minimal design. The "linker" wasn't
doing anywhere near the work of standard linkers. The
result was that the link step was invisible; you didn't
even notice it.
Single pass compiler with combined parsing and code
generation. No separate lexer, no separate parser, no
abstract syntax tree. All of these were integrated into a
single step, made possible by the straightforward syntax of
Pascal (and by not having a preprocessor with macros). If
you're curious, you can read more about the technique.
Yes, there was a drawback to instantaneous compile
times. Fewer optimizations were done, and almost always the
resultant code was slower than the C equivalent. But it
didn't matter. Removing the gap between the steps of
writing and running code was worth more than some amount of
additional runtime performance. I used to hit the build key
every so often, even while typing, just to check for syntax
errors. And zero compilation speed eventually became
standard, with the rise of interpreted languages like Perl,
Ruby, and Python.
The World's Most Mind-Bending Language Has the Best
Development Environment
I highly recommend that all programmers learn J. I doubt most will end up
using it for daily work, but the process of learning it
will stick with you. J is so completely different from
everything else out there, and all your knowledge of C++
and Python and Scheme goes right out the window, leaving
you an abject, confused beginner. In short, J will make you
cry.
But that's not what I want to talk about. Though it's a
bizarre and fringe language (yet not one of those
programmer attempts
at high humor), J is the most beautiful and useful
development system I've come across. I'm not talking about
the language itself, but the standard environment and
add-ons that come with the language.
The IDE is more akin to Python's IDLE than
monstrosities which
may come to mind. There's a window for entering commands
and seeing the results, and you can open up separate,
syntax-colored editor windows, running the contents of each
with a keypress. It's nothing groundbreaking, but it's
something that most languages don't
provide. And in the spirit of IDLE, J's IDE is written in
J.
(I'll interject that J is cross-platform for Windows, OS
X, and Linux, including 64-bit support, just in case anyone
is preparing to deride it as Windows-only.)
Then there are the standard libraries: 3D graphics via
OpenGL; full GUI support including an interface builder;
memory-mapped files; performance profiling tools; a full
interface to arbitrary DLLs; regular-expressions; sockets.
Again, nothing tremendously unusual, except maybe
memory-mapped files and the DLL hooks, but having it all
right there and well-documented is a big win. Beginner
questions like "What windowing library should I use?" just
don't get asked.
The first really interesting improvements over most
languages are the visualization tools. It's one line of
code to graph arbitrary data. Think about that: no need to
use a graphing calculator, no need to export to some
separate tool, and most importantly the presence of such
easy graphing ability means that you will use it.
Once you get started running all kinds of data through
visualization tools, you'll find you use them to spot-check
for errors or to get a better understanding of what kinds
of input you're dealing with. It goes further than just 2D
graphs. For example, there's a nifty tool that color codes
elements of a table, where identical elements have the same
colors. It makes patterns obvious. (Color code a matrix,
and you can easily tell if all the elements on a diagonal
are the same.)
What makes me happiest is the built-in tutorial system,
called "Labs" in J lingo. It's a mix of explanatory text,
expressions which are automatically evaluated so you can
see the results, and pauses after each small bit of
exposition so you can experiment in the live J environment.
Labs can be broken into chapters (so you can work through
them in parts), and the tool for creating your own labs is
part of the standard J download.
While many of the supplied labs are along the lines of
"How to use sockets," the best ones aren't about J at all.
They're about geometry or statistics or image processing,
and you end up learning J while exploring those topics. J
co-creator Ken
Iverson's labs are the most striking, because they
forgo the usual pedantic nature of language tutorials and
come across as downright casual. Every "Learn Haskell"
tutorial I've read wallows in type systems and currying and
all the trappings of the language itself. And after a while
it all gets to be too much, and I lose interest. Iverson
just goes along talking about some interesting number
theory, tosses out some short executable expressions to
illustrate his points, and drops in a key bit of J
terminology almost as an afterthought.
If you're wondering why I love the J environment so much
but don't use it as my primary programming language, that's
because, to me, J isn't suited for most projects I'm
interested in. But for exploration and learning there's no
finer system.
(If you want to see some real J code, try Functional Programming Archaeology.)
Micro-Build Systems and the Death of a Prominent
DSL
Normally I don't think about how to rebuild an Erlang
project. I just compile a file after editing it--via the
c(Filename)
shell command--and that's that.
With hot code loading there no need for a linking step.
Occasionally, such as after upgrading to a new Erlang
version, I do this:
erlc *.erl
which compiles all the .erl
files in the
current directory.
But wait a minute. What about checking if the
corresponding .beam
file has a more recent
date than the source and skipping the compilation step for
that file? Surely that's gong to be a performance win?
Here's the result of fully compiling a mid-sized project
consisting of fifteen files:
$ time erlc *.erl
real 0m1.912s
user 0m0.945s
sys 0m0.108s
That's less than two seconds to rebuild
everything. (Immediately rebuilding again takes less than
one second, showing that disk I/O is a major factor.)
Performance is clearly not an issue. Not yet anyway.
Mid-sized projects have a way of growing into large-sized
projects, and those 15 files could one day be 50.
Hmmm...linearly interpolating based on the current project
size still gives a time of under six-and-a half-seconds, so
no need to panic. But projects get more complex in other
ways: custom tools written in different languages,
dynamically loaded drivers, data files that need to be
preprocessed, Erlang modules generated from data, source
code in multiple directories.
A good start is to move the basic compilation step into
pure Erlang:
erlang_files() -> [
"util.erl",
"http.erl",
"sandwich.erl",
"optimizer.erl"
].
build() ->
c:lc(erlang_files()).
where c:lc()
is the Erlang shell function
for compiling a list of files.
If you stop and think, this first step is actually a
huge step. We've now got a symbolic representation of the
project in a form that can be manipulated by Erlang code.
erlang_files()
could be replaced by searching
through the current directory for all files with an
.erl
extension. We could even do things like
skip all files with _old
preceding the
extension, such as util_old.erl
. And all of
this is trivially, almost mindlessly, easy.
There's a handful of things that traditional build
systems do. They call shell commands. They manipulate
filenames. They compare dates. The fancy ones go through
source files and look for included files. These things are
a small subset of what you can do in Perl, Ruby, Python, or
Erlang. So why not do them in Perl, Ruby, Python, or
Erlang?
I'm pretty sure there's a standard old build system to
do this kind of thing, but in a clunky way where you have
to be careful whether you use spaces or tabs, remember
arcane bits of syntax, remember what rules and macros are
built-in, remember tricks involved in building nested
projects, remember the differences between the versions
that have gone down their own evolutionary paths. I use it
rarely enough that I forget all of these details. There are
modern variants, too, that trade all of that 1970s-era
fiddling for different lists of things to remember. But
there's no need.
It's easier and faster to put together custom,
micro-build systems in the high-level language of your
choice.
Tales of a Former Disassembly Addict
Like many people who learned to program on home
computers in the 1980s, I started with interpreted BASIC
and moved on to assembly language. I've seen several
comments over the years--including one from Alan Kay--less
than thrilled with the 8-bit department store computer era,
viewing it as a rewind to a more primitive time in
programming. That's hard to argue against, as the decade
prior to the appearance of the Apple II and Atari 800 had
resulted in Modula-2, Smalltalk, Icon, Prolog, Scheme, and
some top notch optimizing compilers for Pascal and C. Yet
an entire generation happily ignored all of that and became
bit-bumming hackers, writing thousands of games and
applications directly at the 6502 and Z80 machine level
with minimal operating system services.
There wasn't much of a choice.
In Atari BASIC, this statement:
PRINT SIN(1)
was so slow to execute that you could literally say "dum
de dum" between pressing the enter key and seeing the
result. Assembly language was the only real option if you
wanted to get anywhere near what the hardware was capable
of. That there was some amazing Prolog system on a giant
VAX did nothing to change this. And those folks who had
access to that system weren't able to develop fast
graphical games that sold like crazy at the software store
at the mall.
I came out of that era being very sensitive to what good
low-level code looked like, and it was frustrating.
I'd routinely look at the disassembled output of Pascal
and C compilers and throw up my hands. It was often as if
the code was some contrived example in Zen of Assembly Language, just to show how
much opportunity there was for optimization. I'd see
pointless memory accesses, places where comparisons could
be removed, a half dozen lines of function entry/exit code
that wasn't needed.
And it's often still like that, even though the party
line is that compilers can out-code most humans. Now I'm
not arguing against the overall impressiveness of compiler
technology; I remember trying to hand-optimize some SH4
code and I lost to the C compiler every time (my code was
shorter, but not faster). But it's still common to see
compilers where this results in unnecessarily bulky
code:
*p++ = 10;
*p++ = 20;
*p++ = 30;
*p++ = 40;
while this version ends up much cleaner:
p[0] = 10;
p[1] = 20;
p[2] = 30;
p[3] = 40;
p += 4;
I noticed that under OS X a few years ago--and this may
certainly still be the case with the Snow Leopard C
compiler--that every access to a global variable resulted
in two fetches from memory: one to get the address
of the variable, one to get the actual value.
Don't even get me started about C++ compilers. Take some
simple-looking code involving objects and overloaded
operators, and I can guarantee that the generated code will
be filled with instructions to copy temporary objects all
over the place. It's not at all surprising if a couple of
simple lines of source turn into fifty or a hundred of
assembly language. In fact, generated code can be so
ridiculous and verbose that I finally came up with an
across-the-board solution which works for all compilers on
all systems:
I don't look at the disassembled output.
If you've read just a couple of entries in this blog,
you know that I use Erlang for most of my personal
programming. As a mostly-interpreted language that doesn't
allow data structures to be destructively modified, it's no
surprise to see Erlang in the bottom half of any
computationally intensive benchmark. Yet I find it keeps me
thinking at the right level. The goal isn't to send as much
data as possible through a finely optimized function, but
to figure out how to have less data and do less processing
on it.
In the mid-1990s I wrote a 2D game and an enhanced
version of the same game. The original had occasional--and
noticeable--dips in frame rate on low-end hardware, even
though I had optimized the sprite drawing routines to
extreme levels. The enhanced version didn't have the same
problem, even though the sprite code was the same. The
difference? The original just threw dozens and dozens of
simple-minded attackers at the player. The enhanced version
had a wider variety of enemy behavior, so the game could be
just as challenging with fewer attackers. Or more
succinctly: it was drawing fewer sprites.
I still see people obsessed with picking a programming
language that's at the top of the benchmarks, and they
obsess over the timing results the way I used to obsess
over disassembled listings. It's a dodge, a
distraction...and it's irrelevant.
How Did Things Ever Get This Good?
It's an oft-repeated saying in photography that the
camera doesn't matter. All that fancy equipment is a waste
of money, and good shots are from inspired photographers
with well-trained eyes.
Of course no one actually believes that.
Clearly some photos are just too good to be taken with
some $200 camera from Target, and there must be a reason
that pros can buy two-thousand dollar lenses and
three-thousand dollar camera bodies. The "camera doesn't
matter" folklore is all touchy-feely and inspirational and
rolls off the tongue easily enough, and then everyone runs
back to their Nikon rumor sites and over-analyzes the
differences between various models, and thinks about how
much better photos will turn out after the next hardware
refresh cycle.
But the original saying is actually correct. It's just
hard to accept, because it's fun to compare and lust after
all the toys available to the modern photographer. I've
finally realized that some of those photos that once made
me say "Wow, I wish I had a camera like that!" might look
casual, but often involve elaborate lighting set-ups. If
you could pull back and see more than just the framed shot,
there would be a light box over here, and a flash bounced
off of a big white sheet over there, and so on. Yes,
there's a lot of work involved, but the camera is
incorrectly assumed to be doing more than it really is. In
fact it's difficult to find a truly bad camera.
What, if anything, does this have to do with
programming?
Life is good if you have applications or tools or games
that you want to write. Even a language like Ruby, which
tends to hang near the bottom of any performance-oriented
benchmark, is thousands of times faster than BASICs that
people were learning to program 8-bit home computers with
in the 1980s. That's not an exaggeration, I do mean
thousands.
The world is brimming with excellent programming
languages: Python, Clojure, Scala, Perl, Javascript, OCaml,
Haskell, Erlang, Lua. Most slams against individual
languages are meaningless in the overall scheme of things.
If you like Lisp, go for it. There's no reason you can't
use it to do what you want to do. String handling is poor
in Erlang? Compared to what? Who cares, it's so much easier
to use than anything I was programming with twenty years
ago that it's not worth discussing. Perl is ugly? It
doesn't matter to me; it's fun to program in.
Far, far, too much time has been spent debating the
merits of various programming languages. Until one comes
along that truly gives me a full magnitude increase in
productivity over everything else, I'm good.
Slow Languages Battle Across Time
In my previous optimistic outburst
I asserted that "Even a language like Ruby, which tends to
hang near the bottom of any performance-oriented benchmark,
is thousands of times faster than BASICs that people were
learning to program 8-bit home computers with in the
1980s." That was based on some timings I did five years
ago, so I decided to revisit them.
The benchmark I used is the old and
not-very-good-as-a-benchmark Sieve of Eratosthenes, because
that's the only benchmark that I have numbers for in
Atari
BASIC on original 8-bit computer hardware. Rather than
using Ruby as the modern-day language, I'm using Python,
simply because I already have it installed. It's a fair
swap, as Python doesn't have a reputation for performance
either.
The sieve in Atari BASIC, using timings from an article
written in 1984 by Brian
Moriarty, clocks in at:
324 seconds (or just under 5 and a half minutes)
The Python version, running on hardware that's a
generation back--no i7 processor or anything like
that--completes in:
3 seconds
Now that's impressive! A straight-ahead interpreted
language, one with garbage collection and dynamic typing
and memory allocation all over the place, and it's still
two orders of magnitude, 108 times, faster than what
hobbyist programmers had to work with twenty-five years
ago. But what about the "thousands of times" figure I
tossed about in the first paragraph?
Oh, yes, I forgot to mention that the Python code is
running the full Sieve algorithm one thousand
times.
If the Atari BASIC program ran a thousand times, it
would finish after 324,000 seconds or 5400 minutes or
almost four days. That means the Python version is--get
ready for this--108,000 times faster than the Atari BASIC
code.
That's progress.
(If you liked this, you might also like A Spellchecker Used to be a Major Feat of
Software Engineering.)
How I Learned to Stop Worrying and Love Erlang's
Process Dictionary
The rise of languages based upon hash tables is one of
the great surprises in programming over the last twenty
years.
Whatever you call them--hash tables, dictionaries,
associative arrays, hash maps--they sure are useful. A
majority of my college data structures courses are
immediately negated. If you've got pervasive hash table
support, then you've also got arrays (just hash tables with
integer keys), growable arrays (ditto), sparse arrays
(again, same thing), and it's rare to have to bother with
binary trees, red-black trees, tries, or anything more
complex. Okay, sets are still useful, but they're hash
tables where only the keys matter. I'd even go so far as to
say that with pervasive hash table support it's unusual to
spend time thinking about data structures at all. It's
key/value pairs for everything. (And if you need
convincing, study Peter Norvig's Sudoku solver.)
If you don't believe the theory that functional programming
went mainstream years ago, then at least consider that
the mismatch between dictionary-based programming and
functional programming has dealt a serious blow to the
latter.
Now, sure, it's easy to build a purely functional
dictionary in Erlang or Haskell. In fact, such facilities
are already there in the standard libraries. It's
using them that's clunky. Reading a value out of a
dictionary is straightforward enough, but the restrictions
of single-assignment, and that "modifying" a hash table
returns a new version, calls for a little puzzle solving.
I can write down any convoluted sequence of dictionary
operations:
Add the value of the keys "x" and "y" and store them
in key "z". If "z" is greater than 100, then also set a
key called "overflow" and add the value of "extra" to
"x." If "x" is greater than 326, then set "finished" to
true, and clear "y" to zero.
and in Python, Ruby, Lua, or Perl, minimal thought is
required to write out a working solution. It's just a
sequence of operations that mimic their textual
descriptions. Here's the Python version, where the
dictionary is called "d":
d['z'] = d['x'] + d['y']
if d['z'] > 100:
d['overflow'] = True
d['x'] += d['extra']
if d['x'] > 326:
d['finished'] = True
d['y'] = 0
I can certainly write the Erlang version of that (and,
hey, atoms, so no quotes necessary!), but I can't do it so
mindlessly. (Go ahead and try it, using either the
dict
or gb_trees
modules.) I may
be able to come up with Erlang code that's prettier than
Python in the end, but it takes more work, and a minor
change to the problem definition might necessitate a full
restructuring of my solution.
Well, okay, no, the previous paragraph isn't true at
all. I can bang-out an Erlang version that closely mimics
the Python solution. And as a bonus, it comes with the big
benefit of Erlang: the hash table is completely isolated
inside of a single process. All it takes is getting over
the psychological hurdle--and the lecturing from
purists--about using the process dictionary. (You can use
the ets
module to reach the same end, but it takes more effort:
you need to create the table first, and you have to create
and unpack key/value pairs yourself, among other
quirks.)
Let me come right out and say it: it's okay to use the
process dictionary.
Clearly the experts agree on this, because every sizable
Erlang program I've looked at makes use of the process
dictionary. It's time to set aside the rote warnings of how
unmaintainable your code will be if you use
put
and get
and instead revel in
the usefulness of per-process hash tables. Now what's
really being preached against, when the horrors of
the process dictionary are spewed forth, is using it as a
way to sidestep functional programming, weaving global
updates and flags through your code like old-school BASIC.
But that's the extreme case. Here's a list of situations
where I've found the process dictionary to be useful,
starting from the mildest of instances:
Low-level, inherently stateful functions. Random
number generation is the perfect example, and not too
surprisingly the seed that gets updated with each call
lives in the process dictionary.
Storing process IDs. Yes, you can use named
processes instead, and both methods keep you from having to
pass PIDs around, but named processes are global to the
entire Erlang node. Use the process dictionary instead and
you can start multiple instances of the same application
without conflict.
Write-once process parameters. Think of this as
an initial configuration step. Stuff all the settings that
will never change within a process in the dictionary. From
a programming point of view they're just like constants, so
no worries about side effects.
Managing data in a tail recursive server loop. If
you done any Erlang coding you've written a tail-recursive
server at some point. It's a big receive
statement, where each message handling branch ends with a
recursive call. If there are six parameters, then each of
those calls involves six parameters, usually with one of
them changed. If you add a new parameter to the function,
you've got to find and change each of the recursive calls.
Eventually it makes more sense to pack everything into a
data structure, like a gb_tree
or
ets
table. But there's nothing wrong with just
using the simpler process dictionary for key/value pairs.
It doesn't always make sense (you might want the ability to
quickly roll back to a previous state), but sometimes it
does.
Handling tricky data flow in high-level code.
Sometimes trying to be pure is messy. Nice, clean code gets
muddled by having to pass around some data that only gets
used in exceptional circumstances. All of a sudden
functions have to return tuples intead of simple values.
All of a sudden there's tangential data hitchhiking through
functions, not being used directly. And when I start going
down this road I find myself getting frustrated and
annoyed, jumping through hoops to do something that I
wouldn't even care about in most languages. Making flags or
key elements of data globally accessible, is a huge sigh of
relief, and the excess code melts away.
(If you've read this and are horrified that I'm
completely misunderstanding functional programming,
remember that I've gone further down
the functional path than most people for what appear to be
state-oriented problems.)
Functional Programming Doesn't Work (and what to do
about it)
Read suddenly and in isolation, this may be easy to
misinterpret, so I suggest first reading some past articles
which have led to this point:
After spending a long time in the functional programming
world, and using Erlang as my go-to language for tricky
problems, I've finally concluded that purely functional
programming isn't worth it. It's not a failure because of
soft issues such as marketing, but because the further you
go down the purely functional road the more mental overhead
is involved in writing complex programs. That sounds like a
description of programming in general--problems get much
more difficult to solve as they increase in scope--but it's
much lower-level and specific than that. The kicker is that
what's often a tremendous puzzle in Erlang (or Haskell)
turns into straightforward code in Python or Perl or even
C.
Imagine you've implemented a large program in a purely
functional way. All the data is properly threaded in and
out of functions, and there are no truly destructive
updates to speak of. Now pick the two lowest-level and most
isolated functions in the entire codebase. They're used all
over the place, but are never called from the same modules.
Now make these dependent on each other: function A behaves
differently depending on the number of times function B has
been called and vice-versa.
In C, this is easy! It can be done quickly and cleanly
by adding some global variables. In purely functional code,
this is somewhere between a major rearchitecting of the
data flow and hopeless.
A second example: It's a common compilation technique
for C and other imperative languages to convert programs to
single-assignment form. That is, where variables are
initialized and never changed. It's easy to mechanically
convert a series of destructive updates into what's
essentially pure code. Here's a simple statement:
if (a > 0) {
a++;
}
In single-assignment form a new variable is introduced
to avoid modifying an existing variable, and the result is
rather Erlangy:
if (a > 0) {
a1 = a + 1;
} else {
a1 = a;
}
The latter is cleaner in that you know
variables won't change. They're not variables at all, but
names for values. But writing the latter directly
can be awkward. Depending on where you are in the code, the
current value of whatever "a" represents has different
names. Inserting a statement in the middle requires
inventing new names for things, and you need to make sure
you're referencing the right version. (There's more room
for error now: you don't just say "a," but the name of the
value you want in the current chain of calculations.)
In both of these examples imperative code is actually an
optimization of the functional code. You could pass a
global state in and out of every function in your program,
but why not make that implicit? You could go through the
pain of trying to write in single-assignment form directly,
but as there's a mechanical translation from one to the
other, why not use the form that's easier to write in?
At this point I should make it clear: functional
programming is useful and important. Remember, it was
developed as a way to make code easier to reason about and
to avoid "spaghetti memory updates." The line between
"imperative" and "functional" is blurry. If a Haskell
program contains a BASIC-like domain specific language
which is also written in Haskell, is the overall program
functional or imperative? Does it matter?
For me, what has worked out is to go down the purely
functional path as much as possible, but fall back on
imperative techniques when too much code pressure has built
up. Some cases of this are well-known and accepted, such as
random number generation (where the seed is modified behind
the scenes), and most any kind of I/O (where the position
in the file is managed for you).
Learning how to find similar pressure relief valves in
your own code takes practice.
One bit of advice I can offer is that going for the
obvious solution of moving core data structures from
functional to imperative code may not be the best approach.
In the Pac-Man example from Purely
Functional Retrogames, it's completely doable to write
that old game in a purely functional style. The
dependencies can be worked out; the data flow isn't really
that bad. It still may be a messy endeavor, with lots of
little bits of data to keep track of, and selectively
moving parts out of the purely functional world will result
in more manageable code. Now the obvious target is either
the state of Pac-Man himself or the ghosts, but those are
part of the core data flow of the program. Make those
globally accessible and modifiable and all of a sudden a
large part of the code has shifted from functional to
imperative...and that wasn't the goal.
A better approach is to look for small, stateful, bits
of data that get used in a variety of places, not just on
the main data flow path. A good candidate in this example
is the current game time (a.k.a. the number of elapsed
frames). There's a clear precedent that time/date
functions, such as Erlang's now()
, cover up a
bit of state, and that's what makes them useful. Another
possibility is the score. It's a simple value that gets
updated in a variety of situations. Making it a true global
counter removes a whole layer of data threading, and it's
simple: just have a function to add to the score counter
and another function to retrieve the current value. No
reason to add extra complexity just to dodge having a
single global variable, something that a C / Python / Lua /
Ruby programmer wouldn't even blink at.
(Also see the follow-up.)
Indonesian
translation
Follow-up to "Functional Programming Doesn't Work"
Not surprisingly, Functional
Programming Doesn't Work (and what to do about it)
started some lively discussion. There were two interesting
"you're crazy" camps:
The first mistakenly thought that I was proposing fixing
problems via a judicious use of randomly updated global
variables, so every program turns into potential fodder for
the "Daily WTF."
The second, and really, the folks in this camp need to
put some effort into being less predictable, was that I'm
completely misunderstanding the nature of functional
programming, and if I did understand it then I'd realize
the true importance of keeping things pure.
My real position is this: 100% pure functional
programing doesn't work. Even 98% pure functional
programming doesn't work. But if the slider between
functional purity and 1980s BASIC-style imperative
messiness is kicked down a few notches--say to 85%--then it
really does work. You get all the advantages of functional
programming, but without the extreme mental effort and
unmaintainability that increases as you get closer and
closer to perfectly pure.
That 100% purity doesn't work should only be news to a
couple of isolated idealists. Of the millions of
non-trivial programs ever written--every application, every
game, every embedded system--there are, what, maybe six
that are written in a purely functional style? Don't push
me or I'll disallow compilers for functional languages from
that list, and then it's all hopeless.
"Functional Programming Doesn't Work" was intended to be
optimistic. It does work, but you have to ease up
on hardliner positions in order to get the benefits.
The Recovering Programmer
I wrote the first draft of this in 2007, and I
thought the title would be the name of my blog. But I
realized I had a backlog of more tech-heavy topics that I
wanted to get out of my system. I think I've finally done
that, so I'm going back to the original entry I planned to
write.
When I was a kid, I thought I'd be a cartoonist--I was
always drawing--or a novelist. Something artistic. When I
became obsessed with video games in the 1980s, I saw game
design as being in the same vein as cartooning and writing:
one person creating something entirely on their own. I
learned to program 8-bit computers so I could implement
games of my own design. Eventually, slowly, the programming
overtook the design. I got a degree in computer science. I
worked on some stuff that looks almost impossible now, like
commercial games of over 100K lines of assembly language
(and later I became possibly the only person to ever write
a game entirely in PowerPC assembly language).
Somewhere along the lines, I realized I was looking at
everything backward, from an implementation point of view,
not from the perspective of a finished application, not
considering the user first. And I realized that the
inherent bitterness and negativity of programming arguments
and technical defensiveness on the web were making
me bitter and negative. I've consciously tried to
rewind, to go back to when programming was a tool for
implementing my visions, not its own end. I've found that
Alan Cooper is right, in that a tech-first view promotes
scarcity thinking (that is, making perceived memory and
performance issues be the primary concerns) and dismissing
good ideas because of obscure boundary cases. And now
programming seems less frustrating than it once did.
I still like to implement my own ideas, especially in
fun languages like Erlang and Perl. I'm glad I can
program, because personal programming in the small is
fertile ground and tremendously useful. For starters, this
entire site is generated by 269 lines of commented Perl,
including the archives and the
atom feed (and those 269 lines also include some HTML
templates). Why? Because it was pleasant and easy, and I
don't have to fight with the formatting and configuration
issues of other software. Writing concise to-the-purpose
solutions is a primary reason for programming in the
twenty-first century.
If blogs had phases, then this would be the second phase
of mine. I'm not entirely sure what Phase 2 will consist
of, but I'll figure that out. Happy 2010!
No Comment
I received a few emails after last
time along the lines of "Oh. Perl. Homebrew CMS.
That's why you don't allow people to post
comments." Well, no, but it was definitely a conscious
decision. The Web 2.0 answer is that I'm outsourcing
comments to reddit and
Hacker News. The
real reason is this:
The negativity of online technical discussions makes
me bitter, and even though I'm sometimes drawn to them
I need to stay away.
To be fair, this isn't true only of technical
discussions. Back when I was on Usenet, I took
refuge from geeky bickering in a group about cooking...only
to find people arguing the merits of Miracle
Whip versus mayonnaise. Put enough people together and
there are sure to be complaints and conflicting personal
agendas. But with smart, technically-oriented people, I'd
expect there to be more sharing of real experiences, but
that's often not the case.
Here's a lesson I learned very early on after I started
working full-time as a programmer (and that's a peculiar
sentence for me to read, as I no longer program for a
living). I'd be looking at some code at my desk, and it
made no sense. Why would anyone write it like this? There's
an obvious and cleaner way to approach the same
problem.
So I'd go down the hall to the person who wrote it in
the first place and start asking questions...and find out
that I didn't have the whole picture, the problem was
messier than it first appeared, and there were perfectly
valid reasons for the code being that way. This happened
again and again. Sometimes I did find a real flaw, but even
then it may have only occurred with data that wasn't
actually possible (because, for example, it was filtered by
another part of the system). Talking face to face changed
everything, because they could draw diagrams, pull out
specs, and give concrete examples.
I think that initial knee-jerk "I've been looking at
this for ten seconds and now let me explain the critical
flaws" reaction is a common one among people with
engineering mindsets. And that's not a good thing. I've
seen this repeatedly, from people putting down programming
languages for silly, superficial reasons (Perl's sigils,
Python's enforced indentation), to ridiculous off-the-cuff
put downs of new products (such as the predictions of doom
in the
Slashdot announcement of the original iPod in
2001).
The online community that I've had the most
overwhelmingly positive experience with is the
photo-sharing site Flickr.
I'll keep talking about Flickr, because it played a big
part in getting me out of some ruts, and I've seen more
great photographs over the last five years than I would
have seen in ten lifetimes otherwise. I know that if you
dig around you can find tedious rants from equipment
collectors, but I do a good job of avoiding those. I don't
think I've ever seen real negativity in photo comments
other than suggestions for different crops or the
occasional technical criticisms. There are so many good
photos to see that there's no reason to waste time with
ones the don't appeal to me. That's supported by only
allowing up-voting of photos (by adding a shot to your
favorites); there's no way to formally register
dislike.
Flickr gets my time, but most of the programming
discussion sites don't.
Flickr as a Business Simulator
Flickr came along
exactly when I needed it.
In 2004, I knew I was too immersed in technical
subjects, and Flickr motivated me to get back into
photography as a change of pace. I loved taking photos when
I was in college (mostly of the set-up variety with a
couple of friends), but I hardly touched a camera for the
next decade. When I first found out about Flickr, not long
after it launched, the combination of having a new camera
and a potential audience provided me with a rare level of
inspiration. I remember walking around downtown Champaign
on June 1, 2004, spending two hours entirely focused on
taking photos. I didn't have a plan, I didn't have a
preferred subject; I just made things up as I went.
This is one of my favorites from that day:
Flickr was pretty raw back then. You could comment on
photos, but there wasn't the concept of favoriting a good
shot or the automated interestingness
ranking. As those systems went into a place, it was easier
to get feedback about the popularity of photos. Why did
people like this photo but not this other one? How does
that user manage to get dozens of comments per shot?
It took me a while to recognize some of the thought
patterns and feelings that I had once I started paying
attention to the feedback enabled by Flickr. They were
reminiscent of feelings I had when I was an independent
developer. I was rediscovering lessons which I had, at
great expense, learned earlier. Now I can, and will,
recount some of these lessons, but that in itself isn't
very useful or exciting. Anyone can recite pithy business
knowledge, and anyone can ignore it too, because it's hard
to accept advice without it being grounded in personal
experience. The important part is that you can experience
these lessons firsthand by using Flickr.
Create an account and give yourself a tough goal, such
as getting 50,000 photostream views in six months or
getting 500 photos flagged as favorites. And now it's a
business simulator. You're creating a product--a pool of
photographs--which is released into the wild and judged by
people you don't control. The six month restriction
simulates how long you can survive on your savings. Just
like a real business, the results have a lot to do with the
effort you put forth. But it's not a simple translation of
effort into success; it's trickier than that.
Now some of the lessons.
You don't get bonus points for being the small
guy. It sounds so appealing to be the indie that's
getting by on a shoestring. Maybe some customers will be
attracted to that and want to stick it to the man by
supporting you. On Flickr you're on the same playing field
as pros with thousands of dollars worth of equipment and
twenty years' experience. You can still stand out,
but don't fool yourself into thinking that your lack of
resources that's used an excuse for lower quality is going
to be seen as an endearing advantage.
While quality is important, keep the technical
details behind the scenes. Just as no one really cares
what language your application is written in, no one really
cares what lens you took a photograph with or what filter
you used. Be wary of getting too into the tech instead of
the end result.
What you think people want might not be what people
want. This one is tough. Are you absorbed in things
that you think are important but are irrelevant, or even
turn-offs, to your potential audience? This is the kind of
thing that a good record producer would step in and deal
with ("Just stop with the ten minute solos, okay?"), but it
can be difficult to come to these realizations on your own,
especially if you're seeing the problems as selling
points.
Don't fixate on why you think some people are
undeservedly successful. All it does it pull you away
from improving your own photos/products as you pour energy
into being bitter. Your personal idea of taste doesn't
apply to everyone else. There may be other factors at work
that you don't understand. Just let it go or it will drag
you down.
But don't take my word for it. Just spend a few months
in the simulator.
Nothing Like a Little Bit of Magic
Like so many other people, I was enthralled by the
iPad
introduction. I haven't held or even seen an iPad in
person yet, but that video hit me on a number of levels.
It's a combination of brand new hardware--almost
dramatically so--and uses for it that are coming from a
completely different line of thinking. I realized it's been
a long time since I felt that way about the introduction of
a new computer.
I remember the first time I tried a black and white 128K
Mac in a retail store. A mouse! Really tiny pixels!
Pull-down menus! Graphics and text mixed together! And the
only demo program was what made the whole experience click:
MacPaint.
I remember when the Atari 520ST was announced. Half a
megabyte of memory! Staggering amounts of power for less
than $1000! A Mac-like interface but in full color! Some of
the demos were simple slideshows of 16-color 320x200
images, done with a program called NeoChrome, but
I had never seen anything like them before.
I remember when the Amiga debuted that same year. Real
multitasking! Digitized sound! Stereo! Hardware for moving
around big bitmaps instead of just tiny sprites! Images
showing thousands of colors at once! Just the bouncing
ball demo was outside what I expected to ever see on a
computer. And there was a flight-sim with filled polygon
graphics. Behind the scenes it was the fancy hardware
enabling it all, but it was the optimism and feeling of new
possibilities that fueled the excitement.
I remember when the Macintosh II came out, with 24-bit
color and impossibly high display resolutions for the time.
It seemed like a supercomputer on a desk, the kind of thing
that only high-end graphics researchers would have
previously had access to.
PCs never hit me so unexpectedly and all at once, but
there were a few years when 3D hardware started appearing
where it felt like the old rules had been thrown out and
imagining the future was more important than looking back
on the same set of ideas.
Am I going to buy an iPad? I don't know yet. I never
bought most of the systems listed above. But I am glad I've
been experiencing that old optimism caused by a mix of
hardware and software that suddenly invalidates many of the
old, comfortable rules and opens up territory that hasn't
been endlessly trod upon.
What to do About Erlang's Records?
The second most common complaint about Erlang, right
after confusion about commas and semicolons as separators,
is about records. Gotta give those complainers some credit,
because they've got taste. Statically defined records are
out of place in a highly dynamic language. There have been
various proposals over the years, including Richard
O'Keefe's abstract syntax patterns and Joe Armstrong's
structs. Getting one of those implemented needs the solid
support of the Erlang system maintainers, and it's
understandably difficult to commit to such a sweeping
change to the language. So what are the alternatives to
records that can be used right now?
To clarify, I'm really talking about smallish, purely
functional dictionaries. For large
amounts of data there's already the gb_trees
module, plus several others with similar purposes.
In Python, a technique I often use is to return a small
dictionary with a couple of named values in it. I could use
a tuple, but a dictionary removes the need to worry about
order. This is straightforward in Erlang, too:
fun(length) -> 46;
(width) -> 17;
(color) -> sea_green
end.
Getting the value corresponding to a key is easy
enough:
Result(color)
This is handy, but only in certain situations. One
shortcoming is that there's no way to iterate through the
keys. Well, there's this idea:
fun(keys) -> [length, width, color];
(length) -> 46;
(width) -> 17;
(color) -> sea_green
end.
Now there's a way to get a list of keys, but there's
room for error: each key appears twice in the code. The
second issue is there's no simple way to take one
dictionary and create a new one with a value added or
removed. This road is becoming messy to go down, so here's
more data-driven representation:
[{length, 46}, {width, 17}, {color, sea_green}]
That's just a list of key/value pairs, which is
searchable via the fast, written-in-C function
lists:keyfind
. New values can be appended to
the head of the list, and there are other functions in the
lists
module for deleting and replacing
values. Iteration is also easy: it's just a list.
We still haven't bettered records in all ways. A big win
for records, and this is something few purely functional
data structures handle well, is the ability to create a new
version where multiple keys get different values.
For example, start with the above list and create this:
[{length, 200}, {width, 1400}, {color, sea_green}]
If we knew that only those three keys were allowed,
fine, but that's cheating. The whole point of dictionaries
is that we can put all sorts of stuff in there, and it
doesn't change how the dictionary is manipulated. The
general solution is to delete all the keys that should have
new values, then insert the new key/value pairs at the head
of the list. Or step through the list and see if the
current key is one that has a new value and replace it.
These are not linear algorithms, unfortunately. And you've
got the same problem if you want to change multiple values
in a gb_tree
at the same time.
What I've been using, and I admit that this isn't
perfect, is the key/value list approach, but forcing the
lists to be sorted.This allows the original list and a list
of changes to be merged together in linear time. The
downside is that I have to remember to keep a literal list
in sorted order (or write a parse transform to do this for
me).
There's still one more feature of records that
can't be emulated: extracting / comparing values using
Erlang's standard pattern matching capabilities. It's not a
terrible omission, but there's no way to dodge this one: it
needs compiler and runtime system support.
Optimizing for Fan Noise
The first money I ever earned, outside of getting an
allowance, was writing assembly language games for an 8-bit
home computer magazine called ANALOG Computing.
Those games ended up as pages of printed listings of lines
like this:
1050 DATA 4CBC08A6A4BC7D09A20986B7B980
0995E895D4B99E099DC91C9DB51CA90095C0C8
CA10E8A20086A88E7D1D8E7E,608
A typical game could be 75 to 125+ of those lines (and
those "three" lines above count as one; it's word-wrapped
for a 40-column display). On the printed page they were a
wall of hex digits. And people typed them in by hand--I
typed them in by hand--in what can only be described as a
painstaking process. Just try reading that data to yourself
and typing it into a text editor. Go ahead:
4C-BC-08-A6...
Typos were easy to make. That's the purpose of the "608"
at the end of the line. It's a checksum verified by a
separate "correctness checker" utility.
There was a strong incentive for the authors of these
games to optimize their code. Not for speed, but to
minimize the number of characters that people who bought
the magazine had to type. Warning, 6502 code ahead!
This:
LDA #0
TAY
was two fewer printed digits than this:
LDA #0
LDY #0
Across a 4K or 6K game, those savings mattered.
Two characters here, four characters there, maybe the total
line count could be reduced by four lines, six lines, ten
lines. This had nothing to do with actual code performance.
Even on a sub-2MHz processor those scattered few cycles
were noise. But finding your place in the current line,
saying "A6," then typing "A" and "6" took time. Measurable
time. Time that was worth optimizing.
Most of the discussions I see about optimization are
less concrete. It's always "speed" and "memory," but in the
way someone with a big house and a good job says "I need
more money." Optimization only matters if you're optimizing
something where you can feel the difference, and
you can't feel even thousands of bytes or nanoseconds.
Optimizing for program understandability...I'll buy that,
but it's more of an internal thing. There's one concern
that really does matter these days, and it's not abstract
in the least: power consumption.
It's more than just battery life. If a running program
means I get an hour less work done before looking for a
place to plug in, that's not horrible. The experience is
the same, just shorter. But power consumption equals heat
and that's what really matters to me: if the CPU load in my
MacBook cranks up then it gets hot, and that causes the fan
to spin up like a jet on the runway, which defeats the
purpose of having a nice little notebook that I can bring
places. I can't edit music tracks with a roaring fan like
that, and it's not something I'd want next to me on the
plane or one table over at the coffee shop. Of course it
doesn't loudly whine like that most of the time, only when
doing something that pushes the system hard.
What matters in 2010 is optimizing for fan noise.
If you're not buying this, take a look at Apple's
stats
about power consumption and thermal output of iMacs (which,
remember, are systems where the CPU and fan are right there
on your desk in the same enclosure as the monitor). There's
a big difference in power consumption, and corresponding
heat generated, between a CPU idling and at max load. That
means it's the programs you are running which are
directly responsible for both length of battery charge and
how loudly the fan spins.
Obvious? Perhaps, but this is something that didn't
occur with most popular 8-bit and 16-bit processors,
because those chips never idled. They always ran
flat-out all the time, even if just in a busy loop waiting
for interrupts to hit. With the iMacs, there's a trend
toward the difference between idle and max load increasing
as the clock speed of the processor increases. The worst
case is the early 2009 24-inch iMac: 387.3 BTU/h
at idle, 710.3 BTU/h at max load, for a difference of 323
BTU/h. (For comparison, that difference is larger than the
entire maximum thermal output of the 20-inch iMac CPU:
298.5 BTU/h.)
The utmost in processing speed, which once was the goal,
now has a price associated with it. At the same time that
manufacturers cite impressive benchmark numbers, there's
also the implicit assumption that you don't really want to
hit those numbers in the everyday use of a mobile computer.
Get all those cores going all the time, including the
vector floating point units, and you get rewarded with
forty minutes of use on a full battery charge with the fan
whooshing the whole time. And if you optimize your code
purely for speed, you're getting what you asked for.
Realistically, is there anything you can do? Yes, but it
means you have to break free from the mindset that all of a
computer's power is there for the taking. Doubling the
speed of a program by moving from one to four cores is a
win if you're looking at the raw benchmark numbers, but an
overall loss in terms of computation per watt. Ideas that
sounded good in the days of CPU cycles being a free
resource, such as anticipating a time-consuming task that
the user might request and starting it in the background,
are now questionable features. Ditto for persistent
unnecessary animations.
Nanoseconds are abstract. The sound waves generated by
poorly designed applications are not.
Dehumidifiers, Gravy, and Coding
For a few months I did freelance humor writing. Greeting
cards, cartoon captions, that sort of thing. My sole income
was from the following slogan, which ended up on a
button:
Once I've gathered enough information for the
almighty Zontaar, I'm outta here!
Sitting down and cranking out dozens of funny lines was
hard. Harder than I expected. I gave it up because it was
too draining (and because I wasn't making any money, but I
digress).
Periodically I decide I want to boost my creativity. I
carry around a notebook and write down conversations,
lists, brainstormed ideas, randomness. I recently found one
of these notebooks, so I can give some actual samples of
its contents. Below half a page of "Luxury Housing
Developments in Central Illinois Farmland" (e.g., Arctic
Highlands), there's a long list titled "Ridiculous
Things." Here are a few:
salads
spackle
key fobs
wine tastings
mulch
hair scrunchies
asphalt
Fry DaddyTM
cinder blocks
relish
Frito Pie
aeration shoes
Okay, okay, I'll stop. But you get the idea.
As with the humor writing, I remember this taking lots
of effort, and it took real focus to keep going. Did this
improve my creativity? I'd like to think so. It certainly
got me thinking in new directions and about different
topics. It also made me realize something fundamental:
technical creativity, such as optimizing code or thinking
up clever engineering solutions, is completely different
from the "normal" creativity that goes into writing stories
or taking photos.
Years ago, I followed the development of an indie game.
This was back when writing 3D games for non-accelerated VGA
cards was cutting edge. The author was astounding in his
coding brilliance. He kept pulling out trick after trick,
and he wasn't shy about posting key routines for others to
use. Eventually the game was released...and promptly
forgotten. It may have been a technical masterpiece, but it
was terrible as game, completely unplayable.
I still like a good solution to a programming problem. I
still like figuring out how to rewrite a function with half
the code. But technical creativity is only one form of
creativity.
It Made Sense in 1978
Whenever I see this list of memory cell sizes, it
strikes me as antiquated:
BYTE = 8 bits
WORD = 16 bits
LONG = 32 bits
Those names were standard for both the Intel x86 and
Motorola 68000 families of processors, and it's easy to see
where they came from. "Word" isn't synonymous with a 16-bit
value; it refers to the fundamental data size that a
computer architecture is built to operate upon. On a 16-bit
CPU like the 8086, a word is naturally 16-bits.
Now it's 2010, and it's silly to think of a 16-bit value
as a basic enough unit of data to get to the designation
"word." "Long" is similarly out of place, as 32-bit
microprocessors have been around for over 25 years, and yet
the standard memory cell size is still labeled in a way
that makes it sound abnormally large.
The PowerPC folks got this right back in the early 1990s
with this nomenclature:
BYTE = 8 bits
HALFWORD = 16 bits
WORD = 32 bits
That made sense in 1991, and it's still rational today.
(64-bit is now common, but the jump isn't nearly as
critical as it was the last time memory cell size doubled.
The PowerPC name for "64-bits" is "doubleword.")
Occasionally you need to reevaluate your assumptions and
not just cling to something because it's always been that
way.
Eleven Years of Erlang
I've written about how I started using
Erlang. A good question is why, after eleven years, am
I still using it?
For the record, I do use other languages. I enjoy
writing Python code, and I've taught other people how to
use Python. This website is statically generated by a Perl
program that I had fun writing. And I dabble in various
languages of the month which have cropped up. (Another
website I used to maintain was generated by a script that I
kept reimplementing. It started out written in Perl, but
transitioned through at least REBOL, J, and Erlang before I
was through.)
One of the two big reasons I've stuck with Erlang is
because of its simplicity. The functional core of Erlang
can and has been described in a couple of short chapters.
Knowledge of four data types--numbers, atoms, lists,
tuples--is enough for most programming problems. Binaries
and funs can be tackled later. This simplicity is good,
because the difficult part of Erlang and any
mostly-functional language is in learning to write code
without destructive updates. The language itself shouldn't
pour complexity on top of that.
There are many possibilities for extending Erlang with
new data types, with an alternative to records being high on the list. Should
strings be split off from lists into a distinct entity?
What about arrays of floats, so there's no need to box each
value? How about a "machine integer" type that's
represented without tagging and that doesn't get
automatically promoted to an arbitrarily sized "big number"
when needed?
All of those additional types are optimizations. Lists
work just fine as strings, but even the most naive
implementation of strings as unicode arrays would take half
the memory of the equivalent lists, and that's powerful
enticement. When Knuth warned of premature optimization, I
like to think he wasn't talking so much about obfuscating
code in the process of micro-optimizing for speed, but he
was pointing out that code is made faster by specializing
it. The process of specialization reduces your options, and
you end up with a solution that's more focused and at the
same time more brittle. You don't want to do that until you
really need to.
It may be an overreaction to my years of
optimization-focused programming, but I like the philosophy
of making the Erlang system fast without just caving in and
providing C-style abilities. I know how to write
low-level C. And now I know how to write good high-level
functional code. If I had been presented with a menu of
optimization-oriented data types in Erlang, that might
never have happened. I'd be writing C in the guise of
Erlang.
The second reason I'm still using Erlang is because I
understand it. I don't mean I know how to code in it, I
mean I get it all the way down. I know more or less what
transformations are applied by the compiler and the BEAM
loader. I know how the BEAM virtual machine works. And
unlike most languages, Erlang holds together as a full
system. You could decide to ditch all existing C compilers
and CPUs and start over completely, and Erlang could serve
as a foundation for this new world of computing. The
ECOMP
project (warning: PowerPoint) proved that an FPGA
running the Erlang VM directly gives impressive
results.
Let me zoom in on one specific detail of the Erlang
runtime. If you take an arbitrary piece of data in a
language of the Lua or Python family, at the lowest-level
it ends up wrapped inside a C struct. There's a type field,
maybe a reference count, and because it's a heap allocated
block of memory there's other hidden overhead that comes
along with any dynamic allocation (such as the size of the
block). Lua is unabashedly reliant on malloc-like heap
management for just about everything.
Erlang memory handling is much more basic. There's a
block of memory per process, and it grows from bottom to
top until full. Most data objects aren't wrapped in
structs. A tuple, for example, is one cell of data for the
length followed by the number of cells in the tuple. The
system identifies it as a tuple by tagging the
pointer to the tuple. You know the memory used for
a tuple is always 1 + N, period. Were I trying to optimize
data representation by hand, with the caveat that type info
needs to be included, it would be tough to do significantly
better.
I'm sure some people are correctly pointing out that
this is how most Lisp and Scheme systems have worked since
those languages were developed. There's nothing preventing
an imperative language from using the same methods (and
indeed this is sometimes the case).
Erlang takes this further by
having a separate block of memory for each process, so when
the block gets full only that particular block needs to be
garbage collected. If it's a 64K block, it takes
microseconds to collect, as compared to potentially
traversing a heap containing the hundreds of megabytes of
data in the full running system. Disallowing destructive
updates allows some nice optimizations in the garbage
collector, because pointers are guaranteed to reference
older objects (this is sometimes called a "unidirectional
heap"). Together these are much simpler than building a
real-time garbage collector that can survive under the
pressure of giant heaps.
Would I use Erlang for everything? Of course not. Erlang
is clearly a bad match for some types of programming. It
would be silly to force-fit Erlang into the iPhone, for
example, with Apple promoting Objective C as the one true
way. But it's the best mix of power and simplicity that
I've come across.
A Short Story About Verbosity
In the early 2000s I was writing a book. I don't mean in
the vague sense of sitting in a coffeeshop with my laptop
and pretending to be a writer; I had a contract with a tech
book publisher.
I'm in full agreement with the musician's saying of
"never turn down a gig," so when the opportunity arose, I
said yes. I did that even though there was one big, crazy
caveat:
"In order for a book to sell," said my publisher, "it's
got to be thick. 600 pages thick." "In the worst case we
could go as low as 500 pages, but 600+ should be your
target."
Wow, 600 pages. If I wrote two pages a day, that's
almost a full year of writing, and I had less than a year.
But still, never turn down a gig, and so I took a serious
attempt at it.
I can't prove or refute the claim that a 600 page tech
book sells better than thinner ones, but it explains a lot
of the bloated tomes out there. Mix sections of code with
the text, then reprint the whole program at the end of the
chapter. That can eat four or eight pages. Add a large
appendix that reiterates a language's standard library,
even though all that info is already in the help system and
online. Add some fluff survey chapters that everyone is
going to skip.
I try not to wax nostalgic about how the olden days of
computing were better. While I might have some fond
memories of designing games for 8-bit home computers, there
has been a lot of incredibly useful progress since then. But I do find myself
wishing that the art of the 250 page technical book hadn't
gone completely out of style.
Eventually I did give up on the 600 page monster I was
writing. It was a combination of me not having enough time
and my publisher taking weeks to give feedback about
submitted chapters. In the end I think I had written the
introduction and maybe eight full chapters. Do I wish I had
finished it? Yes. Even with the 600 page requirement, there
was still some clout that went along with writing a book at
the time. These days it's much less so, and I think those
padded-out-to 600 pages volumes had a lot to do with
it.
(If you liked this, you might like Two
Stories of Simplicity.)
Living Inside Your Own Black Box
Every so often I run across a lament that programmers no
longer understand the systems they work on, that
programming has turned into searches through massive
quantities of documentation, that large applications are
built by stacking together loosely defined libraries. Most
recently it was Mike Taylor's
Whatever happened to programming?, and it's worth the
time to read.
To me, it's not that the act of programming has
gotten more difficult. I'd even say that programming has
gotten much easier. Most of the Apple
Pascal assignments I had in high school would be a fraction
of the bulk if written in Ruby or Python. Arrays don't have
fixed lengths. Strings are easy. Dictionaries have subsumed other data
structures. Generic sort routines are painless to use.
Functions can be tested interactively. Times are good!
That's not to say all problems can be solved
effortlessly. Far from it. But it's a tight feedback loop:
think, experiment, write some code, reconsider, repeat.
This works as long as you can live inside an isolated
world, where the basic facilities of your programming
language are the tools you have to work with. But at some
point that doesn't work, and you have to deal with outside
realities.
Here's the simplest example I can think of: Write a
program to draw a line on the screen. Any line, any color,
doesn't matter. No ASCII art.
In Python the first question is "What UI toolkit?" There
are bindings for SDL, Cocoa, wxWindows, and others.
Selecting one of those still doesn't mean that you can
simply call a function and see your line. SDL requires some
up front effort to learn how to create a window and choose
the right resolution and color depth and so on. And then
you still can't draw a line unless you use OpenGL
or get an add-on package like SDL_gfx. If you decide to
take the Cocoa route, then you need to understand its whole
messaging / windowing / drawing model, and you also need to
understand how Python interfaces with it. Maybe there's a
beautifully simple package out there that lets you draw
lines, and then the question becomes "Can I access that
library from the language I'm using?" An even more basic
question: "Is the library written using a paradigm that's a
good match for my language?" (Think of a library based on
subclassing mutable objects and try to use it from
Haskell.)
There's a clear separation between programming languages
and the capabilities of modern operating systems. Any
popular OS is obviously designed for creating windows and
drawing and getting user input, but those are not
fundamental features of modern languages. At one time
regular expressions weren't standard in programming
languages either, but they're part of Perl and Ruby, and
they're a library that's part of the official Python
distribution.
A handful of language designers have tried to make GUI
programming as easy as traditional programming. The Tk
library for TCL, which is still the foundation for Python's
out-of-the-box IDE, allows basic UI creation with simple,
declarative statements. REBOL is a more recent incarnation
of the same idea, that sample code involving windows and
user input and graphics should be a handful of lines, not
multiple pages of wxWindows fussing. I wish more people
were working on such things.
A completely different approach is to go back to the
isolationist view of only using the natural capabilities of
a programming language, but in a more extreme way. I can
draw a line in Python with this tuple:
("line",0,0,639,479)
or I can do the same thing in Erlang with two fewer
characters:
{line,0,0,639,479}
I know it works, because I can see it right
there. The line starts at coordinates 0,0 and ends at
639,479. It works on any computer with any video card,
including systems I haven't used yet, like the iPad. I can use the same technique to play
sounds and build elaborate UIs.
That the results are entirely in my head is of no
matter.
It may sound like I'm being facetious, but I'm not. In
most applications, interactions between code and the
outside world can be narrowed down to couple of critical
moments. Even in something as complex as a game, you really
just need a few bytes representing user input at the start
of a frame, then much later you have a list of things to
draw and a list of sounds to start, and those get handed
off to a thin, external driver of sorts, the small part of
the application that does the messy hardware
interfacing.
The rest of the code can live in isolation, doing
arbitrarily complex tasks like laying out web pages and
mixing guitar tracks. It takes some practice to build
applications this way, without scattering calls to external
libraries throughout the rest of the code, but there are
big wins to be had. Fewer dependencies on platform
specifics. Fewer worries about getting overly reliant on
library X. And most importantly, it's a way to declutter
and get back to basics, to focus on writing the important
code, and to delve into those thousands of pages of API
documentation as little as possible.
Rethinking Programming Language Tutorials
Imagine you've never programmed before, and the first
language you're learning is Lua. Why not start with the
official book about
Lua? Not too far in you run across this paragraph:
The table type implements associative arrays. An
associative array is an array that can be indexed not
only with numbers, but also with strings or any other
value of the language, except nil. Moreover, tables
have no fixed size; you can add as many elements as you
want to a table dynamically. Tables are the main (in
fact, the only) data structuring mechanism in Lua, and
a powerful one. We use tables to represent ordinary
arrays, symbol tables, sets, records, queues, and other
data structures, in a simple, uniform, and efficient
way. Lua uses tables to represent packages as well.
When we write io.read, we mean "the read entry from the
io package". For Lua, that means "index the table io
using the string "read" as the key".
All right, where to start with this? "Associative
arrays"? The topic at hand is tables, and they're defined
as being synonymous with an odd term that's almost
certainly unfamiliar. Ah, okay, "associative array" is
defined in the next sentence, but it goes off track
quickly. "Indexed" gets casually used; there's the
assumption that the reader understands about arrays and
indexing. Then there's the curious addendum of "except
nil." All this talk of arrays and association and indexing,
and the novice's head is surely swimming, and then the
author throws in that little clarification, "except nil,"
as if that's the question sure to be on the mind of someone
who has just learned of the existence of something called a
table.
I've only dissected two sentences of that paragraph so
far.
Really, I should stop, but I can't resist the
declaration "Lua uses tables to represent packages as
well." Who is that sentence written for exactly? It has no
bearing on what a table is or how to use one; it's a five
mile high view showing that a beautifully invisible
language feature--packages--is really not so invisible and
instead relies on this table idea which hasn't been
explained yet.
I don't mean to single out Lua here. I can easily find
tutorials for other languages that have the same problems.
Every Haskell tutorial trots out laziness and folding and
type systems far too early and abstractly. Why? Because
those are the concerns of people who write Haskell
tutorials.
To really learn to program, you have to go around in
circles and absorb a lot of information. You need to get
immersed in the terminology. You'll be exposed to the
foibles and obsessions of language communities. You'll
absorb beliefs that were previously absorbed by people who
went on to write programming tutorials. It's hard to come
out of the process without being transformed. Not only will
you have learned to program, but all that nonsense that you
struggled with ("We use tables to represent ordinary
arrays...") no longer matters, because you get it.
After that point it's difficult to see the madness, but
it's still there.
Programming language tutorials shouldn't be about
learning languages. They should be about something
interesting, and you learn the language in the process.
If you want to learn to play guitar, the wrong approach
is to pick up a book about music theory and a chart showing
where all the notes are on the fretboard. There's a huge
gap between knowing all that stuff and actually playing
songs. That's why good music lessons involve playing
recognizable songs throughout the process. But what do we
get in programming tutorials? Hello World. Fibonacci
sequences. Much important manipulation of "foo."
Not all tutorials are this way. Paradigms of Artificial
Intelligence Programming is a survey of classic AI
programs mixed together with enough details about Lisp to
understand them. I've mentioned others in Five Memorable Books About Programming. But I
still want to see more. "Image Processing in Lua."
"Massively Multiplayer Games in Erlang." "Exploring Music
Theory (using Python)."
I'll give a real example of how "Image Processing in
Lua" could work. You can convert the RGB values of a pixel
to a monochrome intensity value by multiplying Red, Green,
and Blue by 0.3, 0.6, and 0.1 respectively, and summing the
results. That's an easily understandable Lua function:
function intensity(r, g, b)
return r*0.3 + g*0.6 + b*0.1
end
If each color value ranges from 0 to 255, then a full
white pixel should return the maximum intensity:
intensity(255, 255, 255)
and it does: 255. This tiny program opens the door for
showing how the R, G, and B values can be grouped together
into a single thing...and that turns out to be a table!
There's also the opportunity to show that each color
element can be named, instead of remembering a fixed
order:
{green=255, blue=255, red=255}
Rewriting the "intensity" function first using tables
and then using tables with named elements should hammer
home what a table is and how it gets used. There was no
need to mention any techy tangentials, like "tables have no
fixed size." (That can go in a terse reference doc.)
After all, the reason to learn a programming language is
to do something useful with it, not simply to know the
language.
(If you liked this, you might like The
World's Most Mind-Bending Language Has the Best Development
Environment.)
How Much Processing Power Does it Take to be Fast?
First, watch this.
It's Defender, an arcade game released thirty years ago.
I went out of my way to find footage running on the
original hardware, not emulated on a modern computer.
(There's clearer
video from an emulator if you prefer.)
Here's the first point of note: Defender is running on a
1MHz 8-bit processor. That's right ONE megahertz. This was
before the days of pipelined, superscalar architectures, so
if an instruction took 5 cycles to execute, it always took
5 cycles.
Here's the second: Unlike a lot of games from the early
1980s, there's no hardware-assisted graphics. No
honest-to-goodness sprites where the video processor does
all the work. No hardware to move blocks of memory around.
The screen is just a big bitmap, and all drawing of the
enemies, the score, the scrolling mountains, the special
effects, is handled by the same processor that's running
the rest of the code.
To be fair, the screen is only 320x256 with four bits
per pixel. But remember, this was 1980, and home computers
released up until mid-1985 didn't have that combination of
resolution and color.
Now it's 2010, and there's much amazement at the
responsiveness of the iPad. And why
shouldn't it be responsive? There's a 32-bit, gigahertz CPU
in there that can run multiple instructions at the same
time. Images are moved around by a separate processor
dedicated entirely to graphics. When you flick your finger
across the screen and some images slide around, there's
very little computation involved. The CPU is tracking
some input and sending some commands to the GPU. The GPU is
happy to render what you want, and a couple of 2D images is
way below the tens of thousands of texture-mapped polygons
that it was designed to handle.
Okay, JPEG decompression takes some effort. Ditto for
drawing curve-based, anti-aliased fonts. And of course
there's overhead involved in the application framework
where messages get passed around to delegates and so on.
None of this justifies the assumption that it takes amazing
computing power to provide a responsive user experience.
We're so used to interfaces being clunky and static, and
programs taking long to load, and there being unsettling
pauses when highlighting certain menu items, that we
expect it.
All the fawning over the speed iPad is a good reminder
that it doesn't have to be this way.
(If you liked this, you might like Slow Languages Battle Across Time.)
How to Think Like a Pioneer
Here's an experiment to try at home: do a Google image
search for "integrated development environment." Take some
time to go through the first several pages of pictures.
Even if you had no idea what an IDE was, the patterns
are obvious. There's a big area with text in it, and on the
left side of the screen is a pane that looks a lot like
Windows Explorer: a collection of folders and files in a
tree view, each with icons. Some folders are closed and
have little plus signs next to them. Others are expanded
and you can see the files contained within them. To be
perfectly fair, this project area is sometimes on the right
side of the window instead of the left. Variety is the
spice of life and all that.
Why have IDEs settled into this pattern of having a
project view take up twenty percent or more of the entire
left or right side of the window?
The answer is shallower than you may expect. Someone who
decides to create an IDE uses the existing programs he or
she is familiar with as models. In other words, "because
that's how it's supposed to be."
There's no solid reason the project view pane has to be
the way it usually is. In fact, there are some good
arguments against doing it that way. First, you are
typically either looking at the project view or you're
editing a file, not doing both at the same time. Yet the
project view is always there, taking up screen real estate,
sometimes a lot of screen real estate if highly nested
folders are expanded. That view is made even wider by
having icons to the left of each filename, even though most
projects consist of one or two file types and they could be
differentiated with color or an outline instead of a
contrived icon attempting to evoke "file of Java code."
Second, a long, thin pane containing files and folders
doesn't give a particularly deep or even interesting view
of a large project. Open a folder and it might fill the
whole pane, and all the other folders are now
offscreen.
Are there better options? Sure! The first problem, of
losing screen space to a persistent view of the project
hierarchy, can be alleviated by a hot key that brings up a
full-screen overlay. When you want to see the project as a
whole, hit a key. Press escape to make it go away (or
double-click a file to edit it).
The data presented in this overlay doesn't need to be a
tree view. A simple option is to group related files into
colored boxes, much like folders, except you can see the
contents the whole time. With the narrow, vertical format
out of the way, there can be multiple columns of boxes
filling the space. Now you can see 100+ files at a time
instead of a dozen.
It might make more sense to display modules as shapes,
each connected by lines to the modules which are imported.
Or provide multiple visualizations of the project, each for
a different purpose.
Someone, sometime, and probably not all that long ago,
came up with the canonical "project view on the left side
of the window" design for IDEs. And you may wonder how that
person arrived at that solution. After all, there were no
prior IDEs to use for guidance. I think the answer is a
basic one: because it's a solution that worked and was
better than what came before. No magic. No in-depth
comparison of a dozen possibilities. Clearly a way to see
all the files in a project is better than a raw window in a
text editor where there's no concept of "project"
whatsoever. That solution wasn't the best solution that
could ever exist in the entire future history of IDEs, but
it sure fooled people into thinking it was.
If you want to think like a pioneer, focus on the
problem you're trying to solve. The actual problem. Don't
jump directly to what everyone else is doing and then
rephrase your problem in terms of that solution. In the IDE
case, the problem is "How can I present a visual overview
of a project," not "How can I write a tree viewer like in
all the other IDEs I've ever seen?"
A Ramble Through Erlang IO Lists
The IO List is a handy data type in Erlang, but not one
that's often discussed in tutorials. It's any binary. Or
any list containing integers between 0 and 255. Or any
arbitrarily nested list containing either of those two
things. Like this:
[10, 20, "hello", <<"hello",65>>, [<<1,2,3>>, 0, 255]]
The key to IO lists is that you never flatten them. They
get passed directly into low-level runtime functions (such
as file:write_file
), and the flattening
happens without eating up any space in your Erlang process.
Take advantage of that! Instead of appending values to
lists, use nesting instead. For example, here's a function
to put a string in quotes:
quote(String) -> $" ++ String ++ $".
If you're working with IO lists, you can avoid the
append operations completely (and the second "++" above
results in an entirely new version of String
being created). This version uses nesting instead:
quote(String) -> [$", String, $"].
This creates three list elements no matter how long the
initial string is. The first version creates
length(String) + 2
elements. It's also easy to
go backward and un-quote the string: just take the second
list element. Once you get used to nesting you can avoid
most append operations completely.
One thing that nested list trick is handy for is
manipulating filenames. Want to add a directory name and
".png" extension to a filename? Just do this:
[Directory, $/, Filename, ".png"]
Unfortunately, filenames in the file
module
are not true IO lists. You can pass in deep lists, but they
get flattened by an Erlang function
(file:file_name/1
), not the runtime system.
That means you can still dodge appending lists in your own
code, but things aren't as efficient behind the scenes as
they could be. And "deep lists" in this case means
only lists, not binaries. Strangely, these deep
lists can also contain atoms, which get expanded via
atom_to_list
.
Ideally filenames would be IO lists, but for
compatibility reasons there's still the need to support
atoms in filenames. That brings up an interesting idea: why
not allow atoms as part of the general IO list
specification? It makes sense, as the runtime system has
access to the atom table, and there's a simple
correspondence between an atom and how it gets encoded in a
binary; 'atom' is treated the same as "atom". I find I'm
often calling atom_to_list
before sending data
to external ports, and that would no longer be
necessary.
Tricky When You Least Expect It
Here's a problem: You've got a satellite dish that can
be rotated to any absolute angle from 0 to 360 degrees. If
you think of the dish as being attached to a pole sticking
out of the ground, that's what the dish rotates around.
Given a starting angle and a desired angle, how many
degrees do you rotate the dish by?
An example should clarify this. If the initial angle is
0 degrees, and the goal is to be at 10 degrees, that's
easy. You rotate by 10 degrees. If you're at 10 degrees and
you want to end up at 8 degrees, then rotate -2 degrees. It
looks at lot like all you have to do is subtract the
starting angle from the ending angle, and that's that.
But if the starting angle is 10 and the ending angle is
350...hmmm. 350 - 10 = 340, but that's the long way around.
No one would do that. It makes more sense to rotate by -20
degrees. With this in mind and some experimenting, here's a
reasonable looking solution (in Erlang, but it could easily
be any language):
angle_diff(Begin, End) ->
D = End - Begin,
DA = abs(D),
case DA > 180 of
true -> -(360 - DA);
_ -> D
end.
It seems to cover some quickie test cases, including
those listed above. Now try angle_diff(270,
0)
. The expected answer is 90. But this function
returns -90. Oops.
This is starting to sound like the introduction to a
book by Dijkstra. He'd
have called this problem solving method "guessing," and
it's hard to disagree with that assessment. When I run into
problems like this that look so simple, and I feel like I'm
randomly poking at them to get the right answers, I'm
always surprised. So many messy problems are solved as part
of the core implementation or standard library in most
modern languages, that it's unusual to run into something
this subtle.
In Python or Erlang I never worry about sorting, hash
functions, heap management, implementing regular
expressions, fancy string comparison algorithms such as
Boyer-Moore, and so on. Most of the time I write fairly
straightforward code that's just basic logic and
manipulation of simple data structures. Behind the scenes,
that code is leaning heavily on technically difficult
underpinnings, but that doesn't change how pleasant things
are most of the time. Every once in a while, though, the
illusion of all the hard problems being solved for me is
shattered, and I run into something that initially seems
trivial, yet it takes real effort to work out a correct
solution.
Here's a version of the angle_diff
function
that handles the cases the previous version didn't:
angle_diff(Begin, End) ->
D = End - Begin,
DA = abs(D),
case {DA > 180, D > 0} of
{true, true} -> DA - 360;
{true, _} -> 360 - DA;
_ -> D
end.
Don't be surprised if it takes some thought to determine
if this indeed handles all cases.
There's now a follow-up.
(If you liked this, you might like Let's Take a Trivial Problem and Make it
Hard.)
What Do People Like?
I wrote Flickr as a Business
Simulator in earnest, but I think it was interpreted
more as a theoretical piece. When you build something with
the eventual goal of releasing it to the world, the key
question is "Will people like this?" And, really, you just
won't know until you do it. There's nothing like the
actions of tens of thousands of independently acting
individuals who have no regard for your watertight
theories. What Flickr provides is a way to make lots of
quick little "product" releases, and see if your
expectations line up with reality. Is this my primary use
of Flickr? No! But the educational opportunity is there,
regardless. Click on a photo to go to the Flickr page.
The rest of this entry is an annotated list of some
photos I've posted to Flickr, with both my conjectures of
how they'd be received and what actually happened. In each
case the photo comes first with the commentary
following.
I sat on this photo for a while after shooting it. I
thought it was cliche, something everyone had already seen
many times. A real photographer wouldn't bother to recreate
such an image. Then I posted it...and it got an immediate
string of comments and favorites. It's still one my top ten
overall Flickr photos according to the stats system. My
invented emphasis on originality didn't matter.
I thought these skid marks in front of a local liquor
superstore were photoworthy, but the result didn't grab me.
Like the sunset wheat photo, it took on a life of its own
on Flickr. Was the hook in the impossibility of those tire
tracks? That they look like a signature? Why was I unable
to see the appeal before uploading it? It even ended
up--with permission--in Scott Berkun's The Myths of
Innovation (O'Reilly, 2007).
This one I liked immediately. The red arrow. The odd
framing. The blown-out white background that makes the rust
pop. The Flickr reaction...well there wasn't one. Is the
industrial decay photo niche saturated? Would it have been
a hit if I worked at getting hundreds of dedicated
followers first? Or maybe I like it because it's better
than other photos I've taken recently, but not all that
great in absolute terms?
Oh so cleverly titled "I'm Lovin' IT!" I knew this was a
novelty. It pulled in some novelty linkage as a ha-ha photo
of the day sort of thing. It didn't get anywhere near the
exposure of that Yahoo ad next to the 404 distance on a
home run fence. The traffic from "I'm Lovin' IT!" was
transient, adding points to the view counter, but as they
weren't Flickr users they didn't add comments or favorites.
In the end it was an empty success.
Explaining Functional Programming to
Eight-Year-Olds
"Map" and "fold" are two fundamentals of functional
programming. One of them is trivially easy to understand
and use. The other is not, but that has more to do with
trying to fit it into a particular view of functional
programming than with it actually being tricky.
There's not much to say about map. Given a list and a
function, you create a new list by applying the same
function to each element. There's even special syntax for
this in some languages which removes any confusion about
whether the calling sequence is map(Function,
List)
or map(List, Function)
. Here's
Erlang code to increment the values in
List
:
[X + 1 || X <- List]
Fold, well, it's not nearly so simple. Just the
description of it sounds decidedly imperative: accumulate a
result by iterating through the elements in a list. It
takes three parameters: a base value, a list, and a
function. The last of these maps a value and the current
accumulator to a new accumulator. In Erlang, here's a fold
that sums a list:
lists:foldl(fun(X, Sum) -> X + Sum end, 0, List)
It's short, but it's an awkward conciseness. Now we've
two places where the parameter order can be botched. I
always find myself having to stop and think about the
mechanics of how folding works--and the difference between
left and right folding, too (lists:foldl
is a
left fold). I would hardly call this complicated, but that
step of having to pause and run through the details in my
head keeps it from being mindlessly intuitive.
Compare this to the analog in array languages like APL
and J. The "insert" operation inserts a function between
all the elements of a list and evaluates it. Going back to
the sum example, it would be "+/
" in J, or
"insert addition." So this:
1 2 3 4 5 6
turns to this:
1 + 2 + 3 + 4 + 5 + 6
giving a result of 21. The mechanics here are so simple
that you could explain them to a class of second graders
and not worry about them being confused. There's nothing
about iterating or accumulating or a direction of traversal
or even parameters. It's just...insertion.
Now there are some edge cases to worry about, such as
"What does it mean to insert a function between the
elements of a list of length 1"? Or an empty list for that
matter. The standard array language solution is to
associate a base value with operators, like addition, so
summing a list containing the single value 27 is treated as
0 + 27
. I'm not going to argue that APL's
insert is more general than fold, because it certainly
isn't. You can do all sorts of things with the accumulator
in a traditional fold (for example, computing the maximum
and minimum values of a list at the same time).
But in terms of raw simplicity of understanding, insert
flat-out beats fold. That begs the question: Is the
difficulty many programmers have in grasping functional
programming inherent in the basic concept of
non-destructively operating on values, or is it in the
popular abstractions that have been built-up to describe
functional programming?
(If you liked this, you might like Functional Programming Archaeology.)
Free Your Technical Aesthetic from the 1970s
In the early 1990s, I used Unix professionally for a few
years. It wasn't the official Unix, nor was it Linux, but
Sun's variant called SunOS. By "used" I mean I wrote
commercial, embedded software entirely in a Unix
environment. I edited 10,000+ line files in vi. Not vim.
The original "one file loaded at a time" vi.
At the time, Unix felt clunky and old. I spent a
lot of time in a library room down the hall, going through
the shelves of manuals. It took me a long time to discover
the umask
command for changing the default
file permissions and to understand the difference between
.bashrc
and .bash_profile
and how
to use tar
.
By way of comparison, on my home PC I used a third-party
command shell called 4DOS (later 4NT, and it's still
available for Windows 7 as TCC LE). It had a
wonderful command line history mechanism: type part of a
command, then press up-arrow. The bash bang-notation felt
like some weird mainframe relic. 4DOS had a built-in,
full-screen text file viewer. The Unix equivalent was the
minimalist less
command. 4DOS help was
colorful and pretty and hyperlinked. Documentation paged
through as man
pages was several steps
backward.
The Unix system nailed the core tech that consumer-level
computers were way behind on: stability and responsiveness
in a networked, multitasking environment. It was ugly, but
reliable.
In 2006, I got back into using Unix again (aside from
some day-job stuff with Linux ten years ago) in the guise
of OS X on a MacBook. The umask
command is
still there. Ditto for .bashrc
and
.bash_profile
and all the odd command line
switches for tar
and the clunky bang-notation
for history. I'm torn between wonderment that all those
same quirks and design choices still live on...and shocked
incredulity that all those same quirks and design choices
live on.
Enough time has passed since the silly days of crazed
Linux advocacy that I'm comfortable pointing out the three
reasons Unix makes sense:
1. It works.
2. It's reliable.
3. It stays constant.
But don't--do not--ever, make the mistake of those
benefits being a reason to use Unix as a basis for your
technical or design aesthetic. Yes, there are some textbook
cases where pipelining commands together is impressive, but
that's a minor point. Yes, having a small tool for a
specific job sometimes works, but it just as often doesn't.
("Those days are dead and gone and the eulogy was delivered
by Perl,"
Rob Pike, 2004.) Use Unix-like systems because of the
three benefits above, and simultaneously realize that it's
a crusty old system from a bygone era. If you put it up on
a pedestal as a thing of beauty, you lose all hope of
breaking away from a sadly outdated programmer
aesthetic.
(If you liked this, you might like My
Road to Erlang.)
One Small Step Toward Reducing Programming Language
Complexity
I've taught Python a couple of times. Something that
experience made clear to me is just how many concepts and
features there are, even in a language designed to be
simple. I kept finding myself saying "Oh, and there's one
more thing..."
Take something that you'd run into early on, like
displaying what's in a dictionary:
for key, value in dictionary.iteritems():
print key, value
Tuples are a bit odd in Python, so I put off talking
about them as long as possible, but that's what
iteritems
returns, so no more dodging that.
There's multiple assignment, too. And what the heck is
iteritems
anyway? Why not just use the
keys
method instead? Working out a clean path
that avoids constant footnotes takes some effort.
This isn't specific to Python. Pick any language and it
likely contains a larger interconnected set of features
than it first appears. Languages tend to continually grow,
too, so this just gets worse over time. Opportunities to
reverse that trend--backward compatibility be
damned!--would be most welcome. Let me propose one.
The humble string constant has a few gotchas. How to
print a string containing quotes, for example. In Python
that's easy, just use single quotes around the string that
has double quotes in it. It's a little more awkward in
Erlang and other languages. Now open the file
"c:\my_project\input.txt" under windows. You need to type
"c:\\my_projects\\input.txt", but first you've got to say
"Oh, and there's one more thing" and explain about how
backslashes work in strings.
Which would be fine...except the backslash notation for
string constants is, in the twenty-first century, an
anachronism.
Who ever uses "\a" (bell)? Or "\b" (backspace)? Who even
knows what "\v" (vertical tab) does? The escape sequence
that gets used more than all the others combined is "\n"
(newline), but it's simpler to have a print function that
puts a "return" at the end and one that doesn't. Then
there's "\t" (tab), but it has it's own set of quirks, and
it's almost always better to use spaces instead. The price
for supporting a feature that few people use is core
confusion about what a string literal is in the first
place. "The length of "\n\n" isn't four? What?"
There's an easy solution to all of this. Strings are
literal, with no escapes of any kind. Special characters
are either predefined constants (e.g., TAB
,
CR
, LF
) or created through a few
functions (e.g., char(Value)
,
unicode(Name)
). Normal string concatenation
pastes them all together. In Python:
"Content-type: text/plain" + NL + NL
In Erlang:
"Content-type: text/plain" ++ NL ++ NL
In both cases, the compiler mashes everything together
into one string. There's no actual concatenation taking
place at runtime.
Note that in Python you can get rid of backslash
notation by preceding a string with the "r" character
(meaning "raw"), like this:
r"c:\my_projects\input.txt"
But that adds another feature to the language, one to
patch up the problems caused by the first.
(If you liked this, you might like In
Praise of Non-Alphanumeric Identifiers.)
Stop the Vertical Tab Madness
In One Small Step Toward Reducing
Programming Language Complexity I added "Who even knows
what "\v" (vertical tab) does?" as an off the cuff comment.
Re-reading that made me realize something that's blatantly
obvious in retrospect, so obvious that I've gone all this
time without noticing it:
No one is actually using the vertical tab escape
sequence.
And I truly mean "no one." If I could stealthily patch
the compiler for any language supporting the "\v" escape so
I'd receive mail whenever it occurred in source code, then
I could trace actual uses of it. I'm willing to bet that
all the mail would come from beginners trying to figure out
what the heck "\v" actually does, and then giving up when
they realize it doesn't do anything. That's because it
doesn't do anything, except with some particular
printers and terminal emulators, and in those cases you're
better off not relying on it anyway.
And yet this crazy old feature, one that no one
understands or uses, one that doesn't even do anything in
most cases, not only gets special syntax in modern
programming languages, it's consistently given space in the
documentation, even in tutorials. It's in the official
Lua docs.
It's in MIT
web course notes about printf
. It's in the
books
Programming in Python 3 and
Python Essential Reference. It's in an introduction
to Python strings. It's in the standard
Erlang documentation, too.
[Insert conspiracy theory involving Illuminati
here.]
Here's my simple plea: stop it. Stop mentioning vertical
tabs in tutorials and language references. Drop the "\v"
sequence in all future programming languages. Retroactively
remove it from recent languages, like Python 3. Yes, ASCII
character number 11 isn't going away, but there's no reason
to draw attention to a relic of computing past.
Surprisingly, the "\v" sequence was removed from one
language during the last decade: Perl. And even more
surprisngly, there's a 2010
proposal to add escaped character sequences, including
vertical tab, to the famously minimal language, Forth.
(If you liked this, you might like Kilobyte Constants, a Simple and Beautiful Idea
that Hasn't Caught On.)
Personal Programming
I've mentioned before that this
site is generated by a small Perl script. How small?
Exactly 6838 bytes, which includes comments and an HTML
template. Mentioning Perl may horrify you if you came here
to read about Erlang, but it's a good match for the
problem. Those 6838 bytes have been so pleasant to work on
that I wanted to talk about them a bit.
I've used well-known blogging applications, and each
time I've come away with the same bad taste, one that's
caused by a combination of quirky rich formatting and
having to edit text in a small window inside of a browser.
I don't want to worry about presentation details: choosing
fonts, line spacing, etc. It's surprising how often there
are subtle mismatches between the formatting shown in a
WYSIWYG editing window and what the final result looks
like. Where did that extra blank line come from? Why do
some paragraphs have padding below them but others
don't?
I decided to see if I could bypass all of that and have
a folder of entries marked-up with basic annotations, then
have a way to convert that entire folder into a real site.
And that's pretty much what I ended up with. The sidebar
and "Previously" list and dated permalink are all
automatically generated. Ditto for the atom feed and
archives page. The command-line
formatter lets me rebuild any page, defaulting to the
newest entry. If I want to change the overall layout of the
site, I can regenerate all of the pages in a second or
so.
There are still legitimate questions about the path I
chose. "Why not grab an open source program and modify it
to fit your needs?" "You do realize that you decided to
write an entirely new system from scratch, just because you
didn't like a few things in existing programs; surely
that's a serious net loss?"
My response is simple: I did it because it was
easy. If I get annoyed with some feature of
Microsoft Word, I'm not going to think even for a second
about writing my own alternative. But the site generation
program just took a bit of tinkering here and there over
the course of a weekend. It never felt daunting. It didn't
require anything I'd label as "engineering." I spent more
time getting the site design and style sheet right,
something I would have done even if I used other
software.
Since then, I've made small adjustments and additions to
the original script. I added the archive page nine months
later. Earlier this year I added a fix for some smaller
feed aggregation sites that don't properly handle relative
links. Just today I added mark-up support for block quotes.
That last one took ten minutes and four lines of code.
If this suddenly got complicated, if I needed to support
reader comments and ten different
feed formats and who knows what else, I'd give it up. I
have no interest in turning these 6838 bytes into something
that requires a grand architecture to keep from collapsing.
But there's some magic in a solution that's direct,
reliable, easy to understand, and one that fits my personal
vision of how it should work.
(If you liked this, you might like Micro-Build Systems and the Death of a Prominent
DSL.)
Common Sense, Part 1
There's a photo of mine in the September 2010 issue of
Popular Photography. I'm excited about it; my photo
credits are few and far between, and it brings back the
feelings I had when I wrote for
magazines long ago. Completely ignoring the subject of the
image, there are couple of surprising facts about it.
The first is that it was a taken on a circa-2004 Canon
PowerShot G5, a camera with a maximum resolution of five
megapixels.
The second is that it's a doubly-compressed JPEG. The
original photo was a JPEG, then I adjusted the colors and
contrast a bit, and saved it out as a new JPEG. Each save
lost some of the image quality. I was perfectly willing to
submit the adjusted photo as a giant TIFF to avoid that
second compression step, but was told not to worry about
it; the JPEG would be fine.
Yet there it is: the five megapixel, doubly-compressed
photo, printed across almost two pages of the magazine. And
those two technical facts are irrelevant. I can't tell the
difference; it looks great in print.
Now it is an impressionistic shot, so it could
just be that the technical flaws aren't noticeable in this
case. Fortunately, I have another anecdote to back it
up.
Last year I was in New Mexico and took a lot of photos.
After I got back home, I decided to get a few photo books
printed. The source images were all twelve megapixel JPEGs,
but the book layout software recommended a six megapixel
limit. I cut the resolution in half, again
twice-compressing them. When I got the finished books back,
the full-page photos were sharp and beautiful.
The standard, pedantic advice about printing photos is
that resolution is everything. Shoot as high as possible.
Better yet, save everything as RAW files, so there's no
lossy compression. Any JPEG compression below maximum is
unacceptable. Double-compression is an error of the highest
order, one only made by rank amateurs. And so it goes. But
I know from personal experience that while it sounds
authoritative, and while it's most likely given in a
well-meaning manner, it's advice that's endlessly repeated
in a loose, "how could it possibly be wrong?" sort of way
and never actually tested.
Erlang vs. Unintentionally Purely Functional
Python
Here's a little Python function that should be easy to
figure out, even if you don't know Python:
def make_filename(path):
return path.lower() + ".jpg"
I want to walk through what's going on behind the scenes
when this function executes. There is, of course, a whole
layer of interpreting opcodes and pushing and popping
parameters, but that's just noise.
The first interesting part is that the
lower()
method creates an entirely new string.
If path
contains a hundred characters, then
all hundred of those are copied to a new string in the
process of being converted to lowercase.
The second point of note is that the append
operation--the plus--is doing another copy. The entire
lowercased string is moved to a new location, then the four
character extension is tacked on to the end. The original
path
has now been copied twice.
Those previous two paragraphs gloss over some key
details. Where is the memory for the new strings coming
from? It's returned by a call to the Python memory
allocator. As with all generic heap management functions,
the execution time of the Python memory allocator is
difficult to predict. There are various checks and
potential fast paths and manipulations of linked lists. In
the worst case, the code falls through into C's
malloc
and the party continues there.
Remember, too, that objects in Python have headers, which
include reference counts, so there's more overhead that
I've ignored.
Also, the result of lower
gets thrown out
after the subsequent concatenation, so I could peek into
"release memory" routine and see what's going on down in
that neck of the woods, but I'd rather not. Just realize
there's a lot of work going on inside the simple
make_filename
function, even if the end result
still manages to be surprisingly
fast.
A popular criticism of functional languages is that a
lack of mutable variables means that data is copied around,
and of course that just has to be slow, right?
A literal Erlang translation of
make_filename
behaves about the same as the
Python version. The string still gets copied twice, though
in Erlang it's a linked list which uses eight bytes per
characters given a 32-bit build of the language. If the
string in Python is UTF-8 (the default in Python 3), then
it's somewhere between 1 and 4 bytes per character,
depending. The big difference is that memory allocation in
Erlang is just a pointer increment and bounds check, and
not a heavyweight call to a heap manager.
I'm not definitively stating which language is faster
for this specific code, nor does it matter to me. I suspect
the Erlang version ends up running slightly longer, because
the lowercase function is itself written in Erlang, while
Python's is in C. But all that "slow" copying of memory
isn't even part of the performance discussion.
(If you liked this, you might like Functional Programming Went Mainstream Years
Ago.)
Advice to Aimless, Excited Programmers
I occasionally see messages like this from aimless,
excited programmers:
Hey everyone! I just learned Erlang/Haskell/Python,
and now I'm looking for a big project to write in it.
If you've got ideas, let me know!
or
I love Linux and open source and want to contribute
to the community by starting a project. What's an
important program that only runs under Windows that
you'd love to have a Linux version of?
The wrong-way-aroundness of these requests always
puzzles me. The key criteria is a programing language or an
operating system or a software license. There's nothing
about solving a problem or overall usefulness or any
relevant connection between the application and the
interests of the original poster. Would you trust a music
notation program developed by a non-musician? A Photoshop
clone written by someone who has never used Photoshop
professionally? But I don't want to dwell on the negative
side of this.
Here's my advice to people who make these queries:
Stop and think about all of your personal interests and
solve a simple problem related to one of them. For example,
I practice guitar by playing along to a drum machine, but I
wish I could have human elements added to drum loops, like
auto-fills and occasional variations and so on. What would
it take to do that? I could start by writing a simple drum
sequencing program--one without a GUI--and see how it went.
I also take a lot of photographs, and I could use a tagging
scheme that isn't tied to a do-everything program like
Adobe Lightroom. That's simple enough that I could create a
minimal solution in an afternoon.
The two keys: (1) keep it simple, (2) make it something
you'd actually use.
Once you've got something working, then build a series
of improved versions. Don't create pressure by making a
version suitable for public distribution, just take a long
look at the existing application, and make it better. Can I
build an HTML 5 front end to my photo tagger?
If you keep this up for a couple of iterations, then
you'll wind up an expert. An expert in a small,
tightly-defined, maybe only relevant to you problem domain,
yes, but an expert nonetheless. There's a very interesting
side effect to becoming an expert: you can start
experimenting with improvements and features that would
have previously looked daunting or impossible. And those
are the kind of improvements and features that might all of
a sudden make your program appealing to a larger
audience.
A Concurrent Language for Non-Concurrent Software
Occasionally I get asked why, as someone who uses Erlang
extensively, do I rarely talk about concurrency?
The answer is because concurrency is not my primary
motivation for using Erlang.
Processes themselves are wonderful, and I often use them
as a way to improve modularity. Rather than passing the
state of the world all over the place, I can spin off
processes that capture a bit of it. This works surprisingly
well, but it's just a coarser-grained version of creating
objects in Python or other languages. Most of the time when
I send a message to another process, my code sits and waits
for the result to come back, which is hardly
"concurrency."
Suppose Erlang didn't have processes at all. Is there
still anything interesting about the language? To me, yes,
there is. I first tried functional
programming to see if I could think at a higher level, so I
could avoid a whole class of concerns that I was tired of
worrying about. Erlang is further down the purely
functional road than most languages, giving the benefits
that come with that, but at the same time there's a
divergence from the hardcore, theoretical beauty of
Haskell. There's no insistence on functions taking a single
value, there isn't a typing-first viewpoint. The result is
being able to play fast and loose with a handful of data
types--especially atoms--and focus on how to arrange and
rearrange them in useful ways.
(Okay, there are some small things I like about Erlang
too, such as being able to introduce named values without
creating a new scope that causes creeping indentation. It's
the only functional language I've used that takes this
simple approach.)
The angle of writing code that doesn't involve
micromanaging destructive updates takes some time to sink
in. Possibly too long; something almost always ignored when
presenting a pathologically beautiful one-liner that makes
functional programming look casually effortless. There are
a number of techniques that aren't obvious, that aren't
demonstrated in tutorials. I wrote about
one in 2007. And here's another:
Lists in Erlang--and Haskell and Scheme--are
singly-linked. Given a list, you can easily get the next
element. Getting the previous element looks
impossible; there's no back pointer to follow. But that's
only true if you're looking at the raw definition of lists.
It's easy if you add some some auxiliary data. When you
step forward, remember the element you just moved away
from. When you step back, just grab that element. The data
structure looks like this:
{Previous_Items, Current_List}
To move through a list, start out with {[],
List}
. You can step forward and back with two
functions:
forward({Prev, [H|T]}) ->
{[H|Prev], T}.
back({[H|T], L}) ->
{T, [H|L]}.
Wait, isn't that cheating? Creating a new list on the
fly like that? No, that's the point of being free
from thinking about managing memory or even instantiating
classes.
This Isn't Another Quick Dismissal of Visual
Programming
I stopped following technical forums for three reasons:
pervasive negativity, waning interest on my part, and I realized I
could predict the responses to most questions. "I bet this
devolves into a debate about the validity of the singleton
pattern." *click* "Ha! I knew it! Wait...why am I wasting
my time on this?"
"Visual programming" is one of those topics that gets
predictable responses. I don't mean "visual" in the
GUI-design sense of the word, like Visual C++. I mean
building programs without creating text files, programming
where the visual component is the program.
Not too surprisingly, this is a subject that brings on
the defensiveness. It's greeted with diatribes about "real"
programming and how drawing lines to connect components is
too limiting and links to articles about the glory of raw
text files. At one time I would have agreed, but more and
more it's starting to smack of nostalgic "that's how it's
always been done"-ness.
In Microsoft's old QBasic, programs
were still text files, but you could choose to view them as
a grid of subroutine names. Click one and view code for
that function--and only that function. It was very
different than moving around within a single long document.
There was the illusion that each function was its own
entity, that a program was a collection of smaller parts.
Behind the scenes, of course, all of those functions were
still part of a single text file, but no matter. That a
program was presented in a clear, easily navigable way
changed everything. Was this visual programming? No, but it
was a step away from thinking about coding as editing text
files.
Take that a bit further: what if a visual representation
of a program made it easier to understand, given code
you've never seen before? To get in the proper frame of
mind, go into the Erlang distribution and load up one of
the compiler modules, like beam_jump.erl
or
beam_block.erl
. Even with all the comments at
the top of the former, just coming to grips with the
overall flow of the module takes some effort. I'm not
talking about fully understanding the logic, but simply
being able to get a picture of how data is moving through
the functions. Wouldn't a nice graphical representation
make that obvious?
Neither of these examples are visual programming. I'd
call them different forms of program visualization.
Regardless, I can see the benefits, and I suspect there are
larger gains to be had with other non-textual ways of
dealing with code. I don't want to block out those
possibilities because I'm too busy wallowing in my
text-file comfort zone.
Easy to Please
I have favorited over seven thousand photos on Flickr.
"Favoriting" is not a valuable currency. Clicking the
"Add to Faves" icon means I like a photo, I'm inspired by
it, and I want to let the photographer know this. Doing so
doesn't cost me anything, and it doesn't create any kind of
tangible reward for who took the photo. As such, I don't
feel a need to be tight-fisted about it. If I enjoy a
photo, I favorite it. For photos that don't grab me, I
don't do anything. I look at them, but "Add to Faves"
remains unclicked. There's no negativity, no reason to bash
them.
With over seven thousand favorites, I'm either easy to
please or have low standards--or both.
I prefer "easy to please," but the "low standards" angle
is an interesting one. There's a Macintosh rumors site that
lets users rate stories as positive or negative. Here's a
story from earlier this year that I can't imagine causing
any bitterness: "iPhone 4 Available for Pre-Order from
Apple." It has 545 negative votes. Who are these people? I
don't mean "people who don't like or own an iPhone," I mean
"people who would follow an Apple rumors site and down-vote
the announcement of a flagship product that's both exciting
and delivers on its promises." Clearly those people need to
have lower standards or they'll never be happy.
I'll let you in on a secret: it's okay to not think
about technology on an absolute scale. And it's also okay
to use two competing technologies without being focused on
which is better.
My dad bought an Atari 800 when I was 14, so that's what
I used to write my own games. He later got an Apple //e,
and there was a stark difference between the two computers.
The Atari was full of colors and sounds and sprites. The
Apple was full of colors, but it was connected to a
monochrome monitor that only showed green so I couldn't
experience the others. Sounds? Funny clicks and buzzes.
Hardware sprites? No. But it had its own charms, and I
wrote a couple of games for it, too.
I like using Erlang to solve problems. I also like using
Perl. And C. And Python. It might look like there's some
serious internal inconsistency going on, and I guess that's
true. I think about functional programming when using
Erlang. It's a hybrid of procedural and functional
approaches with Perl and Python. In C I think about
pointers and how memory is arranged. Is one of these
approaches superior? Sometimes. And other times a different
one is.
I'm not even sure that many extreme cases of putting
down technologies hold water. The anti-PHP folks seem
rather rabid, but that's not stopping people from happily
using it. Despite the predictable put-downs of BASIC, I'm
fond of BlitzMax, which is an oddball BASIC built around
easy access to graphics and sound. Everyone makes fun of
Windows, but it's an understatement to say that it clearly
works and is useful (and I'm saying that even though I'm
typing on a MacBook).
(If you liked this, you might like Slumming with BASIC Programmers.)
Good-Bye to the Sprawling Suburbs of Screen Space
Application platforms are going off in two completely
different directions. Desktop monitors keep getting
bigger--using two or three monitors at once isn't the
rarity that it once was--and then there's the
ultra-portable end of things: iPhone, iPad, Blackberry,
Nintendo DS.
The common reaction to this is that large monitors are
good for programmers and musicians and Photoshop
users--people doing serious work--and portable devices let
you take care of some common tasks when away from your
desk.
I don't see it that way at all. The freedom to do
creative work when and where I want to, and not be tied to
a clunky computer at a desk, has great appeal. Notebooks
are a good start, but they're still rooted in the old
school computer world: slow booting, too much emphasis on
mental noise (menus, moving windows around, pointless
manipulation of items on a virtual desktop). And there are
the beginnings of of people using portable devices for real
work. At
least
two people have made serious attempts at writing novels
on their iPhones. Check out what a professional photographer
can do with an older model iPhone (see "iPhone as Art" in
the "Portfolios" section).
There's becoming a great divide in terms of UI design
for desktops and ultra-portables. Desktop UIs need all the
surface area they can get, with more stuff shown at once,
with more docked or floating palettes. Just look at any
music
production
software. Or any graphic
arts package. None of these interfaces translates over
to a pocket-sized LCD screen.
There's truth to the human interface guideline that flat
is (often) better than nested. That's why the tool ribbons
in Microsoft Office take less fumbling than a set of nested
menus, and why easy-to-use websites keep most options right
there in front of you, ready to click. Flat, not
surprisingly, takes more screen real estate. But it has
also become too easy to take advantage of all those pixels
on a large monitor simply because they exist. I can't help
but see some interfaces as the equivalent of the "data
dump" style of PowerPoint presentation (see Presenting to
Win by Jerry Weissman). The design choices are not
choices so much as simply deciding to show
everything: as many toolbars and inspectors and
options as possible. If it's too much, let the user sort it
out. Make things float or dock and provide customization
settings and everyone is running at 1600-by-something
resolution on a 19+ inch monitor anyway.
Except on an iPhone.
I don't have an all-encompassing solution for how to
make giant-interface apps run on screen that fits in your
pocket, but simply having that as a goal changes things.
Instead of being enslaved to cycles and bytes--which rarely
make or break programs in the way that optimization
obsessed developers long for--there's a much more relevant
limited resource to be concerned with the conservation of:
pixels. How to take the goings-on of a complex app and
present them on a screen with some small number of square
inches of screen space?
(If you liked this, you might like Optimizing for Fan Noise.)
Learning to Ignore Superficially Ugly Code
Back when I was in school and Pascal was the required
language for programming assignments, I ran across a book
by Henry Ledgard: Professional Pascal. Think of it
as the 1980s version of Code Complete. This was my
first exposure to concerns of code layout, commenting
style, and making programs look pretty. At the time the
advice and examples resonated with me, and for a long time
afterward I spent time adjusting and aligning my code so it
was aesthetically pleasing, so the formatting accentuated
structural details.
To give a concrete example, I'd take this:
win_music_handle = load_music("win.aiff");
bonus_music_handle = load_music("bonus.aiff");
high_score_music_handle = load_music("highscore.aiff");
and align it like this:
win_music_handle = load_music("win.aiff");
bonus_music_handle = load_music("bonus.aiff");
high_score_music_handle = load_music("highscore.aiff");
The theory was that this made the repeated elements
obvious, that at a quick glance it was easy to see that
three music files were being loaded. Five or six years ago
I wrote a filter that takes a snippet of Erlang code and
vertically aligns the "->" arrows. I still have it
mapped to ",a" in vim.
More and more, though, I'm beginning to see code
aesthetics as irrelevant, as a distraction. After all, no
one cares what language an application was written in, and
certainly no one cares about the way a program was
architected. How the program actually looks is far
below either of those. Is there a significant
development-side win to having pleasingly formatted code?
Does it make any difference at all?
In the manual for Eric Isaacson's A86 assembler (which I used
in the early 1990s), he advises against the age-old
practice of aligning assembly code into columns, like
this:
add eax, 1
sub eax, ebx
call squish
jc error
His view is that that the purpose of a column is so you
can scan down it quickly, and there's no reason you'd ever
want to scan down a list of unrelated operands. The
practice of writing columnar code comes from ancient tools
that required textual elements to begin in specific
column numbers. There's a serious downside to this layout
methodology, too: you have to take the time to make sure
your code is vertically aligned.
How far to take a lack of concern with layout
aesthetics? Here's a quick test. Imagine you're looking for
an error in some code, and narrowed it down to a single
character in this function:
void amazing_adjustment(int amount)
{
int temp1 = Compartment[1].range.low;
int temp_2 = Compartment[ 2 ].range.low;
int average = (temp1 + temp_2)/2;
Global_Adjustment =
average;
}
The fix is to change the first index from 1 to 0. Now
when you were in there, would you take the time to
"correct" the formatting? To adjust the indentation? To
remove the inconsistencies? To make the variable names more
meaningful? Or would you just let it slide, change the one
character, and be done with it? If you do decide to make
those additional fixes, who is actually benefiting from
them?
(If you liked this, you might like Macho Programming.)
Instant-On
"Mobile" is the popular term used to describe devices
like the iPhone and iPad. I prefer "instant-on." Sure, they
are mobile, but what makes them useful is that you
can just turn them on and start working. All the usual
baggage associated with starting-up a computer--multiple
boot sequences that add up to a minute or more of time,
followed by a general sluggishness while things settle
down--are gone.
What's especially interesting to me is that instant-on
is not new, not by any means, but it was set aside as a
goal, even considered impossible stuff of fantasy.
Turn on any 1970s-era calculator. It's on and usable
immediately.
Turn on any 1970s or 1980s game console. It's on and
usable immediately.
Turn on any 8-bit home computer. Give it a second or
two, and there's the BASIC prompt. You can start typing
code or use it as a fancy calculator (a favorite example of
Jef Raskin). To be fair, it wasn't quite so quick as soon
as you started loading extensions to the operating system
from a floppy disc (such as Atari DOS).
That it got to where it wasn't unusual for a PC to take
from ninety seconds to two minutes to fully boot-up shows
just how far things had strayed from the simple, pleasing
goal of instant-on. Yes, operating systems were bigger and
did more. Yes, a computer from 2000 was so much more
powerful than one from 1985. But those long boot times kept
them firmly rooted in the traditional computer world. They
reveled in being big iron, with slow self-testing sequences
and disjoint flickering between different displays of
cryptic boot messages.
And now, thankfully, instant-on is back. Maybe not truly
instant; there's still a perceived start-up time on an
iPad. But it's short enough that it doesn't get in the way,
that by the time you've gotten comfortable and shifted into
the mindset for your new task the hardware is ready to use.
That small shift from ninety seconds to less than ten makes
all the difference.
(If you liked this, you might like How
Much Processing Power Does it Take to be Fast?.)
Write Code Like You Just Learned How to Program
I'm reading
Do More Faster, which is more than a bit of an
advertisement for the TechStars start-up incubator, but
it's a good read nonetheless. What struck me is that
several of the people who went through the program,
successfully enough to at least get initial funding, didn't
know how to program. They learned it so they could
implement their start-up ideas.
Think about that. It's like having a song idea and
learning to play an instrument so you can make it real. I
suspect that the learning process in this case would
horrify most professional musicians, but that horror
doesn't necessarily mean that it's a bad idea, or that the
end result won't be successful. After all, look at how many
bands find success without the benefit of a degree in music
theory.
I already knew how to program when I took an "Intro to
BASIC" class in high school. One project was to make a
visual demo using the sixteen-color, low-res mode of the
Apple II. I quickly put together something algorithmic,
looping across the screen coordinates and drawing lines and
changing colors. It took me about half an hour to write and
tweak, and I was done.
I seriously underestimated what people would create.
One guy presented this amazing demo full of animation
and shaded images. I'm talking crazy stuff, like a skull
that dripped blood from its eye into a rising pool at the
bottom of the screen. And that was just one segment of his
project. I was stunned. Clearly I wasn't the hotshot
programmer I thought was.
I eventually saw the BASIC listing for his program. It
was hundreds and hundreds of lines of statements to change
colors and draw points and lines. There were no loops or
variables. To animate the blood he drew a red pixel,
waited, then drew another red pixel below it. All the
coordinates were hard-coded. How did he keep track of where
to draw stuff? He had a piece of graph paper that he
updated as he went.
My prior experience hurt me in this case. I was thinking
about the program, and how I could write something
that was concise and clean. The guy who wrote the skull
demo wasn't worried about any of that. He didn't care about
what the program looked like or how maintainable it was. He
just wanted a way to present his vision.
There's a lesson there that's easy to forget--or ignore.
It's extremely difficult to be simultaneously concerned
with the end-user experience of whatever it is that you're
building and the architecture of the program that delivers
that experience. Maybe impossible. I think the only way to
pull it off is to simply not care about the latter. Write
comically straightforward code, as if you just learned to
program, and go out of your way avoid wearing any kind of
software engineering hat--unless what you really want to be
is a software engineer, and not the designer of an
experience.
(If you liked this, you might like Coding as Performance.)
A Three-Year Retrospective
This is not a comprehensive index, but a categorization
of some of the more interesting or well-received entries
from November 2007 through December 2010. Feel free to dig
through the archives if
you want everything. Items within each section are in
chronological order.
popular
functional programming
Erlang
personal
progress
J
Forth
other entries that I think were successful
Accidental Innovation, Part 1
In the mid-1980s I was writing 8-bit computer games.
Looking back, it was the epitome of indie. I came up with
the idea, worked out the design, drew the art, wrote the
code, made the sound effects, all without any kind of
collaboration or outside opinions. Then I'd send the
finished game off to a magazine like ANALOG
Computing or Antic,
get a check for a few hundred dollars, and show up in print
six months or a year later.
Where I got the ideas for those games is a good
question. I was clearly influenced by frequent visits to
arcades, but I also didn't just want to rip-off designs and
write my own versions of the games I played there.
I remember seeing screenshots of some public domain
games that appeared to have puzzle elements, though I never
actually played them so I didn't know for sure. The puzzley
aspects may have been entirely in my head. I had a flash of
an idea about arranging objects so that no two of the same
type could touch. Initially those objects were oranges,
lemons, and limes, which made no sense, and there was still
no gameplay mechanic. Somewhere in there I hit upon the
idea of dropping the fruits in a pile and switched the
theme to be about cans of radioactive waste that would hit
critical mass if the same types were near each other for
more than a moment.
Here's what I ended up with, designed and implemented in
five days, start to finish: Uncle Henry's Nuclear Waste
Dump.
Missing image: Uncle Henry's Nuclear Waste Dump
At first glance, this may look like any of the dozens of
post-Tetris puzzle games. There's a pit that you drop stuff
into, and you move back and forth over the top if it. Stuff
piles up at the bottom, and the primary gameplay is in
deciding where to drop the next randomly-determined item.
It's all very familiar, and except that the goal is
reversed--to fill the pit instead preventing it from
filling--there's nothing remarkable here.
Except that Tetris wasn't released in the United States
until 1987 and Uncle Henry's Nuclear Waste Dump was
published in 1986. I didn't even know what Tetris was until
a few years later.
Was Uncle Henry's as good as Tetris? No. Not a chance. I
missed the obvious idea of guiding each piece as it fell
and instead used the heavy-handed mechanic of releasing the
piece before a timer expired (that's the number in the
upper right corner). And keeping three waste types
separated wasn't nearly as addictive as fitting irregular
shapes together. Overall the game wasn't what it could have
been, and yet it's interesting that it's instantly
recognizable as having elements of a genre that didn't
exist at the time.
So close.
(While looking for the above screenshot, I found that
someone wrote a
clone in 2006. The part about the game ending when the
pile gets too high sounds like the gameplay isn't exactly
the same, but the countdown timer for dropping a can is
still there.)
Part 2
Accidental Innovation, Part 2
In 1995 I was writing a book, a collection of interviews
with people who wrote video and computer games in the
1980s. I had the inside track on the whereabouts of many of
those game designers--this was before they were easy to
find via Google--and decided to make use of that knowledge.
But at the time technology books that weren't "how to" were
a tough sell, so after a number of rejections from
publishers I set the project aside.
A year later, my wife and I were running a small game
development company, and those interviews resurfaced as a
potential product. We were already set-up to handle orders
and mail games to customers, so the book could be entirely
digital. But what format to use? PDF readers were clunky
and slow. Not everyone had Microsoft Word. Then it hit me:
What about using HTML as a portable document format? I
know, I know, it's designed to be a portable
document format, but I was thinking of it as an off-line
format, not just for viewing websites. I hadn't seen anyone
do this yet. It was before HTML became a common format for
documentation and help files.
And so for the next couple of years people paid $20 for
Halcyon
Days: Interviews with Classic Computer and Video Game
Programmers. Shipped via U.S. mail. On a 3 1/2 inch
floppy disc. Five years later I put it on the web for
free.
Even though I made the leap of using HTML for off-line
e-books, and web browsers as the readers, I still didn't
realize how ubiquitous HTML and browsers would become. I
don't remember the details of how it happened, but I asked
John
Romero to write the introduction, which he
enthusiastically did. I mentioned that I was looking for a
distribution format for those people who didn't use
browsers, and his comment was (paraphrased): "Are you
crazy! Don't look backward! This is the future!"
Obviously, he was right.
Part 3
Accidental Innovation, Part 3
I didn't write the previous
two installments so I could build up
my ego. I wanted to give concrete examples of innovation
and the circumstances surrounding it, to show that it's not
magic or glamorous, to show that innovation is more than
sitting down and saying "Okay, time to innovate!"
It's curious how often the "innovative" stamp is applied
to things that don't fit any kind of reasonable definition
of the word. How many times have you seen text like this on
random company X's "About" page:
We develop innovative solutions which enable
enterprise-class cloud computing infrastructure
that...something something synergy...something
something "outside the box."
I've seen that enough that I've formulated a simple
rule: If you have to say that you're innovating, then
you're not. Or in a less snarky way: Innovation in itself
is an empty goal, so if you're using it in the mission
statement for the work you're doing, then odds are the rest
of the mission statement is equally vacant.
Really, the only way to innovate is to do so
accidentally.
In both the examples I gave, I wasn't thinking about how
to do things differently. I was thinking about how to solve
a problem and only that problem. The results ended up being
interesting because I didn't spend all my time fixated on
what other people had done, to the point where that's all I
could see. If I started designing a puzzle game in 2011,
I'd know all about Tetris and all the knock-offs of Tetris
and all the incremental changes and improvements that
stemmed from Tetris. It would be difficult to work within
the restrictions of the label "puzzle game" and come up
with something that transcends the boundaries of those
restrictions.
Suppose it's the late 1990s, and your goal is to design
a next generation graphical interface for desktop
PCs--something better than Windows. Already you're sunk,
because you're looking at Windows, you're thinking about
Windows, and all of your decisions will be colored by a
long exposure to Windows-like interfaces. There are icons,
a desktop, resizable windows, some kind of task bar, etc.
What you end up with will almost certainly not be
completely identical to Windows, but it won't be
innovative.
Now there are some interesting problems behind that
vague goal of building a next generation GUI. The core
question is how to let the user run multiple applications
at the same time and switch between them. And that question
has some interesting and wide-ranging answers. You can see
the results of some of those lines of thinking in current
systems, such as doing away with the "app in a movable
window" idea and having each application take over the
entire screen. Then the question becomes a different one:
How to switch between multiple full-screen apps? This is
all very different than starting with the desktop metaphor
and trying to morph it into something innovative.
(If you liked this, you might like How
to Think Like a Pioneer.)
Exploring Audio Files with Erlang
It takes surprisingly little Erlang code to dig into the
contents of an uncompressed audio file. And it turns out
that three of the most common uncompressed audio file
formats--WAV, AIFF, and Apple's CAF--all follow the same
general structure. Once you understand the basics of one,
it's easy to deal with the others. AIFF is the trickiest of
the three, so that's the one I'll use as an example.
First, load the entire file into a binary:
load(Filename) ->
{ok, B} = file:read_file(Filename),
B.
There's a small header: four characters spelling out
"FORM", a length which doesn't matter, then four more
characters spelling out "AIFF". The interesting part is the
rest of the file, so let's just validate the header and put
the rest of the file into a binary called B:
<<"FORM", _:32, "AIFF", B/binary>> = load(Filename).
The "rest of file" binary is broken into chunks that
follow a simple format: a four character chunk name, the
length of the data in the chunk (which doesn't include the
header), and then the data itself. Here's a little function
that breaks a binary into a list of {Chunk_Name,
Contents}
pairs:
chunkify(Binary) -> chunkify(Binary, []).
chunkify(<<N1,N2,N3,N4, Len:32,
Data:Len/binary, Rest/binary>>, Chunks) ->
Name = list_to_atom([N1,N2,N3,N4]),
chunkify(adjust(Len, Rest), [{Name, Data}|Chunks]);
chunkify(<<>>, Chunks) ->
Chunks.
Ignore the adjust
function for now; I'll
get back to that.
Given the results of chunkify
, it's easy to
find a specific chunk using lists:keyfind/3
.
Really, though, other than to test the chunkification code,
there's rarely a reason to iterate through all the chunks
in a file. It's nicer to return a function that makes
lookups easy. Replace the last line of
chunkify
with this:
fun(Name) ->
element(2, lists:keyfind(Name, 1, Chunks)) end.
The key info about sample rates and number of channels
and all that is in a chunk called COMM
and now
we've got an easy way to get at and decode that chunk:
Chunks = chunkify(B).
<<Channels:16, Frames:32,
SampleSize:16,
Rate:10/binary>> = Chunks('COMM').
The sound samples themselves are in a chunk called
SSND
. The first eight bytes of that chunk
don't matter, so to decode that chunk it's just:
<<_:8/binary, Samples/binary>> = Chunks('SSND').
Okay, now the few weird bits of the AIFF format. First,
if the size of a chunk is odd, then there's one extra pad
byte following it. That's what the adjust
function is for. It checks if a pad byte exists and removes
it before decoding the rest of the binary. The second quirk
is that the sample rate is encoded as a ten-byte extended
floating point value, and most languages don't have support
for them--including Erlang. There's an algorithm in the
AIFF spec for encoding and decoding extended floats, and I
translated it into Erlang.
Here's the complete code for the AIFF decoder:
load_aiff(Filename) ->
<<"FORM", _:32, "AIFF", B/binary>> = load(Filename),
Chunks = chunkify(B),
<<Channels:16, Frames:32, SampleSize:16, Rate:10/binary>> =
Chunks('COMM'),
<<_:8/binary, Samples/binary>> = Chunks('SSND'),
{Channels, Frames, SampleSize, ext_to_int(Rate), Samples}.
chunkify(Binary) -> chunkify(Binary, []).
chunkify(<<N1,N2,N3,N4, Length:32,
Data:Length/binary, Rest/binary>>, Chunks) ->
Name = list_to_atom([N1,N2,N3,N4]),
chunkify(adjust(Length, Rest), [{Name, Data}|Chunks]);
chunkify(<<>>, Chunks) ->
fun(Name) -> element(2, lists:keyfind(Name, 1, Chunks)) end.
adjust(Length, B) ->
case Length band 1 of
1 -> <<_:8, Rest/binary>> = B, Rest;
_ -> B
end.
ext_to_int(<<_, Exp, Mantissa:32, _:4/binary>>) ->
ext_to_int(30 - Exp, Mantissa, 0).
ext_to_int(0, Mantissa, Last) ->
Mantissa + (Last band 1);
ext_to_int(Exp, Mantissa, _Last) ->
ext_to_int(Exp - 1, Mantissa bsr 1, Mantissa).
load(Filename) ->
{ok, B} = file:read_file(Filename),
B.
WAV and CAF both follow the same general structure of a
header followed by chunks. WAV uses little-endian values,
while the other two are big-endian. CAF doesn't have chunk
alignment requirements, so that removes the need for
adjust
. And fortunately it's only AIFF that
requires that ugly conversion from extended floating point
in order to get the sample rate.
Don't Distract New Programmers with OOP
When I get asked "What's a good first programming
language to teach my [son / daughter /
other-person-with-no-programming-experience]?" my answer
has been the same for the last 5+ years: Python.
That may be unexpected, coming from someone who often
talks about non-mainstream languages, but I stand by
it.
Python is good for a wide range of simple and
interesting problems that would be too much effort in C.
(Seriously, a basic spellchecker can
be implemented in a few lines of Python.) There are
surprisingly few sticking points where the solution is easy
to see, but there's a tricky mismatch between it and the
core language features. Erlang has a couple of biggies. Try
implementing any algorithm that's most naturally phrased in
terms of in-place array updates, for example. In Python the
sailing tends to be smooth. Arrays and dictionaries and
sets cover a lot of ground.
There's one caveat to using Python as an introductory
programming language: avoid the object-oriented features.
You can't dodge them completely, as fundamental data types
have useful methods associated with them, and that's okay.
Just make use of what's already provided and resist talking
about how to create classes, and especially avoid talking
about any notions of object-oriented design where every
little bit of data has to be wrapped up in a class.
The shift from procedural to OO brings with it a shift
from thinking about problems and solutions to thinking
about architecture. That's easy to see just by
comparing a procedural Python program with an
object-oriented one. The latter is almost always longer,
full of extra interface and indentation and annotations.
The temptation is to start moving trivial bits of code into
classes and adding all these little methods and
anticipating methods that aren't needed yet but might be
someday.
When you're trying to help someone learn how to go from
a problem statement to working code, the last thing you
want is to get them sidetracked by faux-engineering
busywork. Some people are going to run with those scraps of
OO knowledge and build crazy class hierarchies and end up
not as focused on on what they should be learning. Other
people are going to lose interest because there's a layer
of extra nonsense that makes programming even more
cumbersome.
At some point, yes, you'll need to discuss how to create
objects in Python, but resist for as long as you can.
(November 2012 update: There's now a sequel of sorts.)
If You're Not Gonna Use It, Why Are You Building
It?
Just about every image editing or photo editing program
I've tried has a big collection of visual filters. There's
one to make an image look like a mosaic, one to make it
look like watercolors, and so on. Except for few of the
most fundamental image adjustments, like saturation and
sharpness, I never use any of them.
I have this suspicion that the programmers of these
tools got hold of some image processing textbooks and
implemented everything in them. If an algorithm had any
tweakable parameters, then those were exposed to the user
as sliders.
Honestly, that sounds like something I might have done
in the past. The process of implementing those filters is
purely technical--almost mechanical--yet it makes the
feature list longer and more impressive. And they could be
fun to code up. But no consideration is given to if those
filters have any practical value.
Contrast this with apps like Instagram and Hipstamatic. Those
programs use your phone's camera to grab images, then apply
built-in filters to them. They're fully automatic; you
can't make any manual adjustments. And yet unlike all of
those filter-laden photo editors I've used in the past, I'm
completely hooked on Hipstamatic. It rekindled my interest
in photography, and I can't thank the authors enough.
What's the difference between those apps and
old-fashioned photo editors?
The Hipstamatic and Instagram filters were designed with
clear goals in mind: to emulate certain retro-camera
aesthetics, to serve as starting points and inspirations
for photographs. Or more succinctly: they were built to be
used.
If you find yourself creating something, and you don't
understand how it will be used, and you don't plan on using
it yourself, then it's time to take a few steps back and
reevaluate what you're doing.
(If you liked this, you might like Advice to Aimless, Excited Programmers.)
Caught-Up with 20 Years of UI Criticism
Interaction designers have leveled some harsh criticisms
at the GUI status-quo over the last 20+ years. The mouse is
an inefficient input device. The desktop metaphor is
awkward and misguided. Users shouldn't be exposed to
low-level details like the raw file-system and having to
save their work.
And they were right.
But instead of better human/computer interaction, we got
faster processors and hotter processors and multiple
processors and entire processors devoted to 3D graphics.
None of which are bad, mind you, but it was always odd to
see such tremendous advances in hardware while the
researchers promoting more pleasant user experiences wrote
books that were eagerly read by people who enjoyed smirking
at the wrong-headedness of an entire industry--yet who
weren't motivated enough to do anything about it. Or so it
seemed.
It's miraculous that in 2011, the biggest selling
computers are mouse-free, run programs that take over the
entire screen without the noise of a faux-desktop, and the
entire concept of "saving" has been rendered obsolete.
Clearly, someone listened.
(If you liked this, you might enjoy Free Your Technical Aesthetic from the
1970s.)
Revisiting "Tricky When You Least Expect It"
Since writing Tricky When You Least
Expect It in June 2010, I've gotten a number of
responses offering better solutions to the
angle_diff
problem. The final version I
presented in the original article was this:
angle_diff(Begin, End) ->
D = End - Begin,
DA = abs(D),
case {DA > 180, D > 0} of
{true, true} -> DA - 360;
{true, _} -> 360 - DA;
_ -> D
end.
But, maybe surprisingly, this function can be written in
two lines:
angle_diff(Begin, End) ->
(End - Begin + 540) rem 360 - 180.
The key is to shift the difference into the range -180
to 180 before the modulo operation. The "- 180" at the end
adjusts it back. One quirk of Erlang is that the modulo
operator (rem) gives a negative result if the first value
is negative. That's easily fixed by adding 360 to the
difference (180 + 360 = 540) to ensure that it's always
positive. (Remember that adding 360 to an angle gives the
same angle.)
So how did I miss this simpler solution? I got off track
by by thinking I needed an absolute value, and things went
downhill from there. I'd like to think if I could rewind
and re-attempt the problem from scratch, then I'd see the
error of my ways, but I suspect I'd miss it the second
time, too. And that's what I was getting at when I wrote
"Tricky When You Least Expect It": that you never know when
it will take some real thought to solve a seemingly simple
problem.
(Thanks to Samuel Tardieu, Benjamin Newman, and Greg
Rosenblatt, who all sent almost identical solutions.)
Follow the Vibrancy
Back in 1999 or 2000, I started reading a now-defunct
Linux game news site. I thought the combination of
enthusiastic people wanting to write video games and the
excitement surrounding both Linux and open source would
result in a vibrant, creative community.
Instead there were endless emulators and uninspired
rewrites of stale old games.
I could theorize about why there was such a lack of
spark, a lack of motivation to create anything distinctive
and exciting. Perhaps most of the projects were intended to
fulfill coding itches, not personal visions. I don't know.
But I lost interest, and I stopped following that site
When I wanted to modernize my
programming skills, I took a long look at Lisp. It's a
beautiful and powerful language, but I was put off by the
community. It was a justifiably smug community, yes, but it
was an empty smugness. Where were the people using this
amazing technology to build impressive applications? Why
was everyone so touchy and defensive? That doesn't directly
point at the language being flawed--not by any means--but
it seemed an indication that something wasn't right,
that maybe there was a reason that people driven to push
boundaries and create new experiences weren't drawn to the
tremendous purported advantages of Lisp. So I moved on.
Vibrancy is an indicator of worthwhile technology. If
people are excited, if there's a community of developers
more concerned with building things than advocating or
justifying, then that's a good place to be. "Worthwhile"
may not mean the best or fastest, but I'll take enthusiasm
and creativity over either of those.
(If you liked this, you might enjoy The Pure Tech Side is the Dark Side.)
Impressed by Slow Code
At one time I was interested in--even enthralled
by--low-level optimization.
Beautiful and clever tricks abound. Got a function call
followed by a return statement? Replace the pair with a
single jump instruction. Once you've realized that "load
effective address" operations are actually doing math, then
they can subsume short sequences of adds and shifts. On
processors with fast "count leading zero bits"
instructions, entire loops can be replaced with a couple of
lines of linear code.
I spent a long time doing that before I realized it was
a mechanical process.
I don't necessarily mean mechanical in the "a good
compiler can do the same thing" sense, but that it's a raw
engineering problem to take a function and make it faster.
Take a simple routine that potentially loops through a lot
of data, like a case insensitive string comparison. The
first step is to get as many instructions out of the loop
as possible. See if what remains can be rephrased using
fewer or more efficient instructions. Can any of the
calculations be replaced with a small table? Is there a way
to process multiple elements at the same time using vector
instructions?
The truth is that there's no magic in taking a
well-understood, working function, analyzing it, and
rewriting it in a way that involves doing slightly or even
dramatically less work at run-time. If I ended up with a
routine that was a bottleneck, I know I could take the time
to make it faster. Or someone else could. Or if it was
small enough I could post it to an assembly language
programming forum and come back in a couple of days when
the dust settled.
What's much more interesting is speeding up something
complex, a program where all the time isn't going into a
couple of obvious hotspots.
All of a sudden, that view through the low-level
magnifying glass is misleading. Yes, that's clearly an
N-squared algorithm right there, but it may not matter at
all. (It might only get called with with low values of N,
for example.) This loop here contains many extraneous
instructions, but that's hardly a big picture view. None of
this helps with understanding the overall data flow, how
much computation is really being done, and where the
potential for simplification lies.
Working at that level, it makes sense to use a language
that keeps you from thinking about exactly how your code
maps to the underlying hardware. It can take a bit of faith
to set aside deeply ingrained instincts about performance
and concerns with low-level benchmarks, but I've seen
Python programs that ended up faster than C. I've seen
complex programs running under the Erlang virtual machine
that are done executing before my finger is off the return
key.
And that's what's impressive: code that is so easy to
label as slow upon first glance, code containing functions
that can--in isolation--be definitively proven to be dozens
or hundreds of times slower than what's possible on a given
CPU, and yet the overall program is decidedly one of high
performance.
(If you liked this, you might enjoy Timidity Does Not Convince.)
Constantly Create
When I wrote Flickr as a Business
Simulator, I was thinking purely about making a
product--photos--and getting immediate feedback from a real
audience. Seeing how much effort it takes to build-up a
following. Learning if what you think people will like and
what they actually like are the same thing.
It works just as well for learning what it's like to be
in any kind of creative profession, such as an author of
fiction or a recording artist.
Go look at music reviews on Amazon, and you'll see
people puzzling over why a band's latest release doesn't
have the spark of their earlier material, pointing out
filler songs on albums, complaining about inconsistency
between tracks. Sometimes the criticisms are empty, but
there's often a ring of truth. There's an underlying
question of why. How could a songwriter or band
release material that isn't always at the pinnacle of
perfection?
After years of posting photos to Flickr, I get it. I'm
just going along, taking my odd photographs, when all of a
sudden one resonates and breaks through and I watch the
view numbers jump way up. Then I've got pressure: How can I
follow that up? Sometimes I do, with a couple of winners in
a row, but inevitably I can't stay at that level. Sometimes
I take a break, not posting shots for a month or more, and
then I lose all momentum.
When I'm at a low point, when I devolve into taking
pictures of mundane subjects, pictures I know aren't good,
I think about how I'm ever going to get out of that rut.
Inevitably I do, though it's often a surprise when I go
from a forgettable photo one day to something inspired the
next.
The key for me is to keep going, to keep taking and
posting photos. If I get all perfectionist then there's too
much pressure, and I start second-guessing myself. If I
give up when my quality drops off, then that's not solving
anything. The steady progress of continual output, whether
good or bad output, is part of the overall creative
process.
Tough Love for Indies
At one time I was the independent software developer's
dream customer.
I was a pushover. I bought applications, I bought tools,
I bought games. This was back when "shareware" was still
legitimate, back before the iPhone App Store made five
dollars sound like an outrageous amount of money for a
game. I did it to support the little guy, to promote the
dream of living in the mountains or a coastal town with the
only source of income coming from the sale of homemade
code.
Much of the stuff I bought wasn't great. I bought it
because it showed promise, because it clearly had some
thought and effort behind it. That I knew it was produced
by one person working away in his spare hours softened my
expectations.
The thing is, most people don't think that way.
These days I still gravitate toward toward apps that
were developed by individuals or small companies, but I
don't cut them any slack for lack of quality. I can't
justify buying an indie game because it has potential but
isn't actually fun. I won't downgrade my expectations of
interface design and usability so I can use a program
created by two people instead of a large corporation.
That whole term "indie" only means something if you go
behind the scenes and find out who wrote a piece of
software. And while I think it's fascinating to watch the
goings-on of small software developers, it's a quirk shared
by a small minority of potential customers. The first rule
of being indie is that people don't care if you're indie.
You don't get any preferential treatment for not having a
real office or a QA department. The only thing that matters
is the end result.
(If you liked this, you might enjoy Easy to Please.)
Living in the Era of Infinite Computing Power
Basic math used to be slow. To loop 10K times on an
8-bit processor, it was faster to iterate 256 times in an
inner loop, then wrap that in an outer loop executing 40
times. That avoided multi-instruction 16-bit addition and
comparison each time through.
Multiplication and division used to be slow. There were
no CPU instructions for those operations. If one of the
multiplicands was constant, then the multiply could be
broken down into a series of adds and bit shifts (to
multiply N by 44: N lshift 5 + N lshift 3 + N lshift 2),
but the general case was much worse.
Floating point used to be slow. Before FPUs, floating
point math was done in software at great expense. Early
hardware was better, but hardly impressive. On the original
8087 math coprocessor, simple floating point addition took
a minimum of 90 cycles, division over 200, and there were
instructions that took over a thousand cycles to
complete.
Graphics used to be slow. For the longest time,
programmers who had trouble getting 320x200 displays to
update at any kind of reasonable rate, scoffed at the
possibility of games running at the astounding resolution
of 640x480.
All of these concerns have been solved to comical
degrees. A modern CPU can add multiple 64-bit values at the
same time in a single cycle. Ditto for floating point
operations, including multiplication. All the work of
software-rendering sprites and polygons has been offloaded
to separate, highly-parallel processors that run at the
same time as the multiple cores of the main CPU.
Somewhere in the late 1990s, when the then-popular
Pentium II reached clock speeds in the 300-400MHz range,
processing power became effectively infinite. Sure there
were the notable exceptions, like video compression and
high-end 3D games and editing extremely high-resolution
images, but I was comfortably developing in interpreted
Erlang and running complex Perl scripts without worrying
about performance.
Compared to when I was building a graphically intensive
game on an early 66MHz Power Macintosh, compared to when I
was writing commercial telecommunications software on a
20MHz Sun workstation, compared to developing on a wee
8-bit Atari home computer, that late 1990s Pentium II was a
miracle.
Since then, all advances in processing power have been
icing. Sure, some of that has been eaten up by cameras
spitting out twelve megapixels of image data instead of
two, by Windows 7 having more overhead than Windows 98, and
by greatly increased monitor resolution. And there are
always algorithmically complex problems that never run fast
enough; that some hardware review site shows chipset X is
8.17% faster than chipset Y in a particular benchmark isn't
going to overcome that.
Are you taking advantage of living in the era of
infinite computing power? Have you set aside fixations with
low-level performance? Have you put your own productivity
ahead of vague concerns with optimization? Are you
programming in whatever manner lets you focus on the
quality and usefulness of the end product?
To be honest, that sounds a bit Seth Godin-esque,
feel-good enough to be labeled as inspirational yet
promptly forgotten. But there have been and will be hit iOS
/ Android / web applications from people without any
knowledge of traditional software engineering, from people
using toolkits that could easily be labeled as technically
inefficient, from people who don't even realize they're
reliant on the massive computing power that's now part of
almost every available platform.
(If you liked this, you might enjoy How Much Processing Power Does it Take to be
Fast?.)
The Nostalgia Trap
I used to maintain a site about 8-bit
game programmers and the games they created. To be fair, I
still update the "database" now and then, but changes are
few and far between, and I stopped posting news blurbs five
years ago.
There's a huge amount of information on that site.
Clearly I was passionate--or at least obsessive--about it,
and for a long time, too. When I learned to program in the
1980s, I saw game design as a new outlet for creativity, a
new art form. Here were these people without artistic
backgrounds, who weren't professional developers, buying
home computers and making these new experiences out of
essentially nothing. I wanted to document that period. I
wanted to communicate with those people and find out what
drove them.
(In 2002, D.B. Weiss wrote a novel called Lucky Wander Boy. It
followed the story of someone who attempted to catalog
every video game ever made. Amusingly, I received a
promotional copy.)
That's why I started the site. A better question is "Why
did I stop?"
Partly it was because I answered the questions that I
had. I was in contact with a hundred or more designers of
8-bit computer games, and I learned their stories. But
mostly I needed to move on, to not be spending so much time
looking to the past.
Nostalgia is is intensely personal. I was a teenage game
designer seeing hundreds of previously unimagined new
creations for the Apple II and Atari 800 and Commodore 64
come into existence, and I have fond memories of those
years. Other people wax nostalgic about VAX system
administration, about summer afternoons with cartridges for
the Nintendo Entertainment System, about early mainframe
games like Rogue or Hack played on a clunky green terminal,
or of the glory days of shareware in the early 1990s. Some
people pine for the heyday of MS-DOS development--of
cycling the power after every crash--or writing programs in
QBASIC.
But don't mistake wistful nostalgia for "how things
ought to be."
Just because you used to love the endless string of
platformers
for a long-dead game system doesn't mean that recreating
them for the iPhone is a worthy endeavor. Just because you
get a warm and fuzzy feeling when recalling thirty year-old
UNIX command-line programs is different than putting them
on a pedestal as model for how to design tools. That
doesn't mean you shouldn't learn from the past and avoid
repeating expensive mistakes. Just don't get trapped by
thinking that older software or technologies are superior
because they happened to be entangled with more carefree
periods in your life.
The future is much more interesting.
The End is Near for Vertical Tab
Stop the Vertical Tab Madness
wasn't based on a long-standing personal peeve. It dawned
on me after writing Rethinking
Programming Language Tutorials and a follow-up piece that here is this archaic
escape sequence ("\v") that no one uses or understands, yet
it's mindlessly included in new programming languages and
pedantically repeated in tutorials and reference
manuals.
One year later, a Google search for vertical tab
produces this:
Missing image: Google search for vertical tab
There's the Wikipedia entry about tab in general, and
then there's an essay pointing out the utter uselessness of
vertical tab in modern programming.
This is progress!
I will take this opportunity to repeat my plea. If
you're a programming language maintainer, please follow the
lead taken by Perl and drop support for the vertical tab
escape sequence. If you're writing a tutorial, don't even
hint that the vertical tab character exists.
Thank you.
8-Bit Scheme: A Revisionist History
In The Nostalgia Trap I wrote, "I
was in contact with a hundred or more designers of 8-bit
computer games, and I learned their stories." Those stories
were fantastically interesting, but most of them were only
incidentally about programming. The programming side
usually went like this:
Early home computers were magic,
and upon seeing one there was a strong desire to learn how
to control it and create experiences from moving graphics
and sound. At power-up there was the prompt from a BASIC
interpreter, and there was a BASIC reference manual in the
box, so that was the place to start.
Later there was serendipitous exposure to some fast and
impressive game that was far beyond the animated character
graphics or slow line-drawing of BASIC, and that led to the
discovery of assembly language and the freedom to exploit
the hardware that came with it. It was suitably clunky to
be writing a seven instruction sequence to do 16-bit
addition and remembering that "branch on carry set" could
be thought of as "branch on unsigned greater or equal to."
The people with prior programming experience, the ones who
already knew C or Algol or Scheme, they may have been
dismayed at the primitive nature of it all, but really it
came down to "you do what you have to do." The goal was
never to describe algorithms in a concise and expressive
manner, but to get something interactive and wonderful up
on the screen.
Now imagine if an Atari 800 or Commodore 64 shipped with
a CPU designed to natively run a high-level language like
Scheme. That's not completely outrageous; Scheme chips were
being developed at MIT in the late 1970s.
My suspicion is that Scheme would have been learned by
budding game designers without a second thought. Which
language it was didn't matter nearly so much as having
a language that was clearly the right choice for the
system. All the quirks and techniques of Scheme would have
been absorbed and worked around as necessary.
It's not so simple with today's abundance of options,
none of which is perfect. Haskell is beautiful, but it
looks difficult to reason about the memory usage of Haskell
code. Erlang has unappealing syntax. Python is
inconsistently object-oriented. Lisp is too bulky--Scheme
too minimalist. All of these ring more of superficiality
than the voice of experience. Yet those criticisms are
preventing the deep dive needed to get in there and find
out how a language holds up for a real project.
What if you had to use Scheme? Or Haskell? Or
Erlang? You might slog it out and gain a new appreciation
for the mundane, predictable nature of C. Or you might find
out that once you've worked through a handful of tricky
bits, there are great advantages in working with a language
that's more pleasant and reliable. Either way, you will
have learned something.
(If you liked this, you might enjoy Five Memorable Books About Programming.)
Collapsing Communities
At one time the Lisp and Forth communities were exciting
places. Books and articles brimmed with optimism. People
were creating things with those languages. And then slowly,
slowly, there was a loss of vibrancy.
Perhaps the extent of the loss went unnoticed by people
inside those communities, but the layers of dust and the
reek of years of defensiveness jump out at the curious who
wander in off the street, not realizing that the welcome
sign out front was painted decades earlier by people who've
long since moved away.
This is not news. Time moves on. Product and technical
communities grow tired and stale. The interesting questions
are when do they go stale and how do you realize it?
The transition from MS-DOS to Windows was a difficult
one for many developers. Windows 3.1 wasn't a complete
replacement for the raw audio/visual capabilities of DOS.
Indeed, the heyday of MS-DOS game creation took place in
the years between Windows 3.1 and Windows 95. But even in
2002, seven years after every PC booted into Windows by
default, it wasn't uncommon to see the authors of software
packages, even development environments, still resisting
the transition, still targeting MS-DOS. Why did they hang
on in the face of clear and overwhelming change? How did
they justify that?
Unfortunately, it was easy to justify. "Windows is
overly complex. Look at the whole shelf of manuals you need
to program for it." "I can put a pixel on the screen with
two lines of code under MS-DOS versus 200 for Windows."
"I'm not going to take the performance hit from virtual
memory, pre-emptive multitasking, and layers of hardware
drivers."
Even if some of those one-sided arguments hold a bit of
water, they made no difference at all to the end result. It
would have been better to focus on learning the new
platform rather than tirelessly defendinding the old
one.
I've been a Flickr user since the
early days. Oh, the creativity and wonder touched off by
that site! But there have been signs that it is growing
crusty. The iPhone support is only halfway there and has
been for some time, for example. I wouldn't say Flickr is
truly collapsing, but the spark is dimmer than it once was.
I'm far from shutting down my account, but there's more
incentive to start poking around the alternatives.
When I do move on to another photo sharing site, I won't
fight it. I won't post long essays about why I won't leave.
I'll simply follow the vibrancy.
"Avoid Premature Optimization" Does Not Mean "Write
Dumb Code"
First there's a flurry of blog entries citing a snippet
of a Knuth quote: "premature optimization is the root of
all evil." Then there's the backlash about how performance
needs to be considered up front, that optimization isn't
something that can be patched in at the end. Around and
around it goes.
What's often missed in these discussions is that the
advice to "avoid premature optimization" is not the same
thing as "write dumb code." You should still try to make
programs clear and reliable and factor out common
operations and use good names and all the usual stuff.
There's this peculiar notion that as soon as you ease up on
the hardcore optimization pedal, then you go all mad and
regress into a primitive mindset that thinks BASIC on an
Apple ][ is the epitome of style and grace.
The warning sign is when you start sacrificing clarity
and reliability while chasing some vague notion of
performance.
Imagine you're writing an application where you
frequently need to specify colors. There's a nice list of
standard
HTML color names which is a good starting point. A
color can be represented as an Erlang atom: tomato,
lavenderBlush, blanchedAlmond. Looking up the R,G,B value
of a color, given the name, is straightforward:
color(tomato) -> {255,99,71};
color(lavenderBlush) -> {255,240,245};
color(blanchedAlmond) -> {255,235,205};
...
That's beautiful in its textual simplicity, but what's
going on behind the scenes? That function gets turned into
the virtual machine equivalent of a switch
statement. At load time, some additional optimization gets
done and that switch
statement is transformed
into a binary search for the proper atom.
What if, instead of atoms, colors are represented by
integers from zero to some maximum value? That's easy with
macros:
-define(Tomato, 0).
-define(LavenderBlush 1).
-define(BlanchedAlmond, 2).
...
This change allows the color
function to be
further, automatically, optimized at load time. The keys
are consecutive integers, so there's no need for a search.
At runtime there's a bounds check and a look-up, and that's
it. It's hands-down faster than the binary search for an
atom.
What's the price for this undefined amount of extra
speed? For starters, colors get displayed as bare integers
instead of symbolic names. An additional function to
convert from an integer to a name string fixes that...well,
except in post-crash stack traces. The easy-to-read and
remember names can't be entered interactively, because
macros don't exist in the shell. And the file containing
the macros has to be included in every source file where
colors are referenced, adding dependencies to the project
that otherwise wouldn't be present.
The verdict? This is madness: sacrificing ease of
development, going against the grain of Erlang, all in the
name of nanoseconds.
(If you liked this, you might enjoy Two
Stories of Simplicity.)
It's Like That Because It Has Always Been Like
That
At a time when most computers could only display
phosphorescent screens of text, the first GUI calculator
app was a bold experiment. It looked like an
honest-to-goodness pocket calculator. No instruction manual
necessary; click on keys with the mouse. And that it could
be opened while working within another application was
impressive in itself.
Of course now the interaction design mistakes of having
a software calculator mimic the real-life plastic device
are well-understood. Why click on graphical buttons when
there's a computer keyboard? And if keyboard input is
accepted, then why waste screen space displaying the
buttons at all? Isn't it easier and less error prone to
type an expression such as "806 * (556.5 / 26.17)" than to
invisibly insert operators within a series of separately
entered numbers?
That a literal digitization of a physical calculator
isn't a particularly good solution is no longer news.
What's surprising is how long the design mistakes of that
original implementation have hung on.
If I were teaching a class, and I gave the assignment of
"mock-up the interface for a desktop PC calculator app,"
I'd fully expect to get back a variety of rectangular
windows with a numeric display along the top and a grid of
buttons below. What a calculator on a computer is supposed
to look like is so ingrained that all thought of
alternatives is blocked.
This kind of blindness is both easy and difficult to
discover. It's easy because all you have to do is stop and
give an honest answer to the question "What problem am I
trying to solve?" and then actually solve that problem.
It's difficult because there are many simple, superficial
dodges to that question, such as "because I need to add a
calculator to my application."
A better problem statement is along the lines of "a way
for users to compute and display the results of basic math
operations." The solution is in no way locked into a
rectangle containing a numeric display with a grid of
buttons below it.
(If you liked this, you might enjoy If
You're Not Gonna Use It, Why Are You Building It?)
Building Beautiful Apps from Ugly Code
I wish I could entirely blame my computer science degree
for undermining my sense of aesthetics, but I can't. Much
of it was self-inflicted from being too immersed in
programming and technology for its own sake, and it took me
a long time to recover.
There's a tremendous emphasis on elegance and beauty in
highbrow coding circles. A sort implemented in three lines
of Haskell. A startlingly readable controller for a washing
machine in half a dozen lines of Forth. Any example from
Structure and Interpretation of Computer Programs.
It's difficult to read books, follow blogs, take classes,
and not start developing an eye for elegant code.
All those examples of beauty and elegance tend to be
small. The smallness--the conciseness--is much of
the elegance. If you've ever implemented a sorting
algorithm in C, then you likely had three lines of code
just to swap values. Three lines for the entire sort is
beautifully concise.
Except that beauty rarely scales.
Pick any program outside of the homework range, any
program of 200+ lines, and it's not going to meet a
standard of elegance. There are special cases and hacks and
convoluted spots where the code might be okay except that
the spec calls for things which are at odds with writing
code that crackles with craftsmanship. There are global
variable updates and too many parameters are passed around.
Bulky functions because things can't be reduced to one or
two clean possibilities. Lines of code wasted translating
between types. Odd-looking flag tests needed to fix
reported bugs.
Small, elegant building blocks are used to construct
imperfect, even ugly, programs. And yet those imperfect,
ugly programs may actually be beautiful applications.
The implications of that are worth thinking about.
Functional programming zealots insist
upon the complete avoidance of destructive updates, yet
there's a curious lack of concrete examples of programs
which meet that standard of purity. Even if there were, the
purely functional style does not any in way translate to
the result being a stunningly powerful and easy to use
application. There's no cause and effect. And if there is a
stunningly powerful and easy to use application, does that
mean the code that runs the whole thing is a paragon of
beauty? Of course not.
The only sane solution is to focus on the end
application first. Get the user to experience the beauty of
it and be happy. Don't compromise that because, behind the
scenes,
the code to draw ovals is more elegant than the code to
draw rounded rectangles.
(If you liked this, you might enjoy Write Code Like You Just Learned How to
Program.)
Boldness and Restraint
Modern mobile devices are hardly the bastions of
minimalism once synonymous with embedded systems. They're
driven by bold technical decisions. Full multi-core, 32-bit
processors. Accelerated 3D graphics all the way down,
including shader support. No whooshing fans or hot to the
touch parts. Big, UNIX-like operating systems. This is all
the realm of fantasy; pocket-sized computers outperforming
what were high-end desktop PCs not all that long ago.
What goes hand-in-hand with that boldness is restraint.
It's not the cutting edge, highest clocked, monster of a
CPU that ends up in an iPhone, but a cooler, slower,
relatively simpler chip. A graphics processor doesn't have
to be driven as hard to push the number of pixels in a
small display. Storage space is a fraction of a desktop PC,
allowing flash memory to replace whirring hard drives and
keeping power consumption down.
This is a complete turnaround from the bigger is better
at any cost philosophy of the early to mid 2000s. That was
when the elite PC hobbyists sported thousand watt power
supplies and impressive arrays of fans and heatsinks and
happily upgraded to new video cards that could render 18%
more triangles at the expense of 40% higher power
consumption.
An interesting case where I can't decide if it's
impressive boldness or unrestrained excess is in the ultra
high-resolution displays expected to to be in near future
tablets, such as the iPad 3.
If you haven't been following this, here's the rundown.
Prior to mid-2010, the iPhone had a resolution of 480x320
pixels. With the iPhone 4, this was doubled in each
dimension to 960x480, and Apple dubbed it a retina display.
Now there's a push for the iPad to have its resolution
similarly boosted.
The math here is interesting.
The original iPhone, with a resolution of 480x320, has
153,600 pixels.
The iPhone 4's retina display has 614,400 pixels.
The iPad 2 has a resolution of 1024x768, for a total of
786,432 pixels.
A double-in-each-dimension display for the
iPad--2048x1536 resolution--has 3,145,728 pixels.
That's an amazing number. It's twenty times the
pixel count of the original iPhone. It's over five
times the pixels of the iPhone 4 display. It's even 1.7
times the number of pixels on my PC monitor at home. And
we're talking about a nine inch screen vs. a desktop
display.
I have zero doubt that a display of that resolution will
find its way into a next generation iPad. Zero. It's bold,
it's gutsy, the precedent is there, the displays already
exist, and people want them. But such a tremendous increase
in raw numbers, all to make an ultrasharp display be even
sharper at close viewing distances? Maybe the days of
restraint are over.
(If you liked this, you might enjoy How My Brain Kept Me from Co-Founding
YouTube.)
Beyond Empty Coding
There's a culture of cloning and copying that I have a
hard time relating to.
I taught myself to program so I could create things of
my own design--originally 8-bit video games. There's an
engineering side to that, of course, and learning how to
better structure code and understand algorithms built my
technical knowledge, enabling the creation of things that
are more interesting and sophisticated. By itself, that
engineering side is pedestrian, even mechanical, much like
grammar is an unfortunate necessity for writing essays and
short stories. But using that knowledge to create new
experiences? That's exciting!
When I see people writing second-rate versions of
existing applications simply because they disagree with the
licensing terms of the original, or cloning an iPhone app
because there isn't an Android version, or rehashing stale
old concepts in a rush to make money in the mobile game
market...I don't get it.
Oh, I get it from an "I know how to program, and I'm
looking for a ready-made idea that I can code-up" angle.
What I don't understand is the willingness to so quickly
narrow the possibility space, to start with a wide-open sea
of ways to solve a problem and develop an easy to use
application, but choosing instead to take an existing,
half-baked solution as gospel and recreating it (maybe even
with a few minor improvements).
Yes, there are some classic responses to this line of
thinking. Everything is a
remix. Every story ever written can be boiled down to
one of seven fundamental plots.
But is that kind of self-justification enough reason to
stop trying altogether? To elevate the empty act of coding
above the potential to make progress and explore new
territory? To say that all music and movies and games are
derivative and that's how they'll always be and bring on
the endless parade of covers and remakes?
I can only answer for myself: no, it's not.
(If you liked this, you might enjoy Personal Programming.)
Greetings from the Bottom of the Benchmarks
I can guarantee that if you write a benchmark pitting
Erlang's dictionary type against that of any other
language, Erlang is going to lose. Horribly. It doesn't
matter if you choose the dict
module or
gb_trees
; Erlang will still have an
embarrassing time of it, and there will be much snickering
and posting of stories on the various programming news
aggregation sites.
Is the poor showing because dictionaries in Erlang are
purely functional, so the benchmark causes much copying of
data? Or perhaps because Erlang is dynamically typed?
Neither. It's because the standard Erlang dictionary
modules are written in Erlang.
In that light, the low benchmark numbers are
astoundingly impressive. The code for every dictionary
insert, look-up, and deletion is run through the same
interpreted virtual machine as any other code. And the
functions being interpreted aren't simply managing a big,
mutable hash table, but a high-concept, purely functional
tree structure. The Erlang code is right there to look at
and examine. It isn't entangled with the dark magic of the
runtime system.
There are still some targets of criticism here. Why are
there multiple key/value mappings in the standard library,
one with the awkward name of gb_trees
, in
addition to the hackier, clunkier-to-use "Erlang term
storage" tables? Why would I choose one over the others?
Why is dict
singular and gb_trees
plural? Let's face it: The Erlang standard library is not a
monument of consistency.
But performance? I've used both dictionary types in
programs I've written, and everything is so instantaneous
that I've never taken the time to see if a disproportionate
amount of time is being spent in those modules. Even if I'm
overstating things, over-generalizing based on the
particular cases where I've used dictionaries, it's still
high-level Erlang code going up against the written-in-C
runtime libraries of most languages. And that it comes
across as "instantaneous" in any kind of real-world
situation is impressive indeed.
(If you liked this, you might enjoy Tales of a Former Disassembly Addict.)
Optimization on a Galactic Scale
The code to generate this site has gotten bloated. When
I first wrote about it, the Perl
script was 6838 bytes. Now it's grown to a horrific 7672
bytes. Part of the increase is because the HTML template is
right there in the code, so when I tweak or redesign the
layout, it directly affects the size of the file.
The rest is because of a personal quirk I've picked-up:
when I write tools, I don't like to overwrite output files
with exactly the same data. That is, if the tool generates
data that's byte-for-byte identical to the last time the
tool was run, then leave that file alone. This makes it
easy to see what files have truly changed, plus it often
triggers fewer automatic rebuilds down the line (imagine if
one of the output files is a C header that's included
throughout a project).
How do you avoid overwriting a file with exactly the
same data? In the write_file
function, first
check if the file exists and if so, is it the same size as
the data to be written? If those are true, then load the
entire file and compare it with the new data. If
they're the same, return immediately, otherwise overwrite
the existing file with the new data.
At one time I would have thought this was crazy talk,
but it's simple to implement, works well, and I've yet to
run into any perceptible hit from such a mad scheme. This
site currently has 112 pages plus the archive page and the
atom feed. In the worst case, where I force a change by
modifying the last byte of the template and regenerate the
whole site, well, the timings don't matter. The whole thing
is over in a tenth of a second on a five year old
MacBook.
That's even though the read-before-write method has
got to be costing tens or hundreds of millions of
cycles. A hundred million cycles is a mind-bogglingly huge
number, yet in this case it's irrelevant.
As it turns out, fully half of the execution time is
going into one line that has nothing to do with the above
code. I have a folder of images that gets copied into
another folder if they've changed. To do that I'm passing
the buck to the external rsync
command using
Perl's backticks.
It's oh so innocuous in the Perl source, but behind the
scenes it's a study in excess. The shell executable is
loaded and decoded, dependent libraries get brought in as
needed, external references are fixed-up, then finally the
shell itself starts running. The first thing it does is
start looking for and parsing configuration files. When the
time comes to process the rsync
command, then
here we go again with all the executable loading and
configuration reading and eventually the syncing actually
starts.
It must be a great disappointment after all that work to
discover that the two files in the image folder are up to
date and nothing needs to be done. Yet that whole process
is as expensive as the rest of the site generation, much
more costly than the frivolous reading of 114 files which
are immediately tromped over with new data.
This is all a far cry from Michael
Abrash cycle-counting on the 8086, from an Apple II
graphics programmer trimming precious instructions from a
drawing routine.
(If you liked this, you might enjoy How Did Things Ever Get This Good?)
The Revolution is Personal
If you were going to reinvent the music / film / video
game industry, what would you do?
Articles deriding the state of modern music, et al, are
staples of the web. They're light and fun to read, and
snickering at the antics of a multi-billion dollar industry
feels like the tiniest seed of revolution.
There, I've used that word twice now: industry.
It's not someone's name, but a faceless scapegoat.
Corporations. Wall Street. The Man. It's an empty term. I
should have phrased the opening question as "If you were
going to make an album / film / video game to buck the
current trends which you dislike, what would that album /
film / video game be?" Now it's concrete and, perhaps
surprisingly, a more difficult problem.
The iOS App Store set the stage for a revolution. You
can make anything you want and put it in front of a
tremendous audience. Sure, Apple has to give cursory
approval to the result, but don't read too much into that.
They're only concerned with some blatant edge cases, not
with censoring your creativity, and some of the stuff that
gets into the App Store emphasizes that.
But the App Store itself is only a revolution in
distribution. The ability to implement iOS software and get
it out to the world isn't synonymous with having a clear,
personal vision about what to implement in the first place.
Even just over three years later, some deep ruts in the
landscape of independent iOS game development have formed.
A cartoony art style. A cute animal as the hero. Mechanics
lifted from a small set of past games. If you've ever
browsed the iPhone App Store, I'm sure made-up titles like
Ninja Cow, Pogo Monkey, Goat Goes Home, and Distraught
Penguin all evoke a certain image of what you'd get for
your ninety-nine cents.
If you truly want to reinvent even a small part of a
creative field, then start developing a personal
vision.
Papers from the Lost Culture of Array Languages
2012 is the 50th anniversary of Ken Iverson's A
Programming Language, which described the notation that
became APL (even though a machine executable version of APL
didn't exist yet). Since then there's been APL2, Nial, A+,
K, Q, and other array-oriented languages. Iverson
(1920-2004) teamed with Roger Hui to create a modern
successor to APL, tersely named J, in the late 1980s.
The culture of array languages is a curious one. Though
largely functional, array languages represent a separate
evolutionary timeline from the lambda calculus languages
like Miranda and Haskell. (Trivia: The word monad is
an important term in both Haskell and J, but has completely
different meanings.) Most strikingly, while Haskell was
more of a testbed for functional language theorists that
eventually became viable for commercial products, array
languages found favor as serious development tools early
on. Even today, K is used to analyze large data sets, such
as from the stock market. J is used in actuarial work.
Notation as a
Tool of Thought, Ken Iverson's 1979 Turing Award
Lecture, is the most widely read paper on APL. Donald
McIntyre (1923-2009) explored similar ideas in
Language as an Intellectual Tool: From Hieroglyphics to
APL. When I first learned of McIntyre's paper roughly
ten years ago, it wasn't available on the web. I inquired
about it via email, and he said he'd see if he or one of
his acquaintances had a copy they could send to me. A week
later I received an envelope from Ken Iverson (!)
containing non-photocopied reprints of Hieroglyphics and
his own A
Personal View of APL. I still have both papers in the
original envelope.
Donald McIntyre also wrote
The Role of Composition in Computer Programming, which
is mind-melting. (Note that it uses an earlier version of
J, so you can't always just cut and paste into the J
interpreter.)
There's a touch of melancholy to this huge body--fifty
years' worth--of ideas and thought. Fifty years of a
culture surrounding a paradigm that's seen as an oddity in
the history of computing. Even if you found the other
papers I've mentioned to be so many unintelligible
squiggles, read Keith Smillie's My
Life with Array Languages. It covers a thirty-seven
year span of programming in APL, Nial, and J that started
in 1968.
(If you liked this, you might enjoy Want to Write a Compiler? Just Read These Two
Papers.)
Starting in the Middle
When I start on a personal project, I'm bright-eyed and
optimistic. I've got an idea in my head, and all I need to
do is implement it. Wait, before I can begin working on the
good stuff there are some foundational underpinnings that
don't yet exist. I work on those for a while, then I get
back to main task...until I again realize that there are
other, lower-level libraries that I need to write
first.
Now I'm worried, because I'm building all this stuff,
but I'm no closer to seeing results. I charge ahead and
write one-off routines and do whatever it takes to get a
working version one-point-oh. Then I hear the creaks and
groans of impending code collapse. I shift my focus to
architecture and move things to separate modules, refactor,
sweep out dark corners, and eventually I'm back to thinking
about the real problem. It's only a short respite. As the
code gets larger and more complex, I find myself having to
wear my software engineering hat more and more of the time,
and that's no fun.
That's the story sometimes, anyway. When using
functional programming languages I take a different
approach: I pick an interesting little bit that's right in
the middle of the problem and start working on it. I don't
build up a foundation needed to support the solution. I
don't think about how it integrates into the whole. There's
a huge win here that should be the selling point of
functional programming: you can build large programs
without worrying about architecture.
Okay, sure, that architecture dodge isn't
entirely true, but it's dramatic enough that I'm
surprised functional programming isn't the obvious choice
for anyone writing "So you want to learn to program?"
tutorials. Or at least that it isn't the focus of "Why
Functional Programming is Great" essays. If nothing else,
it's a more compelling hook than currying or type
systems.
How can starting a project in the middle possibly work?
By writing symbolic code that lets me put off as many
design decisions as possible. If I'm writing the "move
entity" function for a video game,
the standard approach is to directly modify a structure or
object representing that entity. It's much easier to return
a description of the change, like {new_ypos,
76}
or {new_color, red}
. (Those are
both Erlang tuples.) That avoids the whole issue of how to
rebuild what may be a complex, nested data structure with a
couple of new values.
If I want to multiply the matrices M and N, the result
is {'*', M, N}
. (This is another Erlang tuple.
The single quotes around the asterisk mean that it's an
atom--a symbol in Lisp or Scheme. Those quotes are only
necessary if the atom isn't alphanumeric.) The function to
transpose a matrix returns {transpose, M}
.
It looks like the essential work is being dodged, but it
depends what you're after. I can write code and see at a
glance that it gives the right result. I can use those
functions to create more interesting situations and learn
about the problem. If I find my understanding of the
problem is wrong, and I need to back up, that's okay. It's
more than okay: it's great! Maybe it turns out that I don't
need to multiply matrices after all, so I didn't waste time
writing a multiply routine. Maybe the transpose function is
always called with a parameter of {transpose,
Something}
, so the two transpositions cancel out and
there's no need to do anything.
At some point I have to stop living this fantasy and do
something useful with these abstract descriptions.
Hopefully by that time my experiments in symbolic
programming have better defined both the problem and the
solution, and I won't need to spend as much time thinking
about boring things like architecture.
(If you liked this, you might enjoy Living Inside Your Own Black Box.)
Things That Turbo Pascal is Smaller Than
Turbo Pascal 3 for MS-DOS was released in September
1986. Being version 3, there were lesser releases prior to
it and flashier ones after, but 3 was a solid
representation of the Turbo Pascal experience: a full
Pascal compiler, including extensions that it made it
practical for commercial use, tightly integrated with an
editor. And the whole thing was lightning fast, orders of
magnitude faster at building projects than Microsoft's
compilers.
The entire Turbo Pascal 3.02 executable--the compiler
and IDE--was 39,731 bytes. How does that stack up in 2011
terms? Here are some things that Turbo Pascal is smaller
than, as of October 30, 2011:
The minified version of jquery 1.6 (90,518 bytes).
The yahoo.com home page (219,583 bytes).
The image of the white iPhone 4S at apple.com (190,157
bytes).
zlib.h
in the Mac OS X Lion SDK (80,504
bytes).
The touch
command under OS X Lion (44,016
bytes).
Various vim quick reference cards as PDFs. (This
one is 47,508 bytes.)
The compiled code for the Erlang R14B02 parser
(erl_parse.beam
, 286,324 bytes).
The Wikipedia page for C++ (214,251 bytes).
(If you liked this, you might like A
Personal History of Compilation Speed.)
Adventures in Unfiltered Global Publishing
I remember sitting in my parents' backyard in Texas, in
the mid 1980s, reading a computer magazine that contained a
game and accompanying article I had written. I don't know
what the circulation of the magazine--Antic--was,
but it was popular enough that I could walk into any mall
bookstore and flip through a copy.
The amazing part, of course, was that my game was in
there. Not that it was a great game, but it had gone from
initial design to final implementation in under two weeks.
I didn't talk to anyone about the concept. I didn't have
any help with the development. I don't think I even asked
anyone to playtest it. Yet there it was in print, the name
of the game right on the cover, and available in dozens of
bookstores in the Dallas area alone.
In early 1998, Gordon Cameron asked if I'd be the guest
editor for SIGGRAPH Computer Graphics Quarterly. The issue
was focused on gaming and graphics, and the invitation was
largely based on Halcyon Days which I
had put together the previous year. I wasn't even a
SIGGRAPH member.
I talked to some people I had been in contact with, like
Steven Collins (who co-founded Havok that same year) and
Owen Rubin (who wrote games for those old "glowing vector"
arcade machines). I still like this bit from Noah
Falstein's "Portrait of the Artists in a Young
Industry":
Incidentally, Sinistar was probably the first
videogame to employ motion capture for graphics--of a
sort. Jack provided us with three mouth positions,
closed, half-open and open. It was up to Sam Dicker,
the lead programmer, and myself to figure out which
positions to use for which phrases. After a few
unsuccessful attempts to synchronize it by hand we hit
on a scheme. We wrote each of the short phrases
Sinistar spoke on a whiteboard. Then Sam held a marker
to his chin with its tip touching the board and moved
his head along the phrase, reading it aloud. This gave
us a sort of graph showing how his chin dropped as he
spoke. Then we "digitized" it, eyeballing the curve,
reducing it to three different states and noting
duration.
I think one of the seven contributors was recommended by
Gordon; the other six were my choice. I suggested topics,
edited the articles (and over-edited at least one), wrote
the "From the Guest Editor" column, and the
completed issue was mailed out to SIGGRAPH members in
May.
In both of these cases, I failed to realize how unusual
it is to go from idea to print without any interference
whatsoever. Somehow my own words and thoughts were getting
put into professionally produced, respectable periodicals,
without going through any committees, without anyone
stopping to ask "Hey, does this guy even know what he's
talking about?"
On October 30th of this year, I sat down on a couch in
my basement to write a short article I had in my head. The
total time from first word to finished piece was one hour,
and most of that was spent researching some numbers. I've
had unintentionally popular blog entries before, most
notably Advice to Aimless, Excited
Programmers and Write Code Like You
Just Learned How to Program, but that start to finish
in one hour entry, Things That Turbo
Pascal is Smaller Than, took off faster than anything
I've written. It was all over the place that same evening
and inexplicably ended up on Slashdot within forty-eight
hours.
If you read or linked to that article, thank you.
(If you just started reading this site, you might enjoy
A Three-Year Retrospective.)
Photography as a Non-Technical Hobby
When I got into photography in
2004, I approached it differently from the more
technical endeavors I've been involved in. It was a
conscious decision, not an accident.
I'd been overexposed to years of bickering about
computer hardware, programming languages, you name it. All
the numbers (this CPU is 17% faster in some particular
benchmark), all the personal opinions stated as fact (open
source is superior to closed), all the comparisons and put
downs (Ruby sucks!). I'd had enough.
Now photographers can be similarly cranky and
opinionated. All the different makes and models of cameras,
lenses, filters, flashes. Constant dissection of every
rumored product. Debates about technique, about whether
something is real art or cheating.
I didn't want any of that. I wanted to enjoy creating
good pictures without getting into the photography
community, without thinking about technical issues at all.
No reading tutorials or photography magazines (even though
I've had a photo published in a
tutorial in one of those magazines). No hanging out in
forums. And it has been refreshing.
I've even gone so far as to leave my fancy-pants Nikon
in a cupboard most of the time, because it's so much more
fun to use my iPhone 4 with the Hipstamatic app. The iPhone completely and
utterly loses to the Nikon in terms of absolute image
quality, but that's more than balanced out by guaranteeing
that I have an unobtrusive camera with me at all times, one
that can directly upload photos to my Flickr account.
Here are a few photos I've taken this year. Each one is
a link to the Flickr original.
(If you liked this, you might enjoy Constantly Create.)
User Experience Intrusions in iOS 5
The iPhone has obsoleted a number of physical gadgets. A
little four-track recorder that I use as a notebook for
song ideas. A stopwatch. A graphing calculator. Those ten
dollar LCD games from Toys 'R Us. And it works because an
iPhone app takes over the device, giving the impression
that it's a custom piece of hardware designed for that
specific purpose.
But it's only an illusion. I can be in the middle of
recording a track, and I get a call. That puts the recorder
to sleep and switches over to the phone interface. Or I can
be playing a game and the "Battery is below 20%" alert pops
up at an inopportune moment. These are interesting edge
cases, where the reality that the iPhone is a more complex
system--and not a dedicated game player or recorder--bleeds
into the user experience. These intrusions are driven by
things outside of my control. I didn't ask to be
called at that moment; it just happened. I understand that.
I get it.
What if there was something I could do within an
app that broke the illusion? Suppose that tapping the
upper-left corner of the screen ten times in row caused an
app to quit (it doesn't; this is just an example). Now the
rule that an app can do whatever it wants, interface-wise,
has been violated. You could argue that tapping the corner
of the screen ten times is so unlikely that it doesn't
matter, but that's a blind assumption. Think of a game
based around tapping, for example. Or a drum machine.
As it turns out, two such violations were introduced in
iOS 5.
On the iPad, there are a number of system-wide gestures,
such as swiping left or right with four fingers to switch
between apps. Four-finger swipes? That's convoluted, but
imagine a virtual mixing console with horizontal sliders.
Quickly move four of them at once...and you switch apps.
Application designers have to work around these, making
sure that legitimate input methods don't mimic the
system-level gestures.
The worst offender is this: swipe down from the top of
the screen to reveal the Notification Center (a window
containing calendar appointments, the weather, etc.). A
single-finger vertical motion is hardly unusual, and many
apps expect such input. The games Flight Control and Fruit
Ninja are two prime examples. Unintentionally pulling down
the Notification Center during normal gameplay is common. A
centered vertical swipe is natural in any paint program,
too. Do app designers need build around allowing such
controls? Apparently, yes.
There's an easy operating system-level solution to the
Notification Center problem. Require the gesture to start
on the system bar at the top of the screen, where the
network status and battery indicator are displayed.
Allowing the system bar in an app is already an intrusion,
but one opted into by the developer. Some apps turn off the
system bar, including many games, and that's fine. It's an
indication that the Notification Center isn't
available.
(If you liked this, you might enjoy Caught-Up with 20 Years of UI Criticism.)
2011 Retrospective
I was going to end this blog one year ago.
Prog21 was entirely a personal outlet for the more
technical ideas kicking around in my head, and it had run
its course. Just before Christmas 2010, I sat down and
wrote a final "thanks for reading," essay. I've still got
it on my MacBook. But instead of posting it, I dashed off
Write Code Like You Just Learned How to
Program, and the response made me realize my initial
plan may have been too hasty.
In 2011 I posted more articles than in any previous
year--32, including this one [EDIT: well, actually it was
the second most; there were 33 in 2010]. I finally gave the
site a much needed visual makeover. And I'm still wrestling
with how to balance the more hardcore software engineering
topics that I initially wrote about with the softer, less
techy issues that I've gotten more interested in.
Have a great 2012, everyone!
popular articles from 2011
others from 2011 that I personally like
(There's also a retrospective
covering 2007-2010.)
A Programming Idiom You've Never Heard Of
Here are some sequences of events:
Take the rake out of the shed, use it to pile up the
leaves in the backyard, then put the rake back in the
shed.
Fly to Seattle, see the sights, then fly home.
Put the key in the door, open it, then take the key
out of the door.
Wake-up your phone, check the time, then put it back
to sleep.
See the pattern? You do something, then do something
else, then you undo the first thing. Or more accurately,
the last step is the inverse of the first. Once you're
aware of this pattern, you'll see it everywhere. Pick up
the cup, take a sip of coffee, put the cup down. And it's
all over the place in code, too:
Open a file, read the contents, close the file.
Allocate a block of memory, use it for something,
free it.
Load the contents of a memory address into a
register, modify it, store it back in memory.
While this is easy to explain and give examples of, it's
not simple to implement. All we want is an operation that
looks like idiom(Function1, Function2)
, so we
could write the "open a file..." example above as
idiom(Open, Read)
. The catch is that there
needs to be a programmatic way to determine that the
inverse of "open" is "close." Is there a programming
languages where functions have inverses?
Surprisingly, yes: J.
And this idiom I keep talking about is even a built-in
function in J, called under. In English, and not J's
terse syntax, the open file example is stated as "read
under open."
One non-obvious use of "under" in J is to compute the
magnitude of a vector. Magnitude is an easy algorithm:
square each component, sum them up, then take the square
root of the result. Hmmm...the third step is the inverse of
the first. Sum under square. Or in actual J code:
mag =: +/ &.: *:
+/
is "sum." The ampersand, period, colon
sequence is "under." And *:
is "square."
(Also see the follow-up.)
Follow-up to "A Programming Idiom You've Never Heard
Of"
Lots of mail, lots of online discussion about A Programming Idiom You've Never Heard Of,
so I wanted to clarify a few things.
What I was trying to do was get across the unexpected
strangeness of function inverses in a programming language.
In that short definition of vector magnitude, there wasn't
a visible square root function. There was only an operator
for squaring a value, and another operator that involved
inverting a function.
How does the J interpreter manage to determine a
function inverse at runtime? For many primitives, there's
an associated inverse. The inverse of add is subtract. The
inverse of increment is decrement. For some primitives
there isn't a true, mathematical inverse, but a counterpart
that's often useful. That's why the preferred term in J
isn't inverse, but obverse.
For user-defined functions, there's an attempt at
inversion (er, obversion) that works much of the time. A
function that reverses a list then adds five to each
element turns into a function that subtracts five from each
element then reverses the list. For cases where the
automated obverse doesn't work, or where you want the
obverse to have different behavior, you can associate a
user-defined obverse with any verb (J lingo for function).
You could define an open_file
verb which opens
a file and has an obverse that closes a file. Or in actual
J:
open_file =: open :. close
Well, really, that should be:
open_file =: (1!:21) :. (1!:22)
But the former, without the explicit foreign function
calls, gets the point across clearer, I think.
One common use of obverses and the "under" operator is
for boxing and unboxing values. In J, a list contains
values of the same type. There's no mixing of integers and
strings like Lisp or Python. Instead you can "box" a value,
then have a list containing only boxed values. But there's
nothing you can do with a boxed value except unbox it, so
it's common to say "[some operation] under open box," like
"increment under open box." That means unbox the value,
increment it, then put it back in a box. Or in real,
eyeball-melting J:
inc_box =: >: &. >
The >:
is increment. The right
>
means open box. That's the "under"
operation in the middle.
Now it sounds like this "open box, do something, close
box" sequence would translate beautifully to the "open
file, read the contents, close the file" example I gave
last time, but it doesn't. The catch is that the open /
read / close verbs aren't manipulating a single input the
way inc_box
is. Opening a file returns a
handle, which gets passed to read
. But reading
a file returns the contents of the file, which is
not something that can be operated on by
close
. So this definition won't work:
read_file =: read &. open
If a structured data type like a dictionary was being
passed around, then okay, but that's not a pretty example
like I hoped it would be.
Still, I encourage learning J, if only to make every other
language seem easy.
Recovering From a Computer Science Education
I was originally going to call this "Undoing the Damage
of a Computer Science Education," but that was too
link-baity and too extreme. There's real value in a
computer science degree. For starters, you can easily get a
good paying job. More importantly, you've gained the
ability to make amazing and useful things. But there's a
downside, too, in that you can get so immersed in the
technical and theoretical that you forget how wonderful it
is to make amazing and useful things. At least that's what
happened to me, and it took a long time to recover.
This is a short list of things that helped me and might
help you too.
Stay out of technical forums unless it's directly
relevant to something you're working on. It's far too
easy to get wrapped up in discussions of the validity of
functional programming or whether or not Scheme can be used
to make commercial applications or how awful PHP is. The
deeper you get into this, the more you lose touch.
Keep working on real projects related to your area of
interest. If you like designing games, write games. If
you like photography, write a photo organizer or camera
app. Don't approach things wrong-way-around, thinking that
"a photo organizer in Haskell" is more important than "a
photo organizer which solves a particular problem with
photo organizers."
If you find yourself repeatedly putting down a
technology, then take some time to actually learn and use
it. All the jokes and snide remarks aside, Perl is
tremendously useful. Ditto for PHP and Java and C++. Who
wins, the person who has been slamming Java online for ten
years or the author of Minecraft who just used the language
and made tens of millions of dollars?
Don't become an advocate. This is the flipside of
the previous item. If Linux or Android or Scala are helpful
with what you're building, then great! That you're relying
on it is a demonstration of its usefulness. No need to
insist that everyone else use it, too.
Have a hobby where you focus the end results and not
the "how." Woodworkers can become tool collectors.
Photographers can become spec
comparison addicts. Forget all of that and concern yourself
with what you're making.
Do something artistic. Write songs or short
stories, sketch, learn to do pixel art. Most of these also
have the benefit of much shorter turnaround times than any
kind of software project.
Be widely read. There are endless books about
architecture, books by naturalists, both classic and
popular modern novels, and most of them have absolutely
nothing to do with computers or programming or science
fiction.
Virtual Joysticks and Other Comfortably Poor
Solutions
Considering that every video game system ever made
shipped with a physical joystick or joypad, the smooth,
featureless glass of mobile touchscreens was unnerving. How
to design a control scheme when there is no controller?
One option was to completely dodge the issue, and that
led to an interesting crop of games. Tip the entire device
left and right and read the accelerometer. Base the design
around single-finger touches or drawing lines or dragging
objects. But the fallback solution for games that need more
traditional four or eight way input is to display a faux
controller for the player to manipulate.
The virtual joystick option is obvious and easy, but it
needs pixels, filling the bottom of the screen with a
bitmap representation of an input device. Sometimes
it isn't too obtrusive. Other
times it's impressively ugly. Aesthetics aside, there's
a fundamental flaw: you can't feel the image. There's no
feedback indicating that your hand is in the right place or
if it slides out of the control area.
There may have been earlier attempts, but Jeff
Minter's
Minotaur Rescue, released just over a year ago, was the
first good alternative to a virtual joystick that I ran
across. Minter's insight was that directional movement
anywhere on the screen contains useful information. Touch,
then slide to the right: that's the same as moving a
virtual controller to the right. Without lifting your
finger, slide up: that's upward motion. There's no need to
restrict input to a particular part of the screen; anywhere
is fine.
He even extended this to work for twin-stick shooter
controls. The first touch is for movement, the second for
shooting, then track each independently. Again, it's not
where you touch the screen, it's when and
how.
It's all clean and obvious in retrospect, but it took
getting past the insistence that putting pictures of
joysticks and buttons on the screen was the only
approach.
Pretend This Optimization Doesn't Exist
In any modern discussion of algorithms, there's mention
of being cache-friendly, of organizing data in a way that's
a good match for the memory architectures of CPUs. There's
an inevitable attempt at making the concepts concrete with
a benchmark manipulating huge--1000x1000--matrices. When
rows are organized sequentially in memory, no worries, but
switch to column-major order, and there's a very real
slowdown. This is used to drive home the impressive gains
to be had if you keep cache-friendliness in mind.
Now forget all about that and get on with your
projects.
It's difficult to design code for non-trivial problems.
Beautiful code quickly falls apart,
and it takes effort to keep things both organized and
correct. Now add in another constraint: that the solution
needs to access memory in linear patterns and avoid chasing
pointers to parts unknown.
You'll go mad trying to write code that way. It's like
writing a short story without using the letter "t."
If you fixate on the inner workings of caches,
fundamental and useful techniques suddenly turn horrible.
Reading a single global byte loads an entire cache line.
Think objects are better? Querying a byte-sized field is
just as bad. Spreading the state of a program across
objects scattered throughout memory is guaranteed to set
off alarms when you run a hardware-level performance
analyzer.
Linked lists are a worst case, potentially jumping to a
new cache line for each element. That's damning evidence
against languages like Haskell, Clojure, and Erlang. Yet
some naive developers insist on using Haskell, Clojure, and
Erlang, and they cavalierly disregard the warnings of the
hardware engineers and use lists as their primary data
structure...
...and they manage to write code where performance is
not an issue.
(If you liked this, you might enjoy Impressed by Slow Code.)
Four Levels of Idea Theft
Imagine you've just seen a tremendously exciting piece
of software--a mobile app, a web app, a game--and your
immediate reaction is "Why didn't I think of that?!" With
your mind full of new possibilities, you start on a
project, a project enabled by exposure to the exciting
software. What happens next is up to you. How far do you
let your newfound motivation take you?
Borrowing specific features. You like the way the
controls work. The sign-in process. Something specific.
Sliminess Factor: None. This is how progress
happens.
General inspiration. If web-based photo sharing
had never occurred to you, and then you saw Flickr, that
opens the door for thinking about the entire problem space.
Some of those options may be Flickr-ish, some aren't.
Sliminess Factor: Low. It's a common reaction to be
excited and inspired by something new, and it inevitably
affects your thinking.
Using the existing product as a template. Now
you're not simply thinking about photo sharing, but having
groups and contacts and favorites and tags and daily
rankings. You're not still writing a full-on Flickr
clone--there are lots of things to be changed for the
better--but it's pretty clear what your model is.
Sliminess Factor: Medium. While there's nothing illegal
going on, you won't be able to dodge the comparisons, and
you'll look silly if you get defensive. Any claims of
innovation or original thinking will be dismissed as
marketing-speak.
Wholesale borrowing of the design. All pretense
of anything other than recreating an existing product have
gone out the window. Your photo sharing site is called
"Phlickr" and uses the same page layouts as the
original.
Sliminess Factor: High. This is the only level that
legitimately deserves to be called theft.
(If you liked this, you might enjoy Accidental Innovation.)
A Peek Inside the Erlang Compiler
Erlang is a complex system, and I can't do its inner
workings justice in a short article, but I wanted to give
some insight into what goes on when a module is compiled
and loaded. As with most compilers, the first step is to
convert the textual source to an abstract syntax tree, but
that's unremarkable. What is interesting is that the code
goes through three major representations, and you can look
at each of them.
Erlang is unique among functional languages in its
casual scope rules. You introduce variables as you go,
without fanfare, and there's no creeping indentation caused
by explicit scopes. Behind the scenes that's too quirky, so
the syntax tree is converted into Core Erlang. Core Erlang
looks a lot like Haskell or ML with all variables carefully
referenced in "let" statements. You can see the Core Erlang
representation of a module with this command from the
shell:
c(example, to_core).
The human-readable Core Erlang for the
example
module is written to
example.core
.
The next big transformation is from Core Erlang to code
for the register-based BEAM virtual machine. BEAM is poorly
documented, but it's a lot like the Warren
Abstract Machine developed for Prolog (but without the
need for backtracking). BEAM isn't terribly hard to figure
out if you write short modules and examine them with:
c(example, 'S').
The disassembled BEAM code for the example
module is written to example.S
. The key to
understanding BEAM is that there are two sets of registers:
one for passing parameters ("x" registers) and one for use
as locals within functions ("y" registers).
Virtual BEAM code is the final output of the compiler,
but it's still not what gets executed by the system. If you
look at the source for the Erlang runtime, you'll see that
beam_load.c
is over six thousand lines of
code. Six thousand lines to load a module? That's because
the beam loader is doing more than its name lets on.
There's an optimization pass on the virtual machine
instructions, specializing some for certain situations and
combining others into superinstructions. To check if a
value is a tuple of three elements is accomplished with a
pair of BEAM operations: is_tuple
and
is_arity
. The BEAM loader turns these into one
superinstruction: is_tuple_of_arity
. You can
see this condensed representation of BEAM code with:
erts_debug:df(example).
The disassembled code is written to
example.dis
. (Note that the module must be
loaded, so compile it before giving the above command.)
The loader also turns the BEAM bytecode into threaded
code: a list of addresses that get jumped to in
sequence. There's no "Now what do I do with this opcode?"
step, just fetch and jump, fetch and jump. If you want to
to know more about threaded code, look to the Forth world.
Threaded code takes advantage of the labels
as values extension of gcc. If you build the BEAM
emulator with another compiler like Visual C++, it falls
back on using a giant switch
statement for
instruction dispatch and there's a significant performance
hit.
(If you liked this, you might enjoy A
Ramble Through Erlang IO Lists.)
Don't Fall in Love With Your Technology
In the 1990s I followed the Usenet group
comp.lang.forth. Forth has great personal appeal. It's
minimalist to the point of being subversive, and Forth
literature once crackled with rightness.
Slowly, not in a grand epiphany, I realized that there
was something missing from the discussions in that group.
There was talk of tiny Forths, of redesigning control
structures, of ways of getting by with fewer features, and
of course endless philosophical debates, but no one was
actually doing anything with the language, at least
nothing that was in line with all the excitement about the
language itself. There were no revolutions waiting to
happen.
I realized comp.lang.forth wasn't for me.
A decade later, I stuck my head back in and started
reading. It was the same. The same tinkering with
the language, the same debates, and the same peculiar lack
of interest in using Forth to build incredible things.
Free Your Technical Aesthetic from the
1970s is one of the more misunderstood pieces I've
written. Some people think I was bashing on Linux/Unix as
useless, but that was never my intent. What I was trying to
get across is that if you romanticize Unix, if you view it
as a thing of perfection, then you lose your ability to
imagine better alternatives and become blind to potentially
dramatic shifts in thinking.
It's bizarre to realize that in 2007 there were still
people fervently arguing Emacs versus vi and defending the
quirks of makefiles. That's the same
year that multi-touch interfaces exploded, low power
consumption became key, and the tired, old trappings of
faux-desktops were finally set aside for something
completely new.
Don't fall in love with your technology the way some
Forth and Linux advocates have. If it gives you an edge, if
it lets you get things done faster, then by all means use
it. Use it to build what you've always wanted to build,
then fall in love with that.
(If you liked this, you might enjoy Follow the Vibrancy.)
A Complete Understanding is No Longer Possible
Let's say you've just bought a MacBook Air, and your
goal is to become master of the machine, to understand how
it works on every level.
Amit Singh's Mac OS X Internals: A Systems
Approach is a good place to start. It's not about
programming so much as an in-depth discussion of how all
the parts of the operating system fit together: what the
firmware does, the sequence of events during boot-up, what
device drivers do, and so on. At 1680 pages, it's not light
summer reading.
To truly understand the hardware, Intel has kindly
provided a free seven volume set of documentation. I'll
keep things simple by recommending Intel 64 and IA-32
Architectures Software Developer's Manual Volume 1: Basic
Architecture (550 pages) and the two volumes describing
the instruction set (684 pages and 704 pages
respectively).
Objective-C is the language of OS X. We'll go with
Apple's thankfully concise The Objective-C Programming
Language (137 pages).
Of course Objective-C is a superset of C, so also work
through the second edition of The C Programming
Language (274 pages).
Now we're getting to the core APIs of OS X. Cocoa
Fundamentals Guide is 239 pages. Application Kit
Framework Reference is a monster at 5069 pages. That's
help a file-like description of every API call. To be fair
I'll stop there with the Cocoa documentation, even though
there are also more usable guides for drawing and Core
Audio and Core Animation and a dozen other things.
Ah, wait, OpenGL isn't part of Cocoa, so throw in the
784 page OpenGL Reference Manual. And another 800
pages for OpenGL Shading Language, Second
Edition.
The total of all of this is 79 pages shy of eleven
thousand. I neglected to include man pages for hundreds of
system utilities and the Xcode documentation. And I didn't
even touch upon the graphics knowhow needed to do anything
interesting with OpenGL, or how to write good C and
Objective-C or anything about object-oriented design,
and...
(If you liked this, you might enjoy Things That Turbo Pascal is Smaller
Than.)
Solving the Wrong Problem
Occasionally, against my better judgement, I peek into
discussion threads about things I've written to see what
the general vibe is, to see if I've made some ridiculous
mistake that no one bothered to tell me directly about. The
most unexpected comments have been about how quickly this
site loads, that most pages involve only two requests--the
HTML file and style sheet--for less than ten kilobytes in
total, and that this is considered impressive.
Some of that speed is luck. I use shared hosting, and I
have no control over what other sites on the same server
are doing.
But I've also got a clear picture of how people interact
with a blog: they read it. With the sole exception of
myself, all people do with the prog21 site is grab files
and read them. There's no magic to serving simple, static
pages. What's surprising is that most implementers of
blogging software are solving the wrong problems.
An SQL database of entries that can be on-the-fly mapped
to themed templates? That's a solution designed to address
issues of blog maintainers, not readers, yet all readers
pay the price of slower page loads or not being able to see
a page at all after a mention on a high-profile site.
(At one time, the Sieve
of Eratosthenes--an algorithm for finding all prime
numbers up to a given limit--was a popular benchmark for
programming language performance. As an example of raw
computation, the Sieve was fine, but suppose you needed a
list of the primes less than 8,000 in a
performance-sensitive application. Would you bother
computing them at run time? Of course not. You already
know them. You'd run the program once during
development, and that's that.)
Tracking cookies and Google Analytics scripts? Those
solve a problem specific to the site owner: "How can I know
exactly how much traffic I'm getting"? Readers don't
care.
Widgets for Google+ and Twitter and Facebook? These
don't solve anyone's problems. You can tweet easily enough
without a special button. Aggregation sites, Google Reader,
and even Google search results have the equivalent of
"like" buttons, so why duplicate the functionality? More
importantly, only a small minority of users bother with
these buttons, but all the associated scripting and image
fetching slows down page loads for everyone.
(If you liked this, you might enjoy It's Like That Because It Has Always Been Like
That.)
Turning Your Code Inside Out
If I gave this assignment to a class of novice
programmers:
Write a C program to sum the elements of an array
and display the results. Include five test cases.
I'd expect multiple people would come up with a function
like this:
void sum_array(int array[], int size)
{
int sum = 0;
for (int i = 0; i < size; i++) {
sum += array[i];
}
printf("the sum of the array is %d\n", sum);
}
There's a cute new programmer-ism in that solution:
displaying the output is built into sum_array
instead of returning the sum and printing it elsewhere.
Really, it's hard to see why that one extra line is a bad
idea, at least up front. It prevents duplication of the
printf
call, which is good, right? But tack on
an extra requirement such as "Use your array summing
function to compute the total sum of three arrays," and
there will be an "Ohhhhh, I don't want that print in there"
moment.
The design error was obvious in this example, but it
crops up in other cases where it isn't so immediately
clear.
Let's say we're writing a video game and there's a
function to spawn an attacker. To alert the player of the
incoming danger, there's a sound accompanying the
appearance of each new attacker, so it makes sense to play
that sound inside new_attacker
.
Now suppose we want to spawn five attackers at the same
time by calling new_attacker
in a loop. Five
attackers are created as expected, but now five identical
sounds are starting during the same frame. Those five
sounds will be perfectly overlaid on top of each other, at
best sounding five times louder than normal and at worst
breaking-up because the audio levels are too high. As a
bonus, we're taking up five audio channels so the player
can hear this mess.
The solution is conceptually the same as the
sum_array
example: take the sound playing out
of new_attacker
and let the caller handle it.
Now there's a single function call to start a sound
followed by a loop that creates the attackers.
Why am I bothering to talk about this?
This method of turning your code inside out is the
secret to solving what appear to be hopelessly
state-oriented problems in a purely functional style. Push
the statefulness to a higher level and let the caller worry
about it. Keep doing that as much as you can, and you'll
end up with the bulk of the code being purely
functional.
(If you liked this, you might enjoy Purely Functional Retrogames.)
This is Why You Spent All that Time Learning to
Program
There's a standard format for local TV news broadcasts
that's easy to criticize. (By "local," I mean any American
town large enough to have its own television station.)
There's an initial shock-value teaser to keep you
watching. News stories are read in a dramatic,
sensationalist fashion by attractive people who fill most
of the screen. There's an inset image over the shoulder of
the reader. Periodically there's a cutaway to a reporter in
the field; it's often followed-up with side-by-side images
of the newscaster and reporter while the former asks a few
token questions to latter. There's pretend banter between
newscasters after a feel-good story.
You get the idea. Now what if I wanted to change this
entrenched structure?
I could get a degree in journalism and try to get a job
at the local TV station. I'd be the new guy with no
experience, so it's not likely I could just step-in and
make sweeping reforms. All the other people there have been
doing this for years or decades, and they've got a
established routines. I can't make dozens of people change
their schedules and habits because I think I'm so smart. To
be perfectly fair, a drastic reworking of the news would
result in people who had no issues with old presentation
getting annoyed and switching to one of the other channels
that does things the old way.
When I sit down to work on a personal project at home,
it's much simpler.
I don't have to follow the familiar standards of
whatever kind of app I'm building. I don't have to use an
existing application as a model. I can disregard history. I
can develop solutions without people saying "That's not how
it's supposed to work!"
That freedom is huge. There are so many issues in
the world that people complain about, and there's little
chance of fixing the system in a significant way. Even
something as simple as reworking the local news is out of
reach. But if you're writing an iOS game, an HTML 5 web
app, a utility that automates work so you can focus on the
creative fun stuff, then you don't have to fall back on the
existing, comfortable solutions that developers before you
chose simply because they too were trapped by the patterns
of the solutions that came before them.
You can fix things. You can make new and amazing things.
Don't take that ability lightly.
(If you liked this, you might enjoy Building Beautiful Apps from Ugly Code.)
100,000 Lines of Assembly Language
I occasionally get asked about writing Super Nintendo
games. How did anyone manage to work on projects consisting
of hundreds of thousands of lines of 16-bit assembly
language?
The answer is that it's not nearly as Herculean as it
sounds.
The SNES hardware manual is a couple of hundred pages. I
don't remember the exact number, so I'll shoot high: 400
pages. Add in a verbose 65816 assembly language book and
combined we're talking 800 or 900 pages tops. That's eight
percent of the total I came up with
for having a complete understanding of an OS X based
computer: nearly 11,000 pages.
Sure, there are whole classes of errors that you can
make in assembly language that are invisible in C. For
example, here's some old-school x86 code:
mov ax, 20
mov bx, -1
int XX
This sets up a couple of parameters and calls an
interrupt. It looks right, it works, it may even ship in a
commercial product, but then there's a new MS-DOS version
and it crashes. Why? Because the second parameter should be
passed in the dx register, not bx. It only worked because a
previous interrupt happened to return -1 in dx, so the
second line above isn't actually doing anything useful. But
those kinds of errors are rare.
The secrets of working entirely in assembly language are
being organized, thinking before implementing, and keeping
things clean and understandable. That sounds a lot like how
to write good Javascript or C++. Steve McConnell's Code
Complete is actually a guidebook for the Super Nintendo
game programmer.
But all this talk of programming languages and hardware
is backward. Jordan Mechner designed and built the original
Prince of Persia on an Apple II. The game and the editor
for laying out the levels are written in assembly code for
the 8-bit 6502. He kept a journal while writing
the game.
You might expect the journal to be filled with coding
philosophies and 6502 tricks, but there's little of that.
Sure, he's doing difficult tech work behind the scenes, but
that's not what he's writing about. They're the journals of
a designer and a director, of someone living far away from
home after graduating from college, with long detours into
his screenwriting aspirations (and don't let that scare you
off; they're fascinating).
He may have had second set of coding journals, but I
like to think he didn't. Even if he did, he was clearly
thinking about more than the techie side of things, in the
way that a novelist's personal journal is unlikely to be
filled with ramblings about grammar and sentence
structure.
(If you liked this, you might enjoy The Pure Tech Side is the Dark Side.)
Use and Abuse of Garbage Collected Languages
The garbage collection vs. manual memory management
debates ended years ago. As with the high-level vs.
assembly language debates which came before them, it's hard
to argue in favor of tedious bookkeeping when there's an
automatic solution. Now we use Python, Ruby, Java,
Javascript, Erlang, and C#, and enjoy the productivity
benefits of not having to formally request and release
blocks of bytes.
But there's a slight, gentle nagging--not even a true
worry--about this automatic memory handling layer: what if
when my toy project grows to tens or hundreds of megabytes
of data, it's no longer invisible? What if, despite the
real-time-ness and concurrent-ness of the garbage
collector, there's a 100 millisecond pause in the middle of
my real-time application? What if there's a hitch in my
sixty frames per second video game? What if that hitch
lasts two full seconds? The real question here is "If this
happens, then what can I possibly do about it?"
These concerns aren't theoretical. There are periodic
reports from people for whom the garbage collector has
switched from being a friendly convenience to the enemy.
Maybe it's because of a super-sized
heap? Or maybe
accidentally triggering worst-case behavior in the GC?
Or maybe it's simply using an environment where
GC pauses didn't matter until recently?
Writing a concurrent garbage collector to handle
gigabytes is a difficult engineering feat, but any student
project GC will tear through a 100K heap fast enough to be
worthy of a "soft real-time" label. While it should be
obvious that keeping data sizes down is the first step in
reducing garbage collection issues, it's something I
haven't seen much focus on. In image processing code
written in Erlang, I've used the atom
transparent
to represent pixels where the
alpha value is zero (instead of a full tuple:
{0,0,0,0}
). Even better is to work with runs
of transparent pixels (such as {transparent,
Length}
). Data-size optimization in dynamic
languages is the new cycle counting.
There's a more often recommended approach to solving
garbage collection pauses, and while I don't want to
flat-out say it's wrong, it should at least be viewed with
suspicion. The theory is that more memory allocations means
the garbage collector runs more frequently, therefore the
goal is to reduce the number of allocations. So far, so
good. The key technique is to preallocate pools of objects
and reuse them instead of continually requesting memory
from and returning it to the system.
Think about that for a minute. Manual memory management
is too error prone, garbage collection abstracts that away,
and now the solution to problems with garbage collection is
to manually manage memory? This is like writing your own
file buffering layer that sits on top of buffered file I/O
routines. The whole point of GC is that you can say "Hey,
I'd like a new [list/array/object]," and it's quick, and it
goes away when no longer referenced. Memory is a
lightweight entity. Need to build up an intermediate list
and then discard it? Easy! No worries!
If this isn't the case, if memory allocations in a
garbage collected language are still something to be
calorie-counted, then maybe the memory management debates
aren't over.
(If you liked this, you might enjoy Why Garbage Collection Paranoia is Still
(sometimes) Justified.)
Can You Be Your Own Producer?
I've worked on personal projects where I went badly off
track and didn't realize it until much later. What I needed
was someone to nudge me in the right direction, someone to
objectively point out the bad decisions I was making.
What I needed was a producer.
When a band is recording an album, the producer isn't
there to write songs or play instruments, but to provide
focus and give outside perspective. A good producer should
say "Guys, you sound great live, but it's not coming
through in the recording; can we try getting everyone in
here at same time and see how that goes?" or "We've got
three songs with similar wandering breakdowns; if you could
replace one, which would it be?"
(Here I should point out that a producer in the music
production sense is different than a producer in film or in
video games. Same term, different meanings.)
If you're a lone wolf or part of a small group building
something in your basement, there's tremendous value in
being able to step back and get into the producer mindset.
Maybe you can't be both a producer and developer at the
same time, but recognizing that you need to switch hats
periodically is key.
What are the kinds of question you should be asking when
in your producer role?
Are you letting personal technology preferences cloud
your vision? Haskell is a great language, but what if
you're writing a lot of code that you'd get for free with
Python? Yes, Android is open to some extent and iOS isn't,
but should you be dismissing the entire iPhone / iPad
market for that reason?
Are you avoiding doing the right thing because it's
hard? If everyone you show your project to has the same
confusion about how a feature works, disregarding it
because it would take a month of rearchitecting usually
isn't a valid response.
Are you simply copying existing ideas without
offering anything new? It's so easy to see a finished
application and jump into writing your own version. But
think about the problem that it was designed to solve
instead of copying the same solution. A painting program
doesn't have to use the blueprint laid out by MacPaint in
1984 [EDIT: that blueprint appeared earlier in Draw for the
Xerox Alto and SuperDraw for the PC]. An IDE doesn't have
to follow the project tree view on the
left schematic.
Are you spending too much time building your own
solutions instead of using what's already out there?
Should you really be writing another webserver or markdown
formatter or JSON decoder?
Do you understand your audience? If you're
building an application for graphic artists, are you
familiar with how graphic artists work? Are you throwing
out features that would be met with horrified stares?
A Forgotten Principle of Compiler Design
That a clean system for separately compiled modules
appeared in Modula-2, a programming language designed by
Niklaus Wirth in 1978, but not in the 2011 C++
standard...hmmm, no further comment needed. But the
successor to Modula-2, Oberon, is even more
interesting.
With Oberon, Wirth removed features from Modula-2
while making a few careful additions. It was a smaller
language overall. Excepting the extreme minimalism of
Forth, this is the first language I'm aware of where
simplicity of the implementation was a concern. For
example, nested modules were rarely used in Modula-2, but
they were disproportionately complex to compile, so they
were taken out of Oberon.
That simplicity carried over to optimizations performed
by the compiler. Here's
Michael Franz:
Optimizing compilers tend to be much larger and much
slower than their straightforward counterparts. Their
designers usually do not follow Oberon's maxim of
making things "as simple as possible", but are inclined
to completely disregard cost (in terms of compiler
size, compilation speed, and maintainability) in favor
of code-quality benefits that often turn out to be
relatively marginal. Trying to make an optimizing
compiler as simple as possible and yet as powerful as
necessary requires, before all else, a measurement
standard, by which both simplicity and power can be
judged.
For a compiler that is written in the language it
compiles, two such standards are easily found by
considering first the time required for
self-compilation, and then the size of the resulting
object program. With the help of these benchmarks, one
may pit simplicity against power, requiring that every
new capability added to the compiler "pays its own way"
by creating more benefit than cost on account of at
least one of the measures.
The principle is "compiler optimizations should pay for
themselves."
Clearly it's not perfect (the Oberon compiler doesn't
make heavy use of floating point math, for example, so
floating point optimizations may not speed it up or make it
smaller), but I like the spirit of it.
(If you liked this, you might enjoy Papers from the Lost Culture of Array
Languages.)
The Most Important Decisions are Non-Technical
I occasionally get puzzled questions about a parenthetical remark I made in 2010: that I
no longer program for a living. It's true. I haven't been a
full-time programmer since 2003. The short version of these
questions is "Why?" The longer version is "Wait, you've got
a super technical programming blog and you seem to know all
this stuff, but you don't want to work as a
programmer?"
The answer to both of these is that I realized that the
most important decisions are non-technical. That's a bare
and bold statement, so let me explain.
In the summer of 1993, I answered a newspaper ad looking
for a "6502 hacker" (which I thought was amusing; the
Pentium was released that same year) and got a job with a
small company near Seattle writing Super Nintendo games. I
had done game development prior to that, but it was me
working by myself in my parents' living room.
The first SNES game I worked on was a Tarzan-themed
platformer authorized by the estate of Edgar
Rice Burroughs (it had no connection to the Disney
movie, which was still six years in the future). I had fun
working out ways of getting tropical fish to move in
schools and creating behaviors for jungle animals like
monkeys and birds. It was a great place to work, with the
ten or so console programmers all sharing one big
space.
The only problem was that the game was clearly going to
be awful. It was a jumble of platformer clichés, and it
wasn't fun. All the code tweaking and optimization and
monkey behavior improvements weren't going to change that.
To truly fix it required a project-level rethink of why
were building it in the first place. As a "6502 hacker" I
wasn't in a position to make those decisions.
While it's fun to discuss whether an application should
be implemented in Ruby or Clojure, to write beautiful and
succinct code, to see how far purely functional programming
can be taken, these are all secondary to defining the user
experience, to designing a comfortable interface, to
keeping things simple and understandable, to making sure
you're building something that's actually usable by the
people you're designing it for. Those are more important
decisions.
Whatever happened to that Tarzan game? Even though it
had been previewed in Nintendo Power, the publisher
wisely chose to pay for the year of development and shelve
the finished project.
You, Too, Can Be on the Cutting Edge of Functional
Programming Research
In 1999 I earned $200 writing an essay titled
Toward Programmer Interactivity: Writing Games in Modern
Programming Languages. It was an early, optimistic
exploration of writing commercial games in Haskell, ML, and
Lisp.
It was not a good article.
It's empty in the way that so many other "Hey everyone!
Functional programming! Yeah!" essays are. I demonstrated
the beauty of Haskell in the small, but I didn't offer any
solutions for how to write state-heavy games in it. There
are a few silly errors in the sample code, too.
Occasionally during the following years, I searched for
information about writing games in functional languages,
and that article kept coming up. Other interesting
references turned up too, like papers on Functional
Reactive Programming, but apparently I had accidentally
become an authority. An authority who knew almost nothing
about the subject.
I still didn't know if a mostly-functional style would
scale-up past the usual toy examples. I didn't know how to
build even a simple game without destructive assignment. I
wasn't sure if there were even any legitimate benefits. Was
this a better way of implementing game ideas than the usual
tangled web of imperative code--or madness?
As an experiment, I decided to port an action game that
I wrote in 1997 to mostly-pure Erlang. It wasn't a toy, but
a full-featured game chock full of detail and special
cases. I never finished the port, but I had the bulk of the
game playable and running smoothly, and except for a list
of special cases that I could write on a "Hello! My Name
is" label, it was purely functional. I wrote about what I
learned in Purely Functional
Retrogames.
Now when I search for info about writing games in a
functional style, that's what I find.
Sure, there are some other sources out there. Several
times a year a new, exuberant "Haskell / ML / Erlang is a
perfect match for games!" blog entry appears. Functional
Reactive Programming keeps evolving. A couple of people
have slogged through similar territory and managed to
bang-out real games in Haskell (Raincat is a good
example).
If you want to be on the cutting edge of functional
programming research, it's easy. Pick something that looks
like a poor match for state-free code, like a video game,
and start working on it. Try to avoid going for the
imperative pressure release valves too quickly. At some
point you're going to need them, but make sure you're not
simply falling back on old habits. Keep at it, and it won't
be long before you're inventing solutions to problems no
one else has had to deal with.
If you come up with something interesting, I'd like to
hear about it.
(If you liked this, you might enjoy Back to the Basics of Functional
Programming.)
We Who Value Simplicity Have Built Incomprehensible
Machines
The 8086 "AAA" instruction seemed like a good idea at
the time. In the 1970s there was still a case to be made
for operating on binary-coded decimal values, with two
digits per byte. What's the advantage of BCD? Large values
can be easily displayed without multi-byte division or
multiplication. "ASCII Adjust After Addition," or AAA, was
committed to the x86 hardware and 30+ years later it's
still there, emulated in microcode, in every i7
processor.
The C library function memcpy
seemed like a
good idea at the time. memmove
was fast and
robust, properly handling the case where the source and
destination overlapped. That handling came at the expense
of a few extra instructions that were enough of a concern
to justify a second, "optimized" memory copying routine
(a.k.a. memcpy
). Since then we've had to live
with both functions, though there has yet to be an example
of an application whose impressive performance can be
credited to the absence of overlap-detection code in
memcpy
.
libpng
seemed like a good idea at the time.
The theory was to have an easy, platform-independent way of
reading and writing PNG files. The result does work,
and it is platform independent, but it's possibly
the only image decoding library where I can read through
the documentation and still not know how to load an image.
I always Google "simple libpng example" and cut and paste
the 20+ line function that turns up.
The UNIX ls
utility seemed like a good idea
at the time. It's the poster child for the UNIX way: a
small tool that does exactly one thing well. Here that
thing is to display a list of filenames. But deciding
exactly what filenames to display and in what format led to
the addition of over 35 command-line switches. Now the man
page for the BSD version of ls
bears the shame
of this footnote: "To maintain backward compatibility, the
relationships between the many options are quite
complex."
None of these examples are what caused modern computers
to be incomprehensible. None of them are what caused SDKs
to ship with 200 page overview documents to give some clue
where to start with the other thousands of pages of API
description.
But all the little bits of complexity, all those cases
where indecision caused one option that probably wasn't
even needed in the first place to be replaced by two
options, all those bad choices that were never remedied for
fear of someone somewhere having to change a line of
code...they slowly accreted until it all got out of
control, and we got comfortable with systems that were
impossible to understand.
We did this. We who claim to value simplicity are the
guilty party. See, all those little design decisions
actually matter, and there were places where we could have
stopped and said "no, don't do this." And even if we were
lazy and didn't do the right thing when changes were easy,
before there were thousands of users, we still could have
gone back and fixed things later. But we didn't.
(If you liked this, you might enjoy Living in the Era of Infinite Computing
Power.)
The Pace of Technology is Slower than You Think
"That post is OLD! It's from 2006!" The implication is
that articles on technology have a shelf-life, that
writings on programming and design and human factors
quickly lose relevance. Here's a reminder that the pace of
technological advancement isn't as out of control as it may
seem.
The first book on Objective-C, the language of modern
iOS development, was published in 1986.
Perl came on the scene in 1987, Python in 1991, Ruby in
1995.
You can still buy brand new 6502 and Z80 microprocessors
(a Z80 is $2.49 from Jameco
Electronics). A Z80
programming guide written in 1979 is still
relevant.
Knowledge of the C standard library would have served
you equally well developing for MS-DOS, early SUN
workstations, the Atari ST, Microsoft Windows, and iOS.
The Quicksort algorithm, taught in all computer science
curricula, was developed by Tony Hoare in 1960.
Bill Joy wrote vi in 1976. The span of time between it
and the initial release of Bram Moolenaar's vim in 1991 (15
years) is shorter than the time between the release of vim
and this blog entry (21 years).
The instruction set of the 80386 CPU, announced in 1985
and available the following year, is still a common target
for 32-bit software development.
The tar
command appeared in Seventh Edition
UNIX in 1979, the same year the vector-based Asteroids
arcade game was released. Pick up any 2012 MacBook Air or
Pro and tar
is there.
Another Programming Idiom You've Never Heard Of
New programmers quickly pick-up how array indexing
works. You fetch an element like this:
array[3]
. (More experienced folks can amuse
themselves with the equally valid 3[array]
in
C.) Now here's a thought: what if you could fetch multiple
values at the same time and the result was a new array?
Let's say the initial array is this:
10 5 9 6 20 17 1
Fetching the values at indices 0, 1, 3, and 6,
gives:
10 5 6 1
In the J language you
can actually do this, though the syntax likely isn't
familiar:
0 1 3 6 { 10 5 9 6 20 17 1
The list of indices is on the left, and the original
array on the right. That awkwardly unmatched brace is the
index operator. (You can also achieve the same end in the R
language, if you prefer.)
This may seem like a frivolous extension to something
you already knew how to do, but all of a sudden things have
gotten interesting. Now indexing can be used for more than
just indexing. For example, you can delete elements by
omitting indices. This drops the first two elements:
2 3 4 5 6 { 10 5 9 6 20 17 1
Or how about reversing an array without needing a
special primitive:
6 5 4 3 2 1 0 { 10 5 9 6 20 17 1
This last case is particularly significant, because the
indices specify a permutation of the original array.
Arrange the indices however you want, and you can transform
an array to that order.
In J, there's an operator that's like a sort, except the
result specifies a permutation: a list of where each
element should go. Using the same "10 5 9..." array, that
first element should be in position 4, the value 5 should
be in position 1, and so on. Here's the whole array of
permuted indices.
6 1 3 2 0 5 4
What good is that? If you use that list of indices on
the left side of the "{" operator with the original array
on the right, you sort the array:
6 1 3 2 0 5 4 { 10 5 9 6 20 17 1
Now imagine you've got two other parallel arrays that
you want to keep in sync with the sorted one. All you do is
use that same "sorted permutation" array to index into each
of the other arrays, and you're done.
(If you liked this, you might enjoy the original
A Programming Idiom You've Never Heard
Of.)
Your Coding Philosophies are Irrelevant
I'll assume you've got a set of strongly-held beliefs
about software development. This is a safe bet; anyone who
writes code has some personal mantras and peeves.
Maybe you think that PHP is a broken mess or that Perl
is unmaintainable? Maybe you're quick to respond in forums
with essays about the pointlessness of the singleton
pattern? You should always check the result code after
calling malloc
. Or wait, no, result codes are
evil and exceptions are The Way. Always write the test
cases before any code. Static typing is...well, I could
keep going and hit dozens of other classic points of
contention and link to arguments for each side.
Now imagine there are two finished apps that solve
roughly identical problems. One is enjoyable to use and
popular and making a lot of money. The other just doesn't
feel right in a difficult to define way. One of these apps
follows all of your development ideals, but which app is
it? What if the successful product is riddled with
singletons, doesn't check result codes after allocating
memory (but the sizes of these allocations are such that
failures only occur in pathological cases), and the authors
don't know about test-driven development? Oh, and the
website of the popular app makes extensive use of PHP.
Or even simpler: pick a single tool or game or
application you admire, one you don't have any inside
information about. Now just by staring hard at the screen,
determine if the author favored composition over
inheritance or if he's making rampant use of global
variables. Maybe there are are five-hundred line functions
and gotos all over the place, but you can't tell. Even if
you pick a program that constantly crashes, how do you know
the author doesn't have exactly the same opinions about
development as you?
It's not the behind-the-scenes, pseudo-engineering
theories that matter. An app needs to work and be
relatively stable and bug free, but there are many ways to
reach that point. There isn't a direct connection between
some techie feel-good rule and success. For most arbitrary
rules espoused in forums and blogs, you'll find other
people vehemently arguing the opposite opinion. And it
might just be that too much of this kind of thinking is
turning you into an obsessive architect of abstract code,
not the builder of things people want.
(If you liked this, you might enjoy Don't Fall in Love with Your
Technology.)
The Silent Majority of Experts
When I still followed the Usenet group comp.lang.forth, I wasn't the only person
frustrated by the lack of people doing interesting things
with the language. Elizabeth Rather, co-founder of Forth,
Inc., offered the following explanation: there are
people solving real problems with Forth, but they don't
hang-out in the newsgroup. She would know; her company
exists to support the construction of commercial Forth
projects.
In 1996 I worked on a port of The Need for Speed
to the SEGA Saturn. (If you think that's an odd system to
be involved with, I also did 3DO development, went to a
Jaguar conference at Atari headquarters, and had an
official set of Virtual Boy documentation.) There were a
number of game developers with public faces in the 1990s,
but the key people responsible for the original version of
The Need for Speed, released in 1994, remained
unknown and behind the scenes. That's even though they had
written a game based around rigid-body physics before most
developers had any idea that term was relevant to 3D video
games. And they did it without an FPU: the whole engine
used fixed-point math.
Yes, there are many people who blog and otherwise
publicly discuss development methodologies and what they're
working on, but there are even more people who don't.
Blogging takes time, for example, and not everyone enjoys
it. Other people are working on commercial products and
can't divulge the inner workings of their code.
That we're unable to learn from the silent majority of
experts casts an unusual light upon online discussions.
Just because looking down your nose at C++ or Perl is the
popular opinion doesn't mean that those languages aren't
being used by very smart folks to build amazing, finely
crafted software. An appealing theory that gets frantically
upvoted may have well-understood but non-obvious drawbacks.
All we're seeing is an intersection of the people working
on interesting things and who like to write about
it--and that's not the whole story.
Your time may better spent getting in
there and trying things rather than reading about what
other people think.
(If you liked this, you might enjoy Photography as a Non-Technical Hobby.)
I Am Not a Corporation
In 2009, when I exclusively used a fancy Nikon DSLR, my
photographic work flow was this: take pictures during the
day, transfer them to a PC in the evening, fiddle with the
raw version of each shot in an image editor, save out a
full-res copy, make a smaller version and upload it to
Flickr.
Once I started using an iPhone and the Hipstamatic app,
my work flow was this: take a picture, immediately upload
it to Flickr.
Pick any criteria for comparing the absolute quality of
the Nikon vs the iPhone, and the Nikon wins every time:
sharpness, resolution, low-light ability, you name it. It's
not even close. And yet I'm willing to trade that for the
simplicity and fun of using the iPhone.
That's because I'm not a professional photographer who
gets paid to put up with the inconveniences that come with
the higher-end equipment. If I can avoid daily image
transfers, that's a win for me. If I don't have to tweak
around with contrast settings and color curves, that's
huge.
I also work on projects in my spare time that involve
writing code, but I don't have the luxury of a corporate IT
department that keeps everything up to date and running
smoothly. I don't want to be maintaining my own Linux
installation. I would prefer not to wait forty-five minutes
to build the latest bug-fix release of some tool. I don't
think most developers want to either; that kind of
self-justified technical noise feels very 1990s.
When I'm trying out an idea at home, I'm not getting
paid to deal with what a professional software engineer
would. If I've got thirty minutes to make progress, I don't
want to spend that puzzling out why precompiled headers
aren't working. I don't want to spend it debugging a
makefile quirk. I don't want to
decipher an opaque error message because I got something
wrong in a C++ template. I don't want to wait for a project
to compile at all. I'm willing to make significant
trades to avoid these things. If I can get zero or close to
zero compilation speed, that's worth
a 100x performance hit in the resulting code.
Seriously.
If I were a full-time programmer paid to eke out every
last bit of performance, then there's no way I'd consider
making such a trade. But I'm not, and if I pretended
otherwise and insisted on using the same tools and
techniques as the full-time pros, I'd end up frustrated and
go all Clifford
Stoll and move to an internet-free commune in
Tennessee.
Fun and simplicity are only optional if you're paid to
ignore them.
(If you liked this, you might enjoy Recovering From a Computer Science
Education.)
Things to Optimize Besides Speed and Memory
Whittling down a function to accomplish the same result
with fewer instructions is, unfortunately, fun. It's a mind
teaser in the same way that crossword puzzles and Sudoku
are. Yet it's a waste of time to finely hone a C++ routine
that would be more than fast enough if implemented in
interpreted Python. Fortunately, there are plenty of other
targets for that optimization instinct, and it's worth
retraining your habits to give these aspects of your
projects more attention:
Power consumption, battery life, heat, and fan noise.
Number of disk sector writes (especially for solid-state
drives). Are you rewriting files that haven't changed?
Overall documentation size and complexity.
How much time it takes to read a tutorial--and the
engagement level of that tutorial.
Number of bytes of network traffic. The multiplayer game
folks have been concerned with this from the start, but now
almost every application has some level of network traffic
that might go over non-free phone networks or through slow
public Wi-Fi.
#include
file size. This is more about the
number of entities exposed than the byte count.
Number of taps/clicks it takes to accomplish a task.
App startup time.
How long it takes to do a full rebuild of your project.
Or how long it takes to make usability tweaks and verify
that they work.
The number of special cases that must be documented,
either to the user or in your code.
Blog entry length.
(If you liked this, you might enjoy "Avoid Premature Optimization" Does Not Mean
"Write Dumb Code".)
App Store Failure and Personal Responsibility
"I wrote an iPhone app, and it didn't make any money" is
a growing literary genre, and I sympathize with the
authors. I really do. Building any kind of non-trivial,
commercial application takes an immense amount of work that
combines coding, writing, interaction design, and graphic
arts. To spend a thousand hours on a project that sells 103
copies at 99 cents apiece...well, it's disheartening to say
the least.
Dismissing that failure as losing the "app store
lottery" (meaning that success or failure is out of your
control) dodges important questions. When I was writing and
selling indie games in the mid 1990s, I went through the
experience of releasing a game to the
world--euphoria!--followed by despair, confusion, and
endless theorizing about why it wasn't the smash hit I knew
it deserved to be. Most of the failed iPhone app articles
sound like something I would have written in 1997. Of
course the iPhone and Apple's App Store didn't even exist
then, but my feelings and reactions were exactly same.
What I learned from that experience may sound obvious,
and that's precisely why it's a difficult lesson to learn:
just because you slogged through the massive effort it
takes to design and release a product doesn't have any
bearing at all on whether or not anyone actually wants what
you made.
See? I told you it sounds obvious, but that doesn't make
it any easier to deal with. Getting something out the door
is the price of entry, not a guarantee of success. If it
doesn't go as planned, then you have to accept that there's
some reason your beautiful creation isn't striking a chord
with people, and that involves coming face to face with
issues that aren't fun to think about for most bedroom
coders.
Have you ever watched complete strangers use your app?
Are they interpreting the tutorials correctly? Are they
working in the way you expected them to? If it's a game, is
the difficulty non-frustrating? Maybe you designed and
polished twenty levels, not realizing that only a handful
of players get past level one.
It's harder to judge if the overall quality is there.
Your cousin might draw icons for free, but do they give the
impression of high-end polish? Are there graphics on the
help screen or just a wall of text? Are you using readable
fonts? Are you avoiding improvements because they'd be too
much work? Developer Mike Swanson
wrote an Adobe Illustrator to Objective-C exporter just
so images would stay sharp when scaled.
It's also worth taking a step back and looking at the
overall marketplace. Maybe you love developing 16-bit retro
platformers, but what's the overall level of interest in
16-bit retro platformers? Is there enough enthusiasm to
support dozens of such games or is market saturation
capping your sales? If you've written a snazzy to-do list
app, what makes it better than all the other to-do lists
out there? Can folks browsing the app store pick up on that
quickly?
It would be wonderful to be in a position of developing
software, blindly sending it out into the world, and making
a fortune. It does happen. But when it doesn't, it's better
to take responsibility for the failure and dig deeper into
what to do about it rather than throwing up your hands and
blaming the system.
One Small, Arbitrary Change and It's a Whole New
World
I want to take one item from Things
to Optimize Besides Speed and Memory and run with it:
optimizing the number of disk sector writes.
This isn't based on performance issues or the limited
number of write cycles for solid-state drives. It's an
arbitrary thought experiment. Imagine we've got a system
where disk writes (but not reads!) are on par with early
1980s floppy disks, one where writing four kilobytes of
data takes a full second. How does looking through the
artificial lens of disk writes being horribly slow change
the design perceptions of a modern computer (an OS X-based
laptop in this case, because that's what I'm using)?
Poking around a bit, there's a lot more
behind-the-scenes writing of preferences and so on than
expected. Even old-school tools like bash
and
vim
save histories by default. Perhaps
surprisingly, the innocuous less
text-file
viewer writes out a history file every time it's run.
There are system-wide log files for recording errors and
exceptional occurrences, but they're used for more than
that. The Safari web browser logs a message when the
address/search bar is used and every time a web page is
loaded. Some parts of the OS are downright chatty,
recording copyright messages and every step of the
initialization process for posterity. There's an entire
megabyte of daemon shutdown details logged every time the
system is turned off. Given the "4K = 1 second" rule,
that's over four minutes right there.
The basic philosophy of writing data files needs a
rethink. If a file is identical to what's already on disk,
don't write it. Yes, that implies doing a read and compare,
but those aren't on our performance radar. Here's an
interesting case: what if you change the last line of a
100K text file? Most of the file is the same, so we can get
by with writing a single 4K sector instead of the 25 second
penalty for blindly re-saving the whole thing.
All of this is minor compared to what goes on in a
typical development environment. Compiling a single file
results in a potentially bulky object file. Some compilers
write out listings that get fed to an assembler. The final
executable gets written to disk as well. Can we avoid all
intermediate files completely? Can the executable go
straight to memory instead of being saved to disk, run once
for testing, then deleted in the next build?
Wait, hold on. We started with the simple idea of
avoiding disk writes and now we're rearchitecting
development environments?
Even though it was an off-the-cuff restriction, it
helped uncover some inefficiencies and unnecessary
complexity. Exactly why does less
need to
maintain a history file? It's not a performance issue, but
it took code to implement, words to document, and it raises
a privacy concern as well. Not writing over a file with
identical data is a good thing. It makes it easy to sort by
date and see what files are actually different.
Even the brief wondering about development systems
brings up some good questions about whether the design
status quo for compilers forty years ago is the best match
for the unimaginable capabilities of the average portable
computer in 2012.
All because we made one small, arbitrary change to our
thinking.
(If you'd like to subscribe, here's the news feed.)
All that Stand Between You and a Successful Project are
500 Experiments
Suppose there was a profession called "maker." What does
a maker do? A maker makes things! Dinner. Birdhouses.
Pants. Shopping malls. Camera lenses. Jet engines.
Hydroelectric power stations. Pianos. Mars landers.
Being a maker is a rough business. It's such a
wide-ranging field, and just because you've made hundreds
of flowerpots doesn't give you any kind of edge if you need
to make a catalytic converter for a 1995 Ford truck.
Now think about a profession called "programmer." What
does a programmer do? A programmer programs things!
Autonomous vehicles. Flight simulators. Engine control
systems. Solid state drive firmware. Compilers. Video
games. Airline schedulers. Digital cameras.
If you focus in on one area it expands like a fractal.
Video games? That covers everything from chess to 3D open
world extravaganzas to text adventures to retro
platformers. Pick retro platformers and there's a wide
range of styles and implementation techniques. Even if you
select a very specific one of these, slight changes to the
design may shift the problem from comfortable to brain
teaser.
The bottom line is that it's rare to do software
development where you have a solid and complete
understanding of the entire problem space you're dealing
with. Or looking at it another way, everything you build
involves forays into unfamiliar territory. Everything you
build is to a great extent a research project.
How do you come to grips with something you have no
concrete experience with? By running experiments. Lots of
little throwaway coding and interface experiments that
answer questions and settle your mind.
Writing a PNG decoder, for example, is a collection of
dozens of smaller problems, most of which can be fiddled
around with in isolation with real code. Any significant
app has user interactions that need prototyping, unclear
and conflicting design options to explore, tricky bits of
logic, API calls you've never used--hundreds of things.
Maybe five hundred. And until you run those experiments,
you won't have a solid understanding of what you're
making.
(If you liked this, you might enjoy Tricky When You Least Expect It.)
Hopefully More Controversial Programming Opinions
I read
20 Controversial Programming Opinions, and I found
myself nodding "yes, yes get to the good stuff." And then,
after "less code is better than more," it was over. It was
like reading a list of controversial health tips that
included "eat your veggies" and "don't be sedentary." In an
effort to restore a bit of spark to the once revolutionary
software development world, I present some opinions that
are hopefully more legitimately controversial.
Computer science should only be offered as a minor. You
can major in biology, minor in computer science. Major in
art, minor in computer science. But you can't get a degree
in CS.
It's a mistake to introduce new
programmers to OOP before they understand the basics of
breaking down problems and turning the solutions into
code.
Complex compiler optimizations are almost never worth
it, even if they result in faster code. They can disproportionately slow down the compiler.
They're risky, in that a mishandled edge case in the
optimizer may result in obscure, latent bugs in the
application. They make reasoning about
performance much more difficult.
You shouldn't be allowed to write a library for use by
other people until you have ten years of programming under
your belt. If you think you know better and ignore this
rule, then one day you will come to realize the mental
suffering that you have inflicted upon others, and you will
have to live with that knowledge for the rest of your
life.
Superficially ugly code is
irrelevant. Pretty formatting--or lack thereof--has no
bearing on whether the code works and is reliable, and that
kind of mechanical fiddling is better left to an automated
tool.
Purely functional programming doesn't
work, but if you mix in a small amount of imperative
code then it does.
A software engineering mindset can
prevent you from making great things.
The Goal is to be Like a Bad Hacker Movie
The typical Hollywood hacking scene is an amalgamation
of familiar elements: screens full of rapidly changing hex
digits, database searches that show each fingerprint or
image as it's encountered, password prompts in a 72 point
font, dozens of windows containing graphs and random
data...oh, and 3D flights through what presumably are the
innards of a computer. Somehow the protagonist uses these
ridiculous tools to solve a difficult problem in a matter
of minutes, all while narrating his exploits with
nonsensical techno jargon.
Admittedly, it's a lot more entertaining than the
reality of staring at code for hours, sitting through ten
minute compile and link cycles, and crying over two page
error messages from a C++ template gone bad.
But the part about solving a problem in a matter of
minutes? There's some strong appeal in that. Who wouldn't
want to explore and come to grips with a tricky issue in
real-time?
It's not as outlandish as it sounds. Traditional
software development methodologies are based more around
encapsulation and architecture than immediate results. To
mimic the spirit--not the aesthetics or technical
details--of a scene from a bad hacker movie, you need
different priorities:
Visualization tools. At the one end of the
spectrum are Bret
Victor-esque methods for interactively exploring a
problem space. On a more basic level, it's tremendously
useful to have graphing facilities available at all times.
How often are zeros occurring in this data? What are the
typical lengths of strings going through this function?
Does displaying a matrix as a grid of rectangles, with each
unique element mapped to a separate color, show any hidden
patterns?
Terseness. It's easier to just say print
or cos than remembering output display functions
are in system.output.text and cosine is in
math.transcendentals. It's easier to have built-in
support for lists than remembering what constructors are
available for the list class. It may initially seem obtuse
that Forth's memory fetch and store operations are named
"@" and "!", but that reaction quickly fades and the
agility of single-character words sticks with you.
A set of flexible, combinable operations. The
humble "+" in array languages does
more than a plus operator in C. It not only adds to two
numbers, but it can add a value to each element of an
arbitrarily long array and add two arrays together. Follow
it with a slash ("+/") and the addition operator gets
"inserted" between the elements of an array, returning the
sum.
It gets interesting when you've got a collection of
operations like this that can be combined with each other.
Here's a simple example: How do you transform a list of
numbers into a list of pairs, where the first element of
each pair is the index and the second the original number?
Create a list of increasing values as long as the original,
then zip the two together. That's impossibly terse in a
language like J, but
maybe more readily understandable in Haskell or Erlang:
lists:zip(lists:seq(1,length(L)),L).
The trick here is to forget about loops and think
entirely in terms of stringing together whole array or list
transformations. That lets you try a series single-line
experiments while avoiding opening up a text editor and
switching your mindset to "formal coding" mode.
(If you liked this, you might enjoy This Isn't Another Quick Dismissal of Visual
Programming.)
Minimalism in an Age of Tremendous Hardware
You don't know minimalism until you've explored the
history of the Forth programming language.
During Forth's heyday, it was unremarkable for a full
development environment--the entire language with
extensions, assembler, and integrated editor--to be less
than 16K of object code. That's not 16K of data loaded by
another program, but a full, standalone, 16K system capable
of meta-compiling its own source code to create a new
standalone system.
With work, and depending on how much you wanted to throw
out, you could get that 16K number much lower. 8K was
reasonable. 1K for an ultralight system with none of the
blanks filled in. Half that if you were an extremist that
didn't mind bending the definition of Forth into
unrecognizable shapes.
Some Forths booted directly from a floppy disk and took
over the machine. No operating system necessary. Or perhaps
more correctly, Forth was the operating system.
In the early 1990s, Forth creator Chuck Moore had an
epiphany and decided that interactively hex-editing 32-bit
x86 machine code was the way to go. (After writing an
entire chip layout package this way he realized there were
some drawbacks to the scheme and reverted to designing
languages that used source code.)
Looking back, there are many questions. Was there ever a
time when reducing a development environment from 16K to 8K
actually mattered? Why did the Forth community expend so
much effort gazing inward, constantly rethinking and
rewriting the language instead of building applications
that incidentally happened to be written in Forth? Why was
there such emphasis on machine-level efficiency instead of
developer productivity?
In 2012 it all seems like so much madness, considering
that I could write a Forth interpreter in Lua that when
running on an iPhone from a couple generations back would
be 10,000 times faster than the most finely crafted
commercial Forth of 30 years ago. I'm not even considering
the tremendous hardware in any consumer-level desktop.
Still, there's something to the minimalism that drove
that madness. The mental burden involved in working with a
50K file of Python code is substantially higher than one of
10K. It doesn't matter that these modest numbers are
dwarfed by the multi-megabyte system needed to execute that
code. Those hundreds of extra lines of Python mean there's
more that can go wrong, more that you don't understand,
more time sitting in front of a computer fixing and
puzzling when you should be out hiking or playing
guitar.
Usually--almost always--there's a much simpler solution
waiting to be discovered, one that doesn't involve all the
architectural noise, convolutions of the straightforward,
and misguided emphasis on hooks and options for all kinds
of tangents which might be useful someday. Discovering that
solution may not be easy, but it is time well spent.
That's what I learned from Forth.
(If you liked this, you might enjoy Deriving Forth.)
What's Your Hidden Agenda?
In July 1997, the Issaquah Press printed an article with
the headline "Man Shoots Computer in Frustration." Now
realize that Issaquah is just south of Redmond, so it's not
surprising that this story was picked up nationally. It
rapidly became a fun to cite piece of odd news, the fodder
of morning radio shows. Google News has
a scan of one version of the story, so you can read it
for yourself.
A week later, the Issaquah Press ran a correction to the
original story. It turns out that not only was the PC
powered down at the time of the shooting, but the man
wasn't even in the same with it room when he fired his gun.
The bullets went through a wall and hit the computer more
or less by accident. I'm not denying that this guy had
issues, but one of them wasn't anger stemming from computer
trouble.
So how did the original story manage to get into
print?
Somehow the few facts were lined up, and from an
objective point of view there were gaps between them. A
distraught man. A discharged gun. Bullet holes in a no
longer functioning PC. I have no way of knowing who
mentally pieced together the sequence of events, but to
someone the conclusion was blindingly obvious: computers
are frustrating, wouldn't we all like to shoot one? Perhaps
the unknown detective recently lost hours of work when a
word processor crashed? Perhaps it was the influence of all
the overheard and repeated comments about Windows 95
stability?
When I read forum postings and news articles, I'm wary
of behind the scenes agendas. Sometimes they're obvious,
sometimes not. Sometimes it takes a while to realize that
this is a guy with a beef about Apple, this other person
will only say good things about free-as-in-freedom
software, this kid endlessly defends the honor of the
PlayStation 3 because that's what his parents got him for
Christmas, and he can't afford to also have an Xbox. And
then I realize these people are unable to present me with a
clear vision of what happened in that house in Issaquah in
1997.
Digging Out from Years of Homogeneous Computing
When I first started looking into
functional programming languages, one phrase that I kept
seeing in discussions was As Fast as C. A
popular theory was that functional programming was failing
to catch on primarily because of performance issues. If
only implementations of Haskell, ML, and Erlang could be
made As Fast As C, then programmers would flock to these
languages.
Since then, all functional languages have gotten
impressively fast. The top-end PC in 1998 was a 350MHz
Pentium II. The passage of time has solved all
non-algorithmic speed issues. But at the same time, there
was a push for native code generation, for better
compilers, for more optimization. That focus was a mistake,
and it would take a decade for the full effect of that
decision come to light.
In the early 2000s, PCs were the computing world.
I'm not using "personal computer" in the generic sense; I'm
talking about x86 architecture boxes with window-oriented
GUIs and roughly the same peripherals. People in the
demo-coding scene would shorten the term "x86 assembly
language" to "asm" as if no other processor families
existed. The Linux and Windows folks with nothing better to
do argued back and forth, but they were largely talking
about different shades of the same thing. One of the
biggest points of contention in the Linux community was how
to get a standard, "better than Windows but roughly the
same" GUI for Linux, and several were in development.
Then in 2007 the iPhone arrived and everything
changed.
This has nothing to do with Apple fanboyism. It's that a
new computer design which disregarded all the familiar
tenets of personal computing unexpectedly became a major
platform. The mouse was replaced with a touchscreen. The
decades old metaphor of overlapping windows shuffled around
like papers on a table was replaced by apps that owned all
the pixels of the device. All those years of learning the
intricacies of the Win32 API no longer mattered; this was
something else entirely. And most significantly for our
purposes: the CPU was no longer an x86.
Compiler writers had been working hard, and showing
great progress, in getting Haskell and Objective Caml
turning into fast x86 machine code. Then, through no fault
of their own, they had to deal with the ARM CPU and a new
operating system to interface with, not to mention that
Objective-C was clearly the path of least resistance for
hardware developed and being rapidly iterated by a company
that promoted Objective-C.
That a functional language compiler on a desktop PC was
getting within a reasonable factor of the execution time of
C no longer mattered if you were a mobile developer. The
entire emphasis put on native code compilation seemed
questionable. With the benefit of hindsight, it would have
been better to focus on ease of use and beautiful coding
environments, on smallness and embeddability. I think that
would have been a tough sell fifteen years ago, blinded by
the holy grail of becoming As Fast as C.
To be fair, ARM did become a target for the Glasgow
Haskell compiler, though it's still not a reasonable option
for iOS developers, and I doubt that's the intent. But
there is one little language that was around fifteen years
ago, one based around a vanilla interpreter, one that's
dozens of times slower than Haskell in the general case.
That language is Lua, and it gets a lot of use on iPhone,
because it was designed from the start to be embeddable in
C programs.
(If you liked this, you might enjoy Caught-Up with 20 Years of UI Criticism.)
Do You Really Want to be Doing This When You're
50?
When I was still a professional programmer, my
office-mate once asked out of the blue, "Do you really want
to be doing this kind of work when you're fifty?"
I have to say that made me stop and think.
To me, there's an innate frustration in programming. It
doesn't stem from having to work out the solutions to
difficult problems. That takes careful thought, but it's
the same kind of thought a novelist uses to organize a
story or to write dialog that rings true. That kind of
problem-solving is satisfying, even fun.
But that, unfortunately, is not what most programming is
about. It's about trying to come up with a working solution
in a problem domain that you don't fully understand and
don't have time to understand.
It's about skimming great oceans of
APIs that you could spend years studying and learning,
but the market will have moved on by then and that's no fun
anyway, so you cut and paste from examples and manage to
get by without a full picture of the architecture
supporting your app.
It's about reading between the lines of documentation
and guessing at how edge cases are handled and whether or
not your assumptions will still hold true two months or two
years from now.
It's about the constant evolutionary changes that occur
in the language definition, the compiler, the libraries,
the application framework, and the underlying operating
system, that all snowball together and keep you in
maintenance mode instead of making real improvements.
It's about getting derailed by hairline fractures in
otherwise reliable tools, and apparently being the first
person to discover that a PNG image with four
bits-per-pixel and an alpha channel crashes the decoder,
then having to work around that.
One approach is to dig in and power through all the
obstacles. If you're fresh out of school, there are free
Starbucks lattes down the hall, and all your friends are
still at the office at 2 AM, too...well, that works. But
then you have to do it again. And again. It's always a last
second skid at 120 miles per hour with brakes smoking and
tires shredding that makes all the difference between
success and failure, but you pulled off another miracle and
survived to do it again.
I still like to build things, and if there's no one else
to do it, then I'll do it myself. I keep improving the
the tiny Perl script that puts
together this site, because that tiny Perl script is
unobtrusive and reliable and lets me focus on writing. I
have a handy little image compositing tool that's less than
28 kilobytes of C and Erlang source. I know how it works
inside and out, and I can make changes to it in less time
than than it takes to coax what I want out of
ImageMagick.
But large scale, high stress coding? I may have to admit
that's a young man's game.
The Background Noise Was Louder than I Realized
A few years ago I started cutting back on the number of
technology and programming sites I read. It was never a
great number, and now it's only a handful. This had nothing
to with being burned out on technology and programming; it
was about being burned out on reading about
technology and programming. Perhaps surprisingly, becoming
less immersed in the online tech world has made me more
motivated to build things.
Here's some of what I no longer bother with:
Tired old points of contention that make no difference
no matter who says what (e.g., static vs. dynamic
typing).
Analyses of why this new product is going to be the end
of a multi-billion dollar corporation.
Why some programming language sucks.
Overly long, detailed reviews of incrementally improved
hardware and operating system releases. (I like iOS 6 just
fine, but from a user's point of view it's iOS 5 with a few
tweaks and small additions that will be discovered through
normal use.)
Performance comparisons of just about anything: systems,
GPUs, CPUs, SSDs. The quick summary is that they're all
5-15% faster than last year's infinitely fast stuff.
All of these things are noise. They're below the
threshold of what matters. Imagine you started hanging out
with people who were all, legitimately, writing books. They
each have their own work styles and organization methods
and issues with finding time to write efficiently. As a
software designer, you might see some ways to help them
overcome small frustrations with their tools or maybe even
find an opportunity for a new kind of writing app. But I
can guarantee that GPU numbers and programming language
missteps and the horrors of dynamic typing will have no
relevance to any of what you observe.
I do still read some tech (and non-tech) blogs, even
ones that sometimes violate the above rules. If the author
is sharing his or her direct, non-obvious experience or has
an unusual way of seeing the world, then I'll happily
subscribe. Being much more selective has kept me excited
and optimistic and aware of possibilities instead of living
down below in a world of endless detail and indecision and
craning my neck to see what's going on above the
surface.
(As a footnote, a great way to avoid the usual
aggregation sites is to subscribe to the PDF or real paper
edition of Hacker News
Monthly. Read it cover to cover one Saturday morning
with a good coffee instead of desperately refreshing your
browser every day of the week. Disclosure: I've gotten free
copies of the PDF version for a while now, because I've had
a few articles
reprinted in it.)
OOP Isn't a Fundamental Particle of Computing
The biggest change in programming over the last
twenty-five years is that today you manipulate a set of
useful, flexible data types, and twenty-five years ago you
spent a disproportionately high amount of time building
those data types yourself.
C and Pascal--the standard languages of the
time--provided a handful of machine-oriented types:
numbers, pointers, arrays, the illusion of strings, and a
way of tying multiple values together into a record or
structure. The emphasis was on using these rudiments as
stepping stones to engineer more interesting types, such as
stacks, trees, linked lists, hash tables, and resizable
arrays.
In Perl or Python or Erlang, I don't think about this
stuff. I use lists and strings and arrays with no concern
about how many elements they contain or where the memory
comes from. For almost everything else I use dictionaries,
again no time spent worrying about size or details such as
how hash collisions are handled.
I still need new data types, but it's more a repurposing
of what's already there than crafting a custom solution. A
vector of arbitrary dimension is an array. An RGB color is
a three-element tuple. A polynomial is either a tuple
(where each value is the coefficient and the index is the
degree) or a list of {Coefficient, Degree}
tuples. It's surprising how arrays, tuples, lists, and
dictionaries have eliminated much of the heavy lifting from
the data structure courses I took in college. The focus
when implementing a balanced binary tree is on how balanced
binary trees work and not about suffering through a tangled
web of pointer manipulation.
Thinking about how to arrange ready-made building blocks
into something new is a more radical change than it may
first appear. How those building blocks themselves
come into existence is no longer the primary concern. In
many programming courses and tutorials, everything is going
along just fine when there's a sudden speed bump of
vocabulary: objects and constructors and abstract base
classes and private methods. Then in the next assignment
the simple three-element tuple representing an RGB color is
replaced by a class with getters and setters and multiple
constructors and--most critically--a lot more code.
This is where someone desperately needs to step in and
explain why this is a bad idea and the death of fun, but it
rarely happens.
It's not that OOP is bad or even flawed. It's that
object-oriented programming isn't the fundamental particle
of computing that some people want it to be. When blindly
applied to problems below an arbitrary complexity
threshold, OOP can be verbose and contrived, yet there's
often an aesthetic insistence on objects for everything all
the way down. That's too bad, because it makes it harder to
identify the cases where an object-oriented style truly
results in an overall simplicity and ease of
understanding.
(Consider this Part 2 of Don't
Distract New Programmers with OOP. There's also a
Part 3.)
An Outrageous Port
In You, Too, Can Be on the Cutting
Edge of Functional Programming Research I wrote:
As an experiment, I decided to port an action game
that I wrote in [1996 and] 1997 to mostly-pure Erlang.
It wasn't a toy, but a full-featured game chock full of
detail and special cases. I never finished the port,
but I had the bulk of the game playable and running
smoothly, and except for a list of special cases that I
could write on a "Hello! My Name is" label, it was
purely functional. I wrote about what I learned in
Purely Functional Retrogames.
I didn't mention the most unusual part: this may be the
world's only port of a game from RISC (PowerPC) assembly
language to a functional language (Erlang).
Exactly why I chose to write an entire video game in
PowerPC assembly language in the first place is hard to
justify. I could point out that this was when the 300,000+
pixels of the lowest resolution on a Macintosh was a heavy
burden compared to the VGA standard of 320x200. Mostly,
though, it was a bad call.
Still, it's a fascinating bit of code archaeology to
look at something developed by my apparently mad younger
self. By-the-book function entry/exit overhead can be 30+
instructions on the PowerPC--lots of registers to save and
restore--but this code is structured in a way that
registers hardly ever need to be saved. In the sprite
drawing routines, option flags are loaded into one of the
alternate sets of condition registers so branches don't
need to be predicted. The branch processing unit already
knows which way the flow will go.
In Erlang, the pain of low-level graphics disappeared.
Instead of using a complicated linkage between Erlang and
OpenGL, I moved all of the graphics code to a small,
separate program and communicated with it via local socket.
(This is such a clean and easy approach that I'm surprised
it's not the go-to technique for interfacing with the OS
from non-native languages.)
With sprite rendering out of the way, what's left for
the Erlang code? Everything! Unique behaviors for sixteen
enemy types, a scripting system, collision detection and
resolution, player control, level transitions, and all the
detail work that makes a game playable and game design fun.
(The angle_diff
problem
came from this project, too, as part of a module for
handling tracking and acceleration.)
All of this was recast in interpreted Erlang. Yes, the
purely functional style resulted in constant regeneration
of lists and tuples. Stop-the-world garbage collection kicked in whenever needed.
Zoom in on any line of code and low-level inefficiencies
abound. Between all the opcode dispatching in the VM and
dynamic typing checks for most operations I'm sure the end
result is seen by hardware engineers as some kind of
pathological exercise in poor branch prediction.
All of this, all of this outrageous overhead, driving a
game that's processing fifty or more onscreen entities at
sixty frames per second, sixteen milliseconds per
frame...
...and yet on the 2006 non-pro MacBook I was using at
the time, it took 3% of the CPU.
Three percent!
If anything, my port was too timid. There were places
where I avoided some abstractions and tried to cut down on
the churning through intermediate data structures, but I
needn't have bothered. I could have more aggressively
increased the code density and made it even easier to work
with.
(If you liked this, you might enjoy Slow Languages Battle Across Time.)
"Not Invented Here" Versus Developer Sanity
Developers, working independently, are all pushing to
advance the state of the art, to try to make things better.
The combined effect is to make things more chaotic and
frustrating in the short term.
Early PC video cards rapidly advanced from bizarrely
hued CGA modes to the four-bit pastels of EGA to the 256
color glory of VGA, all in six years. Supporting the full
range required three sets of drawing routines and three
full sets of art.
iPhone resolution went from 320x480 to 640x960 to
640x1136 in less time. Application design issues aside, the
number of icons and launch screen images required for a
submitted app exploded.
Windows 95 offered huge benefits over odd little MS-DOS,
but many companies selling development tools were unwilling
or unmotivated to make the transition and those tools
slowly withered.
Starting with iOS 5, Apple required applications to have
a "root view controller," a simple change that resulted in
disproportionate amount of confusion (witness the variety
of fixes in this
Stack Overflow thread).
GraphicsMagick is smaller, cleaner, and faster than its
predecessor ImageMagick, but that's of no consolation if
you're reliant on one of the features dropped from the
latter in the name of simplicity and cleanliness.
Keeping up with all of these small, evolutionary changes
gets tiring, but there's no point in complaining. Who
doesn't want double iPhone resolution, the boost that
DirectX 9 gives over previous versions, or the aesthetics
of curve-based, anti-aliased font rendering instead of 8x8
pixel grids?
Sometimes, occasionally, you can hide from the
never-ending chaos. You can take refuge in your own custom
crafted tools--maybe even a single tool--that does exactly
what you need it to do. A tool that solves a problem core
to your area of focus. A tool that's as independent as
realistically possible from the details of a specific
operating system and from libraries written by people with
different agendas.
This isn't a blanket license for Not
Invented Here syndrome or reinventing the wheel. If you
have a small (small!) library or tool that does exactly
what you need, that you understand inside and out, then you
know if your needs change slightly you can get in there and
make adjustments without pacing back and forth for a new
release. If you have an epiphany about how to further
automate things or how to solve difficult cases you didn't
think were possible, then you can code up those
improvements. Maybe they don't always work out, but you can
at least try the experiments.
Mostly it's about developer sanity and having something
well-understood that you can cling to amidst the swirling
noise of people whose needs and visions of the right
solutions never quite line up with your own.
The UNIX Philosophy and a Fear of Pixels
I've finally crossed the line from mild discomfort with
people who espouse the UNIX way as the pinnacle of
computing to total befuddlement that there's anyone who
still wants to argue such a position. One key plank in the
UNIX party platform is that small tools can be combined
together providing great expressiveness. Here's a simple
task: print a list of all the files with a txt
extension in the current directory except for
ignore_me.txt
.
Getting a list of text files is easy: ls
*.txt
. Now how to remove ignore_me.txt
from that list? Hmmm...well, you might know that
grep
can be inverted via a switch so it
returns lines that don't match:
ls *.txt | grep -v ^ignore_me\\.txt$
There's also the find
utility which can do
the whole thing in one step, but it takes more fiddling
around to get the parameters right:
find *.txt -type f ! -name ignore_me.txt
This all works, and we've all figured this stuff out by
reading man pages and googling around, but take a moment to
consider how utterly anachronistic both of the above
solutions come across to non-believers in 2012. It's like
promoting punch cards or IBM's job control language from
the 1960s. You've got to get that space between the
!
and -name
or you'll get back
"!-name: event not found
." But this isn't what
I wanted to talk about so I'll stop there.
What I really wanted to talk about are text files and
visual programming.
I keep seeing the put-downs of any mention of
programming that involves a visual component. I wrote an
entire entry two years ago on the subject, This Isn't Another Quick Dismissal of Visual
Programming, and now I don't think it was strong
enough. Maybe the problem is that "visual programming" is a
bad term, and it should be "ways to make programming be
more visual." At one time all coding was done on monochrome
monitors, but inexpensive color displays and more CPU power
led to syntax highlighting, which most developers will
agree is a win.
Now go further and stop thinking of code as a long
scroll of text, but rather as discrete functions that you
can view and edit independently. That's starting to get
interesting. Or consider the discussion of trees in any
algorithms book, where nodes and leaves are rendered inside
of boxes, and arrows show the connections between them.
It's striking that $500 consumer hardware has over three
million pixels and massively parallel GPUs to render those
pixels, yet there's old school developer resistance to
anything fancier than dumping out characters in a
monospaced font? Why is that?
It's because tools to operate on text files are easy to
write, and anything involving graphics is
several orders of magnitude harder.
Think about all the simple, interview-style coding
problems you've seen. "Find all the phone numbers in this
text file."
FizzBuzz. Do any of them involve colors or windows or
UI? For example, "On this system, how many pixels wide is a
given string in 18 point Helvetica Bold?" "List all the
filenames in the current directory in alphabetical order,
with the size of the font relative to the size of the file
(the names of the largest and smallest files should be
displayed in the largest and smallest font,
respectively)."
There have been some tantalizing attempts at making
graphical UI development as easy as working with text. I
don't mean tools like Delphi or the iOS UIKit framework,
where you write a bunch of classes that inherit from a core
set of classes, then use visual layout packages to design
the front-end. I mean tools that let you quickly write a
description of what you want UI-wise, and then there it is
on the screen. No OOP. No code generators. If you've ever
used the Tk toolkit for Tcl, then you've got a small taste
of what's possible.
The best attempt I've seen is the UI description
sub-language of REBOL. Creating a
basic window with labeled buttons is a one-liner. Clearly
all wasn't perfect in REBOL-ville, as a burst of excitement
in the late 1990s was tempered with a long period of
inactivity, and some features of the language never quite
lived up to their initial promises.
These days HTML is the most reasonable approach to
anything involving fonts and images and interaction. It's
not as beautifully direct as REBOL, and being trapped in a
browser is somewhere between limiting and annoying, but the
visual toolkit is there, and it's ubiquitous. (For the
record, I would have solved the "list all the filenames..."
problem by generating HTML, but firing up a browser to
display the result is a heavy-handed solution.)
Code may still be text behind the scenes, but that
doesn't mean that coding has to always be about
working directly in a text editor or monospaced terminal
window.
Dangling by a Trivial Feature
I'm looking for a good vector-illustration app and
download a likely candidate. I rely on knowing the position
of the cursor on the page--strangely, some programs won't
show this--so it's a good first sign to see coordinates
displayed at the top of the screen. Now I drag the
selection rectangle around a shape to measure it.
Uh-oh, it's still showing only the cursor coordinates as
I drag. What I want to see are two sets of values:
the cursor position and the current size of the rectangle.
Now I could do the subtraction myself, but I'm using a
computer that can do billions of calculations each second.
I don't want to get slowed down because of mistakes in my
mental computation or because I mistyped a number into the
Erlang or Python interpreter I use as a calculator.
I set this app aside and start evaluating another. And
that sentence should be utterly horrifying to developers
everywhere.
A team of people spent thousands of hours building that
app. There are tens or hundreds of thousands of lines of
codes split across dozens of files. There's low-level
manipulation of cubic splines, a system for creating layers
and optimizing redraw when there are dozens of them, a
complex UI, importing and exporting of SVG and Adobe
Illustrator and Postscript files, tricky algorithms for
detecting which shape you're clicking on, gradient and drop
shadow rendering, text handling...and I'm only hitting some
of the highlights.
Yet here I am dismissing it in a casual, offhand way
because of how the coordinates of the selection rectangle
are displayed. The fix involves two subtractions, a change
to a format string, and a bit of testing. It's trivial,
especially in comparison to all the difficult,
under-the-hood work to make the selection of objects
possible in the first place, but it makes no difference,
because I've moved on.
These are the kind of front-facing features people use
to decide if they like an app or not. Seemingly superficial
fit and finish issues are everything, and the giant
foundation that enables those bits of polish is simply
assumed to exist.
(If you liked this, you might enjoy It's Like That Because It Has Always Been Like
That.)
Documenting the Undocumentable
Not too long ago, any substantial commercial software
came in a substantial box filled with hundreds or thousands
of printed pages of introductory and reference material,
often in multiple volumes. Over time the paper manuals
became less comprehensive, leaving only key pieces of
documentation in printed form, the reference material
relegated to online help systems. In many cases the concept
of a manual was dropped completely. If you can't figure
something out you can always Google for it or watch YouTube
videos or buy a book.
If you're expecting a lament for good, old-fashioned
paper manuals, then this isn't it. I'm torn between the
demise of the manual being a good thing, because almost no
one read them in the first place, and the move to digital
formats hiding how undocumentable many modern software
packages have become.
Look at Photoshop CS6. The "Help and Tutorials" PDF is
750 pages, with much of that being links to external
videos, documents, and tutorials. Clearly that's still not
enough information, because there's a huge market for
Photoshop books and classes. The first one I found at
Amazon, Adobe Photoshop CS6 Bible, is 1100
pages.
The most fascinating part of all of this is what's
become the tip of the documentation iceberg: the Quick
Start guide.
This may be the only non-clinical documentation that
ships with an application. It's likely the only thing a
user will read before clicking around and learning through
discovery or Google. So what do you put in the Quick Start
guide? Simple tutorials? Different tutorials for different
audiences? Explanations of the most common options?
Here's what I'd like to see: What the developers of the
software were thinking when they designed it.
I don't mean coding
methodologies; I mean the assumptions that were made
about how the program should be used. For example, some
image editors add a new layer each time you create a
vector-based element like a rectangle. That means lots of
layers, and that's okay. The philosophy is that bitmaps and
editable vector graphics are kept completely separate.
Other apps put everything into the same layer unless you
explicitly create a new one. The philosophy is that layers
are an organizational tool for the user.
Every application has philosophies like this that
provide a deeper understanding once you know about them,
but seem random otherwise. Why does the iMovie project size
remain the same after removing twenty seconds of video?
Because the philosophy is that video edits are
non-destructive, so you never lose the source footage. Why
is it so much work to change the fonts in a paper written
in Word? Because you shouldn't be setting fonts directly;
you should be using paragraph styles to signify your intent
and then make visual adjustments later.
I want to see these philosophies documented right up
front, so I don't have to guess and extrapolate about what
I perceive as weird behavior. I'm thinking "What? Where are
all these layers coming from?" but the developers wouldn't
even blink, because that's normal to them.
And I'd know that, if they had taken the time to tell
me.
(If you liked this, you might enjoy A
Short Story About Verbosity.)
2012 Retrospective
A short summary of 2012: more entries than any previous
year by far (41 vs. 33 in 2010), and a site design that
finally doesn't look so homemade.
And a tremendous increase in traffic.
It's not the numbers of network packets flying around
that matter. To you, reading this, this may look like a
blog that's ostensibly about building things with
technology while more often than not dancing around any
kind of actual coding, but in (hopefully) interesting ways.
To me, this is an outlet for ideas and for writing. That
I'm able to fulfill my own desire to write and
there's a large audience that finds it useful...I am
stunned that such an arrangement exists.
You have my sincere gratitude for taking the time to
read what I've written.
popular articles from 2012
others from 2012 that I personally like
(Here's last year's retrospective.)
An Irrational Fear of Files on the Desktop
A sign of the clueless computer user has long been
saving all files directly to the desktop. You can spot this
from across the room, the background image peeking through
a grid of icons. Well-intentioned advice of "Here, let me
show you how to make a folder somewhere else," is
ignored.
The thing is, it's not only okay to use the desktop as a
repository for all your work, it's beautiful from an
interaction design perspective.
The desktop is the file system, and it's a visual
one too. Everything is right there in front of you as a
sort of permanent file browser. There's no need for a "My
Computer" icon, having to open an application for browsing
files (i.e., Windows Key + E), or dealing with the
conceptual difference between the desktop and, say, "My
Documents" (something surprisingly difficult to explain to
new users). It's only too bad so much time has been spent
disparaging the desktop as a document storage location.
What about the mess caused a screen full of icons?
That's the best part: you can see your mess. You can be
disorganized regardless of where you store documents, but
if you just dump everything into "My Documents" you don't
have the constant in-your-face reminder to clean things up.
The lesson shouldn't be not to put things on the desktop,
but how to create folders for projects or for things you're
no longer working on.
To be fair, there were once good arguments against
storing everything on the desktop. Back when the Windows
Start menu required navigating nested menus, it was easier
to have desktop shortcuts for everything--most applications
still create one by default. That muddled the metaphor. Was
the desktop for documents or programs? Once you were able
to run an app by clicking Start and typing a few letters of
the name (or the OS X equivalent: Spotlight), the desktop
was no longer needed as an application launcher. (And now
there are more iOS-like mechanisms for this purpose in both
OS X Mountain Lion and Windows 8.)
The puzzling part of all this is how a solid, easy to
understand model of storing things on a computer became
exactly what the knowledgeable folks--myself included--were
warning against.
(If you liked this, you might enjoy User Experience Intrusions in iOS 5.)
Trapped by Exposure to Pre-Existing Ideas
Let's go back to the early days of video games. I don't
mean warm and fuzzy memories of the Nintendo Entertainment
System on a summer evening, but all the way back to the
early 1970s when video games first started to exist as a
consumer product. We have to go back that far, because
that's when game design was an utterly black void, with no
genres or preconceptions whatsoever. Each game that came
into existence was a creation no one had previously
imagined.
While wandering through this black void, someone
had--for the very first time--the thought to build a video
game with a maze in it.
The design possibilities of this maze game idea were
unconstrained. Was it an open space divided up by a few
lines or a collection of tight passageways? Was the goal to
get from the start to the finish? Was it okay to touch the
walls or did they hurt you? Two people could shoot at each
other in a spacious maze using the walls for cover. You
could be in a maze with robots firing at you. Maybe could
break through some of the walls? Or what if you built the
walls yourself? "Maze" was only a limitation in the way
that "detective story" was for writers.
And then in 1980, when only a relative handful of maze
game concepts had been explored, Toru Iwatani designed
Pac-Man.
It featured a maze of tight passageways full of dots,
and the goal was to eat all of those dots by moving over
them. You were chased by four ghosts that killed you on
contact, but there were special dots that made the ghosts
edible for a brief period, so you could hunt them down.
After the release of Pac-Man, when someone had the
thought to create a game with a maze in it, more often than
not that game had tight passageways full of dots,
something--often four of them--chasing you, and a way to
turn the tables on those somethings so you could eliminate
them.
Because by that time, there were no other options.
(If you liked this, you might enjoy Accidental Innovation.)
Sympathy for Students in Beginning Programming
Classes
Here's a template for a first programming class: Use a
book with a language name in the title. Start with the very
basics like formatted output and simple math. Track through
more language features with each chapter and assignment,
until at the end of the semester everyone is working with
overloaded operators and templates and writing their own
iterators and knows all the keywords related to exception
handling.
If you're a student in a class like this, you have my
sympathy, because it's a terrible way to be introduced to
programming.
Once you've learned a small subset of a language like
Python--variables, functions, control flow, arrays, and
dictionaries--then features are no longer the issue. Sure,
you won't know all the software engineery stuff like
exceptions and micromanagement of variable and function
scopes, but it's more important to learn how to turn
thoughts into code before there's any mention of
engineering.
I'd even go so far as to say that most
OOP is irrelevant at this point, too.
My real template for a first programming class is this:
Teach the bare minimum of language features required to do
interesting things. Stop. Spend the rest of the semester
working on short assignments that introduce students to
problem solving and an appreciation for the usefulness of
knowing how to write code.
The Highest-Level Feature of C
At first blush this is going to sound ridiculous, but
bear with me: the highest-level feature of C is the
switch
statement.
As any good low-level language should be, C is designed
for transparent compilation. If you take a bit of C source,
the corresponding object code emitted by the compiler--even
a heavily optimizing compiler--roughly mimics the structure
of the original text.
The switch
statement is the only part of
the language where you specify an intent, and the
choice of how to make that a reality is not only out of
your hands, but the resulting code can vary in algorithmic
complexity.
Sure, there are other situations where the compiler can
step in and reinterpret things. A for
loop
known to execute three times can be replaced by three
instances of the loop body. In some circumstances, if
you're careful not to trip over all the caveats, a loop can
be vectorized so multiple elements can be processed in each
iteration. None of these are fundamental changes. Your loop
is still conceptually a loop, one way or another.
The possibilities when compiling a switch
are much more varied. It can result in a trivial series of
if..else
statements. It can result in a binary
search. Or, if the values are consecutive, a jump table. Or
for a complex sequence, some combination of these
techniques. If each case
simply assigns a
different value to the same variable, then it can be
implemented as a range check and array lookup. The overall
sweep of the solutions, from hundreds of sequential,
mispredicted comparisons to a single memory read, is
substantial.
The same principle is what makes pattern matching so
useful in Erlang and Haskell. You provide this great, messy
bunch of patterns containing a mix of numbers and lists and
tuples and "don't care" values. At compile time the
commonalities, exceptional cases, and opportunities for
table lookups are sorted out, and fairly optimally,
too.
In the compiled code for this bit of Erlang, the tuple
size is used for dispatching to the correct line:
case Position of
{X, Y, Dir} -> ...
{X, Y, Dir, _, _} -> ...
{X, Y, _, _} -> ...
{X, Y} -> ...
end
The switch
statement in C is a signal that
even though you could do it yourself, you'd prefer to have
the compiler act as a robotic assistant who'll take your
spec--a list of values and actions--and write the code for
you.
(If you liked this, you might enjoy On
Being Sufficiently Smart.)
Simplicity is Wonderful, But Not a Requirement
Whenever I write about the overwhelming behind-the-scenes complexity of
modern systems, and the developer frustration that comes
with it, I get mail from computer science students asking
"Am I studying the right field? Should I switch to
something else?"
It seems somewhere between daunting and impossible to
build anything with modern technology if it's that much of
a mess. But despite endless claims by knowledgeable
insiders as to the extraordinary difficulty and deeply
flawed nature of software development, there's no end of
impressive achievements that are at odds with that
pessimism.
How could anyone manage to build an operating system out
of over ten million lines of error prone C and C++? Yet I
can cite three easy examples: Windows, OS X, and Linux.
How could anyone craft an extravagant 3D video game with
pointers and manual memory management woven throughout a
program that has triple the number of lines as the one in
the space shuttle's main computer? Yet dozens of such games
are shipped each year.
How could anyone write an executable specification of a
superscalar, quad-core CPU with 730,000,000 transistors?
One that includes support for oddball instructions that
almost made sense in the 1970s, multiple floating point
systems (stack-based and vector-based), and in addition to
32-bit and 64-bit environments also includes a full 16-bit
programming model that hasn't been useful since the mid
1990s? Yet the Intel i7 powers millions of laptops and
desktops.
If you wallow in the supposed failure of software
engineering, then you can convince yourself that none of
these examples should actually exist. While there's much to
be said for smallness and simpleness, it's clearly not a
requirement when it comes to changing the world. And
perhaps there's comfort in knowing that if those crazy
people working on their overwhelmingly massive systems are
getting them to work, then life is surely much easier for
basement experimenters looking to change the world in
smaller ways.
(If you liked this, you might enjoy Building Beautiful Apps from Ugly Code.)
Don't Be Distracted by Superior Technology
Not long after I first learned C, I stumbled across a
lesser-used language called Modula-2. It was designed by
Niklaus Wirth who previously created Pascal. While Pascal
was routinely put down for being awkwardly restrictive,
Wirth nudged and reshaped the language into Modula-2,
arguably the finest systems-level programming language of
its day.
Consider that Modula-2 had the equivalent of C++
references from the start (and for the record, so did
Pascal and ALGOL). Most notably, if you couldn't guess from
the name, Modula-2 had a true module system that has
managed to elude the C and C++ standards committees for
decades.
My problem became that I had been exposed to the Right
Approach to separately compiled modules, and going back to
C felt backward--even broken.
When I started exploring functional programming, I used
to follow the Usenet group comp.lang.functional. A common
occurrence was that someone struggling with how to write
programs free of destructive updates would ask a question.
As the thread devolved into bickering about type systems
and whatnot, someone would inevitably point out a language
that gracefully handled that particular issue in a much
nicer way than Haskell, ML, or Erlang.
Except that the suggested language was an in-progress
research project.
The technology world is filled with cases where smart
and superior alternatives exist, but their existence makes
no difference because you can't use them. 1980s UNIX was
incredibly stable compared to MS-DOS, but it was irrelevant
if you intended to use MS-DOS software. Clojure and Factor
are wonderful languages, but if you want to write iOS games
then you're better off pretending you've never heard of
them. Not only are they not good options for iOS, at least
not at the moment, but going so against the grain brings
extra work and headaches with it.
Words like better, superior, and
right are misleading. Yes, Modula-2 has a beautiful
module system, but that's negated by being a fringe
language that isn't likely to be available from the start
when exciting new hardware is released. Erlang isn't as
theoretically beautiful as those cutting-edge research
languages, but it's been through the forge of shipping
large-scale systems. What may look like warts upon first
glance may be the result of pragmatic choices.
There's much more fun to be had building things than
constantly being distracted by better technology.
(If you liked this, you might enjoy The Pace of Technology is Slower than You
Think.)
Expertise, the Death of Fun, and What to Do About
It
I've started writing this twice before. The first time
it turned into Simplicity is Wonderful,
But Not a Requirement. The second time it ended up as
Don't Be Distracted by Superior
Technology. If you re-read those you might see bits and
pieces of what I've been wanting to say, which goes like
this:
There is danger in becoming an expert. Long-term
exposure to programming, coding, software
development--whatever you want to call it--changes you. You
start to recognize the extreme complexity in situations
where there doesn't need to be any, and it eats at you. You
realize how broken the tools are. You discover bygone
flashes of amazing beauty in old systems that have been set
aside in favor of the way things have always been done.
This is a bad line of thinking.
It's why you run into twenty-year veteran coders who can
no longer write
FizzBuzz. It's why people right out of school often
create impressive and impossible-seeming things, because
they haven't yet developed an aesthetic that labels all of
that successful hackery as ugly and wrong. It's why some
programmers migrate to more and more obscure languages,
trading productivity for poetic tinkering.
Maybe a better title for this piece is "So You've Become
Jaded and Dissatisfied. Now What?"
Cultivate a "try it first" attitude. Yes, it's
amusing to read about those silly developers who can't
write FizzBuzz. But your first reaction should be to set
aside the article and try implementing it
yourself.
Active learning or bust. Don't bother with
tutorials or how-to books unless you're going to use the
information immediately. Fire up your favorite interpreter
and play along as you read. Don't take the author's word
for anything; prove it to yourself. Do the exercises and
invent your own.
Be realistic about the limitations of your favorite
programming language. I enjoy Erlang, but it's puzzle language, meaning that some truly
trivial problems don't have a straightforward mapping to
the strengths of the language (such as most algorithms
based around destructive array updates). When I don't have
a clear picture of what features I'm going to need, I reach
for something with few across-the-board sticking points,
like Python. Sometimes the cleanest approach involves
straightforward loops and counters and return
statements right there in the middle of it all.
Let ease of implementation trump perfection. Yes,
yes, grepping an XML file is fundamentally wrong...but it
often works and is easier than dealing with an XML parsing
library. Yes, you're supposed to properly handle exceptions
that get thrown and always check for memory errors and
division by zero and all that. If you're writing code for a
manned Mars mission, then please take the time to do
it right. But for a personal tool or a prototype, it's okay
if you don't. Really. It's better to focus on the fun parts
and generating useful results quickly.
Exploring the Lower Depths of Terseness
There's a 100+ year old system for recording everything
that happens in a baseball game. It uses sheet of paper
with a small box for each batter. Whether that batter gets
to base or is out--and why--gets coded into that box. It's
a scorekeeping method that's still in use at the
professional and amateur level, and at major league games
you can buy a program which includes a scorecard.
What's surprising is how cryptic the commonly used
system is. For starters, each position is identified by a
number. The pitcher is 1. The center fielder 8. If the ball
is hit to the shortstop who throws it to the first baseman,
the sequence is 6-3. See, there isn't even the obvious
mnemonic of the first, second, and third basemen being
numbered 1 through 3 (they're 3, 4, and 5).
In programming, no one would stand for this. It breaks
the rule of not having magic numbers. I expect the center
fielder would be represented by something like:
visitingTeam.outfield.center
The difference, though, is that programming isn't done
in real-time like scorekeeping. After the initial learning
curve, 8 is much more concise, and the terseness is a
virtue when recording plays with the ball moving between
multiple players. Are we too quick to dismiss extremely
terse syntax and justify long-winded notations because
they're easier for the uninitiated to read?
Suppose you have a file where each line starts with a
number in parentheses, like "(124)", and you want to
replace that number with an asterisk. In the vim editor the
keystrokes for this are "^cib*
" followed by
the escape key. "^" moves to the start of the line. The "c"
means you're going to change something, but what? The
following "ib" means "inner block" or roughly "whatever is
inside parentheses." The asterisk fills in the new
character.
Once you get over the dense notation, you may notice a
significant win: this manipulation of text in vim can be
described and shared with others using only five
characters. There's no "now press control+home"
narrative.
The ultimate in terse programming languages is J. The
boring old "*" symbol not only multiplies two numbers, but
it pairwise multiplies two lists together (as if a map
operation were built in) and also multiplies a scalar value
with each element of a list, depending on the types of its
operands.
That's what happens with two operands anyway. Each verb
(the J terminology for "operator"), also works in a unary
fashion, much like the minus sign in C represents both
subtraction and negation. When applied to a lone value "*"
is the sign function, returning either -1, 0, or 1 if the
operand is negative, zero, or positive.
So now each single-character verb has two meanings, but
it goes further than that. To increase the number of
symbolic verbs, each can have either a period or a colon as
a second character, and then each of these have both one
and two operand versions. "*:" squares a single parameter
or returns the nand ("not and") of two parameters. Then
there's the two operand version of "*." which computes the
least common multiple, and I'll give it up now before
everyone stops reading.
Here's the reason for this madness: it allows a wide
range of built-in verbs that never conflict with
user-defined, alphanumeric identifiers. Without referencing
a single library you've got access to prime number
generation ("p:"), factorial ("!"), random numbers ("?"),
and matrix inverse ("%.").
Am I recommending that you switch to vim for text
editing and J for coding? No. But when you see an expert
working with those tools, getting results with fewer
keystrokes than it would take to import a Python module,
let alone the equivalent scripting, then...well, there's
something to the terseness that's worth remembering. It's
too impressive to ignore simply because it doesn't line up
with the prevailing aesthetic for readable code.
(If you liked this, you might enjoy Papers from the Lost Culture of Array
Languages.)
Remembering a Revolution That Never Happened
Twenty-three years ago, a book by Edward Cohen called
Programming in the 1990s: An Introduction to the
Calculation of Programs was published. It was a glimpse
into the sparkling software development world of the
future, a time when ad hoc coding would be supplanted by
Dijkstra-inspired manipulation of proofs. Heck, no need to
even run the resulting programs, because they're right by
design.
Clearly Mr. Cohen's vision did not come to pass, but I
co-opted the title for this blog.
That book is a difficult read. It starts out as
bright-eyed and enthusiastic as you could expect a computer
science text to be, then rapidly turns into chapter-long
slogs to prove the equivalent of a simple linear search
correct. It wasn't the difficulty that made the program
derivation approach unworkable. Reading and writing music
looks extraordinarily complex and clunky to the
uninitiated, but that's not stopping vast numbers of people
from doing so. The problem is that for almost any
non-trivial program, it's not clear what "correct"
means.
Here's a simple bit of code to write: display a sorted
list of the filenames in a folder. That should take a
couple of minutes, including googling around for how to get
the contents of a directory.
Except that on some systems you're getting weird
filenames like "." and ".." that you don't want to
display.
Except that there are also hidden files, either based on
an attribute or a naming convention, and you should ignore
those too.
Except that you need the sort to be case insensitive or
else the results won't make sense to most users.
Except that some people are using spaces between words
and some are using underscores, so they should be treated
the same when sorting.
Except that a naive sort is going to put "File 10"
before "File 9", and while that's logical in the cold
innards of the CPU, it's no excuse to present the data that
way.
And this is a well-understood, weird old relic of a
problem that's nothing compared to all the special
cases and exceptions needed to implement a solid user
experience in a modern app. Making beautiful code ugly--and
maybe impossible to prove correct--by making things easier
for the user is a good thing.
(If you liked this, you might enjoy Write Code Like You Just Learned How to
Program.)
A Short Quiz About Language Design
Suppose you're designing a programming language. What
syntax would you use for a string constant? This isn't a
trick; it's as simple as that. If you want to print
Hello World
then how do you specify a basic
string like that in your language?
I'll give you a moment to think about it.
The obvious solution is to use quotes: "Hello
World"
. After all, that's how it works in English,
so it's easy to explain to new students of your language.
But then someone is going to ask "What if I want to put a
quotation mark inside a string? That's a legitimate
question, because it's easy to imagine displaying a string
like:
"April 2013" could not be found.
There are a couple of options to fix this. Some form of
escape character is one, so an embedded quote is preceded
by, say, a backslash. That works, but now you've got to
explain a second concept in order to explain strings.
Another option is to allow both single and double quotes.
If your string contains single quotes, enclose it in double
quotes, and vice-versa. A hand goes up, and someone asks
about how to enter this string:
"April 2013" can't be found.
Ugh. Now you have two kinds of string delimiters, and
you still need escapes. You need to explain these special
cases up front, because they're so easy to hit.
What if instead falling back on the unwritten rule of
using single and double quotes, strings were demarcated by
something less traditional? Something that's not common in
Latin-derived languages? I'll suggest a vertical bar:
|"April 2013" can't be found.|
That may be uncomfortable at first glance, but give it a
moment. Sure, a vertical bar will end up in a string at
some point--regular expressions with alternation come to
mind--but the exceptional cases are no longer blatant and
nagging, and you could get through a beginning class
without even mentioning them.
(If you liked this, you might enjoy Explaining Functional Programming to
Eight-Year-Olds.)
Stumbling Into the Cold Expanse of Real
Programming
This is going to look like I'm wallowing in nostalgia, but that's not my intent. Or
maybe it is. I started writing this without a final
destination in mind. It begins with a question:
How did fast action games exist at all on 8-bit
systems?
Those were the days of processors living below the 2 MHz
threshold, with each instruction run to completion before
even considering the next. No floating point math. Barely
any integer math, come to think of it: no multiplication or
division and sums of more than 255 required two
additions.
But that kind of lively statistic slinging doesn't tell
the whole story or else there wouldn't have been so many
animated games running--usually at sixty
frames-per-second--on what appears to be incapable
hardware. I can't speak to all the systems that were
available, but I can talk about the Atari 800 I learned to
program on.
Most games didn't use memory-intensive bitmaps, but a
gridded character mode. The graphics processor converted
each byte to a character glyph as the display was scanned
out. By default these glyphs looked like ASCII characters,
but you could change them to whatever you wanted, so the
display could be mazes or platforms or a landscape, and
with multiple colors per character, too. Modify one of the
character definitions and all the references to it would be
drawn differently next frame, no CPU work involved.
Each row of characters could be pixel-shifted
horizontally or vertically via two memory-mapped hardware
registers, so you could smoothly scroll through levels
without moving any data.
Sprites, which were admittedly only a single color each,
were merged with the tiled background as the video chip
scanned out the frame. Nothing was ever drawn to a buffer,
so nothing needed to be erased. The compositing happened as
the image was sent to the monitor. A sprite could be moved
by poking values in position registers.
The on-the-fly compositing also checked for overlap
between sprites and background pixels, setting bits to
indicate collisions. There was no need for even simple
rectangle intersection tests in code, given pixel-perfect
collision detection at the video processing level.
What I never realized when working with all of these
wonderful capabilities, was that to a large extent I was
merely scripting the hardware. The one sound and two video
processors were doing the heavy lifting: flashing colors,
drawing characters, positioning sprites, and reporting
collisions. It was more than visuals and audio; I didn't
even think about where random numbers came from. Well,
that's not true: I know they came from reading memory
location 53770 (it was a pseudo-random number generator
that updated every cycle).
When I moved to newer systems I found I wasn't nearly
the hotshot game coder I thought I was. I had taken for
granted all the work that the dedicated hardware handled,
allowing me to experiment with game design ideas.
On a pre-Windows PC of the early 1990s, I had to write
my own sprite-drawing routines. Real ones, involving actual
drawing and erasing. Clipping at the screen edges? There's
something I never thought about. The Atari hardware
silently took care of that. But before I could draw
anything, I had to figure out what data format to use and
how to preprocess source images into that format. I
couldn't start a tone playing with two register settings; I
had to write arcane sound mixing routines.
I had wandered out of the comfortable realm where I
could design games in my head and make them play out on a
TV at my parents' house and stumbled into the cold expanse
of real programming.
(If you liked this, you might enjoy A
Personal History of Compilation Speed.)
Flickr's Redesign is a Series of Evolutionary
Changes
After years of teetering on the brink of relevance,
Flickr is back in the
limelight thanks in part to a more modern appearance. But
here's something that may not be so obvious: it wasn't a
sudden reworking of Flickr. It's been evolving through a
series of smaller improvements over the course of fifteen
months.
In February 2012, photo thumbnails presented as grid of
small squares floating in a sea of whitespace were replaced
with the
justified view: images cropped to varying widths and
packed into aesthetically pleasing rows in the browser
window. Initially this was only for the favorites page, but
a few months later it was applied to the amalgamation of
recent photos from your contacts, then to the photos in
topic-oriented groups.
In December 2012, the iOS Flickr app was replaced with a
completely rewritten, better designed version. It sounds
drastic, rewriting an app, but it's only a client for
interacting with the Flickr database. The core of Flickr
remained the same.
Around the same time, the justified view spread to the
Explore (top recent photos) page.
When the May 2013 redesign hit, most of the pieces were
already in place. Sure, there was some visual design work
involved, but if you look closely one of the most striking
changes is that the justified view is now used for
individual photostreams.
I love stories like this, because it's my favorite way
to develop: given an existing, working application, pick
one thing to improve. Not a full rewrite. Not a Perl 6
level of manic redesign. Not a laundry list of changes. One
thing. The lessons learned from that one improvement may
lead to further ideas to try which will lead to still
further ideas. Meanwhile you're dealing with an
exponentially simpler problem than architecting an entire
lineage of such theoretical improvements all at once.
(If you liked this, you might enjoy What Do People Like?)
Getting Comfortable with the Softer Side of
Development
When I was in college, I took an upper-level course
called "Operating Systems." It was decidedly hardcore:
preemptive multitasking and synchronization, task
scheduling, resource management, deadlock avoidance, and so
on. These were the dark, difficult secrets that few people
had experience with. Writing one's own operating system was
the pinnacle of geeky computer science aspirations.
The most interesting thing about that course, in
retrospect, is what wasn't taught: anything about how
someone would actually use an operating system. I
don't mean flighty topics like flat vs. skeuomorphic
design, but instead drop way down to something as
fundamental as how to start an application or even how
you'd know which applications are available to choose from.
Those were below the radar of the computer science
definition of "operating system." And not just for the
course, either. Soft, user experience topics were nowhere
to be found in the entire curriculum.
At this point, I expect there are some reactions to the
previous two paragraphs brewing:
"You're confusing computer science and
human-computer interaction! They're two different
subjects!"
"Of course you wouldn't talk about those
things in an operating systems course! It's about the
lowest-level building blocks of an OS, not about user
interfaces."
"I don't care about that non-technical stuff!
Some designer-type can do that. I'm doing the
engineering work."
There's some truth in each of these--and the third is
simply a personal choice--but all it takes is reading a
review of OS X or Windows where hundreds of words are
devoted to incremental adjustments to the Start menu and
dock to realize those fluffy details aren't so fluffy after
all. They matter. If you want to build great software, you
have to accept that people will dismiss your application
because of an awkward UI or font readability issues,
possibly switching to a more pleasing alternative that was
put together by someone with much less coding skill than
you.
So how do you nudge yourself in that direction without
having to earn a second degree in a softer, designery
field?
Learn basic graphic design. Not so much how to
draw things or create your own artistic images (I'm
hopeless in that regard), but how to use whitespace, how
fonts work together, what a good color scheme looks like.
Find web pages and book covers that you like and
deconstruct them. Take the scary step of starting with a
blank page and arranging colors and fonts and text boxes on
it. Hands-on experimentation is the only way to get better
at this.
Read up on data visualization. Anything by Edward
Tufte is a good place to start.
Foster a minimalist aesthetic. If you're striving
for minimalism, then you're giving just as much thought to
what to leave out as what to include, and you need to make
hard choices. That level of thought and focus is only going
to make your application better. You can go too far with
minimalism, but a quick glance around the modern software
world shows that this isn't a major worry.
Don't build something a certain way simply because
"that's how it's always been done." There's strong
programmer impulse to clone, to implement what you've
already seen. That can result in long eras of misguidedly
stagnant IDEs or calculator apps, because developers have
lost sight of the original problem and are simply rehashing
what they're familiar with.
Optimize for things that directly affect users.
Speed and memory are abstract in most cases. Would you even
notice if an iPhone app used 10 megabytes instead of 20?
Documentation size and tutorial length are more concrete,
as are the number of steps it takes to complete common
tasks.
Tips for Writing Functional Programming Tutorials
With the growing interest in a functional programming
style, there are more tutorials and blog entries on the
subject, and that's wonderful. For anyone so inclined to
write their own, let me pass along a few quick tips.
Decide if you're writing a tutorial about functional
programming or a specific language. If you're covering
the feature set of Haskell, from the type system to
laziness to monads, then you're writing about Haskell. If
you show how to explore interesting problems and the
executable parts of your tutorial happen to be written in
Haskell, then you're writing about functional programming.
See the difference?
Let types explain themselves. The whole point of
type inference is that it's behind the scenes and
automatic, helping you write more correct code with less
bookkeeping. Don't negate that benefit by talking about the
type system explicitly. Let it be silently assimilated
while working through interesting examples and exercises
that have nothing to do with types.
Don't talk about currying. There's a fascinating
theoretical journey from a small set of expressions--the
lambda calculus--to a more useful language. With just the
barest of concepts you can do seemingly crazy things like
recursion without named functions and using single-argument
functions to mimic functions that take multiple arguments
(a.k.a. currying). Don't get so swept up in that theory
that you forget the obvious: in any programming language
ever invented, there's already a way to easily
define functions of multiple arguments. That you can build
this up from more primitive features is not useful or
impressive to non-theoreticians.
Make sure you've got meaningful examples. If you
have functions named foo
or bar
,
then that's a warning sign right there. If you're
demonstrating factorials or the Fibonacci sequence without
a reason for calculating them (and there are reasons, such
as permutations), then choose something else. There are
curious and approachable problems everywhere. It's
easy to write a dog_years
function based on
the incorrect assumption that one human year equals seven
dog years. There's a more accurate computation where the
first two years of a dog's life are 10.5 human years each,
then each year after that maps to four human years. That's
a perfect beginner-level problem.
(If you liked this, you might enjoy You, Too, Can Be on the Cutting Edge of
Functional Programming Research.)
Organizational Skills Beat Algorithmic Wizardry
I've seen a number of blog entries about technical
interviews at high-end companies that make me glad I'm not
looking for work as a programmer. The ability to implement
oddball variants of heaps and trees on the spot. Puzzles
with difficult constraints. Numeric problems that would
take ten billion years to complete unless you can cleverly
analyze and rephrase the math. My first reaction is wow,
how do they manage to hire anyone?
My second reaction is that the vast majority of
programming doesn't involve this kind of algorithmic
wizardry.
When it comes to writing code, the number one most
important skill is how to keep a tangle of features from
collapsing under the weight of its own complexity. I've
worked on large telecommunications systems, console games,
blogging software, a bunch of personal tools, and very
rarely is there some tricky data structure or algorithm
that casts a looming shadow over everything else. But
there's always lots of state to keep track of, rearranging
of values, handling special cases, and carefully working
out how all the pieces of a system interact. To a great
extent the act of coding is one of organization.
Refactoring. Simplifying. Figuring out how to remove
extraneous manipulations here and there.
This is the reason there are so many accidental
programmers. You don't see people casually become
neurosurgeons in their spare time--the necessary training
is specific and intense--but lots of people pick up enough
coding skills to build things on their own. When I learned
to program on an 8-bit home computer, I didn't even know
what an algorithm was. I had no idea how to sort data, and
fortunately for the little games I was designing I didn't
need to. The code I wrote was all about timers and counters
and state management. I was an organizer, not a genius.
I built a custom a tool a few years ago that combines
images into rectangular textures. It's not a big
program--maybe 1500 lines of Erlang and C. There's one
little twenty line snippet that does the rectangle packing,
and while it wasn't hard to write, I doubt I could have
made it up in an interview. The rest of the code is for
loading files, generating output, dealing with image
properties (such as origins), and handling the data flow
between different parts of the program. This is also the
code I tweak whenever I need a new feature, better error
handling, or improved usability.
That's representative of most software development.
(If you liked this, you might enjoy Hopefully More Controversial Programming
Opinions.)
Getting Past the Cloning Instinct
When I write about creativity or similarly non-technical
subjects, I often get mail from pure coders asking how they
can become better at design. There's an easy response: take
an idea from your head all the way to completion.
Almost certainly that idea isn't as well thought-out as
you'd hoped, and you'll have to run experiments and backtrack and work out
alternate solutions. But if you stick with it and build
everything from the behind-the-scenes processing to the UI,
make it simple enough for new users to learn without you
being in the room to guide them, and agonize over all the
choices and details that define an app, then
congratulations! You've just gone through the design
process.
Obvious advice? It would be, except there's a
conflicting tendency that needs to be overcome first: the
immediate reaction upon seeing an existing, finished
product of "I want to make my own version of that."
It's a natural reaction, and the results are everywhere.
If a fun little game gets popular, there's an inevitable
flood of very similar games. If an app is useful, there
will be me-too versions, both commercial and open source.
Choosing to make something that already exists shifts the
problem from one of design to one that's entirely
engineering driven. There's a working model to use for
reference. Structure and usability problems are already
solved. Even if you desperately want to change one thing
you think the designer botched, you're simply making a
tweak to his or her work.
Originality is not the issue; what matters is the
process that you go through. If your work ends up having
similarities to something that came before it, that's
completely different than if you intentionally set-out to
duplicate that other app in the first place.
The cloning instinct makes sense when you're first
learning development. Looking at an existing application
and figuring out how to implement it is much easier than
creating something new at the same time. But at some point,
unless you're happy being just the programmer, you need to
get beyond it.
(If you liked this, you might enjoy The Pure Tech Side is the Dark Side.)
How much memory does malloc(0) allocate?
On most systems, this little C program will soak up all
available memory:
while (1) {
malloc(0);
}
so the answer is not the obvious "zero." But before
getting into malloc(0)
, let's look at the
simpler case of malloc(1)
.
There's an interesting new C programmer question about
malloc
: "Given a pointer to dynamically
allocated memory, how can I determine how many bytes it
points to?" The answer, rather frustratingly, is "you
can't." But when you call free
on that same
pointer, the memory allocator knows how big the block is,
so it's stored somewhere. That somewhere is commonly
adjacent to the allocated memory, along with any other
implementation-specific data needed for the allocator.
In the popular dlmalloc
implementation,
between 4 and 16 bytes of this overhead are added to a
request, depending on how the library is configured and
whether pointers are 32 or 64 bits. 8 bytes is a reasonable
guess for a 64-bit system.
To complicate matters, there's a minimum block size that
can be returned by malloc
. Alignment is one
reason. If there's an integer size secretly prepended to
each block, then it doesn't make sense to allocate a block
smaller than an integer. But there's another reason: when a
block is freed, it gets tracked somehow. Maybe it goes into
a linked list, maybe a tree, maybe something fancier.
Regardless, the pointers or other data to make that work
have to go somewhere, and inside the just-freed block is a
natural choice.
In dlmalloc
, the smallest allowed
allocation is 32 bytes on a 64-bit system. Going back to
the malloc(1)
question, 8 bytes of overhead
are added to our need for a single byte, and the total is
smaller than the minimum of 32, so that's our answer:
malloc(1)
allocates 32 bytes.
Now we can approach the case of allocating zero bytes.
It turns out there's a silly debate about the right thing
to do, and it hasn't been resolved, so technically
allocating zero bytes is implementation-specific behavior.
One side thinks that malloc(0)
should return a
null pointer and be done with it. It works, if you don't
mind a null return value serving double duty. It can either
mean "out of memory" or "you didn't request any
memory."
The more common scheme is that malloc(0)
returns a unique pointer. You shouldn't dereference that
pointer because it's conceptually pointing to zero bytes,
but we know from our adventures above that at least
dlmalloc
is always going to allocate a 32 byte
block on a 64-bit system, so that's the final answer: it
takes 32 bytes to fulfill your request for no memory.
[EDIT: I modified the last two paragraphs to correct
errors pointed out in email and a discussion thread on
reddit. Thank you for all the feedback!]
(If you liked this, you might enjoy Another Programming Idiom You've Never Heard
Of.)
Purely Functional Photoshop
One of the first things you learn about Photoshop--or
any similarly styled image editor--is to use layers for
everything. Don't modify existing images if you can help
it. If you have a photo of a house and want to do some
virtual landscaping, put each tree in its own layer. Want
to add some text labels? More layers.
The reason is straightforward: you're keeping your
options open. You can change the image without overwriting
pixels in a destructive way. If you need to save out a
version of the image without labels, just hide that layer
first. Maybe it's better if the labels are slightly
translucent? Don't change the text; set the opacity of the
layer.
This stuff about non-destructive operations sounds like
something from a functional programming tutorial. It's easy
to imagine how all this layer manipulation could look
behind the scenes. Here's a list of layers using Erlang
notation:
[House, MapleTree, AshTree, Labels]
If you want to get rid of the label layer, return a new
list:
[House, MapleTree, AshTree]
Or to reverse the order of the trees, make another new
list:
[House, AshTree, MapleTree, Labels]
Again, nothing is being modified. Each of these simple
manipulations returns a brand new list. Performance-wise
there are no worries no matter how much data
House
represents. Each version of the list is
referencing the same data, so nothing is being copied. In
Erlang, each of these alternate list transformations
creates three or four conses (six or eight memory cells
total), which is completely irrelevant.
Now what about changing the opacity of the labels layer?
Realistically, a layer should be a dictionary of some sort,
maybe a property list:
[{name,"labels"},...]
If one of the possible properties is opacity, then the
goal is to return a new list where the layer looks like
this:
[{name,"labels"},{opacity,0.8}]
Is this all overly obvious and simplistic? Maybe,
especially if you have a functional programming background,
but it's an interesting example for a couple of reasons.
Non-destructive manipulations are the natural approach;
there's no need to keep saying "I know, I know, this may
seem awkward, but bear with me, okay?" It also shows the
most practical reason for using a language like Erlang,
Haskell, or Lisp: so you can easily work with symbolic
descriptions of data instead of the raw data itself.
Why Do Dedicated Game Consoles Exist?
The announcement of the Nintendo 2DS has reopened an old
question: "Should Nintendo give up on designing their own
hardware and write games for existing platforms like the
iPhone?" A more fundamental question is "Why do dedicated
game consoles exist in the first place?"
Rewind to the release of the first major game system
with interchangeable cartridges, the Atari VCS (a.k.a.
Atari 2600) in 1977. Now instead of buying that game
system, imagine you wanted a general purpose PC that could
create displays of the same color and resolution as the
Atari. What would the capabilities of that mythical 1977 PC
need to be?
For starters, you'd need a 160x192 pixel display with a
byte per pixel. Well, technically you'd need 7-bits, as the
2600 can only display 128 colors, but a byte per pixel is
simpler to deal with. That works out to 30,720 bytes for
the display. Sounds simple enough, but there's a major
roadblock: 4K of RAM in 1977 cost roughly $125. To get
enough memory for our 2600-equivalent display, ignoring
everything else, would have been over $900.
For comparison, the retail price of the Atari 2600 was
$200.
How did Atari's engineers do it? By cheating. Well,
cheating is too strong of a word. Instead of building a
financially unrealistic 30K frame buffer, they created an
elaborate, specialized illusion. They built a video
system--a monochrome background and two single-color
sprites--that was only large enough for a single horizontal
line. To get more complex displays, game code wrote and
rewrote that data for each line on the TV screen. That let
the Atari 2600 ship with 128 bytes of RAM instead of
the 30K of our fantasy system.
Fast-forward fourteen years to 1991 and the introduction
of the Super Nintendo Entertainment System. Getting an
early 90s PC to equal the color and resolution of the SNES
is easy. The 320x200 256-color VGA mode is a good match for
most games. The problem is no longer display quality. It's
motion.
The VGA card's memory was sitting on the other side of a
strained 8-bit bus. Updating 64,000 pixels at the common
Super Nintendo frame rate of 60fps wasn't possible, yet the
SNES was throwing around multiple parallaxing backgrounds
and large animated objects.
Again, it was a clever focus that made the console so
impressive. The display didn't exist as a perfect grid of
pixels, but was diced into tiles and tilemaps and sprites
and palettes which were all composited together with no
involvement from the underpowered 16-bit CPU. There was no
argument that a $2000 PC was a more powerful
general-purpose machine, likely by several orders of
magnitude, but that didn't stop the little $200 game system
from providing an experience that the PC couldn't.
The core of both of these examples--graphics on a 2D
screen--is a solved problem. Even free-with-contract
iPhones have beautiful LCDs overflowing with resolution and
triangle-drawing ability, so it's hard to justify a
hand-held system that largely hinges on having a similar or
worse display. There are other potential points of
differentiation, of course. Tactile screens. Head-mounted
displays. 3D holographic projection. But eventually it all
comes down to this: Is the custom hardware so fundamentally
critical to the experience that you couldn't provide it
otherwise? Or is the real goal to design great games and
have people play them, regardless of which popular system
they run on?
(If you liked this, you might enjoy Nothing Like a Little Bit of Magic.)
Dynamic Everything Else
Static vs. dynamic typing is one of those recurring
squabbles that you should immediately run away from. None
of the arguments matter, because there are easy to cite
examples of big, famous applications written using each
methodology. Then there are confusing cases like large C++
apps that use dynamically typed Lua for scripting. And
right about now, without fail, some know-it-all always
points out that dynamic typing is really a subset of static
typing, which is a lot like defining a liberal as a
conservative who holds liberal views, and nothing
worthwhile comes from this line of reasoning.
I have no interest in the static vs. dynamic typing
dispute. What I want is dynamic everything else.
Sitting in front of me is a modern, ultra-fast MacBook
Pro. I know it can create windows full of buttons and
checkboxes and beautifully rendered text, because I see
those things in every app I use. I should be able to start
tapping keys and, a command or two later, up pops a live OS
X window that's draggable and receives events. I should be
able to add controls to that window in a playful sort of
way. Instead I have to create an XCode project (the first
obstacle to creative fiddling), compile and run to see what
I'm doing (the second), then quit and re-run it for each
subsequent change (the third).
There's an impressive system for rendering shapes and
curves and fonts under OS X. Like the window example above,
I can't interactively experiment with these capabilities
either. I end up using a vector-based image editor, but the
dynamism goes away when I save what I've created and load
it into a different app. Why must the abilities to grab
curves and change font sizes be lost when I export? Why
can't the editing features be called forth for any image
made of vectors?
I know how to solve these problems. They involve writing
custom tools, editors, and languages. Switching to a
browser and HTML is another option, with the caveat that
the curves and glyphs being manipulated are virtual
entities existing only inside the fantasy world of the
browser.
That aside, it is worth taking a moment to think about
the expectations which have been built up over the decades
about how static and inflexible most computing environments
are.
Code is compiled and linked and sealed in self-contained
executables. There's no concept of live-editing, of
changing a running system, or at least that's relegated to
certain interpreted languages or the distant memories of
Smalltalk developers. Reaching for an open source JPEG
library is often easier than using the native operating
system--even though the OS is clearly capable of loading
and displaying JPEGs--especially if you're not using the
language it was designed to interface with.
We've gotten used to all of this, but there's no
fundamental law dictating systems must be designed this
way.
(If you liked this, you might enjoy The UNIX Philosophy and a Fear of
Pixels.)
What Are You The World's Foremost Authority Of?
It started with schools of fish, then butterflies.
I was given a series of animals, with some art for each,
and my job was to make them move around in ways that looked
believable. Well, as believable as you could get on a
16-bit game system. It was more about giving the impression
of how fish would swim in a river than being truly
realistic. My simple schooling algorithm and obsessive
fiddling with acceleration values must have worked, because
I ended up demoing those fish half a dozen times to people
who stopped by to see them.
Later for an indie game I designed, I implemented a
variety of insects: swooping hornets, milling bees,
scuttling centipedes, marching ants--sixteen in all. (My
wife, Jessica, gets half the credit here; she did all the
art and animation.)
I certainly never had it as my goal, but I've gotten
pretty good at implementing naturalistic behaviors.
Not long ago I took a shot at getting loose flower
petals to fly in the breeze. I didn't have a plan and
didn't know if I could come up with something workable, but
it only took a few hours. It's also the only time I've used
calculus in any code I've written, ever. (Don't be overly
impressed; it's about the simplest calculus imaginable.) In
one of Bret Victor's wonderful talks, he proposes that
mimicking leaf motion by tracing it on a touchscreen is
easier than building it programmatically. That's the only
time I've disagreed with him. I had already started
thinking about how to model a falling leaf.
What kind of company am I in with this odd talent? Is
there a society of insect movement simulation designers
that I'm not familiar with? Or have I accidentally built up
a base of arcane knowledge and experience that's unique to
me alone?
Let me ask another question, one that I don't mean to
sound condescending or sarcastic in the least: What are
you the world's foremost authority of?
Surely it's something. If you have your own peculiar
interests and you work on projects related to them, then
you've likely implemented various solutions and know the
advantages and drawbacks of each. Or maybe you've taken one
solution and iterated it, learning more with each version.
The further you go, the more likely that you're doing
things no one else has done in the same way. You're
becoming an authority.
And if you're stumped and can't think of anything you've
worked on that isn't in pretty much the same exact
territory as what hundreds of other people have done, then
it's time to fix that. Go off in a slightly odd direction.
Re-evaluate one of your base assumptions. Do something
completely random to shake things up. You may end up an
authority in one small, quirky area, but you'll still be an
authority.
(If you liked this, you might enjoy Constantly Create.)
Three Years in an Alternate Universe
My first post-college programming job was with Ericsson
Network Systems in the early 1990s. I had similar offers
from three other hard to differentiate telecom companies.
The main reason I went with Ericsson was because the word
"engineer" was in the title, which I thought sounded
impressive. I stayed until I got the three year itch and
moved on without looking back, but during those three years
I never realized how out of the ordinary that job was.
I was paid overtime. Yes, overtime as a salaried
software engineer. There was an unpaid five-hour gap
between forty and forty-five hours, but everything after
that was paid in full. When I worked 65 hour crunch weeks,
I earned 50% more pay.
One in six software engineers were women. Well,
okay, on an absolute scale that's not a big number. But in
comparison to workplaces I've been in since then, where
it's been one in twenty or even a flat-out zero, it's a
towering statistic. Note that I'm only including people who
designed and wrote code for massively concurrent telephone
exchanges as their primary jobs, not non-technical
managerial or support roles.
Since then, I know that unpaid crunch time is how things
work, and blog complaints about this being free labor are
perennial fountains of karma. Likewise, there's much
lamenting the abysmally low numbers of women in software
development positions. But for three years, when I didn't
have enough life experience to know otherwise, I worked in
an alternate universe where these problems didn't exist to
the degree that I've seen since.
[EDIT: I remember the number being closer to one in
three, and I thought I still had my old department
directory to prove it, but I didn't. Instead I downplayed
the numbers, and "one in six" is close to the overall
average for engineers. That watered down the whole
piece.]
C is Lower Level Than You Think
Here's a bit of code that many new C programmers have
written:
for (int i = 0; i < strlen(s); i++) {
...
}
The catch is that strlen
is executed in
each iteration, and as it involves looking at every
character in search of a null, it's an unintentional
n-squared loop. The right solution is to assign the length
of the string to a local variable before the loop and check
that.
"That's just busywork," says our novice coder, "modern
compilers are smart enough to do that kind of trivial
optimization."
As it turns out, this is much trickier to automate than
may first appear. It's only safe if it can be guaranteed
that the body of the loop doesn't modify the string, and
that guarantee in C is hard to come by. All bets are off
after a single external function call, because memory used
by the string could be referenced somewhere else and
modified by that call. Most bets are off after a single
store through a pointer inside the loop, because it could
be pointing to the string passed to strlen
.
Actually, it's even worse than that: any time you write a
value to memory you could be changing the value of any
variable in memory. Determining that a[i]
can
be cached in a register across even a single memory write
is unsolvable in the general case.
(To control the chaos, the C99 standard includes a way
to assert that a pointer is used in a restricted
manner. It's only an affirmation on the part of the
programmer, and is not checked by the compiler. If you get
this wrong the results are undefined.)
The GCC C compiler, as it turns out, will move the
strlen
call out of the loop in some cases.
Don't get too excited, because now you've got an algorithm
that's either n-squared or linear
depending on the compiler. You could also say the hell with
all of this and write a naive optimizer that always lifts
the strlen
out of a for-loop expression.
Great! It works in the majority of real-life cases. But now
if you go and write an algorithm, even a contrived one,
that's dependent on the string length changing inside the
loop...uh oh, now the compiler is transforming your valid
intent into code that doesn't work. Do you want this kind
of nonsense going on behind the scenes?
The clunky "manually assign the length to a constant"
solution is a better one across the board. You're clearly
stating that it doesn't matter what external functions do
or that there are other writes to memory. You've already
grabbed the value you want and that's that.
(If you liked this, you might enjoy How much memory does malloc(0)
allocate?)
Self-Imposed Complexity
Bad data visualizations are often more computationally
expensive--and harder to implement--than clear versions. A
3D line graph is harder to read than the standard 2D
variety, yet the code to create one involves the additional
concepts of filled polygons, shading, viewing angle, and
line depth. An exploded 3D pie chart brings nothing over an
unexploded version, and both still miss out on the
simplicity of a flat pie chart (and there's a strong case
to be made for using the even simpler bar chart
instead).
Even with a basic bar chart there are often
embellishments that detract from the purpose of the chart,
but increase the amount of interface for creating them and
code to draw them: bars with gradients or images, drop
shadows, unnecessary borders. Edward Tufte has deemed these
chartjunk. A bar chart with all of the useless fluff
removed looks like something that, resolution aside, could
have been drawn on a computer from thirty years ago, and
that's a curious thing.
But what I really wanted to talk about is vector
graphics.
Hopeful graphic designers have been saying for years
that vector images should replace bitmaps for UI elements.
No more redrawing and re-exporting to support a new screen
size. No more smooth curves breaking into jagged pixels
when zoomed-in. It's an enticing proposition, and if it had
been adopted years ago, then the shift to ultra-high
resolution displays would have been seamless--no developer
interaction required.
Except for one thing: realistic, vector icons are more
complicated than they appear. If you look at an Illustrator
tutorial for creating a translucent, faux 3D globe,
something that might represent "the network" or "the
internet," it's not just a couple of Bezier curves and
filled regions. There are drop shadows with soft edges and
blur filters and glows and reflections and tricky
gradients. That's the problem with scalable vectors for
everything. It takes a huge amount of processing to draw
and composite all of these layers of detailed description,
and meanwhile the 64x64 bitmap version was already drawn by
the GPU, and there's enough frame time left to draw
thousands more.
That was the view three or more years ago, when
user-interface accoutrements were thick with gloss and
chrome and textures that you wanted to run your finger over
to feel the bumps. But now looking at the comparatively
primitive, yet aesthetically pleasing icons of iOS 7 and
Windows 8, the idea that they could be live vector
descriptions isn't so outlandish. And maybe what's kept us
from getting there sooner is that it was hard to have to
have self-imposed restraint amid a whirlwind of so much new
technology. It was hard to say, look, we're going to have a
clean visual language that, resolution aside, could have
worked on a computer from thirty years ago.
Optimization in the Twenty-First Century
I know, I know, don't optimize. Reduce algorithmic
complexity and don't waste time on low-level noise. Or
embrace the low-level and take advantage of magical machine
instructions rarely emitted by compilers. Most of the
literature on optimization focuses on these three
recommendations, but in many cases they're no longer the
best place to start. Gone are the days when you could look
like a superstar by replacing long, linear lookups with a
hash table. Everyone is already using the hash table from
the get-go, because it's so easy.
And yet developers are still having performance
problems, even on systems that are hundreds, thousands, or
even over a hundred-thousand times
than faster those which came before. Here's a short guide
to speeding up applications in the modern world.
Get rid of the code you didn't need to write in the
first place. Early programming courses emphasize
writing lots of code, not avoiding it, and it's a hard
habit to break. The first program you ever wrote was
something like "Hello World!" It should have looked like
this:
Hello world!
There's no code. I just typed "Hello world!" Why would
anyone write a program for that when it's longer
than typing the answer? Similarly, why would anyone compute
a list of prime numbers at runtime--using some kind of
sieve algorithm, for example--when you can copy a list of
pre-generated primes? There are lots of applications out
there with, um, factory manager caching classes in them
that sounded great on paper, but interfacing with the extra
layer of abstraction is more complex than what life was
like before writing those classes. Don't write that stuff
until you've tried to live without it and fully understand
why you need it.
Fix that one big, dumb thing. There are some
performance traps that look like everyday code, but can
absorb hundreds of millions--or more--cycles. Maybe the
most common is a function that manipulates long strings,
adding new stuff to the end inside a loop. But, uh-oh,
strings are immutable, so each of these append operations
causes the entire multi-megabyte string to be copied.
It's also surprisingly easy to unintentionally cause the
CPU and GPU to become synchronized, where one is waiting
for the other. This is why reducing the number of times you
hand-off vertex data to OpenGL or DirectX is a big deal.
Sending a lone triangle to the GPU can be as expensive as
rendering a thousand triangles. A more obscure gotcha is
that writing to an OpenGL vertex buffer you've already sent
off for rendering will stall the CPU until the drawing is
complete.
Shrink your data. Smallness equals performance on
modern hardware. You'll almost always win if you take steps
to reduce the size of your data. More fits into cache. The
garbage collector has less to trace through and copy
around. Can you represent a color as an RGB tuple instead
of a dictionary with the named elements "red", "green", and
"blue"? Can you replace a bulky structure containing dozens
of fields with a simpler, symbolic representation? Are you
duplicating data that you could trivially compute from
other values?
As an aside, the best across-the-board compilation
option for most C/C++ compilers is "compile for size." That
gets rid of optimizations that look good in toy benchmarks,
but have a disproportionately high memory cost. If this
saves you 20K in a medium-sized program, that's way more
valuable for performance than any of those high-end
optimizations would be.
Concurrency often gives better results than speeding
up sequential code. Imagine you've written a photo
editing app, and there's an export option where all the
filters and lighting adjustments get baked into a JPEG. It
takes about three seconds, which isn't bad in an absolute
sense, but it's a long time for an interactive program to
be unresponsive. With concerted effort you can knock a few
tenths of a second off that, but the big win comes from
realizing that you don't need to wait for the export to
complete before continuing. It can be handled in a separate
thread that's likely running on a different CPU core. To
the user, exporting is now instantaneous.
(If you liked this, you might enjoy Use and Abuse of Garbage Collected
Languages.)
Success Beyond the Barrier of Full Understanding
The most memorable computer science course I took in
college was a two part sequence: Software Development I and
II. In the first semester you built a complete application
based on a fictional customer's specification. To date
myself, it was written in Turbo
Pascal for MS-DOS. In the second semester, you were
given someone's completed project from a previous year of
Software Development--a different project than the one you
just worked through--and were asked to make a variety of
modifications to it.
The checkbook tracking application I inherited was
written by a madman. A madman who was clearly trying to
write as many lines of Pascal as possible. Anything that
should have been encapsulated in a handy helper function,
wasn't. Code to append an extension to a filename was
written in-line every time it was needed. Error checking
was duplicated when a file was opened. There weren't any
abstract data types, just repetitive manipulation of global
data structures. For example, if there was a special case
that needed fixing up, then it was handled with separate
code in twenty places. The average function was over two
hundred lines long. There were functions with fifteen
levels of indentation.
And yet, it worked. The author of this mess
didn't realize you aren't supposed to write code like this,
that all alarms were loudly reporting that the initial plan
and set of abstractions were failing, and it was time to
stop and re-evaluate. But he or she didn't know what all
the sirens and buzzers meant and hit the afterburners and
kept going and going past all point of reason. And the end
result worked. Not in a "you'd rely on it in a life and
death situation" way, but good enough for how most
non-critical apps get used.
That is why big companies hire young,
enthusiastic coders. Not because they're as clueless as my
madman, but because they can self-motivate beyond the
barrier of full understanding and into imperfect and brute
force solutions. I'd want to stop and rework the
abstractions I'm using and break things into lots of
smaller, reliable, understandable pieces. My code might be
more bullet-proof in the end, but I still have a level of
admiration for people who can bang out complex apps before
they become jaded enough to realize it's not that easy.
(If you liked this, you might enjoy Do You Really Want to be Doing This When You're
50?)
A Worst Case for Functional Programming?
Several times now I've seen the following opinion:
For anything that's algorithm-oriented or with lots
of math, I use functional programming. But if it's any
kind of simulation, an object-oriented solution is much
easier.
I'm assuming "simulation" means something with lots of
moving, nested actors, like a battlefield where there are
vehicles containing soldiers who are carrying weapons, and
even the vehicle itself has wheels and different parts that
can be damaged independently and so on. The functional
approach looks to be a brain-teaser. If I'm deep down
inside the code for a tank, and I need to change a value in
another object, how do I do that? Does the state of the
world have to get passed in and out of every function? Who
would do this?
In comparison, the object-oriented version is obvious
and straightforward: just go ahead and modify objects as
needed (by calling the proper methods, of course). Objects
contain references to other objects and all updates happen
destructively and in-place. Or is it that simple?
Let's say the simulation advances in fixed-sized time
steps and during one of those steps a tank fires a shell.
That's easy; you just add a shell object into the data
structures for the simulation. But there's a catch. The
tanks processed earlier in the frame don't know about this
shell, and they won't until next frame. Tanks processed
later, though, have access to information from the future.
When they run a "Were any shells recently fired?" check,
one turns up, and they can take immediate action.
The fix is to never pollute the simulation by adding new
objects mid-frame. Queue up the new objects and insert them
at the end of the frame after all other processing is
complete.
Now suppose each tank decides what to do based on other
entities in the vicinity. Tank One scans for nearby
objects, then moves forward. Tank Two scans for objects and
decides Tank One is too close. Now it isn't actually
too close yet; this is based on an incorrect picture of the
field caused by Tank One updating itself. And it may never
be too close, if Tank Two is accelerating away from Tank
One.
There are a couple of fixes for this. The first is to
process situational awareness for every actor on the field
as a separate step, then pass that information to the
decision/movement phase. The second is to avoid any kind of
intra-frame pollution of object data by keeping a list of
all changes (e.g., that a tank moved to a new position),
then applying all of those changes atomically as a final
step.
If I were writing such a simulation in a functional
style, then the fixes listed above would be there from the
start. It's a more natural way to work when there aren't
mutable data structures. Would it be simpler than the OOP
version? Probably not. Even though entity updates are put
off until later, there's the question of how to manage all
of the change information getting passed around. But at one
time I would have thought the functional version a complete
impossibility, and now it feels like the obvious way to
approach these kinds of problems.
(Some of these ideas were previously explored in
Purely Functional Retrogames, Part 4
and Turning Your Code Inside
Out.)
You Don't Want to Think Like a Programmer
It's an oft-stated goal in introductory coding books and
courses: to get you to think like a programmer. That's
better than something overly specific and low-level like
"to learn Java." It's also not meant to be taken literally.
A clearer, more accurate phrasing would be "to get you to
break down problems in an analytical way." But let that
initial, quirky sequence of five words--"to think like a
programmer"--serve as a warning and a reminder.
Because you really don't want to think like a
programmer.
It starts slowly, as you first learn good coding
practices from the bad. Never use global variables; wrap
all data into objects. Write getter and setter methods to
hide internal representations. Use const
wherever possible. Only one class definition per file,
please. Format your source code to encourage reading and
understanding by others. Take time to line up your equal
signs so things are in nice, neat columns.
Eventually this escalates to thinking in terms of design
patterns and citing rules from Code Complete. All
these clueless people want you add features that are
difficult and at odds with your beautiful architecture;
don't they realize that complexity is the enemy? You come
to understand why every time a useful program is written in
Perl or PHP it's an embarrassment to computer science. Lisp
is the way, and it's worth using even if you don't have
access to most of the libraries that make Python such a
vital tool. Then one day you find yourself arguing static
versus dynamic typing and passionately advocating
test-driven development and all hope is lost.
It's not that any of these things are truly bad on their
own, but together they occupy your mind. You should be
obsessing about the problem domain you're working in--how
to make a game without pedantic tutorials, what's the most
intuitive set of artistic controls in a photography
app--and not endless software engineering concerns.
Every so often I see someone attempting to learn a skill
(e.g., web design, game development, songwriting), by
finishing a project every day/week/month. I love these!
They're exciting and inspirational and immediate. What a
great way to learn! The first projects are all about
getting something--anything--working. That's followed by
re-engineering familiar designs. How to implement Snake,
for example. Or Tetris.
If you've embarked on such a journey, the big step is to
start exploring your own ideas. Don't copy what people who
came before you were copying from other people. Experiment.
Do crazy things. If you stick to the path of building what
has already been made, then you're setting yourself up as
implementor, as the engineer of other people's ideas, as
the programmer. Take the opportunity to build a reputation
as the creator of new experiences.
And, incidentally, you know how to write code.
(If you liked this, you might enjoy Learning to Ignore Superficially Ugly
Code.)
Popular iOS Games That Could Have Been Designed for
8-Bit Systems
Amid all the Flappy Bird hoopla, it struck me that I
could have brought that game to life thirty years ago on an
8-bit home computer--if only I had thought of the idea. And
then of course someone confirmed this hypothesis by writing
a Commodore 64
version. That made me wonder what other popular iOS
titles meet the same criteria.
"Implementable on an 8-bit computer" can't simply be
equated with pixelated graphics. When I designed 8-bit
games, I spent a lot of time up front making sure my idea
was a good match for the hardware. Rendering a dozen (or
even half that) sprites in arbitrary positions wasn't
possible. Ditto for any kind of scaling, rotation, or
translucency. Flappy Bird undershoots the hardware of the
Atari 800 I learned to program on, with a sparse, scrolling
background and one tiny sprite. What other iOS games would
work?
Jetpack
Joyride. If you've never seen it, take Flappy Bird
and change the controls so that a touch-and-hold moves you
upward and you drop when released. Make the scrolling world
more interesting than pipes. Add floating coins to collect.
That's the gist of it, anyway. Other niceties would
translate, too, like semi-procedural environments and
mission-like objectives ("fly 1000m without collecting any
coins"). Most of the special vehicles you can commandeer
would need to be dropped, but they're icing and not core
gameplay.
Ridiculous
Fishing. At first glance there's a lot going on
visually, as you drop a line through many layers of fish.
In a design move that looks like a concession for 8-bit
hardware, the fish swim in horizontal bands. On most
systems with hardware sprites, there's a limit to how many
can be displayed on the same scan line. But those sprites
can be modified and repositioned as the screen draws from
top to bottom, so four sprites can be repurposed to eight
or twelve or twenty, as long as they're in separate strips.
That's a good match for the fishing portion of the game,
but less so for the bonus shooting segment (which would
need a rethink).
Super
Hexagon. This one looks impossible, being based
around hardware-accelerated polygons, but it could have
been designed by a bold 8-bit coder. The key is that the
polygons are flat in the graphic design sense: no
textures, no gradients. How do you move a huge, flat
triangle across the screen on a retro machine? Draw a line
on one side of the triangle in the background color, then
draw a new line on the other side. Repeat. Writing a line
clipper will take some work, but it's doable. The "don't
collide with the shapes" part of the design is easy.
Math-heavy polygon collision routines can be replaced by
checking a single "does sprite overlap a non-background
pixel" register.
Threes!
Here's a straightforward one, with no technical trickery or
major omissions necessary. A retro four-way joystick is the
perfect input device.
All of these designs could have been discovered thirty
years ago, but they weren't. Think about that; someone
could have come up with Jetpack Joyride's objective system
in 1984, but they didn't. Ten years later, they still
hadn't. It's a pleasant reminder that good design isn't all
about the technology, and that there's a thoughtful, human
side to development which doesn't need to move at the
breakneck pace we've come to associate with the computing
world.
(If you liked this you might enjoy Trapped by Exposure to Pre-Existing
Ideas.)
Range-Checks and Recklessness
Here's an odd technical debate from the 1980s: Should
compiler-generated checks for "array index out of range"
errors be left in production code?
Before C took over completely, with its loose accessing
of memory as an offset from any pointer, there was a string
of systems-level languages with deeper treatment of arrays,
including the ALGOL family, PL/1, Pascal, Modula-2, and
Ada. Because array bounds were known, every indexing
operation, such as:
frequency[i] = 0
could be checked at runtime to see if it fell within the
extents of the array, exiting the program with an error
message otherwise.
This was such a common operation that hardware support
was introduced with the 80286 processor in the form of the
bound
instruction. It encapsulated the two
checks to verify an index was between the upper and lower
bounds of an array. Wait, wasn't the lower bound always
zero? Often not. In Pascal, you could have declarations
like this:
type Nineties = array[1990..1999] of integer;
Now back to the original question of whether the range
checks should live on in shipping software. That error
checking is great during development was not controversial,
but opinions after that were divided. One side believed it
wasteful to keep all that byte and cycle eating around when
you knew it wasn't needed. The other group claimed you
could never guarantee an absence of bugs, and wouldn't it
be better to get some kind of error message than to
silently corrupt the state of the application?
There's also a third option, one that wasn't applicable
to simpler compilers like Turbo
Pascal: have the compiler determine an index is
guaranteed to be valid and don't generate range checking
code.
This starts out easy. Clearly the constant in
Snowfall[1996]
is allowed for a variable of
type Nineties
. Replace "1996" with a variable,
and it's going to take more work. If it's the iteration
variable in a for
loop, and we can ensure that
the bounds of the loop are between 1990 and 1999 inclusive,
then the range checks in the loop body can be omitted.
Hmmm...what if the for
loop bounds aren't
constants? What if they're computed by a function in
another module? What if there's math done on the indices?
What if it's a less structured while
loop? Is
this another case of needing a sufficiently smart compiler? At what point do
diminishing returns kick in, and the complexity of
implementation makes it hard to have faith that the
solution is working correctly?
I set out to write this not for the technical details
and trivia, but more about how my thinking has changed.
When I first ran across the range-check compiler option, I
was fresh out of the school of assembly language
programming, and my obsessive,
instruction-counting brain was much happier with this
setting turned off. These days I can't see that as anything
but reckless. Not only would I happily leave it enabled,
but were I writing the compiler myself I'd only remove the
checks in the most obvious and trivial of cases. It's not a
problem worth solving.
Get Good at Idea Generation
I get more mail about The Recovering
Programmer than anything else I've written. Questions
like "How can I be more than just the programmer who
implements other peoples' master plans?" are tough to
respond to. Seeing that feeling of "I can make anything!"
slide into "Why am I doing this?" makes me wish there was
some easy advice to give, or at least that I could buy each
of these askers a beer, so I'd like to offer at least one
recommendation:
Get good at idea generation.
Ideas have a bad reputation. They're a dime a dozen.
They're worthless unless implemented. Success is 90%
perspiration. We've all seen the calls for help from a
self-proclaimed designer and his business partner who have
a brilliant company logo and a sure-fire concept for an
app. All they need is a programmer or two to make it
happen, and we all know why it won't work out.
Now get past ideas needing to be on a grand scale--the
vision for an entire project--and think smaller. You've got
a UI screen that's confusing. You have something
non-trivial to teach users and there's
no manual. The number of tweakable options is getting
out of hand. Any of the problems that come up dozens of
times while building anything.
The two easy approaches are to ignore the problem
("What's one more item on the preferences panel?") or do an
immediate free association with all the software you've
ever been exposed to and pick the closest match.
Here's what I do: I start writing a list of random
solutions on a piece of paper. Some won't work, some are
simple, some are ridiculous. What I'm trying to do is work
through my initial batch of middling thoughts to get to the
interesting stuff. If you've ever tried one of those "write
a caption for the image" contests, it's the same thing. The
first few captions you come up with seem like they're
funny, but keep going. Eventually you'll hit comedy gold
and those early attempts will look dumb in comparison.
Keep the ideas all over the place instead of circling
around what you've already decided is the right direction.
What if you were had to remove the feature entirely? Could
you negate the problem through a change in terminology?
What's the most over-engineered solution you can think of?
What if this was a video game instead of a serious app?
What would different audiences want: a Linux advocate, the
person you respect most on twitter, an avant garde artist,
someone who can't speak your native language?
The point of this touchy-feeliness isn't just to solve
your current problem, but to change your thinking over
time. To get your mind working in a unique way, not just
restating what you've seen around the web. Every so often
you'll have a small breakthrough of an idea that will
become a frame for future solutions. Later you'll have
another small breakthrough that builds on it. Eventually
you'll be out in a world of thought of your own making,
where you're seriously considering ideas that aren't even
in someone else's realm of possibility.
(If you liked this you might enjoy Advice to Aimless, Excited Programmers.)
You Don't Read Code, You Explore It
(I wrote this in 2012 and rediscovered it in January of
this year. I didn't feel comfortable posting it so close to
Peter Seibel's excellent Code is Not
Literature, so I held off for a few months.)
I used to study the program listings in magazines like
Dr. Dobb's, back when they printed the source code to
substantial programs. While I learned a few isolated tricks
and techniques, I never felt like I was able to comprehend
the entirety of how the code worked, even after putting in
significant effort.
It wasn't anything like sitting down and reading a book
for enjoyment; it took work. I marked up the listings and
kept notes as I went. I re-read sections multiple times,
uncovering missed details. But it was easy to build-up
incorrect assumptions in my head, and without any way of
proving them right or wrong I'd keep seeing what I wanted
to instead of the true purpose of one particular section.
Even if the code was readable in the software engineering
sense, boundary cases and implicit knowledge lived between
the lines. I'd understand 90% of this function and 90% of
that function and all those extra ten percents would keep
accumulating until I was fooling myself if I thought I had
the true meaning in my grasp.
That experience made me realize that read isn't a
good verb to apply to a program.
It's fine for hunting down particular details ("let's
see how many buffers are allocated when a file is loaded"),
but not for understanding the architecture and flow of a
non-trivial code base.
I've worked through tutorials in the J language--called "labs" in the
J world--where the material would have been opaque and
frustrating had it not been interactive. The presentation
style was unnervingly minimal: here's a concept with some
sentences of high-level explanation, and here are some
lines of code that demonstrate it. Through experimentation
and trial and error, and simply because I typed new
statements myself, I learned about the topic at hand.
Of particular note are Ken Iverson's interactive texts
on what sound like dry, mathematical subjects, but they
take on new life when presented in exploratory snippets.
That's even though they are reliant on J, the most
mind-melting and nothing-at-all-like-C language in
existence.
I think that's the only way to truly understand
arbitrary source code. To load it up, to experiment, to
interactively see how weird cases are handled, then keep
expanding that knowledge until it encompasses the entire
program. I know, that's harder to do with C++ than with
Erlang and Haskell (and more specifically, it's harder to
do with languages where functions can have wide-ranging
side effects that can change the state of the system in
hidden ways), and that's part of why interactive,
mostly-functional languages can be more pleasant than C++
or Java.
(If you liked this, you might enjoy, Don't Be Distracted by Superior
Technology.)
Programming Without Being Obsessed With
Programming
I don't get asked this very often, and that's
surprising. I ask myself it all the time:
If you're really this recovering
programmer and all, then why do you frequently
write about super technical topics like functional
programming?
Sometimes it's just for fun. How much
memory does malloc(0) allocate? was a good exercise in
explaining something obscure in a hopefully clear way.
Those pieces also make me the most nervous, because there
are so many experts with all kinds of specialized
knowledge, and if I make any mistakes...let's just say that
they don't get quietly ignored. (I am grateful for the
corrections, in any case.)
But that's not the whole story.
If I don't code, I don't get to make things for all the
wonderful devices out there in the world. Some people get
all bent out of shape about that requirement and say "see,
you're conflating programmer and product
designer; you can do all the design work and leave the
programming to someone else." That may be true if you're on
the right team, but for personal projects it's like saying
that writer should be split into two positions: the
designer of the plot and characters, and the person who
forms sentences on the page. It doesn't work like that.
The catch is, as we all know, developing for today's
massive, deeply-layered systems is difficult, and that
difficulty can be all-consuming: unreadable quantities of documentation,
complex languages, software engineering rules and
methodologies to keep everything from spontaneously going
up in a spectacular fireball, too much technical choice.
There's enough to keep you busy without ever thinking an
original thought or crafting a vision of what you'd like to
build.
For me the question is not whether to write code, but
how to keep the coding side of my mind in check, how to
keep it from growing and thinking about too many details
and all the wrong things, and then at that point I've lost.
I've become a software engineer, and I really don't want to
be a software engineer.
That's my angle right there. Programming without being
overwhelmed by and obsessed with programming. Simplicity of
languages, simplicity of tools, and simplicity in ways of
writing code are all part of that.
(If you liked this, then consider being an early
follower on twitter.)
Unexpectedly Simple
This is a story of the pursuit of user experience
simplicity, confounded by delusions and over-engineering.
It's also about text formatters.
The first computer text formatter, RUNOFF
,
was written in 1964 in assembly language for the CTSS
operating system. If you've never used RUNOFF
or one its descendants like the UNIX utility
troff
, imagine HTML where each tag appears on
a line by itself and is identified with a leading period.
Want to embolden a word in the middle of a sentence? That's
one line to turn bold on, one line for the word, then a
third line to turn bold off. This led to elongated
documents where the formatter commands gave little visual
indication of what the final output would look like.
The RUNOFF
command style, of the first
character on a line indicating a formatting instruction,
carried over to early word processors like WordStar (first
released in 1978). But in WordStar that scheme was only
used for general settings like double-spacing. You could
use control codes mid-line for bold, italics, and
underline. This word is in bold: ^Bword^B
.
(And to be fair you could do this in later versions of
troff
, too, but it was even more awkward.)
WordPerfect 4.2 for MS-DOS (1986), hid the formatting
instructions so the text looked clean, but they
could be displayed with a "reveal codes" toggle. I clearly
remember thinking this was a terrible system, having to
lift the curtain and manually fiddle with command codes.
After all, MacWrite and the preceding Bravo
for the Xerox Star had already shown that WYSIWYG editing
was possible, and clearly it was the simplest possible
solution for the user. But I was wrong.
WYSIWYG had drawbacks that weren't apparent until you
dove in and worked with it, rather than writing a sentence
about unmotivated canines on MacWrite at the local computer
shop and trying to justify a $2495 purchase. If you
position the cursor at the end of an italicized word and
start typing, will the new characters be in italicized or
not? It depends. They might not even be in the same font.
If you paste a paragraph, and all of a sudden there's
excess trailing space below it even though there isn't a
carriage return in the text, how do you remove it?
More fundamentally, low-level presentation issues--font
families, font sizes, boldness, italics, colors--were now
intermingled with the text itself. You don't want to
manually change the formatting of all the text serving as
section headers; you want them each to be formatted the way
a section header should be formatted. That's fixable by
adding another layer, one of user-defined paragraph styles.
Now there's more to learn, and some of the simplicity of
WYSIWYG is lost.
Let's back up a bit to the initial problem of
RUNOFF
: the marked-up text bears little
resemblance to the structure of the formatted output. What
if instead of drawing attention to a rigid command
structure, the goal is to make it invisible. Instead of
.PP
on its own line to indicate a paragraph,
assume all text is in paragraphs separated by blank lines.
An asterisk as the first character means a line is an
element of an unordered list.
The MediaWiki
markup language takes some steps in this direction. The
REBOL
MakeDoc tool goes further. John Gruber's Markdown
is perhaps the cleanest and most complete system for
translating visually formatted text to HTML. (Had I known
about Markdown, I wouldn't have developed the minimal
mark-up notation I use for the articles on this site.)
That's my incomplete and non-chronological history of
text formatters, through ugly and over-engineered to a
previously overlooked simplicity. You might say "What about
SGML and HTML? What about TeX?" which I'll pretend I didn't
hear, and say that the real question is "What other
application types have grown convoluted and are there
unexpectedly simple solutions that are being ignored?"
(If you liked this, you might enjoy Documenting the Undocumentable.)
You Can't Sit on the Sidelines and Become a
Philosopher
At some point every competent developer has that flash
of insight when he or she realizes everything is
fundamentally broken: the tools, the languages, the
methodologies. The brokenness--and who could argue with
it--is not the important part. What matters is what happens
next after this moment of clarity, after this exposure to
the ugly realities of software.
You could ignore it, because that's how it is. You still
get paid regardless of what you're forced to use.
You could go on a quest for perfection and try all the
exotic languages and development environments, even taking
long archaeological expeditions into once promising but now
lost ideas of the 1970s and 80s. Beware, for you may never
return.
You could try to recreate computing in your own image,
starting from a new language, no wait, a new operating
system--wait, wait, wait--a new processor architecture.
This may take a while, and eventually you will be visited
by people on archaeological expeditions.
The right answer is a blend of all of these. You have to
ignore some things, because while they're driving you mad,
not everyone sees them that way; you've built up a
sensitivity. You can try new tools and languages, though
you may have to carry some of their concepts into future
projects and not the languages themselves. You can fix
things, especially specific problems you have a solid
understanding of, and probably not the world of technology
as a whole.
As long as you eventually get going again you'll be
fine.
There's another option, too: you could give up. You can
stop making things and become a commentator, letting
everyone know how messed-up software development is. You
can become a philosopher and talk about abstract, big
picture views of perfection without ever shipping a product
based on those ideals. You can become an advocate for the
good and a harsh critic of the bad. But though you might
think you're providing a beacon of sanity and hope, you're
slowly losing touch with concrete thought processes and
skills you need to be a developer.
Meanwhile, other people in their pre-epiphany states are
using those exact same technologies that you know are
broken, and despite everything you do to convince them that
this can't possibly work...they're successful.
I decided to take my own advice by writing an iPhone game. It's not
written in an exotic functional language, just a lot of
C++, some Objective-C, and a tiny interpreter for
scripting. There are also parts of the code written in a
purely functional style, and some offline tools use Erlang.
It wasn't ever intended as a get-rich project, but more of
get-back-in-touch project. As such, it has been wildly
successful. (Still, if you have fun with it, an App Store
rating would be appreciated.)
(If you liked this, you might enjoy The Background Noise Was Louder than I
Realized.)
Lost Lessons from 8-Bit BASIC
Unstructured programming with GOTO
is the
stuff of legend, as are calling subroutines by line
number--GOSUB 1000
--and setting global
variables as a mechanism for passing parameters.
The little language that fueled the home computer
revolution has been long buried beneath an avalanche of
derision, or at least disregarded as a relic from primitive
times. That's too bad, because while the language itself
has serious shortcomings, the overall 8-bit BASIC
experience has high points that are worth remembering.
It's hard to separate the language and the computers it
ran it on; flipping the power switch, even without a disk
drive attached, resulted in a BASIC prompt. If nothing
else, it could be treated as a calculator:
PRINT "seconds in a week: ",60*60*24*7
or
PRINT COS(2)/2
Notice how the cosine function is always available for
use. No importing a library. No qualifying it with
MATH.TRIG
.
Or take advantage of this being a full programming
language:
T = 0
FOR I=1 TO 10:T=T+I*I:NEXT I
PRINT T
It wasn't just math. I remember seeing the Atari 800 on
display in Sears, the distinctive blue background and READY
prompt visible across the department. I'd switch to a
bitmapped graphics mode with a command window at the bottom
and dash off a program that looped across screen
coordinates displaying a multicolored pattern. It would run
as an in-store demo for the rest of the day or until some
other know-it-all pressed the BREAK key.
There's a small detail that I skipped over: entering a
multi-line program on a computer in a department store.
Without starting an external editor. Without creating a
file to be later loaded into the BASIC interpreter (which
wasn't possible without a floppy drive).
Here's the secret. Take any line of statements that
would normally get executed after pressing return:
PLOT 0,0:DRAWTO 39,0
and prefix it with a number:
10 PLOT 0,0:DRAWTO 39,0
The same commands, the same editing keys, and yet it's
entirely different. It adds the line to the current program
as line number 10. Or if line 10 already exists, it
replaces it.
Lines are syntax checked as entered. Well, each line is
parsed and tokenized so that previous example turns into
this:
Line #: 10
Bytes in line: 6
PLOT command
X: 0
Y: 0
DRAWTO command
X: 39
Y: 0
That's how the line is stored in memory, provided there
aren't any errors. The displayed version is an
interpretation of those bytes. Code formatting is entirely
handled by the system and not something you think
about.
All of this, from the always-available functions, to
being able to develop programs without external tools, to
code stored as pre-parsed tokens, made BASIC not just a
language but a development system. Compare that to most of
today's compilers which feed on self-contained files of
code. Sometimes there's a run-eval-print loop so there's
interactivity, but editing real programs happens elsewhere.
And then there are what have come to be known as Integrated
Development Environments which tie together file-oriented
compilers with text editors and sometimes interactive
command lines, but now they get derided for reasons that
BASIC didn't: for being bulky and cumbersome.
Did I mention that Atari BASIC was contained in an eight
kilobyte ROM cartridge?
How did IDEs go so wrong?
(If you liked this you might enjoy Stumbling Into the Cold Expanse of Real
Programming.)
Design is Expensive
The result may at first glance seem a trifle, but I have
a notebook filled with the genesis and evolution of the
iPhone game DaisyPop. All those
small, painstaking choices that now get "of course
it should be like that" reactions...that's where the bulk
of the development time went. I wanted to go over two
specific design details, neither of which I knew the
existence of at the project's outset.
The core mechanic is tapping flowers and insects that
drift and scurry about the screen. Tapping a daisy freezes
this wandering and it rapidly expands outward before
bursting. Any flowers contacted by the first also expand
and so on recursively. Insects, as everyone knows, don't
expand when touched; they race forward, possibly setting
off other daises and insects.
I put off implementing audio for this chaining process
until late. I wasn't worried. I'd just plug in the sounds
when they were ready, and I did. And it sounded
terrible.
All the sound effects in a ten-length chain played in an
overlapping jumble, sometimes four or five sounds starting
the same frame. I spent a while fiddling with the chain
code, trying to slow things down, to limit how much
activity could occur at once, but it didn't help. I might
have prevented two sounds from triggering on the same
frame, but they were separated by a mere sixtieth of a
second which didn't make a discernible difference. Messing
with the chain system itself was also breaking the already
polished and proven feel of the game.
The eventual solution was to not play sounds
immediately, but queue them up. Every eight frames--a
number found by trial and error--take a sound from the
queue and start it. And it worked beautifully, stretching
out the audio experience for big chains over several
seconds, a regular rhythm of pentatonic tones. Almost.
Now the sounds weren't in sync with the visuals, and
surprisingly it didn't matter. There's no real-life
reference for when a purple daisy expanding to touch a
white flower makes a sound, so the softness introduced by
the audio queueing scheme wasn't a problem. But it was
immediately noticeable when the quick run of notes played
by a racing insect wasn't lined up with the animation, even
by a little bit. The fix was to play insect sounds
immediately instead of queuing them like daisy audio. It's
inconsistent, yes, but it worked.
In any game there are goals that drive the player
forward, such as finishing a level or trying to get a high
score. In DaisyPop it's the latter. Typically you're only
alerted of an exceptional score after you finish playing
(think of when you enter your initials in an old school
arcade game). I was thinking about how to give feedback
mid-game. If there's a top ten list of scores, wouldn't it
be motivational to know you've broken into that list and
are now climbing?
A plain message is one option, but it's not dynamic; it
doesn't draw your eye amidst the chaos. Eventually I
settled on a triangle that appears and drifts
upward--signifying rising up the score charts--which acts
as a backdrop for overlaid text that moves in parallax:
"#5" in a larger font, with "best score" beneath it."
(Trivia: the triangle is one of two pieces of art I did
myself.) After working out the right motion and how to
handle edge cases I'm happy with the result. More games
should do this.
I could have dodged both of these issues by wholesale
borrowing an existing game concept and not trying something
experimental. I would have had a working model to serve as
a reference. It would have been so much easier in
terms of time and mental effort if I had said sure, I just
want to make a version of that game right there that people
already know how to play and like.
Thinking about the details, wandering around an unknown
space trying to invent the right solution, is
expensive.
(If you liked this, you might enjoy All that Stand Between You and a Successful
Project are 500 Experiments.)
Extreme Formatting
Not long after I wrote Solving the
Wrong Problem, it occurred to me that this site is
small because of what I decided to leave out, and that I
never tried to optimize what remained. To that end I used a
PNG reducer on the two images (one accompanies Accidental Innovation, the other is in
The End is Near for Vertical Tab).
And I ran the CSS through a web-based minifier.
A Cascading Style Sheet written in the common way looks
like this:
blockquote {
font-style: italic;
margin-left: 1.25em
}
#top {
background-color: #090974;
color: #FFF;
margin-bottom: .67em;
border-color: #7373D9;
border-style: none none solid;
border-width: 12px;
padding: 2em 0 0
}
and so on, usually for a couple of screenfuls, depending
on the complexity. The minified version of the above has no
extraneous spaces or tabs and is only two lines: one for
each selector. But now I had introduced a workflow problem.
I needed to keep around the nicely formatted original for
easy editing, then re-minify it before uploading to the
site. The first time I wanted to change the CSS it was a
simple tweak, so rather than automate the conversion I went
in and edited the minified version directly.
And I found that I preferred working with the
crunched-down CSS.
Unless you're reading the newsfeed, the CSS file is
already on your computer or phone, so take a moment to
look at it (Editor's Note: I've formatted and inlined it in the <head>). If you're sighing and
shaking your head at this point, you could put each
selector on a line by itself, but that adds another 23
lines. 23 more if each closing brace gets its own line. And
another 40 or so to make sure there's a newline after each
property. Somehow the raw 23-line version puts the overall
simplicity of the stylesheet into clear perspective. Free
of superficial structure, it takes less than half of a
vertical window in my text editor. Is inflating that to
over 100 lines--enough that I need to scroll to see them
all--buying anything concrete, or is it that verticality is
such an ingrained formatting convention?
Okay, right, there's also readability. Surely those run
together properties are harder to scan visually? Syntax
highlighting makes a big difference, and any editor with a
"highlight all occurrences of a string" feature makes this
layout amazing. I can see everywhere the
border-style
property is used all at once. No
jumping to different parts of the document.
Here's a good question: Does this apply to real code and
not just HTML stylesheets?
There's an infamous,
single printed page of C code, written by Arthur
Whitney one afternoon in 1989, which became the inspiration
for the J language interpreter. It occasionally gets
rediscovered and held up as an example of what would happen
if a programmer went rogue and disregarded all rules and
aesthetics of code formatting, and most who see it are
horrified. All those macros? Short identifiers? Many
statements on the same line? Entire functions on a
single line, including those triggers of international
debate, the curly braces?
Despite being misaligned with popular layout standards,
is it really such a mess? It's small, so you can study the
whole thing at once without scrolling. The heavy use of
macros prevents noisy repetition and allows thinking in
terms of higher-level chunks. That level of density makes
the horizontal layout easier to follow than it would be
with preprocessor-free C. (To be fair, the big downside is
that this is not the kind of code debuggers work well
with.)
I suspect I'm not seeing these two examples of extreme
formatting the same way that someone who has programmed
exclusively with languages of the C / Java / Javascript
class does. I happily fused BASIC
statements together with a colon, though I admit a moment
of hesitation before attempting the same daring feat in C
with a semicolon. J and Forth naturally have tight, horizontal
layouts, and that's part of why those language cultures
sometimes use quantity of code to specify problem
difficulty.
"How hard do you think that is?"
"Maybe a dozen lines or so."
Programming Modern Systems Like It Was 1984
Imagine you were a professional programmer in 1984, then
you went to sleep and woke up 30 years later. How would
your development habits be changed by the ubiquitous,
consumer-level supercomputers of 2014?
Before getting to the answer, realize that 1982 wasn't
all about the cycle counting micro-optimizations that you
might expect. Well, actually it was, at least in
home hobbyist circles, but more of necessity and not
because it was the most pleasant option. BASIC's
interactivity was more fun than sequencing seven
instructions to perform the astounding task of summing two
16-bit numbers. Scheme, ML, and Prolog were all developed
in the previous decade. Hughes's Why Functional
Programming Matters was written in 1984, and it's easy
to forget how far, far from practical reality those
recommendations must have seemed at the time. It was
another year of two parallel universes, one of towering
computer science ideas and the other of popular hardware
incapable of implementing them.
The possible answers to the original question--and mind
you, they're only possible answers--are colored by the
sudden shift of a developer jumping from 1984 to 2014 all
in one go, without experiencing thirty years of
evolution.
It's time to be using all of those high-level
languages that are so fun and expressive, but were set
aside because they pushed the limits of expensive
minicomputers with four megabytes of memory and weren't
even on the table for the gold rush of 8-bit computer
games.
Highly optimizing compilers aren't worth the
risk. Everything is thousands, tens of thousands, of
times faster then it used to be. Chasing some additional
2-4x through complex and sensitive manipulations isn't
worth it. You'll regret your decision when for no clear
reason your app starts breaking up at high optimization
settings, maybe only on some platforms. How can anyone have
confidence in a tool like that?
Something is wrong if most programs don't run
instantaneously. Why does this little command line
program take two seconds to load and print the version
number? It would take serious effort to make it that slow.
Why does a minor update to a simple app require
re-downloading all 50MB of it? Why are there 20,000 lines
of code in this small utility? Why is no one questioning
any of this?
Design applications as small executables that
communicate. Everything is set-up for this style of
development: multi-core processors, lots of memory, native
support for pipes and sockets. This gives you multi-core
support without dealing with threads. It's also the most
bulletproof way of isolating components, instead of the
false confidence of marking class members "private."
Don't write temporary files to disk, ever.
There's so much RAM you can have nightmares about getting
lost in it. On the most fundamental level, why isn't it
possible to create and execute a script without saving to a
file first? Why does every tweak to a learning-the-language
test program result in a two megabyte executable that
shortly gets overwritten?
Everything is so complex that you need to isolate
yourself from as many libraries and APIs as possible.
With thousands of pages of documentation for any system,
it's all too easy to become entangled in endless specific
details. Build applications to be self-contained and have
well-defined paths for interfacing with the operating
system, even if those paths involve communicating with an
external, system-specific server program of sorts.
C still doesn't have a module system? Seriously?
And people are still using it, despite all the
alternatives?
(If you liked this, you might enjoy Remembering a Revolution That Never
Happened.)
The Software Developer's Sketchbook
It takes ten-thousand hours to master your field, so the
story goes, except that programming is
too broad of a field.
Solving arbitrary problems with code isn't what
ultimately matters, but problems within a specific domain.
Maybe it's strategy games. Maybe vector drawing tools.
Music composition software. Satellite control systems.
Viewed this way, who is truly an expert? Once I was asked
how many high-end 3D games I work on a year, and I wrote
"3" on the whiteboard, paused a moment to listen to the
"that's fewer than I expected" comment, then finished
writing: "1 / 3". How many satellite control systems do you
work on in a decade? One? Maybe two?
Compare this to carpenters or painters or cartoonists
who can look back on a huge body of work in a relatively
short time, assuming they're dedicated. Someone who does
roof work on fifty houses a year looks a lot more the
expert than someone who needs two years to ship a single
software project. Have you mastered building user
interfaces for paint programs when you've only created one,
then maintained it for five years, and you've never tried
alternate approaches?
To become the expert, you need more projects. They can
be smaller, experimental, and even be isolated parts of a
non-existent larger app. You can start
in the middle without building frameworks and modules
first. This is the time to try different languages, so you
can see if there's any benefit to Go or Clojure or Rust
before committing to them. Or to attempt a Chuck
Moore-esque exercise in extreme minimalism: is it possible
to get the core of your idea working in under a hundred
lines of code? But mostly it's to burn through a lot of
possibilities.
This sketchbook of implemented ideas isn't a paper book,
but a collection of small programs. It could be as simple
as a folder full of Python scripts or Erlang modules. It's
not about being right or wrong; many ideas won't work out,
and you'll learn from them. It's about exploring your
interests on a smaller scale. It's about playing with code.
It's about having fun. And you might just become an expert
in the process.
(If you liked this, you might enjoy The Silent Majority of Experts.)
Retiring Python as a Teaching Language
For the last ten years, my standard advice to someone
looking for a programming language to teach beginners has
been start with Python. And now I'm
changing that recommendation.
Python is still a fine language. It lets you focus on
problem solving and not the architectural stuff that
experienced developers, who've forgotten what it's like to
an absolute beginner, think is important. The language
itself melts into the background, so lessons aren't
explanations of features and philosophies, but about how to
generate musical scales in any key, computing distances
around a running track based on the lane you're in, or
writing an automated player for poker or Yahtzee.
Then one day a student will innocently ask "Instead of
running the poker simulator from the command line, how can
I put it in a window with a button to deal the next
hand?"
This is a tough question in a difficult-to-explain way.
It leads to looking at the various GUI toolkits for Python.
Turns out that Guido does the same thing every few years,
re-evaluating if TkInter is the right choice for IDLE, the
supplied IDE. For now, TkInter it is.
A week later, another question: "How can I write a
simple game, one with graphics?"
Again, time to do some exploration into what's out
there. Pyglet looks promising, but it hasn't been updated
since July 2012. There are some focused libraries that
don't try to do everything, like SplatGL, but it's pretty
new and there aren't many examples. PyGame appears popular,
and there's even a book, so okay let's start teaching how
to use PyGame.
A month later, more questions: "How can I give this game
I made to my friend? Even better, is there a way can I put
this on my phone so I can show it to kids at school without
them having to install it?"
Um.
All of these questions have put me off of Python as a
teaching language. While there's rigor in learning how to
code in an old-school way--files of algorithmic scripts
that generate monochromatic textual output in a terminal
window--you have to recognize the isolation that
comes with it and how far away this is from what people
want to make. Yes, you can find add-on packages for just
about anything, but which ones have been through the sweat
and swearing of serious projects, and which are
well-intentioned today but unsupported tomorrow?
The rise of non-desktop platforms complicates matters,
and I can sympathize. My goal in learning
Erlang was to get away from C and C++ and shift my
thinking to a higher level. I proved that I could use
Erlang and a purely functional style to work in the domain
that everyone is most scared of: games. Then the iPhone
came out and that was that. Erlang wasn't an option.
It's with all of this in mind that my recommended
language for teaching beginners is now Javascript. I know,
I know, it's quirky and sometimes outright weird, but
overall it's decent and modern enough. More importantly
it's sitting on top of an unprecedentedly ubiquitous
cross-platform toolkit for layout, typography, and
rendering. Want to display UI elements, images, or text?
Use HTML directly. Want to do graphics or animation? Use
canvas.
I expect some horrified reactions to this change of
thinking, at least to the slight degree that one can apply
horrified to a choice of programming language. Those
reactions should have nothing to do with the shortcomings
of Javascript. They should be because I dismissed so many
other languages without considering their features, type
systems, or syntaxes, simply because they aren't natively
supported by modern web browsers.
Life is More Than a Series of Cache Misses
I don't know what to make of the continual stream of
people in 2015 with fixations on low-level performance and
control. I mean the people who deride the
cache-obliviousness of linked lists, write-off languages
that aren't near the top of the benchmark table, and who
rant about the hopelessness of garbage collection. They're
right in some ways. And they're wrong at the same time.
Yes, you can do a detailed analysis of linked list
traversal and realize "Hey! Looping over an array is much
faster!" It is not news to anyone that different languages
have different performance characteristics. Garbage
collection is a little trickier, because unfortunately
there are still issues depending on
the situation, and not all that rarely either.
I could take a little Erlang or Scheme program and put
on a show, publicly tearing it to pieces, analyzing the
inefficiencies from dynamic typing and immutability and
virtual machines. There would be foot stomping and cheering
and everyone would leave convinced that we've been fooling
ourselves and that the only way to write code is to frame
problems in terms of cache architectures.
And then I'd reveal that the massively inefficient
Erlang program takes only a couple of milliseconds to
run.
Back in the 1990s I decided to modernize my skills,
because my experience was heavily skewed toward low-level C
and assembly work. I went through tutorials for a number of
modern languages before settling on Erlang. While learning,
I wrote programs for problems I did in college. Things like
tree manipulation, binary search--the classics. And while I
remember these being messy in C and Pascal, writing them in
Erlang was fun. I'm not giving up that fun if I can
help it. Fun is more productive. Fun leads to a better
understanding of the problem domain. And that leads to fast
code, even if it might be orders of magnitude away from
optimal when viewed through a microscope.
There is an exception to all of this. Imagine you're an
expert in building a very specific type of application.
You've shipped five of them so you've got a map of all the
sinkholes and poorly lit places. There's a chance, maybe,
depending on your background, that your knowledge
transcends the capabilities provided by higher level
programming languages, and you can easily crystallize a
simple, static architecture in C.
But until I'm that expert, I'll take the fun.
Are You Sure?
It's an old, familiar prompt. You delete a file, discard
a work in progress, or hit Cancel mid-install:
Are you sure?
It's not always so quaintly phrased these days, but the
same cautious attitude lives on in the modern confirmation
box. In iOS 8, tapping the trash can while viewing a photo
brings up a "Delete Photo?" button. Even so, that only
moves it to the Recently Deleted album. Permanently
removing it requires another delete with confirmation.
It may seem that the motivation behind "Are you sure?"
is to prevent rash decisions and changes of heart. The
official White House photographer isn't allowed to delete
any shots, so that solves that problem. But for
everyone else the little prompt quickly becomes part of a
two-button sequence that finds its way into your muscle
memory.
More commonly this second layer of confirmation averts
legitimate mistakes. If I'm in a UNIX shell wanting to
delete a file and it turns out to be write protected, then
I thank whoever decided that a little "C'mon, really?"
check was a good idea. Or I might unintentionally delete a
video when making a clumsy attempt to grab my falling
phone, were it not for those three familiar words, waiting,
visible through the cracked screen.
But now there are better options, especially given the
prevalence of touchscreens. The ideal is something easy to
remember, easy to do, but that's naturally outside the
realm of normal input. Here are a few.
Imagine tapping an image thumbnail four times. The first
selects it. The subsequent taps expand the image, as if
it's being inflated, until the fourth pops it and deletes
it.
If that's too much fun, and you find your nephew has
popped your entire photo library, touch each quadrant of an
image in sequence. It doesn't matter which you start with,
as long as you get all four. As you tap, that quadrant
disappears, then with the fourth it's gone.
Long-holds are little used on touchscreens, so there's
another possibility. Don't display a quit button in a game;
hold your finger in the same place for three seconds. After
a second, a circle starts shrinking toward your fingertip,
to give feedback. Don't want to quit? Lift your finger.
These are only examples, and I know there are other
approaches. There are some basic usability issues as well,
such as how does an uninitiated person know about the
four-quadrant tapping? But it's worth trying different
ideas rather than, once again and without thought,
following the "Are you sure?" model, the same one that
prevented unintended MS-DOS disk formatting in the
pre-Macintosh days.
(If you liked this, you might enjoy Virtual Joysticks and Other Comfortably Poor
Solutions.)
The Wrong Kind of Paranoia
Have you ever considered how many programming language
features exist only to prevent developers from doing
something? And it's not only to keep you from doing
something in other people's code. Often the person you're
preventing from doing this thing is yourself.
For example, modules let you prevent people from calling
functions that haven't been explicitly exported. In C
there's static
which hides a function from
other separately compiled files.
const
prevents modifying a variable. For
pointers there's a second level of const-ness, making the
pointed-to data read-only. C++ goes even further, as C++
tends to, allowing a class method to be marked
const
, meaning that it doesn't change any
instance variables.
Many object-oriented languages let you group methods
into private
and public
sections,
so you can't access private methods externally. At least
Java, C++, and Object Pascal add protected
,
which muddies the water. In C# you can seal classes so they
can't be inherited. I'm trying real hard not to bring up
friend classes, so I won't.
Here's the question: how much does all this pedantic
hiding, annotating, and making sure you don't double-cross
yourself by using a "for internal use only" method actually
improve your software? I realize I'm treading in dangerous
territory here, so take a few deep breaths first.
I like const
, and I automatically precede
local variables with it, but the compiler doesn't need me
to do that. It can tell that a local integer is only
assigned to once, and the generated code will be exactly
the same. You could argue that the qualifier prevents
accidental changes, but if I've ever had that happen in
real code it's rare enough that I can't recall.
Internal class methods are similar. If they're not in
the tutorial, examples, or reference, you don't even know
they exist. If you use the header file for documentation,
and internal methods are grouped together beneath the terse
comment "internal methods," then why are you calling them?
Even if they're secured with the private
incantation, nothing is stopping you from editing the file,
deleting that word, and going for it. And if this is your
own code that you're doing this with, then this scenario is
teetering on the brink of madness.
What all of these fine-grained controls have done is to
put the focus on software engineering in the small. The
satisfaction of building so many tiny, faux-secure
fortresses by getting publics and protecteds in the right
places and adding immutability keywords before every
parameter and local variable. But you've still got a sea of
modules and classes and is anything actually simpler or
more reliable because some methods are behind the
private
firewall?
I'm going to give a couple of examples of building for
isolation and reliability at the system level, but don't
overgeneralize these.
Suppose you're building code to control an X-ray
machine. You don't want the UI and all of that mixed
together with the scary code that irradiates the patient.
You want the control code on the device itself, and a small
channel of communication for sending commands and getting
back the results. The UI system only knows about that
channel, and can't accidentally compromise the state of the
hardware.
There's an architecture used in video games for a long
time now where rendering and other engine-level functions
are decoupled from the game logic, and the two communicate
via a local socket. This is especially nice if the engine
is in C++ and you're using a different language for the
game proper. I've done this with Erlang, which worked out
well, at least under OS X.
Both of these have a boldness to them, where an entire
part of the system is isolated from the rest, and the
resulting design is easier to understand and simpler
overall. That's more important than trying to protect each
tiny piece from yourself.
Reconsidering Functional Programming
Key bits and pieces from the functional programming
world have, perhaps surprisingly, been assimilated into the
everyday whole of development: single-assignment
"variables," closures, maps and folds, immutable strings,
the cavalier creation and immediate disregarding of complex
structures.
The next step, and you really need this one if you want
to stick to the gospel of referential transparency that
dominated the early push toward FP, is immutable
everything. You can create objects and data
structures, but you can't modify them--ever. And this is a
hard problem, or at least one that requires thinking about
programs in a different way. Working that out is what drove
many of my early blog entries, like Functional Programming Doesn't Work (and what to
do about it) from 2009.
Across the board immutability may seem a ridiculous
restriction, but on-the-fly modifying of data is a
dangerous thing. If you have a time-sliced simulation or a game,
and you change the internals of an object mid-frame, then
you have to ask "At what point in the frame was this change
made?" Some parts of the game may have looked at the
original version, while others looked at it after the
change. Now magnify the potential for crossed-wires by all
of the destructive rewriting going on during a typical
frame, and it's a difficult problem. Wouldn't it be better
to have the core data be invariant during the frame, so you
know that no matter when you look at it, it's the same?
I wrote a number of smaller games in Erlang, plus one
big one, exploring this, and I finally had a realization:
keeping the whole of a game or simulation frame purely
functional is entirely unrelated to the functional
programming in the small that gets so much attention. In
fact, it has nothing to do with functional programming
languages as all. That means it isn't about maps or folds
or lambdas or even avoiding destructive-operation languages
like C or C++.
The overall approach to non-destructive simulations is
to keep track of changes from one frame to the next,
without making any changes mid-frame. At the very end of
the frame when the deltas from the current to the next have
been collected, then and only then apply those changes to
the core state. You can make this work in, say, purely
functional Erlang, but it's tedious, and a bit of a house
of cards, with changes threaded throughout the code and
continually passed back up the call-chain.
Here's an alternate way of thinking about this. Instead
of modifying or returning data, print the modification you
want to make. I mean really print it with
printf
or its equivalent. If you move a sprite
in the X direction, print this:
sprite 156: inc x,1.5
Of course you likely wouldn't use text, but it's easier
to visualize than a binary format or a list of values
appended to an internal buffer. How is this different than
passing changes back up the line? It's direct, and there's
now one place for collecting all changes to the frame. Run
the frame logic, look at the list of changes, apply those
changes back to the game, repeat. Never change the core
data mid-frame, ever.
As with most realizations, I can see some people calling
this out as obvious, but it's something I never considered
until I stopped thinking about purely functional languages
as a requirement for writing purely functional software.
And in most other languages it's just so easy to
overwrite a field in a data structure; there's a built-in
operator and everything. Not going for that as the first
solution for everything is the hard part.
(If you liked this you might enjoy Turning Your Code Inside-Out.)
Why Doesn't Creativity Matter in Tech Recruiting?
A lot of buzz last week over the author of the excellent
Homebrew package manager being asked to invert a binary
tree in a Google interview. I've said it before, that
organizational skills beat algorithmic
wizardy in most cases, probably even at Google. But
maybe I'm wrong here. Maybe these jobs really are hardcore
and no day goes by without implementing tricky graph
searches and finding eigenvectors, and that scares me.
A recruiter from Google called me up a few years ago,
and while I wasn't interested at the time, it made me
wonder: could I make it through that kind of technical
interview, where I'm building heaps and balancing trees on
a whiteboard? And I think I could, with one caveat. I'd
spend a month or two immersing myself in the technical, in
the algorithms, in the memorization, and in the process
push aside my creative and experimental tendencies.
I hope that doesn't sound pretentious, because it's a
process I've experienced repeatedly. If I focus on the
programming and tech, then that snowballs into more
interest in technical topics, and then I'm reading
programming forums and formulating tech-centric opinions.
If I get too much into the creative side and don't program,
then everything about coding seems much harder, and I talk
myself out of projects. It's difficult to stay in the
middle; I usually swing back and forth.
Is that the intent of the hardcore interview process? To
find people who are pure programming athletes, who amaze
passersby with non-recursive quicksorts written on a subway
platform whiteboard, and aren't distracted by non-coding
thoughts? It's kinda cool in a way--that level of training
is impressive--but I'm unconvinced that such a technically
homogeneous team is the way to go.
I've always found myself impressed by a blend between
technical ability and creativity. The person who came up
with and implemented a clever game design. The person doing
eye-opening visualizations with D3.js or Processing. The
artist using a custom-made Python tool for animation. It's
not creating code so much as coding as a path to
creation.
So were I running my ideal interview, I'd want to hear
about side projects that aren't pure code. Have you written
a tutorial with an unusual approach for an existing
project? Is there a pet feature that you dissect and
compare in apps you come across (e.g., color pickers)? And
yes, Homebrew has a number of interesting usability
decisions that are worth asking about.
(If you liked this, you might enjoy Get Good at Idea Generation.)
If You Haven't Done It Before, All Bets Are Off
I've been on one side or the other of most approaches to
managing software development: from hyper-detailed used of
specs and flowcharts to variants of agile to not having any
kind of planning or scheduling at all. And I've distilled
all of that down into one simple rule: If you haven't done
it before--if you haven't built something close to this
before--then all bets are off.
It's one of the fundamental principles of programming,
that it's extremely difficult to gauge how much work is
hidden behind the statement of a task, even to where the
trivial and impossible look the same when silhouetted in
the morning haze. Yet even the best intentioned software
development methodologies still ride atop this
disorientation. That little, easy feature hiding in the
schedule, the one that gets passed over in discussions
because everyone knows it's little and easy, turns out to
be poorly understood and cascades into another six months
for the project.
This doesn't mean you shouldn't keep track of what work
you think you have left, or that you shouldn't break down
vague tasks into concrete ones, or that you shouldn't be
making drastic simplifications to what you're making (if
nothing else, do this last one).
What it does mean is that there's value in having built
the same sort of thing a couple of times.
If you've previously created a messaging service and you
want to build a new messaging service, then you have
infinitely more valuable insight than someone who has only
worked on satellite power management systems and decides to
get into messaging. You know some of the dead ends. You
know some of the design decisions to be made. But even if
it happens that you've never done any of this before, then
nothing is stopping you from diving in and finding your
way, and in the end you might even be tremendously
successful.
Except when it comes to figuring out how much work it's
going to take. In that case, without having done it before,
all bets are off.
(If you liked this, you might enjoy Simplicity is Wonderful, But Not a
Requirement.)
Computer Science Courses that Don't Exist, But
Should
CSCI 2100: Unlearning Object-Oriented
Programming
Discover how to create and use variables that aren't inside
of an object hierarchy. Learn about "functions," which are
like methods but more generally useful. Prerequisite: Any
course that used the term "abstract base class."
CSCI 3300: Classical Software Studies
Discuss and dissect historically significant products,
including VisiCalc, AppleWorks, Robot Odyssey, Zork, and
MacPaint. Emphases are on user interface and creativity
fostered by hardware limitations.
CSCI 4020: Writing Fast Code in Slow
Languages
Analyze performance at a high level, writing interpreted
Python that matches or beats typical C++ code while being
less fragile and more fun to work with.
CSCI 2170: User Experience of Command Line
Tools
An introduction to UX principles as applied to command line
programs designed as class projects. Core focus is on
output relevance, readability, and minimization. UNIX "ls"
tool is a case study in excessive command line
switches.
PSYC 4410: Obsessions of the Programmer Mind
Identify and understand tangential topics that software
developers frequently fixate on: code formatting, taxonomy,
type systems, splitting projects into too many files.
Includes detailed study of knee-jerk criticism when exposed
to unfamiliar systems.
The Right Thing?
A Perl program I was working on last year had fifteen
lines of code for loading and saving files wholesale (as
opposed to going line by line). It could have been shorter,
but I was using some system routines that were supposedly
the fastest option for block reads and writes.
The advice I had been seeing in forums for years was
that I shouldn't be doing any of this, but instead use the
nifty File::Slurp
module. It seemed silly to
replace fifteen lines of reliable code with a module, but
eventually I thought I'd do the right thing and switch.
I never should have looked, but File::Slurp
turned out to be 800+ lines of code, including comments,
and not counting the documentation block at the bottom. One
of those comments stood out, as is typical when prefaced
with "DEEP DARK MAGIC":
DEEP DARK MAGIC. this checks the UNTAINT IO flag of
a glob/handle. only the DATA handle is untainted (since
it is from trusted data in the source file). this
allows us to test if this is the DATA handle and then
to do a sysseek to make sure it gets slurped correctly.
on some systems, the buffered i/o pointer is not left
at the same place as the fd pointer. this sysseek makes
them the same so slurping with sysread will work.
Still, I kept using it--the right thing and all--until
one day I read about a Unicode security flaw with
File::Slurp
. Now I was using an 800 line
module containing deep dark magic and security issues. Oh,
and also no one was maintaining it. This was no longer the
recommended solution, and there were people
actively pointing out why it should be avoided.
I dug up my original fifteen lines, took out the
optimizations, and now I'm back to having no module
dependencies. Also the code is faster, likely because it
doesn't have to load another module at runtime. As a
footnote, the new right thing is the
Path::Tiny
module, which in addition to
providing a host of operations on file paths, also includes
ways to read and write entire files at once. For the
moment, anyway.
(If you liked this, you might enjoy Tricky When You Least Expect It.)
What Can You Put in a Refrigerator?
This may sound ridiculous, but I'm serious. The goal is
to write a spec for what's allowed to be put into a
refrigerator. I intentionally picked something that
everyone has lots of experience with. Here's a first
attempt:
Anything that (1) fits into a refrigerator and (2)
is edible.
#1 is hard to argue with, and the broad stroke of #2 is
sensible. Motorcycles and bags of cement are off the list.
Hmmm...what about liquids? Can I pour a gallon of orange
juice into the refrigerator? All right, time for version
2.0:
Anything that's edible and fits into a refrigerator.
Liquids must be in containers.
Hey, what about salt? It fits, is edible, and isn't a
liquid, so you're free to pour a container of salt into
this fridge. You could say that salt is more of a seasoning
than a food, in an attempt to disallow it, but I'll counter
with uncooked rice. This could start a long discussion
about what kinds of food actually need
refrigeration--uncooked rice doesn't, but cooked rice does.
Could we save energy in the long haul by blocking things
that don't need to be kept cool? That word need
complicates things, so let's drop this line of thinking for
now.
Anything that's edible and fits into a refrigerator.
Items normally stored in containers must be in
containers.
How about a penguin? Probably need some kind of clause
restricting living creatures. Maybe the edibility
requirement covers this, except leopard seals and sea lions
eat penguins. No living things across the board is safest
way to plug this hole. Wait, do the bacteria in yogurt
count as living? This entire edibility issue is
troublesome. What about medicine that needs to be kept
cool?
Oh no, we've only been thinking about residential uses!
A laboratory refrigerator changes everything. Now we've got
to consider organs and cultures and chemicals and is it
okay to keep iced coffee in there with them. It also never
occurred to me until right now that we can't even talk
about any of this until we define exactly what the allowed
temperature range of a refrigeration appliance is.
In the interest of time, I'll offer this
for-experts-only spec for "What can you put in a
refrigerator?":
Anything that fits into a refrigerator.
Alternate Retrocomputing Histories
There's a computer science course that goes like this:
First you build an emulator for a fictional CPU. Then you
write an assembler for it. Then you close the loop by
defining an executable format that the emulator can load
and the assembler can generate, and you have a complete, if
entirely virtual, development system.
Of course this project is intended as an educational
tool, to gain exposure to hardware and operating systems
concepts. When I took that course, the little homemade CPU
felt especially hopeless, making an expensive minicomputer
slower than an Apple II. Use it to develop games? Not a
chance.
And now, there's MAME.
The significance of this deserves some thought. All
those processors that were once so fast, from the 6809 to
the 68000 to the lesser known TMS34010 CPU/GPU combo that
powers Mortal Kombat and NBA Jam, being completely
duplicated by mere programs. This pretend hardware can, in
real-time, reanimate applications always cited as requiring
the ultimate performance: high frame rate games. When you
look at the results under MAME, the screen full of enticing
pixels, that instructions are being decoded and dispatched
by a layer of C code isn't something that makes its way
through the system and into your mind.
Maybe that virtual CPU from that college class isn't so
crazy any more?
Now, sure, you could design your own processor and
emulate it on a modern desktop or phone. You could even
ship commercial software with it. This little foray into
alternate retrocomputing histories will result in systems
that are orders of magnitude simpler than what we've
currently got. Your hundred virtual opcodes is a footnote
to the epic volumes of Intel's x86 instruction set manuals.
No matter what object code file structure you come up with,
it pales in comparison to the Portable Executable Format
that's best explained by
large posters.
I wouldn't do that. It's still assembly language, and I
don't want to go back down that road.
The most fascinating part of this thought experiment is
that it's possible at all. You can set aside decades of
cruft, start anew in a straightforward way, and the result
is immediately usable. There's not much personal appeal to
a Z80 emulator, but many applications I've written have
small, custom-built interpreters in them, and maybe I
didn't take them far enough. Is all the complaining about
C++ misguided, in that the entire reason for the existence
of C++ is so you can write systems that prevent having to
use that language?
(If you liked this, you might enjoy Success Beyond the Barrier of Full
Understanding.)
The Same User Interface Mistakes Over and Over
It has been 42 years since the not-very-wide release of
the Xerox Alto and almost 32 since the mainstream
Macintosh. You might expect we've moved beyond the era of
egregious newbie mistakes when building graphical UIs, but
clearly we have not. Drop-down lists containing hundreds of
elements are not rare sights. Neither are modal preference
dialogs, meaningless alerts where the information is not
actionable, checkboxes that allow mutually exclusive
options to be selected at the same time, and icons that
don't clearly represent anything. I could go on, but we've
all experienced this firsthand.
Wait, I need to call out one of the biggest offenses:
applications stealing the user's focus--jumping into the
foreground--so that clicks intended for the previously
front-most app are now applied to the other, possibly with
drastic results.
That there are endless examples of bad UIs to cite and
laugh at and ignore is not news. The real question is
why, after all this time, do developers still make these
mistakes? There are plenty of UI experts teaching
simplicity and railing against poor design. Human-computer
interaction and UX are recognized fields. So what
happened?
We've gotten used to it. Look at the preferences
panel in most applications, and there are guaranteed to be
settings that you can't preview, but instead have to
select, apply, close the window, and then can't be undone
if you don't like them. You have to manually re-establish
the previous settings. This is so common that it wouldn't
even be mentioned in a review.
(At one time the concern was raised that the ubiquitous
"About..." menu option was mislabeled, because it didn't
give information about what a program was or how it worked,
but instead displayed a version number and copyright
information. It's a valid point, but it doesn't get a
second thought now. We accept the GUI definition of
"About.")
There's no standard resource. How do you know
that certain uses of checkboxes or radio buttons are bad?
From experience using apps, mostly, and some designers may
never notice. If you're starting out building an interface,
there's no must-have, coffee-stained reference--or a web
equivalent--that should be sitting on your desk. Apple and
others have their own guidelines, but these are huge and
full of platform-specific details; the fundamentals are
easy to overlook.
There aren't always perfect alternatives. There's
so much wrong with the focus-stealing, jump-to-the-front
application, but what's the solution? Standard practice is
a notification system which flashes or otherwise vies for
attention, then you choose when you want to interact with
the beckoning program. What this notification system is
depends on the platform. There isn't a definitive approach
for getting the user's attention. It's also not clear that
the model of background apps requesting the user's
attention works. How many iPhones have you seen with a red
circle containing "23" on the home screen, indicating that
23 apps need updating...and it's been there for months?
Implementing non-trivial GUIs is still messy.
Windows, OS X, and iOS are more or less the same when it
comes to building interfaces. Use a tool to lay out a
control-filled window, setting properties and defaults.
Write classes which are hooked to events fired by controls.
There's more architecture here than there should be, with
half of the design in code, half in a tool, and trying to
force-fit everything into an OOP model. It's also easy to
build interfaces that are too static. REBOL and Tk showed
how much nicer this could be, but they never became
significant. It's better in HTML, where layout and code are
blurred, but this doesn't help native apps.
(If you liked this, then you might enjoy If You're Not Gonna Use It, Why Are You Building
It?)
What's Your Secondary Language?
Most of the "Which programming language is best?"
discussion is irrelevant. If you work at a company which
uses C++ or Python or Java, then you use that; there's no
argument to be had. In other cases your options are limited
by what's available and well-supported. If you want to
write an iPhone game, Common Lisp is not on the menu of
reasonable options. You could figure out a way to
make it work, but you're fighting the system and C or Swift
would almost certainly be less stressful in the end.
At some point in the development of your
mandated-to-use-Java project, you'll need to do some quick
calculations on the side, ones that won't involve Java. I
never use a faux plastic-button GUI calculator for that; I
bring up an interpreter with an interactive command prompt.
Going beyond math, algorithms are easier to prototype in a
language that isn't batch-compiled and
architecture-oriented. When I was working on a PlayStation
2 launch title, I had never implemented a texture cache
before, so I experimented with some possibilities in
Erlang.
When there's debate in a project meeting about some
topic and everything being said is an unproven opinion, the
person in the back who immediately starts in a small
prototype to provide concrete data is the only person I
trust.
The important question is not "Which programming
language is best?" but "What's your secondary language?"
The language you reach for to solve problems, prove that
ideas work before implementing them for real, and to do
interesting things with files and data.
The criteria for judging a development language and
secondary language are completely different. The latter is
all about expressiveness, breadth of readily available
capabilities, and absence of roadblocks. Languages without
interactive command lines are non-starters. Ditto for
languages that are geared toward building infrastructure.
You want floating point vectors now, and not ways to
build overloaded floating point vector classes with private
methods.
My secondary language has jumped around quite a bit and
at the moment I have two. It was J for a while, because J is an
amazing calculator with hooks to a variety of
visualizations. At some point J shifted to being more
cross-platform and lost much of its usefulness (this may or
may not still be true). Erlang is my go-to for algorithms
and math and small tools, but it's not something I'd use to
build a GUI. Recently I've used JavaScript for anything
interactive or visual. I know, I know, those scope rules!
But I can sit down and make stuff to show people fairly
quickly, and that outweighs quibbles I have with the
language itself.
Messy Structs/Classes in a Functional Style
There are two major difficulties that get ignored in
introductions to functional programming. One is how to
build interactive programs like games (see Purely Functional Retrogames). The other is
how to deal with arbitrary conglomerations of data, such as
C structs or C++ classes.
Yes, yes, you can create composite data types in
functional languages, no problem. What happens in C,
though, is that it's easy to define a struct, then keep
putting things in there as needed. One day you realize that
this structure contains dozens of flags and counters and
other necessary things, which sounds bad--and technically
is bad--except that it sure was nice to just add them and
not worry about it. You can do the same thing in a
functional language, but it's a poor match for
immutability. "Change" one of those hundred fields and they
all get copied. When interactively testing it's hard to see
what's different. There are just these big 50-field data
types that get dumped out.
I have a couple of guidelines for dealing with messy
struct/class-like data in Erlang, and I expect they will
apply to other languages. I've never seen these mentioned
anywhere, so I want to pass them along. Again, the key word
is messy. I'm not talking about how to represent an
RGB tuple or other trivial cases. Set perfection aside for
the moment and pretend you're working from a C++ game where
the "entity" type is filled with all kinds of data.
The first step is to separate the frequently changed
fields from the rest. In a game, the position of an entity
is something that's different every frame, but other
per-entity data, like the name of the current animation,
changes only occasionally. One is in the 16-33 millisecond
time frame, the other in seconds or tens of seconds. Using
Erlang notation, it would be something like this:
{Position, Everything_Else}
The technical benefit is that in the majority of frames
only Position
and the outer tuple are created,
instead of copying the potentially dozens of fields that
make up Everything_Else
. This factoring based
on frequency of change provides additional information for
thinking about the problem at hand.
Everything_Else
can be a slower to rebuild
data structure, for example.
The other rule of thumb I've found helpful is to
determine which fields are only used in certain cases. That
is, which are optional most of the time. In this oversized
entity, there might be data that only applies if the
character is swimming. If the character is on-foot most of
the time, don't add the water-specific data to the core
structure. Now we've got something like:
{Position, Most_Everything_Else, Optional_Stuff}
In my code, the optional stuff is an Erlang property
list, and values come and go as needed (were I to do it
today I might use a map instead). In a real game, I found
that almost everything was optional, so I ended up with
simply:
{Position, Optional_Stuff}
(If you liked this, you might enjoy A
Worst Case for Functional Programming?)
On the Madness of Optimizing Compilers
There's the misconception that the purpose of a compiler
is to generate the fastest possible code. Really, it's to
generate working code--going from a source language to
something that actually runs and gives results. That's not
a trivial task, mapping any JavaScript or Ruby or
C++ program to machine code, and in a reliable manner.
That word any cannot be emphasized enough. If you
take an existing program and disassemble the generated
code, then it's easy to think "It could have been optimized
like this and like that," but it's not a compiler designed
for your program only. It has to work for all programs
written by all these different people working on entirely
different problems.
For the compiler author, the pressure to make the
resultant programs run faster is easy to succumb to. There
are moments, looking at the compiled output of test
programs, where if only some assumptions could be made,
then some of these instructions could be removed. Those
assumptions, as assumptions tend to be, may look correct in
a specific case, but don't generalize.
To give a concrete example, it may be obvious that an
object could be allocated on the stack instead of the heap.
To make that work in the general case, though, you need to
verify that the pointer to the object isn't saved
anywhere--like inside another object--so it outlives the
data on the stack. You can trace through the current
routine looking for pointer stores. You can trace down into
local functions called from the current routine. There may
be cases where the store happens in one branch of a
conditional, but not the other. As soon as that pointer is
passed into a function outside of the current module, then
all bets are off. You can't tell what's happening, and have
to assume the pointer is saved somewhere. If you get any of
this wrong, even in an edge case, the user is presented
with non-working code for a valid program, and the compiler
writer has failed at his or her one task.
So it goes: there are continual, tantalizing cases for
optimization (like the escape
analysis example above), many reliant on a handful of
hard to prove, or tempting to overlook, restrictions. And
the only right thing to do is ignore most of them.
The straightforward "every program all the time"
compiler is likely within 2-3x of the fully optimized
version (for most things), and that's not a bad place to
be. A few easy improvements close the gap. A few slightly
tricky but still safe methods make up a little more. But
the remainder, even if there's the potential for 50% faster
performance, flat out isn't worth it. Anything that
ventures into "well, maybe not 100% reliable..." territory
is madness.
I've seen arguments that some people desperately need
every last bit of performance, and even a few cycles inside
a loop is the difference between a viable product and
failure. Assuming that's true, then they should be crafting
assembly code by hand, or they should be writing a custom
code generator with domain-specific knowledge built-in.
Trying to have a compiler that's stable and reliable and
also meets the needs of these few people with extreme,
possibly misguided, performance needs is a mistake.
(If you liked this, you might enjoy A
Forgotten Principle of Compiler Design.)
Moving Beyond the OOP Obsession
Articles pointing out the foibles of an object-oriented
programming style appear regularly, and I'm as guilty as anyone. But all this anecdotal
evidence against OOP doesn't have much effect. It's still
the standard taught in universities and the go-to technique
for most problems.
The first major OO language for PCs was Borland Turbo
Pascal 5.5, introduced in 1989. Oh, sure, there were a few
C++ compilers before that, but Turbo Pascal was the
language for MS-DOS in the 1980s, so it was the first
exposure to OOP for many people. In Borland's magazine ads,
inheritance was touted as the big feature, with an example
of how different variants of a sports car could be derived
from a base model. What the ads didn't mention at all was
encapsulation or modularity, because Turbo Pascal
programmers already knew how to do that in earlier,
pre-object versions of the language.
In the years since then, the situation has reversed.
Inheritance is now the iffiest part of the object-oriented
canon, while modularity is everything. In OOP-first
curricula, objects are taught as the method of
achieving modularity, to the point where the two have
become synonymous. It occurred to me that there are now a
good many coders who've never stopped to think about how
modularity and encapsulation could work in C++ without
using classes, so that's what I want to do now.
Here's a classic C++ method call for an instance of a
class called mixer
:
m->set_volume(0.8);
m
is a pointer to instance of the
mixer
class. set_volume
is the
method being called. Now here's what this would look like
in C++ without using objects:
mixer_set_volume(m, 0.8);
This is in a file called mixer.cpp
, where
all the functions have the mixer_
prefix and
take a pointer (or reference) to a variable of the
mixer
type. Instead of new mixer
you call mixer_new
. It might take a moment to
convince yourself, but barring some small details, these
two examples are the same thing. You don't need OOP to do
basic encapsulation.
(If you're curious, the pre-object Turbo Pascal version
is almost the same:
mixer.set_volume(m, 0.8);
mixer
is not an object, but the name of the
module, and the dot means that the following identifier is
inside that module.)
Now the C++ mixer_set_volume
example above
is slightly longer than the class version, which I expect
will bother some people. The mild verbosity is not a bad
thing, because you can do simple text searches to find
everywhere that mixer_set_volume
is used.
There is no confusion from multiple classes having the same
methods. But if you insist, this is easy to remedy by
making it an overloaded function where the first parameter
is always of type mixer
. Now you can simply
say:
set_volume(m, 0.8);
I expect there are some people waiting to tell me I'm
oversimplifying things, and I know perfectly well that I'm
avoiding virtual functions and abstract base classes.
That's more or less my point: that while simple, this
covers the majority of use cases for objects, so teach
it first without having the speed bump of terminology
that comes with introducing OOP proper.
To extend my example a bit more, what if "mixer" is
something that there can only be one of, because it's
closely tied to the audio hardware? Just remove the first
parameter to all the function calls, and you end up
with:
mixer_set_volume(0.8);
You can teach this without ever using the word
"singleton."
(You might enjoy Part 1 and
Part 2 of this unintentional
trilogy.)
Death of a Language Dilettante
I used to try every language I came across. That
includes the usual alternatives like Scheme, Haskell, Lua,
Forth, OCaml, and Prolog; the more esoteric J, K, REBOL,
Standard ML, and Factor; and some real obscurities: FL,
Turing, Hope, Pure, Fifth. What I hoped was always that
there was something better than what I was using. If it
reduced the pain of programming at all, then that was a
win.
Quests for better programming languages are nothing new.
Around the same time I started tinkering with Erlang in the
late 1990s, I ran across a site by Keith Waclena, who was
having a self-described "programming language crisis." He
assigned point values to a
list of features and computed a score for each language
he tried. Points were given for static typing, local
function definition, "the ability to define new control
structures" and others.
There's a certain set of languages often chosen by
people who are outside of computer science circles: PHP,
JavaScript, Flash's ActionScript, Ruby, and some more
esoteric app-specific scripting languages like GameMaker's
GML. If I can go further back, I'll also include
line-numbered BASIC. These also happen to be some of the
most criticized languages by people who have the time for
that sort of thing. JavaScript for its weird scope rules
(fixed in ES6, by the way) and the strange outcomes from
comparing different types. Ruby for its loose typing and
sigils. PHP for having dozens of reserved keywords. BASIC
for its lack of structure.
This criticism is troubling, because there are clear
reasons for choosing these languages. Want to write
client-side web code? JavaScript. Using GameMaker? GML.
Flash? ActionScript. Picked up an Atari 130XE from the
thrift shop? BASIC. There's little thought process needed
here. Each language is the obvious answer to a question.
They're all based around getting real work done, yet
there's consistent agreement that these are the wrong
languages to be using.
If you veer off into discussions of programming language
theory (PLT), it quickly becomes muddy why one language is
better than another, but more importantly, as with Keith's
crisis, the wrong criteria are being used. Even
something as blatantly broken as the pre-ES6 scoping rules
in JavaScript isn't the fundamental problem it's made out
to be. It hasn't been stopping people from making great
things with the language. Can PLT even be trusted as a
field? And what criteria do you use for choosing a
programming language?
Does this language run on the target system that I
need it to? If the answer is no, end of discussion. Set
aside your prejudices and move on.
Will I be swimming against the current, not being
able to cut and paste from SDK documentation and get
answers via Google searches, if I choose this language? You
might be able to write a PlayStation 4 game in Haskell, but
should you?
Are the compiler and other tools pleasant to use,
quick, and reliable? Once I discovered that Modula-2 was cleaner than C and Pascal, I
wanted to use it. Unfortunately, there were fewer choices
for Modula-2 compilers, and none of them were as fast and
frustration-free as Turbo Pascal.
Am I going to hit cases where I am at the mercy of
the implementors, such as the performance of the
garbage collector or compile times for large projects? You
don't want to get in a situation where you need certain
improvements to the system, but the maintainers don't see
that as important, or even see it as against the spirit of
the language. You're not going to run into that problem
with the most heavily used toolsets.
Do I know that this is a language that will survive
the research phase and still be around in ten years?
Counterpoint: BitC.
Here's an experiment I'd like to see: give a language
with a poor reputation (JavaScript, Perl) to someone who
knows it passably well and--this is the key--has a strong
work ethic. The kind of person who'd jump in and start
writing writing a book rather than dreaming about being
famous novelist. Then let the language dilettante use
whatever he or she wants, something with the best type
system, hygenic macros, you name it. Give them both a
real-world task to accomplish.
As someone who appreciates what modern languages have to
offer, I really don't want this to be the case, but my
money is on the first person by a wide margin.
Evolution of an Erlang Style
I first learned Erlang in 1999, and it's still my go-to
language for personal projects and tools. The popular
criticisms--semicolons, commas, and dynamic typing--have
been irrelevant, but the techniques and features I use have
changed over the years. Here's a look at how and why my
Erlang programming style has evolved.
I came to Erlang after five years
of low-level coding for video games, so I was concerned
about the language being interpreted and the overhead of
functional programming. One of the reasons I went with
Erlang is that there's an easy correspondence between
source code and the BEAM virtual machine. Even more than
that, there's a subset of Erlang that results in optimal
code. If a function makes only tail calls and calls to
functions written in C, then parameters stay in fixed
registers even between functions. What looks like a lot of
parameter pushing and popping turns into destructive
register updates. This is one of the first things I wrote about here, back in
2007.
It's curious in retrospect, writing in that sort of
functional assembly language. I stopped thinking about it
once BEAM performance, for real problems, turned out to
much better than I expected. That decision was cemented by
several rounds of hardware upgrades.
The tail-recursive list building pattern, with an
accumulator and a lists:reverse
at the end,
worked well with that primitive style, and it's a common
functional idiom. Now I tend to use a more straightforward
recursive call in the right hand side of the list
constructor. The whole "build it backward then reverse"
idea feels clunky.
For a small project I tried composing programs from
higher-level functions (map
,
filter
, foldl
, zip
)
as much as possible, but it ended up being more code and
harder to follow than writing out the "loops" in straight
Erlang. Some of that is awkward syntax (including
remembering parameter order), but there are enough cases
where foldl
isn't exactly right--such as
accumulating a list and counting something at the same
time--that a raw Erlang function is easier.
List comprehensions, though, I use all the time. Here
the syntax makes all the difference, and there's no order
of parameters to remember. I even do clearly inefficient
things like:
lists:sum([X || {_,_,X} <- List]).
because it's simpler than foldl
.
I use funs--lambdas--often, but not to pass to functions
like map
. They're to simplify code by reducing
the number of parameters that need to be passed around.
They're also handy for returning a more structured type, a
sort of simple object, again to hide unneccessary
details.
Early on I was also concerned about the cost of
communicating with external programs. The obvious method
was to use ports (essentially bidirectional pipes), but the
benchmarks under late-1990s Windows were not good. Instead
I used linked-in drivers, which were harder to get right
and could easy crash the emulator. Now I don't even think
about it: it's ports for everything. I rewrote a 2D action
game for OS X with the graphics and user input in an
external program and the main game logic in Erlang. The
Erlang code spawns the game "driver," and they communicate
via a binary protocol. Even at 60fps, performance is not an
issue.
Fun vs. Computer Science
I've spent most of my career working on games, either
programming or designing them or both. Games are weird,
because everything comes down to this nebulous thing called
fun, and there's a complete disconnect between fun
and most technical decisions:
Does choosing C++14 over C++11 mean the resulting game
is more fun?
Does using a stricter type system mean the game is more
fun?
Does using a more modern programming language mean the
game is more fun?
Does favoring composition over inheritance mean the game
is more fun?
Now you could claim that some of this tech would be more
fun for the developer. That's a reasonable, maybe
even important point, but there's still a hazy at best
connection between this kind of "developer fun" and "player
fun."
A better argument is that some technologies may result
in the game being more stable and reliable. Those two terms
should be a prerequisite to fun, and even though people
struggle along--and have fun with--buggy games (e.g.,
Pokemon Go), I'm not going to argue against the importance
of reliability. Think about all the glitchiness and
clunkiness you experience every day, from spinning cursors,
to Java tricking you into installing the Ask toolbar, to an
app jumping into the foreground so you click on the wrong
thing. Now re-watch The Martian and pretend all the
computers in the movie work like your desktop PC. RIP Matt
Damon.
The one thing that does directly make a game more
fun is decreased iteration time. Interactive tweaking beats
a batch compile and re-launch every time, and great ideas
can come from on the fly experimentation. The productivity
win, given the right tools, is 10x or more, and I can't
emphasize this enough.
And yet this more rapid iteration, which is so important
to me, does not seem to be of generally great importance.
It's not something that comes up in computer sciencey
discussions of development technologies. There's much focus
on sophisticated, and slow, code optimization, but
turnaround time is much more important in my work. A
certain circle of programmers puts type systems above all
else, yet in Bret Victor's inspirational Inventing on Principle
talk from 2012, he never mentioned type systems, not once,
but oh that interactivity.
I realize that we're heading toward the ultimate
software engineer dream of making a type-checked change
that's run through a proven-correct compiler that does
machine-learning driven, whole program optimization...but
it's going the exact opposite of the direction I want. It's
not helping me in my quest for creating fun.
For the record, I just picked those buzzwords out of my
mind. I'm not criticizing static type checking or any of
those things, or even saying that they preclude interactive
iteration (see Swift's playgrounds, for example). They
might make things harder though, if they necessitate
building a new executable of the entire game for every
little change.
Interactivity, I may have to grudgingly accept, is not
trendy in computer science circles.
(If you liked this, you might enjoy You Don't Read Code, You Explore It.)
Optimizing for Human Understanding
Long ago, I worked on a commercial game that loaded a
lot of data from text files. Eventually some of these grew
to over a megabyte. That doesn't sound like a lot now, but
they were larger than the available buffer for decoding
them, so I looked at reducing the size of the files.
The majority of the data was for placement of 3D
objects. The position of each object was a three-element
floating point vector delineated with square brackets like
this:
[ 659.000000 -148.250000 894.100000 ]
An orientation was a 3x3 matrix, where each row was a
vector:
[ [ 1.000000 0.000000 0.000000 ]
[ 0.000000 1.000000 0.000000 ]
[ 0.000000 0.000000 1.000000 ] ]
Now this format looks clunky here, but imagine a text
file filled with hundreds of these. The six-digits after
the decimal point was to keep some level of precision, but
in practice many values ended up being integers. Drop the
decimal point and everything after it, and the orientation
matrix becomes:
[ [ 1 0 0 ]
[ 0 1 0 ]
[ 0 0 1 ] ]
which is a big improvement. In the vector example,
there's "-148.250000" which isn't integral, but those last
four zeros don't buy anything. It can be reduced to
"-148.25".
The orientation still isn't as simple as it could be.
It's clearly an identity matrix, yet all nine values are
still specified. I ended up using this notation:
[ I ]
I also found that many orientations were simply
rotations around the up vector (as you would expect in a
game with a mostly flat ground plane), so I could reduce
these to a single value representing an angle, then convert
it back to a matrix at load time:
[ -4.036 ]
I don't remember the exact numbers, but the savings were
substantial, reducing the file size by close to half. At
the time the memory mattered, but half a megabyte is
trivial to find on any modern system. This also didn't
result in simpler code, because the save functions were now
doing more than just fprintf
-ing values.
What ended up being the true win, and the reason I'd do
this again, is because it makes the data easier to visually
interpret. Identity matrices are easy to pick out, instead
of missing that one of the other values is "0.010000"
instead of "0.000000". Common rotations are clearly such,
instead of having to mentally decode a matrix. And there's
less noise in "0.25" than "0.250000" (and come to think of
it, I could have simplified it to ".25"). It's optimized
for humans.
(If you liked this, you might enjoy Optimization in the Twenty-First
Century.)
The New Minimalism
You don't know minimalism until you've spent time in the
Forth community. There are recurring
debates about whether local variables should be part of the
language. There are heated discussions about how scaled
integer arithmetic is an alternative to the complexity of
floating point math. I don't mean there were those
debates back in the day; I mean they still crop up now and
again.
My history with Forth and stack machines explains the
Forth mindset better than I can, but beware: it's a warning
as much as a chronology.
Though my fascination with Forth is long behind me, I
still tend toward minimalist programming, but not in the
same, extreme, way. I've adopted a more modern approach to
minimalism:
Use the highest-level language that's a viable
option.
Lean on the built-in features that do the most
work.
Write as little code as possible.
The "highest-level language" decision means you get as
much as possible already done for you: arbitrary length
integers, unicode, well-integrated data structures, etc.
Even better are graphics and visualization capabilities,
such as in R or Javascript.
"Lean on built-in features," means that when there's a
choice, prefer the parts of the system that are both
fast--written in C--and do the most work. In Perl, for
example, you can split a multi-megabyte string into many
pieces with one function call, and it's part of the C
regular expression library. Ditto for doing substitutions
in a large string. In Perl/Python/Ruby, lean on
dictionaries, which are both flexible and heavily
optimized. I've seen Python significantly outrun C, because
the C program used an off-the-cuff hash table
implementation.
I've been mostly talking about interpreted languages,
and there are two ways to write fast interpreters. The
first is to micro-optimize the instruction fetch/dispatch
loop. There are a couple of usual steps for this, but
there's only so far you can go. The second is to have each
instruction do more, so there are fewer to fetch and
dispatch. Rule #2 above is taking advantage of the
latter.
Finally, "write as little code as possible." Usual
mistakes here are building a wrapper object around an array
or dictionary and representing simple types like a
three-element vector as a dictionary with x, y, and z keys,
or worse, as a class. You don't need a queue class; you've
already got arrays with ways to add and remove elements.
Keep things light and readable at a glance, where you don't
have to trace into layers of functions to understand what's
going on. Remember, you have lots of core language
capabilities to lean on. Don't insist upon everything being
part of an architecture or framework.
This last item, write less code, is the one that the
other two are building toward. If you want people to be
able to understand and modify your programs--which is the
key to open source--then have less to figure out. That
doesn't mean fewer characters or lines at all costs. If you
need a thousand lines, then you need a thousand lines, but
make those thousand lines matter. Make them be about the
problem at hand and not filler. Don't take a thousand lines
to write a 500 line program.
(If you liked this, you might enjoy The Software Developer's Sketchbook.)
Being More Than "Just the Programmer"
There's a strange dichotomy that college doesn't prepare
computer science majors for: knowing how to program
is a huge benefit if you want to
create something new and useful, but as a programmer
you're often viewed as the implementer of someone else's
vision--as just the programmer--and have limited say in
crafting the application as a whole.
(Note that here I'm using "application as a whole" to
mean the feature set and experience of using the app, not
the underlying architecture.)
In my first game development job
writing 16-bit console games, I naively expected there to
be a blend of coding and design, like there was when I was
writing my own games for home computer magazines, but the
two were in different departments in different locations in
the rented office park space. It had never occurred to me
that a game could be failing because of poor design, yet I
wouldn't be able to do anything about it, not having a
title with "design" in it. I came out of that experience
realizing that I needed to be more than an implementer.
I wanted to write up some tips for people in similar
situations, people who want to be more than just the
programmer.
Go through some formalities to prove that you have
domain knowledge. You might think you know how
to design good user interfaces, but why should anyone
listen to you? Buy and read the top books in the field,
have them at your desk, and use them to cite guidelines. Or
take a class, which might be less efficient than reading on
your own, but it's concrete and carries more weight than
vague, self-directed learning.
Don't get into technical details when it doesn't
matter. "Why is that going to take three weeks to
finish?" "Well, there's a new version of the library that
doesn't fully work with the C++11 codebase that we're using
so I'm going to have to refactor a few classes, and also
there are issues with move semantics in the new compiler,
so..." No matter how you say this, it sounds like
complaining, and you get a reputation as the programmer who
spouts technical mumbo jumbo. Sure, talk tech with the
right people, but phrase things in terms of the
project--not the code--otherwise.
Don't get into programming or technology arguments,
ever. Just don't. Again, this is usually thinking on
the wrong level, and you don't want to advertise that.
There's also this Tony Hoare
quote that I love:
You know, you shouldn't trust us intelligent
programmers. We can think up such good arguments for
convincing ourselves and each other of the utterly
absurd.
Get to know people in departments whose work
interests you. Continuing the user interface example
from above, go talk to the UX folks. Learn what they like
and don't like and why they've made certain decisions.
They'll be glad that someone is taking an interest, and
you'll be learning from people doing the work
professionally.
Build prototypes to demonstrate ideas. If you
jump in and do work that someone else is supposed to do,
like changing the game design, then that's not going to
turn out well. A better approach is to build a small
prototype of a way you think something should work and get
feedback. Take the feedback to heart and make changes based
on it (also good, because you're showing people you value
their opinions). Sometimes these prototypes will fall flat,
but other times you'll have a stream of people stopping by
your desk to see what they've heard about.
Picturing WebSocket Protocol Packets
(I'm using JavaScript in this article. If you're
reading this via the news feed, go to the original version to see the missing
parts.)
I recently wrote a WebSocket server in Erlang. I've
gotten fond of separating even desktop apps into two
programs: one to handle the graphics and interface, and one
for the core logic, and they communicate over a local
socket. Any more it makes sense to use a browser for the
first of these, with a WebSocket connecting it to an
external program. The only WebSocket code I could find for
Erlang needed existing web server packages, which is why I
wrote my own.
The WebSocket
spec contains this diagram to describe the messages
between the client and server:
0 1 2 3
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
4 5 6 7
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
8 9 10 11
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
12 13 14 15
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+
This is a confusing a diagram for a number of reasons.
The ASCII art, for example, makes it hard to see which
lines contain data and which are for byte numbers. When I
first looked at it, it made me think there was more
overhead than there actually is. That's unfortunate,
because there's a simplicity to WebSocket protocol packets
that's hard to extract from the above image, and that's
what I want to demonstrate.
Here's the fixed part of the header, the 16-bits that
are always present. This is followed by additional info, if
needed, then the data itself. The number of bits is shown
below each field. You should keep coming back to this for
reference.
[See the original or enable JavaScript.]
F = 1 means this is a complete, self-contained packet.
Assume it's always 1 for now. The main use of the opcode
(Op) is to specify if the data is UTF-8 text or binary. M =
1 signals the data needs to be exclusive or-ed with a
32-bit mask. The length (Len) has three different encodings
depending on much much data there is.
Messages to the server are required to have a mask, so
here's what packets look like for each of the three length
encodings.
[See the original or enable JavaScript.]
The first has a length of 60 bytes, the second 14,075,
and the third 18,000,000. Special escape values for the 7
bit Len field indicate the presence of additional 16 or 64
bit length fields.
Packets from the server to the client don't use the
mask, so the headers are shorter. Again, for the same three
data lengths:
[See the original or enable JavaScript.]
The remaining part is what fragmented messages look
like. The F bit is 1 only for the Final packet. The initial
packet contains the opcode; the others have 0 in the opcode
field.
[See the original or enable JavaScript.]
This message is 8256 bytes in total: two of 4096 bytes
and one of 64. Notice how different length encodings are
used, just like in the earlier examples.
(If you liked this, you might enjoy Exploring Audio Files with Erlang.)
Learning to Program Without Writing the Usual Sort of
Code
There's much anecdotal evidence, from teachers of
beginning programming classes, that many people can't come
to grips with how to program. Sticking points can be as
fundamental as not being able to break a problem down into
a series of statements executed one after another, or
struggling with how variables are updated and have
different values at different points in the program.
I don't think it's quite as straightforward as that,
because there are real life analogs for both of these
sticking points. Clearly you have to go into the restaurant
before you can sit at the table, then you order, eat, pay
the bill, and leave. Everyone gets that (and knows why you
don't sit at the table before going to the restaurant).
When you pay for the meal, the money you have is decreased,
and it stays that way afterward. The difference with code
is that it's much more fine-grained, much more particular,
and it's not nearly so easy to think about.
If you think that I'm oversimplifying, here's a little
problem for you to code up: Write a program that, given an
unordered array, finds the second largest value.
(I'll wait for you to finish.)
You could loop through the array and see if each element
is greater than Largest
. If that's true, then
set NextLargest
to Largest
, and
Largest
to the current element. Easy. Except
that this doesn't work if you find a value smaller than
Largest
but greater than
NextLargest
. You need another check for that
(did you get that right?). I'm not saying this is a hard
problem, just a little tricky, and a hard one for beginners
to think about. Even in the first, simple case you have to
get the two assignments in the right order, or both
variables end up with the same value.
Set aside that kind of programming for a bit, and let's
look at other ways of solving the "second largest" problem.
Remember, nowhere in the description does it say anything
about performance, so that's not a concern.
Here's the easiest solution: Sort the array from largest
to smallest and take the second element.
There's a little extra housekeeping for this to be
completely correct (what if the length of the array is 1?),
but the solution is still trivial to think about. It's two
steps. No looping. No variables. Of course you don't write
the sort function; that's assumed to exist.
If you're not buying the sort, here's another: Find the
largest value in the array, remove it, then find the
largest value in the updated array. This sounds like
looping and comparisons (even though finding the largest
element is an easier problem than the second largest), but
think about it in terms of primitive operations that should
already exist: (1) finding the largest value in an array,
and (2) deleting a value from an array. You could adjust
those so you're getting the index of the largest value and
deleting the element at an index, but the naive version is
perfectly fine.
What I'm getting at is that thinking about problems
given a robust set of primitives to work with is
significantly easier than the procedural coding needed to
write those primitives in the first place. Yet
introductions to programming are focused almost exclusively
on the latter.
As much as I like functional programming, I don't think
it gets this right. The primitives in most functional
languages are based around maps and folds and zip-withs,
all of which require writing small, anonymous functions as
parameters. Now if a fold that adds up all the integers in
an array is named "sum" (as in at least Haskell and
Erlang), then that's a solid, non-abstract primitive to
think about and work with.
Long-time readers will expect me to talk about array
languages like J at this point, and I won't disappoint. The
entire history of array languages has been about finding a
robust and flexible set of primitives for manipulating
data, especially lists of numbers. To be completely fair, J
falls down in cases that don't fit that model, but it's a
beautiful system for not writing code. Instead you
interactively experiment with a large set of pre-written
verbs (to use Ken Iverson's term).
J or not, this kind of working exclusively with
primitives may be a good precursor to traditional
programming.
(If you liked this, you might enjoy Explaining Functional Programming to
Eight-Year-Olds.)
Progress Bars are Surprisingly Difficult
We've all seen progress bars that move slowly for twenty
minutes, then rapidly fill up in the last 30 seconds. Or
the reverse, where a once speedy bar takes 50% of the time
covering the last few pixels. And bars that occasionally
jump backward in time are not the rarity you'd expect them
to be.
Even this past month, when I installed the macOS Sierra
update, the process completed when the progress bar was
only two-thirds full. DOOM 2016 has a circular progress
meter for level loads, with the percent-complete in the
center. It often sits for a while at 0%, gets stuck at 74%
and 99%, and sometimes finishes in the 90s before reaching
100%.
Clearly this is not a trivial problem, or these quirks
would be behind us.
Conceptually, a perfect progress bar is easy to build.
All you need to know is exactly how long the total
computation will take, then update the bar in its own
thread so it animates smoothly. Simple! Why do developers
have trouble with this? Again, all you need to know is
exactly how long...
Oh.
You could time it with a stopwatch and use that value,
but that assumes your system is the standard, and that
other people won't have faster or slower processors,
drives, or internet connections. You could run a little
benchmark and adjust the timing based on that, but there
are too many factors. You could refine the estimate
mid-flight, but this is exactly the road that leads to the
bar making sudden jumps into the past. It's all dancing
around that you can't know ahead of time exactly how long
it should take for the progress bar to go from empty to
full.
There's a similar problem in process scheduling, where
there are a number of programs to run sequentially in batch
mode. One program at a time is selected to run to
completion, then the next. If the goal is to have the
lowest average time for programs being completed, then best
criteria for choosing the next program to run is the one
with the shortest execution time (see shortest
job next). But this requires knowing how long each
program will take before running it, and that's not
possible in the general case.
And so the perfect progress bar is forever out of reach,
but they're still useful, as established by Brad Allan
Meyers in his 1985 paper ("The importance of percent-done
progress indicators for computer-human interfaces"). But
"percent-done" of what? It's easy to map the loading of a
dozen similarly sized files to an overall percentage
complete. Not so much when all kinds of downloading and
local processing is combined together into a single
progress number. At that point the progress bar loses all
meaning except as an indication that there's some sort of
movement toward a goal, and that mostly likely the
application hasn't hasn't locked up.
(If you liked this, you might enjoy An Irrational Fear of Files on the
Desktop.)
Writing Video Games in a Functional Style
When I started this blog in 2007, a running theme was
"Can interactive experiences like video games be written in
a functional style?" These are programs heavily based
around mutable state. They evolve, often drastically,
during development, so there isn't a perfect up-front
design to architect around. These were issues curiously
avoided by the functional programming proponents of the
1980s and 1990s.
It's still not given much attention in 2016 in either. I
regularly see excited tutorials about mapping and folding
and closures and immutable variables, and even JavaScript
has these things now, but there's a next step that's rarely
discussed and much more difficult: how to keep the benefits
of immutability in large and messy programs that could gain
the most from functional solutions--like video games.
Before getting to that, here are the more skeptical
functional programming articles I wrote, so it doesn't look
like I'm a raving advocate:
I took a straightforward, arguably naive, approach to
interactive functional programs: no monads (because I
didn't understand them), no functional-reactive programming
(ditto, plus all implementations had severe performance
problems), and instead worked with the basic toolkit of
function calls and immutable data structures. It's
completely possible to write a video game (mostly) in that
style, but it's not a commonly taught methodology. "Purely
Functional Retrogames" has most of the key lessons, but I
added some additional techniques later:
The bulk of my experience came from rewriting a 60fps 2D
shooter in mostly-pure Erlang. I wrote about it in An Outrageous Port, but there's not much
detail. It really needed to be a multi-part series with
actual code.
For completeness, here are the other articles that
directly discuss FP:
If I find any I missed, I'll add them.
So Long, Prog21
I always intended "Programming in the 21st Century" to
have a limited run. I knew since the Recovering Programmer entry from January 1,
2010, that I needed to end it. It just took a while.
And now, an explanation.
I started this blog to talk about issues tangentially
related to programming, about soft topics like creativity
and inspiration and how code is a medium for implementing
creative visions. Instead I worked through more technical
topics that I'd been kicking around over the years. That
was fun! Purely Functional Retrogames
is something I would have loved to read in 1998. More than
once I've googled around and ended up back at one of my
essays.
As I started shifting gears and getting back toward what
I originally wanted to do, there was one thing that kept
bothering me: the word programming in the title.
I don't think of myself as a programmer. I write code,
and I often enjoy it when I do, but that term
programmer is both limiting and distracting. I don't
want to program for its own sake, not being interested in
the overall experience of what I'm creating. If I start
thinking too much about programming as a distinct entity
then I lose sight of that. Now that I've exhausted what I
wanted to write about, I can clear those topics out of my
head and focus more on using technology to make fun
things.
Thanks for reading!
It's hard to sum up 200+ articles, but here's a start.
This is not even close to a full index. See the archives if you want everything. (There
are some odd bits in there.)
widely linked
popular
on creativity
others that I like
Erlang
retro
Also see the previous entry for
all of the functional programming articles.
Programming as
if Performance Mattered is something I wrote in 2004
which used to be linked from every prog21 entry.
Library of Chadnet | wiki.chadnet.org