26 October 2006

Ignoring tail call recursion...

I'm feeling bitchy right now, so if you don't want to here my venting on tail call recursion optimization, move along.

This bitch is specifically directed at the code in Bruce Tate's article on IBM's developerWorks site, Concurrent programming with Erlang, but the exact same code shows up all over the place.

The problem is, when folks are trying to explain functional programming they love to use the fibonacci and factorial algorithms to demonstrate how cleanly the mathematical definition translates into functional code. Invariably, you see something like this:

factorial(0) -> 1;
factorial(N) -> N * factorial(N-1).

And we all say, "wow, that's really cool how concisely you've written that function". But then we also say, "wait a minute, that recursive call is going to cause your stack to grow and grow, that ain't cool". And if you are thinking that, you're totally justified in doing so, because you're right. That really isn't cool. That's because the author, in an effort to introduce as few new concepts as possible, is leaving out the standard use of an accumulator to allow for tail call optimization.

Okay, if you don't already know what tail call optimization is, I'm sure you're dying to know now. It's really quite a simple concept, but is really important if you want to get the most out of your recursive functions. What tail call optimization does is it looks at the current stack frame where a function call is being made, and if that function call does not rely on anything outside of its own arguments, it releases the resources from that frame. This allows the garbage collector to come through and do its job, it also allows you to call a recursive function an infinite number of times without overflowing the stack. Let me show you what I mean with some code.

factorial(N) -> factorial(N, N-1).
factorial(Accum, 0) -> Accum;
factorial(Accum, N) -> factorial(Accum * N, N-1).

The function that clients would call is still the single argument version of factorial, but now we've introduced a new two argument (btw, in Erlang, the number of arguments a function takes is called its "arity") function. Looking at the two versions of the function, you'll see that instead of the recursive call "N * fact(N-1)" we take that N that's hanging out outside of the function and put it within the functions arguments. This one small change now allows the compiler/virtual machine/whoever to look at the last function called and see that there's nothing left in the current stack frame that will be used in the future, so that frame can be forgotten (gc'd). When we had that hanging reference to N in the first version of the function the stack frame could not be eliminated because down the line, when the recursive function returned, it would then need to use that value of N. You can test this out by executing the two versions of factorial, on my machine trying to evaluate factorial(100000) crashed the non-optimized version while the optimized version ran just fine (Erlang also handles numbers of arbitrary size, so factorial(100000) spits out a crazy big integer value).

Once you have tail call optimization you can start to see how functional languages can get away with not having for loops (a pure functional language can't have for loops because that would require incrementing a variable and variables can only be assigned once in FP). So, instead of saying:

for(int i = 0; i < 50; i++){ do_something(i); }

we would say:

for_loop(i, max) when (i < max) -> do_something(i), for_loop(i + 1, max);
for_loop(i, max) -> .//end

Here, for_loop is just a function that uses an accumulator and a max value. Since that last call to for_loop does not require any other information besides what's in its arguments, it the current stack frame will be removed. Without tail call optimization the iterative version of the for loop is clearly superior to the recursive version, but with optimization the two are basically equivalent.

0 Comments:

Post a Comment

<< Home