Hacker News new | past | comments | ask | show | jobs | submit login

Tail-call elimination applies to any function which calls another function as the last action.

Turning this into a loop is only possible when the function calls itself, not when it calls another function, and not (naively) when two functions call each other in the tail position.




Understood, but in practice how often does that really happen? (I'm aware of the contrived 'odd'/'even' example always given, but in the real world, I never had anything like that). Even when I wrote more functional code I rarely (if ever?) did that. 98-99% of the time it was the same function calling itself. Is your experience different?


Iteratees are immensely useful in practice and they involve doing this all the time (basically you do stream processing with a source and a sink that are mutually recursive - the source emits an element by calling the sink to process it, the sink processes it and then recurses into the source for the next element. It feels a bit forced if you describe it like that, but it gives you a model wiht lots of natural structure - it's the only approach to stream processing I've found where you can have a value that represents a stage in the middle of processing a stream (not just a 1:1 function but something that does grouping or suspension or the like) in a natural way, that you can then test standalone etc.).


> the source emits an element by calling the sink to process it, the sink processes it and then recurses into the source for the next element

I think I understood that concept but it still seems a bit strange to me. Doesn't that imply a potentially infinite stack of function calls?

I've seen this idea in other functional program examples as well. Instead of having a sequence of instructions, they have a sequence of function calls. Instead of a function returning to the caller, it calls the next function in the program which is analogous to the virtual machine moving on to the next line of code in the sequence. It implies the program's structure and state is actually expressed within the function call stack and its frames. I admit I'm not sure what the purpose of this is.


> I think I understood that concept but it still seems a bit strange to me. Doesn't that imply a potentially infinite stack of function calls?

Yes, which is exactly why tail calls are so important. Otherwise you'd stack overflow after processing the first few thousand elements of your stream.

> I've seen this idea in other functional program examples as well. Instead of having a sequence of instructions, they have a sequence of function calls. Instead of a function returning to the caller, it calls the next function in the program which is analogous to the virtual machine moving on to the next line of code in the sequence. It implies the program's structure and state is actually expressed within the function call stack and its frames. I admit I'm not sure what the purpose of this is.

Well, if you want to express your function as an actual function, in the mathematical sense, recursion is often the natural way to do it. Things like the call stack are an implementation detail; often a natural way to describe a function is "do this, do that, then do the same thing for all the rest of it", and usually the easiest way to translate that into a program is recursion.

It also tends to be more compositional, because you don't have to worry about control flow (or at least, control flow isn't a special case: all your control flow is just function calls). For example if you are calling several nested functions in an imperative-style loop, there's no way to break out of that loop from the inner function. But if you write the same thing as several mutually recursive functions, you can "break out of the loop" by just not recursing under particular conditions.


Thanks, I think I have a better grasp now.


> Understood, but in practice how often does that really happen?

In functional code written where general tail call elimination (or special tail call operations) are available, non- or mutually-recursive tail calls are not uncommon.

Heck, non-recursive (potentially fairly nested) tail calls are common in languages without any optimization for it. Anytime you see something like “return f(x)” (or, because operators are implemented via special methods, even “return x+y”) in Python, for instance, you've got a tail call.

If it isn't direct or mutual recursion, it has to be a pretty degenerate case to blow the stack even without elimination, so elimination is must necessary for those cases. But other cases bene for from not having the extra memory usage and reducing pushing stuff onto and popping it off of the stack.


Happens a fair bit of time when parsing deeply nested structures (e.g. parsing deeply nested JSON structures) using mutually recursive parsers. I've generally resorted to either trampolining or explicit stacks.


This is usually a case where TCE doesn't help, in my experience, because most of the calls to sub-parsers aren't in tail position. This is especially unpleasant when you get bug reports of mysterious segfaults that turn out to be stack overflows...


Yep, a basic security check for any endpoint that accepts JSON: what happens when you hand it 4KiB of '[' characters


> parsing deeply nested JSON structures

How likely is it that the depth of the JSON structure would exceed the maximum depth of the stack? This doesn't appear to be a compelling example to me.

It probably is a shame that it's called tail call optimisation since nobody wants to mandate low level details for an optimisation.


> How likely is it that the depth of the JSON structure would exceed the maximum depth of the stack?

Not that unlikely i'd say, at lest if the stack size is unreasonably small.

For example, on nodejs, JSON.stringify() chokes with a nested array of 10K depth:

  > N=10_000; JSON.stringify(JSON.parse("[".repeat(N) + "]".repeat(N)))
  Uncaught RangeError: Maximum call stack size exceeded
      at JSON.stringify (<anonymous>)
Curiously, JSON.parse() is OK with arrays of 10M depth (and probably more). A difference of over 3 orders of magnitude between functions that seem pretty symmetrical at first glance.

Imagine that you build some software that lets users nest things —e.g., a circuit designer, a visual programming language, a UI builder, etc— and you wanna save these nested structures as JSON, or process them recursively. Pretty natural fit i'd say. Now, as a user, i'd be pretty upset if my magnum opus of a very complex circuit, or giant visual program, or very busy beast of UI art, starts creashing the program when exceeding a mere 10K elements. My computer has gigabytes of memory! And, presumably, there's a lot still available. But nope, a meager stack can ruin all the fun.

I personally think this is a pretty silly limitation to have on software, especially given today's memory sizes.


TCO likely wouldn’t help here, as the parser is probably using non-tail recursion to build the nested arrays. (I haven’t looked at the code, but that’s the natural way to do it with a recursive descent algorithm.)


It’s a security/DOS vulnerability if the object depth can be used to trigger stack overflow.


For example pretty much any higher-order function that wants to do some decision-making and then hand over execution to one of several functions provided will want to hand over execution by means of a tail call, so that it doesn't unnecessarily change stack complexity for whatever algorithm is being run around that piece of code.


One example would be in a parser, as various parsing functions would be invoked recursively until end of input, or until some non-matching input is encountered. In pretty much any scenario where you would use recursive functions, it is quite common to have mutual recursion between multiple functions. Not always of course, which is where the tail call elimination optimization is typically applied, but not being able to freely use mutual recursion to arbitrary depth turns out to be quite an annoying limitation.


State machines can often be implemented with a set of functions that each handle a single state and then tail-call into the function for the next state.


This is just a fancy form of spaghetti code. Don't do it unless you actually need the performance.


State machines are known for a predilection to use `goto` extensively, but that does not make them 'spaghetti code'.

A state machine is, in fact, a well-understood method of applying structure to spaghetti.


The point is that tail calls obscure the transitions between states by spreading it everywhere.


I'm not too clear on how one would implement a state machine without "spreading the transitions everywhere". Can you share a link to some example code showing the format you prefer?

Typically, we'd see the code that implements the logic for each state gathered together. Since this is where the outgoing transitions get decided, the transitions end up being distributed across their originating states.


erlang/otp gen_fsm maybe?

you define a function with a clause per state (and some other args including the incoming message), and each clause returns a tuple with the new state and some other stuff. the loop is part of the framework, and handles generic otp stuff like updating code in a running system.


That doesn't compute for me. You'd need to give an example for me to understand you.


> 98-99% of the time it was the same function calling itself

well if the same function is in the "tail" it would still apply a tail call elimination.

But a lot of the time tail call eliminiation is only needed in some extreme cases. it can eliminate stack overflow exceptions by a big amount. C#'s new System.Text.Json would greatly benefit with tail call ops, because it often has a call chain that is recursive and does not call itself (its mostly like ReadObject, ReadObjectAsync, ReadObject, ReadBla, ReadObject, etc. and if the object has a ref to itself system.text.json often blows up.

In Java and Scala this thing can btw. also happen (and I've already seen it) however at least you can configure the stack size there and thus most often eliminate the problem somewhat.

it's an edge case, but one that would be cool to fix, since it makes code often more reasonable especially parsers.


In languages like Scheme, Haskell, OCaml, and Erlang it happens all the time, often when the user isn't aware of it; even when these languages have special looping constructs they are often implemented in terms of tail recursion, sometimes mutual tail recursion between several anonymous subroutines.

In languages like Scala, Java, C#, and Swift, it basically never happens.


> In languages like Scala, Java, C#, and Swift, it basically never happens.

Scalaz and cats use tail calls extensively and wouldn't be possible without them.


How does that work on the JVM?


Scala only does single method tail recursion and rewrites it into a while loop. Cats and other libraries use a technique called trampolining which basically moves the frames to the heap (or a mixed technique where they do recurse a certain depth on the stack before switching to trampolining).


That sounds like it means that Scala doesn't support general tail-call elimination, so people don't write mutually tail-recursive code, instead transforming things that would naturally be expressed that way into a different structure that Scala does support.


Yes. It can be a major PITA, in certain problem spaces and I think most people don't pretend it's a great solution but ultimately accept it as a limitation of being hosted on the JVM.


In functional languages, recursive data structures are often most naturally walked with recursive functions. Languages like Haskell and ML take this to its natural conclusion and use it systematically. For example, a linked list. Here's how the Haskell "last" function, to get the last item in a linked list, is implemented in the standard library:

    last [x]                =  x
    last (_:xs)             =  last xs
    last []                 =  errorEmptyList "last"
If given a list with a single item, return that item. If given a list with two or more items, call itself with the list minus its head. If given an empty list, throw an error.


This is the scenario I am used to using where "last" is called in tail position calling itself, but I see from the other answers there are definitely valid use cases for non-same function recursion as well


It's useful anytime you can benefit from the performance advantages of replacing a full fledged function call with little more than a jump. One classic example is implementing an efficient VM.


If you’re implementing a compiler for a language like Haskell, then everything is compiled into mutually-recursive tail calls.


If f1 calls f2 that calls f1 why can't the compiler inline f2 into the body of f1? It's not so complicated if the compiler has annotations or other hints to guide it.


I once wrote an interpreter in continuation-passing style, meaning every final line of a function called the next function in the interpreter, passing a continuation and error continuation on.

Such a program never returns until it finishes interpreting, and can make an unlimited number of such calls.

This is a bad idea without TCE.


Why couldn't the compiler inline the whole interpreter?


Because in the continuation-passing style, the functions-calling-functions are circular in many places, inlining would explode rather than halting.


You can turn it into a loop with a case statement inside if you collect the connected component of tail called functions and make those the cases. But it’s gonna be a big function.

You can split it up a bit if you use a trampoline, but then you lose some efficiency.


Though at that point you're basically just one step removed from making your own stack frame.

Which would definitely work, no doubts about that, but it might be nicer to just get it handled at a more fundamental level.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: