What is Better, Recursion or Iteration?
Learn the difference between both of these functionalities at the hardware level to understand what to prefer.
Introduction
Instead of answering the question directly in this article, we will learn how these features work at the hardware level. To do so, we will translate an example of iteration and recursion to a pseudo-assembly language.
Assembly doesn't know about iterations, not even about functions or conditionals. At that level, the only thing that the machine can do is jump to another address or continue with the consecutive address. Therefore, we need to rewrite iteration and functions using only goto
statements.
In this article, I will use a pseudo-assembly where I will take a lot of liberties. I use this pseudo-assembly to make a point of recursion versus iteration. I don't go into the nitty-gritty details of assembly programming.
Factorial
The example we will use is a function that calculates the factorial of a number. For example, the factorial of five is the product of all positive integers less than or equal to five: factorial(5) == 5 * 4 * 3 * 2 * 1
.
Factorial: Iteration
An iterative solution to the factorial problem is the following:
How does this iteration look in assembly, where we don't have loops, only jumps (or goto
statements)?
There are two main locations (or labels): loop and test. The code jumps between them when iterating. The first thing we do is jump to test from the goto test
; from there, we do a conditional jump: if current < n
is true; we jump back to loop. The loop label has the statements inside the while
, and it finishes where test starts. There, test checks the condition again and jumps back to loop if true.
In an iteration, the hardware is just jumping between instructions.
Factorial: Recursion
Take a look at a recursive solution for the factorial function.
There is a line prev = fact(n - 1)
where we call the same function. Yet, in assembly, the idea of calling a function does not exist. We need to create this functionality with goto
. How does that work?
Function calls in assembly: simple example
Before we write recur_fact
in assembly, let's look at a simpler example: a function that returns the double of a number.
How does that look in pseudo-assembly?
There are three functionalities that high-level languages have and assembly lacks:
Function definition:
def double(n):
Function execution:
double(10)
Return from a function:
return result
We have to rewrite each of these functionalities in assembly.
"Function definition" is the simplest; it's just a label where assembly can jump to.
"Function execution" is easy to understand yet harder to implement. Executing a function means jumping to the label of that function. That is the easy part. Yet, for each execution, we might have different parameters; such as "n" in our example of double(5)
or double(10)
.
The "n" parameter is stored in an environment, and then the environment is added to a part of the computer's memory called Stack
.
A new environment is created because each execution is different. This environment has different data used by the function, for example, the argument "n."
Note: "Stack overflow" means that we pushed too many new environments to the "Stack memory."
The idea of "Return from a function" is simple but full of little details when implementing. The most important things are that the code needs to jump back to where it was executed and give the returned value.
That means that to come back, we need to know in which line the function was called. We can use the environment to set the value of where to jump back: return_address
. In the same way that each function execution has different parameters (e.g., "n"), it also has a different return_address
.
Another important detail in the "Return from a function" part is that it must delete the environment created when executing the function. That means that after returning, the Stack
memory is decreased in one environment. We exemplify this with Stack.pop()
in our pseud-assembly code.
If we put the three parts together, we have the following pseudo-assembly code of defining and executing the function double
.
I have taken the liberty of using a few functionalities which are not available in assembly: Stack.push
and Stack.pop
.
Let me better illustrate the evolution of the stack with drawings:
As we can see, the memory increases on every "Function execution" and decreases after every "Return from the function."
Function calls in assembly
In summary, executing a function in assembly means:
Create a new environment
Push it to the Stack
Jump to the function address
Wait for the function to jump back
Pop the return value from the stack
Use return value
Function calls in assembly: recursion
The previous description of how to implement functions in assembly also works for recursive functions.
Translated to pseudo-assembly:
The tricky part is that the assembly is creating a new environment inside the recur_fact
function. Then, the code jumps back to the function's beginning before removing the stack's old environment. As we know, if it weren't for the base case, this code would continue creating new environments and jumping back to the beginning of the function indefinitely.
Let's follow the steps with the specific example of calling fact(5)
.
At this moment, the Stack memory has five different environments:
The Stack memory grows with each new environment and decreases as each function returns.
On the other hand, the instructions jump from one place to another on each call, similar to the iteration.
Iteration vs. recursion at the machine level
The only difference between iteration and recursion is the amount of memory used. Recursion uses more memory because it needs to create a new environment every time a function is called.
Yet, we can fix this in some cases.
Tail-recursive functions
Tail recursive functions don't use extra memory because they do a little trick to avoid creating a new environment.
A tail-recursive function is a function whose recursive call is done in the return value. Not in the middle of the function, like it was in our previous example.
Note: tail-recursive functions are a special case of the tail call.
Let's refactor recur_fact
to make it tail-recursive:
The biggest difference is that now we need to pass an extra parameter when calling the function. This extra parameter could have been avoided by adding a default parameter or an extra helper function. But, I wanted to keep things as similar as possible to the previous examples.
Translated to pseudo-assembly, this function is:
The main difference between recur_fact
and tail_fact
is in these lines:
The first recursive function uses "n," which is part of the old environment after creating the new environment. In the tail-recursive example, this does not happen. When we make the recursive call, the old environment is not used anymore.
Because the old environment is not used anymore, tail-recursive functions can reuse the environment instead of creating a new one.
Something like the following:
Therefore the stack memory does not grow with each call. Thus, making tail-recursive functions as efficient as iteration.
Tail Recursive Function in programming languages
Not all programming languages support the efficiency improvement in tail-recursive functions.
For example, even though ECMAScript specifies tail call implementation in its standards, only Safari's engine has implemented it. Check out this stack overflow answer to learn the story behind it; both Chrome V8 and Firefox's engine decided not to push this feature to production.
Most functional programming languages like Haskell, Scheme, Scala, etc., implement the tail-recursive improvement.
I found enlightening the blog post that Python's creator (or Benevolent Dictator For Life) Guido van Rossum shared about why Python does not support this functionality.
If you want a short answer, it's simply unpythonic.
For a full list of languages, you can see the Wikipedia entry on Tail Call.
So, iteration or recursion?
I hope that I have given you the tools to answer the question yourself. You learned that iteration is more memory efficient. Unless you can write a tail-recursive function and your programming language supports the improvement.
Yet, if you know that memory won't be a problem, recursion is, in many cases, the most elegant and readable solution.
In my opinion, if there are no performance problems, you should always write your code with readability in mind.