I used to teach Haskell to first-year university students, and many of them struggled to write their first recursive functions. It really isn’t obvious why you can solve a problem using the function you’re in the process of defining, and many students have difficulty making that leap. There is no shame in this. I remember taking a long time to grok proof-by-induction, which has a similar conceptual hurdle: how can you use a statement to prove itself?
Writing recursive functions requires a lot of tacit knowledge in selecting the recursion pattern to use, which variables to recurse over, etc. Recursion was not immediately obvious to industry professionals, either: I remember an errata card that came with TI Extended Basic for the Texas Instruments TI 99/4A which mentioned that later versions of the cartridge removed the ability for subprograms to call themselves, because they thought it was not useful and mostly done by accident.
I want to share a recipe that helped my students write their first recursive functions. There are three steps in this recipe:
Worked examples and some teaching advice after the jump.
product :: [Int] -> Int
Suppose we are asked to write a function
product :: [Int] -> Int
that multiplies a list of
numbers together. Begin by writing out several examples of what the
function should actually do:
product [2, 3, 4, 5] = 2 * 3 * 4 * 5 = 120
product [3, 4, 5] = 3 * 4 * 5 = 60
product [4, 5] = 4 * 5 = 20
product [5] = 5
There is a bit of an art to selecting the initial examples, so here are a few tips:
The shape of the data
definition heavily influences
the shape of the recursion. Because this function must recurse over cons
lists, we choose example inputs with similar tails.
It’s not usually necessary to choose big examples: three or four elements are usually enough.
Choose distinct values in each part of the data structure, so it’s clear which sub-parts need to align.
Avoid elements that behave strangely with respect to the function
you’re writing. It’s tempting to use the list [1, 2, 3, 4]
,
but the fact that 1 * x == x
means that we could confuse
ourselves a bit. I chose the list [2, 3, 4, 5]
instead.
Next, rewrite the examples in terms of each other:
product (2:3:4:5:[]) = 2 * 3 * 4 * 5 = 2 * product (3:4:5:[])
product (3: 4:5:[]) = 3 * 4 * 5 = 3 * product ( 4:5:[])
product (4: 5:[]) = 4 * 5 = 4 * product ( 5:[])
product (5: []) = 5 = 5
Notes:
If lists are involved, desugar them to make the correspondence more obvious.
Aligning similar elements vertically often helps, but it’s more helpful to align them by their position in the data structure instead of by their value. In this example, I’ve put all the list heads into the same column. This makes it easier to see where to introduce variables.
Finally, generalise over the first and last columns by introducing variables:
x ,--xs--. x ,--xs--.
| | | | | |
product (2:3:4:5:[]) = 2 * 3 * 4 * 5 = 2 * product (3:4:5:[])
product (3: 4:5:[]) = 3 * 4 * 5 = 3 * product ( 4:5:[])
product (4: 5:[]) = 4 * 5 = 4 * product ( 5:[])
product (5: []) = 5 = 5 * product []
product (x: xs) = = x * product xs
Notes:
Rewriting the 5:[]
example into
5 * product []
makes it fit the pattern of all of our other
examples, which allows us to generalise over all our examples with a
single equation.
Knowing that 5 * product [] = 5
tells us that
product []
must be 1
.
We now have enough information to write out a function definition:
product :: [Int] -> Int
product [] = 1
product (x:xs) = x * product xs
treeSum :: Tree Int -> Int
Suppose we are asked to write
treeSum :: Tree Int -> Int
, given the following
definition of binary trees:
data Tree a = Nil | Node a (Tree a) (Tree a)
As before, write out an example, and generate equations from its sub-parts:
-- For the tree:
--
-- 4
-- / \
-- 3 x
-- / \
-- / \
-- 1 2
-- / \ / \
-- x x x x
--
Node 4 (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) Nil) = 4 + 3 + 1 + 2
treeSum (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) = 3 + 1 + 2
treeSum (Node 1 Nil Nil) = 1
treeSum (Node 2 Nil Nil) = 2 treeSum (
Then, line up the examples and rewrite them in terms of each other:
Node 4 (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) Nil) = 4 + 3 + 1 + 2
treeSum (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) = 3 + 1 + 2
treeSum (Node 1 Nil Nil) = 1
treeSum (Node 2 Nil Nil) = 2
treeSum (
-- Rewritten:
Node 4 (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) Nil) = 4 + treeSum (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil))
treeSum (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) = 3 + treeSum (Node 1 Nil Nil) + treeSum (Node 2 Nil Nil)
treeSum (Node 1 Nil Nil) = 1
treeSum (Node 2 Nil Nil) = 2 treeSum (
Nothing seems to line up! The problem is that the example isn’t
complicated enough to give us a complete picture, so we could try
drawing another tree with more nodes and working through that. But the
equation for
treeSum (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil))
contains
a big hint, as it’s adding three things together: the node value, the
treeSum
of the left subtree, and the treeSum
of the right subtree. We can force the other equations for
Node
s into the right shape by adding + 0
a few
times, and that gives a pretty big hint that treeSum Nil
should be equal to 0
:
Node 4 (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) Nil) = 4 + treeSum (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) + 0
treeSum (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) = 3 + treeSum (Node 1 Nil Nil) + treeSum (Node 2 Nil Nil)
treeSum (Node 1 Nil Nil) = 1 + 0 + 0
treeSum (Node 2 Nil Nil) = 2 + 0 + 0
treeSum (Nil = 0
treeSum
-- Rewritten:
Node 4 (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) Nil) = 4 + treeSum (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) + treeSum Nil
treeSum (Node 3 (Node 1 Nil Nil) (Node 2 Nil Nil)) = 3 + treeSum (Node 1 Nil Nil) + treeSum (Node 2 Nil Nil)
treeSum (Node 1 Nil Nil) = 1 + treeSum Nil + treeSum Nil
treeSum (Node 2 Nil Nil) = 2 + treeSum Nil + treeSum Nil
treeSum (Nil = 0 treeSum
Complete the process by generalising over all of the examples with variables:
treeSum :: Tree Int -> Int
Node n left right) = n + treeSum left + treeSum right
treeSum (Nil = 0 treeSum
map :: (a -> b) -> [a] -> [b]
Suppose we are asked to write the classic map
function
over lists. Since the input function and the element type are not known,
use placeholders when generating examples:
map f [a, b, c] = [f a, f b, f c]
map f [b, c] = [f b, f c]
map f [c] = [f c]
map f [] = []
Desugar, align, and rewrite the equations in terms of each other, and finish by introducing variables:
map f (a:b:c:[]) = f a : f b : f c : [] = f a : map f (b:c:[])
map f (b: c:[]) = f b : f c : [] = f b : map f ( c:[])
map f (c: []) = f c : [] = f c : map f ( [])
| | | | | |
`-xs-' x `-xs-'
x
map f (x : xs) = f x : map f xs
map _ [] = []
This technique is useful but limited; larger data structures quickly become too unwieldy for it to work. But it seems to really help new Haskell programmers “get” recursion and bootstrap their skills and confidence. While it’s fine to show an example or two for students to crib from (at first), something about asking students to physically handle a pen and write it all out seems to make it sink in a lot better.