Marty Stumpf

24 Jan 2019

â€¢

4 min read

By bananas, I mean banana brackets as in the famous paper "Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire" by Meijer et al. In this post I only focus on bananas, also called **catamorphisms**.

Recursion is core to functional programming. Meijer et al. show that we can treat recursions as separate higher order functions known as **recursion schemes**. Catamorphisms is one type of recursion schemes.

**Because knowing how to use recursion schemes improves your coding ability!** Consider a simple exercise: finding the minimum element of an array. In OCaml, without using ** fold** (catamorphisms), we would compare the elements of the array a one by one, keeping the minimum of the element, until we reach the end of the array. E.g.:

```
let rec find_min_inner a i j =
match a.(i) > a.(j) , (Array.length a -1 < i || Array.length a - 1 < j + 1) with
|true, true -> a.(j)
|false, true -> a.(i)
|true, false -> find_min_inner a j (j + 1)
|false, false -> find_min_inner a i (j + 1) ;;
let find_min a =
match a with
|[||] -> [||]
|_ -> find_min_inner a 0 1 ;;
```

Using catamorphisms or *fold_left* (or *fold_right*) in OCaml, we can do the equivalent for *int arrays* in one line:

```
let find_min a =
Array.fold_left (fun x y -> if x < y then x else y) max_int a;;
```

Or, using the function min in the Pervasives to replace the anonymous function and eta-reducting *a*:

```
let find_min = Array.fold_left min max_int;;
```

**Catamorphisms dramatically reduce the amount of code** because *fold_left* does exactly what we want to do for us: recurse down each element of the array. Of course, *fold_left* or *fold_right* work on other data structures too.

Going back to the above example, what's going on in that one line? We invoke the iterator in the array module *fold_left*. *fold_left* has the following type signature:

```
('a -> 'b -> 'a) -> 'a -> 'b array -> 'a
```

That is, fold_left takes 3 inputs:

- A function that takes two arguments, one of type 'a and one of type that is the same as the array, and returns a result of type 'a.
- An input of type 'a, think of it as the
**base case**of the recursion. - An array (of the same or different type to 'a).

The output of *fold_left* is of type *'a.*

The first input is a function that does something with the first input (of type 'a) and an element of the array. In this example, the function is min and it compares *max_int* with an element of the array.

*fold_left* **recurses the function with the earlier result as an input.** In this example, *fold_left* calls the following:

```
(* When the array is empty, it returns the max_int. *)
fold_left min max_int [||] => max_int
(* When the array has one element, it returns the min of max_int and the element. *)
fold_left min max_int [|e1|] => min max_int e1
(* When the array has two elements, min takes the result of
(min max_int e1) as an input, and compare it with e2. *)
fold_left min max_int [|e1;e2|] => min (min max_int e1) e2
(* When the array has three elements, min takes the result of
fold_left max_int [|e1;e2|] as an input, and compare it with e3. *)
fold_left min max_int [|e1;e2;e3|] => min (min (min max_int e1) e2) e3
```

We can generalize it to any function *f*, with *x* **as the base case** and *a* **as the array input**, the result is as in the Pervasives:

```
fold_left f x a => f a (... (f (f x a.(0)) a.(1)) ...) a.(n-1)
```

*fold_right* works similarly, except that the array elements are the first input to the function, and the brackets are **to the right**:

```
(* When the array has one element, it returns the min of max_int and the element. *)
fold_right min [|e1|] max_int => min e1 max_int
(* When the array has two elements, min takes the result of
(min e2 max_int) as an input, and compare it with e1. *)
fold_right min [|e1;e2|] max_int => min e1 (min e2 max_int)
(* When the array has three elements, min takes the result of
(min e3 max_int) as an input to compare with e2,
```

the result of which is then compared to e1. *) fold_left min max_int [|e1;e2;e3|] => min e1 (min e2 (min e3 max_in)) ...

You can see from the above example that to find the minimum element of an array, we can use *fold_left* or *fold_right* in OCaml, naturally with the *min* function as the function input. But how do we choose the base case?

The base case is there to safe guard an empty array input. **When an empty array is input to fold_left or fold_right, OCaml returns the base case**. Otherwise, the base case must have the property that when the function takes the base case and another input, the function returns the other input as a result. That is,

```
let sum = Array.fold_left (+) 0;;
# sum [||];;
- : int = 0 (* Passing an empty array to sum returns the base case 0. *)
let product = Array.fold_left (fun x y -> x * y) 1;;
# product [||];;
- : int = 1 (* Passing an empty array to product returns the base case 1. *)
```

Because the base case is just to safe guard an empty input, when you can be sure that the input is not empty, the base case would not be necessary! OCaml does not provide such option but Haskell does. We'll see this in the next post!

Did you like this article?

Marty Stumpf

Software engineer. Loves FP Haskell Coq Agda PLT. Always learning. Prior: Economist. Vegan, WOC in solidarity with POC.

See other articles by Marty

hello@works-hub.com

Ground Floor, Verse Building, 18 Brunswick Place, London, N1 6DZ

108 E 16th Street, New York, NY 10003

Join over 111,000 others and get access to exclusive content, job opportunities and more!