Reducers and Transducers Introductory
This is going to be an introduction to reducers and transducers in Clojure. There is an abundance of resources on this matter , take the clojure for the brave and true book on reducers for example. By the end of this I hope you wonder and explore such books and write your own ! Now let’s get started.
As mentioned by clojure.org , “A reducer is the combination of a reducible collection with a reducing function”. A reducible collection, like a vector or a map, is one which can be broken down, assembled and transformed. Notice that a reducer is the combination of a collection and function . It is not the collection itself. Moreover, a reducer does not return the transformed collection for us to work with. To illustrate this, let’s use the reducers provided for us by
clojure.core.reducers and see the difference :
(nth (map inc [1 2 3]) 2) ; => 4
Here we are using the sequence implementation of map. Which is the built-in default for Clojure. Map here is lazily evaluated, which means when we call nth to grab the value in the second index only the values up to the second index are actually computed. Map returns a list, a collection, for us to work with which is why this code evaluates.
(require '[clojure.core.reducers :as r])
(nth (r/map inc [1 2 3]) 2) ; => UnsupportedOperationException nth not supported on this type: reducers$folder$reify
But when we use the reducer implementation the return type is not a collection, thus we get an error when we call nth. This trick will be explained later in the article, but right now I want to really illustrate the difference between reducers and sequences.
Reducers are meant to perform substantially better in some cases than their seq counterparts. This is due to reducers performing eager computation and the fact no intermediate collections are produced.
Eager computation - Each member of the collection is computed before it is actually referenced
No Intermediate Collections - Reducers make use of parameterized collections or collections passed down from other functions in scope instead of creating their own, so it is more efficient.
This is directly opposite to lazy sequences which evaluate their members as you need them, whilst this sounds great if you know that you need the entire collection to be realised then laziness produces overheads. To see why look at the implementation of map :
user=> (doc map) clojure.core/map ([f] [f coll] [f c1 c2] [f c1 c2 c3] [f c1 c2 c3 & colls])
The function gets applied over and over again the more items you want to evaluate which can get memory intensive. If this is the case then it is better to stick to reducers when you want to have the entire collection to work with.
Sequences also produce intermediate collections (hence why nth worked in our last example). With reducers, the execution of operations is deferred until the final reduction is performed. Moreover, this means that there is no need for intermediate variables/collections as the reductions have already been computed (if the final reduction is performed we’re finished transforming the collection).
Let’s look at an example where we can actually work with a collection and use our reducers.
(->> [1 2 3 4 5] (r/map #(inc %)) (into ))
; [2 3 4 5 6]
Remember that reducers do not create their own collections, so we must provide. With into we provide a vector for the reducer to store the results.
Without into the reducer would return what’s called a reducible. Which is only the instructions for producing the collection. Moreover, other functions can be linked onto one reducer without confusion as the instructions are just being handed down. This makes reducers very composable, take this one for example:
(->> [1 2 3 4 5] (r/map #(inc %)) (r/map #(* % %)) (r/filter even?) (into ))
; [4 16 36]
Each reducer is completely independent of the collection and of the transformations we make to the collection, which enables such elegant composability.
To conclude this section on reducers below is a table highlighting the differences between using reducers and sequences for different scenarios. I know that this article is meant to demystify what reducers and transducers actually are but I think it is worth including sequences as something familiar to help with some of the concepts here.
| || |
|Performs operations on reducible collections given a reducing function.||Performs operations on sequence collections given a sequencing function.|
|Evaluate each member in the collection before any operations are computed. (eager evaluation)||Evaluates each member as an operation is to be conducted on the member. (lazy evaluation)|
|Does not produce collections during computations , collections must be specified.||Creates intermediate collections to store the transformed data.|
Use Cases :
Use Cases :
|When you need the entire collection to be known.||When you only reference one or a group of elements out of the collection at a time.|
|When performance matters , learn more about folding .||When you can afford the overheads of searching large amounts of the collection if needed.|
Let’s start our walk-through of transducers with another definition from clojure.org , “A transducer (sometimes referred to as xform or xf) is a transformation from one reducing function to another.” Another formal definition of a reducing function → “a function that takes an accumulated result and a new input and returns a new accumulated result.” A transducer is a level up from the reducers we were just getting used to, we are know transforming reducing functions!
The benefit of using transducers is that they are actually polymorphic, we transform reducing functions to be applied in multiple scenarios, from numbers to asynchronous channels and the like.This was the reason for them being included in the language, because it was found that to use the common map, filter and reduce on streams, observables and primitives there would need to be different implementations. A modular approach was needed for transforming collections. The great thing about a transducer is that it should not be concerned about the inputs it receives or the results it outputs.
A common example to understand why transducers are used is by rewriting the implementation of map in terms of reduce, then we can see with some changes the flexibility we get from this new implementation.
(defn map [f coll] (reduce (fn [x y] (conj x (f y)))  [0 1 2 3 4]))
and then you would call
(map inc [1 2 3 4 5])
[2 3 4 5 6]
In our home-made implementation of map, the function that we pass to reduce is
(fn [x y] (conj x (f y)))
Where f is the function that we would like to apply to every element. So we can write a function that produces such a function for us, passing the function that we would like to map.
(defn mapping-with-conj [f] (fn [x y] (conj x (f y))))
But we still see the presence of conj in the above function. Assuming we want to add elements to a collection. We can get even more flexibility by factoring that out:
(defn mapping [f] (fn [step] (fn [x y] (step x (f y)))))
Then we can use it like this:
(def increase-by-1 (mapping inc)) (reduce (increase-by-1 conj)  [1 2 3])
Why would you want to do things this way? The answer is that it gives us a lot of flexibility to build things. For instance, instead of building up a collection, we can do
(reduce ((map inc) +) 0 [1 2 3 4 5])
Which will give us the sum of the mapped collection [2 3 4 5 6]. Or we can add functionality by function composition.
(reduce ((comp (filter odd?) (map inc)) conj)  [1 2 3 4 5])
Which will first remove even elements from the collection before we map. The transduce function does essentially what the above line does, but takes care of another few extra details. So you would actually write.
(transduce (comp (filter odd?) (map inc)) conj  [1 2 3 4 5])
In conclusion, I hope this has made the concept of reducers and transducers a little easier to understand . For more information about reducers and transducers in Clojure, I would recommend you visit:
Edit: I forgot to credit the author of the example I used for implementing a modular reduce : the answer was originally posted here
P.S, Wondering why there was a burrito for this blog post? I think it's best explained in the presentation given by Rich Hickey himself on Transducers. You can watch the talk here Rich Hickey on Transducers