Monads are a formal theory of composition where programmers get to invoke some very abstract mathematics (category theory) to argue the minutia of annotating, scheduling, sequencing operations, and side effects. On the positive side the monad axioms are a guarantee that related ways of writing code are in fact substitutable and equivalent; so you want your supplied libraries to obey such axioms to make your life easy. On the negative side, the theory is complicated.

In this article we will consider the latest entry of our mad “programming theory in R series” (see Some programming language theory in R, You don’t need to understand pointers to program using R, Using closures as objects in R, and How and why to return functions in R): category theory!

In practice the programming language Haskell improved greatly when non-monadic I/O libraries were replaced by better monad inspired I/O libraries. But this also created the unfortunate false impression you had to understand monads to use Haskell (when in fact you only have to understand them to implement Haskell).

The fun side of Monads is flexibility in using them and occasionally saying (either formally or informally) “hey, x turns out to be a monad!”

This can vary from meaning:

• “x is nice (and not in fact a monad).”
• “x is cryptomorphic to a monad (a term favored by mathematician Gian-Carlo Rota).”
• “I went to graduate school!”
• “Or x really obeys the monad laws.”

Saying “x is a monad” is like singing in the shower, it is always more fun to say than to hear. I know I am certainly guilty of this in writing this article.

It turns out the `magrittr` pipe package in R obeys the monad axioms. So if you are a data scientist who has tried `magrittr` you have already benefited from monadic design.

Obviously I am nowhere near the first to notice this, but it is something I wish to comment on here. It doesn’t matter if this is core intent or a side-effect of good design, but it does give yet another reason to trust the package.

Let’s load the `magrittr` package and spot-check the monad laws.

``````library('magrittr')

# Identity function,
ret <- function(x) { x }

# Note the example functions included here are not fully "standard non-standard eval"
# production hardened.``````

First we check that `magrittr`’s `%>%` operator obeys the Monad laws when using `%>%` as “bind” and `ret` as “return”. For simplicity we would like to think of `magrittr` as a category over single argument functions (though obviously `magrittr` works over more values than these, and the big part of the the `magrittr` service is Currying code fragments into single argument functions).

``````a <- 1:5 # our values
m <- 1:5 # more values
f <- sin # our first function
g <- cos # our other function``````
• Axiom 1 Left identity: `ret(a) %>% f == f(a)`:
``ret(a) %>% f``
``##   0.8414710  0.9092974  0.1411200 -0.7568025 -0.9589243``
``f(a)``
``##   0.8414710  0.9092974  0.1411200 -0.7568025 -0.9589243``
• Axiom 2 Right identity: `m %>% ret == m`:
``````#
m %>% ret``````
``##  1 2 3 4 5``
``m``
``##  1 2 3 4 5``
• Axiom 3 Left Associativity: `(m %>% f) %>% g == m %>% (function(x) {f(x) %>% g})`:
``(m %>% f) %>% g``
``##  0.6663667 0.6143003 0.9900591 0.7270351 0.5744009``
``m %>% (function(x) {f(x) %>% g})``
``##  0.6663667 0.6143003 0.9900591 0.7270351 0.5744009``

Let’s go through those rather abstract axioms again and specialize them using our knowledge that “`ret`” is the identity.

• Axiom 1’ Left identity: `a %>% f == f(a)`. Says: “piping a value into a function is the same as applying the function to the value.” In general the original says `ret` has to be faithful in the sense that `%>%` can recover enough information to compute `f(a)`.
• Axiom 2 Right identity: `m %>% ret == m`. Now implied by Axiom 1’. But in the cases where we don’t have `ret` is the identity this would tell us that `%>%` is faithful in the sense it is retaining enough information about `m` that `ret` can re-build `m`.
• Axiom 3 Associativity: `(m %>% f) %>% g == m %>% (function(x) {f(x) %>% g})`. There is a notational convention hidden in this statement: we assume `m %>% f %>% g` is to be read as `(m %>% f) %>% g`. Mathematicians like left-association as Curried functions are left-associative (and not necessarily right associative, consider `Curry(f) ○ g ○ h` where `f` is originally a two argument function that ignores its first argument). The axiom plus the convention tell us we can consider piping as moving the value `m` through `f` and then through `g`. This is where we are really checking `%>%` is behaving like a pipe.

As a user we would like to be able to write “`a %>% (f %>% g)`” or “`h <- f %>% g; a %>% h`”. That is: we would like to be able to save complex magrittr pipe sequences for re-use (without introducing more notation, such as “. %>%”). We would like to have right-associativity (reified composition of operators) in addition to the left association we expect from `magrittr` being described a “forward-pipe operator” (from magrittr’s description).

It turns out that isn’t one of the monad axions, and we can’t immediately do that:

``h <- f %>% g``
``## Error in g(.): non-numeric argument to mathematical function``

We can fix this one of two ways: by introducing a function or using a built-in “dot notation” (thank you to Professor Jenny Bryan for pointing this out to me.)

``````h <- function(x) { x %>% f %>% g }
a %>% h``````
``##  0.6663667 0.6143003 0.9900591 0.7270351 0.5744009``
``````h <- . %>% f %>% g
a %>% h``````
``##  0.6663667 0.6143003 0.9900591 0.7270351 0.5744009``

It turns out category theorists anticipated this problem and fix. The function wrapping trick is essentially building the Kleisli category derived from our monad (see “Monads Made Difficult”).

In principle we could implement a Kleisli arrow `%>=>%` operator in addition to the `magrittr` bind operator (`%>%`) which would allow code like the following (all four statements below would be equivalent):

```````%>=>%` <- function(f,g) { function(x) {g(f(x))} }
a %>% (sin %>=>% cos %>=>% abs)``````
``##  0.6663667 0.6143003 0.9900591 0.7270351 0.5744009``
``a %>% ((sin %>=>% cos) %>=>% abs)``
``##  0.6663667 0.6143003 0.9900591 0.7270351 0.5744009``
``a %>% (sin %>=>% (cos %>=>% abs))``
``##  0.6663667 0.6143003 0.9900591 0.7270351 0.5744009``
``abs(cos(sin(a)))``
``##  0.6663667 0.6143003 0.9900591 0.7270351 0.5744009``

Note that the above `%>=>%` operator is not in fact the desired general Kleisli operator as we haven’t implemented the critical Currying services that the `magrittr` `%>%` operator supplies (these services would be what category theorist call the “endofunctor” which would map R functions to specialized savable R functions; definitions such as `function(f,g) { . %>% f %>% g }` won’t work either as we would need to capture unevaluated arguments inside the `%>=>%` function and can not delegate that to interior `%>%` calls).

In the Kleisli category we would no longer need special monad axioms, they are replaced by the more common associativity and category axioms in the Kleisli category. One still has to prove you have a category, but it is a more standard task.

You can say standard imperative styles of analysis that see operations as sequenced transient steps that mutate data are “strict left to right associative” ways of thinking (the “`((a %>% sin) %>% cos) %>% abs`” form). Databases, and standard data analyses in `R` are usually so organized.

The “general associativity” way of thinking (the “`h <- sin %>=>% cos %>=>% abs; a %>% h`” form) emphasizes the processing pipeline as a reusable entity and data a transient quantities that flow through the pipeline. Systems like Weka, LingPipe, and graphical data science tools such as Alpine Data workflow notebooks essentially represented processes in this manner. Analysis steps as a durable graph: Alpine Data workflow notebooks

The guarantee of full associativity is just a mathematical way of saying: you can mix these styles (data oriented or operator oriented) and you are guaranteed the same result in either case.

And there you have it, more category theory than a data scientist should need to worry about.

Categories: Computer Science

Tagged as: ### jmount

Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.

### 2 replies ›

1. John Mount says:

The above really has to be taken informally.

For example here is a note on how hard it is to really make a true mathematical category where the arrows are functions implemented in a general computer language http://math.andrej.com/2016/08/06/hask-is-not-a-category/ . The issues include checking identity between partial functions without the help of an axiom like extensionality.

You would also have to except the error return demonstrated as a legitimate object in the category (so our category is definitely more than just functions, despite wishes). The main point I wanted to make is the non-right associativity of the magrittr pipe isn’t a foundational problem, the familiar axioms don’t seem to insist on that property.

2. John Mount says:

Roughly monads are all about composition (though there are a lot of technical details). So saying you have such means you have a nice system with associative composition. This is something we expect from our intuition of working with mathematical functions, and something that is hard to guarantee for programs trying to implement functions. Hence the work is to document when the intuition caries from mathematical functions to execution of computer programs.

After a lot of study the observation that monads are “a monad just a monoid in the category of endofunctors” does makes some sense, if you look up all the words (see here http://stackoverflow.com/questions/3870088/a-monad-is-just-a-monoid-in-the-category-of-endofunctors-whats-the-issue ). Really we are hoping that composition keeps us in the same world (semigroup properties) and we have an identity (monoids are semigroups with an identity). “Endofunctors” just means functors from a category to itself (functor itself meaning functions from categories to categories that respect arrows in addition to mapping objects).

I find the notations all fairly difficult relative to their payoff, so I don’t actually use the concepts that often.

Also of interest is the `magrittr` `%,%` operator, which is for function composition- or like a version of `%>%` the associates the other direction. Great article on the history and theory here: https://www.r-statistics.com/2014/08/simpler-r-coding-with-pipes-the-present-and-future-of-the-magrittr-package/ .

And finally there is Bizarro Pipe which actually uses “;” as its implementation (taking the whole monad thing full circle).