Monads are a formal theory of composition where programmers get to invoke some very abstract mathematics (category theory) to argue the minutia of annotating, scheduling, sequencing operations, and side effects. On the positive side the monad axioms are a guarantee that related ways of writing code are in fact substitutable and equivalent; so you want your supplied libraries to obey such axioms to make your life easy. On the negative side, the theory is complicated.
In this article we will consider the latest entry of our mad “programming theory in R series” (see Some programming language theory in R, You don’t need to understand pointers to program using R, Using closures as objects in R, and How and why to return functions in R): category theory!
In practice the programming language Haskell improved greatly when non-monadic I/O libraries were replaced by better monad inspired I/O libraries. But this also created the unfortunate false impression you had to understand monads to use Haskell (when in fact you only have to understand them to implement Haskell).
The fun side of Monads is flexibility in using them and occasionally saying (either formally or informally) “hey, x turns out to be a monad!”
This can vary from meaning:
- “x is nice (and not in fact a monad).”
- “x is cryptomorphic to a monad (a term favored by mathematician Gian-Carlo Rota).”
- “I went to graduate school!”
- “Or x really obeys the monad laws.”
Saying “x is a monad” is like singing in the shower, it is always more fun to say than to hear. I know I am certainly guilty of this in writing this article.
It turns out the magrittr
pipe package in R obeys the monad axioms. So if you are a data scientist who has tried magrittr
you have already benefited from monadic design.
Obviously I am nowhere near the first to notice this, but it is something I wish to comment on here. It doesn’t matter if this is core intent or a side-effect of good design, but it does give yet another reason to trust the package.
Let’s load the magrittr
package and spot-check the monad laws.
library('magrittr')
# Identity function,
ret <- function(x) { x }
# Note the example functions included here are not fully "standard non-standard eval"
# production hardened.
First we check that magrittr
’s %>%
operator obeys the Monad laws when using %>%
as “bind” and ret
as “return”. For simplicity we would like to think of magrittr
as a category over single argument functions (though obviously magrittr
works over more values than these, and the big part of the the magrittr
service is Currying code fragments into single argument functions).
a <- 1:5 # our values
m <- 1:5 # more values
f <- sin # our first function
g <- cos # our other function
- Axiom 1 Left identity:
ret(a) %>% f == f(a)
:
ret(a) %>% f
## [1] 0.8414710 0.9092974 0.1411200 -0.7568025 -0.9589243
f(a)
## [1] 0.8414710 0.9092974 0.1411200 -0.7568025 -0.9589243
- Axiom 2 Right identity:
m %>% ret == m
:
#
m %>% ret
## [1] 1 2 3 4 5
m
## [1] 1 2 3 4 5
- Axiom 3 Left Associativity:
(m %>% f) %>% g == m %>% (function(x) {f(x) %>% g})
:
(m %>% f) %>% g
## [1] 0.6663667 0.6143003 0.9900591 0.7270351 0.5744009
m %>% (function(x) {f(x) %>% g})
## [1] 0.6663667 0.6143003 0.9900591 0.7270351 0.5744009
Let’s go through those rather abstract axioms again and specialize them using our knowledge that “ret
” is the identity.
- Axiom 1’ Left identity:
a %>% f == f(a)
. Says: “piping a value into a function is the same as applying the function to the value.” In general the original saysret
has to be faithful in the sense that%>%
can recover enough information to computef(a)
. - Axiom 2 Right identity:
m %>% ret == m
. Now implied by Axiom 1’. But in the cases where we don’t haveret
is the identity this would tell us that%>%
is faithful in the sense it is retaining enough information aboutm
thatret
can re-buildm
. - Axiom 3 Associativity:
(m %>% f) %>% g == m %>% (function(x) {f(x) %>% g})
. There is a notational convention hidden in this statement: we assumem %>% f %>% g
is to be read as(m %>% f) %>% g
. Mathematicians like left-association as Curried functions are left-associative (and not necessarily right associative, considerCurry(f) ○ g ○ h
wheref
is originally a two argument function that ignores its first argument). The axiom plus the convention tell us we can consider piping as moving the valuem
throughf
and then throughg
. This is where we are really checking%>%
is behaving like a pipe.
As a user we would like to be able to write “a %>% (f %>% g)
” or “h <- f %>% g; a %>% h
”. That is: we would like to be able to save complex magrittr pipe sequences for re-use (without introducing more notation, such as “. %>%”). We would like to have right-associativity (reified composition of operators) in addition to the left association we expect from magrittr
being described a “forward-pipe operator” (from magrittr’s description).
It turns out that isn’t one of the monad axions, and we can’t immediately do that:
h <- f %>% g
## Error in g(.): non-numeric argument to mathematical function
We can fix this one of two ways: by introducing a function or using a built-in “dot notation” (thank you to Professor Jenny Bryan for pointing this out to me.)
h <- function(x) { x %>% f %>% g }
a %>% h
## [1] 0.6663667 0.6143003 0.9900591 0.7270351 0.5744009
h <- . %>% f %>% g
a %>% h
## [1] 0.6663667 0.6143003 0.9900591 0.7270351 0.5744009
It turns out category theorists anticipated this problem and fix. The function wrapping trick is essentially building the Kleisli category derived from our monad (see “Monads Made Difficult”).
In principle we could implement a Kleisli arrow %>=>%
operator in addition to the magrittr
bind operator (%>%
) which would allow code like the following (all four statements below would be equivalent):
`%>=>%` <- function(f,g) { function(x) {g(f(x))} }
a %>% (sin %>=>% cos %>=>% abs)
## [1] 0.6663667 0.6143003 0.9900591 0.7270351 0.5744009
a %>% ((sin %>=>% cos) %>=>% abs)
## [1] 0.6663667 0.6143003 0.9900591 0.7270351 0.5744009
a %>% (sin %>=>% (cos %>=>% abs))
## [1] 0.6663667 0.6143003 0.9900591 0.7270351 0.5744009
abs(cos(sin(a)))
## [1] 0.6663667 0.6143003 0.9900591 0.7270351 0.5744009
Note that the above %>=>%
operator is not in fact the desired general Kleisli operator as we haven’t implemented the critical Currying services that the magrittr
%>%
operator supplies (these services would be what category theorist call the “endofunctor” which would map R functions to specialized savable R functions; definitions such as function(f,g) { . %>% f %>% g }
won’t work either as we would need to capture unevaluated arguments inside the %>=>%
function and can not delegate that to interior %>%
calls).
In the Kleisli category we would no longer need special monad axioms, they are replaced by the more common associativity and category axioms in the Kleisli category. One still has to prove you have a category, but it is a more standard task.
You can say standard imperative styles of analysis that see operations as sequenced transient steps that mutate data are “strict left to right associative” ways of thinking (the “((a %>% sin) %>% cos) %>% abs
” form). Databases, and standard data analyses in R
are usually so organized.
The “general associativity” way of thinking (the “h <- sin %>=>% cos %>=>% abs; a %>% h
” form) emphasizes the processing pipeline as a reusable entity and data a transient quantities that flow through the pipeline. Systems like Weka, LingPipe, and graphical data science tools such as Alpine Data workflow notebooks essentially represented processes in this manner.

Analysis steps as a durable graph: Alpine Data workflow notebooks
The guarantee of full associativity is just a mathematical way of saying: you can mix these styles (data oriented or operator oriented) and you are guaranteed the same result in either case.
And there you have it, more category theory than a data scientist should need to worry about.
Categories: Computer Science
jmount
Data Scientist and trainer at Win Vector LLC. One of the authors of Practical Data Science with R.
The above really has to be taken informally.
For example here is a note on how hard it is to really make a true mathematical category where the arrows are functions implemented in a general computer language http://math.andrej.com/2016/08/06/hask-is-not-a-category/ . The issues include checking identity between partial functions without the help of an axiom like extensionality.
You would also have to except the error return demonstrated as a legitimate object in the category (so our category is definitely more than just functions, despite wishes). The main point I wanted to make is the non-right associativity of the magrittr pipe isn’t a foundational problem, the familiar axioms don’t seem to insist on that property.
Roughly monads are all about composition (though there are a lot of technical details). So saying you have such means you have a nice system with associative composition. This is something we expect from our intuition of working with mathematical functions, and something that is hard to guarantee for programs trying to implement functions. Hence the work is to document when the intuition caries from mathematical functions to execution of computer programs.
After a lot of study the observation that monads are “a monad just a monoid in the category of endofunctors” does makes some sense, if you look up all the words (see here http://stackoverflow.com/questions/3870088/a-monad-is-just-a-monoid-in-the-category-of-endofunctors-whats-the-issue ). Really we are hoping that composition keeps us in the same world (semigroup properties) and we have an identity (monoids are semigroups with an identity). “Endofunctors” just means functors from a category to itself (functor itself meaning functions from categories to categories that respect arrows in addition to mapping objects).
I find the notations all fairly difficult relative to their payoff, so I don’t actually use the concepts that often.
Also of interest is the
magrittr
%,%
operator, which is for function composition- or like a version of%>%
the associates the other direction. Great article on the history and theory here: https://www.r-statistics.com/2014/08/simpler-r-coding-with-pipes-the-present-and-future-of-the-magrittr-package/ .And finally there is Bizarro Pipe which actually uses “;” as its implementation (taking the whole monad thing full circle).