Posted on January 5, 2022

Lenses to the left of me, Prisms to the right

There’s an interesting way to think about lenses and prisms. Lenses model processes that perform internal computation and interact with the environment. Prisms model processes that perform internal computation or interact with the environment. Let me explain what I mean.

A lens \((A, A') \to (B, B')\) is usually thought of as a pair of maps \(get : A \to B\) and \(put : A \times B' \to A'\). This is the classic presentation of a lens. Instead, here we’ll use the coend representation of which is more symmetric. It’s defined in terms of two maps of type \(f : A \to A \times B\) and \(b : A \times B' \to A'\), meaning forward and backward, respectively. For any lens defined in terms of \(get\) and \(put\) maps, we can recast it to the coend form: define \(f\) as the composite \(A \xrightarrow{\Delta_A} A \times A \xrightarrow{A \times get} A \times B\) , while \(b = put\). This is the representation of lens as an optic, and it makes explicit the fact that the forward part of the lens saves an internal state of type \(A\). In the image below, the forward map \(f\) is the gray box and the internal state is drawn as the vertical wire (this is also called the “residual”). The dot is the copy map \(\Delta_A\).

We can imagine a particle flowing in this lens, as an input of type \(A\). A lens takes in this input and does two things in parallel in the forward pass: it copies it down on the vertical wire (residual/internal state) and it also uses the map \(get\) to turn it into an output \(B\). Then, this output \(B\) is turned into a “response” \(B'\) by the environment (not drawn), and on the backward pass via the \(put\) map it’s turned back into \(A'\). In other words, in this lens we can trace out “two paths” the data flows through starting from \(A\) and ending at \(A'\): it goes through the residual (which is in this case also of type \(A\)) and it goes through both \(B\) and \(B'\).

We can think of this as a “conjunctive” data flow: both paths are traversed. The lens computes some internal state and interacts with the environment. This is useful for machine learning (where we want to remember which point we’re differentiating with respect to on the backward pass) and in game theory (where we want to remember the state for which we’re going to receive a payoff for).

A prism, on the other hand, is characterized by a disjunctive data flow. Starting from \(A\), a prism routes the data to go one of two ways: either inside through the residual or outside into the environment through \(B\) and \(B'\). We see this animated below.

Prism trace

More precisely, a prism is \((A, A') \to (B, B')\) is a pair of two maps \(match : A \to A' + B\) and \(build : B' \to A'\). This is the classic presentation of a prism. As we’ll now see, it too has a more symmetric coend representation. It’s defined in the analogous way - in terms of two maps \(f : A \to A' + B\) and \(b : A' + B' \to A'\) standing for forward and backward. For any prism defined in terms of \(match\) and \(build\) maps, we can recast it to the coend form by setting \(f = match\) and setting \(b\) to the composite \(A' + B' \xrightarrow{A' + build} A' + A' \xrightarrow{\nabla_A} A\). In the image below we can see how the prism is an upside-down, mirror image of a lens.

We can also imagine a particle flowing through a prism. A prism takes in an input \(A\), checks for some condition, and allows the environment to execute some computation only if that condition is true. This computation is of type \(B \to B'\). That is, the environment takes in a \(B\) that the prism produced and returns a \(B'\) which the prism receives. The prism then has to translate that \(B'\) into an \(A'\), i.e. it has to know how to translate the answer from the environment into the type that it would’ve ended up with if the condition had been false.

This shows that that unlike a lens, a prism routes data to one of two ways: either through the residual into \(A'\), or through the environment. These two branches of computation are shown as two animations above. For each input of type \(A\), only one of them will get executed. Prisms are quite often used in parsing. In game theory, we would use prisms if we want to allow agents to stop playing a game.

I find this operational perspective of lenses and prisms enlightening. If we think of lenses and prisms as little machines, then lenses are machines that always query the environment as they operate. Prisms, on the other hand, only sometimes query the environment, depending on the values of certain variables in their internals.

Here I am, stuck in an Affine Traversal with you

Can we combine lenses and prisms? Can we model a process that has both disjunctive and conjunctive data flows? The answer is yes, and we get something that’s called an affine traversal. To understand it, let’s go back to the formal definition.

Both lenses and prisms are optics, and optics have a neat coend representation. Given a symmetric monoidal category \(\mathcal{C}\), the hom-object of optics is defined as follows:

\[\mathbf{Optic}(\mathcal{C})((A, A'), (B, B')) = \int^M \mathcal{C}(A, M \otimes B) \times \mathcal{C}(M \otimes B', A')\]

Lenses fall out when \(\mathcal{C}\) is a cartesian monoidal category, and prisms fall out when \(\mathcal{C}\) is a cocartesian monoidal category. So how do we compose lenses and prisms?

Since both are given by some monoidal categories, we’d expect that affine traversals are given by a monoidal category too. But it’s not clear how to compose two monoidal categories. What would we even expect the result to be? As always, making sure we understand what it is that we want to do will lead us to the answer. And if we think, we can realise that it’s not the case that we want to treat these as just monoidal categories.

In the above definition, looking at the operator \(\otimes\) (which appears twice), its left operand \(M\) has different semantics than its right operand. The object \(M\) is private data of an optic hidden from the outside world, while objects \(B\) and \(B'\) are “ports” available to the outside world. Even more, if we think about the underlying functor of the monoidal category \(\otimes : \mathcal{C} \times \mathcal{C} \to \mathcal{C}\), there’s nothing in the definition of the optics that requires the first argument to be the same as the second. The private information \(M\) can be of different type than \(B\) and \(B'\) which are available to the outside world.

This brings us to the world of actegories. They have a slick definition. Let \(\mathcal{M}\) be a monoidal category, and \(\mathcal{C}\) just a category. A \(\mathcal{M}\)-actegory \(\mathcal{C}\) is a strong monoidal functor \(\bullet : \mathcal{M} \to [\mathcal{C}, \mathcal{C}]\). Uncurrying \(\bullet\) we can represent actegories in a more familiar form, as a functor of type \(\mathcal{M} \times \mathcal{C} \to \mathcal{C}\) plus some equations. This looks similar to a monoidal category, and the extra level of generality will be needed to represent affine traversals.

An actegory is like a parameterised map. A \(\mathcal{M}\)-actegory \(\mathcal{C}\) is a map \(\mathcal{C} \to \mathcal{C}\) parameterised by \(\mathcal{M}\). And parameterised maps have a well-defined composition rule. A \(P\) parameterised map \(A \to B\) and a \(Q\)-parameterised map \(B \to C\) can be plugged together to obtain a \(P \otimes Q\)-parameterised map \(A \to C\), as animated below.

Every symmetric monoidal category \(\mathcal{C}\) is a \(\mathcal{C}\)-actegory \(\mathcal{C}\), which means we have a clear way of composing lenses and prisms – as actegories! Let me spell it out.

A cartesian monoidal category is a \(\mathcal{C}\)-parameterised map \(\mathcal{C} \to \mathcal{C}\). A cocartesian monoidal category is a \(\mathcal{C}\)-parameterised map \(\mathcal{C} \to \mathcal{C}\). Composing them as parameterised maps, we obtain a \(\mathcal{C} \times \mathcal{C}\)-parameterised map \(\mathcal{C} \to \mathcal{C}\). This is a \(\mathcal{C} \times \mathcal{C}\)-actegory \(\mathcal{C}\), i.e. a strong monoidal functor \(AF : \mathcal{C} \times \mathcal{C} \to [\mathcal{C}, \mathcal{C}]\).¹ Fixing some parameters \((C, D) : \mathcal{C} \times \mathcal{C}\), what does the resulting functor \(AF(C, D) : \mathcal{C} \to \mathcal{C}\) do? It takes some object \(B\) to \(C + D \times B\). If we unpack the definition of optics for this actegory, we recover the affine traversal defined Profunctor Optics: A Categorical Update (Sec. 3.2.1).

This seems pretty neat, but what does an affine traversal actually do? Lenses compute an internal state and interact with the environment. Prisms compute an internal state or interact with the environment. Can we interpret an affine traversal in a similar way? The answer is yes. The first thing we have to note is that unlike lenses and prisms, the affine traversal is parameterised by a pair of private objects \((C, D) : \mathcal{C} \times \mathcal{C}\). This means that we’ll have a prism-like internal state \(C\) and a lens-like internal state \(D\). Looking at the coend below

\(\int^{C, D} \mathcal{C}(A, C + D \times B) \times \mathcal{C}(C + D \times B', A')\)

we can see that an affine traversal can do either one, but not both of these things:

Compute an internal state \(C\)
Compute an internal state \(D\) and interact with the environment

While technical derivations are available in the aforementioned paper, the high-level intuitions I presented here don’t seem to be. I find them useful, since they give an operational view of the affine traversal. We can also easily see how to recover prisms (by setting \(D = I\)) and lenses (by setting \(C = I\)).

Conclusions

It’s been known for a while that affine traversals arise as compositions of lenses and prisms. What I couldn’t find in the literature was the definition of this composition operator, as well as a more detailed operational view. The perspective of lenses and prisms as processes that either make us query the environment, or allow us to query the environment was invaluable, and my key to understanding affine traversals.

This unlocks more questions. All optics in Profunctor Optics: A Categorical Update are defined using actegories, and actegories can be composed. Can we use this method to come up with new types of optics? In this blog post we talked about the affine traversal, but can we recover the traversal as a limit of some infinite sequence of composition? Are there some even more exotic types of optics, other than kaleidoscopes, glasses and grates?

And lastly, it’s well known that lenses can be used to help us model all sorts of interaction protocols (1, 2, 3). But prisms, affine traversals, and the rest of the optical gadgets do not appear as often in theories of interaction protocols, nor in machine learning or game theory. And it seems they ought to be, because they allow us to describe important kinds of systems: those with the ability to choose whether to execute some code or not. Those, especially in machine learning, are becoming more and more important.

Thanks to Ieva Čepaitė and David Orion Girardo for a read-through of this post.

While we can easily obtain the functor \(AF : \mathcal{C} \times \mathcal{C} \to [\mathcal{C}, \mathcal{C}]\), showing it is strong monoidal will require a distributive law between the two underlying actegories.↩︎

Bruno Gavranović

Lenses to the left of me, Prisms to the right

Here I am, stuck in an Affine Traversal with you

Conclusions