Cale

Inversions in permutations and words

Posted in Cale on January 9th, 2010 by Cale – Be the first to comment

An inversion in a permutation is a pair of indices such that
but . Let denote the total number of inversions of . One can uniquely record a permutation by recording, for each the number of inversions , which will be a number from to . One can recover the entry of the permutation (and hence the whole permutation) from such a recording by the following algorithm. Let be the set of the first entries of the permutation. If the number of inversions of the form is , then there must be exactly elements after the position which are smaller than it. Hence, the entry of the permutation is the smallest element of .

Hence,


is a weight preserving bijection, where is the usual additive weight.

By simply reversing the sequences (using the index substitution ), we get a further bijection:

So these two sets will have the same generating series. Using the product and sum lemma, we obtain:

We now extend the definition of inversions to words on the alphabet . If is the symbol at the position of the word , then an inversion in is a pair of positions such that , but .

We wish to know the generating series with respect to inversions for the set of words of length on with occurrences of .

To get it, we will use an indirect decomposition, decomposing an element as a word , together with permutations in a bijection which additively preserves inversions, where and is a set with elements.

That is, our weight-preserving bijection will look like:

We split the domain of a permutation into blocks with sizes in the obvious way, with the block consisting of the elements where . Note that acts on in a canonical way, with a permutation sending to .

The word is then defined by if .

We define the permutation for each as the subword of consisting of the elements of . Note that if and are both elements of , then is an inversion of if and only if it is also is an inversion of .

It will help to illustrate this bijection with an example. Take , and , so that , and .

Let be the permutation (in word form):

Then we construct the word by taking to be the block in which occurs:

We then compute:

To invert the above process, we simply replace the occurrence of in the word with . Hence, this is a bijection.

To see that it is additively weight-preserving with respect to inversions, we do a bit of a case analysis. We classify the inversions of according to whether and are in the same block or in different blocks.

If they are in different blocks, say and , then since , we must have , and so . Hence is also an inversion of . Conversely, every inversion of will clearly be an inversion of of this type, since if but , we have that and , but since , any element of is greater than any element of , and in particular .

If they are in the same block , then by a previous argument, is an inversion of if and only if it is an inversion of .

Hence, we have that .

By subtracting from each of the elements of , we obtain a standard permutation , and this is obviously inversion-preserving.

So, we have a weight-preserving bijection:

Hence, we have a corresponding equation of generating series. Let be the generating series for
. Then:


and hence,

[pdf version]

Permutation indices

Posted in Cale on January 9th, 2010 by Cale – Be the first to comment

Here’s just a quick post of a plot I did a quite a while back:

invmaj

This animation shows for through to the number of permutations with a given number of inversions and major index. I love the way that discrete structure slowly falls away into something nearly continuous. I would have included more frames in the animation, but as there are permutations in it gets expensive to compute these by brute force counting. I would need to come up with a more effective method.

An inversion of a permutation on is a pair of elements with such that That is, the permutation reverses their order. The inversion number, is the number of such pairs that a given permutation has. For example, the permutation (regarded as a word, not a cycle) has inversions and so

A descent in a permutation is a position with where a larger number occurs immediately before a smaller one. That is, The major index, is defined as the sum of the descents. For example, in the permutation we have descents at the first position (3 falls to 1), and at the third (4 falls to 2). So the major index of this permutation is

The plot shows at position the number of permutations with exactly inversions, and major index as a normalised shade of grey, darker meaning larger. Indices increase left to right and top to bottom.

You can easily see from the symmetry of the plots that permutations are equidistributed with respect to these two indices, so that we can expect the polynomial where the coefficient of is the number of permutations with inversion number and major index will be a symmetric polynomial in and

Categorical Logic I

Posted in Cale on December 8th, 2009 by Cale – 2 Comments

Suppose we have almost any deduction system whatsoever. (I’m going to refine things and ask for more structure as we go along, but for now it will do.)

Our deduction system consists of a bunch of statements, and some rules for determining when some statements follow from others. It’s not much of a stretch to demand that given any statement , we will be able to prove from it, that is, that we have some trivial deduction . Also, in order to keep things interesting, if we have a deduction , and a deduction , we ought to have some way to join them into a deduction . This will be some sort of “concatenation of proofs”, but we’re not going to care how it’s implemented, just so long as it is associative, and the “identity deduction” above is an identity for it.

So, anyone who knows me well enough should know what’s coming now. We form a category whose objects are statements, and where the set of arrows are exactly the deductions . That is, the proofs of given . The identity arrows are given by trivial deductions, and the composition is our concatenation of proofs.

Let’s rediscover basic logical operations in terms of category theoretic constructions.

Starting simple, what would an initial object correspond to? Well, the property of an initial object is that for any we have a (unique) arrow This seems to suggest that ought to be a statement which is trivially false, as anything can be deduced from it.

How about a terminal object ? There we have for any , a unique arrow So this is behaving like something trivially true.

You might recall that in an arbitrary category with a terminal object, we often define an element of an object to be an arrow What would that be here? Well, it’s a proof of starting from something which is trivially true, or just a proof of Dually, a coelement of which is an arrow will be a proof of falsity from or a refutation of

How about the product of statements and ? Well, should be a statement which, according to the projection maps, implies both and and moreover, if any other statement implies both and then there must be a (unique) proof that implies making the following diagram commute:

These are the usual properties attributed to logical AND, so the existence of (finite) products is, modulo perhaps some squabbling about uniqueness of proofs, equivalent to the notion that we have statements representing the conjunctions of other statements.

In a similar fashion, the coproduct has essentially all the properties we’d expect from logical disjunction (OR) of statements.

How about exponential objects? Recall that if and are objects in some category , then an exponential object is an object of together with an arrow

so that whenever there is an arrow

we have that there is a unique arrow

such that the following diagram commutes:

In order to better follow this, we can try setting to be , in which case becomes simply an arrow and becomes an arrow which we can think of as an element of the exponential.

That is, every proof of should correspond to a proof that

So exponential objects correspond directly with implication.

What about logical negation? Well, we said earlier that an arrow would act like a refutation of so a natural thing to do would be to define to be

Okay, so if our deduction system is in fact a Cartesian closed category with coproducts, it means we have encodings of the usual logical operations. What about the usual properties of logical operations with respect to each other?

Firstly, can we prove the distributive laws? In classical logic, we have that

and that

Secondly, do we get De Morgan’s laws for free, or do we need to add something more to this formulation?

I’ll leave those questions to be answered next time.

[pdf version]

A note about conjugation of permutations

Posted in Cale on November 28th, 2009 by Cale – Be the first to comment

I had a feeling that while the use of linear algebra in Robin’s notes about the action of conjugation in permutation groups is interesting, I thought it would be nice to see why the conjugacy classes are determined by cycle type by a purely combinatorial argument. After a chat about it, Robin encouraged me to write this note.

Let’s start by looking just at what happens when we conjugate a single cycle by an arbitrary permutation , both in , giving:

First of all, which elements of will be affected?

Diagrammatically, we have something which looks like this:

Here, the vertices are elements of , and the arrows show where various permutations send them, according to which permutation they are labelled with.

There are two sets of elements to consider: those which are sent to some by , and so are equal to for some and those which are not.

If is not sent to something affected by the cycle by , so that

then we have:

so that is unaffected by the conjugated cycle.

On the other hand, if

we have that:

(You can follow this in the diagram above, by starting at the and taking the arrows successively.)

So the upshot of this is that:

As each is sent by to , and everything else is left untouched.

To rearrange the diagram a little to reflect this new insight, we have something like this:

The trivial but important corollary here is that the image of a cycle under conjugation is always another cycle of the same length.

Another important result is that given two cycles of the same length, say and we can easily find a permutation conjugation by which will send one to the other. Start with a function

This will be well-defined, since each element of the first cycle occurs only once, and injective since any element will only occur once in the second cycle. So it will extend to a permutation in (in a number of different ways in general). Any such permutation will do.

Since there are different ways to line up the elements of the two cycles, and ways to rearrange the remainder of the elements, there will be permutations that conjugate any single cycle of length in to any other cycle of length

So individual cycles are conjugate if and only if they have the same length.

The next thing to note is that since every permutation can be written as a product of disjoint cycles, and conjugation is a group homomorphism, we have that conjugation preserves cycle type.

But not only does it preserve cycle type, as with individual cycles, given two permutations having the same cycle type, we can easily construct a permutation that sends one to the other, in much the same way as we did with the individual cycles. Just line up all the cycles of the same length in each permutation (there’ll be some choice in the matter, but any choice will do), and define a permutation sending each element of each cycle to the corresponding element of the corresponding cycle. Of course, unlike the case with just a single cycle, after choosing how to line things up this time, there will be no remaining choices to make (provided that we include cycles of length 1 in the process, if any).

For example, if we have:

then we can simply read off the columns, choosing

to get

and then

Rather than checking that we did this correctly by hand, we can confirm it using gap, as follows:


gap> a := (1,2)(3,6,7)(5,9,12)(4,8,11,10);
(1,2)(3,6,7)(4,8,11,10)(5,9,12)
gap> b := (5,8)(1,3,6)(2,10,4)(7,12,11,9);
(1,3,6)(2,10,4)(5,8)(7,12,11,9)
gap> sigma := (1,5,2,8,12,4,7,6,3)(9,10)(11);
(1,5,2,8,12,4,7,6,3)(9,10)
gap> a^sigma = b;
true

And there you have it.

To summarise, conjugacy of permutations is entirely determined by their cycle type, and in general, there will be one permutation that conjugation by which sends one to the other for each way to align corresponding cycles of the same length.

So, if the cycle type is some partition:

where the are distinct, and the indicate the number of times that cycle length occurs (and so are not an exponent in the usual sense), then since we have ways to pair up cycles of length , and then for each of these cycles, we can choose from cyclic shifts to pair up members of that cycle, we have that the total number of permutations sending one member of this conjugacy class to another under conjugation are:

For example, my permutations above both had cycle type and so the number of ways we could have constructed by rearranging the cycles in is:

Because of this computation, we can tell how many elements are in each conjugacy class. If for every permutation that sends to , we have exactly of them which do the same thing, we have that starting from any permutation there are distinct things we can get by conjugating with each of the permutations in which is then the size of the conjugacy class

[pdf version]

Natural transformations 5

Posted in Cale on November 18th, 2009 by Cale – Be the first to comment

Suppose that we have categories, functors, and natural transformations as suggested by the following diagram:

It’s not yet clear at this point, even though we have horizontal and vertical composition, that this diagram defines a well-defined composite, since there are at least two ways we could put this a diagram together. We could either take the vertical composites and , and then horizontally compose them, or we could take the horizontal composites and and then vertically compose them. However:

So indeed, we have the law that:

which is known as the interchange law.

The interchange law and the associativity of vertical and horizontal composition together give us the somewhat-awkward-to-formalise but intuitive result that if we have any diagram of natural transformations that we could compose, it doesn’t matter what order we choose to do it in. So the pasting diagrams that we’ve been using, being somewhat ambiguous about it, are really an appropriate notation to use.

It’s also possibly worth noting here that the functor distributivity laws we used to prove the interchange law and the associativity of horizontal composition are also special cases of the interchange law. For example:

There’s a valuable analogy to be had here with the multiplication and tensor product of linear maps.

If we have vector spaces , and linear maps and then we can define the tensor product of linear maps:

With this definition, we have that:

so we have that:

The situation for natural transformations, vertical, and horizontal composition looks so similar to the one for linear maps, composition, and tensor product that it seems there must be something going on here.

In fact there is, and in order to explain it, we’ll need the notion of a weak 2-category, which I’ll save for another time.

[pdf format]

Natural transformations 4

Posted in Cale on November 16th, 2009 by Cale – Be the first to comment

Okay, so we’ve seen at least a handful of examples of these natural transformation things. They are actually all over the place if you know where to look for them. As we move along through other categorical structures, we’ll see a fair number of them.

So, if natural transformations are supposed to be the arrows between functors, and we’re intending to do some category theory, we should be unsatisfied with just an unstructured set of them. We really ought to be able to compose them! Owing to their two-dimensional nature, there will actually be two ways.

Let’s start with the most obvious one, the one which turns the collection of functors between any two categories into a category itself.

Suppose we have categories, functors and natural transformations as depicted in this diagram:

So, these line up nicely, and for any object we have components:

and if we have any arrow in :

then we have the following diagram in :

Since each naturality square commutes, the outer rectangle does as well:

which means that we have a new natural transformation, called the vertical composite of and defined by:

(The above equation is in a box, because it is so very important to remember when working with these things!)

Moreover, we have identity and associativity fall out essentially immediately from the identity and associativity of arrows in (If this isn’t clear, then check it!)

That is, for any natural transformation

where is the identity natural transformation:

and if we have natural transformations like this:

then,

Now let’s turn to the second type of composition. This time, we have categories and functors like so:

So, for any we have the following components (examine this from right to left, as I’ve aligned the bits here with which category they lie in, in the diagram above):

Because is natural, we have that the square in actually commutes:

So, we will define the horizontal composite of this and as the common value:

This might seem complicated and hard to remember, but we can simplify it a bit if we define (for any appropriate and ):


so that if

then

And similarly,


so that if

then

Then the definition above becomes

or more abstractly,

The notation is starting to better reflect the notion that we’re just doing on the part, and on the part, and it doesn’t matter which one we do first, so long as we do them both.

The notation is also getting simpler. As we develop the properties of vertical and horizontal composition, this will all get conceptually much easier to keep track of.

One thing which is nice to notice is that these new notations we’ve defined for sticking a functor onto a natural transformation are the same as horizontal composition with the identity natural transformation on that functor:

The proof that is similar, and a good exercise in checking that you’re comfortable with the definitions and notation.

We also have that is associative. I’m going to start working entirely at the level of functors and natural transformations here, since writing in so many places is becoming tedious, and starts to make the presentation less clear. You’re invited to check anything which is not immediately apparent (in particular, the dubious steps that I’ve marked with (!?)).

We have functors and natural transformations as follows:

That’s obviously important so let’s put it in a box:

It’s slowly becoming more apparent that perhaps it doesn’t matter at all about which order we choose to paste some complicated diagram of natural transformations together, and that everything is perfectly associative in every possible way that it could be.

We almost have it, there is but one remaining basic property we’ll need, and I’ll save it for the next post.

[pdf version]

Natural transformations 3.b

Posted in Cale on November 15th, 2009 by Cale – Be the first to comment

The other natural transformation from basic linear algebra which I’d like to say something about is actually between two functors

Which might seem surprising, until I tell you what they are.

First of all, if is any field, we can form the group of invertible matrices with entries in , commonly known as the general linear group, or Moreover, if is any field homomorphism, then we get an induced group homomorphism

which simply applies the field homomorphism to each of the entries of the matrix. It’s not hard to check that this is actually a functor

from the category of fields to the category of groups.

Secondly, if is any field, we can take its group of units (invertible elements) under multiplication , which, since it’s a field, is everything except for So long as you’re familiar with the definitions of these things, it’s easy to see that any field homomorphism induces a group homomorphism directly, and that the functor laws are obeyed. So we have:

In addition to these, we have for any field , a group homomorphism

the determinant! It’s a group homomorphism now because of the basic results from linear algebra that:

Now, the naturality square,

commutes because the determinant is a polynomial in the entries of the matrix, and field homomorphisms commute with any polynomial map.

So the determinant too numbers among natural transformations.

[pdf version]

Natural transformations 3.a

Posted in Cale on November 15th, 2009 by Cale – Be the first to comment

Let’s have a look at the details of just a couple more examples of natural transformations, this time from linear algebra.

One example is just so standard that I can’t help but give it. If we have some finite dimensional vector space over the field , then the dual space is the space of all linear transformations , which in this special case are referred to as linear functionals. It’s not too hard to show that has the same dimension as , and therefore must be isomorphic to it. One way is simply to note that the dimension of the space of linear maps is equal to the product of the dimensions of and , and that the dimension of over itself is of course .

However, there’s no obvious isomorphism to pick. If you fix a basis, there’s a sort of obvious one to choose from there, but each possible choice of basis for leads to a different isomorphism.

However, consider the double dual: which is the space of linear maps The situation suddenly seems much different. Of course, again we have that is isomorphic to , but this time there is a natural choice of isomorphism , in particular,

That is, we send a vector to the linear map which takes a linear functional and sends it to

So, there’s a sense here that this choice of isomorphism seems somehow “natural”, and it’s certainly quite special, but how do we formalise that?

Well, we’d like to be able to say that this double dual thing was a functor, and then we could apply our categorical definition of what natural means.

So let’s look at the dual first, and consider a linear map Can we construct from it a linear map Well, it seems not in any obvious way aside from choosing particular isomorphisms.

How about then? We could go for making into a functor:

Given a linear functional and a linear map we can just compose them to get a linear functional

This process of converting linear maps into linear maps is usually called the transpose (as it corresponds to taking the transpose of matrix representations in a particular way), and we define

We also have that


and of course this will send the identity transformation to the identity transformation. So this really is an honest functor.

So, this means that the map which sends each vector space to its double dual and each linear map to the transpose of its transpose is now a plain functor

Our from above is now primed to become the component of a natural transformation:

All we have to check is that this diagram commutes:

Easy enough, there’s lots of flipping things around and notation, but it’s all mechanical. For any and , we have:

So it works out, and this really is a natural transformation.

[pdf version]

Natural transformations 2

Posted in Cale on November 15th, 2009 by Cale – Be the first to comment

Now that we have the definition of a natural transformation fleshed out, let’s take a look at least a couple of examples before we go off talking about how we can glue them together in various ways.

I’ve been meaning to get some stuff into this blog about functional programming, so let’s borrow our first real example from Haskell. I think most of our readers probably know already, but for anyone who isn’t aware, Haskell is a nice functional programming language with lots of cool features, and I highly recommend checking it out if you’re a mathematician.

So, in Haskell, the various datatypes, together with the Haskell-definable functions between them form a category. Composition of functions f :: B -> C and g :: A -> B is given by the composition function defined in the Prelude:

(.) :: (b -> c) -> (a -> b) -> (a -> c)
(f . g) x = f (g x)

And of course, the identity maps are also in the Prelude:

id :: a -> a
id x = x

So we’ll refer to this category of Haskell types and functions as In addition to being a category, we have a convenient way within Haskell to talk about functors

Our functor needs to have a part which is a function on types, and parametric types allow us to define some. For instance, for each type a, we have the type [a] of lists of values of type a, and we can define other datastructures which are parametrised on a type, such as binary trees with values of type a at each of the branches:

data Tree a where
   Leaf   :: Tree a
   Branch :: a -> Tree a -> Tree a -> Tree a

This defines Leaf as a value of type Tree a for any type a, and Branch as a function which takes a value of type a and two trees, each of type Tree a, and formally constructs a value of type Tree a.

I am using GADT syntax here, which is a syntax extension supported by GHC. You can turn it on with the flag -XGADTs, or by writing

{-# LANGUAGE GADTs #-}

at the top of your source file.

For example, we might represent the tree of integers:

using the value:

t :: Tree Integer
t = Branch 5 (Branch 2 (Branch 1 Leaf
                                 Leaf)
                       (Branch 4 (Branch 3 Leaf
                                           Leaf)
                                 Leaf))
             (Branch 8 (Branch 6 Leaf
                                 (Branch 7 Leaf
                                           Leaf))
                       (Branch 9 Leaf
                                 Leaf))

Of course, being a function on types is not enough. If the type constructor [] for lists or Tree for trees is to be a functor, we need a way to send functions (a -> b) to functions [a] -> [b] or Tree a -> Tree b.

In order to provide a uniform interface to such mappings, the Haskell Prelude also defines the following class of type constructors:

class Functor f where
   fmap :: (a -> b) -> (f a -> f b)

Of course, these are really just endofunctors, but for purposes of programming, those are the ones we’re usually most concerned with.

To define [] and Tree as functors then, we write instances:

instance Functor [] where
   fmap f [] = []
   fmap f (x:xs) = f x : fmap f xs

For those unfamiliar with Haskell, the first equation says that applying fmap f to the empty list gives the empty list. The second says that applying it to a nonempty list whose first element is x and where the remainder of the list is called xs (think plural), is the list whose first element is f applied to x, and where the rest of the list is fmap f xs. The end result is that it just applies the function f to each of the values in the list, leaving the list structure otherwise unchanged.

Very similarly, we can define a functor instance for trees:

instance Functor Tree where
   fmap f Leaf = Leaf
   fmap f (Branch x left right) = Branch (f x) (fmap f left)
                                               (fmap f right)

This again just applies the function f to each of the values on the branches of the tree, leaving the structure alone.

Now, these are both functors with common domain and codomain, so perhaps we can describe a natural transformation between them?

A natural transformation would give, for each type a, a function whose type would be Tree a -> [a]. So, we would expect that perhaps some polymorphic function will serve as a natural transformation.

However, this gets better: Since a polymorphic function of that type is disallowed by the type system to actually inspect values of type a as it does its work, it can only shuffle them around in some way which is independent of their values, and the only way it can get values of type a to fill the list is by plucking values from the tree it receives.

So, we would expect that if we applied a function to the values on the branches of the tree beforehand, and then the polymorphic function from trees to lists, the result would be the same as if we applied the polymorphic function first to get a list, and then applied the function to the values of the resulting list. The naturality square commutes!

To write the naturality equation using Haskell syntax, if

eta :: Tree a -> [a]

is our natural transformation, then

fmap f . eta = eta . fmap f

This equation cannot be used directly as part of a program, but will be a law that our code satisfies.

To rephrase all that, every polymorphic function of that type is a natural transformation, and Haskell’s type system will actually ensure that we don’t write code which violates this.

So, what’s a good example? How about something like an inorder traversal?

inorder :: Tree a -> [a]
inorder Leaf = []
inorder (Branch x l r) = inorder l ++ [x] ++ inorder r

Here, the operator (++) is concatenation of lists.

We can test this function in our Haskell interpreter:

ghci> inorder t
[1,2,3,4,5,6,7,8,9]

Aside: This is not the best algorithm to choose from a complexity standpoint because concatenation of lists takes time on the order of the length of the first list, making the overall traversal quadratic. We can actually get a much faster implementation by factoring this map through the type of functions:

[a] -> [a]

by using functions which add appropriate elements to the beginning of a list in lieu of actual lists. Concatenation of lists becomes composition of functions then, which is constant time. We can then apply our function to the empty list once it is constructed.

inorder :: Tree a -> [a]
inorder t = inorder' t []
  where
    inorder' :: Tree a -> ([a] -> [a])
    inorder' Leaf = id
    inorder' (Branch x l r) = inorder' l . (x:) . inorder' r
     -- NB. (x:) is the function which adds x
     --                   to the beginning of a list.

So to reiterate, from this very discrete programming standpoint, we have that one sort of natural transformations are these polymorphic functions which take one sort of data structure to another, leaving the elements of the structure untouched, and merely permuting, duplicating, or deleting them somehow in a way which is independent of their individual values.

One can imagine from this standpoint many sorts of natural transformations on sorts of combinatorial structures, expressed as functors (or variations thereof: the category of finite sets whose arrows are just the bijections is a good choice for those who are interested primarily in enumeration). I will leave the details of this approach to enumerative combinatorics for another time, but the keyword if you’re interested is “combinatorial species”.

[pdf version]

Natural transformations 1

Posted in Cale on November 15th, 2009 by Cale – Be the first to comment

I’d like to let a larger group of people in on my series of posts about adjunctions and monads, which itself is designed to let a larger group of people in on this stuff I’d like to say about vector spaces, functional programming, and other areas.

To that end, I’ll push the stuff I’m going to write about adjunctions onto the stack, and do a few posts right now about natural transformations. You should hopefully be able to follow from here if you already are comfortable with the basic definitions of what categories and functors are, but I will be throwing some other topics into the examples for enrichment.

So, to start things off, let’s have a look at what natural transformations are and have a look at a couple examples before we get into what things we can do with them.

Suppose that we have categories and and two parallel functors. Our natural transformation will go between the functors:

(The reason our arrows in diagrams of natural transformations will go from right to left and and bottom to top is that it will make composition formulas easier to read off later.)

The natural transformation consists of, for each object , an arrow in . Furthermore, for every arrow in we have the following commutative square in :

That is,

A good interpretation of this naturality square in many settings is that if we’re thinking of each as some sort of structure built on in some way, and as acting as does on the bits which came from and leaving the additional structure alone, then the component of the natural transformation acts on the other half: it rearranges the structure into a structure, leaving the bits which come from alone.

There is an alternate definition from which we might be able to extract some more intuition that I would like to take a quick look at here:

We will define the categorical interval as the extremely simple category with two objects , and a unique nonidentity arrow between them. (This category is also sometimes known as )

Suppose that and are functors as before. Then, a homotopy of functors is defined to be a functor such that:

If we have a functor like this, then for any in , we have the following diagram of arrows in :

Because the composite of arrows in the product category is defined by we have that each of those two triangles commutes, and so the square does as well.

But now if we can define for each , the arrow we can replace things in the above diagram according to the equations we have to get:

So these components give rise to a natural transformation

Moreover, if we start with a natural transformation, , we can define a homotopy of functors using , and construct the remainder of the definition of according to the laws it must satisfy. So the two definitions are equivalent.

Those familiar with topology at this point are probably quite happy with this, but for everyone else who is in the dark, the analogy here is to the concept of a homotopy of continuous maps.

In topology, we might have topological spaces and (if you don’t know about topological spaces, you can think of subsets of perhaps), and a pair of continuous functions The notion we’re trying to define is when it’s possible to continuously deform the function into the function . We define a homotopy of continuous maps as a continuous map , where this time, our is the interval of real numbers:

subject to the conditions that for each point :

The way we typically think of this is to consider the second parameter to be like time. If we pick some point , then the map will trace out a continuous curve in which goes from at to at . This curve is analogous to our component .

It seems no accident that the founders of category theory were topologists.

[pdf version]