Littlewood–Richardson Numbers

Posted in Robin on October 1st, 2010 by Robin – Be the first to comment

The Littlewood–Richardson rule describes how to take the product of two Schur functions:


The Schur functions are an important basis for the ring of symmetric functions which arise in many different branches of mathematics. We don’t have time to discuss them in detail here.

The coefficients are known as the Littlewood–Richardson numbers. They admit many nice combinatorial descriptions. We shan’t give any proof today, but we will give some examples.

Let , and .

The binary string associated to is , the binary string associated to is , and the binary string associated to is . A one represents a horizontal step and a zero represents a vertical step. We are thinking of all three partitions as sitting inside a by box.

In this case we have that and The corresponding Littlewood–Richardson puzzles are:

The red dots should be thought of as “right moving particles” and the blue dots as “left moving particles”, with time flowing up the page. The black dots are “holes” in the underlying triangular lattice.

When free to do so, a red particle takes one step the right, and a blue particle takes one step to the left.

The partition, is on the bottom of the triangle read left to right, with ones represented as red dots and zeros by blue dots. The partition is on the right of the triangle read top to bottom, with ones represented as red dots and zeros by black dots. The partition is on the left of the triangle read bottom to top with ones represented by black dots and zeros represented by blue dots.

When both a left mover and a right mover “want” to occupy the same location, either the red particle gets “scattered” by the blue particle, or the blue particle gets “scattered” by the red.

It is possible for a single blue particle to get scattered by multiple red particles, and simmilarly for a single red particle to get scattered by multiple red particles.

It is not however possible for both a blue and a red particle to scatter though each other. The following configuration is forbidden:

For completeness, here are the corresponding Littlewood–Richardson tableaux:

Define the LR-word of a skew tableau to be the sequence of entries read from right to left, top to bottom. The LR-word of the first tableau is . The LR-word of the second tableau is .

In addition to the usual rule for skew-tableau, that is weakly increasing along rows, and strictly increasing along the columns, in order to qualify a Littlewood–Richardson tableau it must satisfy the following additional property: For every initial subword of the LR-word of the tableau, and for each , there must be at least as many occurences of
the label as the label .

Infinite Dimensions II

Posted in Robin on September 24th, 2010 by Robin – Be the first to comment

Today we’re going to diagonalize an infinite dimensional Linear operator. Recall that we are working over “some funny function space”, with inner product defined by:


The operator represents “multiplication by

The operator represents times “differentiation by

The operator we are going to diagonalize today is:

This is the Hamiltonian for the Quantum Harmonic Oscillator but you don’t need to know that.

The first thing that you might instinctively try to do is use the formula for the difference of perfect square to write:


The only problem with this is that the operators and do not commute. In actual fact, what we have is:

To see this, take a test function and compute:





Nevertheless, it is still useful to define the operators:


A quick calculation reveals that:




So its not so bad. The eigenfunctions of are exactly the same as the eigenfunctions of , only the eigenvalues are shifted by a half. For notational convenience, we shall define:

Now, lets see how and commute:



Interesting… This “almost commuting” property of and has many useful consequences. For example:




As a consquence of this, if is an eigenfunction of with eigenvalue then we have:



In other words is an eigenfunction of with eigenvalue . For this reason is called a raising operator. Similarly we have:



This implies that if eigenvalue of with eigenvalue then:



For this reson is called a lowering operator.

Now, you may be surprised to learn that we already have an eigenfunction for , namely our good friend the Gaussian from last week. Lets see:





Pretty cool, huh?

We can now construct infinitely many eigenvectors for by repeatedly applying the raising operator:


That is, for any , the following is an eigenfunction of (with eigenvalue :

One can check that the eigenfunctions are always of the form:

The polynomials are known as the Hermite Polynomials and we shall have more to say about them soon.

Infinite Dimensions

Posted in Robin on September 16th, 2010 by Robin – 1 Comment

We have spent a lot of time studying linear transformations on finite dimensional vector spaces. This week, lets have a look at some infinite dimensional vector spaces for a change.

One of the most common infinite dimensional vector spaces you are likely to encounter is known as . The “L” stands for Lebesgue, and upto some subtleties about sets of measure zero, may be thought of as the space of square integrable functions.

A function:


is said to be square integrable if:

Remember that since we are working with complex valued functions:

A function is square integrable if when you take the square of the absolute value of the function then the area under the curve is finite. This is important if we want to normalize the function in order to think of it as some kind of probability distribution.

Unfortunately, most of your “everyday” functions, such as polynomials and exponentials are not square integrable, which can make it difficult to develop intuition about this space of functions.

One example of a square integrable function which you may be familiar with from probability theory is the Gaussian:

There is a little trick for showing that the Gaussian is square integrable, which involves first taking the square, and then switching to polar co-ordinates and integrating by parts. Let us define:


Then, recalling that in polar co-ordinates, the formula for the area element is given by:

We have:







It follows that:

and the Gaussian is indeed square integrable. Note that since a probability distribution must have total area equal to one, probabilists usually like to normalize Gaussian as:

This is also called the normal distribution.

We shall see later that if is any polynomial in , then the function is still square integrable. In fact, if we allow for infinite sums, there is a sense in which we can extract a basis from this, but it would require too much functional analysis to make this statement precise just now.

Next, the space of square integrable functions can be given an inner product as follows:

All we need now is some linear operators. Lets start with the Fourier transform:

Even though the Fourier transform is a map from back to itself, we use the variable to denote functions the domain of and the variable to denote functions in the image.

Actually, the Fourier transform can be defined on a much larger class of functions than just the square integrable ones. One way to think about it is to imagine that the exponential functions form some kind of “pseudo-basis” indexed by the real parameter . Now is the “component” of in the “direction” of using the inner product defined above.

As another example of a linear transformation on an infinite dimensional vector space, let us define to be the operator “multiplication by “. In otherwords:


Similarly, let us define to be times the operator “differentiation by “. That is:

Unfortunately these operators are not well-defined in the sense that they always send square-integrable functions to square-integrable functions. They are examples of what functional analysts call “unbounded operators” but we’re not going to worry about this too much.

Working with infinite dimensional vector spaces requires a certain suspension of disbelief. So, while we’re entertaining somewhat sketchy possibilities, let us note that the operator is diagonalizable, with eigenvectors given by the exponential functions:

Of course, the exponential functions are most certainly not square integrable, but don’t worry, it gets worse. The operator is diagonalizable with eigenvectors given by the Dirac delta function.

The Dirac delta function is a curious creature which is defined to be zero everywhere except for at the point , whilst somehow managing to have total integral equal to one. Its defining propery is that for any function we have:

If you like, you can imagine it as an infinitely tall, infinitely thin spike centered at the point , arising as the limit of a sequences of Gaussians getting taller and thinner, taller and thinner, as the standard deviation goes to zero. All this can be made rigorous using the notion of Schwarz distribution but that would be another story…

Despite its non-existance, the direct delta function is a very uesful beast. The important point to take away is that, almost by definition, the Fourier transform of the Dirac delta function is the exponential function:


In other words, the Fourier transform is an intertwinner between the operators and

If you didn’t understand that, then don’t worry. All you really need to know is that the two functions and are very important in Quantum mechanics and are lots of fun to play around with.

To be continued….

Infinitesimal Rotations

Posted in Uncategorized on September 9th, 2010 by Robin – Be the first to comment

It is well known that the exponential has power series expansion:



It is slightly less well-known that the exponential operator can also be applied to matrix. Consider the following anti-symmetric matrix:

Since , and we have:





And what do you know? Its our favourite rotation matrix!

But wait, there’s more… Let and consider we have:



But we also have:

And indeed:

This is, of course a special case of something much more general. The set of by anti-symmetric matrices form a Lie algebra with the commutator as bracket.

The exponential of an anti-symmetric matrix is always orthogonal. That is, the associated Lie group is the group of orthogonal matrices.

A Lie group is also a smooth manifold, and as such has a tangent space. The tangent space at the identity is precicely the Lie algebra.

Expanding and to first order, we get:

In other words, the matrix:


lives in the Lie algebra of the orthogonal group and can be thought of as an infintesimal generator for the two by two rotations.

Now consider the action of on the following matrix representing a hyperbola in projective space:

As we saw earlier:

What happens if we differentiate this action? Of course we have:


But we also have:


Now, keeping in mind that , and that commutes with , this gives us:


Indeed:




Its interesting to see that when acts by conjugation, the derivative of this action is the commutator with the infintessimal generator .

Matrices and Determinants

Posted in Robin on September 2nd, 2010 by Robin – Be the first to comment

Let denote an invertible matrix, and let denote its inverse.
Suppose that we are given a column vector and that we wish to solve the equation for the unknown vector .

Since is invertible with inverse , this is equivalent to:


In co-ordinates:

This would be great if we had a nice expression for the inverse of a matrix.

Now, let denote the columns of .
That is:


If denotes the standard basis then:

That is, the columns of the matrix are the images of the standard basis vectors.

Another way of looking at the problem is as follows. By definition, we have:


To solve the equation for we must find unknowns such that:

In other words, we are trying to find the co-ordinates of the vector in terms of a new basis.

Recalling now that the determinant is multi-linear and anti-symmetric, consider the expression:



Let us now rewrite the equation as the following “formal” determinant , which has entries in the first row the standard basis vectors, and additional “projective symbol” :

Expanding along the top row, whilst making use of the previous remark (and being careful with the signs), we get:




which after “deprojectivizing” gives us exactly the vector which we were searching for.

Next, by expanding along the -th column, we obtain:


where denotes the determinant of the minor obtained from by removing the -th row and the -th column. From which we recover the well-known formula for the inverse of a matrix:

Special Relativity and Hyperbolae

Posted in Robin on August 27th, 2010 by Robin – Be the first to comment

The matrix for the Lorenz Transformation, in one space dimension and one time dimension (with ) is given by:

I shan’t derive this matrix from the “principle of relativity” which states that the laws of physics are identical in all inertial frames of reference, for this has been done very well elsewhere, and is not the purpose of this post.

Let us return instead to our good friend, the two by two rotation matrix, in order to make a few observations:

Consider a right angled triangle with sides lengths , and . If you remember the definition of your basic trigonometric functions, as well as Pythagorus’ theorem, then you will “see” that we have:


This gives us the following alternative parameterization for the rotation matrices, where :


It looks rather like the Lorenz transformation, doesn’t it? Though not quite.

If you recall now the definition of the hyperbolic trigonometric functions, you will “see” that and , and as a consequence, we have:


This in turn gives us an alternative parameterization for the Lorenz Transformation, where :

The upshot of this is that if you are willing to work over rather than over , and to think of your space dimension as purely real, and your time dimension as purely imaginary, then applying a “Lorenz boost” to your frame of reference is equivalent to rotating through an imaginary angle.


In the same way that a regular rotation maps circles to circles in two space dimensions, a “hyperbolic rotation” sends hyperbolae to hyperbolae in one space dimension and one time dimension. The following mathematica code will let you play around with this idea.


points = {{1, 0}, {2, 0}, {3, 0}, {4, 0}};

A[theta_] := {{Cos[theta], -Sin[theta]}, {Sin[theta], Cos[theta]}};
B[v_] := {{1/Sqrt[1 - v^2], v/Sqrt[1 - v^2]}, {v/Sqrt[1 - v^2],
    1/Sqrt[1 - v^2]}};

Plot[y /. Table[Solve[x^2 + y^2 == k^2, y], {k, 5}], {x, -5, 5},
  PlotRange -> {-5, 5}, AspectRatio -> 1];
Manipulate[
 Show[%, ListPlot[points . A[theta], PlotStyle -> PointSize[0.025],
   PlotRange -> {{-5, 5}, {-5, 5}}, AxesOrigin -> {0, 0},
   AspectRatio -> 1]], {{theta, 0}, -Pi, Pi}]

Plot[y /. Table[Solve[x^2 - y^2 == k^2, y], {k, 0, 8}], {x, -10, 10},
  PlotRange -> {-10, 10}, AspectRatio -> 1];
Manipulate[
 Show[%, ListPlot[points . B[theta], PlotStyle -> PointSize[0.025],
   PlotRange -> {{-5, 5}, {-5, 5}}, AxesOrigin -> {0, 0},
   AspectRatio -> 1]], {{theta, 0}, -0.9, 0.9}]

Since a particle moving at the speed of light is represented in this picture by the degenerate hyperbola which form the asymptotes of the rest of the family, it is clear now why the speed of light is constant in all inertial frames of reference.

But what’s really going on here? Let be the matrix representing a Minkowski bilinear form with one space dimension and one time dimension:


Then any matrix of the form:

will have the property that:

Compare this to the Euclidean case, in which the bilinear form is represented by the identity matrix. Now we have that any matrix of the form:


has the property that:

In co-ordinates, if and represent two different particles in two dimensional space, then the Euclidean inner product is usually represented as:

In the Minkowski picture, where time is imaginary, a “point” represents a particle located at a given position in one dimensional space, at a given time. The “inner product” of two “points” and is given by:

Conic Sections and Projective Space

Posted in Robin on August 20th, 2010 by Robin – Be the first to comment

I’m sure you’ve all seen the equation of a circle before:


You’ve probably also seen the equation of a hyperbola:

Or perhaps you’re more familliar with seeing the asymptotes at 45 degree angles to the co-ordinate axes, as in:

All these are special cases of the most general quadratic equation, which can be written in the following projective form:

Note that the by matrix in the middle is symmetric. As an example, the Family of circles centered at the origin, with radius are associated to the family of matrices of the form:

The first form of the hyperbola, with asymptotes parallel to the co-ordinate axes, corresponds to the matrix:


while the second hyperbola, with asymptotes at 45 degrees, corresponds to the matrix:

More generally, the matrix associated to a hyperbola whose axes have been tilded by an angle of clockwise from the diagonal is:



The following mathematica code will let you play around with these rotations.

   v = Transpose[{{x, y, 1}}];
   similtude[A_, M_] := Transpose[M].A.M
   equation[A_] := Solve[similtude[A, v][[1]][[1]] == 0, y]

   R[theta_] := {{Cos[theta], -Sin[theta], 0},
                 {Sin[theta], Cos[theta], 0},
                 {0, 0, 1}};
   H = {{1,0,0},{0,-1,0},{0,0,-1}}
   W[theta_] := similtude[H, R[theta]]

   Manipulate[
     Plot[Evaluate[y /. equation[W[theta]]], {x, -4, 4},
       PlotRange -> {{-4, 4}, {-4, 4}},
       AspectRatio -> 1],
   {{theta, 0}, -Pi, Pi}]

A few remarks about the code. In Mathematica square brackets are reserved for passing arguments into functions, curly brackets for defining lists, and normal round brackets for grouping terms in algebraic expressions.

Double square backets are used for indexing elements of a list. Matrices are represented as lists of list. The first list is the first row, the second list is the second row, etc…

The matrix multiplication operator is simply a dot. When defining your own functions, the variables which can be passed into a function must be followed by an underscore.

The built in functions, Transpose and Solve do more or less what you would expect. The only subtlety is that the output of Solve is a transformation rule. The /. notation replaces the left hand side with the right hand side in the transformation rule.

The built in function Plot is responnsible for drawing static plots. Leaving a parameter free (in this case the angle theta) and wrapping it inside the Manipulate command allows you to vary the parameter with the mouse and view the resulting animation.

Recall that translation can be thought of an operation in projective space represented by the matrix:

Now, starting from the matrix for a circle of radius one, centered at the origin:


and applying the following similarity transform:



We obtain the matrix for a circle of radius one, centered at the point . The following code will let you play around with this:

  T[a_, b_] = {{1, 0, -a}, {0, 1, -b}, {0, 0, 1}};
  CC = {{1,0,0},{0,1,0},{0,0,-1}}
  M[a_, b_] := similtude[C, T[a, b]]
  Manipulate[
    Plot[Evaluate[y /. equation[CC]], {x, -2, 2},
      PlotRange -> {{-2, 2}, {-2, 2}}, AspectRatio -> 1],
  {{a, 0}, -1, 1}, {{b, 0}, -1, 1}]

Now for the fun part. Let us define the following “projective rotation”

Using this, we can smoothly deform a circle into a hyperbola. Check it out:

   P[theta_] := {{Cos[theta], 0, -Sin[theta]}, {0, 1, 0},
                 {Sin[theta], 0, Cos[theta]}};
   S[theta_] := similtude[CC, P[theta]]

   Manipulate[
     Plot[Evaluate[y /. equation[S[theta]]], {x, -10, 10},
       PlotRange -> {{-10, 10}, {-10, 10}},
       AspectRatio -> 1],
   {{theta, 0}, -Pi, Pi}]

But what’s really going on here geometrically? The following three dimensional quadratic equation describes a mathematical cone (which is more like a pair of icecream cones, touching at their tips):

By setting we are effectively taking the intersection with a plane. As you may know, all quadrics can be obtained by intersecting a cone with some given plane (hence the term conic section). The plane gives a circle, while the plane gives a hyperbola. Our “projective rotation” smoothly rotates the axis to the axis.

Translations as Projective Transformations

Posted in Robin on August 11th, 2010 by Robin – Be the first to comment

You probably already know that in two dimensional space, the geometric act of rotating a vector through a given angle can be represented algebraically by the matrix:

You might have also asked yourself “What about translation?” That’s a nice geometric operation, how can we represent it algebraically in terms of matrices?

The problem with translation is that it doesn’t preserve the origin, and is thus not a linear transformation. It is however a projective transformation, and so by working in projective space it is still possible to think of translations in terms of matrix operations. This is a common trick used in computer graphics packages such as openGL.

There are many different ways to think about two dimensional projective space. Geometrically it is the set of lines in three dimensional space which pass through the origin. Topologically its the two dimensional sphere with antipodal points identified. Think of the fact that a line through the origin in intersects the unit sphere in exactly two points.

Algebraically, it is the set of triples under the equivalence relation for all .

Another useful way to think about projective two-space is as regular two dimensional space with an additional “circle* at infinity”, that is an additional point at infinity for every possible direction which you might head out in from the origin. Imagine the hyperplane sitting inside and notice how any line passing through the origin, which does not lie in the two dimensional subspace spanned by the and directions, intersects this plane in exactly one point.

Working with the algebraic definition, a two dimensional projective transformation is just a by matrix acting on projective co-ordinates. The equivalence relation on the underlying projective space implies that two matrices which differ only by a scalar multiple represent the same projective transformation.

Suppose that we want to send the arbitrary point to the new point which is shifted across two steps and down three steps.

The first step is to embed the two dimensional plane, which we are really interested in, into the somewhat more abstract and difficult to understand two dimensional projective space, where we are going to perform our transformation. This is accomplished by the map:

The “inverse” of this embedding is the map:


I say “inverse” in inverted commas, because this map is not defined everywhere. In particular its not defined when .

Alternatively, if you have trouble visualizing projective space, just imagine embedding into as the hyperplane given by the equation . The only thing we’re really missing in this picture is the “circle at infinity” which will be important later when we consider quadrics, but does not really matter just now while we are only concerned with translation operators (which send infinity to infinity anyhow).

Consider now the following projective transformation:


(or if you prefer, just think of it as a regular three dimensional transformation, which happens to map the hyperplane back to itself)

If we apply this to the image of our point under our funny embedding, then we get:

Et Voila! After taking the inverse of the embedding, this is exactly the image of the point that we were looking for.

* As correctly pointed out by Cale Gibbard, its not actually a circle at infinity, but a projective line. If two people head out in opposite directions along the plane they will reach the same point at infinity.

Back from the Dead

Posted in Robin on August 5th, 2010 by Robin – Be the first to comment

This blog has been dead for some time, but since I am now officially on the job market, I have decided to re-open it. I plan to post something new at least once a week.

Data Mining

Posted in Links, Robin on January 28th, 2010 by Robin – Be the first to comment

For anybody who is interested, here are two online video lecture series from Stanford University:

Statistical Aspects of Data Mining

Machine Learning

Also, the data set for the netflix prize.