Geodesics of left invariant metrics on matrix Lie groups – Part 2 Conservation laws

In the last post, Geodesics of left invariant metrics on matrix Lie groups – Part 1,we have derived Arnold’s equation – that is a half of the problem of finding geodesics on a Lie group endowed with left-invariant metric.

Suppose G is a Lie group, and g(\xi,\eta) is a scalar product (i.e. a nondegenerate bilinear form) on its Lie algebra Lie(G). Then, using left translations g defines a left invariant (Riemannian or pseudo-Riemannian) metric on the whole group G. If t\mapsto a(t)\in G is a path in G, we use left translations to define the image \omega(t) of the tangent vector \dot{a}(t) in Lie(G)

(1)   \begin{equation*}\omega(t)=a(t)^{-1}\dot{a}(t).\end{equation*}

Usually it is written more carefully, using L_{a^{-1}} instead of a^{-1}, but I am using a simplified notation, well adapted to dealing with problems for matrix groups.

If a(t) is a geodesic for the metric g, then \omega(t) satisfies the Arnold’s equation

(2)   \begin{equation*}\dot{\omega}=B(\omega,\omega),\end{equation*}

where

    \[B:Lie(G)\times Lie(G)\rightarrow Lie(G)\]

is defined as

(3)   \begin{equation*}B^k_{ij}=g^{kl}B_{ij,l}=g^{kl}C_{jl,i}=g^{kl}g_{im}C_{jl}^m,\end{equation*}

and C_{jl}^m are the structure constants of G, that is

(4)   \begin{equation*}[\xi_i,\xi_j]=C_{ij}^k\, \xi_k\end{equation*}

where \xi_i form a basis for Lie(G) and

(5)   \begin{equation*}B_{ij,l}=g_{lk}B_{ij}^k,\,C_{ij,l}=g_{lk}C_{ij}^k.\end{equation*}

Equation (2) is a system of nonlinear ordinary differential equations with constant coefficients. In this form it can be found in Arnold’s 1966 paper “Sur la g\’eom\’etrie differentielle des groupes de Lie de dimension infinie et ses applications \`a l’hydrodynamique des fluides parfaits”, Ann. Inst.
Fourier (Grenoble) 16 (1966), 319-361.

In his blog post “The Euler-Arnold equation” Terrence Tao mentions that some people call this equation the “Euler-Arnold” equation, while some other prefer to skip Arnold’s name completely and call it “Euler-Poincare equation”. Go-figure!

Solving equations (2) is just one half of the whole problem of finding a geodesic. Once \omega^{i}(t) are known, we need to solve the linear differential equation with variable coefficients that results from the definition of \omega (??):

(6)   \begin{equation*} \dot{a}(t)=a(t)\omega(t),\end{equation*}

For the case of the rotation group and free rigid body we did it in Taming the T-handle continued.

B. Kolev in his paper “Lie groups and mechanics. An introduction” shows that for the rotation group O(n) the equations of motion for the free rigid body are completely integrable. We will not need this result, but is useful to know that in general case there are always two quadratic constants of motion corresponding to “kinetic energy” and “square of the angular momentum”. We were discussing these constants of motion for the group O(3) in
“Asymmetric Spinning Top – The Hardest Concept To Grasp In Physics” – they are used in so-called Poinsot construction. We will discuss a version of it for the group O(2,1) in the future.

The first observation is that the “kinetic energy” T(t)=g_{ij}\omega(t)^i\omega(t)^j is, in fact, a constant, independent of t. To see that this is the case, we differentiate:

(7)   \begin{equation*}\frac{dT(t)}{dt}=g_{kl}\dot{\omega}^{k}\omega^l+g_{kl}\omega^{k}\dot{\omega^l}=2g_{kl}\omega^k\dot{\omega}^l.\end{equation*}

In the previous post we wrote Eq. (2) as

(8)   \begin{equation*}g_{kl}\,\dot{\omega}^l=C_{jk,i}\,\omega^{i}\omega^{j}.\end{equation*}

Substituting into Eq. (7) we get

(9)   \begin{equation*}\frac{dT(t)}{dt}=2C_{jk,i}\omega^{i}\omega^j \omega^{k}=0.\end{equation*}

The result is zero because C_{jk,i} is antisymmetric in j,k while \omega^j\omega^k is symmetric. That is an often used property: if A^{ij}=-A^{ji} and B_{ij}=B_{ji}, then the contraction A^{ij}B_{ij}=0. Indeed A^{ij}B_{ij}=-A^{ji}B_{ij}=-A^{ji}B_{ji}=-A^{ij}B_{ij}, where in the last equality we have exchanged dummy indices names i\leftrightarrow j. If a number is equal to its negative, it must be zero.

To get the formula for the second quadratic invariant we need to return to the Ad-invariant scalar product that we have denoted \mathring{g} in Killing vectors, geodesics, and “Noether’s theorem”:

(10)   \begin{equation*}\mathring{g}(\eta_1,\eta_2)=\mbox{const}\, \frac{1}{2}Re(\mbox{Tr}(\eta_1\eta_2))).\end{equation*}

The fact that \mathring{g} is Ad-invariant implies an important relation between the matrix \mathring{g}_{ij} and the structure constants C_{ij}^k
that we are going to use. Ad-invariance means that:

(11)   \begin{equation*}\mathring{g}(e^{t\eta}\xi_1 e^{-t\eta},e^{t\eta}\xi_2 e^{-t\eta})=\mathring{g}(\xi_1,\xi_2)\end{equation*}

for all t\in\mathbf{R} and \xi_1,\xi_2,\eta\in Lie(G).
Differentiating at t=0 we get

(12)   \begin{equation*}\mathring{g}([\eta,\xi_1],\xi_2)+\mathring{g}(\xi_1,[\eta,\xi_2])=0.\end{equation*}

Setting \xi_1=\xi_i,\xi_2=\xi_j,\eta=\xi_k we get

(13)   \begin{equation*}C_{ki}^l\mathring{g}_{lj}+C_{kj}^l\mathring{g}_{li}=0.\end{equation*}

Multiplying both sides by g^{im}g^{jn} we obtain

(14)   \begin{equation*} C_{ki}^n\,g^{im}+C_{kj}^m\,g^{jn}=0\end{equation*}

We can now derive the second conservation law. The angular momentum m_i is defined as

(15)   \begin{equation*}m_i=g_{ij}\omega^{j}.\end{equation*}

Notice that m is a covector, a one-form on Lie(G), it is in the dual Lie(G)^* of Lie(G). It is the metric that connects the space to its dual. While vectors in Lie(G) play an active role, they generate transformations, elements in the dual, one-forms from Lie(G)^*, are “passive”, they evaluate vectors to numbers. It is the metric that is the third element here, that allows the active principle to connect to the passive principle. The metric depends on the mass distribution. In application to rigid bodies the inertia tensor is encoded in the metric on the rotation group.

The second conservation law states that the square of the angular momentum evaluated with the Ad-invariant metric \mathring{g} is constant:

(16)   \begin{equation*} m_0^2(t)=\mathring{g}^{ij}m_i(t)m_j(t)=\mbox{const}.\end{equation*}

To verify we differentiate and use Eq. (8) rewritten as

(17)   \begin{equation*}\dot{m}_k=C_{jk}^{i}\,m_{i}\omega^{j}.\end{equation*}

(18)   \begin{equation*}\frac{dm_0^2(t)}{dt}=2\mathring{g}^{kl}\dot{m}_k m_l=2\mathring{g}^{kl}C_{jk}^{i}m_im_l\omega^j.\end{equation*}

Now, according to Eq. (14) \mathring{g}^{kl}C_{jk}^{i} is antisymmetric in (i,l), while the product m_im_l is symmetric, therefore we get zero:

(19)   \begin{equation*}\frac{dm_0^2(t)}{dt}=0.\end{equation*}

In the following posts we will first return to the case of the rotation group in three dimensions and the rigid body, and then try to apply a similar reasoning to the case of the Lorentz group O(2,1) in 2+1 dimensions.

Geodesics of left invariant metrics on matrix Lie groups – Part 1

An elegant derivation of geodesic equations for left invariant metrics has been given by B. Kolev in his paper “Lie groups and mechanics. An introduction”.

Here we will derive these equations using simple tools of matrix algebra and differential geometry, so that at the end we will have formulas ready for applications. We will use
the conservation laws derived in the last post Killing vectors, geodesics, and Noether’s theorem. We will also use the same notation. We consider matrix Lie group G with the Lie algebra Lie(G). The tangent space at a\in G is denoted T_aG.
Thus Lie(G)=T_eG. On Lie(G) we assume nondegenerate scalar product
denoted as g(\xi,\eta),\, \xi,\eta\in Lie(G). We propagate it to the whole group using left translations as in Eqs. (8,9) of Killing vectors, geodesics, and Noether’s theorem

(1)   \begin{equation*}g_a(\xi,\eta)=g_e(a^{-1}\xi,a^{-1}\eta),\end{equation*}

which implies for \xi,\eta\in T_bG

(2)   \begin{equation*}g_{ab}(a\xi,a\eta)=g_b(\xi,\eta),\,a,b\in G.\end{equation*}

The metric so constructed is automatically left-invariant, therefore for each \xi\in Lie(G) the vector field \xi(a)=\xi a is a Killing field.

Let a(t) be a geodesic for this metric. We denote by \omega(t)\in Lie(G) the tangent vector left translated to the identity:

(3)   \begin{equation*}\omega(t)=a(t)^{-1}\dot{a}(t).\end{equation*}

Then, from the conservation laws derived in the last post, we know that the scalar product of \dot{a}(t) with \xi a(t) is constant. That is

(4)   \begin{equation*}g_{a(t)}(\xi a(t),\dot{a}(t))=\mbox{const}.\end{equation*}

The metric is left-invariant, therefore g_e(a(t)^{-1}\xi a(t),a(t)^{-1}\dot{a}(t))=\mbox{const}, or

(5)   \begin{equation*}g_e(a(t)^{-1}\xi a(t),\omega(t))=\mbox{const}.\end{equation*}

We will differentiate the last equation with respect to t, but first let us notice that by differentiating the identity a(t) a(t)^{-1}=e we obtain

(6)   \begin{equation*}\frac{d}{dt}a(t)^{-1}=-a(t)^{-1}\dot{a}(t)a(t)^{-1}=-\omega(t)a(t)^{-1}.\end{equation*}

Now, differentiating Eq. (5), and using also \frac{da(t)}{dt}=a(t)\omega(t) we obtain

(7)   \begin{equation*}g_e([a^{-1}\xi a,\omega],\omega)+g_e(a^{-1}\xi a,\dot{\omega})=0.\end{equation*}

We now need a certain bilinear operator on Lie(G) that is defined using the commutator and the scalar product. The commutator [\xi_1,\xi_2] itself is such an operator
from Lie(G)\times Lie(G)\rightarrow Lie(G). But using the scalar product we can define another operator B(\xi_1,\xi_2) by the formula:

(8)   \begin{equation*}g_e(B(\xi_1,\xi_2),\eta)=g_e([\xi_2,\eta],\xi_1),\quad \xi_1,\xi_2,\eta\in Lie(G).  \end{equation*}

The right hand side is linear in \eta, and owing to the nondegeneracy of the scalar product every linear functional is represented by a scalar product with a unique vector. Therefore B(\xi_1,\xi_2) is well defined, and evidently is linear in both arguments.

Let \xi_i be a basis in Lie(G), so that the structure constants are C_{ij}^k

(9)   \begin{equation*}[\xi_i,\xi_j]=C_{ij}^k\,\xi_k.\end{equation*}

We can also write B as

(10)   \begin{equation*}B(\xi_i,\xi_j)=B_{ij}^k\,\xi_k.\end{equation*}

Then Eq. (8) gives

(11)   \begin{equation*} g_e(B(\xi_i,\xi_j),\xi_k)=g_e([\xi_j,\xi_k],\xi_i)\end{equation*}

or

    \[ B_{ij}^l g_{lk}=C_{jk}^lg_{li},\]

which can be solved for B using the inverse metric:

(12)   \begin{equation*}B_{ij}^m=g^{mk}C_{jk}^lg_{li}.\end{equation*}

On the other hand, if we agree to lower the upper index of B and C with the metric, we can write Eq. (11) as

(13)   \begin{equation*}B_{ij,k}=C_{jk,i},\end{equation*}

which is easy to remember.

We can now return to Eq. (7) and rewrite it as

    \[g_e(a^{-1}\xi a,\dot{\omega})=g_e([\omega,a^{-1}\xi a],\omega)=g_e(a^{-1}\xi a,B(\omega,\omega)).\]

Since \xi, and therefore also a^{-1}\xi a is arbitrary, we obtain

(14)   \begin{equation*}\dot{\omega}=B(\omega,\omega),\end{equation*}

or, using a basis and Eq. (13)

(15)   \begin{equation*}g_{kl}\,\dot{\omega}^l=C_{jk,i}\,\omega^{i}\omega^{j}.\end{equation*}

Killing vectors, geodesics, and Noether’s theorem

Consider Lie groups of matrices: SO(3) or SO(2,1). Their double covering groups are SU(2) and SU(1,1) (or, after Cayley transform, SL(2,R)). We prefer to use these covering groups as they have simpler topologies. SU(2) is topologically a three-sphere, SL(2,R) is an open solid torus. Our discussion will be quite general, and applicable to other Lie groups as well.

We denote by Lie(G) the Lie algebra of G. It is a vector space, the set of all tangent vectors at the identity e of the group. It is also an algebra with respect to the commutator.

G acts on its Lie algebra by the adjoint representation. If X\in Lie(G) and a\in G, then

(1)   \begin{equation*} Ad_a: X\mapsto aXa^{-1}.\end{equation*}

We define the scalar product (X,Y) on Lie(G) using the trace

(2)   \begin{equation*}(X,Y)=\mbox{const}\frac{1}{2}\mbox{Re}(\mbox{Tr}(XY)).\end{equation*}

.

In each particular case we will choose the constant so that the formulas are simple.

Due to trace properties this scalar product is invariant with respect to the adjoint representation:

(3)   \begin{equation*}(aXa^{-1},aYa^{-1})=(X,Y).\end{equation*}

We will assume that this scalar product is indeed a scalar product, that is we assume it being non-degenerate. For SO(3) and SO(2,1) it certainly is. Lie groups with this property are called semisimple.

Let X_i be a basis in Lie(G). The structure constants C^{i}_{jk} are then defined through

(4)   \begin{equation*}[X_i,X_j]=C_{ij}^k\,X_k.\end{equation*}

We denote by \mathring{g}_{ij} the matrix of the metric tensor in the basis X_i

(5)   \begin{equation*}\mathring{g}_{ij}=(X_i,X_j).\end{equation*}

The inverse matrix is denoted \mathring{g}^{ij} so that \mathring{g}_{ij}\mathring{g}^{jk}=\delta^k_i.

For SU(2) the Lie algebra consists of anti-Hermitian 2\times 2 matrices of zero trace. For the basis we can take

(6)   \begin{equation*}X_1=\frac{1}{2}\begin{bmatrix}0&i\\i&0\end{bmatrix},\,X_2=\frac{1}{2}\begin{bmatrix}0&1\\-1&0\end{bmatrix}, \, X_3=\frac{1}{2}\begin{bmatrix}i&0\\0&-i\end{bmatrix}.\end{equation*}

For the constant \mbox{const} we chose \mbox{const}=-2. Then \mathring{g}_{ij}=\mathring{g}^{ij}=\mbox{diag}(1,1,1).

The structure constants are

(7)   \begin{equation*}C_{ij}^k=\mathring{g}^{kl}\epsilon_{ijl}.\end{equation*}

In this case, since \mathring{g}_{ij} is the identity matrix, there is no point to distinguish between lower and upper indices. But in the case of SU(1,1) it will be important.

We will now consider a general left-invariant metric on the group G. The discussion below is a continuation of the discussion in Riemannian metrics – left, right and bi-invariant.

That is we have now two scalar products on Lie(G) – the Ad-invariant scalar product with metric \mathring{g}, and another one, with metric g. We propagate the scalar products from the identity e to other points in the group using left translations (see Eq. (1) in Riemannian metrics – left, right and bi-invariant). We have a small notational problem here, because the letter g often denotes a group element, but here it also denotes the metric. Moreover, we have two scalar products and we need to distinguish between them. We will write g_a(\xi,\eta) for the scalar product with respect to the metric g of two vectors tangent at a\in G. Then left invariance means

(8)   \begin{equation*}g_a(\xi,\eta)=g_e(a^{-1}\xi,a^{-1}\eta),\end{equation*}

which implies for \xi,\eta tangent at b

(9)   \begin{equation*}g_{ab}(a\xi,a\eta)=g_b(\xi,\eta),\,a,b\in G\end{equation*}

Infinitesimal formulation of left invariance is that the vector fields \xi(a)=\xi a are “Killing vector fields for the metric” – Lie derivatives of the metric (cf. SL(2,R) Killing vector fields in coordinates, Eq.(13)) with respect to these vector fields vanish. What we need is a very important result from differential geometry: scalar products of Killing vector fields with vectors tangent to geodesics are constant along each geodesic. For the convenience of the reader we provide the definitions and a proof of the above mentioned result (a version of Noether’s theorem). Here we will assume that there are coordinates x^1,...,x^n on G. Later on we will get rid of these coordinates, but right now we will follow the standard routine of differential geometry with coordinates.

We define the Christoffel symbols of the Levi-Civita connection

(10)   \begin{equation*}\Gamma_{kl,m}=\frac{1}{2}\left(\frac{\partial g_{mk}}{\partial x^{l}}+\frac{\partial g_{ml}}{\partial x^{k}}-\frac{\partial g_{kl}}{\partial x^{m}}\right).\end{equation*}

(11)   \begin{equation*}\Gamma^{i}_{kl}=g^{im}\Gamma_{kl,m}=\frac{1}{2}g^{im}\left(\frac{\partial g_{mk}}{\partial x^{l}}+\frac{\partial g_{ml}}{\partial x^{k}}-\frac{\partial g_{kl}}{\partial x^{m}}\right).\end{equation*}

The geodesic equations are then (in Geodesics on upper half-plane factory direct we have already touched this subject)

(12)   \begin{equation*}\frac{d^2 x^i}{ds^2}=  -\Gamma^{i}_{jk}\frac{dx^j}{ds}  \frac{dx^k}{ds}.\end{equation*}

A vector field \xi is a Killing vector field for g_{ij} if the Lie derivative of g_{ij} with respect to \xi vanishes, i.e.

(13)   \begin{equation*}0=(L_\xi g)_{îj}=\xi^k\partial_k g_{ij}+g_{ik}\partial_j \xi^k+g_{jk}\partial_i\xi^k.\end{equation*}

The scalar product of the Killing vector field and the tangent vector to a geodesic is constant. That is the “conservation law”. A short proof can be found online in Sean Carroll online book “Lecture notes in General Relativity”. The discussion of the proof can be found on physics forums. But the result is a simple consequence of the definitions. What one needs is differentiating composite functions and renaming indices. Just for fun of it let us do the direct, non-elegant, brute force proof.

Suppose x^{i}(t) is a geodesic, and \xi is a Killing field. The statement is that along geodesic the scalar product is constant. That means we have to show that

    \[ g_{ij}(x(t))\,\dot{x}^{i}(t)\,\xi^{j}(x(t))=\mbox{const}.\]

We differentiate with respect to t, and we are supposed to get zero. So, let’s do it. We have derivative of a product of three terms, so we will get three terms t_1,t_2,t_3:

    \[t_1=\frac{d}{dt}(g_{ij}(x(t)))\,\dot{x}^{i}(t)\,\xi^{j}(x(t)),\]

    \[t_2=g_{ij}(x(t))\,\frac{d}{dt}(\dot{x}^{i}(t))\,\xi^j(x(t)),\]

    \[t_3=g_{ij}(x(t))\,\dot{x}^{i}(t)\,\frac{d}{dt}(\xi^j(x(t))).\]

Let us calculate the derivatives. After we are done, in order to simplify the notation, we will skip the arguments.

    \[\frac{d}{dt}(g_{ij}(x(t)))=\partial_k\,g_{ij}\dot{x}^k.\]

Thus

    \[t_1=\partial_k\,g_{ij}\dot{x}^{i} \dot{x}^{k}(t)\,\xi^{j}(x(t)).\]

Then, from Eq. (12)

    \[\frac{d}{dt}(\dot{x}^{i}(t))=-\Gamma^{i}_{kl}\dot{x}^k\dot{x}^l,\]

therefore

    \[t_2=-\Gamma_{kl,j}\dot{x}^k\dot{x}^l\xi^j=-\frac{1}{2}\partial_k g_{lj}\dot{x}^k\dot{x}^l\xi^j-\frac{1}{2}\partial_l g_{kj}\dot{x}^k\dot{x}^l\xi^j+\frac{1}{2}\partial_jg_{kl}\dot{x}^k\dot{x}^l\xi^j.\]

Renaming the dummy summation indices k,l we see that the two first terms of t_2 are identical, therefore

    \[t_2=-\partial_k g_{lj}\dot{x}^k\dot{x}^l\xi^j+\frac{1}{2}\partial_jg_{kl}\dot{x}^k\dot{x}^l\xi^j.\]

Again, renaming the dummy summation indices we see that the first term of t_2 cancels out with t_1, therefore

    \[t_1+t_2=\frac{1}{2}\partial_jg_{kl}\,\dot{x}^l\dot{x}^k\xi^j.\]

For t_3 we have

    \[t_3=g_{ij}\,\dot{x}^{i}\,\partial_k\xi^j\dot{x}^k.\]

Owing to the symmetry of \dot{x}^{i}\dot{x}^k=\dot{x}^{k}\dot{x}^i, we can write it as

    \[t_3=\frac{1}{2}g_{ij}\,\partial_k\xi^j\,\dot{x}^{i}\,\dot{x}^k+\frac{1}{2}g_{kj}\,\partial_i\xi^j\,\dot{x}^{i}\,\dot{x}^k.\]

Therefore

    \[t_1+t_2+t_3=\frac{1}{2}\left(\partial_jg_{kl}\,\dot{x}^l\dot{x}^k\xi^j+g_{ij}\,\partial_k\xi^j\,\dot{x}^{i}\,\dot{x}^k+g_{kj}\,\partial_i\xi^j\,\dot{x}^{i}\,\dot{x}^k\right)\]

We rename the indices to get

    \[t_1+t_2+t_3=\frac{1}{2}\left(\xi^j\partial_jg_{ik}+g_{ij}\,\partial_k\xi^j+g_{kj}\,\partial_i\xi^j\right)\dot{x}^{i}\,\dot{x}^k\]

But the expression in parenthesis vanishes owing to Eq. (13).