23. Motion of a Rigid Body:  the Inertia Tensor

Michael Fowler

Definition of Rigid

We’re thinking here of an idealized solid, in which the distance between any two internal points stays the same as the body moves around.  That is, we ignore vibrations, or strains in the material resulting from inside or outside stresses.  In fact, this is almost always an excellent approximation for ordinary solids subject to typical stresses—obvious exceptions being rubber, flesh, etc.  Following Landau, we’ll usually begin by representing the body as a collection of particles of different masses m i  held in their places r i  by massless bonds. This approach has the merit that the dynamics can be expressed cleanly in terms of sums over the particles, but for an ordinary solid we’ll finally take a continuum limit, replacing the finite sums over the constituent particles by integrals over a continuous mass distribution.

Rotation of a Body about a Fixed Axis

As a preliminary, let’s look at a body firmly attached to a rod fixed in space, and rotating with angular velocity Ω  radians/sec. about that axis.  You’ll recall from freshman physics that the angular momentum and rotational energy are L z =IΩ, E rot = 1 2 I Ω 2  where

I= i m i r i 2 = dxdydzρ x,y,z r 2

(here r = x 2 + y 2  is the distance from the axis).

 

 

 

But you also know that both angular velocity and angular momentum are vectors. Obviously, for this example, the angular velocity is a vector pointing along the axis of rotation, Ω = 0,0, Ω z .  One might be tempted to conclude that the angular momentum also points along the axis, but this is not always the case. An instructive example is provided by two masses m  at the ends of a rod of length 2a  held at a fixed angle θ  to the z  axis, which is the axis of rotation.

Evidently,

L z =2m a 2 sin 2 θΩ.

But notice that, assuming the rod is momentarily in the xz  plane, as shown, then

L x =2m a 2 cos 2 θΩ.

The total angular momentum is not parallel to the total angular velocity

In fact, as should be evident, the total angular momentum is rotating around the constant angular velocity vector, so the axis must be providing a torque. This is why unbalanced car wheels stress the axle.  

General Motion of a Rotating Rigid Body

We’ll follow the Landau notation (which itself tends to be bilingual between coordinates x,y,z  and x 1 , x 2 , x 3 .)  Notice that we’ll label the components by x 1 , x 2 , x 3 , not r 1 , r 2 , r 3  even though we call  the vector r .   Again, we’re following Landau.

We take a fixed, inertial (or lab) coordinate system labeled X,Y,Z  and in this system the rigid body’s center of mass, labeled O , is at R .  We have a Cartesian set of axes fixed in the body, origin at the center of mass, and coordinates in this system,  vectors from O  to a point in the body denoted by r , are labeled x,y,z  or x 1 , x 2 , x 3 .

  A vector from the external inertial fixed origin to a point in the body is then R + r = ρ , say, as shown in the figure.

Suppose now that in infinitesimal time dt , the center of mass of the body moves d R  and the body rotates through d ϕ .  Then a particle at r  as measured from the center of mass will move through d ρ =d R +d ϕ × r  relative to the external inertial frame.

Therefore, the velocity of that particle in the fixed frame, writing the center of mass velocity and the angular velocity as d R /dt= V ,d ϕ /dt= Ω ,   is

v = V + Ω × r .   Now, in deriving the above equation, we have not used the fact that the origin O  fixed in the body is at the center of mass. (That turns out to be useful shortly.)  What if instead we had taken some other origin O  fixed in the body? Would we find the angular velocity Ω  about O  to be the same as Ω ? The answer turns out to be yes, but we need to prove it! Here's the proof:

 If the position of O  relative to O  is a  (a vector fixed in the body and so moving with it) then the velocity V  of O  is V = V + Ω × a  .

A particle at r  relative to O  is at r = r a  relative to O  . 

Its velocity relative to the fixed external axes is v = V + Ω × r , this must of course equal V + Ω × r = V + Ω × r + Ω × a = V + Ω × r .  It follows that Ω = Ω .

This means that if we describe the motion of any particle in the body in terms of some origin fixed in the body, plus rotation about that origin, the angular velocity vector describing the body’s motion is the same irrespective of the origin we choose.  So we can, without ambiguity, talk about the angular velocity of the body.  

From now on, we’ll assume that the origin fixed in the body is at the center of mass.

The Inertia Tensor

 Regarding a rigid body as a system of individual particles, we find the kinetic energy

T= n 1 2 m n v n 2 = n 1 2 m n V + Ω × r n = n 1 2 m n V 2 + n m n V Ω × r n + n 1 2 m n Ω × r n 2 .  The first term in the last line is n 1 2 m n V 2 = 1 2 M V 2 , where M  is the total mass of the body.

  The second term is  n m n V Ω × r n = V Ω × n m n r n =0, from the definition of the center of mass (our origin here)  n m n r n =0.  

 The third term can be rewritten: n 1 2 m n Ω × r n 2 = n 1 2 m n Ω 2 r n 2 Ω r n 2 .  Here we have used Ω × r =Ωrsinθ, Ω r =Ωrcosθ.

Alternatively, you could use the vector product identity a × b × c = a b c + b a c  together with a × b c × d = a × b × c d  to find a × b c × d = a c b d a d b c .  

The bottom line is that the kinetic energy

T= 1 2 M V 2 + n 1 2 m n Ω 2 r n 2 Ω r n 2 = T tr + T rot ,  a translational kinetic energy plus a rotational kinetic energy.

Warning about notation:  at this point, things get a bit messy.  The reason is that to make further progress in dealing with the rotational kinetic energy, we need to write it in terms of the individual components of the n  particle position vectors r n  .  Following Landau and others, we’ll write these components in two different ways:

r n = x n , y n , z n x n1 , x n2 , x n3 .  The x,y,z  notation is helpful in giving a clearer picture of rotational energy, but the x ni  notation is essential in handling the math, as will become evident.

Landau’s solution to the too many suffixes for clarity problem is to omit the suffix n  labeling the individual particles, I prefer to keep it in.

Double Suffix Summation Notation:  to cut down on the number of  ’s in expressions, we’ll follow Landau and others in using Einstein’s rule that if a suffix like i,j,k  appears twice in a product, it is to be summed over the values 1,2,3.  It’s called a “dummy suffix” because it doesn’t matter what you label it, as long as it appears twice.  For example, the inner product of two vectors A B = i=1 3 A i B i  can be written as A i B i  or equally as A k B k . Furthermore, Ω i 2  means Ω 1 2 + Ω 2 2 + Ω 3 2 = Ω 2 .

 But do not use Greek letters for dummy suffixes in this context: the standard is that they are used in relativistic equations to signify sums over the four dimensions of space time, Latin letters for sums over the three spatial dimensions, as we are doing here.

The rotational kinetic energy is then

T rot = 1 2 n m n Ω i 2 x ni 2 Ω i x ni Ω k x nk = 1 2 n m n Ω i Ω k δ ik x nl 2 Ω i Ω k x ni x nk = 1 2 Ω i Ω k n m n δ ik x nl 2 x ni x nk .

Warning:  That first line is a bit confusing:  copying Landau, I’ve written Ω i 2 x ni 2 , you might think that’s Ω 1 2 x n1 2 + Ω 2 2 x n2 2 + Ω 3 2 x n3 2 , but a glance at the previous equation (and the second line of this equation) makes clear it’s actually Ω 2 r 2 . Landau should have written Ω i 2 x nl 2 .  Actually I’m not even keen on Ω i 2  implying a double summation.  Standard use in relativity, for example, is that both of the two suffixes be explicit for summation to be implied.  In GR one would write Ω i Ω i .   (Well, actually Ω i Ω i  , but that’s another story.)

Anyway, moving on, we introduce the inertia tensor

I ik = n m n x nl 2 δ ik x ni x nk

In terms of which the kinetic energy of the moving, rotating rigid body is

T= 1 2 M V 2 + 1 2 I ik Ω i Ω k .

As usual, the Lagrangian L=TV  where the potential energy V  is a function of six variables in general, the center of mass location and the orientation of the body relative to the center of mass.

Landau writes the inertia tensor explicitly as:

I ik = m y 2 + z 2 mxy mxz mxy m z 2 + x 2 myz mxz myz m x 2 + y 2

 

but you should bear in mind that mxz  means n m n x n z n .

Tensors 101

We see that the “inertia tensor” defined above as I ik = n m n x nl 2 δ ik x ni x nk  is a 3×3  two-dimensional array of terms, called components, each of which is made up (for this particular tensor) of products of vector components. 

Obviously, if we had chosen a different set of Cartesian axes from the same origin O  the vector components would be different:  we know how a vector transforms under such a change of axes, x,y,z x , y , z  where

x y z = cosθ sinθ 0 sinθ cosθ 0 0 0 1 x y z .

This can be written more succinctly as

x i = R ij x j , or   x =Rx,

the bold font indicating a vector or matrix.

In fact, a transformation from any set of Cartesian axes to any other set having the same origin is a rotation about some axis.  This can easily be seen by first rotating so that the x  axis coincides with the x  axis, then rotating about that axis.  (Of course, both sets of axes must have the same handedness.)  We’ll discuss these rotation transformations in more detail later, for now we’ll just mention that the inverse of a rotation is given by the transpose matrix (check for the example above),

R T = R -1 ,or   R ji = R ij 1

so if the column vector x i = R ij x j , or   x =Rx, the row vector x T = x T R T = x T R -1 , a.k.a. x i = R ij x j = x j R ji T = x j R ji 1 ,  and the length of the vector doesn’t change:

x i x i = x T x = x T R T Rx= x T R -1 Rx= x T x= x i x i .

It might be worth spelling out explicitly here that the transpose of a square matrix (and almost all our matrices are square) is found by just swapping the rows and columns, or equivalently swapping elements which are the reflections of each other in the main diagonal, but the transpose of a vector, written as a column, has the same elements as a row, and the product of vectors follows the standard rules for matrix multiplication: AB ij = A ik B kj , with the dummy suffix k  summed over.

Thus, a 1 a 2 a 3 T = a 1 a 2 a 3 ,  and a T a= a 1 a 2 a 3 a 1 a 2 a 3 = a 1 2 + a 2 2 + a 3 2 ,   but   a a T = a 1 a 2 a 3 a 1 a 2 a 3 = a 1 2 a 1 a 2 a 1 a 3 a 1 a 2 a 2 2 a 2 a 3 a 1 a 3 a 2 a 3 a 3 2 .

This will perhaps remind you of the Hilbert space vectors in quantum mechanics: the transposed vector above is analogous to the bra, the initial column vector being the ket. One difference from quantum mechanics is that all our vectors here are real, if that were not the case it would be natural to add complex conjugation to the transposition, to give a * a= a 1 2 + a 2 2 + a 3 2 ,  the length squared of the vector.

The difference shown above between a T a  and a a T  is exactly parallel to the difference between a a  and a a  in quantum mechanics—the first is a number, the norm of the vector, the second is an operator, a projection into the state a .  

Definition of a Tensor

We have a definite rule for how vector components transform under a change of basis: x i = R ij x j .  What about the components of the inertia tensor I ik = n m n x nl 2 δ ik x ni x nk ?

We’ll do it in two parts, and one particle at a time.  First, take that second term for one particle, it has the form m x i x k .  But we already know how vector components transform, so this must go to

m x i x k = R il R jm m x l x m .

The same rotation matrix R ij  is applied to all the particles, so we can add over n .

In fact, the inertia tensor is made up of elements exactly of this form in all nine places, plus diagonal terms m r i 2 , obviously invariant under rotation. To make this clear, we write the inertia tensor:

m y 2 + z 2 mxy mxz mxy m z 2 + x 2 myz mxz myz m x 2 + y 2 = m x 2 + y 2 + z 2 1 m x 2 mxy mxz mxy m y 2 myz mxz myz m z 2  

where 1  is the 3×3  identity matrix.  (Not to be confused with I !)

Exercise: convince yourself that this is the same as I= m x T x 1x x T .  

This transformation property is the definition of a two-suffix Cartesian three-dimensional tensor:  just as a vector in this space can be defined as an array of three components that are transformed under a change of basis by applying the rotation matrix, x i = R ij x j , a tensor with two suffixes in the same space is a two-dimensional array of nine numbers that transform as

T ij = R il R jm T lm .

Writing this in matrix notation, and keeping an eye on the indices, we see that with the standard definition of a matrix product, AB ij = A ik B kj ,

T =RT R T =RT R -1 .

(The transformation property for our tensor followed immediately from that for a vector, since our tensor is constructed from vectors, but by definition the same rule applies to all Cartesian tensors, which are not always expressible in terms of vector components.)

Diagonalizing the Inertia Tensor

The inertial tensor has the form of a real symmetric matrix.  By an appropriate choice of axes x 1 , x 2 , x 3  any such tensor can be put in diagonal form, so that

T rot = 1 2 I 1 Ω 1 2 + I 2 Ω 2 2 + I 3 Ω 3 2 .  These axes, with respect to which the inertia tensor is diagonal, are called the principal axes of inertia, the moments about them I 1 , I 2 , I 3  the principal moments of inertia.

If you’re already familiar with the routine for diagonalizing a real symmetric matrix, you can skip this review.

The diagonalization of the tensor/matrix proceeds as follows.

First, find the eigenvalues λ i  and corresponding eigenvectors e i  of the inertial tensor I  :

I e i = λ i e i (i=1,2,3, not summed)

(The λ i  turn out to be the principal moments I i , but we’ll leave them as λ i  for now, we need first to establish that they’re real.)

Now since I  is real and symmetric, I T =I , the eigenvalues are real. To prove this, take the equation for e 1  above and premultiply by the row vector e 1 T , the complex conjugate transpose:              

e 1 *T I e 1 = λ 1 e 1 *T e 1

The left hand side is a real number:  this can be established by taking its complex conjugate. The fact that the tensor is real and symmetric is crucial!

e 1i I ij e 1j = e 1i I ij e 1j = e 1i I ji e 1j = e 1j I ji e 1i ,

And since these are dummy suffixes, we can swap the i  ’s and j  ’s to establish that this number is identical to its complex conjugate, hence it’s real.  Clearly, e 1 *T e 1  is real and positive, so the eigenvalues are real.

(Note: a real symmetric matrix does not necessarily have positive roots: for example 0 1 1 0 .)

Taking the eigenvalues to be distinct (the degenerate case is easy to deal with) the eigenvectors are orthogonal, by the standard proof, for this matrix left eigenvectors (rows) have the same eigenvalues as their transpose, so

e 2 T I e 1 = λ 2 e 2 T e 1 = λ 1 e 2 T e 1 ,

and e 2 T e 1 =0

The diagonalizing matrix is made up of these eigenvectors (assumed normalized):

R= e 1 T e 2 T e 3 T ,

a column of row vectors. 

To check that this is indeed a rotation vector, from one orthogonal set of axes to another, notice first that its transpose R T = e 1 e 2 e 3  is its inverse (as required for a rotation), since the eigenvectors form an orthonormal set.

Now apply this R  to an arbitrary vector:

x =Rx= e 1 T e 2 T e 3 T x= e 1 T x e 2 T x e 3 T x .

In vector language, these elements are just e 1 x , etc., so x 1 = e 1 x , the primed components are just the components of x  along the eigenvector axes, so the operator R  gives the vector components relative to these axes, meaning it has rotated the coordinate system to one with the principal axes of the body are now the x 1 , x 2 , x 3  axes.

We can confirm this by applying the rotation to the inertia tensor itself:

I =RT R T = e 1 T e 2 T e 3 T I e 1 e 2 e 3 = e 1 T e 2 T e 3 T λ 1 e 1 λ 2 e 2 λ 3 e 3 = λ 1 0 0 0 λ 2 0 0 0 λ 3 .

Let’s examine the contribution of one particle to the inertia tensor:

                                                                            I 1 =m x T x 1x x T

Note that x  here represents the column vector of the particle coordinates, in other words, it’s just r ! And, watch out for the inertia tensor I  and the unit tensor 1

They transform as x =Rx , note that this agrees with I =RI R T .  Since under rotation the length of a vector is invariant, x T x = x T x , and Rx x T R T = x x T  it is evident that in the rotated frame (the eigenvector frame) the single particle contributes to the diagonal elements m x 2 2 + x 3 2 , x 3 2 + x 1 2 , x 1 2 + x 2 2 . We’ve dropped the primes, since we’ll be working in this natural frame from now on.

Principal Axes Form of Moment of Inertia Tensor

We already know that the transformed matrix is diagonal, so its form has to be

n m n x n2 2 + x n3 2 0 0 0 x n3 2 + x n1 2 0 0 0 x n1 2 + x n2 2 = I 1 0 0 0 I 2 0 0 0 I 3 .

The moments of inertia, the diagonal elements, are of course all positive.  Note that no one of them can exceed the sum of the other two, although it can be equal in the (idealized) case of a two-dimensional object.  For that case, taking it to lie in the x,y  plane, I z = n x n 2 + y n 2 = I x + I y . .

Relating Angular Momentum to Angular Velocity

It’s easy to check that the angular momentum vector is

L i = I ij Ω j , since

L= r n × m n v n = m n r n × Ω × r n = Ω m n r n 2 m n r n Ω r n =I Ω .

Exercise:  verify this by putting in all the suffixes.

Symmetries, Other Axes, the Parallel Axis Theorem

If a body has an axis of symmetry, the center of mass must be on that axis, and it is a principal axis of inertia. To prove the center of mass statement, note that the body is made up of pairs of equal mass particles on opposite sides of the axis, each pair having its center of mass on the axis, and the body’s center of mass is that of all these pairs centers of mass, all of which are on the axis.  

Taking this axis to be the x  axis, symmetry means that for each particle at x,y,z  there is one of equal mass at x,y,z , so the off-diagonal terms in the x  row and column, mxy, mxz  all add up to zero, meaning this is indeed a principal axis.

The moment of inertia about an arbitrary axis through the center of mass, in the direction of the unit vector b ^ ,  is

m r 2 r b ^ 2 = b ^ T I b ^ = b x 2 I x + b y 2 I y + b z 2 I z .

The inertia tensor about some origin O  located at position a  relative to the center of mass is easily found to be

I ik = I ik +M a 2 δ ik a i a k .

In particular, we have the parallel axis theorem: the moment of inertia about any axis through some point O  equals that about the parallel axis through the center of mass O  plus M a 2 , where a  is the perpendicular distance between the axes.

Exercise: check this!