23. Motion of a Rigid Body: the Inertia Tensor
Michael Fowler
Definition of Rigid
We’re thinking here of an idealized solid, in which the distance between any two internal points stays the same as the body moves around. That is, we ignore vibrations, or strains in the material resulting from inside or outside stresses. In fact, this is almost always an excellent approximation for ordinary solids subject to typical stresses—obvious exceptions being rubber, flesh, etc. Following Landau, we’ll usually begin by representing the body as a collection of particles of different masses held in their places by massless bonds. This approach has the merit that the dynamics can be expressed cleanly in terms of sums over the particles, but for an ordinary solid we’ll finally take a continuum limit, replacing the finite sums over the constituent particles by integrals over a continuous mass distribution.
Rotation of a Body about a Fixed Axis
As a preliminary, let’s look at a body firmly attached to a rod fixed in space, and rotating with angular velocity radians/sec. about that axis. You’ll recall from freshman physics that the angular momentum and rotational energy are where
(here is the distance from the axis).
But you also know that both angular velocity and angular momentum are vectors. Obviously, for this example, the angular velocity is a vector pointing along the axis of rotation, One might be tempted to conclude that the angular momentum also points along the axis, but this is not always the case. An instructive example is provided by two masses at the ends of a rod of length held at a fixed angle to the axis, which is the axis of rotation.
Evidently,
But notice that, assuming the rod is momentarily in the plane, as shown, then
The total angular momentum is not parallel to the total angular velocity!
In fact, as should be evident, the total angular momentum is rotating around the constant angular velocity vector, so the axis must be providing a torque. This is why unbalanced car wheels stress the axle.
General Motion of a Rotating Rigid Body
We’ll follow the Landau notation (which itself tends to be bilingual between coordinates and .) Notice that we’ll label the components by , not even though we call the vector Again, we’re following Landau.
We take a fixed, inertial (or lab) coordinate system labeled and in this system the rigid body’s center of mass, labeled , is at . We have a Cartesian set of axes fixed in the body, origin at the center of mass, and coordinates in this system, vectors from to a point in the body denoted by , are labeled or .
A vector from the external inertial fixed origin to a point in the body is then say, as shown in the figure.
Suppose now that in infinitesimal time , the center of mass of the body moves and the body rotates through . Then a particle at as measured from the center of mass will move through relative to the external inertial frame.
Therefore, the velocity of that particle in the fixed frame, writing the center of mass velocity and the angular velocity as is
Now, in deriving the above equation, we have not used the fact that the origin fixed in the body is at the center of mass. (That turns out to be useful shortly.) What if instead we had taken some other origin fixed in the body? Would we find the angular velocity about to be the same as ? The answer turns out to be yes, but we need to prove it! Here's the proof:
If the position of relative to is (a vector fixed in the body and so moving with it) then the velocity of is .
A particle at relative to is at relative to .
Its velocity relative to the fixed external axes is this must of course equal It follows that .
This means that if we describe the motion of any particle in the body in terms of some origin fixed in the body, plus rotation about that origin, the angular velocity vector describing the body’s motion is the same irrespective of the origin we choose. So we can, without ambiguity, talk about the angular velocity of the body.
From now on, we’ll assume that the origin fixed in the body is at the center of mass.
The Inertia Tensor
Regarding a rigid body as a system of individual particles, we find the kinetic energy
The first term in the last line is where is the total mass of the body.
The second term is from the definition of the center of mass (our origin here)
The third term can be rewritten: Here we have used
Alternatively, you could use the vector product identity together with to find
The bottom line is that the kinetic energy
a translational kinetic energy plus a rotational kinetic energy.
Warning about notation: at this point, things get a bit messy. The reason is that to make further progress in dealing with the rotational kinetic energy, we need to write it in terms of the individual components of the particle position vectors . Following Landau and others, we’ll write these components in two different ways:
The notation is helpful in giving a clearer picture of rotational energy, but the notation is essential in handling the math, as will become evident.
Landau’s solution to the too many suffixes for clarity problem is to omit the suffix labeling the individual particles, I prefer to keep it in.
Double Suffix Summation Notation: to cut down on the number of ’s in expressions, we’ll follow Landau and others in using Einstein’s rule that if a suffix like appears twice in a product, it is to be summed over the values 1,2,3. It’s called a “dummy suffix” because it doesn’t matter what you label it, as long as it appears twice. For example, the inner product of two vectors can be written as or equally as . Furthermore, means .
But do not use Greek letters for dummy suffixes in this context: the standard is that they are used in relativistic equations to signify sums over the four dimensions of space time, Latin letters for sums over the three spatial dimensions, as we are doing here.
The rotational kinetic energy is then
Warning: That first line is a bit confusing: copying Landau, I’ve written , you might think that’s , but a glance at the previous equation (and the second line of this equation) makes clear it’s actually . Landau should have written . Actually I’m not even keen on implying a double summation. Standard use in relativity, for example, is that both of the two suffixes be explicit for summation to be implied. In GR one would write . (Well, actually , but that’s another story.)
Anyway, moving on, we introduce the inertia tensor
In terms of which the kinetic energy of the moving, rotating rigid body is
As usual, the Lagrangian where the potential energy is a function of six variables in general, the center of mass location and the orientation of the body relative to the center of mass.
Landau writes the inertia tensor explicitly as:
but you should bear in mind that means .
Tensors 101
We see that the “inertia tensor” defined above as is a two-dimensional array of terms, called components, each of which is made up (for this particular tensor) of products of vector components.
Obviously, if we had chosen a different set of Cartesian axes from the same origin the vector components would be different: we know how a vector transforms under such a change of axes, where
This can be written more succinctly as
the bold font indicating a vector or matrix.
In fact, a transformation from any set of Cartesian axes to any other set having the same origin is a rotation about some axis. This can easily be seen by first rotating so that the axis coincides with the axis, then rotating about that axis. (Of course, both sets of axes must have the same handedness.) We’ll discuss these rotation transformations in more detail later, for now we’ll just mention that the inverse of a rotation is given by the transpose matrix (check for the example above),
so if the column vector the row vector a.k.a. and the length of the vector doesn’t change:
It might be worth spelling out explicitly here that the transpose of a square matrix (and almost all our matrices are square) is found by just swapping the rows and columns, or equivalently swapping elements which are the reflections of each other in the main diagonal, but the transpose of a vector, written as a column, has the same elements as a row, and the product of vectors follows the standard rules for matrix multiplication: with the dummy suffix summed over.
Thus, and but
This will perhaps remind you of the Hilbert space vectors in quantum mechanics: the transposed vector above is analogous to the bra, the initial column vector being the ket. One difference from quantum mechanics is that all our vectors here are real, if that were not the case it would be natural to add complex conjugation to the transposition, to give the length squared of the vector.
The difference shown above between and is exactly parallel to the difference between and in quantum mechanics—the first is a number, the norm of the vector, the second is an operator, a projection into the state
Definition of a Tensor
We have a definite rule for how vector components transform under a change of basis: What about the components of the inertia tensor ?
We’ll do it in two parts, and one particle at a time. First, take that second term for one particle, it has the form . But we already know how vector components transform, so this must go to
The same rotation matrix is applied to all the particles, so we can add over .
In fact, the inertia tensor is made up of elements exactly of this form in all nine places, plus diagonal terms , obviously invariant under rotation. To make this clear, we write the inertia tensor:
where is the identity matrix. (Not to be confused with !)
Exercise: convince yourself that this is the same as
This transformation property is the definition of a two-suffix Cartesian three-dimensional tensor: just as a vector in this space can be defined as an array of three components that are transformed under a change of basis by applying the rotation matrix, , a tensor with two suffixes in the same space is a two-dimensional array of nine numbers that transform as
Writing this in matrix notation, and keeping an eye on the indices, we see that with the standard definition of a matrix product, ,
(The transformation property for our tensor followed immediately from that for a vector, since our tensor is constructed from vectors, but by definition the same rule applies to all Cartesian tensors, which are not always expressible in terms of vector components.)
Diagonalizing the Inertia Tensor
The inertial tensor has the form of a real symmetric matrix. By an appropriate choice of axes any such tensor can be put in diagonal form, so that
These axes, with respect to which the inertia tensor is diagonal, are called the principal axes of inertia, the moments about them the principal moments of inertia.
If you’re already familiar with the routine for diagonalizing a real symmetric matrix, you can skip this review.
The diagonalization of the tensor/matrix proceeds as follows.
First, find the eigenvalues and corresponding eigenvectors of the inertial tensor :
(The turn out to be the principal moments , but we’ll leave them as for now, we need first to establish that they’re real.)
Now since is real and symmetric, , the eigenvalues are real. To prove this, take the equation for above and premultiply by the row vector , the complex conjugate transpose:
The left hand side is a real number: this can be established by taking its complex conjugate. The fact that the tensor is real and symmetric is crucial!
And since these are dummy suffixes, we can swap the ’s and ’s to establish that this number is identical to its complex conjugate, hence it’s real. Clearly, is real and positive, so the eigenvalues are real.
(Note: a real symmetric matrix does not necessarily have positive roots: for example .)
Taking the eigenvalues to be distinct (the degenerate case is easy to deal with) the eigenvectors are orthogonal, by the standard proof, for this matrix left eigenvectors (rows) have the same eigenvalues as their transpose, so
and .
The diagonalizing matrix is made up of these eigenvectors (assumed normalized):
a column of row vectors.
To check that this is indeed a rotation vector, from one orthogonal set of axes to another, notice first that its transpose is its inverse (as required for a rotation), since the eigenvectors form an orthonormal set.
Now apply this to an arbitrary vector:
.
In vector language, these elements are just , etc., so , the primed components are just the components of along the eigenvector axes, so the operator gives the vector components relative to these axes, meaning it has rotated the coordinate system to one with the principal axes of the body are now the axes.
We can confirm this by applying the rotation to the inertia tensor itself:
Let’s examine the contribution of one particle to the inertia tensor:
Note that here represents the column vector of the particle coordinates, in other words, it’s just ! And, watch out for the inertia tensor and the unit tensor .
They transform as , note that this agrees with . Since under rotation the length of a vector is invariant, , and it is evident that in the rotated frame (the eigenvector frame) the single particle contributes to the diagonal elements . We’ve dropped the primes, since we’ll be working in this natural frame from now on.
Principal Axes Form of Moment of Inertia Tensor
We already know that the transformed matrix is diagonal, so its form has to be
The moments of inertia, the diagonal elements, are of course all positive. Note that no one of them can exceed the sum of the other two, although it can be equal in the (idealized) case of a two-dimensional object. For that case, taking it to lie in the plane, .
Relating Angular Momentum to Angular Velocity
It’s easy to check that the angular momentum vector is
since
Exercise: verify this by putting in all the suffixes.
Symmetries, Other Axes, the Parallel Axis Theorem
If a body has an axis of symmetry, the center of mass must be on that axis, and it is a principal axis of inertia. To prove the center of mass statement, note that the body is made up of pairs of equal mass particles on opposite sides of the axis, each pair having its center of mass on the axis, and the body’s center of mass is that of all these pairs centers of mass, all of which are on the axis.
Taking this axis to be the axis, symmetry means that for each particle at there is one of equal mass at , so the off-diagonal terms in the row and column, all add up to zero, meaning this is indeed a principal axis.
The moment of inertia about an arbitrary axis through the center of mass, in the direction of the unit vector is
.
The inertia tensor about some origin located at position relative to the center of mass is easily found to be
In particular, we have the parallel axis theorem: the moment of inertia about any axis through some point equals that about the parallel axis through the center of mass plus , where is the perpendicular distance between the axes.
Exercise: check this!