Rational Canonical Form III - Computation

We have a wonderful result about rational canonical forms, but how do we actually compute the rational canonical form of a given square matrix? Fortunately, there is a very straightforward algorithm. Given an n×n matrix, A, both its invariant factors and the change-of-basis matrix needed to put A into rational canonical form can be obtained from the computation of something called the Smith normal form for A.

The Smith normal form


Suppose A is an n×n matrix over a field F. Consider the n×n matrix xIn−A, which has entries in the ring F[x]. As usual in linear algebra, we will perform three basic types of row/column operations on this matrix. The three operations are:

  1. Swap a pair of rows/columns.
  2. Add an F[x]-multiple of one row/column to another.
  3. Multiply a row/column by a unit in F[x].

We call these the elementary row/column operations. We will use these operations to transform xIn−A into a very particular form, in some ways analogous to the classic reduced row-echelon form.

We first quote (without proof[1]) the following fact.

The Smith Normal Form

Let A be an n×n matrix over a field F. Using the three elementary row and column operations above, the n×n matrix xIn−A can be put into the following diagonal form, called the Smith normal form for A:

[1⋱1a1(x)a2(x)⋱am(x)],

where the ai(x) are nonzero nonconstant monic polynomials satisfying a1(x)âˆŖa2(x)âˆŖâ‹¯âˆŖam(x).

The polynomials a1(x),â€Ļ,am(x) are the invariant factors of A.

As a bonus, in computing the Smith normal form for A it turns out that we can also deduce the change-of-basis matrix that will conjugate A into rational canonical form. Although this will initially seem a bit strange, if we keep track of the row operations used to obtain the Smith normal form for A, then we can also write down a change-of-basis matrix, P, such that P−1AP is the rational canonical form for A. See pages Dummit & Foote for the full details, but here is a short summary:

Begin with the identity matrix In=:P′, and then for each row operation used to put the matrix xIn−A into Smith normal form, change the matrix P′ by the following rules:

  1. If rows i and j are swapped in the computation of the Smith normal form for A, then swap columns j and i in P′.
  2. If Ri+p(x)Rjâ†ĻRi in the computation of the Smith normal form for A, then perform Cj−p(A)Ciâ†ĻCj in P′. (Notice that the indices have switched!)
  3. If we multiply row i by a unit u in the computation of the Smith normal form for A, then multiply column i by u−1 in the computation of P′.

By the end of our computations, we will be left with a matrix P′ for which the first n−m columns are zero and the last m columns are nonzero. These last nonzero columns correspond to F[x]-module generators for the summands for each invariant factor. In particular, there will be exactly one nonzero column in P′ for each invariant factor.

Let vi be the ith nonzero column vector in P′, so that vi is the vector in V that corresponds to the F[x]-module generator 1― for F[x]/⟨ai(x)⟩. Since a full F-vector space basis for F[x]/⟨ai(x)⟩ is {1,x―,â€Ļ,x―deg⁥(a(x))−1}, the corresponding F-vector space basis for that invariant subspace of V is {vi,Avi,â€Ļ,Adeg⁥(ai(x))−1vi}. Do this for each nonzero column of P′. Listing the vectors produced (in this order) gives a desired change-of-basis matrix P.

Warning

This auxiliary matrix P′ is not quite unique. The nonzero columns of P′ correspond to F[x]-module generators of the invariant summands. Those summands are cyclic as F[x]-modules, but the generators for those summands are only unique up to scaling by units.

In particular, different sequences of elementary row/column operations (when computing the Smith normal form for A) can lead to slightly different auxiliary matrices. This will lead, in turn, to slightly different change-of-basis matrices. This is exactly the same situation that occurs when diagonalizing a (diagonalizable) matrix: each eigenbasis provides a suitable change-of-basis, but there is no unique eigenbasis.

Examples


Example 1: Following the general algorithm

Let A be the 3×3 matrix

A=[2−21403−7002]

The goal is to diagonalize the matrix xI3−A using row and column operations over the ring Q[x]. See pages 483-484 in Dummit & Foote for the actual row and column operations used to transform the matrix

xI3−A=[x−22−140x−3700x−2]

to the diagonal matrix

[1000x−2000x2−5x+6]

For quick reference, the operations used were (in order):

What's the strategy?

A fair question that we've left open is how did we come up with that sequence of row and column operations? It turns out that the diagonal entries in the Smith normal form are each the gcd of certain elements. Specifically, the first (i.e., top-left-most) diagonal entry a1 in the Smith normal form is the gcd of all of the entries in the matrix xIn−A.

In our example, we have

xI3−A=[x−22−140x−3700x−2]

and so it's clear by inspection that the gcd of the entries of xI3−A is 1. That's the initial goal, then, namely to use row and column operations (over Q[x]) until we can get a 1 in that (1,1)-entry. We then immediately wipe out all other entries in both the first column and first row. In our example, that leaves us with the matrix

[1000−(x−2)(x−3)7(x−2)00x−2]

Now focus on the 2×2 matrix obtained by omitting the first row and column. The second diagonal entry a2 in the Smith normal form is the gcd of the entries in that 2×2 matrix. In our case, this is evidently x−2. So we repeat the previous strategy, using row and column operations until we get x−2 in the (2,2)-entry of our matrix. At that point, we arrive at the matrix

[1000x−2000(x−2)(x−3)]

and we're done.

By our general theory, it follows that the invariant factors of A are a1(x)=x−2 and a2(x)=x2−5x+6. We can now conclude that the minimal polynomial of A is mA(x)=x2−5x+6=(x−2)(x−3), the characteristic polynomial of A is cA(x)=(x−2)(x2−5x+6)=(x−2)2(x−3), and the rational canonical form of A is

R=[20000−6015]

Moreover, if we keep track of the row operations used to diagonalize the matrix xI3−A, we can also compute a change-of-basis matrix P such that P−1AP is the rational canonical form matrix above. For a quick rundown on his this computation looks, first look at the row operations we used to compute the Smith normal form for A, and then write down the corresponding column operations as describe above. Starting from the identity matrix I3, we perform the following column operations:

This sequence of column operations will yield the matrix

P′=[0−7−1071010]

The first nonzero column of P′ then gives deg⁡(a1(x))=1 column of the matrix P, namely the column

p1=v1=[−771]

The second column of P′ gives deg⁡(a2(x))=2 columns of the matrix P, namely the columns

p2=v2=[−110],p3=Av2=[2−21403−7002][−110]=[−430]

Thus, a change-of-basis matrix P is

P=[−7−1−4713100]

Example 2: Shortcuts for small matrices

For small square matrices (sizes 3×3 and below), it's possible to compute the rational canonical form without going through the diagonalization process outlined above. For example, for the matrix A above, we can first directly compute the characteristic polynomial of A:

cA(x)=det(xI3−A)=(x−2)2(x−3).

This immediately tells us that the product of the invariant factors of A is (x−2)2(x−3). By the divisibility condition on the invariant factors, and the fact that the largest invariant factor is the minimal polynomial mA(x), we also know that mA(x) has the same roots as cA(x). In this case, that means there are only two possibilities for mA(x): it's either (x−2)(x−3) or (x−2)2(x−3). To determine which it is, simply recall that mA(x) is the smallest degree monic polynomial which evaluates at A to zero. Then check that

(A−2I3)(A−3I3)=A2−5A+6I3=0.

Thus, mA(x)=(x−2)(x−3).

Now observe that the invariant factors of A are nonzero nonconstant monic polynomials a1(x),â€Ļ,am(x) such that:

  1. a1(x)âˆŖa2(x)âˆŖâ‹¯âˆŖam(x)
  2. am(x)=mA(x)=(x−2)(x−3)
  3. a1(x)a2(x)⋯am(x)=cA(x)=(x−2)2(x−3).

There is only one such possible list, namely

a1(x)=x−2,a2(x)=(x−2)(x−3)

We now know the invariant factors of A and hence the rational canonical form of A.

For 2×2 and 3×3 matrices, this method is generally the fastest way to determine the rational canonical form. However, it has two downsides:

  1. It does not produce the change-of-basis matrix P.
  2. It does not usually work for matrices larger than 3×3.

Example 3: A larger matrix

Consider the 4×4 matrix

A=[12−442−14−8101−201−23]

It is not difficult to show that the characteristic polynomial of A is cA(x)=(x−1)4, which gives four possibilities for the minimal polynomial of A, namely mA(x)=x−1,(x−1)2,(x−1)3,(x−1)4. It is then not too terrible to verify that mA(x)=(x−1)2, simply by noting A−I4≠0 and verifying (A−I4)2=0. However, that leaves two possibilities for the invariant factors of A:

x−1,x−1,(x−1)2or(x−1)2,(x−1)2.

On the other hand, computing the Smith normal form of the matrix xI4−A we eventually obtain

[1000010000(x−1)20000(x−1)2].

Thus, the invariant factors of A are a1(x)=(x−1)2=x2−2x+1 and a2(x)=(x−1)2=x2−2x+1. It now follows that the rational canonical form of A is

R=[0−1001200000−10012].

Moreover, using the row operations used to compute the Smith normal form allows us to compute a change-of-basis matrix. We first compute the auxiliary matrix, P′, finding it to be

P′=[0010000100000000]

From this, we can now compute a change-of-basis matrix, P, ultimately finding it to be

P=[1102021−101000001]

See pages 485-486 of Dummit & Foote for the list of row and column operations used, and how they produce the matrix, P, above.

Suggested next note


Jordan Canonical Form I - Definition


  1. To see a proof, check out Exercises 16-19 in Section 12.1 and Exercises 22-25 in Section 12.2 of Dummit & Foote. â†Šī¸Ž