Rational Canonical Form III - Computation

#module_theory

We have a wonderful result about rational canonical forms, but how do we actually compute the rational canonical form of a given square matrix? Fortunately, there is a very straightforward algorithm. Given an $n \times n$ matrix, $A$ , both its invariant factors and the change-of-basis matrix needed to put $A$ into rational canonical form can be obtained from the computation of something called the Smith normal form for $A$ .

The Smith normal form

Suppose $A$ is an $n \times n$ matrix over a field $F$ . Consider the $n \times n$ matrix $x I_{n} - A$ , which has entries in the ring $F [x]$ . As usual in linear algebra, we will perform three basic types of row/column operations on this matrix. The three operations are:

Swap a pair of rows/columns.
Add an $F [x]$ -multiple of one row/column to another.
Multiply a row/column by a unit in $F [x]$ .

We call these the elementary row/column operations. We will use these operations to transform $x I_{n} - A$ into a very particular form, in some ways analogous to the classic reduced row-echelon form.

We first quote (without proof^[1]) the following fact.

The Smith Normal Form

Let $A$ be an $n \times n$ matrix over a field $F$ . Using the three elementary row and column operations above, the $n \times n$ matrix $x I_{n} - A$ can be put into the following diagonal form, called the Smith normal form for $A$ :

[\begin{matrix} 1 \\ ⋱ \\ 1 \\ a_{1} (x) \\ a_{2} (x) \\ ⋱ \\ a_{m} (x) \end{matrix}],

where the $a_{i} (x)$ are nonzero nonconstant monic polynomials satisfying $a_{1} (x) ∣ a_{2} (x) ∣ \dots ∣ a_{m} (x)$ .

The polynomials $a_{1} (x), \dots, a_{m} (x)$ are the invariant factors of $A$ .

As a bonus, in computing the Smith normal form for $A$ it turns out that we can also deduce the change-of-basis matrix that will conjugate $A$ into rational canonical form. Although this will initially seem a bit strange, if we keep track of the row operations used to obtain the Smith normal form for $A$ , then we can also write down a change-of-basis matrix, $P$ , such that $P^{- 1} A P$ is the rational canonical form for $A$ . See pages Dummit & Foote for the full details, but here is a short summary:

Begin with the identity matrix $I_{n} = P^{'}$ , and then for each row operation used to diagonalize the matrix $x I_{n} - A$ , change the matrix $P^{'}$ by the following rules:

If rows $i$ and $j$ are swapped in the computation of the Smith normal form for $A$ , then swap columns $i$ and $j$ in $P^{'}$ .
If $R_{i} + p (x) R_{j} \mapsto R_{i}$ in the computation of the Smith normal form for $A$ , then perform $C_{j} - p (A) C_{i} \mapsto C_{j}$ in $P^{'}$ . (Notice that the indices have switched!)
If we multiply row $i$ by a unit $u$ in the computation of the Smith normal form for $A$ , then multiply column $i$ by $u^{- 1}$ in the computation of $P^{'}$ .

Once we have computed the Smith normal form for $A$ , we will be left with a matrix $P^{'}$ for which the first $n - m$ columns are zero and the last $m$ columns are nonzero. These last nonzero columns correspond to $F [x]$ -module generators for the summands corresponding to each invariant factor. In particular, there will be exactly one nonzero column in $P^{'}$ for each invariant factor.

Let $v_{i}$ be the $i^{th}$ nonzero column vector in $P^{'}$ , so that $v_{i}$ is the vector in $V$ that corresponds to the $F [x]$ -module generator $1$ in $F [x] / ⟨ a_{i} (x) ⟩$ . Since a full $F$ -vector space basis for $F [x] / ⟨ a_{i} (x) ⟩$ is ${1, \overset{―}{x}, \dots, {\overset{―}{x}}^{\deg (a (x)) - 1}}$ , the corresponding $F$ -vector space basis for that invariant subspace of $V$ is ${v_{i}, A v_{i}, \dots, A^{\deg (a_{i} (x)) - 1} v_{i}}$ . Do this for each nonzero column of $P^{'}$ . Listing the vectors produced (in this order) gives a desired change-of-basis matrix $P$ .

Warning

This auxiliary matrix $P^{'}$ is not quite unique. The nonzero columns of $P^{'}$ correspond to $F [x]$ -module generators of the invariant summands. Those summands are cyclic as $F [x]$ -modules, but the generators for those summands are only unique up to scaling by units.

In particular, different sequences of elementary row/column operations (when computing the Smith normal form for $A$ ) can lead to slightly different auxiliary matrices. This will lead, in turn, to slightly different change-of-basis matrices. This is exactly the same situation that occurs when diagonalizing a (diagonalizable) matrix: each eigenbasis provides a suitable change-of-basis, but there is no unique eigenbasis.

Examples

Example 1: Following the general algorithm

Let $A$ be the $3 \times 3$ matrix

A = [\begin{matrix} 2 & - 2 & 14 \\ 0 & 3 & - 7 \\ 0 & 0 & 2 \end{matrix}]

The goal is to diagonalize the matrix $x I_{3} - A$ using row and column operations over the ring $Q [x]$ . See pages 483-484 in Dummit & Foote for the actual row and column operations used to transform the matrix

x I_{3} - A = [\begin{matrix} x - 2 & 2 & - 14 \\ 0 & x - 3 & 7 \\ 0 & 0 & x - 2 \end{matrix}]

to the diagonal matrix

[\begin{matrix} 1 & 0 & 0 \\ 0 & x - 2 & 0 \\ 0 & 0 & x^{2} - 5 x + 6 \end{matrix}]

For quick reference, the operations used were (in order):

$R_{1} + R_{2} \mapsto R_{1}$
$C_{1} - C_{2} \mapsto C_{1}$
$- R_{1}$
$R_{2} + (x - 3) R_{1} \mapsto R_{2}$
$C_{2} + (x - 1) C_{1} \mapsto C_{2}$
$C_{3} - 7 C_{1} \mapsto C_{3}$
$- C_{2}$
$R_{2} - 7 R_{3} \mapsto R_{2}$
$R_{2} \leftrightarrow R_{3}$
$C_{2} \leftrightarrow C_{3}$

By our general theory, it follows that the invariant factors of $A$ are $a_{1} (x) = x - 2$ and $a_{2} (x) = x^{2} - 5 x + 6$ . We can now conclude that the minimal polynomial of $A$ is $m_{A} (x) = x^{2} - 5 x + 6 = (x - 2) (x - 3)$ , the characteristic polynomial of $A$ is $c_{A} (x) = (x - 2) (x^{2} - 5 x + 6) = (x - 2)^{2} (x - 3)$ , and the rational canonical form of $A$ is

R = [\begin{matrix} 2 & 0 & 0 \\ 0 & 0 & - 6 \\ 0 & 1 & 5 \end{matrix}]

Moreover, if we keep track of the row operations used to diagonalize the matrix $x I_{3} - A$ , we can also compute a change-of-basis matrix $P$ such that $P^{- 1} A P$ is the rational canonical form matrix above. For a quick rundown on his this computation looks, first look at the row operations we used to compute the Smith normal form for $A$ , and then write down the corresponding column operations as describe above. Starting from the identity matrix $I_{3}$ , we perform the following column operations:

$C_{2} - C_{1} \mapsto C_{2}$
$- C_{1}$
$C_{1} - (A - 3 I_{3}) C_{2} \mapsto C_{1}$
$C_{3} + 7 C_{2} \mapsto C_{3}$
$C_{2} \leftrightarrow C_{3}$

This sequence of column operations will yield the matrix

P^{'} = [\begin{matrix} 0 & - 7 & - 1 \\ 0 & 7 & 1 \\ 0 & 1 & 0 \end{matrix}]

The first nonzero column of $P^{'}$ then gives $\deg (a_{1} (x)) = 1$ column of the matrix $P$ , namely the column

p_{1} = v_{1} = [\begin{matrix} - 7 \\ 7 \\ 1 \end{matrix}]

The second column of $P^{'}$ gives $\deg (a_{2} (x)) = 2$ columns of the matrix $P$ , namely the columns

p_{2} = v_{2} = [\begin{matrix} - 1 \\ 1 \\ 0 \end{matrix}], p_{3} = A v_{2} = [\begin{matrix} 2 & - 2 & 14 \\ 0 & 3 & - 7 \\ 0 & 0 & 2 \end{matrix}] [\begin{matrix} - 1 \\ 1 \\ 0 \end{matrix}] = [\begin{matrix} - 4 \\ 3 \\ 0 \end{matrix}]

Thus, a change-of-basis matrix $P$ is

P = [\begin{matrix} - 7 & - 1 & - 4 \\ 7 & 1 & 3 \\ 1 & 0 & 0 \end{matrix}]

Example 2: Shortcuts for small matrices

For small square matrices (sizes $3 \times 3$ and below), it's possible to compute the rational canonical form without going through the diagonalization process outlined above. For example, for the matrix $A$ above, we can first directly compute the characteristic polynomial of $A$ :

c_{A} (x) = det (x I_{3} - A) = (x - 2)^{2} (x - 3) .

This immediately tells us that the product of the invariant factors of $A$ is $(x - 2)^{2} (x - 3)$ . By the divisibility condition on the invariant factors, and the fact that the largest invariant factor is the minimal polynomial $m_{A} (x)$ , we also know that $m_{A} (x)$ has the same roots as $c_{A} (x)$ . In this case, that means there are only two possibilities for $m_{A} (x)$ : it's either $(x - 2) (x - 3)$ or $(x - 2)^{2} (x - 3)$ . To determine which it is, simply recall that $m_{A} (x)$ is the smallest degree monic polynomial which evaluates at $A$ to zero. Then check that

(A - 2 I_{3}) (A - 3 I_{3}) = A^{2} - 5 A + 6 I_{3} = 0.

Thus, $m_{A} (x) = (x - 2) (x - 3)$ .

Now observe that the invariant factors of $A$ are nonzero nonconstant monic polynomials $a_{1} (x), \dots, a_{m} (x)$ such that:

$a_{1} (x) ∣ a_{2} (x) ∣ \dots ∣ a_{m} (x)$
$a_{m} (x) = m_{A} (x) = (x - 2) (x - 3)$
$a_{1} (x) a_{2} (x) \dots a_{m} (x) = c_{A} (x) = (x - 2)^{2} (x - 3)$ .

There is only one such possible list, namely

a_{1} (x) = x - 2, a_{2} (x) = (x - 2) (x - 3)

We now know the invariant factors of $A$ and hence the rational canonical form of $A$ .

For $2 \times 2$ and $3 \times 3$ matrices, this method is generally the fastest way to determine the rational canonical form. However, it has two downsides:

It does not produce the change-of-basis matrix $P$ .
It does not usually work for matrices larger than $3 \times 3$ .

Example 3: A larger matrix

Consider the $4 \times 4$ matrix

A = [\begin{matrix} 1 & 2 & - 4 & 4 \\ 2 & - 1 & 4 & - 8 \\ 1 & 0 & 1 & - 2 \\ 0 & 1 & - 2 & 3 \end{matrix}]

It is not difficult to show that the characteristic polynomial of $A$ is $c_{A} (x) = (x - 1)^{4}$ , which gives four possibilities for the minimal polynomial of $A$ , namely $m_{A} (x) = x - 1, (x - 1)^{2}, (x - 1)^{3}, (x - 1)^{4}$ . It is then not too terrible to verify that $m_{A} (x) = (x - 1)^{2}$ , simply by noting $A - I_{4} \neq 0$ and verifying $(A - I_{4})^{2} = 0$ . However, that leaves two possibilities for the invariant factors of $A$ :

x - 1, x - 1, (x - 1)^{2} or (x - 1)^{2}, (x - 1)^{2} .

On the other hand, diagonalizing the matrix $x I_{4} - A$ we eventually obtain

[\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & (x - 1)^{2} & 0 \\ 0 & 0 & 0 & (x - 1)^{2} \end{matrix}] .

Thus, the invariant factors of $A$ are $a_{1} (x) = (x - 1)^{2}$ and $a_{2} (x) = (x - 1)^{2}$ . It now follows that the rational canonical form of $A$ is

R = [\begin{matrix} 0 & - 1 & 0 & 0 \\ 1 & 2 & 0 & 0 \\ 0 & 0 & 0 & - 1 \\ 0 & 0 & 1 & 2 \end{matrix}] .

Moreover, using the actual row and column operations used to diagonalize $x I_{4} - A$ allows us to compute a change-of-basis matrix. One first finds an auxiliary matrix, $P^{'}$ , to be

P^{'} = [\begin{matrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{matrix}]

From this, we can now compute a change-of-basis matrix, $P$ , ultimately finding one to be

P = [\begin{matrix} 1 & 1 & 0 & 2 \\ 0 & 2 & 1 & - 1 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}]

See pages 485-486 of Dummit & Foote for the list of row and column operations used, and how they produce the matrix, $P$ , above.

Suggested next note

Jordan Canonical Form I - Definition

To see a proof, check out Exercises 16-19 in Section 12.1 and Exercises 21-25 in Section 12.2 of Dummit & Foote. ↩︎