2×2 matrices

First, we observe the following formula for the square of a 2×2 matrix over a commutative ring R :

Theorem: Let

\mathbf{A} = \left( \begin{matrix} x & y \\ z & w \end{matrix} \right)

be a 2×2 matrix over a commutative ring R. Define t = \mathrm{tr}( \mathbf{A} ) = x + w , and d = \det( \mathbf{A} ) = xw - yz . Then we have:

\mathbf{A} ^2 = \left( \begin{matrix} tx - d & ty \\ tz & tw - d \end{matrix} \right)

This gives rise to a nice identity:

\mathrm{tr}( \mathbf{A} ^2) = \big[ \mathrm{tr}( \mathbf{A} ) \big]^2 - 2 \det( \mathbf{A} )

Using this formula, several classes of 2×2 matrices can be characterised (excluding nontrivial examples) by their trace and determinant:

Theorem: Call a matrix \mathbf{A} k-potent if \mathbf{A} ^2 = k \mathbf{A} , for some k \in R. Then, the k-potent 2×2 matrices are exactly those with \det( \mathbf{A} ) = 0, \mathrm{tr}( \mathbf{A} ) = k, along with the matrix k \mathbf{I} . They can be parametrised by:

\left( \begin{matrix} ab & b \\ a(k-ab) & k-ab \end{matrix} \right)

Corollary: The idempotent (i.e. 1-potent) 2×2 matrices are those with \det( \mathbf{A} ) = 0, \mathrm{tr}( \mathbf{A} ) = 1, and the identity matrix \mathbf{I} .

Corollary: The nilpotent (i.e. 0-potent) 2×2 matrices are those with \det( \mathbf{A} ) = 0, \mathrm{tr}( \mathbf{A} ) = 0, and the zero matrix \mathbf{0} .

Theorem: Recall that a matrix \mathbf{A} has order n if n \in \mathbb{N} is the smallest (positive) natural number such that \mathbf{A}^n = I. Then, the involutory 2×2 matrices (i.e. order 2 matrices) are exactly those having \det( \mathbf{A} ) = -1, \mathrm{tr}( \mathbf{A} ) = 0, along with - \mathbf{I} . They can be parametrised by:

\left( \begin{matrix} a & b \\ -(1+a^2)/b & -a \end{matrix} \right)


We can write the earlier formula for \mathbf{A} ^2 more succinctly as \mathbf{A}^2 = t \mathbf{A} - d \mathbf{I} , which has obvious connections to the Cayley-Hamilton theorem. Using this, we observe that \mathbf{A} ^3 can also be written as a linear combination of \mathbf{A} and \mathbf{I} [in this case, \mathbf{A}^3 = (t^2 - d) \mathbf{A} -dt \mathbf{I} ].

Generalising further, if we write \mathbf{A}^n = x_n \mathbf{A} + y_n \mathbf{I} , we can use the formula for \mathbf{A}^2 to obtain the recursion

x_{n+1} = tx_n + y_n \\ y_{n+1} = -dx_n

which simplifies by substitution to

x_{n+1} = tx_n - dx_{n-1} \\ y_{n+1} = -dx_n

x_n then has the form of a Lucas sequence, with initial values x_0 = 0 , x_1 = 1 . Therefore, we have that

x_n = U_n(t,d) \\ y_n = -d\, U_{n-1}(t,d)

where U_n(P,Q) is the Lucas sequence of the first kind. Thus, we can get an explicit formula for \mathbf{A}^n in terms of the Lucas sequences U_n(P,Q) :

\mathbf{A}^n = U_n(t,d) \mathbf{A} - d\, U_{n-1}(t,d) \mathbf{I}

and thus generalise our trace identity from before:

\mathrm{tr}(\mathbf{A}^n) = t\, U_n(t,d) - 2d\, U_{n-1}(t,d) \\ \hspace{3.2em} = V_n \big( \mathrm{tr}(\mathbf{A}) , \det(\mathbf{A}) \big)

where V_n(P,Q) is the Lucas sequence of the second kind. Furthermore, applying the identity

\det( x\mathbf{A} + y \mathbf{I} ) = x^2 \det( \mathbf{A} ) + xy\, \mathrm{tr}( \mathbf{A} ) + y^2

to our earlier formula for \mathbf{A}^n , and using the fact that \det is multiplicative, we can derive the following identity for U_n :

U_n^2 - U_{n-1} U_{n+1} = Q^{n-1}


It would be interesting to see how much of this generalises to higher dimensional matrices.