Skip to main content

Section 6.2 Matrices and Transformations

Subsection 6.2.1 Matrix Representation of Linear Transformations

Now I come to the second major application of matrices. In addition to succinctly encoding linear systems, matrices can also be used very efficiently to encode linear transformations. This is done by defining how a matrix can act on a vector.

Definition 6.2.1.

Let \(A = a_{ij}\) be a \(m \times n\) matrix and let \(v\) be a vector in \(\RR^n\text{.}\) There is an action of \(A\) on \(v\text{,}\) written \(Av\text{,}\) which defines a new vector in \(\RR^m\text{.}\) That action is given in the following formula.
\begin{equation*} \begin{pmatrix} a_{11} \amp a_{12} \amp \cdots \amp a_{1n} \\ a_{21} \amp a_{22} \amp \cdots \amp a_{2n} \\ \vdots \amp \vdots \amp \vdots \amp \vdots \\ a_{n1} \amp a_{n2} \amp \cdots \amp a_{nn} \end{pmatrix} \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix} = \begin{pmatrix} a_{11}v_1 + a_{12}v_2 + \ldots + a_{1n}v_n \\ a_{21}v_1 + a_{22}v_2 + \ldots + a_{2n}v_n \\ \vdots \\ a_{n1}v_1 + a_{n2}v_2 + \ldots + a_{nn}v_n \end{pmatrix} \end{equation*}
This is a bit troubling to work out in general. Let me show what it looks like slightly more concretely in \(\RR^2\) and \(\RR^3\text{.}\)
\begin{align*} \begin{pmatrix} a \amp b \\ c \amp d \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix} \amp = \begin{pmatrix} ax + by \\ cx + dy \end{pmatrix}\\ \begin{pmatrix} a \amp b \amp c \\ d \amp e \amp f \\ g \amp h \amp i \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} \amp = \begin{pmatrix} ax + by + cz \\ dx + ey + fz \\ gx + hy + iz \end{pmatrix} \end{align*}
In this way, all \(m \times n\) matrices determine a method of sending vectors in \(\RR^n\) to \(\RR^m\text{:}\) a function \(\RR^n \rightarrow \RR^m\text{.}\) It is not at all obvious from the definition, but matrices completely describe all linear transformations.
In this way, the set of linear transformations \(\RR^n \rightarrow \RR^m\) is exactly the same as the set of \(m \times n\) matrices. This is a very powerful result: in order to understand linear transformations of Euclidean space, I only have to understand matrices and their properties.
I’ll start with some important examples, using \(3 \times 3\) matrices for transformations from \(\RR^3\) to itself.

Example 6.2.3.

Let me interpret the matrix action of the zero matrix (all coefficient are zero).
\begin{equation*} \begin{pmatrix} 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} 0x + 0y + 0z \\ 0x + 0y + 0z \\ 0x + 0y + 0z \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} \end{equation*}
The zero matrix corresponds to the transformation that sends all vectors to the origin.

Example 6.2.4.

Let me also interpret the matrix action of the identity matrix.
\begin{equation*} \begin{pmatrix} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} 1x + 0y + 0z \\ 0x + 1y + 0z \\ 0x + 0y + 1z \end{pmatrix} = \begin{pmatrix} x \\ y \\ z \end{pmatrix} \end{equation*}
The identity matrix corresponds to the transformation which doesn’t change anything. Appropriately, we called this the identity transformation.

Example 6.2.5.

I can also interpret a more general diagonal matrix. Let \(a,b,c \in \RR\) be scalars.
\begin{equation*} \begin{pmatrix} a \amp 0 \amp 0 \\ 0 \amp b \amp 0 \\ 0 \amp 0 \amp c \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} ax + 0y + 0z \\ 0x + by + 0z \\ 0x + 0y + cz \end{pmatrix} = \begin{pmatrix} ax \\ by \\ cz \end{pmatrix} \end{equation*}
This is a dilation: the \(x\) direction is stretched by the factor \(a\text{,}\) the \(y\) direction by the factor \(b\) and the \(z\) direction by the factor \(c\text{.}\) (If any of the three constants are zero, that entire axis direction is collapsed). Diagonal matrices are dilations in the axis directions, with the possibility of completely collapsing an axis as well.

Subsection 6.2.2 Composition and Matrix Multiplication

In Subsection 6.1.2, I defined the composition of linear transformations. Composition allows me to combine transformations. However, since matrices represent transformations, this composition should somehow be accounted for in the matrix representation. If \(A\) is the matrix of the transformation \(S\) and \(B\) is the matrix of the transformation \(T\text{,}\) what is the matrix of \(S \circ T\text{?}\) The answer is given by matrix multiplication.

Definition 6.2.6.

Let \(A\) be a \(k \times m\) matrix and \(B\) an \(m \times n\) matrix. I can think of the rows of \(A\) as vectors in \(\RR^m\text{,}\) and the columns of \(B\) as vectors in \(\RR^m\) as well. To emphasize this perspective, I write the following, using \(u_i\) for the rows of \(A\) and \(v_i\) for the columns of \(B\text{.}\)
\begin{align*} \amp A = \begin{pmatrix} \rightarrow \amp u_1 \amp \rightarrow \\ \rightarrow \amp u_2 \amp \rightarrow \\ \rightarrow \amp u_3 \amp \rightarrow \\ \vdots \amp \vdots \amp \vdots \\ \rightarrow \amp u_k \amp \rightarrow \end{pmatrix} \amp \amp B = \begin{pmatrix} \downarrow \amp \downarrow \amp \downarrow \amp \ldots \amp \downarrow \\ v_1 \amp v_2 \amp v_3 \amp \ldots \amp v_n \\ \downarrow \amp \downarrow \amp \downarrow \amp \ldots \amp \downarrow \end{pmatrix} \end{align*}
With this notation, the matrix multiplication of \(A\) and \(B\) is the \(k \times n\) matrix where the entries are the dot products of rows and columns.
\begin{equation*} AB = \begin{pmatrix} u_1 \cdot v_1 \amp u_1 \cdot v_2 \amp u_1 \cdot v_3 \amp \ldots \amp u_1 \cdot v_n \\ u_2 \cdot v_1 \amp u_2 \cdot v_2 \amp u_2 \cdot v_3 \amp \ldots \amp u_2 \cdot v_n \\ u_3 \cdot v_1 \amp u_3 \cdot v_2 \amp u_3 \cdot v_3 \amp \ldots \amp u_3 \cdot v_n \\ \vdots \amp \vdots \amp \vdots \amp \ldots \amp \vdots \\ u_k \cdot v_1 \amp u_k \cdot v_2 \amp u_k \cdot v_3 \amp \ldots \amp u_k \cdot v_n \end{pmatrix} \end{equation*}
This operation has the desired property: the product of the matrices repesents the composition of the transformations. (This remarkable fact is presented here without proof; I’ll leave it to you to wonder why this weird combination of dot products has the desired geometric interpretation.) Remember that the composition still works from right to left so that the matrix multiplication \(AB\) represents the transformation associated with \(B\) first, followed by the transformation associated with \(A\text{.}\) When I defined matrices acting on vectors, I wrote the action on the right: \(Av\text{.}\) Now when a composition acts, as in \(ABv\text{,}\) the closest matrix gets to act first.
I have defined a new algebraic operation. As with the new products for vectors (dot and cross), I want to know the properties of this new operation.
Note that commutativity is not on this list. In general, \(AB \neq BA\text{.}\) In fact, if \(A\) and \(B\) are not square matrices, if \(AB\) is defined then \(BA\) will not be, since the indices will not match. Not only am I unable to exchange the order of matrix multiplication, sometimes that multiplication doesn’t even make sense as an operation. Matrix multiplication is a very important example of a non-commutative product.