After a good definition of the environment (\(\RR^n\)) and its objects (lines, planes, hyperplanes, etc.), the next mathematical step is to understand the functions that live in the environment and affect its objects. First, I need to generalize the simple notion of a function to linear spaces. Algebra and calculus are concerned with functions of real numbers. These functions are rules \(f: A \rightarrow B\) which go between subsets of real numbers. The function \(f\) assigns to each number in \(A\) a unique number in \(B\text{.}\) These functions include the very familiar \(f(x) =
x^2\text{,}\)\(f(x) = \sin (x)\text{,}\)\(f(x) = e^x\) and many others.
Definition6.1.1.
Let \(A\) and \(B\) be subsets of \(\RR^n\) and \(\RR^m\text{,}\) respectively. A function between linear spaces is a rule \(f: A \rightarrow B\text{,}\) which assigns to each vector in A a unique vector in \(B\text{.}\)
Example6.1.2.
I can define a function \(f: \RR^3 \rightarrow \RR^3\) by \(f\begin{pmatrix}x\\y\\z \end{pmatrix} =
\begin{pmatrix}x^2\\y^2\\z^2 \end{pmatrix}\text{.}\)
Example6.1.3.
Another function \(f: \RR^3 \rightarrow \RR^2\) could be \(f\begin{pmatrix}x\\y\\z \end{pmatrix} =
\begin{pmatrix}x-y\\z-y \end{pmatrix}\text{.}\)
Definition6.1.4.
A linear function or linear transformation from \(\RR^n\) to \(\RR^m\) is a function \(f: \RR^n \rightarrow \RR^m\) such that for two vectors, \(u,v \in \RR^n\) and any scalar \(a \in
\RR\text{,}\) the function must obey two rules.
Often I will say that the function respects the two main operations on linear spaces: addition of vectors and multiplication by scalars. If I perform addition before or after the function, I get the same result. Likewise for scalar multiplication.
By inspection and testing, I could determine that Example 6.1.2 example above fails these two rules, but Example 6.1.3 satisfies them. Therefore, the later is a linear transformation but the former is not.
This definition creates a restrictive but important class of linear functions. I could easily define linear algebra as a study of these transformations. The defintiion gives so far is an algebraic definition; however, there is also an elegant geometric description of these functions.
Proposition6.1.5.
A function \(f: \RR^n \rightarrow \RR^m\) is linear if and only if it sends linear objects to linear objects.
Under a linear transformation, points, lines, planes and etc. are changed to other points, lines, planes and etc. A line can’t be bent into an arc or broken into two different lines. Hopefully, some of the major ideas of the course are starting to fit together: the two basic operations of addition and scalar multiplication give rise to spans, which are flat objects. Linear transformations preserve those operations, so they preserve flat objects. Exactly how they change these objects can be tricky to determine.
Lastly, because of scalar multiplication, if I take \(a =
0\text{,}\) I find that \(f(0) = 0\text{.}\) Under a linear transformation, the origin is always sent to the origin. So, in addition to preserving flat objects, linear transformations can’t move the origin. I could drop this condition of preserving the origin to get another class of functions.
Definition6.1.6.
An affine transformation from \(\RR^n\) to \(\RR^m\) is a transformation that preserves affine subspaces. These transformations preserve flat objects but may move the origin.
Though they are interesting, I don’t spend much time with affine transformations. They can always be realized as a linear transformation combined with a shift or displacement of the entire space by a fixed vector. Since shifts are relatively simple, I can usually reduce problems of affine transformations to problems of linear transformations.
Subsection6.1.2Composition
Composition is a very important idea for the study of real-valued functions in calculus. It likewise is very important for linear transformations.
Definition6.1.7.
Let \(f : \RR^n \rightarrow \RR^m\) and \(g: \RR^m
\rightarrow \RR^l\) be linear transformations. Then \(g
\circ f : \RR^n \rightarrow \RR^l\) is the linear transformation formed by first applying \(f\) and then \(g\text{.}\) Note that the \(\RR^m\) has to match: \(f\) outputs to \(\RR^m\text{,}\) which is the input for \(g\text{.}\) Also note that the notation is written right-to-left: In \(g \circ f\text{,}\) the transformation \(f\) happens first, followed by \(g\text{.}\) This new transformation is called the composition of \(f\) and \(g\text{.}\)
Subsection6.1.3Symmetry
Notice that I’ve defined linear functions by the objects and/or properties they preserve. This is a very general technique in mathematics. Very frequently, functions are classified by what they preserve. As discussed in the very first chapter, I use the word ‘symmetry’ to describe this perspective: the symmetries of a function are the objects or algebraic properties preserved by the function. A function exhibits more symmetry if it preserves more objects or more properties. The conventional use of symmetry in English relates more to a shape than a function: what are the symmetries of a hexagon? I can connect the two ideas: asking for the symmetries of the hexagon can be thought of as asking for the transformations of \(\RR^2\) that preserve a hexagon. This is a bit of a reverse: the standard usage of the word talks about transformation as the symmetries of a shape. Here we start with a transformation and talk about the shape as a symmetry of the transformation: the hexagon is a symmetry of rotation by one-sixth of a full turn.