You have now visited the fundamental ideas of matrices and linear transformations. Why are these two being taught together? The answer is simple - both of these phenomena refer to the same thing!
Matrices are a time-tested, powerful tool to perform the calculations related to linear transformations. The essential point of this section is that a matrix is a linear transformation, expressed in a convenient form. While performing computations to solve linear equations, matrices are a data structure that can be easily parallelised (i.e., many arithmetic operations can be performed in parallel on a computer), and hence, is fast in its execution.
As an example, let's consider the following series of linear equations.
5*x1 + 4*x2 + 8*x3 = 10
3*x1 + 9*x3 = 8
3*x1 + 4*x2 + 2*x3 = 5
This system of equations solves for some values of (x1, x2, x3). Matrices give us a very nifty way to express these equations. First, let us place all the coefficients of x1, x2 and x3 in a matrix A:
Now, x becomes the following matrix:
The right hand side is the output vector b:
This has given us an easy way to express the whole series of equations:
Thus, a system of linear equations can be expressed as a linear transformation (a matrix) applied on an input vector (x values), to get an output vector.
Geometrically, a linear transformation can be interpreted as a distortion of the n-dimensional space in which the transformation is operating. This distortion is such that lines that are parallel before the distortion, remain parallel after it.
Let's consider a linear transformation where the basis vectors move to the following points.
(Recall that the original basis vectors and are present at (1, 0) and (0, 1) respectively)
This means that moves to (2, 0) and moves to (2, 3) as part of the linear transformation. Now, we know that linear transformations are expressed as matrices. To express a linear transform in 2D space, we can simply place the two basis vectors in two columns of a matrix. Hence, this transformation expressed by the movement of these two vectors is given by
Applying a linear transformation on a vector, is the equivalent of finding out where a particular vector moves to when space is distorted according to the linear transformation.
We now know that a matrix is a convenient way to express a linear transformation. Let's now revisit some of the operations we previously learnt, with the new knowledge that they are operations performed on a linear transformation
Dot Product
Intuitively a dot product can be understood in the following way:
Let's say we have a matrix A and an input vector 'x'. The dot product dot(A, x) is the resultant output vector when a linear transformation A is applied on a vector 'x'.
Cross Product
Let's say we have two matrices, A and B. The cross product A X B will yield another matrix C.
Intuitively, a cross product can understood in the following way:
The cross product of A and B is the composition of the linear transformations expressed by matrices B and A, in that order. Let's say we have an input vector 'x'. We first apply the transformation B on x to find (the operator here signifies a cross product. This same notation works in NumPy too - when you create NumPy matrices using numpy.matrix(), you can cross-multiply them using this notation)
To the resultant vector, we then apply the transformation A. We then arrive at
This whole process can be expressed as a single step, i.e. applying a single linear transformation expressed by the matrix C, which is the composition of transformations A and B.
This also brings another point to our notice - matrix cross multiplication is associative.
Inverse of a Matrix
If the inverse of a matrix is known, then we can very easily solve the system of linear equations. Let's say A is a matrix representing a linear transformation, and 'x' is an input vector. Let's say the output vector is 'b'. Then, by the definition of an inverse of a matrix,
Intuitively, this means that the inverse of A is the transformation you can apply to the output vector, to get back the input vector. Thus, all you need to do is multiply 'b' with the inverse, and you can find your 'x' values.
You could try this quick exercise: Sketch out a quick grid on paper. Then, draw the four original points on the grid and make a rectangle. Then, draw the output points on the same grid and make a rectangle using dotted lines. Does this help you visualise the transformation better?
What happens when certain vectors can be produced by a linear combination of certain other vectors? In the next section, you will learn about the concept of linear dependence.