In Machine Learning, we often encounter problems that are changing with time. If 'x' is a vector (perhaps an input vector) and A is a square matrix (maybe a linear transformation performed on the input),
This type of equation cannot be solved by ordinary methods of elimination. We need some special apparatus to tackle this sort of a problem.
Let's consider a matrix A. Almost all vectors in space change direction when they are multiplied by A. However, certain exceptional vectors 'x' remain in the same direction when multiplied by A. That is, A*x is in the direction of x. These vectors are called "eigenvectors". Multiply an eigenvector by A, and the vector Ax is a number λ times the original x.
If eigenvector x is multiplied by matrix A,
where λ is known as the 'eigenvalue' of that matrix.
Let's try and express this intuitively in terms of linear transformations.
As we know now, a matrix represents a linear transformation that can be applied on a vector. Let 'A' be that transformation, and let 'x' be that vector. The above equation says, that when the transformation is applied to the vector, the direction of the vector remains the same, and it only gets stretched by a factor of lambda.
We also know that a linear transformation is a transformation of the space on which it is applied. What does an eigenvector mean in this context? When a transformation is applied to a certain space, most vectors before the trasformation now point in a different direction. In other words, the spans of vectors change after a linear transformation is applied on them.
An eigenvector is a special vector that is an exception to this. As you can see from the fundamental equation, when transformation A is applied on vector x, the vector x is only scaled by a factor of λ. Hence, its span still remains the same - it points in the same direction as before space was distorted. The eigenvalue λ is the factor by which the vector is stretched/shrunk.
Let's consider a case of a transformation when the eigenvalue is 1. In this case, the space "rotates" about a certain axis, and the eigenvector can be considered that axis of rotation. This concept is used in machine learning to find axes about which variance is the highest. We shall see this in a future page.
For now, let's learn how to calculate eigenvalues and eigenvectors.