In the previous section, you learnt about the basics of matrices, and how to create them. In this section, we will visit some of the operations on matrices. These operations are performed regularly during machine learning tasks.
At the start of this section, it would be beneficial to state an important fact - not all matrices can be multiplied with each other. Specific rules need to be met for two matrices to be multiplied.
Say you have two matrices, A & B.
dim(A) = a1xa2
dim(B) = b1xb2
For A multiplied by B, a2 = b1
For B multiplied by A, b2 = a1
In the first case, you can multiply a (3x2) matrix by a (2x3) matrix
In the second case, you cannot multiply a (3x2) matrix by a (3x2) matrix
In machine learning problems, it's always useful to know the sizes of your inputs and outputs when passing calculations on between layers. These I/O sizes are usually determined by the dimensions of results of some matrix multiplication, which can be calculated ahead of time before any computations are run. Consequently, you can design your data processing pipeline more efficiently with this knowledge in place.
Intuitively, a transpose of a matrix is formed when the rows and columns are interchanged. For example, in a 4x3 matrix, if a value is present at [2, 3] (2nd row, 3rd column), this value will go to [3, 2] (3rd row, 2nd column) in the transpose.
For calculating the inverse of a matrix, we must first know if a matrix is invertible. A matrix that is not invertible is called a singular matrix.
Let's say A is a non-singular matrix. Then the inverse of A (denoted by ) is defined as:
where I is the identity matrix.
You will learn a direct way to compute the inverse of a matrix in the section on Determinants.