Its important to know what goes on inside a machine learning algorithm. But its hard. There is some pretty intense math happening, much of whichis linear algebra. When I took Andrew Ngs course on machine learning, I found the hardest part was the linear algebra. Im writing this for myself as much as you.
So here is a quick review, so next time you look under the hood of an algorithm, you’re more confident. You can view the iPython notebook (usually easier to code with) on my github.
matrix– a rectangular array of values
vector– one dimensional matrix
identity matrix I– a diagonal matrix is an n x n matrix with ones on the diagonal from the top left to the bottom right.
[[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]]
When a matrix A is multiplied by its inverse A^-1, the result is the identity matrix I.Only square matrices have inverses. Example below.
Note – the inverse of a matrix is not the transpose.
Matrices are notated m x n, orrows x columns. A 2×3 matrix has 2 rows and 3 columns.Read this multiple times.
You can only add matrices of the same dimensions. You can only multiply two matrices if the first is m x n, and the second is n x p. The n-dimension has to match.
Now the basics in Python
import numpy as np A = np.array([[3, 2, 4]]) B = np.array([, , ]) print("rows by columns, or m by n") print("A is", A.shape) print("B is", B.shape) print("A * B = ", np.dot(A, B)) # note -> A*B is not matrix # multiplication in numpy!!![Output] rows by columns, or m by n A is (1, 3) B is (3, 1) A * B = []
And for using identity matrices in numpy, use the eye() function.
# If a matrix A is multiplied by the identity matrix, the result is A. A = np.array([[1,2,3,4]]) print(np.dot(A, np.eye(4))) # equals A!!![[ 1. 2. 3. 4.]]
Also, calculating the inverse using inv()..or pinv()..is important. Another important function is transpose().B = np.array([[1,2],[3,4]]) print(np.dot(B, np.linalg.inv(B))) # returns the identity matrix (approximately) print(B.transpose())
[[ 1.00000000e+00 0.00000000e+00] [ 8.88178420e-16 1.00000000e+00]] [[1 3] [2 4]]
An eigenvalue of a matrix A is something you can multiply some vector X by, and get the same answer you would if you multiplied A and X. In this situation, the vector X is an eigenvector. More formally –
Def: Let A be an n x n matrix. A scalar is called an eigenvalue of A if there is a nonzero vector X such thatAX =X.
Such a vector X is called an eigenvector of A corresponding to.
There is a wayto compute the eigenvalues of a matrixby hand, and then a corresponding eigenvector, but its a bit beyond the scope of this tutorial.# *** eigenvalues and eigenvectors *** # A = np.array([[2, -4], [-1, -1]]) x = np.array([, [-1]]) # a suspected eigenvector eigVal = 3 # a suspected eigenvalue print(np.dot(A, x), "\n") print(eigVal * x) # They match! [output] [ [-3]] [ [-3]]
Now that we know matrix A has a real eigenvalue, let’s compute it with numpy!w, v = np.linalg.eig(A) print(w) # the eigenvalues of matrix A [ 3. -2.]
Ok, so the square matrix A has two eigenvalues, 3 and -2! But what about the corresponding eigenvector?v[:, 0] # this is the normalized eigenvector corresponding to w, or 3. # let's unnormalize it to see if we were right. import math length = math.sqrt(x**2 + x**2) # the length of our original eigenvector x print(v[:, 0] * length) print("Our original eigenvector was [4, -1]") [ 4. -1.] Our original eigenvector was [4, -1]
Woohoo! Note – it’s important to remember that all multiples of this eigenvector will be an eigenvector of A corresponding to it’s eigenvalue (lambda).
Determinantsare calculated value for a given square matrix. They are used in most of linear algebra beyond matrix multiplication.
We can seewhere this comes from if we look at the determinant for a 2 x 2 matrix.
Imagine we have a square matrix A.
We can define its inverse using the formula below.
That bit in the denominator, thats the determinant. If it is 0, the matrix issingular(no inverse!).
It hasa ton ofproperties, for example,the determinant of a matrix equals that of its transpose.
# ************************ Determinants ************************ # A = np.array([[1, 2], [3, 4]]) print("det(A) = ", np.linalg.det(A))
det(A) = -2.0
Singular Value Decomposition
SVD is a technique to factorize a matrix, or a way of breaking the matrix up into three matrices.
SVD is usedspecifically in something like Principal Component Analysis. Eigenvalues in the SVD can help you determine which features are redundant, and therefore reduce dimensionality!
Its actually considered its own data mining algorithm.
It uses the formula M = UV, then uses the properties of these matrices (i.e. U and V are orthogonal, is a diagonal matrix with non-negative entries) to furthur break them up.
Hereis a bit more math-intensive example.
And of course, theres a function in numpy :D .# ******** Single Value Decomposition *********** # A = np.array([[1, 2, 3, 4, 5, 6, 7, 8], [9,10,11,12, 4,23,45, 2], [5, 3, 5, 2,56, 3, 6, 4]]) U, s, V = np.linalg.svd(A) print(U)
array([[-0.18149711, 0.07590154, 0.98045793], [-0.65271926, 0.73643135, -0.17783826], [-0.73553815, -0.6722409 , -0.08411777]])
I tried not to get to bogged down with the math in this tutorial, but there is a lot more to explore. There is a significant mathmatical difference between data scientists and machine learning researchers. For ML researchers, this stuff is a foundation.
Whileindata science its not as important, I personally think understanding (if possible) the algorithm youre using is anoble goal.