Now that you've gotten a high level overview of the use cases for PCA and some general notes regarding the algorithm's implementation, its time to dive deeper into the theory behind PCA. In particular, you'll break down some of the primary components of the algorithm, including the covariance matrix and eigenvectors.
You will be able to:
- Understand covariance matrix calculation with implementation in numpy
- Understand and explain Eigendecomspoistion with its basic characteristics
- Explain the role of eigenvectors and eigenvalues in eigendecomposition
- Decompose and re-construct a matrix using eigendecomposition
To start, recall the general procedure for PCA:
- Recenter each feature of the dataset by subtracting that feature's mean from the feature vector
- Calculate the covariance matrix for your centered dataset
- Calculate the eigenvectors of the covariance matrix
- Project the dataset into the new feature space: Multiply the eigenvectors by the mean centered features
With that, let's start deconstructing some of these concepts.
You've looked into correlation and covariance as measures to calculate how one random variable changes with respect to another.
Covariance is very similar to correlation, but may not be normalized. The formula for the covariance of two variable X and Y is:
Now that you've seen covariance on its own, lets briefly examine a covariance matrix.
The covariance matrix for a set of data with n dimensions would be:
where
So if you have an n-dimensional data set, then the matrix has n rows and n columns (square matrix) and each entry in the matrix is the result of calculating the covariance between two separate dimensions as shown below:
-
Down the main diagonal, we can see that the covariance value is between one of the dimensions and itself. These are the variances for that dimension.
-
Since
$cov(a,b) = cov(b,a)$ , the matrix is symmetrical about the main diagonal.
In numpy, you can calculate the covariance of a given matrix using np.cov()
function, as shown below:
import numpy as np
X = np.array([ [0.1, 0.3, 0.4, 0.8, 0.9],
[3.2, 2.4, 2.4, 0.1, 5.5],
[10., 8.2, 4.3, 2.6, 0.9]
])
print( np.cov(X) )
[[ 0.115 0.0575 -1.2325]
[ 0.0575 3.757 -0.8775]
[-1.2325 -0.8775 14.525 ]]
The diagonal elements,
print(np.var(X, axis=1, ddof=1))
[ 0.115 3.757 14.525]
The eigendecomposition is one form of matrix decomposition. Decomposing a matrix means that you want to find a product of matrices that is equal to the initial matrix. In the case of the eigendecomposition, you decompose the initial matrix into the product of its eigenvectors and eigenvalues.
A vector
Here, lambda (
A matrix can have one eigenvector and eigenvalue for each dimension of the parent matrix.
Also , remember that not all square matrices can be decomposed into eigenvectors and eigenvalues, and some can only be decomposed in a way that requires complex numbers. The parent matrix can be shown to be a product of the eigenvectors and eigenvalues.
A decomposition operation breaks down a matrix into constituent parts to make certain operations on the matrix easier to perform. Eigendecomposition is used as an element to simplify the calculation of other more complex matrix operations.
Eigenvectors are unit vectors, with length or magnitude is equal to 1.0. They are often referred as right vectors, which simply means a column vector (as opposed to a row vector or a left vector). Imagine a transformation matrix that, when multiplied on the left, reflected vectors in the line
Eigenvalues are coefficients applied to eigenvectors that give the vectors their length or magnitude. For example, a negative eigenvalue may reverse the direction of the eigenvector as part of scaling it. Eigenvalues are closely related to eigenvectors.
A matrix that has only positive eigenvalues is referred to as a positive definite matrix, whereas if the eigenvalues are all negative, it is referred to as a negative definite matrix.
Decomposing a matrix in terms of its eigenvalues and its eigenvectors gives valuable insights into the properties of the matrix. Certain matrix calculations, like computing the power of the matrix, become much easier when we use the eigendecomposition of the matrix. The eigendecomposition can be calculated in NumPy using the eig()
function.
The example below first defines a 3×3 square matrix. The eigendecomposition is calculated on the matrix returning the eigenvalues and eigenvectors using eig()
method.
# eigendecomposition
from numpy import array
from numpy.linalg import eig
# define matrix
A = array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(A)
# calculate eigendecomposition
values, vectors = eig(A)
print(values)
print(vectors)
[[1 2 3]
[4 5 6]
[7 8 9]]
[ 1.61168440e+01 -1.11684397e+00 -9.75918483e-16]
[[-0.23197069 -0.78583024 0.40824829]
[-0.52532209 -0.08675134 -0.81649658]
[-0.8186735 0.61232756 0.40824829]]
Above, The eigenvectors are returned as a matrix with the same dimensions as the parent matrix (3x3), where each column is an eigenvector, e.g. the first eigenvector is vectors[:,0]. Eigenvalues are returned as a list, where value indices in the returned array are paired with eigenvectors by column index, e.g. the first eigenvalue at values[0] is paired with the first eigenvector at vectors[: 0].
You can also test whether the first vector and value are in fact an eigenvalue and eigenvector for the matrix.
# confirm first eigenvector
B = A.dot(vectors[:, 0])
print(B)
C = vectors[:, 0] * values[0]
print(C)
[ -3.73863537 -8.46653421 -13.19443305]
[ -3.73863537 -8.46653421 -13.19443305]
You can also reverse the process and reconstruct the original matrix given only the eigenvectors and eigenvalues.
First, the list of eigenvectors must be converted into a matrix, where each vector becomes a row. The eigenvalues need to be arranged into a diagonal matrix. The NumPy diag()
function can be used for this. Next, you need to calculate the inverse of the eigenvector matrix, which we can achieve with the inv()
function. Finally, these elements need to be multiplied together with the dot()
function.
from numpy.linalg import inv
# create matrix from eigenvectors
Q = vectors
# create inverse of eigenvectors matrix
R = inv(Q)
# create diagonal matrix from eigenvalues
L = np.diag(values)
# reconstruct the original matrix
B = Q.dot(L).dot(R)
print(B)
[[1. 2. 3.]
[4. 5. 6.]
[7. 8. 9.]]
## Summary
In this lesson, you looked at calculating covariance matrix for a given matrix. You also looked at eigen decomposition and its implementation in python. You can now go ahead and use these skills to apply principle component analysis for a given multidimensional dataset using these skills.