Description:

This supplement is meant to instruct and clarify applications of the key concepts of linear algebra.

# install.packages("MASS")

# install.packages("knitr")

# install.packages("matlib")

# install.packages("pcaPP")

# install.packages("devtools")

# install_github("vqv/ggbiplot")



library(MASS)

library(knitr)

library(kableExtra)

library(tidyverse)

library(matlib)

library(pcaPP)

library(devtools)

library(ggbiplot)



example.data <- mtcars

1. Introduction to Matrices/Vectors

a. Definitions and Terms

  • Create a vector with 3 elements.

\[v=\begin{pmatrix} 1 \\ 7 \\ 3 \end{pmatrix}\]

v = c(1,7,3)



# The exact same when wrapped in as.vector()

v = as.vector(c(1,7,3))



# Or can create a 1 column matrix

v = matrix(c(1,7,3))
  • Create a matrix with 6 elements filling by rows.

\[A=\begin{bmatrix} 2 & 4 & 3 \\ 1 & 5 & 7 \end{bmatrix}\]

A = matrix( 

  c(2, 4, 3, 1, 5, 7), # the data elements 

  nrow=2,              # number of rows 

  ncol=3,              # number of columns 

  byrow = TRUE)        # fill matrix by rows (default)
  • Create a matrix with the same 6 elements filling by columns.

\[B=\begin{bmatrix} 2 & 1 \\ 4 & 5 \\ 3 & 7 \end{bmatrix}\]

B = matrix( 

  c(2, 4, 3, 1, 5, 7), # the data elements 

  nrow=3,              # number of rows 

  ncol=2,              # number of columns 

  byrow = FALSE)       # fill matrix by columns
  • Create a square matrix with 9 elements.

\[C=\begin{bmatrix} 25 & 4 & 31 \\ 1 & 52 & 7 \\ 81 & 22 & 13 \end{bmatrix}\]

C = matrix( 

  c(25, 4, 31, 1, 52, 7, 81, 22, 13), # the data elements 

  nrow=3,              # number of rows 

  byrow = TRUE)        # fill matrix by rows (default)
  • Create a identity matrix with 9 elements.

\[I=\begin{bmatrix} 1 &0&0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}\]

I = diag(1,      # The element(s) on the diagonal of matrix

         nrow=3) # number of rows 

                 # (assumes same number of columns or square)
  • Create a unit matrix with 9 elements.

\[U =\begin{bmatrix} 1 &1&1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}\]

U = matrix(1,

           nrow=3,

           ncol=3)
  • Label the rows and columns of the matrix.

\[A=\begin{matrix} & col1 & col2 & col3 \\ row1 & 2 & 4 & 3 \\ row2 & 1 & 5 & 7 \end{matrix}\]

dimnames(A) = list( 

   c("row1", "row2"),         # row names 

   c("col1", "col2", "col3")) # column names 



# Check the row and column labels (in list format)

# labels(A)
  • Transpose a matrix.

\[A^T=\begin{matrix} & row1 & row2 \\ col1 & 2 & 1 \\ col2 & 4 & 5 \\ col3 & 3 & 7 \end{matrix}\]

# t(A)

b. Operations and Properties

Multiplication:

  • Multiply every element in matrix with a constant.

\[5\times B=\begin{bmatrix} 10&5 \\ 20&25 \\ 15&35 \\ \end{bmatrix}\]

k = 5



kB = k*B
  • Multiply each column in matrix by different constant.

\[\begin{bmatrix} 0\times B_{col1} & 2\times B_{col2} \\ \end{bmatrix}= \begin{bmatrix} 0&2 \\ 0&10 \\ 0&14 \\ \end{bmatrix}\]

# WRONG

# c(0,2)*B

# t(c(0,2))*B

# Error: non-conformable arrays



# 3 CORRECT methods

B.scaled = B%*%diag(c(0,2))



B.scaled = t(apply( B , 1 , function(x) x*c(0,2)))



B.scaled = sweep( B , 2 , c(0,2),'*')
  • Multiply vector and matrix using matrix multiplcation.

\[v^T*B=\begin{bmatrix} 39&57 \\ \end{bmatrix}\] \[B^T*v=\begin{bmatrix} 39 \\ 57 \\ \end{bmatrix}\]

vB = t(v)%*%B



Bv = t(B)%*%v
  • Multiply 2 matrices using matrix multiplcation.

\[AB=\begin{bmatrix} 29&43 \\ 43&75 \\ \end{bmatrix}\]

\[BA=\begin{bmatrix} 5&13&13 \\ 13&41&47 \\ 13&47&58 \\ \end{bmatrix}\]

# Using element-wise mulitplication( * ): 

# A*B

# Error: non-conformable arrays 



AB = A%*%B #returns a 2x2 matrix



BA = B%*%A #returns a 3x3 matrix
  • Multiply 3 matrices together.

\[BAC = \begin{bmatrix} 1191&982&415 \\ 4173&3218&1301 \\ 5070&3772&1486 \\ \end{bmatrix}\]

BAC = B%*%A%*%C

Addition:

  • Add constant (-3) to every element in a matrix (C).

\[C^{*} = \begin{bmatrix} 22&1&28 \\ -2&49&4 \\ 78&19&10 \\ \end{bmatrix}\]

C.centered = C-3
  • Add different constant to each row of a matrix.

\[\hat{C}=\begin{bmatrix} C_{col1}+0 & C_{col2}-5 & C_{col3}+7 \\ \end{bmatrix}= \begin{bmatrix} 25&-1&38 \\ 1&47&14 \\ 81&17&20 \\ \end{bmatrix}\]

C.hat = C + matrix(c(0,-5,7), nrow=3, ncol=3, byrow=TRUE)



C.hat = t(apply( C , 1 , function(x) x+c(0,-5,7)))



C.hat = sweep( C , 2 , c(0,-5,7),'+')

Hands-on Example:

  • Before: Can it theoretically be done?

  • Attempt example!

  • After: Does R complete task or result in error?

2. Systems of Linear Equations

a. Gaussian Elmination

Solve the folowing for \(x\):

\[Dx=e\] \[D=\begin{bmatrix} 1&1&1 \\ 2&3&7 \\ -1&0&-9 \end{bmatrix}\]

\[e=\begin{pmatrix} 3 \\ 0 \\ 17 \end{pmatrix}\]

Solution: \[x=\begin{pmatrix} 1 \\ 4 \\ -2 \end{pmatrix}\]

D <- matrix(c(1,1,1,2,3,7,-1,0,-9), nrow=3, byrow = TRUE)



e <- c(3,0,17)



# gaussianElimination(D,e, verbose=TRUE, fractions=TRUE)

b. Linear Independence

Are the following vectors linearly independent?

\[v_1 = \begin{pmatrix} 2 \\ 5 \\ 3 \end{pmatrix}\]

\[v_2 = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix}\]

\[v_3 = \begin{pmatrix} 4 \\ -2 \\ 0 \end{pmatrix}\]

V <- matrix(c(2, 5, 3,1, 1, 1,4,-2,0), nrow=3, byrow = FALSE)



# gaussianElimination(V, verbose=TRUE, fractions=TRUE)

One way to answer this question is find the solution to:

\[k_1v_1+k_2v_2+k_3v_3=0\]

Through gaussian elimination we see the only solution is \(k_1 = k_2 = k_3 = 0\), which proves that the given vectors are linearly independent.

Solving Systems of Equations Examples

  • Three consistent equations.

\[x_1 - x_2 = 2 \] \[2x_1 + 2x_2 = 1 \] \[3x_1 + x_2 = 3 \] Solution:

A <- matrix(c(1,2,3, -1, 2, 1), 3, 2)

b <- c(2,1,3)



# Print the system of equations

# showEqn(A,b)



# Test for consistency, are the ranks equal?

# R(A)==R(cbind(A,b))



# Solve for the solution

Solve(A,b, fractions = TRUE)
## x1    =   5/4 

##   x2  =  -3/4 

##    0  =     0

From the plot we can see all 3 equations have a consistent solution.

plotEqn(A,b)
##   x1 - 1*x2  =  2 

## 2*x1 + 2*x2  =  1 

## 3*x1   + x2  =  3

  • Three inconsistent equations.

\[x_1 - x_2 = 2 \] \[2x_1 + 2x_2 = 1 \] \[3x_1 + x_2 = 6 \]

Solution:

b.new <- c(2,1,6)



# Print the system of equations

# showEqn(A,b.new)



# Test for consistency, are the ranks equal?

# R(A)==R(cbind(A,b.new))



# Solve for the solution, see 0=-3 which is FALSE

Solve(A,b.new, fractions = TRUE)
## x1    =  11/4 

##   x2  =  -9/4 

##    0  =    -3

From the plot we can see that each pair of equations has a solution, but all three don’t have a consistent solution.

plotEqn(A,b.new)
##   x1 - 1*x2  =  2 

## 2*x1 + 2*x2  =  1 

## 3*x1   + x2  =  6

c. Inverses

\[F=\begin{bmatrix} 2&1&0 \\ -2&-1&3 \\ 0&1&-2 \\ \end{bmatrix}\]

\[F^{-1}=\begin{bmatrix} \frac{1}{6}&-\frac{1}{3}&-\frac{1}{2} \\ \frac{2}{3}&\frac{2}{3}&1 \\ \frac{1}{3}&\frac{1}{3}&0 \\ \end{bmatrix}\]

F <- matrix(c(2,-2,0,1,-1,1,0,3,-2), nrow=3)



# Step-by-step process to solve for inverse

# gaussianElimination(F,diag(1,nrow=3), verbose=TRUE, fractions=TRUE)
FI <- fractions(inv(F))



FI <- fractions(Inverse(F))



FI <- fractions(solve(F))

Show the following hold:

  • inv(inv(A)) = A

  • inv(t(A)) = t(inv(A))

  • inv( k*A ) = (1/k) * inv(A)

  • inv(A * B) = inv(B) %*% inv(A)

  • inv(A) * A = diag(nrow(A))

d. Norm and Trace and Determinant

\[norm(C)=107\] \[det(C)=-114624\]

\[tr(C)=90\]

How to code:

# Default norm type is "one" or maximum absolute column sum

# For other norm types see documentation

norm.C <- norm(C)



det.C <- det(C)



trace.C = sum(diag(C))

Some properties include:

  • trace(A) == trace(t(A))

  • trace(alpha * A) == alpha * trace(A)

  • trace(A %*% B) == trace(B %*% A)

  • trace(A) == trace(crossprod(crossprod(B,A),solve(B)))

  • det(inv(A))=1/det(A)=inv(det(A))

  • det(A) != 0, so inverse exists

3. Additional Topics

a. The Moore Penrose Pseudoinverse

\[G_C=\begin{bmatrix} -0.005&-0.005&0.014 \\ -0.005&0.019&0.001 \\ 0.037&0.002&-0.011 \\ \end{bmatrix}\]

How to code:

G.C <- ginv(C)

b. Linear Regression

This is a very simple example. You will learn a lot more in class this semester about how to build and interpret linear models.

Here we fit a linear model that predicts 1974 car fuel consumption, more specifically miles per gallon, using car weight. We can see in the graph that as car weight increases mpg decreases. Therefore there is a negative trend between weight and mpg in this sample.

lin.model <- lm(mpg~wt,data=example.data)



#summary(lin.model)



plot(example.data$wt, example.data$mpg, main="Car Weight vs. Plot Fuel Consumption ",xlab="Weight", ylab="mpg")

abline(lm(mpg~wt,data=example.data), col="red")

c. Principal Component Analysis

mtcars: This dataset consists of data on 32 models of car, taken from an American motoring magazine (1974 Motor Trend magazine). For each car, you have 11 features, expressed in varying units.

We will consider only the numeric variables for the PCA (not the categorical ones vs and am). We fit a PCA object to the scaled and centered data. Each of these 9 components explains a percentage of the total variation in the dataset.

“PC1 explains 63% of the total variance, which means that nearly two-thirds of the information in the dataset (9 variables) can be encapsulated by just that one Principal Component. PC2 explains 23% of the variance. So, by knowing the position of a sample in relation to just PC1 and PC2, you can get a very accurate view on where it stands in relation to other samples, as just PC1 and PC2 can explain 86% of the variance.”

mtcars.pca <- prcomp(mtcars[,c(1:7,10,11)], center = TRUE,scale. = TRUE)



summary(mtcars.pca)
## Importance of components:

##                           PC1    PC2     PC3     PC4     PC5     PC6

## Standard deviation     2.3782 1.4429 0.71008 0.51481 0.42797 0.35184

## Proportion of Variance 0.6284 0.2313 0.05602 0.02945 0.02035 0.01375

## Cumulative Proportion  0.6284 0.8598 0.91581 0.94525 0.96560 0.97936

##                            PC7    PC8     PC9

## Standard deviation     0.32413 0.2419 0.14896

## Proportion of Variance 0.01167 0.0065 0.00247

## Cumulative Proportion  0.99103 0.9975 1.00000

From the plot of PC1 vs. PC2 we see variables hp, cyl, and disp all contribute to PC1, with higher values in those variables moving the samples to the right on this plot. This lets you see how the data points relate to the axes, but it’s not very informative without knowing which point corresponds to which sample or car.

ggbiplot(mtcars.pca)

Label each point on the graph with the car names:

ggbiplot(mtcars.pca, labels=rownames(mtcars))

In order to interpret these results we compare groups. Here we create a mtcars.country variables to represent the country of origin of the car. We will also circle these groups on the graph:

“We see that the American cars are characterized by high values for cyl, disp, and wt. Japanese cars, on the other hand, are characterized by high mpg. European cars are somewhat in the middle and less tightly clustered than either group.”

mtcars.country <- c(rep("Japan", 3), rep("US",4), rep("Europe", 7),rep("US",3), "Europe", rep("Japan", 3), rep("US",4), rep("Europe", 3), "US", rep("Europe", 3))



ggbiplot(mtcars.pca,ellipse=TRUE,  labels=rownames(mtcars), groups=mtcars.country)

Now lets look at PC3 vs. PC4. The trends are less distinct here but we expected that because of the low percent of explained variance from each component.

ggbiplot(mtcars.pca,ellipse=TRUE,choices=c(3,4),   labels=rownames(mtcars), groups=mtcars.country)

Extras

a. Random Functions

See code:

# matrix of logical variables returned

upC <- upper.tri(C, diag=TRUE)

lowC <- lower.tri(C)



# return the values in the upper traingle

# C[upC]



# replace the values in the lower traingle with 0's

# C[lowC] <- 0



#logical variable returned

sym.C <- is_symmetric_matrix(C) 

b. Eigenvalues/Eigenvectors and Singular Value Decomposition

How to code:

eig.C <- eigen(C)



C.svd <- svd(C)

How svd connects with generalized inverses:

ds <- diag(1/C.svd$d)

u <- C.svd$u

v <- C.svd$v

us <- as.matrix(u)

vs <- as.matrix(v)



C.ginv <- vs %*% ds %*% t(us)



# Compare to:

# ginv(C)