6. Linear algebra in R#

This chapter is a quick guide to working with matrices in R. Following the discussion you should be able to define and conduct basic operations with matrices.

The best way to work through the chapter is to read the discussion and then reproduce all code in your own R/RStudio installation. You can do this by copying code from a notebook cell to your R script and run the code line by line.

The chapter includes some exercises that allow you to test your understanding. It is recommended that you try these exercises on your own before choosing the option to reveal the solutions.

6.1. Defining a matrix#

To define a matrix use the matrix() function with usage

matrix(c(a1,a2,...,an), nrow=number, ncol=number,...)

which takes a vector of \(n\) elements, c(a1,a2,...,an) and rearranges it into a nrow\(\times\)ncol matrix. By default the elements of the vector are arranged into the matrix so that the first nrow elements comprise the first column of the matrix, the next nrow elements comprise the second column of the matrix, and so forth.

For example, to define the \(3\times3\) matrix

\[\begin{split} A= \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \end{split}\]

type

A <- matrix(c(1,4,7,2,5,8,3,6,9), nrow=3, ncol=3)
A
     [,1] [,2] [,3]
[1,] 1    2    3   
[2,] 4    5    6   
[3,] 7    8    9   

Note

The number of rows and columns specified should be consistent with the number of elements (i.e. the number of elements of the vector c(a1,a2,...,an) should be exactly equal to nrow\(\times\)ncol). Otherwise, R will return an error.

For example,

N <- matrix(c(1,4,7,2,5,8,3,6,9), nrow=2, ncol=3)

yields the error message

Warning message:
In matrix(c(1, 4, 7, 2, 5, 8, 3, 6, 9), nrow = 2, ncol = 3) :
  data length [9] is not a sub-multiple or multiple of the number of rows [2]

because the number of elements given is 9, while we are trying to place them inside a \(2\times3\) matrix.

In fact, given the vector of elements, c(a1,a2,...,an), it is sufficient

  • to only supply either the number of rows, nrow, (in which case the number of columns will be automatically deduced as equal to length(c(a1,a2,...,an))/nrow)

  • or to only supply the number of columns, ncol, (in which case the number of rows will be automatically deduced as equal to length(c(a1,a2,...,an))/ncol)

For example,

A <- matrix(c(1,4,7,2,5,8,3,6,9), nrow=3)
A
     [,1] [,2] [,3]
[1,] 1    2    3   
[2,] 4    5    6   
[3,] 7    8    9   

or

A <- matrix(c(1,4,7,2,5,8,3,6,9), ncol=3)
A
     [,1] [,2] [,3]
[1,] 1    2    3   
[2,] 4    5    6   
[3,] 7    8    9   

Note

Again, the number of rows or columns specified should be consistent with the number of elements.

For example

N <- matrix(c(1,4,7,2,5,8,3,6,9), nrow=4)

yields and error message

Warning message:
In matrix(c(1, 4, 7, 2, 5, 8, 3, 6, 9), nrow = 4) :
  data length [9] is not a sub-multiple or multiple of the number of rows [4]

as the number of elements in c(1, 4, 7, 2, 5, 8, 3, 6, 9) is not divisible by 4.

6.1.1. Arranging the elements of the vector into a matrix by rows#

As seen above, by default, the elements of the vector passed to matrix() are arranged into a matrix column-wise. However, it may be more convenient for writing to arrange the elements row-wise. To do so use the optional argument byrow=TRUE. For example, the same matrix A as above can be equivalently defined by

## More convenient for writing

A <- matrix(c(1,2,3,4,5,6,7,8,9), ncol=3, byrow=TRUE)
A
     [,1] [,2] [,3]
[1,] 1    2    3   
[2,] 4    5    6   
[3,] 7    8    9   

This is often more convenient for inputting matrices because we can input our code on several lines to make us quickly visualize what matrix we are inputting. For example,

A <- matrix(c(1,2,3,
              4,5,6,
              7,8,9), ncol=3, byrow=TRUE)
A
     [,1] [,2] [,3]
[1,] 1    2    3   
[2,] 4    5    6   
[3,] 7    8    9   

generates the same matrix but makes the code much more readable.

For the purposes of readability, henceforth, this is how we will enter matrices in these notes.

Now let’s continue with another example of entering a matrix.

6.1.2. Example#

To define the \(3\times2\) matrix

\[\begin{split} B= \begin{bmatrix} 7 & 2 \\ 3 & 8 \\ 1 & 9 \end{bmatrix} \end{split}\]

type

B <- matrix(c(7,2,
              3,8,
              1,9), ncol=2, byrow=TRUE)
B
     [,1] [,2]
[1,] 7    2   
[2,] 3    8   
[3,] 1    9   

Exercise 6.1

Define the matrix

\[\begin{split} C= \begin{bmatrix} 7 & 2 & 5 & 12 \\ 3 & 8 & 8 & -3 \\ 1 & 9 & 7 & 0 \end{bmatrix} \end{split}\]

6.1.3. Defining column and row vectors.#

Column and row vectors can be also specified by using the matrix() function, setting the number of columns and rows appropriately.

For example, to define the \(1\times9\) vector \(a= \begin{bmatrix}1,2,3,4,5,6,7,8,9\end{bmatrix}\) type

a <- matrix(c(1,2,3,4,5,6,7,8,9), nrow=1)
a
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1    2    3    4    5    6    7    8    9   

To instead define th \(9\times1\) vector \(b= \begin{bmatrix}1\\2\\3\\4\end{bmatrix}\) type

b <- matrix(c(1,2,3,4), ncol=1)
b
     [,1]
[1,] 1   
[2,] 2   
[3,] 3   
[4,] 4   

A few more examples

c <- matrix(c(3,7,1,2), ncol=4)
c
     [,1] [,2] [,3] [,4]
[1,] 3    7    1    2   
d <- matrix(c(3,7,1,2), nrow=1)
d
     [,1] [,2] [,3] [,4]
[1,] 3    7    1    2   

6.2. Working with matrices#

6.2.1. Checking the order of a matrix#

Given a matrix A, then dim(A) returns the order (or dimension) of the matrix. This can be used to query the order of a given matrix.

For example,

B <- matrix(c(7,2,
              3,8,
              1,9), ncol=2, byrow=TRUE)
dim(B)
[1] 3 2

as the defined matrix B is of order \(3\times2\).

6.2.2. Transpose of a matrix#

Given a matrix A, then t(A) returns the transpose of the matrix.

For example, given

A <- matrix(c(1,2,3,
              4,5,6,
              7,8,9), ncol=3, byrow=TRUE)
A
     [,1] [,2] [,3]
[1,] 1    2    3   
[2,] 4    5    6   
[3,] 7    8    9   

then its transpose is returned by

t(A)
     [,1] [,2] [,3]
[1,] 1    4    7   
[2,] 2    5    8   
[3,] 3    6    9   

As another example,

B <- matrix(c(7,2,
              3,8,
              1,9), ncol=2, byrow=TRUE)
B
     [,1] [,2]
[1,] 7    2   
[2,] 3    8   
[3,] 1    9   
t(B)
     [,1] [,2] [,3]
[1,] 7    3    1   
[2,] 2    8    9   

6.2.3. Matrix addition#

Given matrices A and B of the same order

  • their sum can be found by A+B

  • their difference can be found by A-B

A <- matrix(c(7,2,
              3,8,
              1,9), ncol=2, byrow=TRUE)

B <- matrix(c(1,3,
              6,1,
              2,3), ncol=2, byrow=TRUE)
A+B
     [,1] [,2]
[1,] 8     5  
[2,] 9     9  
[3,] 3    12  
A-B
     [,1] [,2]
[1,]  6   -1  
[2,] -3    7  
[3,] -1    6  

Note that if the matrices added/subtracted are not of the same order, R will return an error.

For example,

A <- matrix(c(7,2,
              3,8,
              1,9), ncol=2, byrow=TRUE)
C <- matrix(c(7,2,
              3,8), ncol=2, byrow=TRUE)

results in the error message Error in A + C: non-conformable arrays

Warning

Just like we can’t add matrices of different orders, a matrix cannot be added to a scalar. For example, the expression

\[\begin{split} \begin{bmatrix}1 & 2\\3 & 4\end{bmatrix} + 1 \end{split}\]

is mathematically meaningless.

However, in R the sum of a matrix and scalar is defined. For example, the code

matrix(c(1,2,3,4),ncol=2,byrow=TRUE)+1

yields the output

     [,1] [,2]
[1,] 2    3   
[2,] 4    5   

so the sum of a matrix and a scalar is interpreted as the sum of the matrix with a matrix of the same order with all elements being equal to the scalar. In other words the code

matrix(c(1,2,3,4),ncol=2,byrow=TRUE)+1

implements the mathematical operation

\[\begin{split} \begin{bmatrix}1 & 2\\3 & 4\end{bmatrix} + \begin{bmatrix}1 & 1\\1 & 1\end{bmatrix} \end{split}\]

The reason this is allowed in R is that it allows a convenient shortcut for the above operation, but you should be careful to recognise that, mathematically, this is not a sum of a scalar and a matrix (which is not defined mathematically).

6.2.4. Scalar multiplication of matrices.#

Given a matrix A and a scalar b, the product of A and b can be obtained as A*b or b*A.

A <- matrix(c(7,2,
              3,8,
              1,9), ncol=2, byrow=TRUE)
A
     [,1] [,2]
[1,] 7    2   
[2,] 3    8   
[3,] 1    9   
A*2
     [,1] [,2]
[1,] 14    4  
[2,]  6   16  
[3,]  2   18  

6.2.5. Matrix multiplication#

Given conformable matrices \(A\) and \(B\), implemented by A and B, the matrix product \(A\times B\) is obtained in R by A %*% B.

Note that the multiplication sign should be enclosed in % signs for R to recognise that what we want is a matrix product.

For example, consider the matrices

\[\begin{split} A_{2\times2}=\begin{bmatrix}1&2\\3&4\end{bmatrix} \text{ and } B_{2\times1}=\begin{bmatrix}1\\2\end{bmatrix} \end{split}\]
A <- matrix(c(1,2,
              3,4), ncol=2, byrow=TRUE)
B <- matrix(c(1,
              2), ncol=1, byrow=TRUE)

Then \(A\times B\) exists and can be found in R by typing

A%*%B
     [,1]
[1,]  5  
[2,] 11  

This is mathematically correct because, indeed, you can easily verify that

\[\begin{split} \begin{bmatrix}1&2\\3&4\end{bmatrix}\times \begin{bmatrix}1\\2\end{bmatrix}= \begin{bmatrix}5\\11\end{bmatrix} \end{split}\]

On the other hand, given \(A\) and \(B\) above, the matrix product \(B\times A\) does not exist because the number of columns of \(B\) is different from the number of rows of \(A\). If we type

B%*%A

we unsurprisingly get the error message Error in B %*% A: non-conformable arrays.

Warning

In R you should be extra careful to use %*% rather than * for matrix multiplication because

  • the operator %*% correctly implements matrix multiplication

  • while the operator * is also defined for matrices but implements pointwise multiplication which is a useful operation but distinct from matrix multiplication.

For example consider the matrices \(A=\begin{bmatrix}1&1\\1&1\end{bmatrix}\) and \(B=\begin{bmatrix}1&1\\1&1\end{bmatrix}\), which can be defined in R by the code

A=matrix(c(1,1,1,1),ncol=2)
B=matrix(c(1,1,1,1),ncol=2)

Mathematically, the matrix product \(A\times B\) is defined as

\[\begin{split} A\times B = \begin{bmatrix}1&1\\1&1\end{bmatrix}\times\begin{bmatrix}1&1\\1&1\end{bmatrix}=\begin{bmatrix}1\times1 + 1\times1& 1\times1 + 1\times1\\ 1\times1 + 1\times1 &1\times1 + 1\times1\end{bmatrix}=\begin{bmatrix}2&2\\2&2\end{bmatrix} \end{split}\]

We can conduct this operation consistently in R by typing

A %*% B

yielding the output

     [,1] [,2]
[1,] 2    2   
[2,] 2    2   

On the other hand, the pointwise multiple of two matrices of the same order is defined as a matrix of the same order where each element is obtained as the product of the corresponding elements of the two original matrices. For example, if we have \(A\) and \(B\) defined as above, their pointwise multiple is the matrix

\[\begin{split} \begin{bmatrix}1\times1&1\times1\\1\times1&1\times1\end{bmatrix}=\begin{bmatrix}1&1\\1&1\end{bmatrix} \end{split}\]

In R the operator * implements pointwise multiplication. So given A and B above, typing

A * B

yields the output

     [,1] [,2]
[1,] 1    1   
[2,] 1    1   

You should be extra careful to use the correct operator %*% for matrix multiplication!

6.2.5.1. Another example#

Consider the matrices \(A\) and \(B\) defined below

A <- matrix(c(13,23,32,78,
              24,15,76,15,
              76,88,19,34), ncol=4, byrow=TRUE)

B <- matrix(c(17,65,72,
              42,56,12,
              81,0 ,13,
              21,18,35), ncol=3, byrow=TRUE)

Their product \(A\times B\) is cumbersome to calculate by hand but can be easily found using R

A%*%B
     [,1] [,2]  [,3]
[1,] 5417  3537 4358
[2,] 7509  2670 3421
[3,] 7241 10480 7965

By correctly using R we can easily calculate arbitrarily complex arithmetic expressions involving matrices.

For example, given

\[\begin{split} A = \begin{bmatrix} 1&5&7\\2&-3&-5\\5&-9&1 \end{bmatrix} \text{ and } B = \begin{bmatrix} 4&1&8\\-2&6&-12\\2&13&0 \end{bmatrix} \end{split}\]

we can easily calculate something like

\[ (A\times B)^T + (B\times A) - A \]

by simply typing

A = matrix(c(1,5,7,
             2,-3,-5,
             5,-9,1), ncol=3, byrow=TRUE)

B = matrix(c(4,1,8,
             -2,6,-12,
             2,13,0), ncol=3, byrow=TRUE)

t(A%*%B) + (B%*%A) - A
     [,1] [,2] [,3]
[1,]  53  -56   64 
[2,]  70    2  -87 
[3,] -29   32   96 

Exercise 6.2

Consider the matrices

\[\begin{split} A = \begin{bmatrix} 1&5&8\\2&-3&-5\end{bmatrix} \text{ and } B = \begin{bmatrix} 4&1&8\\-2&6&-12\\2&13&0 \end{bmatrix} \end{split}\]

Use R to evaluate

\[ (A\times B)^T + B\times A^T - A^T \]

6.3. Matrices and systems of linear equations#

6.3.1. Determinant of a matrix#

A determinant of a matrix can be calculated in R using the det() function.

Given a square matrix \(A\), its determinant is found by typing det(A).

A <- matrix(c(1,2,
              3,4), ncol=2, byrow=TRUE)
det(A)
[1] -2
B <- matrix(c(17,65,72,
              42,56,12,
              81,0 ,13), ncol=3, byrow=TRUE)
det(B)
[1] -286526

Notice that a determinant of a matrix is mathematically defined only for square matrices, and this is also the case computationally. For example, the code

C <- matrix(c(1,3,
              6,1,
              2,3), ncol=2, byrow=TRUE)
det(C)

returns the error Error in determinant.matrix(x, logarithm = TRUE, ...): 'x' must be a square matrix because the matrix C is not square matrix and has no determinant.

6.3.2. (Optional) Systems of linear equations#

Consider a system of \(n\) linear equations in \(n\) unknowns written in matrix form as

\[ Ax=b \]

where

  • \(A\) is an \(n\times n\) matrix of coefficients

  • \(x\) is an \(n\times1\) vector of unknowns

  • \(b\) is an \(n\times1\) vector of coefficients

Suppose that \(det(A)\neq0\) so the system has a unique solution.

In the lectures we have seen one method (Cramer’s rule) of obtaining the solution to the system by using matrix algebra.

Another well known method, which we have not discussed in the lecture, is one based on the notion of an inverse of a function. While implementing this method by hand is much more cumbersome than Cramer’s rule, it is very easy to implement computationally. In what follows we define the notion of an inverse and explain how to use it to solve arbitrarily complex systems of linear equations computationally.

Note

The mathematical material in this section goes beyond the core material for the module. Do not worry if you find the discussion difficult to follow, and feel free to omit it if this is the case.

6.3.2.1. Inverse of a matrix#

First some mathematics.

(Definition) Identity matrix

An identity matrix of order \(n\times n\), commonly denoted \(I_{n\times n}\), is a square matrix of order \(n\times n\) with all diagonal elements equal to 1 and all off-diagonal elements equal to zero.

For example,

  • the \(2\times2\) identity matrix is

\[\begin{split} I_{2\times2}=\begin{bmatrix}1&0\\0&1 \end{bmatrix} \end{split}\]
  • the \(3\times3\) identity matrix is

\[\begin{split} I_{3\times3}=\begin{bmatrix}1&0&0\\0&1&0\\0&0&1 \end{bmatrix} \end{split}\]
  • and so forth.

Property of identity matrix

Given a square matrix \(A_{n\times n}\) or order \(n\times n\), and the identity matrix \(I_{n\times n}\) or order \(n\times n\) then

\[ A_{n\times n}\times I_{n\times n} = I_{n\times n}\times A_{n\times n} = A_{n\times n} \]

In words, multiplying a square matrix \(A\) by the identity matrix of the same order always yields the original matrix \(A\). You can think that in the context of matrix multiplication the identity matrix has the same property as the number 1 has in the context of multiplying numbers.

Similarly, given an \(n\times1\) column vector \(a_{n\times1}\), and the identity matrix \(I_{n\times n}\) or order \(n\times n\) then

\[ I_{n\times n}\times a_{n\times 1} = a_{n\times n} \]

In words, multiplying a vector \(a\) by the comformable identity matrix always yields the original vector \(a\). Again, you can think that in the context of matrix multiplication the identity matrix has the same property as the number 1 has in the context of multiplying numbers.

For example, it is easy to verify that

\[\begin{split} \begin{bmatrix}1&0\\0&1 \end{bmatrix} \times \begin{bmatrix}7&3\\1&2 \end{bmatrix} = \begin{bmatrix}7&3\\1&2 \end{bmatrix}\times \begin{bmatrix}1&0\\0&1 \end{bmatrix} = \begin{bmatrix}7&3\\1&2 \end{bmatrix} \end{split}\]

or

\[\begin{split} \begin{bmatrix}1&0&0\\0&1&0\\0&0&1 \end{bmatrix} \times \begin{bmatrix}5&4&2\\-3&2&7\\0&-1&4 \end{bmatrix} = \begin{bmatrix}5&4&2\\-3&2&7\\0&-1&4 \end{bmatrix} \times \begin{bmatrix}1&0&0\\0&1&0\\0&0&1 \end{bmatrix} = \begin{bmatrix}5&4&2\\-3&2&7\\0&-1&4 \end{bmatrix} \end{split}\]

and so forth.

(Definition) Inverse of a matrix

Given a square matrix \(A\) or order \(n\times n\), the inverse matrix of \(A\) is another \(n\times n\) matrix, commonly denoted by \(A^{-1}\), such that

\[ A^{-1}\times A = A\times A^{-1} = I_{n\times n} \]

In words, the product of a matrix and its inverse is the identity matrix.

For example, we can easily verify that the inverse of the matrix \(A=\begin{bmatrix}1&2\\3&4 \end{bmatrix} \) is the matrix \(A^{-1}=\begin{bmatrix}-2&1\\1.5&-0.5 \end{bmatrix} \) because

\[\begin{split} \begin{bmatrix}-2&1\\1.5&-0.5 \end{bmatrix} \times \begin{bmatrix}1&2\\3&4 \end{bmatrix}= \begin{bmatrix}1&2\\3&4 \end{bmatrix}\times \begin{bmatrix}-2&1\\1.5&-0.5 \end{bmatrix} = \begin{bmatrix}1&0\\0&1 \end{bmatrix} \end{split}\]

Inverses and determinants

For reasons beyond the scope of this discussion, it turns out that a square matrix \(A\) has an inverse if and only if \(det(A)\neq0\).

Now back to R. While there are procedures for obtaining the inverse of a matrix by hand, these are typically quite cumbersome for matrices of order higher than \(2\times2\). On the other hand, they are very easy to find computationally. In R, the inverse of a matrix A can be found by solve(A). For example, given the matrix \(A\) above

A <- matrix(c(1,2,
              3,4), ncol=2, byrow=TRUE)
solve(A)
     [,1] [,2]
[1,] -2.0  1.0
[2,]  1.5 -0.5

But also for much more complicated matrices

B <- matrix(c(13,23,32,72,15,
              24,15,76,81,13,
              76,88,19,93,18,
              27,94,62,73,27,
              12,22,31,62,18), ncol=5, byrow=TRUE)
solve(B)
     [,1]        [,2]         [,3]          [,4]         [,5]       
[1,] -0.06555317  0.015105056  0.0169165843 -0.013068083  0.04640397
[2,]  0.03326893 -0.008685351 -0.0016609482  0.015518808 -0.04306862
[3,] -0.01924799  0.018045231 -0.0043228707  0.006412682 -0.00228883
[4,]  0.05705412 -0.005707990 -0.0008036537 -0.003578446 -0.03725134
[5,] -0.16033034 -0.010871654  0.0009654094 -0.008973683  0.20951104

6.3.3. (Optional) Solving systems of linear equations#

Solving systems of equations using inverse matrices

Consider again a general system of \(n\) linear equations in \(n\) unknowns written in matrix form as

\[ Ax=b \]

with \(det(A)\neq0\).

Since \(det(A)\neq0\), then the inverse of \(A\), say \(A^{-1}\) exists.

Now, pre-multiply both sides of the matrix equation by \(A^{-1}\)

\[ A^{-1}Ax=A^{-1}b \]

Note that by the definition of an inverse of a matrix, \(A^{-1}A=I\) so we have

\[ Ix=A^{-1}b \]

Further, by the proprerty of identity matrices \(Ix = x\) so we have

\[ x=A^{-1}b \]

In other words, given the system of linear equations in matrix form \(Ax=b\), the solution can be found simply as

\[ x=A^{-1}b \]

This allows a very simple method to computationally solve the system.

For example, consider the system of equations \(Ax=b\) where

\[\begin{split} A = \begin{bmatrix}1&2\\3&4\end{bmatrix},\quad x = \begin{bmatrix}x_1\\x_2\end{bmatrix}, \quad b = \begin{bmatrix}3\\7\end{bmatrix} \end{split}\]

To obtain the solution for \(x = \begin{bmatrix}x_1\\x_2\end{bmatrix}\) in R simply type

A <- matrix(c(1,2,
              3,4), ncol=2, byrow=TRUE)

b <- matrix(c(3,
              7), ncol=1)

x <- solve(A)%*%b
x
     [,1]
[1,] 1   
[2,] 1   

So the solution to the system is simply \(x = \begin{bmatrix}1\\1\end{bmatrix}\)

Of course, the usefulness of using numerical methods is especially relevant when we have more difficult systems.

For example, consider the system \(Ax=b\) where

\[\begin{split} A = \begin{bmatrix} 13&23&32&72&15\\ 24&15&76&81&13\\ 76&88&19&93&18\\ 27&94&62&73&27\\ 12&22&31&62&18 \end{bmatrix},\quad x = \begin{bmatrix} x_1\\ x_2\\ x_3\\ x_4\\ x_5 \end{bmatrix}, \quad b = \begin{bmatrix} 12\\ 31\\ 91\\ 17\\ 12 \end{bmatrix} \end{split}\]

which is extremely cumbersome to solve by hand.

First define the matrices in R and note that the determinant of \(A\) is not zero, so \(A\) is invertible.

A <- matrix(c(13,23,32,72,15,
              24,15,76,81,13,
              76,88,19,93,18,
              27,94,62,73,27,
              12,22,31,62,18), ncol=5, byrow=TRUE)

b <- matrix(c(12,
              31,
              91,
              17,
              12), ncol=1)

det(A)
[1] -64782877

Next simply construct the solution as the product of the inverse of \(A\) times \(b\)

x_sol <- solve(A)%*%b
x_sol
     [,1]       
[1,]  1.55571816
[2,] -0.27416868
[3,]  0.01659472
[4,] -0.07328040
[5,]  0.18844674

So the solution of the system is

\[\begin{split} x=\begin{bmatrix} x_1\\ x_2\\ x_3\\ x_4\\ x_5 \end{bmatrix} = \begin{bmatrix} 1.556\\ -0.274\\ 0.017\\ -0.073\\ 0.188 \end{bmatrix} \end{split}\]

In this simple way we can solve arbitrarily complex systems of linear equations.

This completes the discussion of Linear algebra in R. We next turn attention to numerical integration in R.