EPPS Math and Coding Camp

Vector and Matrices

Instructor: Prajyna Barua and Azharul Islam

https://forms.gle/uLaMnx5amMzwCKav9

12.1 SCALARS

Scalar refers any single element of some set. For example, the number 3.5 is a scalar, as is any variable $x \in \mathbb{R}$.

12.2 VECTORS

One way of specifying a vector in this space is as an arrow from the origin (the zero) in three dimensions, $(0,0,0)$, to the point in question, such as from $(0,0,0)$ to $(3,1,2)$. The arrow points to $(3,1,2)$ in this case.

The arrow notes that the vector indicates motion, in general from zero to some $(x,y,z)$. Figure 12.1 displays a graphical representation of the vector from $(0,0)$ to $(5,2)$.

A vector will either be a lowercase letter in a bold font, such as $\mathbf{x}$ or $\mathbf{a}$, or a lowercase letter with an arrow over it, such as $\vec{x}$.

Each element, or component, of the vector will be denoted by a subscript signifying its place in the vector. So, if $\mathbf{x} = (3,1,2)$, then $x_1 = 3$, $x_2 = 1$, and $x_3 = 2$. The dimension of a vector is the number of components in the vector.

12.2.1 Vector Length

The length of a vector, not to be confused with a vector’s dimension, tells us how big it is. In one dimension this is straightforward: the scalar 5 has “length” 5.

For any vector of dimension $n$, its length is given by $\|\mathbf{a}\| = \sqrt{a_1^2 + a_2^2 + \ldots + a_n^2}$. For example, the length of the vector $(2,4,4,1)$ is $\sqrt{2^2 + 4^2 + 4^2 + 1^2} = \sqrt{4 + 16 + 16 + 1} = \sqrt{37}$.

12.2.2 Vector Addition

Vectors add just like scalars (i.e., numbers). To add (or subtract) vectors, they must have the same dimension.

The first component of the first vector adds to the first component of the second vector to form the first component of the added vector, and so on.

$\mathbf{a} + \mathbf{b} = (a_1 + b_1, a_2 + b_2, \ldots, a_n + b_n)$ for two $n$-dimensional vectors. The same is true for subtraction, so that $\mathbf{a} - \mathbf{b} = (a_1 - b_1, a_2 - b_2, \ldots, a_n - b_n)$.

Example: \[(5,-3,-6)+(1,8,7)=(6,5,1)\]

\[ (5, 1, 4, 1) − (1, 2, 3, 4) = (4, −1, 1, −3) \]

12.2.3 Scalar Multiplication

Scalar multiplication is multiplication of a vector by a scalar. To accomplish this, one needs to multiply each element in the vector by the scalar.

So, in general, if $\mathbf{x}$ is an $n$-dimensional vector and $c$ is a scalar, then $c\mathbf{x} = (cx_1, cx_2, \ldots, cx_n)$.

A more concrete example would be 5a, where a = (2, 1). The multiplied vector is (10, 5), where each of 2 and 1 have been multiplied by the scalar 5.

Dividing by a scalar works the same as multiplying by one over that scalar (i.e., multiplying each element by $\frac{1}{c}$).

12.2.4 Vector Multiplication

The scalar product is a way of multiplying vectors that results in a scalar.

In general, if $\mathbf{a}$ and $\mathbf{b}$ are both $n$-dimensional vectors, then $\mathbf{a} \cdot \mathbf{b} = a_1b_1 + a_2b_2 + \ldots + a_nb_n$.

Or, using summations, $\mathbf{a} \cdot \mathbf{b} = \sum_{i=1}^n a_ib_i$.

Example: \[(3, 1) · (2, 3) = 6 + 3 = 9\]

\[(6, 5, 4) · (9, 8, 7) = 54 + 40 + 28 = 122\]

12.3 MATRICES

A matrix is a rectangular table of numbers or variables that are arranged in a specific order in rows and columns.

The size of a matrix in mathematics is known as its dimensions and is ex-pressed in terms of how many rows, $n$, and columns, $m$, it has, written as $n \times m$.

A matrix $A_{n \times m}$ thus is a matrix with $n$ rows and $m$ columns.

Matrix is a column vector if it has only one column but two or more rows, and a row vector if it has only one row but two or more columns.

A scalar is a matrix with only one column and one row.

12.3.1 Some Special Types of Matrices

A square matrix is a matrix that has an equal number of columns and rows, i.e., $m = n$. For example,

\[ A_{3 \times 3} = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{pmatrix}. \]

A zero matrix is a square matrix in which all elements are 0.

A diagonal matrix is a square matrix in which all elements other than those on the main diagonal are zero.

An identity matrix is a diagonal matrix in which all elements on the main diagonal are 1.

A lower triangular matrix has non-zero elements only on or below the main diagonal, while an upper triangular matrix has non-zero elements only on or above the main diagonal. Examples of these are:

\[ L_{3 \times 3} = \begin{pmatrix} a_{11} & 0 & 0 \\ a_{21} & a_{22} & 0 \\ a_{31} & a_{32} & a_{33} \end{pmatrix}, \quad U_{3 \times 3} = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ 0 & a_{22} & a_{23} \\ 0 & 0 & a_{33} \end{pmatrix}. \]

A symmetric matrix is a square matrix in which the elements are symmetric about the main diagonal, or more formally one in which $a_{ij} = a_{ji}$. For example,

\[ A_{3 \times 3} = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{12} & a_{22} & a_{23} \\ a_{13} & a_{23} & a_{33} \end{pmatrix} \]

An idempotent matrix $A$ is a matrix with the property $AA = A$. That is, when you multiply it by itself, it returns the original matrix.

A singular matrix is one that has a determinant of 0, while a nonsingular matrix has a determinant that is not 0.

The transpose of a matrix is another matrix in which the rows and columns have been switched, i.e., the rows of the first matrix are written as columns in the second and the columns in the first matrix are written as rows in the second.

The typical notation for the transpose of $A$ is either $A^T$ or $A'$.

\[A = \begin{bmatrix} 1 & 3 & 0 \\ -1 & 6 & 2 \end{bmatrix}\]

\[A^T = \begin{bmatrix} 1 & -1 \\ 3 & 6 \\ 0 & 2 \end{bmatrix}\]

Matrix Addition and Subtraction - Given two matrices $A$ and $B$ of equal dimensions, the operation $A + B$ will result in a matrix $C$ with the same dimensions where each element $c_{i,j} = a_{i,j} + b_{i,j}$.

\[\begin{bmatrix} 1 & -2 \\ 0 & 5 \\ 4 & 3 \end{bmatrix} + \begin{bmatrix} 3 & 9 \\ -1 & 1 \\ 0 & 2 \end{bmatrix} = \begin{bmatrix} 4 & 7 \\ -1 & 6 \\ 4 & 5 \end{bmatrix}\]

For subtraction, e.g., $A - B$, each element $c_{i,j}$ in $C$ will equal $a_{i,j} - b_{i,j}$.

\[\begin{bmatrix} 1 & -2 \\ 0 & 5 \\ 4 & 3 \end{bmatrix} - \begin{bmatrix} 3 & 9 \\ -1 & 1 \\ 0 & 2 \end{bmatrix} = \begin{bmatrix} -2 & -11 \\ 1 & 4 \\ 4 & 1 \end{bmatrix}\]

12.3.4.1 ScalarMultiplication

Multiply each individual element of the matrix by the scalar to find the product. Formally, $C = rA$, where each $c_{i,j} = r \times a_{i,j}$.

\[5 \times \begin{bmatrix} 1 & -2 \\ 0 & 5 \\ 4 & 3 \end{bmatrix} = \begin{bmatrix} 5 & -10 \\ 0 & 25 \\ 20 & 15 \end{bmatrix}\]

12.3.4.2 Matrix Multiplication

In order to be able to multiply two matrices, the number of columns in the first must match the number of rows in the second matrix, e.g., $A_{n \times m}$ and $B_{m \times p}$: $A_{n \times m}B_{m \times p} = C_{n \times p}$

Example:

\[AB = \begin{bmatrix} 1 & -2 \\ 0 & 5 \\ 4 & 3 \end{bmatrix} \times \begin{bmatrix} 3 & 1 & 4 \\ -1 & 2 & 5 \end{bmatrix}\]

\[= \begin{bmatrix} (1 \times 3) + (-2 \times -1) & (1 \times 1) + (-2 \times 2) & (1 \times 4) + (-2 \times 5) \\ (0 \times 3) + (5 \times -1) & (0 \times 1) + (5 \times 2) & (0 \times 4) + (5 \times 5) \\ (4 \times 3) + (3 \times -1) & (4 \times 1) + (3 \times 2) & (4 \times 4) + (3 \times 5) \end{bmatrix}\]

\[= \begin{bmatrix} 3 + 2 & 1 - 4 & 4 - 10 \\ 0 - 5 & 0 + 10 & 0 + 25 \\ 12 - 3 & 4 + 6 & 16 + 15 \end{bmatrix}\]

\[= \begin{bmatrix} 5 & -3 & -6 \\ -5 & 10 & 25 \\ 9 & 10 & 31 \end{bmatrix}\]

Example:

\[BA = \begin{bmatrix} 3 & 1 & 4 \\ -1 & 2 & 5 \end{bmatrix} \times \begin{bmatrix} 1 & -2 \\ 0 & 5 \\ 4 & 3 \end{bmatrix}\]

\[= \begin{bmatrix} (3 \times 1) + (1 \times 0) + (4 \times 4) & (3 \times -2) + (1 \times 5) + (4 \times 3) \\ (-1 \times 1) + (2 \times 0) + (5 \times 4) & (-1 \times -2) + (2 \times 5) + (5 \times 3) \end{bmatrix}\]

\[= \begin{bmatrix} 3 + 0 + 16 & -6 + 5 + 12 \\ -1 + 0 + 20 & 2 + 10 + 15 \end{bmatrix}\]

\[= \begin{bmatrix} 19 & 11 \\ 19 & 27 \end{bmatrix}\]

12.3.5 Trace

The trace of an $n \times n$ square matrix is the sum of its diagonal elements. Formally, $\text{Tr}(A) = \sum_{i=1}^n a_{ii} = a_{11} + a_{22} + \ldots + a_{nn}$.

12.3.6 Determinant

The determinant of a matrix is a commonly used function that converts the matrix into a scalar. $|A|$ is the determinant of matrix $A$.

Consider the two-by-two matrix,

\[A = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}\]

The determinant of $A$ is the difference of the diagonal products:

\[|A| = (a_{11} \cdot a_{22}) - (a_{12} \cdot a_{21})\]

Consider the following three-by-three matrix:

\[B = \begin{bmatrix} b_{11} & b_{12} & b_{13} \\ b_{21} & b_{22} & b_{23} \\ b_{31} & b_{32} & b_{33} \end{bmatrix}\]

Define the minor of element $b_{23}$ as the determinant of the submatrix of $b_{23}$. The minor of $b_{23}$ is:

\[M_{23} = \begin{vmatrix} b_{11} & b_{12} \\ b_{31} & b_{32} \end{vmatrix} = (b_{11} \cdot b_{32}) - (b_{31} \cdot b_{12})\]

12.3.7 Inverse

An $n \times n$ matrix, $A$, is invertible (only square matrices can be inverted) if one can find a second $n \times n$ matrix, $B$, such that the product $AB$ and the product $BA$ both produce the $n \times n$ identity matrix, $I_{n \times n}$.

In such a situation, $B$ is the inverse of $A$. The inverse of a square matrix is the matrix that produces the identity matrix when it is multiplied by it on either the left or the right:

\[A \cdot B = B \cdot A = I\]

One denotes the inverse of a matrix by using a $-1$ superscript. So $A^{-1}$ is the inverse of $A$:

\[A \cdot A^{-1} = A^{-1} \cdot A = I\]

This rule is a useful special case of the more general way to compute a matrix inverse, given by the following formula:

\[A^{-1} = \frac{1}{|A|}C^T\]

Example:

The transpose of this matrix, known as an adjoint matrix, then gets multiplied by $\frac{1}{|A|}$ to find $A^{-1}$.

Let’s try to find the inverse for the following matrix in this manner:

\[A = \begin{bmatrix} 1 & 2 & 1 \\ 0 & 4 & 3 \\ -6 & -2 & 2 \end{bmatrix}\]

The determinant equals

\[|A| = 1 \cdot M_{11} - 2 \cdot M_{12} + 1 \cdot M_{13} = 1(14) - 2(18) + 1(24) = 2\]

Now we need to construct the cofactor matrix. Thus we first have to find all the minors:

\[M_{11} = \begin{vmatrix} 4 & 3 \\ -2 & 2 \end{vmatrix} = (4 \cdot 2) - ((-2) \cdot 3) = 8 + 6 = 14\]

\[M_{12} = \begin{vmatrix} 0 & 3 \\ -6 & 2 \end{vmatrix} = (0 \cdot 2) - ((-6) \cdot 3) = 0 + 18 = 18\]

… and so forth, until we find that all the minors are

\[\begin{aligned} M_{11} &= 14, \\ M_{12} &= 18, \\ M_{13} &= 24, \\ M_{21} &= 6, \\ M_{22} &= 8, \\ M_{23} &= 10, \\ M_{31} &= 2, \\ M_{32} &= 3, \\ M_{33} &= 4. \end{aligned}\]

With this information we can now construct our cofactor matrix (remember to multiply each minor by $(-1)^{i+j}$):

\[C = \begin{bmatrix} 14 & -18 & 24 \\ -6 & 8 & -10 \\ 2 & -3 & 4 \end{bmatrix}\]

Now we can transpose the cofactor matrix to find the adjoint matrix of $A$:

\[\text{adj}(A) = \begin{bmatrix} 14 & -6 & 2 \\ -18 & 8 & -3 \\ 24 & -10 & 4 \end{bmatrix}\]

Finally, we multiply this matrix by $\frac{1}{|A|}$ to find the inverse of $A$:

\[A^{-1} = \frac{1}{|A|}\text{adj}(A) = \frac{1}{2}\begin{bmatrix} 14 & -6 & 2 \\ -18 & 8 & -3 \\ 24 & -10 & 4 \end{bmatrix} = \begin{bmatrix} 7 & -3 & 1 \\ -9 & 4 & -\frac{3}{2} \\ 12 & -5 & 2 \end{bmatrix}\]

Again, it’s good to check to see whether $AA^{-1} = A^{-1}A = I$ (it does).

Matrix and Vector Properties:

Associative property	$(AB)C = A(BC)$
Additive distributive property	$(A + B)C = AC + BC$
Scalar commutative property	$xAB = (xA)B = A(xB) = ABx$

12.4 PROPERTIES OF VECTORS AND MATRICES

$AB \neq BA$

$IA = AI \quad$

$\quad AA^{-1} = A^{-1}A$

$a \cdot b = b \cdot a$

$\text{tr}(A + B) = \text{tr}(A) + \text{tr}(B)$

$\quad \text{tr}(A^T) = \text{tr}(A)$

$\quad \text{tr}(AB) = \text{tr}(BA)$

Matrix and Vector Transpose Properties

Inverse	$(A^T)^T = A$
Additive property	$(A + B)^T = A^T + B^T$
Multiplicative property	$(AB)^T = B^T A^T$
Scalar multiplication	$(cA)^T = cA^T$
Inverse transpose	$(A^{-1})^T = (A^T)^{-1}$
If A is symmetric	$A^T = A$

Matrix Determinant Properties

Transpose property	$\det(A) = \det(A^T)$
Identity matrix	$\det(I) = 1$
Multiplicative property	$\det(AB) = \det(A) \det(B)$
Inverse property	$\det(A^{-1}) = \frac{1}{\det(A)}$

Matrix Inverse Properties

Inverse	$(A^{-1})^{-1} = A$
Multiplicative property	$(AB)^{-1} = B^{-1}A^{-1}$
Scalar multiplication (n x n)	$(cA)^{-1} = c^{-1}A^{-1}$ if $c \neq 0$

12.5 MATRIX ILLUSTRATION OF OLS ESTIMATION

Consider the OLS regression equation $\mathbf{y} = \alpha + \beta \mathbf{x} + \epsilon$, where $\mathbf{x}$ and $\mathbf{y}$ are vectors containing the values of the independent and dependent variables, respectively, for each observation;

$\alpha$ is a scalar containing the $y$-intercept (i.e., the expected value of $y$ when $x = 0$);

$\epsilon$ is a vector that holds the errors (i.e., the distance between the regression line and the value of $\mathbf{y}$);

and $\beta$ is a scalar holding the value of average change in $\mathbf{y}$ given a one-unit increase in $\mathbf{x}$.

We know the values of $\mathbf{x}$ and $\mathbf{y}$, and our problem is to “estimate” the values of $\alpha$ and $\beta$ that produce the regression line.

OLS regression proposes that the best fit is produced by selecting the values of $\alpha$ and $\beta$ that minimize the sum of squared errors $\left( \sum_i \epsilon_i^2 \right)$.

You will likely learn that the OLS estimator is the best linear unbiased estimator (BLUE) in your statistics coursework, and we sketch a proof of this fact using matrix algebra in the next chapter.

It turns out that we can calculate a vector that contains the values of $\alpha$ and $\beta$, call it $\hat{\beta}$, by using the equation:

\[\hat{\beta} = (X^TX)^{-1}X^T \mathbf{y}\]

Example:

Let’s call $\mathbf{y}$ “size of government” and $\mathbf{x}$ “per capita income.”

State	Per Capita Income	% Gov’t Employees
Alabama	$24,028	19.2
Florida	$30,446	14.5
Georgia	$29,442	16.4
Mississippi	$23,448	21.8
North Carolina	$28,235	17.3
South Carolina	$26,132	18.2
Tennessee	$28,455	15.5

This gives us the vector $\mathbf{y} ^T = (19.2, 14.5, 16.4, 21.8, 17.3, 18.2, 15.5)$.

Because we are estimating both $\alpha$ and $\beta$, we need to add a column of 1s to the $\mathbf{x}$ vector, producing the matrix $X$.

\[ X = \begin{bmatrix} 1 & 24,028 \\ 1 & 30,446 \\ 1 & 29,442 \\ 1 & 23,448 \\ 1 & 28,235 \\ 1 & 26,132 \\ 1 & 28,445 \end{bmatrix} \]

We can now take the product of the matrices $X^T$ and $X$, yielding:

\[ X^TX = \begin{bmatrix} 7 & 190,356 \\ 190,356 & 5,218,840,922 \end{bmatrix} \]

The next step is to calculate the inverse of that matrix:

\[ (X^TX)^{-1} = \begin{bmatrix} 17.603 & -0.001 \\ -0.001 & 0.000 \end{bmatrix} \]

Let’s now calculate the product of $X^T$ and $\mathbf{y}$:

\[ X^T \mathbf{y} = \begin{bmatrix} 1 & 24,028 \\ 1 & 30,446 \\ 1 & 29,442 \\ 1 & 23,448 \\ 1 & 28,235 \\ 1 & 26,132 \\ 1 & 28,445 \end{bmatrix} \begin{bmatrix} 19.2 \\ 14.5 \\ 16.4 \\ 21.8 \\ 17.3 \\ 18.2 \\ 15.5 \end{bmatrix} = \begin{bmatrix} 122.9 \\ 3,301,785.2 \end{bmatrix} \]

That completed, we are now ready to calculate the OLS estimates of $\alpha$ and $\beta$ given the data in Table 12.5 and the equation $\hat{\beta} = (X^TX)^{-1}X^T \mathbf{y}$:

\[ \begin{bmatrix} \beta \\ \alpha \end{bmatrix} = \begin{bmatrix} 17.603 & -0.001 \\ -0.001 & 0.000 \end{bmatrix} \begin{bmatrix} 122.9 \\ 3,301,785.2 \end{bmatrix} = \begin{bmatrix} 50.228 \\ -0.079 \end{bmatrix} \]

We thus have an OLS estimate of 50.228 for $\beta$ and -0.079 for $\alpha$.

Problems

Problem 1

Given the matrices:

\[ A = \begin{bmatrix} 2 & 8 \\ 3 & 0 \\ 5 & 1 \end{bmatrix}, \quad B = \begin{bmatrix} 2 & 0 \\ 3 & 8 \end{bmatrix} \]

Is $AB$ defined?
Calculate $AB$.
Can you calculate $BA$? Why or why not?

Problem 2

Find the inverse of the matrix $B = \begin{bmatrix} 4 & 1 & -1 \\ 0 & 3 & 2 \\ 3 & 0 & 7 \end{bmatrix}$.

https://forms.gle/Uf3aQsEQPCiNX1EDA

Any Questions?

Home

Associative property	\((AB)C = A(BC)\)
Additive distributive property	\((A + B)C = AC + BC\)
Scalar commutative property	\(xAB = (xA)B = A(xB) = ABx\)

Inverse	\((A^T)^T = A\)
Additive property	\((A + B)^T = A^T + B^T\)
Multiplicative property	\((AB)^T = B^T A^T\)
Scalar multiplication	\((cA)^T = cA^T\)
Inverse transpose	\((A^{-1})^T = (A^T)^{-1}\)
If A is symmetric	\(A^T = A\)

Transpose property	\(\det(A) = \det(A^T)\)
Identity matrix	\(\det(I) = 1\)
Multiplicative property	\(\det(AB) = \det(A) \det(B)\)
Inverse property	\(\det(A^{-1}) = \frac{1}{\det(A)}\)

Inverse	\((A^{-1})^{-1} = A\)
Multiplicative property	\((AB)^{-1} = B^{-1}A^{-1}\)
Scalar multiplication (n x n)	\((cA)^{-1} = c^{-1}A^{-1}\) if \(c \neq 0\)