Linear Algebra for Machine Learning-Part 1

7 min readAug 29, 2021

Linear Algebra is one of most important field of mathematics,along with itself solving great technological problems to acting as backbone for other sought mathematical subjects like multi variate calculus and probability etc..

It involves some great concepts,theories,techniques but here we will learn the concepts of linear algebra required as prerequisite for machine learning from very basic and this is the first part of the context.

This part 1 will cover basic concepts of linear algebra like vector,matrix,there operations,forms,conversion and some prerequisite for advance linear algebra needed for machine learning.

I will assume that readers are familiar with basic high school mathematics.

Lets start with vectors,vectors can be treated as list of numbers,a point in space or an object with both magnitude and direction.We will understand the geometrical interpretation of linear algebra in next part,as of now,lets define vector as an element V ∈R^n.

Vector V can be defines as an element of n dimensional coordinate system i.e V∈R^n.

So for 2D plane [x,y] is general form of vector,for 3D it’s [x,y,z] as so on for dimensions.

Transpose of vector(V^T) = vector is represented in column form and it’s transpose is the row representation of same.

A matrix M can be defined as an element M∈R^(mxn) i.e m collection of n dimensional vectors taken as row of matrix having dimension mxn.

Transpose of matrix:-

We can consider vector as a single dimensional array and matrix as an 2 dimensional array.

In machine learning we may need multidimensional array as form of data to perform computation and operation which we refer as tensor.

Tensor is the generalization of vectors and matrices mostly understood as multidimensional array.

Vector is 1st order or single dimensional tensor while matrix is 2D tensor.

We will now study properties and operations of 1D(vector) and 2D(matrix) tensor and that can be generalized to all higher dimensions.

Operations on vectors

operations with scalar:- any operation with scalar is performed with every element of vector.

Addition and Subtraction:- vectors of same dimension are added and subtracted element-wise.

Dot product/Inner product :- dot product and inner product are two different things in mathematics but will be same in our context.It is the summation of element-wise product of two vectors of same dimension.

Outer product:- element wise scaling of one vector by another that results in matrix is called outer product

Operations on matrices

matrix matrix multiplication:- It can be regarded as inner products of every row on 1st matrix with columns of 2nd matrix.

**inner product explanation of matrix multiplication**

It can also be defined as sum of all outer products of row of matrix 1 with column of matrix 2.

**outer product intuition of multiplication**

Both intuition of matrix multiplication gives same result respecting the norms of mathematics.

Row Reduction/Gaussian Elimination:- Some certain operations can be produced with rows of the matrices that may change the matrix but doesn’t changes the interpretation of data or equation represented.These techniques are generally used to solve system of linear equations and inverse of matrix.

Types and representation of matrix

Diagonal Matrix = A matrix with all elements except diagonal being 0 is called diagonal matrix.

Lower Triangular and upper triangular matrix = A matrix with all elements below diagonal being 0 is called upper triangular and if all elements above diagonal is 0 is called lower triangular matrix.

Identity matrix = Diagonal matrix with all element one having property of A.I = A is called identity matrix.

Symmetric matrix = Matrix whose transpose is same to original matrix.

Skew-symmetric matrix = matrix whose transpose is equal to negative of original matrix.

Row Echelon form = A matrix which satisfies following properties is called in row echelon form.

The first non-zero number from the left (the “leading coefficient“) is always to the right of the first non-zero number in the row above.
Rows consisting of all zeros are at the bottom of the matrix.

Reduced Row Echelon form = A matrix which satisfies following properties is called in reduced row echelon form.

The first non-zero number in the first row (the leading entry) is the number 1.
The second row also starts with the number 1, which is further to the right than the leading entry in the first row. For every subsequent row, the number 1 must be further to the right.
The leading entry in each row must be the only non-zero number in its column.
Any zero rows are placed at the bottom of the matrix.

The matrix can be converted to row echelon or reduced row echelon with the help of row reduction method discussed above.

System of linear equations

we all know about linear equations

a0 + a1x1 + … + anxn = c

This is the system of m linear equation of n variables shown below

If all the bi’s are 0, it is called homogeneous system of linear equations

Solving the system of linear equation

The system of linear equation can be solved by converting the augmented matrix formed into row echelon or row reduced echelon form and substitution.

Types of Solutions

There are three types of solutions which are possible when solving a system of linear equations

Independent

Consistent
Unique Solution
A row-reduced matrix has the same number of non-zero rows as variables
The left hand side is usually the identity matrix, but not necessarily
There must be at least as many equations as variables to get an independent solution.

**When you convert the augmented matrix back into equation form, you get x=3, y=1, and z=2.**

Dependent

Consistent
infinitely many solutions
Write answer in parametric form
A row-reduced matrix has more variables than non-zero rows
There doesn’t have to be a row of zeros, but there usually is.
This could also happen when there are less equations than variables.

The first equation will be x + 3z = 4. Solving for x gives x = 4–3z.

The second equation will be y — 2z = 3. Solving for y gives y = 3 + 2z.

The z column is not cleared out (all zeros except for one number) so the other variables will be defined in terms of z. Therefore, z will be the parameter t and the solution is … x = 4–3t, y = 3 + 2t, z = t.since t can be any parameter,so infinite solutions or simply there are less equations than variables so infinite values possible for any one variable.

Inconsistent

No Solution
A row-reduced matrix has a row of zeros on the left side, but the right hand side isn’t zero.

**There is no solution here. You can write that as the null set Ø, the empty set {}, or no solution.**

It indicates that one the equation is inparticipant to give solution,so,inconsistent.

Inverse of matrix

The inverse of a matrix A is a matrix that, when multiplied by A results in the identity. The notation for this inverse matrix is A^–1.

The matrix which have their inverse are called invertible matrices and others are called singular,i.e matrix where AA^-1 = I exist are invertible matrices.

The inverse of the matrix can be found using following steps:-

write the augmented matrix consisting of original matrix and Identity matrix with original on left and identity on left.
perform gaussian elimination such that original matrix on left is converted to identity.
During the operation,converted matrix obtained on right side from identity matrix is the reverse of original matrix.