0% found this document useful (0 votes)
135 views4 pages

2021 EE769 Tutorial Sheet 1

This document provides an introduction to basic mathematics concepts for machine learning, including vectors, matrices, functions, probability distributions, and gradient descent. It contains examples and exercises related to computing dot products, norms, determinants, eigenvalues, means, standard deviations, conditional probabilities, and applying concepts like continuity, convexity/concavity to functions.

Uploaded by

raktion
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
135 views4 pages

2021 EE769 Tutorial Sheet 1

This document provides an introduction to basic mathematics concepts for machine learning, including vectors, matrices, functions, probability distributions, and gradient descent. It contains examples and exercises related to computing dot products, norms, determinants, eigenvalues, means, standard deviations, conditional probabilities, and applying concepts like continuity, convexity/concavity to functions.

Uploaded by

raktion
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

EE 769 Introduction to Machine Learning

Sheet 1 — 2020-21-2
Basic Mathematics for ML

1. Vectors:
(a) Compute the dot product of the given two vectors:
(i)
   
1 1
a = 2 b = 0
   

3 1
(ii) " # " #
1 1
a= b=
1 −1
(b) Find the norm of the following vector:
 
1
a= 0 
 

−1

(c) Find the cosine of the angle between two vectors:


   
1 0
a = 1 , b = −1
   

0 0

Hint: Get unit vectors in the direction of each of the two vectors, compute the dot
product of the two unit vectors. Draw it out.
(d) What is the projection of the the first vector on the to the second one?
" # " #
1 1
a= ,b =
0 1

Hint: Get a unit vector in the direction of the second, compute its dot product
with the first vector, multiply the dot product with the unit vector. Draw it out.

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 1 of 4


Amit Sethi asethi@iitb.ac.in
EE 769 Introduction to Machine Learning: Sheet 1— 2020-21-2

2. Matrices:
(a) Compute the following matrix vector product, if they can be computed.
" #
h i 1 2
(i) 1 2
3 4
  
1 2 1
(ii) 4 5 5
  

0 1 6
(b) Compute the determinant of the following matrix:
" #
0 1
−2 −3
(c) Check if a vector [1, −1]T is an eigenvector of the above given matrix, and if so,
what is the corresponding eigenvalue?
(d) Confirm
# that
" the following#eigenvalue
# " decomposition is valid for the given matrix
" √ √ " √ √ #
5 3 1/ 2 1/ 2 2 0 1/ 2 −1/ 2
= √ √ √ √ by checking that the eigen-
3 5 −1/ 2 1/ 2 0 8 1/ 2 1/ 2
vectors are of norm 1, and the last matrix is the inverse of the first matrix, while
the second matrix is a diagonal matrix. Also confirm that Avi = λi vi for both i.
(e) Find the rank of the following matrices.
 
0 −1 5
(i) 2 4 −6
 

1 1 5
" #
−5 −7
(ii)
5 7
(f) Compute the trace of the matrices above (in part e).

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 2 of 4


Amit Sethi asethi@iitb.ac.in
EE 769 Introduction to Machine Learning: Sheet 1— 2020-21-2

3. Functions: For the following functions, determine if the function is continuous, has a
finite derivative everywhere, has a sub-derivative that exists everywhere (limit of the
derivative from both side exist, and left limit ≤ right limit), has a global maxima
(confirm concavity), has a global minima (confirm convexity), has a local maxima (if
so, then for what value of x), or has a local minima (if so, then for what value of x)?
Make rough drawings to clarify the concepts.
(a) x2 − 2x + 4
(b) −x2 − 2x + 4
(c) x3 − 9x
2
(d) −e−x
p
(e) |x|

4. Python: Set up an ipython notebook in Google CoLab (https://colab.research.


google.com/), import pandas library by typing: import pandas as pd and perform
the following operations:
(a) Read the ”california housing train.csv” file from the sample data folder of colab
environment into a pandas dataframe.
hint: Read the csv file using the method : df=pd.read csv("./location/of/file.csv")
(b) Print the dataframe and extract the ’median income’ and ’population’ columns
from the dataframe.
hint: Try df.head() and df[[’column name1’,’column name2’]]
(c) Compute the mean and standard deviation of the ’median income’ and ’population’
columns.
hint: Try df[’column name’].mean() and df[’column name’].std()
(d) Create a new data frame with zero mean and unit standard deviation columns for
these columns.
hint: new df[’new column name(s)’]= operation to be performed
(e) Write this new dataframe into a new csv file.
hint: Try new df.to csv("./location/to/save/file.csv",index=False)

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 3 of 4


Amit Sethi asethi@iitb.ac.in
EE 769 Introduction to Machine Learning: Sheet 1— 2020-21-2

5. Gradient descent or ascend, and Lagrange multiplier:


(a) For f (x) = x3 − 9x at x = −1 a small positive step is taken (i.e., x ← x + ,  > 0).
Will such a step lead us closer to a local maxima or a local minima?
(b) For the same function f (x) = x3 − 9x at x = −1, a gradient ascend update is
performed using x ← x + ηf 0 (x), η = 1. Is such a value of η desirable?
(c) For the same function f (x) = x3 − 9x at x = −1, compute the optimal update step
that should be added to x as per Newton’s method and determine if it will reach
the local maxima. If not, then why not?
(d) For the following function, find the expression for the gradient: f (x) = 3x21 + 2x2 +
5x33 + 4x1 x2 , where x = [x1 x2 x3 ]T .
(e) Find the minima of the function f (x) = x21 + x22 subject to the condition g(x) =
(x1 − 1)2 + (x2 − 1)2 − 1 = 0.

6. Probability distributions:
(a) What is the probability mass function of a random variable x that represents the
total number of heads if a fair coin is tossed two times?
(b) Between a fair coin and a biased coin, whose number of heads in two tosses has
higher entropy?
(c) If we map the number of heads in three tosses of a fair coin to variable x, and the
maximum number of consecutive heads in those three tosses to variable y, then
what is their joint probability mass function? Write this as a 2-dimensional table.
(d) For the previous question, compute the marginal distributions of x and y using the
joint PMF table.
(e) What is the conditional PMF of x given y = 1 for part (c)?
(f) Two Gaussian random variables (continuous, of course) x and y both have mean
µx = µy = 0, but σx = 1 while σy = 2. Which of them is more likely to have an
absolute value greater than 3? Use the following approach:
(i) Using the formula for a Gaussian PDF, draw two overlapping graphs in python
using matplotlib library, one for PDF of x, and another for y.
(ii) Observe where the two PDFs cross each other to answer part (f) qualitatively.
(g) Assume that slant of the eyes and size of the nose of an animal is captured as a
two dimensional feature vector. For cats, this vector has a Gaussian distribution
   
with µcat = 10 and Σcat = 10 01 . For dogs, the distribution is also Gaussian,
   
but with µdog = 01 and Σdog = 10 01 . Given that an individual animal has the
 
following slant and nose size 0.5
0 , then is it more likely to be cat or a dog by
simply comparing the PDF values of cats and dogs at that point?

Department of Electrical Engineering, Indian Institute of Technology Bombay Page 4 of 4


Amit Sethi asethi@iitb.ac.in

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy