Ps0 Template

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

CS229 Problem Set 0 1

CS 229, Summer 2020


Problem Set 0: Linear Algebra, Multivariable Cal-
culus, and Probability

Due Wednesday, June 29 at 11:59 pm on Gradescope.


Notes:
(1) These questions require thought, but do not require long answers. Please be as concise as
possible.
(2) If you have a question about this homework, we encourage you to post your question on our
Piazza forum at https://piazza.com/stanford/summer2020/cs229.
(3) If you missed the first lecture or are unfamiliar with the collaboration or honor code policy,
please read the policy before you start.
(4) This specific homework is not graded , but we encourage you to solve each of the problems to
brush up on your linear algebra and probability. Some of them may even be useful for subsequent
problem sets. It also serves as your introduction to using Gradescope for submissions. We
strongly suggest you use LaTex to submit your psets (not ony is it helpful for this class, but it
is a good skill to learn). However, if you are scanning your document by cellphone, please use a
scanning app such as CamScanner. There will not be any late days allowed for this particular
assignment.
Honor code: We strongly encourage students to form study groups. Students may discuss and
work on homework problems in groups. However, each student must write down the solutions
independently, and without referring to written notes from the joint session. That being said,
if students are submitting in a pair, they act as one unit - they may share resources (such as
notes) with each other and write the solutions together. Note that both of the two students
should fully understand all the answers in their submission, even though only one of them needs
to write up a solution to a question. In other words, each student must understand the solution
well enough in order to reconstruct it by him/herself. In addition, each student should write on
the problem set the set of people with whom s/he collaborated. Further, because we occasionally
reuse problem set questions from previous years, we expect students not to copy, refer to, or
look at the solutions in preparing their answers. It is an honor code violation to intentionally
refer to a previous year’s solutions.
CS229 Problem Set 0 2

1. [0 points] Gradients and Hessians


Recall that a matrix A ∈ Rn×n is symmetric if AT = A, that is, Aij = Aji for all i, j. Also
recall the gradient ∇f (x) of a function f : Rn → R, which is the n-vector of partial derivatives
 ∂   
∂x1 f (x) x1
..  . 
∇f (x) =   where x =  ..  .
 
.

∂xn f (x)
xn

The hessian ∇2 f (x) of a function f : Rn → R is the n × n symmetric matrix of twice partial


derivatives,
∂2 ∂2 ∂2
 
2 f (x) ∂x1 ∂x2 f (x) · · · ∂x1 ∂xn f (x)
 ∂x 1
 ∂ 2 f (x) ∂2 ∂2

 ∂x2 ∂x1 ∂x22 f (x) · · · ∂x2 ∂xn f (x)

2
∇ f (x) =  .. .. .. .. .

 . . . . 

∂2 ∂2 ∂2
∂xn ∂x1 f (x) ∂xn ∂x2 f (x) · · · ∂x2 f (x)
n

(a) Let f (x) = 21 xT Ax + bT x, where A is a symmetric matrix and b ∈ Rn is a vector. What


is ∇f (x)?
(b) Let f (x) = g(h(x)), where g : R → R is differentiable and h : Rn → R is differentiable.
What is ∇f (x)?
(c) Let f (x) = 12 xT Ax + bT x, where A is symmetric and b ∈ Rn is a vector. What is ∇2 f (x)?
(d) Let f (x) = g(aT x), where g : R → R is continuously differentiable and a ∈ Rn is a vector.
What are ∇f (x) and ∇2 f (x)? (Hint: your expression for ∇2 f (x) may have as few as 11
symbols, including 0 and parentheses.)
CS229 Problem Set 0 3

2. [0 points] Positive definite matrices


A matrix A ∈ Rn×n is positive semi-definite (PSD), denoted A  0, if A = AT and xT Ax ≥ 0
for all x ∈ Rn . A matrix A is positive definite, denoted A  0, if A = AT and xT Ax > 0 for
all x 6= 0, that is, all non-zero vectors x. The simplest example of a positive definite matrix is
the identity I (the diagonal matrix with 1s on the diagonal and 0s elsewhere), which satisfies
2 Pn
xT Ix = kxk2 = i=1 x2i .
(a) Let z ∈ Rn be an n-vector. Show that A = zz T is positive semidefinite.
(b) Let z ∈ Rn be a non-zero n-vector. Let A = zz T . What is the null-space of A? What is
the rank of A?
(c) Let A ∈ Rn×n be positive semidefinite and B ∈ Rm×n be arbitrary, where m, n ∈ N. Is
BAB T PSD? If so, prove it. If not, give a counterexample with explicit A, B.
CS229 Problem Set 0 4

3. [0 points] Eigenvectors, eigenvalues, and the spectral theorem


The eigenvalues of an n × n matrix A ∈ Rn×n are the roots of the characteristic polynomial
pA (λ) = det(λI − A), which may (in general) be complex. They are also defined as the values
λ ∈ C for which there exists a vector x ∈ Cn such that Ax = λx. We call such a pair (x, λ) an
eigenvector, eigenvalue pair. In this question, we use the notation diag(λ1 , . . . , λn ) to denote
the diagonal matrix with diagonal entries λ1 , . . . , λn , that is,
 
λ1 0 0 ··· 0
 0 λ2 0 · · · 0 
 
diag(λ1 , . . . , λn ) =  0
 0 λ3 · · · 0  .
 .. .. .. .. .. 
. . . . . 
0 0 0 · · · λn

(a) Suppose that the matrix A ∈ Rn×n is diagonalizable, that is, A = T ΛT −1 for an invertible
matrix T ∈ Rn×n , where Λ = diag(λ1 , . . . , λn ) is diagonal. Use the notation t(i) for the
columns of T , so that T = [t(1) · · · t(n) ], where t(i) ∈ Rn . Show that At(i) = λi t(i) , so
that the eigenvalues/eigenvector pairs of A are (t(i) , λi ).

A matrix U ∈ Rn×n is orthogonal if U T U = I. The spectral theorem, perhaps one of the most
important theorems in linear algebra, states that if A ∈ Rn×n is symetric, that is, A = AT ,
then A is diagonalizable by a real orthogonal matrix. That is, there are a diagonal matrix
Λ ∈ Rn×n and orthogonal matrix U ∈ Rn×n such that U T AU = Λ, or, equivalently,

A = U ΛU T .

Let λi = λi (A) denote the ith eigenvalue of A.


(b) Let A be symmetric. Show that if U = [u(1) · · · u(n) ] is orthogonal, where u(i) ∈
Rn and A = U ΛU T , then u(i) is an eigenvector of A and Au(i) = λi u(i) , where Λ =
diag(λ1 , . . . , λn ).
(c) Show that if A is PSD, then λi (A) ≥ 0 for each i.
CS229 Problem Set 0 5

4. [0 points] Probability and multivariate Gaussians


Suppose X = (X1 , ..Xn ) is sampled from a multivariate Gaussian distribution with mean µ
in Rn and covariance Σ in S+n
(i.e. Σ is positive semidefinite). This is commonly also written
as X ∼ N (µ, Σ).

(a) Describe the random variable Y = X1 + X2 + . . . + Xn . What is the mean and variance?
Is this a well known distribution, and if so, which?
(b) Now, further suppose that Σ is invertible. Find E[X T Σ−1 X]. (Hint: use the property of
trace that xT Ax = tr(xT Ax)).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy