Honours Course
Honours Course
Honours Course
SEMESTER -4
HONOURS
E S . I N
NOT
KTU
Year of
CODE COURSE NAME CATEGORY L T P CREDIT
Introduction
Preamble: This is the foundational course for awarding B. Tech. Honours in Computer Science and
Engineering with specialization in Security in Computing. The purpose of this course is to create
awareness among learners about the important areas of number theory used in computer science. This
course covers Divisibility & Modular Arithmetic, Primes & Congruences, Euler's Function, Quadratic
Residues and Arithmetic Functions, Sum of Squares and Continued fractions. Concepts in Number
Theory help the learner to apply them eventually in practical applications in Computer organization &
Security, Coding & Cryptography, Random number generation, Hash functions and Graphics.
Course Outcomes: After the completion of the course the student will be able to
CO2 . I N
Use the methods - Induction, Contraposition or Contradiction to verify the correctness of
E S
NOT
mathematical assertions (Cognitive Knowledge Level: Apply)
KTU
Utilize theorems and results about prime numbers, congruences, quadratic residues and
CO3 integer factorization for ensuring security in computing systems (Cognitive Knowledge
Level: Analyse)
Illustrate uses of Chinese Remainder Theorem & Euclidean algorithm in Cryptography and
CO4
Security (Cognitive Knowledge Level: Apply)
Explain applications of arithmetic functions in Computer Science (Cognitive Knowledge
CO5
Level:Understand)
Implement Number Theoretic Algorithms using a programming language (Cognitive
CO6
Knowledge Level: Apply)
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO1 ! ! ! ! ! !
CO2 ! ! ! ! !
CO3 ! ! ! ! ! !
CO4 ! ! ! ! ! !
CO5 ! ! ! ! ! !
CO6
! ! ! ! ! ! !
E S . I N
NOT
Abstract POs defined by National Board of Accreditation
KTU
PO# Broad PO PO# Broad PO
Assessment Pattern
Remember 30 30 30
Understand 30 30 30
Apply 40 40 40
Analyse
Evaluate
Create
E S . I N
Mark Distribution
KTU NOT
Total Marks CIE Marks ESE Marks ESE Duration
Attendance : 10 marks
First Internal Examination shall be preferably conducted after completing the first half of the syllabus
and the Second Internal Examination shall be preferably conducted after completing remaining part of
the syllabus.
There will be two parts: Part A and Part B. Part A contains 5 questions (preferably, 2 questions each
from the completed modules and 1 question from the partly covered module), having 3 marks for each
question adding up to 15 marks for part A. Students should answer all questions from Part A. Part B
contains 7 questions (preferably, 3 questions each from the completed modules and 1 question from
the partly covered module), each with 7 marks. Out of the 7 questions in Part B, a student should
answer any 5.
There will be two parts; Part A and Part B. Part A contains 10 questions with 2 questions from each
module, having 3 marks for each question. Students should answer all questions. Part B contains 2
E
maximum 2 sub-divisions and carries 14 marks.
S . N
questions from each module of which a student should answer any one. Each question can have
I
KTU NOT SYLLABUS
Module 1
Modular Arithmetic- Properties, Euclid's algorithm for the greatest common divisor, Extended Euclid’s
Algorithm, Least Common multiple, Solving Linear Diophantine Equations, Modular Division.
Module 2
Module 3
The Group of units- The group Un,Primitive roots, Existence of primitive roots, Applications of
primitive roots.
Module 4
S . I N
Quadratic Residues- Quadratic Congruences, The group of Quadratic residues, Legendre symbol,
E
NOT
Jacobi Symbol, Quadratic reciprocity.
KTU
Arithmetic Functions- Definition and examples, Perfect numbers, Mobius function and its properties,
Mobius inversion formula, The Dirichlet Products.
Module 5
Sum of Squares- Sum of two squares, The Gaussian Integers, Sum of three squares, Sum of four
squares.
Continued Fractions -Finite continued fractions, Infinite continued fractions, Pell's Equation, Solution
of Pell’s equation by continued fractions.
Text Books
1. G.A. Jones & J.M. Jones, Elementary Number Theory, Springer UTM, 2007.
Reference Books
1. William Stallings, Cryptography and Network Security Principles and Practice, Pearson Ed.
2. Tom M.Apostol, ‘Introduction to Analytic Number Theory’, Narosa Publishing House Pvt. Ltd,
New Delhi, (1996).
3. Neal Koblitz, A course in Number Theory and Cryptography, 2nd Edition, Springer ,2004.
Course Outcome 1 (CO1): Describe the properties of modular arithmetic and modulo operator.
Course Outcome 2 (CO2): Prove that the equation y2 = x3 - 2 has only the integer solution (3, ±5).
Course Outcome 3 (CO3): State the law of reciprocity for Jacobi symbols and use it to determine
whether 888 is a quadratic residue or non residue of the prime 1999.
Course Outcome 4 (CO4): Using Chinese remainder theorem, solve the system of congruence x
≡2(mod 3), x ≡3(mod 5), x ≡2(mod 7)
E S . I N
Course Outcome 5(CO5): State and prove Dirichlet product.
KTU NOT
Course Outcome 6 (CO6):Use extended Euclid's algorithm to solve Diophantine equations
efficiently. Given three numbers a>0, b>0, and c, the algorithm should return some x and y such that
a x + b y = c.
PART A
E S . I N
4. Use Fermat’s Little theorem to show that 91 is not a prime.
KTU NOT
5. If m is relatively prime to n , show that Φ(mn) = Φ(m) Φ(n).
6. Explain how public key cryptography can be used for digital signatures.
7. Define Mobius function and prove Mobius function is a multiplicative.
8. State and prove Dirichlet product.
9. Show that every prime of the form 4k+1 canbe represented uniquely as the sum of two
squares.
10. Find the continued fraction representation of the rational number 55/89.
Part B
11. (a) State the Euclidean algorithm and its extension with an example. (7)
(b) Find all the solutions of 24x + 34 y = 6. (7)
OR
12. (a) Describe the properties of modular arithmetic and modulo operator. (7)
(b) Explain Extended Euclidean algorithm. Using the algorithm find the
OR
14. (a) Using Chinese remainder theorem, solve the system of congruences,
x ≡2(mod 3), x ≡3(mod 5), x ≡2(mod 7) (7)
(b) Define Fermat primes. Show that any two distinct Fermat numbers are
Relatively prime. (7)
15. (a) Distinguish between public key and private key encryption techniques.
Also point out the merits and demerits of both. (7)
(b) Define Carmichael number and show that a Carmichael number must
be the product of at least three distinct primes. (7)
E S . I N
OR
16.
KTU NOT
(a)Define a pseudo prime to a base and find all non trivial bases for which
15 is a pseudo prime. (6)
(b) Find an element of
i) order 5 modulo 11 ii) order 4 modulo 13
iii) order 8 modulo 17 iv) order 6 modulo 19 (8)
17. (a) Determine the quadratic residues and non residues modulo 17. Also
determine whether 219 is a quadratic residue or non residue of the prime 383.
(8)
(b) State the law of quadratic reciprocity. Determine those odd primes p for
which 3 is a quadratic residue and those for which it is a non residue. (6)
OR
18. (a) State and prove properties of Legendre’s symbol. (7)
(b) State the law of reciprocity for Jacobi symbols and using it determine
whether 888 is a quadratic residue or non residue of the prime 1999. (7)
19. (a) Prove that the equation y2 = x3 - 2 has only the integer solution (3 , ±5). (7)
(b) Define a Gaussian integer. Factorize the Gaussian integer 440 − 55i. (7)
OR
20. (a) If m, and n can be expressed as sum of four squares, then show that mn can
also be expressed the sum of four squares. (7)
(b) Find all the solutions of the Diophantine equation x2 – 6 y 2 =1. (7)
Teaching Plan
1.5
S . I N
Modular Arithmetic- Properties of congruences, Modular Arithmetic
E 1 hour
NOT
Operations, Properties of Modular Arithmetic.
1.6
Algorithm. KTU
Euclid's algorithm for the greatest common divisor, Extended Euclid’s
1 hour
2.3 Primality testing and factorization, Miller -Rabin Test for Primality. 1 hour
3.4
S . I N
Definition of Euler Totient function, Examples and properties.
E
1 hour
N OT
3.5
3.6
K TU
Multiplicativity of Euler's Totient function.
1 hour
3.9 Existence of primitive roots for Primes, Applications of primitive roots. 1 hour
4.7 Mobius inversion formula., application of the Mobius inversion formula. 1 hour
5.5
E S . I N
Continued Fractions, Finite continued fractions. 1 hour
5.6
5.7 KTU
Infinite continued fractions.
NOT
Continued Fractions, Finite continued fractions. 1 hour
1 hour
Preamble: This is the foundational course for awarding B. Tech. Honours in Computer
Science and Engineering with specialization in Machine Learning. The purpose of this course
is to introduce mathematical foundations of basic Machine Learning concepts among learners, on
which Machine Learning systems are built. This course covers Linear Algebra, Vector Calculus,
Probability and Distributions, Optimization and Machine Learning problems. Concepts in this course
help the learners to understand the mathematical principles in Machine Learning and aid in the
creation of new Machine Learning solutions, understand & debug existing ones, and learn about the
inherent assumptions & limitations of the current methodologies.
Course Outcomes: After the completion of the course the student will be able to
Make use of the concepts, rules and results about linear equations, matrix algebra,
CO 1 vector spaces, eigenvalues & eigenvectors and orthogonality & diagonalization to
solve computational problems (Cognitive Knowledge Level: Apply)
S . I N
OTE
Utilize the concepts, rules and results about probability, random variables, additive
& multiplicative rules, conditional probability, probability distributions and Bayes’
CO 3
Apply)
K T U N
theorem to find solutions of computational problems (Cognitive Knowledge Level:
PO 1 PO 2 PO 3 PO 4 PO 5 PO 6 PO 7 PO 8 PO 9 PO 10 PO 11 PO 12
CO 1 √ √ √ √ √
CO 2 √ √ √ √
CO 3 √ √ √ √ √
CO 4 √ √ √ √ √ √
CO 5 √ √ √ √ √ √ √ √
Assessment Pattern
S . I N
OTE
Understand 40% 40% 40%
Apply
Evaluate
Create
Mark Distribution
Attendance : 10 marks
First Internal Examination shall be preferably conducted after completing the first half of the
syllabus and the Second Internal Examination shall be preferably conducted after completing
remaining part of the syllabus.
There will be two parts: Part A and Part B. Part A contains 5 questions (preferably, 2
questions each from the completed modules and 1 question from the partly covered module),
having 3 marks for each question adding up to 15 marks for part A. Students should answer
all questions from Part A. Part B contains 7 questions (preferably, 3 questions each from the
completed modules and 1 question from the partly covered module), each with 7 marks. Out
of the 7 questions in Part B, a student should answer any 5.
End Semester Examination Pattern: There will be two parts; Part A and Part B. Part A
contains 10 questions with 2 questions from each module, having 3 marks for each question.
I N
Students should answer all questions. Part B contains 2 questions from each module of which
S .
OTE
student should answer anyone. Each question can have maximum 2 sub-divisions and carries
14 marks.
K T U N
Module 1
Module 2
Module 3
. I N
Continuous Probabilities, Sum Rule, Product Rule, and Bayes’ Theorem. Summary Statistics
S
OTE
and Independence – Important Probability distributions - Conjugacy and the Exponential
K T U N
Family - Change of Variables/Inverse Transform.
Module 4
Module 5
Text book:
1.Mathematics for Machine Learning by Marc Peter Deisenroth, A. Aldo Faisal, and
Cheng Soon Ong published by Cambridge University Press (freely available at https://
mml - book.github.io)
Reference books:
.
2018 published by Cambridge University Press
S I N
T U
Cambridge University Press OTE
4. Convex Optimization by Stephen Boyd and Lieven Vandenberghe, 2004 published by
N
K
5. Pattern Recognition and Machine Learning by Christopher M Bishop, 2006, published
by Springer
S . I N
T U N OTE
4. 4.A set
4. 4.4.
A set
A
A(0,
A set
of of
set
set1,
of
n linearly
of n
noflinearly
K
n linearly
1)n, linearly
independent
independent
linearly independent
independent
(0, 1,−1)
vectors
vectors
independent vectors
vectors
vectors
form a basis
in R inn R n
forms
n
forms a basis.
a basis. Does
Does thethe
set set of vectors
of vectors (2, (2, 4,−3) ,
3 in R forms a basis. Does the set of vectors (2, 4,−3) ,
n n
forinRR?inExplain
R3 forms
forms a basis.
a your
basis. DoesDoes
reasons.
4,−3) , (0, 1, 1) , (0, 1,−1) form a basis for R ? Explain your reasons. the of
the set setvectors
of vectors (2, 4,−3)
(2, 4,−3) , ,
(0, 1, (0,
(0,1)1,1, 1),1,−1)
, 1)
(0, ,(0,
(0,1,−1)
1,−1) form
formform aabasis
a basisbasis
for R for
3
for?R R33??Explain
ExplainExplainyouryour your reasons.
reasons.
reasons.
5. 5.Consider the transformation T (x, y) = (x +
Consider the transformation T (x, y) = (x + y, x + 2y, 2x y, x + 2y, 2x + 3y). Obtain
+ 3y). kerker
Obtain T and useuse this to
T and
5. 5.5. this
Consider
Consider
Consider the
the
to calculate
calculate transformation
the transformation
thetransformation
the nullity.
nullity. Also (x,TTy)
TAlso
find (x,
(x,=y)
find
the y)
(x ==+(x
the (x
y, +x+ y,+y, x2y,
transformation
transformation x ++2x2y,+2x
2y,
matrix 2x
3y).+ 3y).
+for
matrix 3y). Obtain
Obtain
T. Tker
T. ker ker
forObtain TT and
and and use
use use this
this this to
to to
calculate
calculate
calculate thenullity.
nullity.
the nullity.
the Also
AlsoAlso find
findfind thetransformation
transformation
the transformation
the matrix matrix
for T.
matrix forT.
for T.
6. 6.Find
Find
the the characteristic
characteristic equation,
equation, eigenvalues,
eigenvalues, and eigenspaces
and eigenspaces corresponding
corresponding to each to each
6. 6.6. Find
Find Find the
the the
eigenvalue characteristic
characteristic equation,
equation,
characteristic eigenvalues,
eigenvalues,
equation,
of the following matrix and
and and
eigenvalues, eigenspaces
eigenspaces corresponding
corresponding
eigenspaces to each
to each
corresponding to each
eigenvalue of the following matrix
eigenvalue
eigenvalue offollowing
of the
eigenvalue of thefollowing
the following matrix
matrix
matrix
"
7. Diagonalize the following matrix, if possible
7. 7.7. Diagonalize
Diagonalize thefollowing
following
the following
Diagonalize the matrix,
matrix, ififpossible
possible
if possible
matrix,
"
1. 1. For
Fora scalar
a scalar function
function f(x,f(x, x2 +3y
y, zy,) z=) x=2 +3y
S . I
2
N
2 +2z+2z
2
, find
2, find the the gradient
gradient andand its magnitude
its magnitude at at the
1.thepoint
OTE
(1, 2, -1).
For a scalar
point function f(x, y, z ) = x2 +3y2 +2z2, find the gradient and its magnitude at the
(1, 2, -1).
pointthe
2. 2. Find
Find
the
(1, maximum
2, -1).
the maximum
Findcondition
2.subject to the
the K T U
and N
andminimum
minimumvalues
x2 + y2and
condition
maximum <=x2 2.
values of
of the
the function f(x, y)
y) == 4x
2 2
4x++4y4y- x- x-2 y- ysubject
2 to
+ y2 <= 2. values of the function f(x, y) = 4x + 4y - x 2 - y2 subject to
minimum
the condition
3. 3. Suppose x2 +trying
y2 <= 2. f(x, y) y)
= x2=+ x
2y2 + 2y2. Along
Supposeyouyou were
were trying to to
minimize
minimize f(x, + 2y + 2y2. what Along vector
what vector
3.should you
should
Suppose youtravel from (5,(5,
travel
you werefrom
12)?
trying 12)?
to minimize f(x, y) = x2+ 2y + 2y2. Along what vector
4. should you travel from (5, 12)?
4. Find thethe
second order Taylor series expansion forfor
f(x, y) y)
= (x + y)
+ y)about (0 (0
, 0).
2 2
Find second order Taylor series expansion f(x, = (x about , 0).
5.
4.Find thethe
critical points
orderofTaylor
f(x, y)series
= x expansion
3xy+5x-2y+6y +8.y) = (x + y)2 about (0 , 0).
2– 2
Find second for f(x,
5. Find the critical points of f(x, y) = x2 – 3xy+5x-2y+6y2+8.
6. Compute the gradient of the Rectified Linear Unit (ReLU) function ReLU(z) =
5. Find the critical points of f(x, y) = x2 – 3xy+5x-2y+6y2+8.
6. Compute the gradient of the Rectified Linear Unit (ReLU) function ReLU(z) = max(0 , z).
max(0 , z).
7. 6.LetCompute
LL ==||Ax
the gradient of the Rectified Linear Unit (ReLU) function ReLU(z) = max(0 , z).
||Ax- b||
- b||2,22where
, whereAAisisa amatrix
matrixand
andx xand
andb bare
arevectors.
vectors.Derive
DerivedL
dLininterms
termsofof dx.
2
7. Let
7.dx.Let L = ||Ax - b||22, where A is a matrix and x and b are vectors. Derive dL in terms of dx.
Course Outcome 3 (CO3):
Course Outcome 3 (CO3):
1. Let J and T be independent events, where P(J)=0.4 and P(T)=0.7.
i. Find P(J∩T)
1. Let J and T be independent events, where P(J)=0.4 and P(T)=0.7.
ii. Find P(J∪T)
i. Find P(J∩T)
iii. Find P(J∩T′)
ii. Find P(J∪T)
iii. Find P(J∩T′)
i. Find P(J∩T)
i.
i. Given that E(R)=2.85, find a and b.
i. Given
ii. that E(R)=2.85,
Find P(R>2). find a and b.
ii. Find P(R>2).
4. A biased coin (with probability of obtaining a head equal to p > 0) is tossed repeatedly and
4. A biasedindependently
coin (with probability
until the of first
S . I N
obtaining
head isaobserved.
head equalCompute
to p > 0)the
is tossed repeatedly
probability that the first head
OTE
and independently
appears at anuntil
eventhenumbered
first headtoss.
is observed. Compute the probability that the
5. Two players
K T U N
first head appears at an even numbered toss.
5. Two players A and B are competing at a trivia quiz game involving a series of questions. On
A and B question,
any individual are competing at a triviathat
the probabilities quiz game
A and involving
B give a series
the correct of are p and q
answer
questions. On any individual
respectively, question,with
for all questions, the outcomes
probabilities
for that A andquestions
different B give the correct
being independent. The
answer gameare p finishes
and q respectively,
when a player for wins
all questions, with outcomes
by answering a questionforcorrectly.
differentCompute the
questions being independent.
probability that A winsThe if game finishes when a player wins by answering a
i. A answers
question correctly. Compute thethe
first question, that A wins if
probability
ii. B answers the first question.
i. A answers the first question,
6. A coin for which P(heads) = p is tossed until two successive tails are obtained. Find the
ii. Bprobability
answers the first
that thequestion.
experiment is completed on the nth toss.
6. A coin for which P(heads) = p is tossed until two successive tails are obtained. Find
7. You roll a fair dice twice. Let the random variable X be the product of the outcomes of the
the probability that the experiment is completed on the nth toss.
two rolls. What is the probability mass function of X? What are the expected value and the
7. You rollstandard
a fair dice twice. Let the random variable X be the product of the outcomes of
deviation of X?
the two rolls. What is the probability mass function of X? What are the expected value
and 8. While watching
the standard deviationaofgameX? of Cricket, you observe someone who is clearly supporting
Mumbai Indians. What is the probability that they were actually born within 25KM of
Mumbai? Assume that:
• the probability that a randomly selected person is born within 25KM of Mumbai is
1/20;
• the chance that a person born within 25KMs of Mumbai actually supports MI is
7/10 ;
• the probability that a person not born within 25KM of Mumbai supports MI with
probability 1/10.
9. What isDownloaded
an exponential family? Why are
from exponential families useful?
Ktunotes.in
COMPUTER SCIENCE AND ENGINEERING
8. While watching a game of Cricket, you observe someone who is clearly supporting
Mumbai Indians. What is the probability that they were actually born within 25KM of
Mumbai? Assume that:
• the probability that a randomly selected person is born within 25KM of
Mumbai is 1/20;
• the chance that a person born within 25KMs of Mumbai actually supports MI
is 7/10 ;
• the probability that a person not born within 25KM of Mumbai supports MI
with probability 1/10.
9. What is an exponential family? Why are exponential families useful?
10. Let Z1 and Z2 be independent random variables each having the standard normal
distribution. Define the random variables X and Y by X = Z1 + 3Z2 and Y = Z1 + Z2.
Argue that the joint distribution of (X, Y) is a bivariate normal distribution. What are
the parameters of this distribution?
11. Given a continuous random variable x, with cumulative distribution function Fx(x),
show that the random variable y = Fx(x) is uniformly distributed.
I N
12. Explain Normal distribution, Binomial distribution and Poisson distribution in the
S .
OTE
exponential family form.
K T U N
Course Outcome 4(CO4):
5. Consider
5. Consider the
the update
update equation
equation for
for stochastic
stochastic gradient
gradient descent.
descent. Write
Write down
down the
the update
update when
when
we use
we use aa mini-batch
mini-batch size
size of
of one.
one.
COMPUTER SCIENCE AND ENGINEERING
6. Consider
6.
6. Consider the
Consider the function
the function
function
"
"
9. 9.
Solve the following LP problem withwith
the simplex method.
Solve the following LP problem the simplex
S I N
method.
.
9. Solve the following LP problem with the simplex method.
T U N "
OTE
subjectK
subject
to to
subject
to constraints
the the constraints
the constraints
Course
Course Outcome
Outcome 5 (CO5):
5 (CO5):
Course Outcome 5 (CO5):
1. What is a loss function? Give examples.
1. What is a loss function? Give examples.
2.1. What is
area loss
training/validation/test sets? What is cross-validation? Name one or two
function? Give examples.
2. What areoftraining/validation/test
examples sets? What is cross-validation? Name one or two examples
cross-validation methods.
2. What are training/validation/test sets? What is cross-validation? Name one or two examples
of cross-validation methods.
3. of cross-validation
Explain methods.
generalization, overfitting, model selection, kernel trick, Bayesian learning
3. Explain generalization, overfitting, model selection, kernel trick, Bayesian learning
3. Explain generalization, overfitting, model selection, kernel trick, Bayesian learning
4. Distinguish between Maximum Likelihood Estimation (MLE) and Maximum A Posteriori
4. Distinguish between Maximum Likelihood Estimation (MLE) and Maximum A Posteriori
Estimation (MAP)?
Estimation (MAP)?
5. What is the link between structural risk minimization and regularization?
5. What is the link between structural risk minimization and regularization?
6. What is a kernel? What is a dot product? Give examples of kernels that are valid dot
6. What is a kernel? What is a dot product? Give examples of kernels that are valid dot
products.
products. Downloaded from Ktunotes.in
Course Outcome 5 (CO5):
4.2. Distinguish
What are training/validation/test
between Maximum sets? What isEstimation
Likelihood cross-validation?
(MLE) Name one or two A
and Maximum examples
of cross-validation
Posteriori Estimationmethods.
(MAP)?
5.3. What
Explain generalization,
is the link between overfitting, model
structural risk selection, and
minimization kernel trick, Bayesian learning
regularization?
6.4. What
Distinguish between
is a kernel? WhatMaximum Likelihood
is a dot product? GiveEstimation
examples (MLE) andthat
of kernels Maximum
are validAdot
Posteriori
Estimation (MAP)?
products.
7.5. What
Whatisisridge
the link between How
regression? structural risktrain
can one minimization and regularization?
a ridge regression linear model?
8.6. What
What isis Principal
a kernel?Component
What is a Analysis
dot product?
(PCA)?GiveWhich
examples
eigenof value
kernels that arethevalid dot
indicates
products.of largest variance? In what sense is the representation obtained from a
direction
7. projection
What is ridge ontoregression?
the eigenHow can onecorresponding
directions train a ridge regression
the the linear
largestmodel?
eigen values
optimal for data reconstruction?
8. What is Principal Component Analysis (PCA)? Which eigen value indicates the direction of
largest variance?
9. Suppose that you In what
have sense is
a linear the representation
support vector machine obtained
(SVM) from a projection
binary classifier.onto the
eigen directions
Consider a pointcorresponding the the
that is currently largest eigen
classified valuesand
correctly, optimal
is farforaway
data reconstruction?
from the
9. decision
Supposeboundary. If you
that you have remove
a linear the point
support vectorfrom the training
machine (SVM) set, andclassifier.
binary re-train the
Consider a
classifier, will the decision boundary change or stay the same? Explain your answer
point that is currently classified correctly, and is far away from the decision boundary. If you
inremove
one sentence.
the point from the training set, and re-train the classifier, will the decision boundary
change or stay the same? Explain your answer in one sentence.
10. Suppose you have n independent and identically distributed (i.i.d) sample data points
10. xSuppose
1, ... , xnyou havedata
. These n independent
points come andfrom
identically distributed
a distribution (i.i.d)
where thesample of a x1, ... ,
data points
probability
xn. These
given data points
datapoint
. I N
x is come from a distribution where the probability of a given datapoint x is
S
T U N OTE
K "
i. What are the prior and posterior odds for the fair coin?
ii. What are the prior and posterior predictive probabilities of heads on the next
flip? Here prior predictive means prior to considering the data of the first four
flips.
K T U N
3 Let f(x, y, z) = xyer, where r = x2+z2-5. Calculate the gradient of f at the
3 Let f(x, y, z) = xyer, where r = x2+z2-5. Calculate the gradient of f at the point
(1,point
3, -2).(1, 3, -2).
4 Compute the Taylor polynomials Tn, n = 0 , ... , 5 of f(x) = sin(x) + cos(x) at
4 x0 Compute
= 0. the Taylor polynomials Tn, n = 0 , ... , 5 of f(x) = sin(x) + cos(x)
5 LetatXxbe a continuous random variable with probability density function on
0 = 0.
0 <= x <= 1 defined by f(x) = 3x2. Find the pdf of Y = X2.
65 Let that
Show X beifatwocontinuous
events A random variable
and B are with probability
independent, then A anddensity function
B' are independent.
7 Explain the principle of the gradient descent algorithm.
on 0 <= x <= 1 defined by f(x) = 3x2. Find the pdf of Y = X2.
6 Show that if two events A and B are independent, then A and B' are
independent.
7 Explain the principle of the gradient descent algorithm.
8
one over the other.
Briey explain the difference between (batch) gradient descent and stochastic
8 Briey explain the difference between (batch) gradient descent and stochastic
9 What isdescent.
gradient the empirical
Give an risk? What of
example is “empirical risk minimization”?
when you might prefer one over the other.
910gradient descent.
What Give
is thethe an
empirical example
risk? of
Whatwhen you might
is “empirical prefer
risk one over the other.
minimization”?
9 What is Explain
the empirical concept
risk? of ais
What Kernel function
“empirical risk in Support
minimization”? Vector Machines.
10 Explain the concept of a Kernel function in Support Vector Machines. Why are
10 Explain Why
the concept of aso
are kernels Kernel
useful?function in Supporta Vector
What properties kernel Machines. Why
should posses to are
be
kernels so useful? What properties a kernel should posses to be used in an SVM?
kernels so
useduseful?
in an What
SVM?properties a kernel should posses to be used in an SVM?
PART B
PART
Answer any one Question from each B module. Each question carries 14 Marks
11 a)Answer i. any one
Find allQuestion
solutionsfrom PART
to theeach B linear
module.
system of Each question carries 14 Marks
equations (6)
11 a) i. Answer
Find all any
solutions to the system of linear equations
one Question from each module. Each question carries 14 Marks (6)
OTE
T 3
vectors(W)
orthogonal
and why?
to [2, −3, 1] forms a subspace W of R . What is dim
W of R3. What is dim (W) and why?
b) Use
(W) and why?
Usethethe
b) Use theofGramm-Schmidt K T U N
Gramm-Schmidt
Gramm-Schmidt
process
process
to
to find
process
find an
an orthogonal
to find basis for
an orthogonal
orthogonal basis for the
the for
basis
column
column
the space
space
(8)
(8)
(8)
the following
column matrix
space of the following matrix
of the following matrix
"
OR
12 a) i. Let L be the line through the OR 2
ORorigin in R that is parallel to the vector (6)
12 a) i. Let L be 4]Tline
[3,the . Find the standard
through matrix
the origin in Rof
2 the orthogonal projection onto L. Also
that is parallel to the vector (6)
T
[3, 4] find
. Find
thethe standard
point matrixisofclosest
on L which the orthogonal projection
to the point onto
(7 , 1) and findL.the
Also
point on
find theL point
whichonis Lclosest
whichtoisthe
closest
pointto(-3
the, 5).
point (7 , 1) and find the point on
ii. Find
L which the rank-1
is closest approximation
to the point (-3 , 5).of
ii. Find the rank-1 approximation of
"
N OTE
13 a) A skier is on a mountain with equation z = 100 – 0.4x2 – 0.3y2, where z denotes (8)
T U
13 height.
a)
K
A skier is on a mountain with equation z = 100 – 0.4x2 – 0.3y2, where z (8)
denotes height.
i. The skier is located at the point with xy-coordinates (1 , 1), and wants to
skii.downhill along
The skier is the steepest
located possible
at the path. xy-coordinates
point with In which direction
(1 ,(indicated
1), and
by a vector (a , b) in the xy-plane) should the skier begin skiing.
wants to ski downhill along the steepest possible path. In which
direction (indicated by a vector (a , b) in the xy-plane) should the
ii. The skier begins skiing in the direction given by the xy-vector (a , b) you
found skier begin
in part (i), skiing.
so the skier heads in a direction in space given by the
vector (a , b , c). Find the value of c.
ii. The skier begins skiing in the direction given by the xy-vector (a ,
b) b) linear
Find the you found in part (i),tosothe
approximation thefunction
skier heads
f(x,y)in=a2direction
- sin(-x -in3y)space
at the (6)
given
point (0 , π),by
andthethen
vector
use (a , b answer
your , c). Find
to the value f(0.001
estimate of c. , π).
b) Find the linear approximation to the function f(x,y) = 2 - sin(-x - (6)
3y) at the point (0 , π),OR
and then use your answer to estimate
14 a) Let g be the function given by (8)
f(0.001 , π).
OR
"
i. Calculate the partial derivatives of g at (0 , 0).
i. Calculate the partial derivatives of g at (0 , 0).
ii.
ii. Show
Showthatthatggisisnot
notdifferentiable
differentiableatat(0
(0,,0).
0).
b) Find the second order Taylor series expansion for f(x,y) = e-(x2+y2) cos(xy) about (0 , (6)
b) Find the second order Taylor series expansion for f(x,y) = e-(x2+y2) cos(xy) (6)
0).
aboutare
15 a) There (0 ,two
0). bags. The first bag contains four mangos and two apples; the second (6)
15 a) There
bag are twofour
contains bags. The first
mangos andbag
fourcontains
apples. four mangos
We also haveand two apples;
a biased (6)
coin, which
the second
shows bagwith
“heads” contains four mangos
probability 0.6 andand fourwith
“tails” apples. We also0.4.
probability haveIf athe coin
biased coin, which shows “heads” with probability 0.6 and “tails” with
probability 0.4. If the coin shows “heads”. we pick a fruit at
showsrandom
“heads”.from bag 1;
we pick otherwise
a fruit at we pick a fruit at random from bag 2. Your
random fromflips
friend bag the
1; otherwise
coin (youwe pick see
cannot
S . I
a fruit
N
theatresult),
random froma bag
picks fruit2.atYour friend
random
OTE
flips the coin (you cannot see the result), picks a fruit at random from the
from the corresponding bag, and presents you a mango.
What What
is the is
K T U N
corresponding bag, and presents you a mango.
the probability
probability that
that the the mango
mango was picked
was picked from2?bag 2?
from bag
b) b)
Suppose that one
Suppose thathasone
written
has awritten
computer program that
a computer sometimes
program compiles and (8)(8)
that sometimes
sometimes notand
compiles (code does not
sometimes change).
not (code doesYou decide toYou
not change). model
decidethe apparent
to model
stochasticity (success vs. no success) x of the compiler using a Bernoulli
the apparent stochasticity (success vs. no success) x of the compiler using
distribution with parameter μ:
a Bernoulli distribution with parameter μ:
"
Choose a conjugate prior for the Bernoulli likelihood and compute the posterior
Choose a conjugate prior for the Bernoulli likelihood and compute the
distribution p( μ | x1 , ... , xN).
posterior distribution p( μ | x1 , ... OR
, xN).
OR
16 a) Consider a mixture of two Gaussian distributions (8)
i. i.Compute
Compute the the marginal
marginal distributions
distributions for for
eacheach dimension.
dimension.
ii. ii.
Compute
Compute the mean, mode and
the mean, modemedian
andformedian
each marginal
for eachdistribution.
marginal
iii. Compute the mean and mode for the two-dimensional distribution.
distribution.
i. Compute
b) Express the marginal distributions for each dimension.
iii.the Binomial
Compute the distribution
mean and mode as an
for exponential family distribution.
the two-dimensional distribution. Also (6)
ii. Compute the mean, mode and median for each marginal distribution.
express the Betathedistribution
iii. Compute mean and mode is an for
exponential family distribution.
the two-dimensional Show that the
distribution.
product of the Beta and the Binomial distribution is also a member of the
b) b)Express
Express the Binomial
the Binomial distribution
distribution as anas exponential
an exponential family
family distribution.
distribution. Also (6)(6)
exponential family.
FindAlso
17 a) express the express
theextrema thef(x,y,z)
of Beta distribution
Beta distribution =isxan is an exponential
- yexponential
+ z subject family =family
x2 + y2 distribution.
distribution.
to g(x,y,z) +Show that the
z2 = 2. (8)
b) product
Let Show of that
the the
Betaproduct
and theof the Beta and
Binomial the Binomial
distribution distribution
is also a member is also
of athe (6)
exponential
memberfamily.
of the exponential family.
17 a) Find the extrema of f(x,y,z) = x - y + z subject to g(x,y,z) = x2 + y2 + z2 = 2. (8)
17b) a)Let Find the extrema of f(x,y,z) = x - y + z subject to g(x,y,z) = x2 + y2 + z2 = (8)(6)
2.
S . I N
OTE
b) Let
K T U N
Show
" that x* = (1 , 1/2 , -1) is optimal for the optimization problem
Show that x* = (1 , 1/2 , -1) is optimal for the optimization problem
(6)
OR
18 a) Derive the gradient descent trainingOR
rule assuming that the target function (8)
18 a) Derive the gradient descent training rule assuming that the target function is (8)
is represented as od = w0 + w1x1 + ... + wnxn. Define explicitly the cost/
represented as od = w0 + w1x1 + ... + wnxn. Define explicitly the cost/error function
error function E, assuming that a set of training examples D is provided,
E, assuming that a set of training examples D is provided, where each training
where each training example d ∈ D is associated with the target output td.
example d ∈ D is associated with the target output td.
b) Find the maximum value of f(x,y,z) = xyz given that g(x,y,z) = x + y + z = 3 and (6)
x,y,z >= 0.
19 a) Consider the following probability distribution (7)
where θ is a parameter and x is a positive real number. Suppose you get m i.i.d.
where
where
samples is a parameter
θ isxi aθdrawn
parameter and
andthis
from xaispositive
a positive
x isdistribution. real
real number.
number.
Compute Suppose
Suppose
the you
maximumyou getmmi.i.d.
get
likelihood
samples
i.i.d.xfor
estimator drawn
isamples xfrom
θ based thisfrom
drawn
i on these distribution. ComputeCompute
this distribution.
samples. the maximum likelihood
the maximum
estimator for θ based
likelihood on these
estimator for θsamples.
based on these samples.
b) b)Consider the following
Consider Bayesian
the following network
Bayesian with with
network boolean variables.
boolean variables. (7) (7)
b) Consider the following Bayesian network with boolean variables. (7)
S . I N
T U N OTE
K
i. List variable(s) conditionally independent of X33 given X11 and X12
ii. List variable(s) conditionally independent of X33 and X22
iii. Write the joint probability P(X11, X12, X13, X21, X22, X31, X32, X33)
factored according to the Bayes net. How many parameters are
necessary to define the conditional probability distributions for
this Bayesian network?
iv. Write an expression for P(X13 = 0,X22 = 1,X33 = 0) in terms of the
conditional probability distributions given in your answer to part
(iii). Justify your answer.
OR
OR
20 a) Consider the following one dimensional COMPUTER
training data set,SCIENCE
’x’ denotes AND
negative (6)
ENGINEERING
examples and ’o’ positive examples. The exact data points and their labels are
20 a) Consider the following one dimensional training data set, ’x’ denotes (6)
given in the table below. Suppose a SVM is used to classify this data.
negative examples and ’o’ positive examples. The exact data points and
their labels are given in the table below. Suppose a SVM is used to
classify this data.
"
i. Indicate which arewhich
i. Indicate the support vectors
are the andvectors
support mark theand
decision
mark boundary.
the decision
ii. Give the value of the cost function and the model parameter after training.
b) Supposeboundary.
I N
that we are fitting a Gaussian mixture model for data items
S . (8)
OTE
consisting of athe
ii. Give single
valuereal value,
of the costx,function
using K and
= 2 components. We haveafter
the model parameter N=
5 training
K T U N
cases, in which the values of x are as 5, 15, 25, 30, 40. Using the
training.
EM algorithm to find the maximum likeihood estimates for the model
parameters, what are the mixing proportions for the two components, π1
and π2, and the means for the two components, μ1 and μ2. The standard
deviations for the two components are fixed at 10.
S . I N
T U N OTE
K "
What values for the parameters π1, π2 , µ1, and µ2 will be found in
What
the nextvalues forofthe
M step algorithm?π1, π2 , μ1, and μ2 will be found in the next
theparameters
M step of the algorithm? ****
****
Teaching Plan
No. of
Lectures
No Topic
(45)
T U N OTE 1
7.
8.
K
Cholesky Decomposition, Eigen decomposition and Diagonalization
Singular Value Decomposition - Matrix Approximation
1
1
Module-II (VECTOR CALCULUS) 6
E S
1
5 Convex Optimization
OT 1
6.
7.
K TUN
Linear Programming
Quadratic Programming
1
1
5 Module-V (CENTRAL MACHINE LEARNING PROBLEMS) 14
14. Kernels 1
*Assignments may include applications of the above theory. With respect to module V,
programming assignments may be given.
S . I N
T U N OTE
K
YEAR OF
Category L T P CREDIT
CST Principles of Program INTRODUCTION
296 Analysis and Verification
HONOURS 3 1 0 4 2019
Preamble: This is the foundational course for awarding B. Tech. Honours in Computer Science
and Engineering with specialization in Formal Methods. Program Analysis and Program
Verification are two important areas of study, discussing Methods, Technologies and Tools to
ensure reliability and correctness of software systems. The syllabus for this course is prepared
with the view of introducing the Foundational Concepts, Methods and Tools in Program Analysis
and Program Verification.
Prerequisite: Topics covered in the course Discrete Mathematical Structures (MAT 203).
Course Outcomes: After the completion of the course the student will be able to
Explain the concepts and results about Lattices, Chains, Fixed Points, Galois
Connections, Monotone and Distributive Frameworks, Hoare Triples, Weakest
CO1
. I N
Preconditions, Loop Invariants and Verification Conditions to perform Analysis and
E S
Verification of programs (Cognitive knowledge level: Understand)
CO2
KTU NOT
Illustrate methods for doing intraprocedural/interprocedural Data flow Analysis for a
given Program Analysis problem (Cognitive knowledge level: Analyse)
Formulate an Abstract Interpretation framework for a given Data flow Analysis
CO3 problem and perform the analysis using the tool WALA (Cognitive knowledge level:
Analyse)
Use the tool VCC to specify and verify the correctness of a C Program with respect to
CO6
a given set of properties (Cognitive knowledge level: Analyse)
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO1
CO2
CO3
CO4
CO5
CO6
E S . I N
Abstract POs defined by National Board of Accreditation
PO#
KTU
Broad PO NOT PO# Broad PO
Assessment Pattern:
Understand 30 30 30
Apply 40 40 40
Analyze
Evaluate
Create
Mark Distribution
150
N OT
50 100 3 hours
K TU
Continuous Internal Evaluation Pattern:
Attendance : 10 Marks
Assignment : 15 Marks
First series test shall be preferably conducted after completing the first half of the syllabus and
the second series test shall be preferably conducted after completing the remaining part of the
syllabus.
There will be two parts: Part A and Part B. Part A contains 5 questions (preferably, 2 questions
each from the completed modules and 1 question from the partly covered module), having 3
marks for each question adding up to 15 marks for part A. Students should answer all questions
from Part A. Part B contains 7 questions (preferably, 3 questions each from the completed
modules and 1 question from the partly covered module), each with 7 marks. Out of the 7
questions in Part B, a student should answer any 5.
There will be two parts; Part A and Part B. Part A contains 10 questions with 2 questions from
each module, having 3 marks for each question. Students should answer all questions from Part
A. Part B contains 2 questions from each module of which a student should answer any one, each
question carries 14 marks. Each question in part B can have a maximum 2 sub-divisions.
SYLLABUS
Module 1
E S . I N
NOT
Mathematical Foundations – Partially Ordered Set, Complete Lattice, Construction of
KTU
Complete Lattices, Chains, Fixed Points, Knaster-Tarski Fixed Point Theorem.
Module 2
Introduction to Program Analysis – The WHILE language, Reaching Definition Analysis, Data
Flow Analysis, Abstract Interpretation, Algorithm to find the least solutions for the Data Flow
Analysis problem.
Module 3
Module 4
Module 5
Program Verification - Why should we Specify and Verify Code, A framework for software
verification - A core programming Language, Hoare Triples, Partial and Total Correctness,
Program Variables and Logical Variables, Proof Calculus for Partial Correctness, Loop
Invariants, Verifying code using the tool VCC (Verifier for Concurrent C).
Text Books
1. Flemming Nielson, Henne Nielson and Chris Kankin, Principles of Program Analysis,
Springer (1998).
E S . I N
2. Michael Hutch and Mark Ryan, Logic in Computer Science - Modeling and Reasoning
1. Julian Dolby and Manu Sridharan, Core WALA Tutorial (PLDI 2010), available online at
http://wala.sourceforge.net/files/PLDI_WALA_Tutorial.pdf
2. Ernie & Hillebrand, Mark & Tobies, Stephan (2012), Verifying C Programs: A VCC
Tutorial.
1. Find a lattice to represent the data states of a given program and propose a sound abstract
interpretation framework to do a given analysis on the program.
2. When is an abstract interpretation framework said to be sound? Illustrate with an
example.
3. When is an abstract interpretation framework said to be precise? Illustrate with an
example.
1. Illustrate how one can do Interprocedural Data Flow Analysis using the tool WALA.
1. Using the tool VCC prove that a given code segment satisfies a given property.
QP CODE: PAGES:3
PART A
. I N
3. Write a program in while language to find the factorial of a number. Explain the
E S
NOT
statements of your program.
4. Consider a program that calculates x! y through repeated multiplications. Draw the flow
KTU
graph of the program.
5. What is Available Expression (AE) analysis? Give an application for AE analysis.
6. What is Live variable (LV) analysis? Give an application for LV analysis.
7. Let P be a program analysis problem (like LV, AE etc.) and (A,
! FA, γAC ) and (B,
! FB, γBC )
be two abstract interpretations such that B ! is more abstract than A! . Let α
! and γ! be the
abstraction and concretization functions between A ! and B
! . Then, what are the conditions
required for !α and !γ to form a Galois Connection?
8. When is Kildall’s algorithm for abstract interpretation guaranteed to terminate? Justify
your answer.
9. Is it possible to verify total correctness of a program using Hoare Logic? If yes, how is it
possible?
10. Define loop invariant. Show a simple loop with a loop invariant.
PART B
Answer any one Question from each module. Each question carries 14 Marks
11.
a. What is an infinite ascending chain in a lattice? Show an example lattice with an
infinite ascending chain. Is it possible for a complete lattice to contain an infinite
ascending chain? (7 marks)
b. State and prove Knaster-Tarski fixed point theorem. (7 marks)
OR
12.
a. Consider the lattice (ℕ,! ≤ ) . Let f! : ℕ → ℕ , be a function defined as follows:
when x! < 100 , f! (x) = x + 1 , when x! > 100 , f! (x) = x − 1 , otherwise f! (x) = x .
Then, show the following for f! : (i) the set of all fixpoints, (ii) the set of all pre-
fixpoints and (iii) the set of all post-fixpoints. (7 marks)
b. Let (D,
! ≤ ) be a lattice with a least upper bound for each subset of D ! . Then, prove
that every subset of D! has a greatest lower bound. (7 marks)
13.
. I N
a. With a suitable example, explain the equational approach in Data Flow Analysis.
E S
NOT
(7 marks)
b. With a suitable example, explain how you obtain the collecting semantics of a
KTU
program point.
OR
(7 marks)
14.
a. With an example, explain the Constrained Based Approach in Data Flow
Analysis. (7 marks)
b. Discuss the properties of an algorithm to solve the problem of computing the least
solution to the program analysis problems in Data Flow Analysis. (7 marks)
15.
a. Using Intraprocedural Reaching Definition Analysis, find the assignments killed
and generated by each of the blocks in the program
[x:=5]1;
[y:=1]2 ;
while [x>1]3 do
([y:=x*y]4 ; [x:=x-1]5)
(7 marks)
b. Analyse the following program using Intraprocedural Very Busy Expression
analysis
if [a>b]1 then
([x: =b-a]2 ; [y: =a-b]3)
else
([y: =b-a]4; [x: =a-b]5)
(7 marks)
OR
16.
a. Find Maximal Fixed Point (MFP) solution for the program
[x: =a+b]1;
[y: =a*b]2 ;
while [y>a+b]3 do
([a: =a+l]4; [x: =a+b]5)
(7 marks)
b. With examples, explain the difference between flow sensitive and flow insensitive
analysis. (7 marks)
17.
a. Prove that (L,
! α, γ, M ) is an adjunction if and only if (L,
! α, γ, M ) is a Galois
connection. (7 marks)
b. Prove that if α
! : L → M is completely additive then there exists γ! : M → L such
that (L,
! α, γ, M ) is a Galois connection. Similarly, if γ! : M → L is completely
E S . I N
multiplicative then there exists α! : L → M such that (L,! α, γ, M ) is a Galois
NOT
connection. (7 marks)
18.
KTU OR
a. Show that if (Li, αi, γi, Mi) are Galois connections and βi : Vi ➝ Li are
representation functions then
((α1 o β1) ↠ (α2 o β2)) (↝) = α2 o ((β1 ↠ β2) (↝)) o γ1
(7 marks)
b. Briefly explain Kildall’s algorithm for abstract interpretation (7 marks)
19.
a. Briefly explain the need of specification and verification of code. (7 marks)
b. Argue that Hoare Logic is sound. When Hoare Logic is complete? Let {A}P{B}
be a Hoare triple such that Hoare Logic is complete for the program P. Then, is it
always possible to check the validity of the Hoare Triple? If not, what is the
difficulty? (7 marks)
OR
20.
a. With suitable examples, show the difference between partial and total correctness.
(7 marks)
b. With a suitable example, show how a basic program segment can be verified
using the tool VCC. (7 marks)
Teaching Plan
Module 1 (Mathematical Foundations) 6 Hours
E S . I N
NOT
2.3 Reaching Definition Analysis 1 Hour
2.4 KTU
Abstract Interpretation 1 Hour
2.5 Algorithm to find the least solutions for the Data Flow Analysis problem 1 Hour
10
4.1
S . I
A Mundane Approach to Correctness
E N 1 Hour
N OT
4.2
4.3 K TU
Approximations of Fixed Points
Galois Connections,
1 Hour
1 Hour
11
E S . I N
5.11 Verifying C programs using the tool VCC (Lecture 3) 1 Hour
KTU NOT
12