0% found this document useful (0 votes)
29 views51 pages

MSC BDA Curriculum Outcomes

The MSc in Big Data Analytics program at RKMVERI aims to develop critical thinking, technical communication, and ethical standards in students while preparing them for research or industry careers. It covers essential topics such as statistical methods, big data technologies, operations management modeling, data mining, and machine learning techniques. The program includes various courses with specific objectives, prerequisites, and evaluation methods to ensure comprehensive learning in data analytics.

Uploaded by

Subha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views51 pages

MSC BDA Curriculum Outcomes

The MSc in Big Data Analytics program at RKMVERI aims to develop critical thinking, technical communication, and ethical standards in students while preparing them for research or industry careers. It covers essential topics such as statistical methods, big data technologies, operations management modeling, data mining, and machine learning techniques. The program includes various courses with specific objectives, prerequisites, and evaluation methods to ensure comprehensive learning in data analytics.

Uploaded by

Subha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

MSc in Big Data Analytics

Department of Computer Science

RKMVERI, Belur Campus

Program Outcomes

Program Specific Outcomes

Course Outcomes

1
Program outcomes
• Inculcate critical thinking to carry out scientific investigation objectively without being biased with
preconceived notions.

• Equip the student with skills to analyze problems, formulate an hypothesis, evaluate and validate
results, and draw reasonable conclusions thereof.

• Prepare students for pursuing research or careers in industry in mathematical sciences and allied fields

• Imbibe effective scientific and/or technical communication in both oral and writing.

• Continue to acquire relevant knowledge and skills appropriate to professional activities and demonstrate
highest standards of ethical issues in mathematical sciences.

• Create awareness to become an enlightened citizen with commitment to deliver ones responsibilities
within the scope of bestowed rights and privileges.

Program Specific Outcomes


• Basic understanding of statistical methods, probability, mathematical foundations, and computing
methods relevant to data analytics.

• Knowledge about storage, organization, and manipulation of structured data.

• Understand the challenges associated with big data computing.

• Training in contemporary big data technologies

• Understanding about the analytics chain beginning with problem identification and translation, fol-
lowed by model building and validation with the aim of knowledge discovery in the given domain.

• Applying dimensionality reduction techniques in finding patterns/features/factors in big data.

• Estimation of various statistics from stored and/or streaming data in the iterative process of model
selection and model building.

• Future event prediction associated with a degree of uncertainty.

• Modelling optimization techniques such as linear programming, non-linear programming, transporta-


tion techniques in various problem domains such as marketing and supply chain management.

• Interpret analytical models to make better business decisions.

2
DA102
Basic Statistics
Time: TBA
Place: IH402 & Bhaskara Lab

Dr. Sudipta Das

jusudipta@gmail.com
Office: IH404, Prajnabhavan, RKMVERI, Belur
Office Hours: 11 pm—12 noon, 3 pm—4 pm
(+91) 99039 73750

Course Description: DA102 is going to provide an introduction to some basic statistical methods for
analysis of categorical and continuous data. Students will also learn to make practical use of the statistical
computer package R.

Prerequisite(s): NA
Note(s): Syllabus changes yearly and may be modified during the term itself, depending on the circum-
stances. However, students will be evaluated only on the basis of topics covered in the course.
Course url:
Credit Hours: 4

Text(s):
Statistics;
David Freedman, Pobert Pisani and Roger Purves

The visual display of Quantitative Information;


Edward Tufte

Mathematical Statistics with Applications;


Kandethody M. Ramachandran and ChrisP.Tsokos

Course Objectives:
Knowledge acquired: Students will get to know
(1) fundamental statistical concepts and some of their basic applications in real world.
(2) organizing, managing, and presenting data,
(3) how to use a wide variety of specific statistical methods, and,
(4) computer programming in R.
Skills gained: The students will be able to
(1) apply technologies in organizing different types of data,
(2) present results effectively by making appropriate displays, summaries, and tables of data,
(3) perform simple statistical analyses using R
(4) analyze the data and come up with correct interpretations and relevant conclusions.

1
Course Outline (tentative) and Syllabus:
The weekly coverage might change as it depends on the progress of the class. However, you must keep up
with the reading assignments. Each week assumes 4 hour lectures. Quizzes will be unannounced.

Week Content
Week 1 Introduction, Types of Data, Data Collection, Introduction to R, R fundamentals, Arith-
metic with R
Tabular Representation: Frequency Tables, Numerical Data Handling, Vectors, Matrices,
Week 2
Categorical Data Handling
Week 3 Data frames, Lists, R programming, Conditionals and Control Flow, Loops, Functions
Graphical Representation: Bar diagram, Pie-chart, Histogram, Data Visualization in R,
Week 4
Basis R graphics, Different plot types, Plot customizations
Descriptive Numerical Measures:- Measures of Central Tendency, Measures of Variability,
Week 5 Measure of Skewness, Kurtosis
Quiz 1
Week 6 Descriptive Statistics using R:- Exploring Categorical Data, Exploring Numerical Data
Week 7 Numerical Summaries, Box and Whiskers Plot
Week 8 Problem Session, Review for Midterm exam
Week 9 Concept of sample and population, Empirical distribution, Fitting probability distribution
Week 10 Goodness of fit, Distribution fitting in R
Analysis of bivariate data:- Correlation, Scatter plot
Week 11
Representing bivariate data in R
Week 12 Simple linear regression
Linear Regression in R
Week 13
Quiz 2
Week 14 Two-way contingency tables, Measures of association, Testing for dependence
Week 15 Problem Session, Review for Final Exam

2
DA321 Modeling for Operations Management

Instructor
Sudeep Mallick, Ph.D.
Sudeep.mallick@gmail.com

Course Description:
DA321 deals with the topics in modelling techniques for accomplishing
operations management tasks for business. In particular, the course will cover
advanced techniques of operations research and modelling along with their
applications in various business domains with a special focus on supply chain
management and supply chain analytics.

Prerequisite(s): Basic course in Operations Research covering Linear


Programming fundamentals.
Credit Hours: 4

Text(s):
Operations Research, seventh revised edition (2014)
P K Gupta and D S Hira
ISBN: 81-219-0218-9

Introduction to Operations Research, eighth edition


Frederick S. Hillier & Gerald J. Lieberman
ISBN: 0-07-252744-7

Operations Research: An Introduction, ninth edition


Hamdy A. Taha
ISBN: 978-93-325-1822-3

AMPL: A Modeling Language for Mathematical Programming, second Edition


www.ampl.com

Course Objectives:
Knowledge acquired:
1. Different operations research modelling techniques.
2. Application of the modelling techniques in business domains.
3. Hands-on implementation of the models using computer software such as
MS-EXCEL, CPLEX solvers.

Skills acquired: Students will be able to


1. apply the appropriate operations research technique to formulate
mathematical models of the business problem
2. implement and evaluate alternative models of the problem in computer
software
Grade Distribution:
Assignments 20%, Internal Test 20%, Mid-term exam 30%, Final exam 30%
Course Outline (tentative) and Syllabus:
Week Content
Week 1  Advanced Linear Programming: Duality theory, Dual Simplex
method
 Reading assignment: Chapter 6, GH / Chapter 4, HT
Week 2  Lab session on Linear Programming and Sensitivity Analysis with
AMPL (CPLEX solver)
 Lab assignment 1, Reading assignment: AMPL manual
Week 3  Supply chain management modelling: supply chain management
definition, modelling, production planning decisions
 Reading assignment: Instructor notes
Week 4  Lab session on modelling aggregate planning problems
Week 5  Transportation problem: transportation model, solution
techniques, variations.
 Reading assignment: Chapter 3, GH / Chapter 5, HT
 Transportation problem Lab sessions
 Lab instructions: Instructor notes
Week 6  Multi-stage transportation problem: formulation, solution
techniques, truck allocation problem, Traveling Salesman
Problem, vehicle routing problem
 Reading assignment: Instructor notes
 Internal test 1
Week 7  Assignment problem: assignment, solution techniques
 Reading assignment: Chapter 4, GH / Chapter 5. HT
 Lab assignment 2
Week 8  Integer programming: problem formulation and solution
techniques
 Reading assignment: Chapter 6, GH / Chapter 9, HT
 Review for Midterm Exam
Week 9  Non-linear Programming: problem formulation and solution
techniques
 Reading assignment: Chapter 16, GH / Chapter 21, HT
 Lab assignment 3
Week 10  Inventory management: deterministic inventory models, cycle
inventory models
 Reading assignment: Chapter 12, GH / Chapter 13, HT
 Internal test 2
Week 11  Inventory management: stochastic inventory models, safety
stock models
 Reading assignment: Chapter 12, GH / Chapter 13, HT
 Lab session: Inventory management modeling
 Reading assignment: Instructor notes
Week 12  Lab Session: Supply chain management beer game
Week 13  Queueing theory: pure birth and death models
 Reading assignment: Chapter 10, GH / Chapter 18, HT
 Reading assignment: Chapter 10, GH / Chapter 18, HT
Week 14  Queueing theory: general poisson model, specialised poisson
queues
 Lab session: queueing theory
 Reading assignment: Chapter 10, GH / Chapter 18, HT
 Lab assignment 4
Week 15  Queueing theory: queueing decision models
 Reading assignment: Chapter 10, GH / Chapter 18, HT
DA205 Data Mining
Instructor: Prof. Aditya Bagchi

Course Description: The quantity and variety of online data is increasing very rapidly. The data
mining process includes data selection and cleaning, machine learning techniques to “learn” knowledge that
is “hidden” in data, and the reporting and visualization of the resulting knowledge. This course will cover
these issues.

Prerequisite(s): First course in DBMS,


Credit Hours: 2

Text(s):
• Data Mining Concepts and techniques, J. Han and M. Kamber, Morgan Kaufmann.
• Mining of Massive datasets, A. Rajaraman, J. Leskovec, J.D. Ullman
• Mining the WEB, S. Chakrabarti, Morgan Kaufmann.

Course Objectives:
Knowledge acquired: At the finish of this course, students will be quite empowered and will know
(1) standard data mining problems and associated algorithms.
(2) how to apply and implement standard algorithms in similar problem.
Competence Developed: The student will be able to
(1) Understand a data environment, extract relevant features and identify necessary algorithms for
required analysis.
(2) Accumulation, extraction and analysis of Social network data.
Course Outline (tentative) and Syllabus: The weekly coverage might change as it depends on
the progress of the class. However, you must keep up with the reading assignments. Each week assumes 4
hour lectures.
1. Introduction to Data Mining concept, Data Cleaning, transformation, reduction and summarization.
(1 lecture = 2 hours)
2. Data Integration - Multi and federated database design, Data Warehouse concept and architecture. (2
lectures = 4 hours)
3. Online Analytical Processing and Data Cube. (2 lectures =4 Hours)
4. Mining frequent patterns and association of items, Apriori algorithm with fixed and variable support,
improvements over Apriori method - Hash-based method, Transaction reduction method, Partitioning
technique, Dynamic itemset counting method. (2 Lectures = 4 Hours)
5. Frequent Pattern growth and generation of FP-tree, Mining closed itemsets. (1 Lecture = 2 Hours)
6. Multilevel Association rule, Association rules with constraints, discretization of data and association
rule clustering system. (1 Lecture = 2 Hours)
7. Association mining to Correlation analysis. (1 Lecture = 2 Hours)
8. Mining time-series and sequence data. (2 Lectures = 4 Hours)
9. Finding similar items and functions for distance measures. (4 Lectures = 8 Hours)
10. Recommendation system, content based and collaborative filtering methods. (5 Lectures = 10 Hours)
11. Graph mining and social network analysis. (5 Lectures = 10 Hours)

1
DA220 Machine Learning
Instructor: Tanmay Basu

Course Description: DA220 deals with topics in supervised and unsupervised learning methodolo-
gies. In particular, the course will cover different advanced models of data classification and clustering
techniques, their merits and limitations, different use cases and applications of these methods. Moreover,
different advanced unsupervised and supervised feature engineering schemes to improve the performance of
the learning techniques will be discussed.

Prerequisite(s): (1) Linear Algebra and (2) Probability and Stochastic processes
Credit Hours: 4

Text(s):

Introduction to Machine Learning E. Alpaydin ISBN: 978-0262-32573-8


The Elements of Statistical Learning J. H. Friedman, R. Tibshirani, and T. Hastie ISBN: 978-0387-84884-6
Pattern Recognition S. Theodoridis and K. Koutroumbas ISBN: 0-12-685875-6
Pattern Classification R. O. Duda, P. E. Hart and D. G. Stork ISBN: 978-0-471-05669-0
Introduction to Information Retrieval C. D. Manning, P. Raghavan and H. Schutze ISBN: 978-0-521-86571-5

Course Objectives:
Knowledge Acquired:
1) The background and working principles of various supervised learning techniques viz., linear
regression, logistic regression, bayes and naive bayes classifiers, support vector machine etc. and
their applications.
2) The importance of cross validation to optimize the parameters of a classifier.
3) The idea of different kinds of clustering techniques e.g., k-means, k-medoid, single-linkage, DB-
SCAN algorithms and their merits and demerits.
4) The significance of feature engineering to improve the performance of the learning techniques and
overview of various supervised and unsupervised feature engineering techniques.
5) The essence of different methods e.g., precision, recall etc. to evaluate the performance of the
machine learning techniques.

Skills Gained: The students will be able to


1) pre-process and analyze the characteristics of different types of standard data,
2) work on scikit-learn, a standard machine learning library,
3) evaluate the performance of different machine learning techniques for a particular application and
validate the significance of the results obtained.
Competence Developed:
1) Build skills to implement different classification and clustering techniques as per requirement to
extract valuable information from any type of data set.
2) Can train a classifier on an unknown data set to optimize its performance
3) Develop novel solutions to identify significant features in data e.g., identify the feedback of po-
tential buyers over online markets to increase the popularity of different products.
Evaluation:
Assignments 50% Midterm Exam 25% Endterm Exam 25%
Course Outline (tentative) and Syllabus:
The weekly coverage might change as it depends on the progress of the class. However, you must keep up
with the reading assignments. Each week assumes 4 hour lectures.

1
Week Contents
• Overview of machine learning: idea of supervised and unsupervised learning, regres-
Week 1 sion vs classification, concept of training and test set, classification vs clustering and
significance of feature engineering
• Linear regression: least square and least mean square methods
• Bayes decision rule: bayes theorem, bayes classifier and error rate of bayes classifier
Week 2 • Minimum distance classifier and linear discriminant function as derived from Bayes
decision rule
• Naive bayes classifier: gaussian model, multinomial model, bernoulli model
Week 3 • k-Nearest Neighbor (kNN) decision rule: idea of kNN classifier, distance weighted kNN
decision rule and other variations of kNN decision rule
• Perceptron learning algorithm: incremental and batch version, proof of convergence
Week 4 • XOR problem, two layer perceptrons to resolve XOR problem, introduction to multi-
layer perceptrons

Week 5 • Discussion on different aspects of linear discriminant functions for data classification
• Logistic regression and maximum margin classifier

Week 6 • Support vector machine (SVM): hard margin


• Soft margin SVM classifier

Week 7 • Cross validation and parameter tuning


• Different techniques to evaluate the classifiers e.g., precision, recall and f-measure
• The basics to work with Scikit-learn: a machine learning repository in python
Week 8 • How to implement different classifiers in scikit-learn, tune the parameters and evaluate
the performance
• Text classification(case study for data classification): overview of text data, stemming
Week 9 and stopword removal, tf-idf weighting scheme and n-gram approach.
• How to work with text data in scikit-learn
• Assignment 2: Evaluate the performance of different classifiers to classify a newswire
Week 10 e.g., Reuters-21578.
• Review for midterm exam
• Data clustering: overview, cluster validity index

Week 11 • Partitional clustering methods: k-means, bisecting k-means


• k-medoid, buckshot clustering techniques
• Hierarchical clustering techniques: single linkage, average linkage and group average
Week 12 hierarchical clustering algorithms
• Density based clustering technique e.g., DBSCAN
• Feature engineering: overview of feature selection, supervised and unsupervised feature
Week 13 selection techniques
• Overview of principal component analysis for feature extraction
• How to work with Wordnet, an English lexical database
Week 14 • Sentiment analysis (case study for data clustering): overview, description of a data set
of interest for sentiment identification, sentiment analysis using Wordnet
• Assignment 2: Sentiment analysis from short message texts
Week 15 • Practice class for the second assignment
• Review for endterm exam

2
DA104 Probability and Stochastic Processes

Instructor
Dr. Arijit Chakraborty (ISI Kolkata)

Course Description:
DA104 deals with technologies and engineering solutions for enabling big data processing and
analytics . More specifically, it deals with the tools for data processing, data management and
programming in the distributed programming paradigm using techniques of MapReduce programming,
NoSQL distributed databases, streaming data processing, data injestion, graph processing and
distributed machine learning for big data use cases.

Prerequisite(s): (1) Basic knowledge of python and Java programming languages (2) Tabular data
processing / SQL queries. (3) Basic knowledge of common machine learning algorithms.
Credit Hours: 4

Text(s):
1. Introduction to time series analysis; PJ Brockwell and RA Davis
2. Time Series Analysis and Its Applications; Robert H. Shumway and David S. Stoffer
3. Introduction to Statistical time series; WA Fuller
4. A first course in Probability, Sheldon Ross, Pearson Education, 2010
5. Time Series Analysis; Wilfredo Palma
6. P. G. Hoel, S. C. Port and C. J. Stone: Introduction to Probability Theory, University Book
Stall/Houghton Mifflin, New Delhi/New York, 1998/1971.

Syllabus
1. Basic Probability
a. Introduction
b. Sample Spaces
c. Probability Measures
d. Computing Probabilities: Counting Methods
i. The Multiplication Principle
ii. Permutations and Combinations
e. Conditional Probability
f. Independence

2. Random Variables
a. Discrete Random Variables
i. Bernoulli Random Variables
ii. The Binomial Distribution
iii. Geometric and Negative Binomial Distributions
iv. The Hypergeometric Distribution
v. The Poisson Distribution
b. Continuous Random Variables
i. The Exponential Density
ii. The Gamma Density
iii. The Normal Distribution
iv. The Beta Density
c. Functions of a Random Variable

3. Joint Distributions
a. Introduction
b. Discrete Random Variables
c. Continuous Random Variables
d. Independent Random Variables
e. Conditional Distributions
i. The Discrete Case
ii. The Continuous Case
f. Functions of Jointly Distributed Random Variables
i. Sums and Quotients
ii. The General Case

4. Expected Values

a. The Expected Value of a Random Variable


i. Expectations of Functions of Random Variables
ii. Expectation of Linear Combinations of Random Variables
b. Variance and Standard Deviation
c. Covariance and Correlation
d. Conditional Expectation
e. Definitions and Examples
f. The Moment-Generating Function

5. Limit Theorems
a. Introduction
b. The Law of Large Numbers
c. Convergence in Distribution and the Central Limit Theorem

6. Stochastic Process
a. Markov chain
i. State transition matrix
ii. Hitting time
iii. Different States
b. Poisson process
DA230
Enabling Technologies for Big Data Computing

Instructor
Sudeep Mallick, Ph.D.
Sudeep.mallick@gmail.com

Course Description:
DA230 deals with technologies and engineering solutions for enabling big data
processing and analytics . More specifically, it deals with the tools for data
processing, data management and programming in the distributed programming
paradigm using techniques of MapReduce programming, NoSQL distributed
databases, streaming data processing, data injestion, graph processing and
distributed machine learning for big data use cases.

Prerequisite(s): (1) Basic knowledge of python and Java programming


languages (2) Tabular data processing / SQL queries. (3) Basic knowledge of
common machine learning algorithms.
Credit Hours: 4

Text(s):
Hadoop: The Definitive Guide, fourth edition
Tom White
ISBN: 978-1-491-90163-2

Hadoop in Action, edition: 2011


Chuck Lam
ISBN: 978-1-935-18219-1

Spark in Action, edition: 2017


Petar Zecevic & Marko Bonaci
ISBN: 978-93-5119-948-9

Data-Intensive Text Processing with MapReduce, edition: 2010


Jimmy Lin & Chris Dyer
ISBN: 978-1-608-45342-9
Course Outline (tentative) and Syllabus:
The weekly coverage might change as it depends on the progress of the class.
Each week assumes 4 hour lectures.
Week Content
Week 1  Big data computing paradigm and Hadoop: big data, hadoop
architecture
 Reading assignment: Chapter 1, LD & Chapter 1, TW
 Lab: setting up Hadoop platform in standalone mode
Week 2  Hadoop MapReduce (MR): Lab session with simple MR algorithms
in Hadoop standalone mode
 Reading assignment: Chapter 2, LD & Chapter 2, TW
Week 3  Hadoop Distributed File System (HDFS), YARN and MR
architecture, daemons, serialization concept, command line
parameters: Lab session
 Reading assignment: Chapter 3-5 & 7, TW
Week 4  Implementing algorithms in MR - joins, sort, text processing, etc.:
Lab session
 Reading assignment: Chapter 3, LD & Chapter 7, TW
 Lab assignment 1
Week 5  Hadoop operations in Cluster Mode, Hadoop on AWS Cloud: Lab
session
 Reading assignment: Instructor notes
Week 6  Understanding NoSQL using Pig: Lab Session
 Reading assignment: Chapter 16, TW
 Lab assignment 2
Week 7  Introduction to Apache Spark platform and architecture, RDD,
 Reading assignment: Chapters 1-3, ZB
Week 8  Mapping, joining, sorting, grouping data with Spark RDD: Lab
session
 Reading assignment: Chapter 4, ZB
 Review for Mid term exam
Week 9  Advanced usage of Spark API: Lab session
 Reading assignment: Chapter 4, ZB
 Lab assignment 3
Week 10  NoSQL queries using Spark DataFrame and Spark SQL: Lab
session
 Reading assignment: Chapter 5, ZB
Week 11  Using SQL Commands with Spark: Lab session
 Reading assignment: Chapter 5, ZB
Week 12  Machine Learning using Spark MLib: Lab session
 Reading assignment: Chapter 7, ZB
Week 13  Machine Learning using Spark ML: Lab session
 Reading assignment: Chapter 8, ZB
 Lab assignment 4
Week 14  Spark operations in Cluster Mode, Spark on AWS Cloud: Lab
session
 Reading assignment: Chapter 11, ZB
Week 15  Graph processing with Spark GraphX: Lab session
 Reading assignment: Chapter 9, ZB
DA210
Advanced Statistics
Time: TBA
Place: IH402 & Bhaskara Lab

Instructor: TBA

Course Description: DA*** introduce the conceptual foundations of statistical methods and how to
apply them to address more advanced statistical question. The goal of the course is to teach students how
one can effectively use data and statistical methods to make evidence based business decisions. Statistical
analyses will be performed using R and Excel.

Prerequisite(s): NA
Note(s): Syllabus changes yearly and may be modified during the term itself, depending on the circum-
stances. However, students will be evaluated only on the basis of topics covered in the course.
Course url:
Credit Hours: 4

Text(s):

Statistical Inference;
P. J. Bickel and K. A. Docksum

Introduction to Linear Regression Analysis;


Douglas C. Montgomery

Course Objectives:
Knowledge acquired: Students will get to know
(1) advance statistical concepts and some of their basic applications in real world,
(2) the appropriate statistical analysis technique for a business problem,
(3) the appropriateness of statistical analyses, results, and inferences , and,
(4) advance data analysis in R.
Skills gained: The students will be able to
(1) use data to make evidence based decisions that are technically perfect,
(2) communicate the purposes of the data analyses,
(3) interpret the findings from the data analysis, and the implications of those findings,
(4) implement the statistical method using R and Excel.

1
Course Outline (tentative) and Syllabus:
The weekly coverage might change as it depends on the progress of the class. However, you must keep up
with the reading assignments. Each week assumes 4 hour lectures. Quizzes will be unannounced.

Week Content
Week 1 Point Estimation, Method of moments, Likelihood function, Maximum likelihood equations,
Unbiased estimator
Week 2 Mean square error, Minimum variance unbiased estimator, Consistent estimator, Efficiency
Uniformly minimum variance unbiased estimator, Efficient estimator, Sufficient estimator,
Week 3
Jointly sufficient Minimal sufficient statistic
Week 4 Interval Estimation, Large Sample Confidence Intervals: One Sample Case
Small Sample Confidence Intervals for µ, Confidence Interval for the Population Variance,
Week 5
Confidence Interval Concerning Two Population Parameters
Type of Hypotheses, Two types of errors, The level of significance, The p-value or attained
Week 6
significance level,
The NeymanPearson Lemma, Likelihood Ratio Tests, Parametric tests for equality of means
Week 7
and variances.
Week 8 Problem Session, Review for Midterm exam
Week 9 Linear Model, Gauss Markov Model
Week 10 Inferences on the Least-Squares Estimators
Week 11 Analysis of variance.
Week 12 Multiple linear regressionn Matrix Notation for Linear Regression
Week 13 Regression Diagnostics, Forward, backward and stepwise regression,
Week 14 Logistic Regression.
Week 15 Problem Session, Review for Final Exam

2
DA330
Advanced Machine Learning

Tanmay Basu

Email: welcometanmay@gmail.com
URL: https://www.researchgate.net/profile/Tanmay Basu
Office: IH 405, Prajna Bhavan, RKMVERI, Belur, West Bengal, 711 202
Office Hours: 11 pm-5 pm
Phone: (+91)33 2654 9999

Course Description: DA330 deals with topics in supervised and unsupervised learning methodologies.
In particular, the course will cover different advanced models of data classification and clustering techniques,
their merits and limitations, different use cases and applications of these methods. Moreover, different ad-
vanced unsupervised and supervised feature engineering schemes to improve the performance of the learning
techniques will be discussed.

Prerequisite(s): (1) Machine Learning, (2) Linear Algebra and (3) Basic Statistics.
Note(s): Syllabus changes yearly and may be modified during the term itself, depending on the circum-
stances. However, students will be evaluated only on the basis of topics covered in the course.
Course URL:
Credit Hours: 4

Text(s):

Introduction to Machine Learning C. M. Bishop


E. Alpaydin ISBN: 978-0387-31073-2
ISBN: 978-0262-32573-8
Probabilistic Graphical Models: Principles and Tech-
The Elements of Statistical Learning niques
J. H. Friedman, R. Tibshirani, and T. Hastie D. Koller and N. Friedman
ISBN: 978-0387-84884-6 ISBN: 978-0262-01319-2

Neural Networks and Learning Machines Introduction to Information Retrieval


S. Haykin C. D. Manning, P. Raghavan and H. Schutze
ISBN: 978-0-13-14713-99 ISBN: 978-0-521-86571-5

Deep Learning
I. Goodfellow, Y. Bengio and A. Courville
ISBN: 978-0262-03561-3
Pattern Recognition and Machine Learning

Course Objectives:
Knowledge acquired: (1) Different advanced models of learning techniques,
(2) their merits and limitations, and,
(3) applications.
Skills gained: The students will be able to
(1) analyze complex characteristics of different types of data,
(2) knowledge discovery from high dimensional and large volume of data efficiently, and,
(3) creating advanced machine learning tools for data analysis.

1
Grade Distribution:
Assignments 50%, Midterm Exam 20%, Endterm Exam 30%

Course Outline (tentative) and Syllabus:


The weekly coverage might change as it depends on the progress of the class. However, you must keep up
with the reading assignments. Each week assumes 4 hour lectures.

Week Contents

Week 1 • Overview of machine learning: concept of supervised and unsupervised learning


• Decision tree classification: C4.5 algorithm

Week 2 • Random forest classifier


• Discussion on overfitting of data. Boosting and bagging techniques

Week 3 • Non linear support vector machine (SVM): Method and Applications
• Detailed discussion on SVM using kernels

Week 4 • Neural network: overview, XOR problem, two layer perceptrons


• Architecture of multilayer feedforward network

Week 5 • Backpropagation algorithm for multilayer neural networks


• Neural network using radial basis function: method and applications

Week 6 • Design and analysis of recurrent neural networks


• Deep learning: a case study

Week 7 • Assignment 1: design of efficient neural networks for large and complex data of interest
• Overview of data clustering and expectation maximization method
• Spectral clustering method
Week 8 • Non negative matrix factorization for data clustering
• Review for midterm exam

Week 9 • Fuzzy c-means clustering technique


• Overview of recommender systems

Week 10 • Different types of recommender systems and their applications


• Probabilistic graphical model: an overview

Week 11 • Learning in Bayesian networks


• Markov random fields

Week 12 • Hidden markov model: methods and applications


• Temporal data mining

Week 13 • Conditional random fields (CRF)


• Overview of named entity recognition (NER) in text: A case study

Week 14 • Named entity recognition: Inherent vs contextual features, rule based method
• Rule based text mining using regular expressions
• Gazetteer based and CRF based method for NER
Week 15 • Assignment 2: Automatic de-identification of protected information from clinical notes
• Review for endterm exam

2
Ramakrishna Mission Vivekananda Educational and Research Institute
Syllabus for Linear Algebra I
Prepared by: Dr. Soumya Bhattacharya

1 Linear equations
• Systems of linear equations

• Matrices and elementary row operations

• Row reduced Echelon matrices

• Matrix multiplication

• Invertible matrices

• Transpose of a matrix

• Systems of homogeneous equations

• Equivalence of row rank and column rank of a matrix

• Determinant and volume of the fundamental parallelepiped

• Permutation matrices

• Cramer’s rule

2 Vector spaces
• Vector spaces and subspaces

• Bases and dimensions

• Coordinates and change of bases

• Direct sums
3 Linear transformations
• The Rank-Nullity theorem

• Matrix of a linear transformation

• Linear operators and isomorphism of vector spaces

• Determinant of a linear operator

• Linear functionals

• Annihilators

• The double dual

4 Eigenvalues and eigenvectors


• Eigenvalues and eigenvectors of matrices

• The characteristic polynomial

• Algebraic and geometric multiplicities of eigenvalues

• Diagonalizability

• Cayley-Hamilton theorem

• Solving linear recurrences

5 Bilinear forms
• Matrix of a bilinear form

• Symmetric and positive definite bilinear forms

• Normed spaces

• Cauchy-Schwarz inequality and triangle inequality

• Angle between two vectors

• Orthogonal complement

• Projection

• Gram-Schmidt orthogonalization

• Hermitian operators

• The Spectral theorem

Page 2
6 Introduction to linear programming
• Bounded and unbounded sets

• Convex functions

• Convex cone

• Interior points and boundary points

• Extreme points or vertices

• Convex hulls and convex polyhedra

• Supporting and separating hyperplanes

• Formulating linear programming problems

• Feasible solutions and optimal solutions

• Graphical method

• The basic principle of Simplex method

• Big-M method

Reference books
1. M. Artin, Algebra, Prentice Hall.

2. K. M. Hoffmann, R. Kunze, Linear Algebra, Prentice Hall.

3. G. Strang, Introduction to Linear Algebra, Wellesley-Cambridge Press.

4. L. I. Gass, Linear Programming, Tata McGraw Hills.

5. G. Hadley, Linear Programming, Narosa Publishing House.

The students by the end of the course will be able to explain:


• How to check whether a given system of linear equations has any solution or not.

• How to find the solutions (if any) of a system of linear equations.

• Why a system of linear equations with more variables than equations always has a solution,
whereas a system of such equations with more equations than variables may not have any
solution at all.

• How to find the rank and nullity of a matrix.

• Why each permutation matrix is of full rank.

Page 3
• Why a matrix is invertible if and only if it has nonzero determinant and how to find the
inverse of such a matrix.

• Why a matrix with more columns than rows (resp. more rows than columns) does not
have a left (resp. right) inverse.

• How to extend a basis of a subspace of a vector space V to a basis of V .

• How a change of basis affects the coordinates of a given vector.

• Why both the ranks of a matrix A and its transpose AT are the same as that of AT A.

• Why the determinant of the matrix of a linear operator does not depend on the choice of
the basis of the ambient space.

• Why the sum of the dimension of a subspace W of a vector space V and the dimension of
the annihilator of W is the dimension of V .

• Why the double dual of a vector space V is canonically isomorphic to V itself.

• Why the fact that a certain conjugate of a given matrix A is diagonal is equivalent to the
fact that the space on which A acts by left multiplication is a direct sum of the eigenspaces
of A.

• Why every idempotent matrix is diagonalizable.

• Why conjugate matrices have the same eigenvalues with the same algebraic and geometric
multiplicities.

• What Cayley-Hamilton theorem states and why replacing the variable t by the square
matrix A in det(tI − A) does not lead to a proof of this theorem.

• How to solve a linear recurrence whose associated matrix is diagonalizable.

• Why the determinant of an upper or lower triangular matrix is the product of its diagonal
entries.

• Why two diagonalizable matrices commute if and only if they are simultaneously diago-
nalizable.

• Why for a matrix which represent the dot product with respect to some basis, it is necessary
and sufficient to be symmetric and positive definite.

• Why for a symmetric matrix to be positive definite, it is necessary and sufficient for it to
have strictly positive eigenvalues.

• What is the role of the Cauchy-Schwarz inequality in defining the angle between two
vectors.

• Why the elements in a basis a subspace W of V and the elements in a basis of the orthogonal
complement of W are linearly independent.

• How to orthogonalize a given basis of an inner product space.

Page 4
• Why each inner product on a real vector space V induces an isomorphism between V and
its dual.

• Why any symmetric matrix is diagonalizable and why all its eigenvalues are real.

• Why in a closed and bounded convex region, a convex function attains its maximum at
the boundary.

• Why it suffices to check only the corner points to find a solution to a given linear pro-
gramming problem, whose feasible region is a convex polyhedron.

Sample questions

Linear equations
1. Let A be a square matrix. Show that the following conditions are equivalent:

(i) The system of equations AX = 0 has only the trivial solution X = 0.

(ii) A is invertible.

2. Show that a matrix with more columns than rows (resp. more rows than columns) does not
have a left (resp. right) inverse.

3. Explain why a system of linear equations with more variables than equations always has a
solution, whereas a system of such equations with more equations than variables may not have
any solution at all.

4. Let An = 0. Let I denote the identity matrix of the same size as that of A. Compute the
inverse of A − I.

5. Prove that if A is invertible, then (At )−1 = (A−1 )t .

6. Compute the determinant of the following matrix:


 
2 1

1 2 1
 0 


1 2 1
 
 
.
 
 .. .. .. 

 . . . 



 0 1 2 1



1 2 n×n

Page 5
7. Let n be a positive integer and let

2 −1
 

−1 2 −1



 0
−1 2 −1
 
 
A= .
 
.. .. .. 

 . . . 



 0 −1 2 −1 


−1 2 n×n

Find the value of the determinant of the matrix A.


8. Show that every permutation matrix is of full rank.
9. Compute the determinant of the following matrix:

2 −2
 

 −1 5 −2





0
−2 5 −2
 
 
 
 .. .. .. 
.

 . . . 


 −2 5 −2 

0
 
−2 5 −1 
 

−2 2 n×n

10. Compute the determinant of the following matrix:


 
3 2

1
 3 2 0 


1 3 2
 
 
.
 
 .. .. .. 

 . . . 



 0 1 3 2


1 3 n×n

11. If possible, find all the solutions of the equation XY − Y X = I in 3 × 3 real matrices X, Y .

12. Let A ∈ Mn,n (R). Show that


n n
!
Y X
2
(det A) ≤ A2k,i ,
i=1 k=1

where Ak,i denotes the k, i-th entry of A.

Page 6
13. Let  
2 −2 −4
A = −1 3 4  ∈ M3,3 (R).
1 −2 −3
Find the inverse of the matrix 37 · A372 + 2 · I .


Vector spaces and linear transformations


14. Let f and g be two nonzero linear functionals on a finite dimensional real vector space V
such that their nullspaces (i.e. kernels) coincide. Show that there exists a c ∈ R such that
f = cg.

15. Show that if the product of two n × n matrices is 0, then sum of their ranks is less than or
equal to n.

16. The cross product of two vectors in R3 can be generalized for n ≥ 3 to a product of n − 1
vectors in Rn as follows: For x(1) , . . . , x(n−1) ∈ Rn , define
n
X
x(1) × . . . × x(n−1) := (−1)i+1 (det Ai ) · ei ,
i=1

where A ∈ Mn−1, n (R) is the matrix, whose rows are x(1) , . . . , x(n−1) and Ai is the submatrix of
A obtained by deleting the i-th column of A. Similarly as in the case n = 3, the cross product
x(1) × · · · × x(n−1) is given by the formal expansion of
 
e1 e2 ··· en
 (1) (1) (1) 
 x1 x2 ··· xn 
det  .
 .. .. 
 .
. . . 

(n−1) (n−1) (n−1)
x1 x2 · · · xn

w.r.t. the first row. Show that the following assertions hold for the generalized cross product:
a) x(1) ×. . .×x(i−1) ×(x+y)×x(i+1) ×. . .×x(n−1) = x(1) ×. . .×x(i−1) ×x×x(i+1) ×. . .×x(n−1) +
x(1) × . . . × x(i−1) × y × x(i+1) × . . . × x(n−1) .
b) x(1) ×. . .×x(i−1) ×λx×x(i+1) ×. . .×x(n−1) = λ x(1) × . . . × x(i−1) × x × x(i+1) × . . . × x(n−1) .


c) x(1) × . . . × x(n−1) = 0 ⇔ x(1) , . . . , x(n−1) are linearly dependent.


 
y1 y2 ··· yn
 (1) (1) (1) 
 x1 x2 ··· xn 
(1)
d) hx × . . . × x (n−1) , yi = det  . .. .. .

 .. . . 
(n−1) (n−1) (n−1)
x1 x2 ··· xn
e) hx(1) × . . . × x(n−1) , x(i) i = 0 for i ∈ {1, . . . , n − 1}.

17. For any matrix A, show that the ranks of A and AT A are the same.

Page 7
18. Let n ≥ 3, A ∈ On and x(1) , . . . , x(n−1) ∈ Rn . Define the linear map ϕA : Rn −→ Rn by
ϕ(v) = Av and let the generalized cross product of n − 1 vectors in Rn be defined as in the last
exercise. Show that:

ϕA x(1) × · · · × ϕA x(n−1) = det A · ϕA x(1) × · · · × x(n−1) .


  

19. Let V and W be finite dimensional vector spaces and let iV : V → V and iW : W → W be
identity maps. Let φ : V → W and ψ : W → V be two linear maps. Show that iV − ψ ◦ φ is
invertible if and only if iW − φ ◦ ψ is invertible.

20. If W1 and W2 are two subspaces of a vector space V , then show that

(W1 + W2 )0 = W10 ∩ W20 .

21. If W1 and W2 are two subspaces of a vector space V , then show that

(W1 ∩ W2 )0 = W10 + W20 .


     
 1 1 0 
22. Let V = R3 and let B = 0 , 1 , 1 be a basis of V . Compute the dual basis B∗
3 2 1
 
of V ∗ .

23. Let V, W finite dimensional vector spaces over a field K and let ϕ : V → W be a linear map.
(1) Show that ϕ∗ : W ∗ → V ∗ is a linear map.
(2) Show that ψ : HomK (V, W ) −→ HomK (W ∗ , V ∗ ), ϕ 7→ ϕ∗ is an isomorphism.

24. Let V, W be finiet dimensional vector spaces over a field K and let ϕ : V → W be a linear
map.
(1) Show that if ϕ is surjective, then ϕ∗ injective.
(2) Show that if ϕ is injective, then ϕ∗ is surjective.

Eigenvalues and eigenvectors


25. Let A be a diagonalizable matrix. Show that A and AT are conjugate.

26. Let v, w ∈ Rn are eigenvectors of a matrix A ∈ Mn,n (R) with corresponding eigenvalues λ
and µ respectively. Show that if v + w is also an eigenvector of A, then λ = µ.

27. Let V = Rn and A ∈ Mn,n (R) be a diagonalizable matrix. Show that:

V = (ker ϕA ) ⊕ (Im ϕA ),

where the map ϕA : V −→ V is defined by ϕA (v) := Av for all v ∈ V .

28. Find a closed formula for the n-th term of the linear recurrence defined as follows: F0 =
0, F1 = 1 and
Fn+1 = 3Fn − 2Fn−1 .

Page 8
29. Let A ∈ On with det A = −1. Show that −1 is an eigenvalue of A with an odd algebraic
multiplicity.

30. Let n be a positive odd integer and let A ∈ SOn . Show that 1 is an eigenvalue of A.

31. If each row sum of a real square matrix A is 1, show that 1 is an eigenvalue of A.

32. Let A be a 2017 × 2017 matrix with all its diagonal entries equal to 2017. If all the rest of
the entries of A are 1, find the distinct eigenvalues of A.
33. Let λ be an eigenvalue of the n × n matrix A = (aij ). Show that there exists a positive
integer k ≤ n such that
n
X
|λ − akk | ≤ |ajk |.
j=1, j6=k

34. Let A be a diagonalizable matrix. Show that A and AT have the same eigenvalues with the
same algebraic and geometric multiplicities.
35. (a) Let A be a 3 × 3 matrix with real entries such that A3 = A. Show that A is diagonaliz-
able.
(b) Let n be a positive integer. Let A be a n × n matrix with real entries such that A2 = A.
Show that A is diagonalizable.
36. Let A be a diagonalizable matrix. Show that A and AT have the same eigenvalues with the
same algebraic and geometric multiplicities.
37. Let A be a 3 × 3 matrix with positive determinant. Let PA (t) denote the characteristic
polynomial of A. If PA (−1) > 1, show that A is diagonalizable.
38. Let A be a 3 × 3 matrix with real entries. If PA (−1) > 0 > PA (1), where PA (t) denotes the
characteristic polynomial of A, show that A is diagonalizable.
39. (a) Show that similar matrices (i.e. conjugate matrices) have the same eigenvalues with the
same algebraic and geometric multiplicities.
(b) Give examples of two matrices with the same characteristic polynomial but with an eigenvalue
which does not have the same geometric multiplicity.
40. Let A be a 3 × 3 matrix with real entries such that A3 = A. Show that A is diagonalizable.
41. Let n be a positive integer and let A be a n × n matrix with real entries such that A3 = A.
Show that A is diagonalizable.
42. For an n × n matrix A and be the characteristic polynomial PA (t) of A, is the following a
correct proof of Cayley-Hamilton theorem?

PA (A) = det(A · In − A) = det(A − A) = 0.

Justify your answer.

Page 9
43. Determine the eigenvalues of the orthogonal matrix

1 + √12 −1 √12 − 1
 
1 
A = · 1 − √12 1 − √12 − 1 .

2 √
1 2 1

44. (a) Find a closed formula for the n-th term of the linear recurrence defined as follows:
F0 = 0, F1 = 1 and
Fn+1 = 2Fn + Fn−1
 
2 1
by diagonalizing the matrix .
1 0
(b) Explain why the above method fails to help us in finding a closed formula for the n-th term
of the linear recurrence defined as follows: F0 = 0, F1 = 1 and

Fn+1 = 2Fn − Fn−1 .

45. Let A be a 5 × 5 real matrix with negative determinant. If PA (±2) > 0 > PA (±1), where
PA (t) denotes the characteristic polynomial of A, show that A is diagonalizable.

46. We say that two matrices A and B are simultaneously diagonalizable if there exists an
invertible matrix P such that both P AP −1 and P BP −1 are diagonal. Show that two diago-
nalizable matrices A and B commute with each other if and only if they are simultaneously
diagonalizable.

47. Find a closed formula for the n-th term of the linear recurrence defined as follows: F0 = 0,
F1 = 1 and
Fn+1 = 3Fn − 2Fn−1 .

48. Solve the following equation for a 2 × 2 matrix X:


 
2 5 4
X = .
4 5

49. Let    
3 −1 1 3 1 1
A = 1 −1 1 and B = −1 −1 −1 .
1 −1 3 1 1 3
Without doing any calculations, explain for which one of the matrices A + B and AB, the
eigenvectors form a basis of R3 .
(b) (3 points) Determine that basis of eigenvectors of R3 for one of the matrices A + B or AB.

50. Construct and example of the scenario where α, β, γ ∈ Rn such that α ⊥ β, γ 6= 0 and A,
B are n × n matrices such that A · α = aγ and B · β = bγ, where a is a nonzero eigenvalue of A
and b is a nonzero eigenvalue of B.

Page 10
Bilinear forms
51. How many n × n real matrices are both symmetric and orthogonal? Justify your answer.

52. We call a linear map Rn an isometry if it preserves the dot product on Rn . Show that left
multiplication by a real square matrix A defines an isometry on Rn if and only if A is orthogonal.

53. How many n × n complex matrices are there which are positive definite, self-adjoint as well
as unitary?

54. For any complex square matrix A, show that the ranks of A and A∗ are equal.

55. Show that if the columns of a square matrix form an orthonormal basis of Cn , then its rows
do too.

56. Let B ∈ Mn,n (R). Show that

ker ϕB := (Im ϕB T )⊥ ,

where the map ϕB : Rn −→ Rn is defined by ϕB (v) = Bv.

57. Let V = R4 and let f : V −→ V such that f 2 = 0. Show that for each triplet v1 , v2 , v3 ∈
Im f , we have
Vol(v1 , v2 , v3 ) = 0.

58. Let V = C2 and let s be a symmetric bilinear form on V . Let q : V −→ R be the quadratic
form corresponding to s. Suppose, for all z1 , z2 ∈ C, we have
 
z1
q = |z1 |2 + |z2 |2 + i(z1 z2 − z1 z2 ).
z2
   
1 1+i
Compute the determinant of the matrix representing s with respect to the basis B = , .
i 1

59. Let V be a real vector space with inner product s and let vp 1 , . . . , vn ∈ V r {0} such that
s(vi , vj ) = 0 for all i, j ∈ {1, . . . , n}. For v ∈ V , we define kvk = s(v, v).
(1) Show that for all v ∈ V , we have
n
X s(v, vi )2
≤ kvk2 . (1)
kvi k2
i=1

(2) Determine all the cases when the equality holds in (1).

60. Let V be a finite dimensional vector space and let P and Q be projection maps from V to
V . Show that the following are equivalent:

(a) P ◦ Q = Q ◦ P = 0.

(b) P + Q is a projection.

Page 11
(c) P ◦ Q + Q ◦ P = 0.

61. Let V = R3 be the three dimensional euclidean


 space  with the usual dot product and let
1 1
U be the subspace of V which is spanned by −1 and 0. Determine the matrix of the
0 1
orthogonal projection PU with respect to the standard basis of V .

62. Do the following


  exercise without using the Spectral Theorem:
a b
(1) Let A = ∈ M2,2 (R). Show that A is diagonalizable.
b d
(2) Let B ∈ M3,3 (R) be a symmetric matrix. Show that B is diagonalizable.

63. Let V be a finite dimensional real vector space. For v, w ∈ V \ {0}, we define the angle
](v, w) between the vectors v und w as the uniquely determined number ϑ ∈ [0, π], for which

s(v, w) = cos ϑ kvkkwk.

We call ϕ ∈ End(V ) conformal if ϕ is injective and if

](v, w) = ](ϕ(v), ϕ(w)) for all v, w ∈ V \ {0}.

Show that a linear map ϕ is conformal if and only if there exists an isometry ψ ∈ End(V ) and
a λ ∈ R \ {0} such that ϕ = λ · ψ.

64. Find all the unitary matrices A such that s(v, w) := hv, Awi defines an inner product on
Cn , where h , i denotes the canonical inner product on Cn .

65. Let V be a finite dimensional vector space over R. Show that each bilinear form on V can
be uniquely written as the sum of a symmetric and a skew-symmetric bilinear form.

66. Let s be a symmetric bilinear form on a vector space V . If there are vectors v, w ∈ V , such
that s(v, w) 6= 0, show that there is a vector v ∈ V , such that s(v, v) 6= 0.

67. Let V be the vector space of the complex-valued continuous functions on the unit circle in
C. a) Show that
Z 2π
hf, gi := f (eiθ )g(eiθ )dθ
0
defines an inner product on V .
b) Define the subspace W ⊆ V by W := {f (eiθ ) : f (x) ∈ C[x] and deg(f ) ≤ n}. Find an
orthonormal basis of W w.r.t. the above inner product.

Page 12
68. Let A be the following 3 × 3 matrix:
 
1 1 1
1 −1 −1 .
1 −1 1

(a) Without any computation, explain why there must exist a basis of R3 consisting only of the
eigenvectors of A.
(b) Find such a basis of R3 .
(c) Determine whether or not the bilinear form s : R3 → R given by s(u, v) := uT Av defines an
inner product on R3 .

69. (a) Let V be a finite dimensional vector space over R and let f and g be two linear functionals
on V such that ker f = ker g. Show that there exists an r ∈ R such that g = rf .
(b) Let ϕ1 , ϕ2 , . . . , ϕ5 be linear functionals on a vector space V such that there does not exist
any vector v ∈ V for which ϕ1 (v) = ϕ2 (v) = · · · = ϕ5 (v). Show that dim V ≤ 5.
 
1
70. Let w = 2 and let the linear map f : R3 → R be defined by
3

f (v) = v T w

for all v ∈ R3 .
a) Find an orthonormal basis of Ker f w.r.t. dot product.
b) Extend this orthonormal basis of Ker f to an orthonormal basis of R3 .

71. Let P2 (R) denote the set of polynomials of degree ≤ 2 with real coefficients. Define the
linear map φ : P2 (R) → R by φ(f ) = f (1). Determine (Ker φ)⊥ with respect to the following
inner product: Z 1
s(f, g) = f (t)g(t)dt.
−1

72. Let P3 (R) denote the set of polynomials of degree ≤ 3 with real coefficients. On P3 (R), we
define the symmetric bilinear form s by
Z 1
s(f, g) = f (t)g(t)dt.
−1

a) Determine the matrix representtation of s w.r.t. the basis {1, t, t2 , t3 }.


b) Show that s is positive definit.
c) Determine an orthonormal basis of P3 (R).

73. Show that the eigenvectors associated with distinct eigenvalues of a self-adjoint matrix are
orthogonal.

Page 13
74. Let A ∈ Mn,n (R) have eigenvalues λ1 , λ2 , . . . , λn ∈ R which are not necessarily distinct.
Suppose v1 , v2 , . . . , vn ∈ Rn are eigenvectors of A associated with the eigenvalues λ1 , λ2 , . . . , λn
respectively, such that vi ⊥ vj if i 6= j. Show that A is symmetric.

75. Let A ∈ Mn,n (R) a skew symmetric matrix. Let v und w be two eigenvectors of A corre-
sponding respectively to the distinct eigenvalues λ1 and λ2 . Show that v and w are orthogonal
to each other (w.r.t. the dot product).

76. Let A ∈ Mn,n (C) be a self-adjoint matrix. Show that the eigenvalues of A are real.

77. How many orthonormal bases (w.r.t. the dot product) are there in Rn , so that all the
entries of the basis vectors are integers?

78. Let V = Cn , let A ∈ Mn,n (C) a self-adjoint Matrix and let the linear operator φA : V −→ V
be defined by φA (v) = Av. Let W be a subspace of V , so that φA (W ) ⊆ W (i.e. φA (w) ∈ W
for all w ∈ W ). Show that
φA (W ⊥ ) ∩ W = {0}.

79. Let V = R2 and let s a symmetric bilinear form on V . let q : V −→ R be the quadratic
form corresponding to s given by
 
x
q = x2 + 5xy + y 2 .
y
   
2 −1
Determine the matrix of s w.r.t. the basis B = , of R2 .
1 2

80. Let V be a finite dimensional vector space over R with an inner product h , i and let
f : V → R be a linear map. Show that there is an uniquely determined vector vf such that for
all v ∈ V , we have
f (v) = hv, vf i.

81. Given  
3 −1 0
A = −1 0 1  ∈ M3,3 (R),
0 1 −1
find a matrix g ∈ GL3 (R), such that g T Ag is of the form
 
Ik
 −Il .
O

Page 14
  
x 2 2 2
82. Draw the curve C := ∈ R 3x + 4xy + 3y = 5 .
y

83. Let X ∈ Mn,n (C) be a self-adjoint matrix and suppose m be a positive integer such that
X m = I. Show that X 3 − 2X 2 − X + 2I = 0.

84. Let n ∈ Z≥2 . Show that s(A, B) := tr(A · B T ) defines an inner product on V = Mn,n (R).
Let ϕ ∈ End(V ) be defined by
ϕ(A) = AT .
(1) Show that ϕ is hermitian.
(2) Show that ϕ is an isometry.
(3) Find the eigenvalues of ϕ.
(4) Find an orthonormal basis B of V , made up of the eigenvectors ef ϕ.
(5) Find the algebraic multiplicities ef the eigenvalues of ϕ.
85. Let for x ∈ R, the matrix Ax defined by
x + x2 1 + x
 
−x
1
Ax := 1+x −x x + x2  .
1 + x + x2 2
x+x 1+x −x
(1) Show that for all x ∈ R, we have Ax ∈ SO3 .
(2) Conclude from (1) that for all real x 6= ±1, there exists a gx ∈ O3 and an αx ∈ (0, π) ∪ (π, 2π)
such that  
1 0 0
gx Ax gx−1 = 0 cos αx − sin αx  .
0 sin αx cos αx
√ √ √
(3) Determine the complex eigenvalues of Ax for x = 1 + 2 + 3 + 1+√2 3 .

 
13 12
86. (1) Find a matrix g ∈ O2 which diagonalizes the matrix A = .
12 13
(2) Find a matrix X ∈ M2,2 (R), which defines a scalar product through s(v, w) = hv, Xwi on
R2 and which satisfies the following equation:
X 2 − A = 0.

87. Let A ∈ Mn,n (R) be a symmetric


  matrix and let B ∈ Mn,n (R) be a skew-symmetric matrix.
λ1
 .. 
Let M = A + iB and let v :=  . , where λ1 , . . . , λn are the eigenvalues of M . Show that
λn
v
u n
uX
kvk = t |Mjk |2 ,
j,k=1

w.r.t. the canonical norm on Cn .

Page 15
88. Let φ : Cn → Cn be a nilpotent, hermitian endomorphism. Show that: φ = 0.

89. Let A, B ∈ Mn,n (C) be two self-adjoint matrices. Show that the following are equivalent:
(1) There is an unitary matrix g such that both gAg −1 and gBg −1 are diagonal matrices.
(2) The matrix AB is self-adjoint.
(3) AB = BA.

90. (1) Let A, B ∈ Mn,n (C) be nilpotent matrices such that AB = BA holds. Show that A + B
is nilpotent.
(2) Let A, B ∈ Mn,n (C) and r, s ∈ Z>0 such that Ar = I, B s = 0 and AB = BA. Show that
A − B is invertible.

91. Let  
1 −2 2
A =  0 −2 1  ∈ M3,3 (R).
−2 1 −2
(1) Find a decomposition A = D + N , where D is a diagonal matrix an N is a nilpotente Matrix.
(2) Berechnen Sie A2012 .

92. Let A ∈ Mn,n (R) be a nilpotent matrix and let V = Mn,n (R). Let ϕ ∈ End(V ) defined by

ϕ(B) = AB − BA for B ∈ V .

Show that ϕ is nilpotent on V .

93. Let V = Rn with s = h·, ·i and let B = {v1 , . . . , vn } an orthonormal basis of V . Let
Ui = (span{vi })⊥ for i ∈ {1, . . . , n}. Show that

SUi ◦ SUj = SUj ◦ SUi

for i, j ∈ {1, . . . , n}, where SUi and SUj are the reflections in Ui and Uj .

94. Let V be a finite dimensional vectore space and let P ∈ End(V ) be a projektion. Let
Id ∈ End(V ) the identity map of V (i.e. Id(v) = v for all v ∈ V ). Show that
(1) Id −P is a projektion.
(2) Id −2P is bijective.
(3) E0 ⊕ E1 = V , where E0 and E1 are respectively the eigenspaces of P corresponding to the
eigenvalues 0 and 1.

95. Let A ∈ Mn,n (C) and let B = A − A∗ . Show that B is diagonalizable and the real parts of
all the eigenvalues of B are zero.

Page 16
96. Let A ∈ SO2 . Show that there is a skew symmetric matrix X ∈ M2,2 (R), such that

exp(X) = A.
 
v1
5 ∗  .. 
97. Let V = R and let ` ∈ V be given by `(v) = v1 + 2v2 + 3v3 + 4v4 + 5v5 für v =  .  ∈ V .
v5
(1) Find an orthonormal basis of ker ` w.r.t. the dot product.
(2) Extend this basis of ker ` to an orthonormal basis of V .

98. Let V = R4 , let  


2 1 2 −3
1 1 2 −3 2 
 ∈ M4,4 (R)
A= 
2  2 −3 2 1
−3 2 1 2
and let s be the symmetric bilinear form whose associated matrix is A.
(1) Determine a basis A of V , such that MA (s) is a diagonal matrix.
(2) Determine a basis B of V , such that
 
Ik
MB (s) =  −Il .
O
   
 2 1 
3
99. Let V = R with s = h·, ·i (the dot product), let U = span  1 , 0  be a subspace
 
0 −1
 
of V and let SU be the reflection in in U .
(1) Determine a matrix representation of SU , w.r.t. the canonical basis A of V .
(2) Show that M (SU )A ∈ O3 and decide whether M (SU )A ∈ SO3 oder M (SU )A 6∈ SO3 or not.

Introduction to linear programming


100. Maximize f (x, y, z) := 6x+3y +10z using Simplex method under the following constraints:

4x + y + z ≤ 5,

2x + y + 4z ≤ 5,
x + 5y + z ≤ 6,
where x, y and z are non-negative rational numbers.

101. Minimize f (x, y, z) := x + 2y + 9z using big-M method under the following constraints:

2x + y + 4z ≥ 5,

2x + 3y + z ≥ 4,
where x, y and z are non-negative rational numbers.

Page 17
102. (a) A convex linear combination of v1 , v2 , . . . , vn ∈ Rm is a a linear combination of the
form t1 v1 + · · · + tn vn , where t1 + · · · + tn = 1.For example, the points on the straight line
connecting v1 and v2 is given by tv1 + (1 − t)v2 , where t lies in the interval [0, 1] ⊂ R. Show that
any arbitrary point in a triangle in Rm with vertices v1 , v2 and v3 is given by a convex linear
combination of its vertices.
(b) Show that any arbitrary point in a tetrahedron in Rm with vertices v1 , v2 , v3 and v4 is given
by a convex linear combination of its vertices.
103. Let f : R2 → R be defined by f (x, y) := 2x + 3y. Find the maximum value attained by f
in the region where 2y − x ≤ 10, 3x + 2y ≤ 9 and 2x + 5y ≥ 8.
104. Maximize f (x, y, z) := 2x + 5y + 3z using Simplex method under the following constraints:

14x + 8y + 5z ≤ 15,

12x + 7y + 8z ≤ 14,
3x + 17y + 9z ≤ 16,
where x, y and z are non-negative rational numbers.
105. Minimize f (x, y, z) := x + 9y + 9z using big-M method under the following constraints:

6x + y + 5z ≥ 11,

4x + 7y + 2z ≥ 9,
where x, y and z are non-negative rational numbers.
106. (a) Recall that any arbitrary point in a convex polyhedron is given by a convex linear
combination of its vertices. Using this, show that the minimum and the maximum values
attained by a linear functional f : Rn → R in a convex polyhedron P ⊂ Rn is the same as the
minimum and the maximum values attained by f at the set of the vertices of P.
(b) Let f : R2 → R be defined by f (x, y) := 5x − 3y. Find the maximum value attained by f in
the region where 4y − 3x ≤ 10, 7x + 2y ≤ 9 and 2x + 5y ≥ 8.
107. Maximize f (x, y, z) := 3x + y + 3z using Simplex method under the following constraints:

2x + y + z ≤ 2,

x + 2y + 3z ≤ 5,
2x + 2y + z ≤ 6,
where x, y and z are non-negative rational numbers.
108. Maximize f (x, y, z) := 3x + y + 4z using big-M method under the following constraints:

x + 3y + 4z ≤ 20,

2x + y + z ≥ 8,
3x + 2y + 3z = 18,
where x, y and z are non-negative rational numbers.

Page 18
Department of Computer Sc. RKMVERI Belur CS 244 Syllabus

CS 244 : Introduction to Optimization Techniques


Course Overview: The process of making optimal judgement according to various criteria is known as the
science of decision making. A mathematical programming problem, also known as an optimization problem,
is a special class of problem where we are concerned with the optimal use of limited resources to meet some
desired objective(s). Mathematical models (simulation based and/or analytical based) are used in provid-
ing guidelines for making effective decisions under constraints. This course covers three major analytical
topics in mathematical programming [linear, nonlinear and integer programming]. On each topic, the the-
ory and modeling aspects are discussed first, and subsequently solution techniques or algorithms are covered.

Prerequisite(s): Linear Algebra


Credit Hours: 4
Course Objectives: Optimization techniques are used in various fields like machine learning, graph the-
ory, VLSI design and complex networks. In all these applications/fields, mathematical programming theory
supplies the notion of optimal solution via the optimality conditions, and mathematical programming algo-
rithms provide tools for training and/or solving large scale models. Students will have knowledge of theory
and applications of several classes of math programs.

Text(s): The course material will be drawn from multiple book chapters, journal articles, reviewed tutorials
etc. However, the following two books are recommended texts for this course.
• Linear programming and Network Flows, Wiley-Blackwell; 4th Edition, 2010
M. S. Bazaraa, John J. Jarvis and Hanif D. Sheral, ISBN-13: 978-0470462720
• Nonlinear Programming: Theory and Algorithms, Wiley-Blackwell; 3rd Edition (2006)
M. S. Bazaraa, Hanif D. Sherali, C. M. Shetty, ISBN-13: 978-0471486008
Course Policies:
• Grades
Grades in the C range represent performance that meets expectations; Grades in the B range
represent performance that is substantially better than the expectations; Grades in the A
range represent work that is excellent.
• Assignments
1. Students are expected to work independently. Discussion amongst students is encouraged but
offering and accepting solutions from others is an act of dishonesty and students can be penalized
according to the Academic Honesty Policy.
2. No late assignments will be accepted under any circumstances.
• Attendance and Absence
Students are not supposed to miss class without prior notice/permission. Students are responsible
for all missed work, regardless of the reason for absence. It is also the absentee’s responsibility to
get all missing notes or materials.
Grade Distribution:
Assignments 40%
Midterm Exam 20%
Final Exam 40%
Grading Policy: Approximate grade assignments:
>= 90.0 % A+
75.0 – 89.9 % A
60.0 – 74.9 % B
50.0 – 59.9 % C
about 35.0 – 49.9 % D
<= 34.9% F

Page 1
Department of Computer Sc. RKMVERI Belur CS 244 Syllabus

Table 1: Topics Covered


Mathematical Preliminaries

• Theory of Sets and Functions,


• Vctor spaces,
• Matrices and Determinants,
• Convex sets and convex cones,
• Convex and concave functions,
• Generalized concavity
Linear Programming

• The (Conventional) Linear Programming Model


• The Simplex Method: Tableau And Computation
• Special Simplex Method And Implementations
• Duality And Sensitivity Analysis
Integer Programming

• Formulating Integer Programing Problems


• Solving Integer Programs (Branch-and-Bound Enumeration, Implicit Enumeration,
Cutting Plane Methods )
Nonlinear Programming: Theory

• Constrained Optimization Problem (equality and inequality constraints)


• Necessary and Suffiecent conditions
• Constraint Qualification
• Lanrangian Duality and Saddle Point Optimality Criteria
Nonlinear Programming: Algorithms

• The concept of Algorithm


• Algorithms for Uconstrained Optimization
• Constraint Qualification
• Algorithms for Constrained Optimization (Penalty Function, Barrier Function, Feasi-
ble Direction)
Special Topics (if time permits)

• Semi-definite and Semi-infinte Programs


• Quadratic Programming
• Linear Fractional programming
• Separable Programming

Page 2
DA311
Time Series
Time: TBA
Place: IH402 & Bhaskara Lab

Dr. Sudipta Das

jusudipta@gmail.com
Office: IH404, Prajnabhavan, RKMVERI, Belur
Office Hours: 11 pm—12 noon, 3 pm—4 pm
(+91) 99039 73750

Course Description: DA311 is going to provide a broad introduction to the most fundamental method-
ologies and techniques used in time series analysis.

Prerequisite(s): (1) Probability & Stochastic Process and (2) Linear Algebra.
Note(s): Syllabus changes yearly and may be modified during the term itself, depending on the circum-
stances. However, students will be evaluated only on the basis of topics covered in the course.
Course url:
Credit Hours: 4

Text(s):
Introduction to time series analysis;
PJ Brockwell and RA Davis

Time Series Analysis and Its Applications;


Robert H. Shumway and David S. Stoffer

Introduction to Statistical time series;


WA Fuller

Time Series Analysis;


Wilfredo Palma

Course Objectives:
Knowledge acquired: Students will get to know
(1) Different time series models MA, AR, ARMA, ARIMA
(2) Autocorrelation and Partial Autocorrelation functions,
(3) Method of time series modelling, in presence of seasonality, and,
(4) Different non-linear time series models such as ARCH and GARCH.
Skills gained: The students will be able to
(1) explore trend and seasonality in time series data by exploratory data analysis,
(2) implement stationary as well as non-stationary models through parameter estimation,
(3) compute forecast for time series data.

1
Grade Distribution:
Assignments 20%
Quizzes 10%
Midterm Exam 20%
Final Exam 50%

Grading Policy: There will be relative grading such that the cutoff for A grade will not be less than
75% and cutoff for F grade will not be more than 34.9%. Grade distribution will follow normal bell curve
(usually, A: ≥ µ + 3σ/2, B: µ + σ/2 . . . µ + 3σ/2 C: µ − σ/2 . . . µ + σ/2, D: µ − 3σ/2 . . . µ − σ/2, and F:
< µ − 3σ/2)
Approximate grade assignments:
>= 90.0 A+
75.0 – 89.9 A
60.0 – 74.9 B
50.0 – 59.9 C
about 35.0 – 49.9 D
<= 34.9 F
Course Policies:
• General
1. Computing devices are not to be used during any exams unless instructed to do so.
2. Quizzes and exams are closed books and closed notes.
3. Quizzes are unannounced but they are frequently held after a topic has been covered.
4. No makeup quizzes or exams will be given.
• Grades

Grades in the C range represent performance that meets expectations; Grades in the B range
represent performance that is substantially better than the expectations; Grades in the A
range represent work that is excellent.
• Labs and Assignments

1. Students are expected to work independently. Offering and accepting solutions from others is
an act of dishonesty and students can be penalized according to the Academic Honesty Policy.
Discussion amongst students is encouraged, but when in doubt, direct your questions to the
professor, tutor, or lab assistant. Many students find it helpful to consult their peers while doing
assignments. This practice is legitimate and to be expected. However, it is not acceptable practice
to pool thoughts and produce common answers. To avoid this situation, it is suggested that
students not write anything down during such talks, but keep mental notes for later development
of their own.
2. No late assignments will be accepted under any circumstances.
• Attendance and Absences

1. Attendance is expected and will be taken each class. Students are not supposed to miss class
without prior notice/permission. Any absences may result in point and/or grade deductions.
2. Students are responsible for all missed work, regardless of the reason for absence. It is also the
absentee’s responsibility to get all missing notes or materials.

2
Course Outline (tentative) and Syllabus:
The weekly coverage might change as it depends on the progress of the class. However, you must keep up
with the reading assignments. Each week assumes 4 hour lectures. Quizzes will be unannounced.

Week Content
• The Nature of Time Series Data
Week 1 • Financial, Economic, Climatic, Biomedical, Sociological Data.
• Reading assignment: Chapter 1, BD
• Time Series Statistical Models
Week 2 • Components of time series: Trend, Seasonality and randomness
• Whiteness Testing
• Quiz 1
• Stationary time series
• Linear process
Week 3 • Strong and weak stationarity
• Causality, invertibility and minimality
• Reading assignment: Chapter 2, BD
• Auto Regressive model
Week 4 • Moving Average model
• Auto Regressive model
• Moving Average models
• Auto-covariance Function
Week 5 • Auto-correlation Function
• Partial Auto-correlation Function
• Reading assignment: Chapter 3, BD
• Estimating Sample mean,
Week 6 • Estimating Auto-correlation function
• Estimating Partial autocorrelation functions
• Quiz 2
• YuleWalker estimation
Week 7 • Burgs algorithm
• Maximum Likelihood Estimation
• Reading assignment: Chapter 5, BD
• Order Selection
Week 8 • The AIC, BIC and AICC criterion
• Review for Midterm Exam

3
Week Content
• Forecasting
Week 9 • Minimum MSE Forecast
• Forecast Error
• Forecasting Stationary Time Series
Week 10 • The DurbinLevinson Algorithm
• The Innovations Algorithm
• Non-stationarity time series
Week 11 • Unit root tests
• Reading assignment: Chapter 6, BD
• ARIMA Processes
Week 12 • Forecasting ARIMA Models
• Quiz 3
• Modelling seasonal time series
Week 13 • Seasonal ARIMA Models
• Forecasting SARIMA Processes
• Nonlinear Time Series
Week 14 • Testing for Linearity
• Heteroskedastic Data
• Auto-regressive conditional heteroskedastic model
Week 15 • Generalized auto-regressive conditional heteroskedastic model
• Reading assignment: Chapter 5, SS
• Review for Final Exam

4
DA101
Computing for Data Science
Time: TBA

Place: MB212 / Vijnana Computing Lab

Instructor: Dhyanagamyananda

dhyangamyananda@gmail.ac.in, swathyprabhu@gmail.com
url: http://cs.rkmvu.ac.in/~
swat/
Office: MB205, Medhabhavan, RKMVERI, Belur
Office Hours: 10 pm—12 noon, 3 pm—5 pm
(+91) 033-2654 9999

Course Description: DA101 is an introductory course in Data Science giving an


overview of programming, and computing techniques. This course is specially designed for
students of Mathematics, Physics, and Statistics.

Prerequisite(s): (1) Basic logic and mathematics.


Note(s): Syllabus changes yearly and may be modified during the term itself, depending on
the circumstances. However, students will be evaluated only on the basis of topics covered
in the course.
Moodle url: http://moodle.rkmvu.ac.in/course/view.php?id=58
Credit Hours: 4

Text(s):

Algorithms in Data Science, First edition


Brian Steele, John Chandler, & Swarna Reddy

How to proram in Python


Louden & Louden

How to proram in Java


Louden & Louden

Relevant Internet resources

Course Objectives:
Knowledge acquired: .

1
(1) Turing machine model of computing.
(2) Computer programming in python and java.
(3) Algorithm design and analysis
(4) Simulation.
Skills gained: The students will be able to
1. distinguish between computing and non-computing tasks.
2. read and understand a program written in Python, and Java.
3. represent basic data as data structures suited to computing.
4. break down a computing problem into individual steps and code them in python or
java.
5. measure the performance and efficiency of an algorithm in terms of time and space
complexity.
6. understand graph theoritical concepts applied to algorithm.
7. interact with relational database using sql.
8. use simulation techniques in solving computational problems.

Grade Distribution:
Assignments 20%
Quizzes 10%
Midterm Exam 20%
Final Exam 40%

Grading Policy: There will be relative grading such that the cutoff for A grade will not
be less than 75% and cutoff for F grade will not be more than 34.9%. Grade distribution will
follow normal bell curve (usually, A: ≥ µ+3σ/2, B: µ+σ/2 . . . µ+3σ/2 C: µ−σ/2 . . . µ+σ/2,
D: µ − 3σ/2 . . . µ − σ/2, and F: < µ − 3σ/2)
Approximate grade assignments:
>= 90.0 A+
75.0 – 89.9 A
60.0 – 74.9 B
50.0 – 59.9 C
about 35.0 – 49.9 D
<= 34.9 F
Course Policies:
• General course policies, Grades, Labs and assignments, Attendance and
Absences These clauses are common to all courses. And it can be found in the
program schedule.

Course Outline (tentative) and Syllabus:


The weekly coverage might change as it depends on the progress of the class. However, you
must keep up with the reading assignments. Each week assumes 4 hour lectures. Quizzes
will be unannounced.

2
Week Content
• Definition of computing, Binary representation of numbers intergers,
Week 1 floating point, text.
• Reading assignment:
• Unconventional / application specific file formats, like media. Bitmap
representation for monochromatic image and generalizing the represen-
tation for RGB. File metadata, Speed of CPU, Memory, Secondary stor-
Week 2 age, DMA. Hardisk organization into Cylinder, Track, and Sectors for
storing data.
• Reading assignment: XBitmap from Wiki.
• Programming assignment 1:
• Quiz 1

Week 3 • Using and understanding the basics of Linux.


• Lab activity.
• Learning programming using Python. arrays([], [][]), conditional struc-
Week 4 tures (if), and iterative structures (while, for), defining functions, using
library functions.
• Programming assignment:
• Dictionary data structure in python, File access in python, Sorting and
Searching algorithms, appreciating complexity of algorithms. Program-
Week 5 ming using numerical methods.
• Programming assignment:
• Quiz 2
• Basics of Turing machine as a model of computing, analysing the per-
Week 6 formance of a program, time complexity, space complexity, difference
between efficiency and performance, Analyse the first sorting algorithm.
• Home assignment:
• Basic notations of complexity like Big Oh, omega etc, and their mathe-
matical definitions, given a programme to compute the complexity mea-
Week 7 sures.
• Reading assignment: Chapter 2.4, BJS
• Home assignment:
• Quiz 3

Week 8 • Discussion on the reading assignment, and implementing in the lab.


• Review for Midterm Exam

3
Week Content
• Programming in SQL (Structured query language) to query relational
Week 9,10,11 databases.
• Home assignment 4
• Quiz at the end of three weeks.
• Representation of graphs, basic algorithms like minimum spanning tree,
Week 12 matching etc.
• Home assignment 7
• Quiz 5
• Monte-Carlo simulation
Week 13 • Reading assignment:
• Home assignment 8
Week 14,15,16 • Object oriented programming using Java

4
DA310 Multivariate Statistics
Instructor: Sudipta Das

Course Description: This course DA310 deals with a broad introduction to the most fundamental
method- ologies and techniques used in time series analysis

Prerequisite(s): Basic Statistics, Probability and Stochastic Processes


Note(s): Syllabus changes yearly and may be modified during the term itself, depending on the circum-
stances. However, students will be evaluated only on the basis of topics covered in the course.
Credit: 2 (four), approximately 32 credit hours

Text(s):
1. Applied multivariate statistical analysis: Richard A. Johnson and Dean W. Wichern, Prentice Hall
2002.
Evaluation: Theory 60% + Practical/lab 40%
Course Objectives:
Knowledge gained : At the finish of the course the student will know
• Different matrix operations and SVD
• Multivariate normal distribution and its properties
• Multivariate hypothesis testing
• Multivariate analysis of variance and covariance
• Regression analysis
• principal component analysis
• Discriminant analysis
• Factor analysis
Skills acquired : The student will be able to
• Carry out exploratory multivariate data analysis in R and Excel
• To plot multivariate data and compute descriptive statistics
• Test a data for multivariate normality by graphically and computationally in R
• Perform statistical inference on multivariate means including hypothesis testing, confidence ellip-
soid calculation and different types of confidence intervals estimation
• Build multivariate regression model in R
• Extract the features of the data by principal component analysis in R
• Express the data as functions of a number of important causes by the method of factor analysis
in R
• To assign objects (or data points) to one group among a number of groups by the method of
discriminant analysis in R
Competence developed : The course covers theoretical, computational, and interpretive issues of multi-
variate data analysis using R and Excel. Overall, given real data from varied disciplines, students will
be able to apply their mathematical knowledge, methodologies and computational tools to characterize
and analyse it. As a result, important features of the data can be extracted as well some statistical
conclusion can be made.

1
Course Outline (tentative) and Syllabus:
1. Representation of multivariate data, bivariate and multivariate distributions, multinomial distribution,
multivariate normal distribution, sample mean and sample dispersion matrix, concepts of location
depth in multivariate data.(20hrs)
2. Principal component analysis (10hrs)
3. Classification (10hrs)

4. Factor Analysis (10hrs)


5. Clustering (10hrs)

2
DA320 Operations Research
Instructor: Sudeep Mallick

Course Description: CS3210 deals with the topics in problem formulation, modelling and basic so-
lution techniques in operations research. It is deemed as a first course in this area. It is intended that
the course will enable students to take up advanced study in operations research and analytics based on
operations research.

Prerequisite(s): Basic course in Linear Algebra.


Credit Hours: 4

Text(s):
1. Operations Research, seventh revised edition (2014), P K Gupta and D S Hira, ISBN: 81-219-0218-9
2. Introduction to Operations Research, eighth edition, Frederick S. Hillier & Gerald J. Lieberman, ISBN:
0-07-252744-7

3. Operations Research: An Introduction, ninth edition, Hamdy A. Taha, ISBN: 978-93-325-1822-3


4. AMPL: A Modeling Language for Mathematical Programming, second Edition, www.ampl.com

Course Objectives:
Knowledge gained: At the finish of the course the student will know
1) Problem formulation in operations research for problems in various application domains such as
operations management, marketing, production, finance and others.
2) Modelling techniques such as linear programming and translation of any given problem description
to a linear programming mathematical model.
3) Solution techniques such as simplex method and its variations and special cases.
4) Effect to change of parameters on a model using basic algebraic sensitivity analysis techniques.
5) Use of software tools to solve simple models
Skills acquired: The students will be able to
1) develop a mathematical model, clearly state model building assumptions starting from a problem
description.
2) apply the appropriate operations research technique to formulate optimization models.
3) implement and evaluate alternative models of optimization problems using CPLEX software in
AMPL modelling language as well as MS-EXCEL.
Competence developed: The student develop the
1. Ability to translate a given problem description into a mathematical model for optimization.
2. Ability to identify and elicit information about the essential parameters of any given optimization
problem.
3. Ability to identify and use appropriate optimization modelling tools (software) for a given problem
size and description..
Evaluation: Midterm Lab Exam 20% Term Project 40% Endterm Theory Exam 40%

Course Outline (tentative) and Syllabus:

1
• Problem formulation for linear programming problems I
Week 1
• Reading assignment: Chapter 1, HT

• Problem formulation for linear programming problems II


Week 2
• Reading assignment: Chapter 2, HT
• Problem formulation for linear programming problems III
Week 3
• Reading assignment: Chapter 2, HT
• Problem formulation for linear programming problems IV
Week 4
• Reading assignment: Chapter 1-3, HL
• Problem formulation for linear programming problems V
Week 5
• Reading assignment: Chapter 1-3, HL
• Solving linear programming problem graphical approach
Week 6 • RReading assignment: Chapter 3, HT
• Internal test 1
• Solving linear programming problem algebraic approach
Week 7
• Reading assignment: Chapter 3, HT / Chapter 4, HL
• Solving linear programming problem simplex method
Week 8
• Reading assignment: Chapter 3, HT

• Solving linear programming problem simplex method variations Big M method and
Week 9 Artificial variables
• Reading assignment: Chapter 3, HT / Chapter 4, HL
• Solving linear programming problem simplex method special cases degeneracy, alter-
Week 10 native optima, unbounded solution and infeasible solution
• Reading assignment: Chapter 3, HT / Chapter 4, HL
Week 11 • Lab Session: Solving LP problems using AMPL / CPLEX I

Week 12 • Lab Session: Solving LP problems using AMPL / CPLEX - II


• Internal test 2
• Sensitivity analysis graphical approach
Week 13
• Reading assignment: Chapter 3, HT / Chapter 4, HL
• Sensitivity analysis algebraic approach
Week 14
• Reading assignment: Chapter 3, HT / Chapter 4, HL

Week 15 • Lab Session: Sensitivity analysis of LP problems using AMPL / CPLEX


• Course review

2
DA240 Introduction to Econometrics
Instructor:

Course Description: This course is going to provide a broad introduction to the most fundamental
methodologies and techniques used in Econometrics. Students will learn the details of regression analysis
and its applications in real life scenario.

Prerequisite(s): None
Credit: 2 (four), approximately 32 credit hours

Text(s):
1. Introduction to Econometrics by G. S. MADDALA.
Knowledge: The students get to know
• Assumptions of Linear Regression and why are they required.
• The “BLUE” properties of Least Square Estimators.
• Relation between R2 and r2, where r is correlation coefficient between x and y.
• Pairwise correlation tells nothing about multicolinearity except very high correlation near to 1. Even
with less correlation coefficient value (like 0.2) multicolinearity may occur.
• Test of Multicolinearity. VIF test and its threshold value.
• Dropping a variable from model due to multicolinearity is not a right one.
• Distribution of β (the LS estimator) applying Law of Large Number.
• Detection of heteroscedasticity using different statistical hypothesis testing like Gold-Fields Quandt
test, Gleizer test.
• Impact of heteroscedasiticity on β.
• Generalized Least Square Estimation of β.
• Linear Regression when x is stochastic.
• Definition of Exogeneity and Endogeneity.
• Problem of Endogeneity.
• Hypothesis testing (Housman test) to detect Endogeneity
• Handling of Endogeneity by IV estimator(Instrumental Variable).
Evaluation: Theory 60% + Practical/lab 40%

Course Outline (tentative) and Syllabus:


1. Brief discussion about regression analysis.
2. Least Square Estimators
3. Multicolinearity
4. Heteroscedasticity
5. Generalized Least Square Estimation.
6. Exogeneity and Endogeneity.
7. IV estimator(Instrumental Variable)

1
DA241 Introduction to Finance
Instructor:

Course Description: DA241 covers theoretical, computational, and interpretive issues of Finance us-
ing R, Python and excel.

Prerequisite(s): Basic Statistics, probability and stochastic processes.


Credit: 2 (four), approximately 32 credit hours

Text(s):
1. John C.Hull- Options, Futures and Other Derivatives
2. Sheldon M. Ross- An elementary introduction to mathematical finance
3. Chi-fu Huang, Robert H. Litzenberger- Foundations for financial economics
4. Gopinath Kallianpur, Rajeeva L. Karandikar- Introduction to option pricing theory
Knowledge gained: The students get to know
• Overview of portfolio, asset, stock
• Optimal portfolio selection
• Portfolio frontier
• Minimum variance portfolio, zero co-variance portfolio and Risk Neutral portfolio
• Overview of Option Pricing, call and put option, Payoff, arbitrage and derivative
• Overview of Hedging parameter
• Trading strategy and self financing
• Binomial model for option pricing and complete market
• American and European option pricing
• Distribution of stock prices by Cox-Ross-Rubinstein formula
• Derivation and application of Black Sholes formula
Skills acquired: The student will be able to
• Optimize portfolio on the collected historical Sensex data of different company for giving maximum
return with minimum risk.
• Analyze the pattern of return of different company from historical Sensex data.
• Predict the return for a certain amount of time for different company and to check their prediction
accuracy from the actual data.
• Apply Binomial Model in real life Put Call parity problems and also understand model working pro-
cedure by simulated data.
• Apply Black Sholes formula in real life scenarios and also on simulated data
Course Syllabus:
1. Concept of portfolio, portfolio optimization, Different kind of portfolios
2. Concept of options, Assets , Stocks , Derivatives, Put and Call options (American and European),
3. Arbitrage and Hedging, Uses of them in market scenario
4. Binomial model, Cox-Ross-Rubinstein formula, Black-Sholes formula and their derivation

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy