M.SC - Data Science AY 2019 2020
M.SC - Data Science AY 2019 2020
Curriculum
(2019-2020 onwards)
VISION STATEMENT OF VELLORE INSTITUTE OF TECHNOLOGY
Our mission is to educate students from all over India, including those from the
local and rural areas, and from other countries, so they become enlightened
individuals, improving the living standards of their families, industry and
society. We will provide individual attention, world-class quality education and
take care of character building.
CREDIT STRUCTURE
Category Credits
University core (UC) 29
Programme core (PC) 23
Programme elective (PE) 22
University elective (UE) 06
Total credits 80
Course
S. No. Course Title L T P J C
Code
Course
S. No. Course Title L T P J C
Code
10 MATXXXX Bio-Statistics 2 0 2 0 3
.Wiley (2019).
Reference Books
Doing
1 Data Science, Straight Talk From The Frontline, Cathy O'Neil and Rachel
.
Schutt, O'Reilly (2014).
Data
2 Mining: Concepts and Techniques”, Third Edition, Jiawei Han, Micheline
Kamber and Jian Pei, ISBN 0123814790,(2011).
Big
3 Data and Business Analytics, Jay Liebowitz, CRC press (2013)
Data
4 mining methods,2nd edition, C. Rajan, Narosa (2016)
Mode of Evaluation: CAT / Assignment / Quiz / FAT / Project / Seminar
Recommended by Board of 24-06-2020
Studies
Approved by Academic Council No. 59 Date 24-09-2020
Text Book(s)
Echo-1,
1 Méthode de français, J. Girardet, J. Pécheur, Publisher CLE International, Paris 2010.
.
2 Echo-1, Cahier d’exercices, J. Girardet, J. Pécheur, Publisher CLE International, Paris 2010.
Reference Books
CONNEXIONS
1 1, Méthode de français, Régine Mérieux, Yves Loiseau,Les Éditions Didier, 2004.
.
CONNEXIONS
2 1, Le cahier d’exercices, Régine Mérieux, Yves Loiseau, Les Éditions Didier, 2004.
ALTER
3 EGO 1, Méthode de français, Annie Berthet, Catherine Hugo, Véronique M. Kizirian, Béatrix
Sampsonis, Monique Waendendries , Hachette livre 2006.
Module:2 3 hours
Konjugation der Verben (regelmässig /unregelmässig) die Monate, die Wochentage, Hobbys, Berufe,
Jahreszeiten, Artikel, Zahlen (Hundert bis eine Million), Ja-/Nein- Frage, Imperativ mit Sie
Lernziel :
Sätze schreiben, über Hobbys erzählen, über Berufe sprechen usw.
Module:3 4 hours
Possessivpronomen, Negation, Kasus- AkkusatitvundDativ (bestimmter, unbestimmterArtikel), trennnbare
verben, Modalverben, Adjektive, Uhrzeit, Präpositionen, Mahlzeiten, Lebensmittel, Getränke
Lernziel :
Sätze mit Modalverben, Verwendung von Artikel, über Länder und Sprachen sprechen, über eine
Wohnung beschreiben.
Module:4 6 hours
Übersetzungen : (Deutsch – Englisch / Englisch – Deutsch)
Lernziel :
Grammatik – Wortschatz – Übung
Module:6 . 3 hours
Aufsätze :
Meine Universität, Das Essen, mein Freund oder meine Freundin, meine Familie, ein Fest in Deutschland
usw
Module:7 4 hours
Dialoge:
e) Gespräche mit Familienmitgliedern, Am Bahnhof,
f) Gespräche beim Einkaufen ; in einem Supermarkt ; in einer Buchhandlung ;
g) in einem Hotel - an der Rezeption ;ein Termin beim Arzt.
Treffen im Cafe
Module:8 2 hours
Guest Lectures/Native Speakers / Feinheiten der deutschen Sprache, Basisinformation über die
deutschsprachigen Länder
Total Lecture hours: 30 hours
Text Book(s)
Studio
1 d A1 Deutsch als Fremdsprache, Hermann Funk, Christina Kuhn, Silke Demme : 2012
.
Reference Books
Netzwerk
1 Deutsch als Fremdsprache A1, Stefanie Dengler, Paul Rusch, Helen Schmtiz, Tanja Sieber,
2013
Lagune
2 ,Hartmut Aufderstrasse, Jutta Müller, Thomas Storz, 2012.
Deutsche
3 SprachlehrefürAUsländer, Heinz Griesbach, Dora Schulz, 2011
ThemenAktuell
4 1, HartmurtAufderstrasse, Heiko Bock, MechthildGerdes, Jutta Müller und Helmut
Müller, 2010
www.goethe.de
wirtschaftsdeutsch.de
hueber.de, klett-sprachen.de
www.deutschtraning.org
Mode of Evaluation: CAT / Assignment / Quiz / FAT
Recommended by Board of Studies 04-03-2016
Approved by Academic Council No. 41 Date 17-06-2016
7. Having Computational thinking (Ability to translate vast data into abstract concepts and to understand
database reasoning)
Value, Manners, Customs, Language, Tradition, Building a blog, Developing brand message, FAQs', Assessing
Competition, Open and objective Communication, Two-way dialogue, Understanding the audience,
Identifying, Gathering Information, Analysis, Determining, Selecting plan, Progress check, Types of planning,
Write a short, catchy headline, Get to the Point –summarize your subject in the first paragraph., Body –
Make it relevant to your audience,
Number of factors, Factorials, Remainder Theorem, Unit digit position, Tens digit position, Averages,
Weighted Average, Arithmetic Progression, Geometric Progression, Harmonic Progression, Increase &
Decrease or successive increase, Types of ratios and proportions
Data Arrangement(Linear and circular & Cross Variable Relationship), Blood Relations,
Ordering/ranking/grouping, Puzzle test, Selection Decision table
Synonyms & Antonyms, One-word substitutes, Word Pairs, Spellings, Idioms, Sentence completion,
Analogies
Reference Books
1. Kerry Patterson, Joseph Grenny, Ron McMillan, Al Switzler(2001) Crucial Conversations: Tools for
Talking When Stakes are High. Bangalore. McGraw‐Hill Contemporary
2. Dale Carnegie,(1936) How to Win Friends and Influence People. New York. Gallery Books
3. Scott Peck. M(1978) Road Less Travelled. New York City. M. Scott Peck.
4. FACE(2016) Aptipedia Aptitude Encyclopedia. Delhi. Wiley publications
5. ETHNUS(2013) Aptimithra. Bangalore. McGraw-Hill Education Pvt. Ltd.
Websites:
1. www.chalkstreet.com
2. www.skillsyouneed.com
3. www.mindtools.com
4. www.thebalance.com
5. www.eguru.ooo
Mode of Evaluation: FAT, Assignments, Projects, Case studies, Roleplays,
3 Assessments with Term End FAT (Computer Based Test)
Structured and unstructured interview orientation, Closed questions and hypothetical questions,
Interviewers’ perspective, Questions to ask/not ask during an interview, Video interview¸ Recorded
feedback, Phone interview preparation, Tips to customize preparation for personal interview, Practice
rounds
Structure of a standard resume, Content, color, font, Introduction to Power verbs and Write up, Quiz on
types of resume, Frequent mistakes in customizing resume, Layout - Understanding different company's
requirement, Digitizing career portfolio
Syllogisms, Binary logic, Sequential output tracing, Crypto arithmetic, Data Sufficiency, Data
interpretation-Advanced, Interpretation tables, pie charts & bar chats
Reading comprehension, Para Jumbles, Critical Reasoning (a) Premise and Conclusion, (b) Assumption &
Inference, (c) Strengthening & Weakening an Argument
Total Lecture hours: 45 hours
Reference Books
1Michael Farra and JIST Editors(2011) Quick Resume & Cover Letter Book: Write and Use an
. Effective Resume in Just One Day. Saint Paul, Minnesota. Jist Works
2Daniel Flage Ph.D(2003) The Art of Questioning: An Introduction to Critical Thinking. London.
. Pearson
3David Allen( 2002) Getting Things done : The Art of Stress -Free productivity. New York City.
. Penguin Books.
4FACE(2016) Aptipedia Aptitude Encyclopedia.Delhi. Wiley publications
.
5ETHNUS(2013) Aptimithra. Bangalore. McGraw-Hill Education Pvt. Ltd.
.
Websites:
1. www.chalkstreet.com
2. www.skillsyouneed.com
3. www.mindtools.com
4. www.thebalance.com
5. www.eguru.ooo
Mode of Evaluation: FAT, Assignments, Projects, Case studies, Role plays,
3 Assessments with Term End FAT (Computer Based Test)
Modalities / Requirements
1. Individual or group projects can be taken up
2. Involve in literature survey in the chosen field
3. Use Science/Engineering principles to solve identified issues
4. Adopt relevant and well-defined / innovative methodologies to fulfill the specified objective
5. Submission of scientific report in a specified format (after plagiarism check)
Modalities / Requirements
6. Individual or group projects can be taken up
7. Involve in literature survey in the chosen field
8. Use Science/Engineering principles to solve identified issues
9. Adopt relevant and well-defined / innovative methodologies to fulfill the specified objective
10. Submission of scientific report in a specified format (after plagiarism check)
Modalities / Requirements
11. Individual or group projects can be taken up
12. Involve in the literature survey in the chosen field
13. Use Science/Engineering principles to solve identified issues
14. Adopt relevant and well-defined/innovative methodologies to fulfil the specified objective
15. Submission of a scientific report in a specified format (after plagiarism check)
Text Book(s)
Catherine
1 Dawson, Introduction to research methods : a practical guide for anyone undertaking
a. research project, Oxford : How To Books, Reprint 2010
Julius
2 S. Bendat, Allan G. Piersol, Random Data: Analysis and Measurement Procedures,
4.thEdition, ISBN: 978-1-118-21082-6, 640 pages, September 2011
Research
3 in Medical and Biological Sciences, 1st Edition, From Planning and Preparation to
Grant
. Application and Publication, Editos: Petter Laake Haakon Benestad Bjorn Olsen,
ISBN: 9780128001547, Academic Press, March 2015
Reference Books
John
1 Creswell, Research Design: Qualitative, Quantitative, and Mixed Methods
Approaches,
. Fourth Edition (March 14, 2013)
Mode of Evaluation: CAT / Assignment / Quiz / FAT / Project / Seminar
1. Can be a theoretical analysis, modelling & simulation, experimentation & analysis, prototype
design, correlation and analysis of data, software development, applied research and any
other related activities.
2. The project can be for one or two semesters based on the completion of the required
number of credits as per the academic regulations.
3. Should be individual work.
4. Carried out inside or outside the university, in any relevant industry or research institution.
5. Publications in the peer-reviewed journals / International Conferences will be an added
advantage
Mode of Evaluation: Periodic reviews, Presentation, Final oral viva, Poster submission
Text Book(s)
1Sheldon Ross; A First Course in Probability, Pearson, 2014.
.Parimal Mukhopadhyay; An Introduction to the Theory of Probability, World scientific, 2012.
Irwin Miller, Marylees Miller, John E. Freund’s; Mathematical Statistics, Pearson, 2017
3
.
Reference Book(s)
1FetsjeBijma, Marianne Jonker and Aad van der Vaart; Introduction to Mathematical Statistics,
. Amsterdam University Press, 2018.
Krishnamoorthy, K., Handbook of Statistical Distributions with Applications, Chapman &
2Hall/CRC, 2006. nd
. Rohatgi, V.K. and Ebsanes Saleh, A.K. Md., An introduction to Probability and Statistics, 2 Ed.,
John Wiley & Sons, 2002.
Shanmugam, R., Chattamvelli, R. Statistics for scientists and engineers, John Wiley, 2015.
4
.
[7] Having computational thinking (Ability to translate vast data into abstract concepts and to
[9] Having problem-solving ability- solving social issues and engineering problems
Text Book(s)
Manoj
1 Kumar Srivastava and Namita Srivastava, Statistical Inference – Testing of
Hypotheses,
. Prentice Hall of India, 2014.
Robert V Hogg, Elliot A Tannis and Dale L.Zimmerman, Probability and Statistical
Inference,9th
. edition,Pearson publishers,2013
Reference Book(s)
Marc S. Paolella, Fundamental statistical inference: A computational approach, Wiley, 2018.
B. K. Kale and K. Muralidharan, Parametric Inference, Narosa Publishing House, 2016.
Miller, I and Miller, M, John E. Freund's Mathematical statistics with Applications, Pearson
Education, 2002.
Rao, C.R., Linear Statistical Inference and its applications, 2nd Edition, Wiley Eastern, 1973.
Gibbons, J.D., Non-Parametric Statistical Inference, 2/e,Marckel Decker, 1985.
Bansilal, Sanjay Arora and Sudha Arora, Introducing Probability and Statistics, 2/e, Satya
Prakash Publications, 2006.
George Casella and Roger L.Berger: , Statistical Inference, 2nd edition,Casebound Engelska,
2002
Text Book(s)
1Douglas C. Montgomery, Cheryl L. Jennings, Murat Kulahci, Introduction to Time Series
.Analysis and Forecasting, Second Ed., Wiley, 2016.
2George E. P. Box, Gwilym M. Jenkins, Gregory C. Reinsel, Greta M. Ljung, Time Series
.Analysis: Forecasting and Control, Fifth Ed., Wiley, 2016.
Reference Books
1Brockwell, P. J., & Davis, R. A., Introduction to time series and forecasting, Third Ed.,
.Springer, 2016.
2Terence C. Mills, Applied Time Series Analysis: A Practical Guide to Modeling and
.Forecasting, Academic Press, 2019.
Text Book(s)
Douglas C. Montgomery, Elizabeth A. Peck, G. Geoffrey Vining, Introduction to
Linear Regression Analysis, Third Ed., Wiley India Pvt. Ltd., 2016.
Norman R. Draper, Harry Smith; Applied Regression Analysis, WILEY India Pvt.
Ltd. New Delhi; Third Edition, 2015.
Reference Books
. Johnson, R A., Wichern, D. W., Applied Multivariate Statistical Analysis, Sixth Ed., PHI
learning Pvt., Ltd., 2013.
. Iain Pardoe, Applied Regression Modeling, John Wiley and Sons, Inc, 2012.
Mode of Evaluation: CAT / Digital Assignment / Quiz / FAT
List of Challenging Experiments
1. Correlation Analysis using- scatter diagram, Karl Pearson’s correlation 2 hours
coefficient and drawing inferences.
2. Simple linear regression: model fitting, estimation of parameters, 4 hours
computing R2 and adjusted R2 and model interpretation.
3. Residual analysis and forecast accuracy for a given data set. 2 hours
4. Validating Simple linear regression using t, F and p- test. 4 hours
5. Developing confidence interval and testing the model simple and multiple 4 hours
regression.
6. Multiple regression: estimation of parameters, fitting of the model, error 4 hours
analysis, model validation, variable selection and testing.
7. Problem of multicollinearity and, determination of VIF. 2 hours
8. Diagnostic measures and outliers detection, Durbin Watson test, variable 4 hours
selection and model building
9. Autocorrelation, auto regressive model. 2 hours
10 Fitting of nonlinear regression model. 2 hours
Total Laboratory Hours: 30 hours
Mode of assessment: Continuous Assessment and FAT
Recommended by Board of Studies 10-09-2019
Approved by Academic Council No. 56 Date 24-09-2019
Reference Books
Joseph F. Hair, Jr., William C. Black, Barry J. Babin, Rolph E. Anderson and Ronald L.
Tatham, Multivariate Data Analysis, 7th Edition, Pearson Education India, 2014.
Rao, C. R. and Rao, M. M., Multivariate Statistics and Probability, Elsevier & Academic
Press, 2014.
Kshirsagar, A. M., Multivariate Analysis, Marcel Dekkar, 2006.
Anderson T.W., An Introduction to Multivariate Statistical Analysis, John Wiley & sons,
3rd Edition, 2009.
Bhuyan, K. C., Multivariate Analysis and its Applications, New Central book Agency Pvt.
Ltd., 2005.
Weisberg S., Applied Linear Regression, 4th Edition, Wiley, 2013.
Kollo T., and Rosen D. Von, Advanced Multivariate Statistical Analysis with Matrices,
Springer, New York, 2005.
Reference Book(s)
Robert Johansson, Numerical Python – Scientific Computing and Data Science Applications with
NumPy, SciPy and Matplotlib, Apress, 2019
Robert Sedgewick, Kevin Wayne, Robert Dondero, Introduction to Programming in Python: An
Inter-disciplinary Approach, Pearson India Education Services Pvt. Ltd., 2016
Nelli, F., Python Data Analytics: with Pandas, NumPy and Matplotlib, Apress, 2018.
K.2 V. S. Sarma, Statistics Made Simple Do It Yourself, 2nd Ed, Prentice-Hall, 2010.
.
Reference Book(s)
Murtaza
1 Haider, Getting Started with Data Science: Making Sense of Data with Analytics,
.
IBM Press, 2015.
J.P.
2 Verma, Data Analysis in Management with SPSS Software, Springer, 2013.
.
Mode of Evaluation: Continuous Assessment and FAT.
Recommended by Board of Studies 10.09.2019
Approved by Academic Council No. 56 Date 24-09-2019
Text Book(s)
1E. Alpaydin, Introduction to Machine Learning, 3rd Edition, MIT Press, 2015.
.Pratap Dangeti, Statistics for Machine Learning, Packt Publishing, 2017.
2
.
Reference Book(s)
1C.M. Bishop, Pattern Recognition and Machine Learning, Springer, 2016
.K. P. Murphy, Machine Learning: A Probabilistic Perspective, MIT Press, 2012
2
.
Mode of Evaluation: CAT, Quiz, Digital Assignment and FAT
List of Challenging Experiments (Indicative)
1 Exploring and Understanding data and formats 2 hours
2 Classification techniques using Decision Trees 4 hours
3 Support Vector Machines 4 hours
4 Clustering Algorithms 4 hours
5 Computation of missing values and multivariate classification 4 hours
6 Dimensionality reduction: A factor analysis. 4 hours
7 Discriminant analysis 4 hours
8 Canonical Correlation analysis 4 hours
Total Laboratory hours: 30 hours
Mode of evaluation: Continuous Assessment and FAT.
Recommended by Board of Studies 10.09.2019
Approved by Academic Council No. 56 Date 24-09-2019
4
.
Mode of Evaluation: CAT, Quiz, Digital Assignment and FAT.
List of Challenging Experiments (Indicative)
1 Study of facts, objects, predicates and variables in PROLOG 2 hours
2 Study of Rules and Unification in PROLOG 2 hours
3 Study of “cut” and “fail” predicate in PROLOG 2 hours
4 Study of arithmetic operators, simple input/output and compound goals in 4 hours
PROLOG
5 Study of recursion in PROLOG 4 hours
6 Study of Lists in PROLOG 2 hours
7 Study of dynamic database in PROLOG 2 hours
8 Study of string operations in PROLOG (Implement string operations like 4 hours
substring, string position, palindrome etc.)
9 Write a prolog program to maintain family tree 4 hours
10 Write a prolog program to implement all set operations (Union, intersection, 4 hours
complement etc.)
Total Laboratory hours 30 hours
Mode of Evaluation: Continuous assessment and FAT.
Recommended by Board of Studies 24.06.2020
Approved by Academic Council No. 59 Date 24-09-2020
Text Book(s)
1Eugene L.Grant Richard S. Leavenworth, Statistical Quality Control,7 edition,McGraw Hill
Education,India, 2017.
2Douglas C. Montgomery, Introduction to Statistical Quality Control, Seventh Edition, John
Wiley and Sons, New York. 2013.
Reference Books
1Edward G. Schilling, Dean V. Neubauer, Acceptance Sampling in Quality Control, Second
Edition, Taylor & Francis, 2009.
2Poornima M.Charantimath,Total quality Management, 3/e, Pearson India Limited, 2017.
Mode of Evaluation: Continuous assessment, Quiz, Digital Assignment and FAT.
List of Challenging Experiments (Indicative)
1 Mean and Range charts: Experimental control charts for process control. 4 hours
2 Control chart for nonconformities. 4 hours
3 A control chart for nonconformities per unit with variable subgroup size. 4 hours
4 C chart used to control errors on forms. 2 hours
5 Acceptance decisions based on plotted frequency distributions. 4 hours
6 AOQL inspection to produce quality improvement. 4 hours
7 Construction of rectifying inspection using AOQL normal inspection plans 4 hours
8 Acceptance sampling under standard sampling plans. 4 hours
Total Laboratory hours 30 hours
Mode of Evaluation: Continuous assessment and FAT
Recommended by Board of Studies 24.06.2020
Approved by Academic Council No. 59 Date 24-09-2020
and 33- factorial designs, Analysis of confounded factorial designs; Fractional Replication.
Module:6 Balanced Incomplete Block design 6 hours
Balanced Incomplete Block Design (BIBD)– Types of BIBD – Simple construction methods –
Concept of connectedness and balancing – Intra Block analysis of BIBD.
Module:7 Partially Balanced Incomplete Block design 6 hours
Partially Balanced Incomplete Block Design with two associate classes – intra block analysis -