BD Course Handout (Spring 2024)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

KALINGA INSTITUTE OF INDUSTRIAL TECHNOLOGY

Deemed to be University
BHUBANESWAR-751024

School of Computer Engineering


Spring Semester 2024

Course Handout

1. Course code : CS 3032


2. Course Title : Big Data
3. LTP Structure :
L T P Total Credit
3 0 0 3 3
4. Course Faculty : Sourajit Behera
5. Course offered to the School : Computer Engineering
6. Course Objective:
• To understand the concept and principles of big data.
• To explore the big data stacks and the technologies associated with it.
• To evaluate the different NoSQL databases and frameworks required to handle the big data.
• To formulate the concepts, principles and techniques focusing on the applications to industry
and real world experience.
• To contextually integrate and correlate large amounts of information to gain faster insights for
real time scenarios.
7. Course Outcome:
CO # Detail
CO1 Understand the concept of big data and its analytics in the real world
CO2 Analyse various big data technology foundations
CO3 Apply filtering technique to stream data
CO4 Apply Hadoop ecosystem paradigm using MapReduce, YARN, Pig, Hive, Scoop,
HBase to solve data intensive problems
CO5 Analyse big data framework like Hadoop and NoSQL to efficiently store and process
big data to generate analytics
CO6 Present appropriate solutions to big data analytics frameworks and visualization.
8. Course Contents
The course focuses on basic and essential topics in Big Data.
Unit # Unit Detailed Area
1 Overview of Importance of Data, Characteristics of Data, Analysis of
Big Data unstructured data, Introduction to Big Data, Challenges of
conventional systems, Data analytic, Evolution of analytic
scalability, Big Data Analytics, Key Big Data terminologies, Big
Data analytics lifecycle, Cloud Computing and Big Data.
2 Big Data Exploring the Big Data Stack, Data Sources Layer, Ingestion Layer,
Technology Storage Layer, Physical Infrastructure Layer, Platform Management
Foundations Layer, Security Layer, Monitoring Layer, Analytics Engine,
Visualization Layer, Big Data Applications, Virtualization.
3 Streaming Introduction to Streams Concepts – Stream data model and
architecture – Stream Computing, Sampling data in a stream –
1
Filtering streams, Counting distinct elements in a stream.
4 Hadoop Introduction to Hadoop, Hadoop Ecosystem, Hadoop Distributed
Ecosystem File System, MapReduce, YARN, Pig and PigLatin, Hive, Scoop,
HBase
5 Storing Data Data Models, RDBMS and Hadoop, Non-Relational Database,
in Big Data Introduction to NoSQL, Types of NoSQL, Polyglot Persistence,
Sharding
context.
6 Frameworks Distributed and Parallel Computing for Big Data, Big Data
And Visualizations – Visual data analysis techniques, interaction
Visualization techniques, applications

9. Text Book:
TB1. Big Data, Black Book, DT Editorial Services, Dreamtech Press, 2016
10. Reference Books:
RB1. Big Data and Analytics, Seema Acharya, Subhashini Chellappan, Infosys Limited,
Publication: Wiley India Private Limited,1st Edition 2015
RB2. Discovering, Analyzing, Visualizing and Presenting Data by EMC Education
Services (Editor), Wiley, 2014
RB3. Stephan Kudyba, Thomas H. Davenport, Big Data, Mining, and Analytics, Components of
Strategic Decision Making, CRC Press, Taylor & Francis Group. 2014
RB4. Norman Matloff , THE ART OF R PROGRAMMING, No Starch Press, Inc.2011
RB5. Big Data For Dummies, Judith Hurwitz et al. Wiley 2013.
RB6. Glenn J. Myatt, Making Sense of Data, John Wiley & Sons, 2007 Pete Warden,Big
Data Glossary, O’Reilly, 2011.
11. Pre-requisites:
• DBMS
12. Lesson Plan:
Lecture No. Unit Topics Lesson #
1-6 Overview of • Importance of Data 1
Big Data • Characteristics of Data, Analysis of Unstructured Data
• Combining Structured and Unstructured Sources
• Introduction to Big Data 2
• Challenges of conventional systems
• Data analytic 3
• Evolution of Analytic scalability
• Big Data Analytics 4
• Key Big Data terminologies
• Big Data analytics lifecycle 5
• Cloud Computing and Big Data 6
• Discussion
7-11 Big Data • Exploring the Big Data Stack 7
Technology • Data Sources Layer
Foundations • Ingestion Layer
• Storage Layer 8
• Physical Infrastructure Layer
• Platform Management Layer
• Security Layer 9
• Monitoring Layer
• Analytics Engine 10
• Visualization Layer
• Big Data Applications, Virtualization. 11
12-14 Streaming • Introduction to Streams Concepts 12
• Stream data model and architecture
2
Lecture No. Unit Topics Lesson #
• Stream Computing 13
• Sampling data in a stream
• Filtering streams 14
• Counting distinct elements in a stream.
15-22 Hadoop • Introduction to Hadoop 15
Ecosystem • Hadoop Ecosystem
• Hadoop Distributed File System 16
• MapReduce
• YARN 17
• Hive 18
• Pig and PigLatin 19
• HBase 20
• Scoop 21
• Discussion 22
23-30 Storing Data • Data Models 23
in Big Data • RDBMS and Hadoop 24
context
• Non-Relational Database 25
• Introduction to NoSQL 26
• Types of NoSQL 27
• Types of NoSQL cont... 28
• Polyglot Persistence 29
• Sharding 30
• Discussion
31-36 Framework • Distributed and Parallel Computing for Big Data, 31
& Gossip Protocol, Paxos Consensus
visualization • Big Data Visualizations – Visual data analysis 32
techniques
• Interaction techniques and applications 33
• Big Data Visualizations – Visual data analysis 34
techniques cont...
• Big Data Visualizations – Visual data analysis 35
techniques cont...
• Interaction techniques and applications 36
• Discussions
13. Assessment Components:
Sr # Assessment Time Weightage/ Course Lecture No. Mode
Component Marks
From To
1 Mid-Semester 1.5 Hrs 20 1 18 Closed Book
Examination
2 Activity based Through 30 1 36 Open Book,
Teaching and out Closed Book
Learning semester
3 End-Semester 3 Hrs 50 1 36 Closed Book
Examination

3
14. Assessment plan for activity based learning:

Considering the guidelines circulated and after discussing with the faculty members, following
ongoing activity based teaching and learning (comprised of 6 ongoing activities) is proposed and
Component wise distributions of the activities are listed below.

Activities may include one/multiple Individual/Group Assignment(s), Class Test(s), Quiz/Quizzes etc.

Sl.No. Activity Tentative Date of Activity Marks

1 Activity 1 20-01-2024 5

2 Activity 2 03-02-2024 5

3 Activity 3 17-02-2024 5

4 Activity 4 02-03-2024 5

5 Activity 5 16-03-2024 5

6 Activity 6 30-03-2024 5

15. Attendance: Every student is expected to be regular (in attendance) in all lecture classes, tutorials,
labs, tests, quizzes, seminars etc and in fulfilling all tasks assigned to him / her. Attendance will be
recorded and 75% attendance is compulsory.
16. Make-up:
• No make-up examination will be scheduled for the mid semester examination. However,
official permission to take a make-up examination will be given under exceptional
circumstances such as admission in a hospital due to illness / injury, calamity in the family at
the time of examination.
• A student who misses a mid-semester examination because of extenuating circumstances such
as admission in a hospital due to illness / injury, calamity in the family may apply in writing
via an application form with supporting document(s) and medical certificate to the Dean of the
School for a make-up examination.
• Applications should be made within five working days after the missed examination.
17. Chamber consultation hour for doubts clarification:
Sr# Cabin No Day & Time
1 Faculty block - 402, Campus - 14, Will be informed in due time.
Block - C
18. Academic Dishonesty:
• It may be noted that any kind of copying/plagiarism by any student and/or malpractice in
examinations is strictly prohibited.
• In case of the violation of above the Institute will take appropriate and necessary action.
19. Notices: All notices regarding the course will be through email or communicated via Whats App.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy