BDA - Assignment and Submission Guidelines PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

A D Patel Institute of Technology

Department of Information Technology


Sem-7 AY: 2020-21
2171607 – Big Data Analytics
ASSIGNMENT
Unit-1: INTRODUCTION TO BIG DATA
[1] What is Big Data? What are the challenges with big data? Explain three basic types of
Big Data in brief.
[2] Define Big Data Analytics. Explain 5 applications of Big Data Analytics.
[3] Explain the architecture of HDFS.
[4] What is distributed file system? Explain important features of HDFS.
[5] Explain ‘Four V Characteristics’ of Big Data with suitable example.
Unit-2: INTRODUCTION TO HADOOP AND HADOOP ARCHITECTURE
[1] What is Hadoop? Explain important components of Hadoop with suitable diagram.
[2] Explain working of MapReduce with the example of ‘WordCount’.
[3] List out components of Hadoop Eco-system and their functionality in brief.
[4] What is data serialization? How to serialize data in Hadoop? How Hadoop serialization
differs from Java serialization?
[5] What is scheduling? Explain any three schedulers used in Apache Hadoop.
[6] Explain following terms with reference to working of Hadoop:
InputSplit, InputFormat, Shuffle, Sort, Reducer, Combiner, OutputFormat,
RecordWriter, Distributed Cache
[7] How to move the data in and out of Hadoop? Explain in brief.
[8] Explain the features and key advantages of Hadoop.
[9] Differentiate followings:
RDBMS vs. Hadoop
[10] Explain following important daemons of HDFS and MapReduce:
DataNode, NameNode and Secondary NameNode
JobTracker and TaskTracker
Unit-3: HDFS, HIVE AND HIVEQL, HBASE
[1] What is Hive? Explain its architecture and its working with suitable diagrams.
[2] Differentiate followings:
Hive vs. RDBMS
HDFS vs. HBase
HBase vs. RDBMS
[3] Differentiate followings:
Apache Pig vs. MapReduce
Pig vs. SQL
Pig vs. Hive
[4] What is Pig? Why do we need Pig? Explain the architecture of Pig and its data model.
[5] What is Zookeeper? How it helps in monitoring a cluster?
[6] Why Apache Zookeeper is useful? Explain the architecture of Apache Zookeeper.
[7] Explain following terms with reference to Apache Zookeeper:
Ensemble, Leader, Znodes, Sessions, Watches
[8] Explain the workflow of Apache Zookeeper with suitable diagram.
[9] What is the importance of HBase? Explain the data model supported by HBase. Also
differentiate Row-oriented database vs. Column-oriented database.
[10] Explain important components of HDFS.
Unit-4: SPARK
[1] What is Spark? Explain its important features.
[2] Explain important components of Spark.
[3] Explain RDD in detail.
[4] How Spark is faster than MapReduce?
[5] Explain interactive and iterative architecture of MapReduce and Spark.
[6] Explain architecture of Spark streaming.
[7] Explain various data types supported by MLlib.
[8] Explain any five machine learning functionalities supported by MLlib.
[9] Explain working of WordCount Example using Spark.
[10] Explain the types of operations supported by RDD. Explain both in brief.
Unit-5: NoSQL
[1] What is NoSQL? Where is it used?
[2] Explain various types of NoSQL databases with suitable examples.
[3] Explain the use of NoSQL databases in industry.
[4] Differentiate SQL vs. NoSQL
[5] Explain NewSQL with its characteristics, advantages and drawbacks.
Unit-6: DATABASE FOR MODERN WEB
[1] What is MongoDB? Explain the important features of MongoDB.
[2] Compare MongoDB with RDBMS. Mention advantages of MongoDB over RDBMS.
[3] Explain following terms with reference to MongoDB with suitable example:
Cursor, Indexes, MongoImport, MongoExport
[4] Explain CRUD operations in MongoDB.
[5] Explain the significance of following Methods with reference to MongoDB Query
Language:
find(), pretty(), count(), limit(), skip(), sort(), update(), insert(), save()

Submission Guidelines:

1. All students are required to write answer of one question from each unit as per the guidelines in
Sr.No.2 below. (Hence, total 6 questions are to be answered. Rest should be studied for exam
preparation.)
2. The questions to be answered from each unit are to be selected based on last one digit of your
Enrolment number. i.e.
a. If your Enr.No. ends with 1 then answer 1st question from each unit.
b. If your Enr.No. ends with 2 then answer 2nd question form each unit.
c. If your Enr.No. ends with 3 then answer 3rd question from each unit.
d. If your Enr.No. ends with 4 then answer 4th question form each unit.
e. If your Enr.No. ends with 5 then answer 5th question from each unit.
f. If your Enr.No. ends with 6 then answer 1st question from units-1,5,6 and 6th question
from units-2,3,4 respectively.
g. If your Enr.No. ends with 7 then answer 2nd question from units-1,5,6 and 7th question
from units-2,3,4 respectively.
h. If your Enr.No. ends with 8 then answer 3rd question from units-1,5,6 and 8th question
from units-2,3,4 respectively.
i. If your Enr.No. ends with 9 then answer 4th question from units-1,5,6 and 9th question
from units-2,3,4 respectively.
j. If your Enr.No. ends with 0 then answer 5th question from units-1,5,6 and 10th question
from units-2,3,4 respectively.
3. The assignment is to be submitted by 7th November, 2019 in Soft form through Microsoft
Teams.
4. The assignment will have weightage of 5 marks out of 30 ‘M’ component marks.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy