0% found this document useful (0 votes)

22 views

Big Data Quiz1.1

Uploaded by

yitej21617

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

Big Data Quiz1.1

Uploaded by

yitej21617

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

BIG DATA saMA

1. What class does the ApplicationMaster use to communicate with ResourceManager?—AMRM Client
or AMRM client async

2. True or False: The AppMaster is actually container itself- TRUE

3. True pr false : AppMaster asks NodeManager if it is not too busy to start a container for the
AppMaster. Justify , if it is wrong(Golden Rule) – FALSE , AM communicate with the Node Manager to
create containers.

4. Hadoop is fault0tolerant system. What does Hadoop do in case if HDFS is no longer available due to
disk corruption or machine failure?
it will replicate to another rack/machine

5. Difference in hardware requirements for NameNode and DataNode. Is NameNodenmachine same

as dataNade machine as in terms of hardware.

NameNode needs more memory. It is a memory based server. Determines and maintains how the
chunks of data are distributed across the DataNode. Namespace, Metadata, Block Map.
DataNode needs more storage. Store the chunk of data, is responsible for replicating chunks to other
Datanodes. Handling read and write requests
. Performing the blocks of creation, deletion and replication upon instruction from namenode. Send
heartbeat and blockreport to Namenode.

6. How memory reqirements of NameNode will change if we increase the size of files stored in HDFS
without increasing the number of files--- Memory requirements will decrease.
512 mb = 4 blocks ( 128 mb each)

150+150+150+62= 512 multiple file but same size

2+2+2+1 = 7 blocks

7. When a client contacts the NameNode for accessing a file, the NameNode responds with – block
location

8. Which among the followings can be related to Hadoop 1.x (multiple choise is possible)
A. Job Tracker
B. NodeManager
C. Task Tracker
D. NameNode
E. Datanode

9. Who are the job tracker in Hadoop 1?

A.container
B.master +
c.both
d.slaves
e.none of them
10. What is the difference between Hadoop1 and hadoop2 ?
Hadoop1 is single use system – batch processing, block size 64 MB
Hadoop2 is multi process platform (batch, interactive, online…), block size is 128 MB
Hadoop 1 processing – map reduce (it support only Map Reduce processing model, doesn’t support non
MR tools) , HDFS. Has limited scaling of nodes. Limited to 4000 nodes per cluster. A single Namenode to
manage the entire namespace. NameNode failure affect the stack. Does not support Microsoft Windows.
Hadoop 2 processing—map reduce , others ( data processing), YARN, HDFS. Has better scalability.
Scalable up to 10000 nodes per cluster. Multiple Namenode services manage multiple namespaces. The
Hadoop stack – Hive, pig , Hbase and etc is equipped to handle the NameNode failure. Supports
Microsoft Windows

11. What is the default WEBUI port number of NodeManager? --8042

12. Default duration for heartbeat sent from DataNode to NameNode is --- Heartbeat is 3
seconds

13. Hadoop Ecosystem: “HIVE” is a query engine that supports the parts of SQL specific to
quering data
HIVE

14. Map Reduce is computing model that OPTIMIZED for HIGH SCALABILITY but not for LOW
LATENCY
-Ambari
-MapReduce
-Low
-High
-Optimize
-NameNode
-Scalability
-latency

15. Components of the YARN

Resource Manager, Container, Node Manager, Application Master.

16. Worker Node = DataNode + Node Manager

17. TRUE OR FALSE If the container fails to complete its task successfully, Resource manager starts
…. On different Node Manager.---- TRUE

18. IF you had 10 Mapreduce job running on your cluster , how many Application master instances
would you have running? --- 10 AM. each job has its own AM.

19. What 2 types of resources can Application Master request for a container. --- Processing and
Storage ( RAM + CPU)

20. Data Mining and Analitycs is cross disciplinary area of research which includes following
diciplines?
Machine learning, statistics, artificial intelligence, signal processing, data engineering, probability
models, statistical learning, database management systems, cloud computing
21. Briefly explain Main concepts of Mapreduce –. Functioning programming. Works well in big data.
Can process large data sets. It is a programming model designed for processing large volumes of data in
parallel by dividing the work into set or independent tasks. It provides a flexible and scalable foundation
for analytics, from traditional reporting to leading-edge machine learning algorithms.

22. Hadoop ecosystem : MAPREDUCE is a general pupose computing model and runtime system for
distributed data analytics

23. Explain difference between Block and replica and what are their default value?
Block –128--- is the file on the underlying file system, is for fast reading
Replica- 3---- replica is a copy of original files,

24. List daemons of Hadoop version and briefly explain their roles in Hadoop Cluster
Master
1. Name Node—Hold metadata for HDFS
2. Secondary Name Node-- Perform housekeeping function for the Namenode and back up for
Namenode
3. Job tracker—Manage mapreduce jobs, distribute individually tasks to machine running the Task
Tracker. Coordinates MapReduce stages

Slave service
1. Data Node—Stores actual HDFS data blocks.
2. Task Tracker—Responsible for representing and monitoring the Map and Reduce job

25. What does the following command do? (learn all the commands in slide)
-cat: display file content (uncompressed)
-text: just like cat but works on compressed files
-chgrp,-chmod,-chown: changes file permissions
-put,-get,-copyFromLocal,-copyToLocal: copies files from the local file system to the HDFS and vice
versa.
-ls, -ls -R: list files/directories
-mv,-moveFromLocal,-moveToLocal: moves files
-stat: statistical info for any given file (block size, number of blocks, file type, etc.)

26. Bring 2 use cases on how Big Data Management and Analytics can help multi sectoral business to
increase profit and effectiveness ?
Use Case 1 – Financial use cases transformed to analytics
Customer profiling— Financial firms use parameters about customers to determine risk

Use case 2 -- Retail transformed Market basket analysis

Fraud Detection -- Credit card companies
look at transaction factors to detect fraud

27. 5V’S of Big Data

1. high volume (data at rest)--- terabytes, record, files
2. high velocity( data in motion)—batch, near time, real time, stream
3. high veraity ( data in many forms) – structured, unstructured, multi factor, probabilistic….
4. high veracity ( data in doupt) – trustworthiness, availability, accountability,
5. high value ( data in limbo) – statistical, correlation, hypothetical

28. Briefly describe the primary steps of map-reduce jop in Hadoop

1. input
2.split
3.map
4. shuffer
5.reduce
6. result

29. What is heartbeat in terms of Hadoop and how they are important for cluster?

It is a signal from datanode to name node. It indicates that data node is alive.

30.
vagrant Centos VM
Centos box Hadoop

31.
Rm gives command to nodemanager to create AM
AM send request to RM for allocating resource
RM allocate resource for AM
AM send request to NodeManager to create container
AM directly communicate with container
32. all(sample(15:25), 11> 15)—False
any(sample(15:25), 11> 20)—True

33. rep(seq(2,10,2), each=3)

[1] 2 2 2 4 4 4 6 6 6 8 8 8 10 10 10

34. salesDT<- as.data.table(sales)

Setkey(salesDT,total)
salesDT(nrow(salesDT), [, 6]------ 199

35. streamingde hadoop un istifade etdiyi componentler

Hive—data warehouse structure that supports ad hoq sql queries
Kafka—fast, scaleable, durable and fault-tolerance publish subscribing messaging system
Hbise—scalable distributed nosql database that supports structured data storage for large data tables

36. how to open csv file in R

dat = read.csv("spam.csv", header = TRUE)
37.

Hadoop MCQs
75% (8)
Hadoop MCQs
21 pages
Nptel Big Data Full Assignment Solution 2021
100% (8)
Nptel Big Data Full Assignment Solution 2021
36 pages
Big Data Exam Correction
100% (1)
Big Data Exam Correction
10 pages
Hadoop Interview Questions New
No ratings yet
Hadoop Interview Questions New
9 pages
AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
Big Data Analytics Unit 1 MCQ
90% (10)
Big Data Analytics Unit 1 MCQ
10 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Big Data Hadoop Interview Questions and Answers
No ratings yet
Big Data Hadoop Interview Questions and Answers
26 pages
Data Egineer Interview Questions
No ratings yet
Data Egineer Interview Questions
126 pages
Some of The Frequently Asked Interview Questions For Hadoop Developers Are
100% (1)
Some of The Frequently Asked Interview Questions For Hadoop Developers Are
72 pages
BDC Previous Papers 2 Marks
100% (1)
BDC Previous Papers 2 Marks
7 pages
DSBDA ORAL Question Bank
100% (1)
DSBDA ORAL Question Bank
6 pages
Basic Hadoop Interview Questionsxyzz
No ratings yet
Basic Hadoop Interview Questionsxyzz
18 pages
Final Exam
17% (6)
Final Exam
6 pages
Top 500 Data Engineering Interview Questions
No ratings yet
Top 500 Data Engineering Interview Questions
126 pages
Devoir Surveillé: Please Answer The Following Multiple-Choice Questions
No ratings yet
Devoir Surveillé: Please Answer The Following Multiple-Choice Questions
8 pages
Hadoop and Java Ques - Ans
No ratings yet
Hadoop and Java Ques - Ans
222 pages
Big Data Hadoop
No ratings yet
Big Data Hadoop
11 pages
500+ Data Engineering Interview_Questions
No ratings yet
500+ Data Engineering Interview_Questions
118 pages
500+ Interview Questions-1
No ratings yet
500+ Interview Questions-1
126 pages
2022 Assignment Answers
No ratings yet
2022 Assignment Answers
37 pages
MCQ – Hadoop – Javaguides
No ratings yet
MCQ – Hadoop – Javaguides
3 pages
Bigdatacourse
No ratings yet
Bigdatacourse
10 pages
Top Hadoop Interview Q&A
No ratings yet
Top Hadoop Interview Q&A
25 pages
Subject Name:: Knowledge Institute of Technology & Engineering-135
No ratings yet
Subject Name:: Knowledge Institute of Technology & Engineering-135
22 pages
Compare Hadoop & Spark Criteria Hadoop Spark
No ratings yet
Compare Hadoop & Spark Criteria Hadoop Spark
18 pages
Pig
No ratings yet
Pig
24 pages
Jenny Blog
No ratings yet
Jenny Blog
12 pages
MCQ Type Questions
No ratings yet
MCQ Type Questions
24 pages
DS_QCM_BigData_2021 (1)
No ratings yet
DS_QCM_BigData_2021 (1)
6 pages
What Are Basic Characteristics of Data and How Is Parallel Processing System Different From Distributed System?
No ratings yet
What Are Basic Characteristics of Data and How Is Parallel Processing System Different From Distributed System?
24 pages
What Are Basic Characteristics of Data and How Is Parallel Processing System Different From Distributed System?
No ratings yet
What Are Basic Characteristics of Data and How Is Parallel Processing System Different From Distributed System?
24 pages
Hadoop Interviews Q
No ratings yet
Hadoop Interviews Q
9 pages
Hadoop Exams
No ratings yet
Hadoop Exams
14 pages
454U8-Big Data Analytics
No ratings yet
454U8-Big Data Analytics
22 pages
Bda MCQ
No ratings yet
Bda MCQ
9 pages
Data Engineer Interview Questions
No ratings yet
Data Engineer Interview Questions
16 pages
InterviewQuestions_1735756800
No ratings yet
InterviewQuestions_1735756800
125 pages
Bda Imp No Header Footer (1)
No ratings yet
Bda Imp No Header Footer (1)
25 pages
A48970353 16469 14 2019 Hadoop
No ratings yet
A48970353 16469 14 2019 Hadoop
18 pages
BDA UNIT 2 (1)
No ratings yet
BDA UNIT 2 (1)
16 pages
Hadoop Interview Qs
No ratings yet
Hadoop Interview Qs
99 pages
Big data Unit 4 own
No ratings yet
Big data Unit 4 own
18 pages
4 5969937999511686081
No ratings yet
4 5969937999511686081
6 pages
Top 50 Hadoop Interview Questions for 2019
No ratings yet
Top 50 Hadoop Interview Questions for 2019
42 pages
Fbda Unit-3
No ratings yet
Fbda Unit-3
27 pages
Unit 2 Hadoop
No ratings yet
Unit 2 Hadoop
60 pages
Questionsand Answers
No ratings yet
Questionsand Answers
23 pages
Big Data Ia Answers
No ratings yet
Big Data Ia Answers
14 pages
Chapter 2 Introduction To Hadoop
No ratings yet
Chapter 2 Introduction To Hadoop
31 pages
BDA Unit-3
No ratings yet
BDA Unit-3
47 pages
1 Bda Chapter1 Answer
No ratings yet
1 Bda Chapter1 Answer
7 pages
Hadoop Test
100% (1)
Hadoop Test
8 pages
Printing Big Data Hadoop
No ratings yet
Printing Big Data Hadoop
24 pages
Big Data Hadoop Interview Questions and Answers
100% (1)
Big Data Hadoop Interview Questions and Answers
25 pages
Hadoop Admin Interview Questions and Answers
No ratings yet
Hadoop Admin Interview Questions and Answers
9 pages
Bits
No ratings yet
Bits
2 pages
Bigdata MCQ QA Part2
No ratings yet
Bigdata MCQ QA Part2
9 pages
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet
Bingjing - Big Data Tools
No ratings yet
Bingjing - Big Data Tools
38 pages
Cloud Computing Lab Manual
No ratings yet
Cloud Computing Lab Manual
73 pages
Athul Dev - Spark With Python (2020) - Libgen - Li
No ratings yet
Athul Dev - Spark With Python (2020) - Libgen - Li
153 pages
Unit No. 8
No ratings yet
Unit No. 8
24 pages
Midterm Solution
0% (1)
Midterm Solution
7 pages
Big Data Algorithms
100% (1)
Big Data Algorithms
476 pages
Apache Hadoop
No ratings yet
Apache Hadoop
11 pages
03 Unit Bda Hadoop,Map Reduce
No ratings yet
03 Unit Bda Hadoop,Map Reduce
80 pages
BDA Notes
No ratings yet
BDA Notes
96 pages
IOT Mod-4
No ratings yet
IOT Mod-4
42 pages
Scalable Sequential Pattern Mining Based On PrefixSpan For High Dimensional Data
No ratings yet
Scalable Sequential Pattern Mining Based On PrefixSpan For High Dimensional Data
6 pages
Assignment - Big Data Management
No ratings yet
Assignment - Big Data Management
2 pages
226 Unit-7
No ratings yet
226 Unit-7
26 pages
M.Tech (CSE) Scheme & Syllabus 2024-25
No ratings yet
M.Tech (CSE) Scheme & Syllabus 2024-25
59 pages
Big Data Research Paper
No ratings yet
Big Data Research Paper
14 pages
1-Big Data Systems, Programming and Management
No ratings yet
1-Big Data Systems, Programming and Management
3 pages
Data Science With Python - Lesson 12 - Python Integration With Hadoop
No ratings yet
Data Science With Python - Lesson 12 - Python Integration With Hadoop
53 pages
Unit - V PIG Hadoop & Big Data: Pig Latin. This Language Provides Various Operators Using Which Programmers
No ratings yet
Unit - V PIG Hadoop & Big Data: Pig Latin. This Language Provides Various Operators Using Which Programmers
9 pages
Research Issues in Cloud Computing
No ratings yet
Research Issues in Cloud Computing
8 pages
R13 Cse 4th Syllabus
No ratings yet
R13 Cse 4th Syllabus
31 pages
Final
No ratings yet
Final
276 pages
Pig: Web-Scale Processing, Yahoo Research
100% (1)
Pig: Web-Scale Processing, Yahoo Research
33 pages
Unit 1 Introduction To Big Data and Hadoop
No ratings yet
Unit 1 Introduction To Big Data and Hadoop
100 pages
Map Reduce Examples
No ratings yet
Map Reduce Examples
7 pages
Mrjob Documentation: Release 0.6.0.dev0
No ratings yet
Mrjob Documentation: Release 0.6.0.dev0
150 pages
Mid - 2 Questions & Bits
No ratings yet
Mid - 2 Questions & Bits
5 pages
S MapReduce Types Formats Features
No ratings yet
S MapReduce Types Formats Features
15 pages
BDA Paper
No ratings yet
BDA Paper
1 page
L4A Running Hadoop with MR
No ratings yet
L4A Running Hadoop with MR
5 pages
Cloud Unit V
No ratings yet
Cloud Unit V
23 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Big Data Quiz1.1

Uploaded by

Big Data Quiz1.1

Uploaded by

BIG DATA saMA

2. True or False: The AppMaster is actually container itself- TRUE

5. Difference in hardware requirements for NameNode and DataNode. Is NameNodenmachine same

150+150+150+62= 512 multiple file but same size

9. Who are the job tracker in Hadoop 1?

11. What is the default WEBUI port number of NodeManager? --8042

15. Components of the YARN

16. Worker Node = DataNode + Node Manager

Use case 2 -- Retail transformed Market basket analysis

27. 5V’S of Big Data

28. Briefly describe the primary steps of map-reduce jop in Hadoop

33. rep(seq(2,10,2), each=3)

34. salesDT<- as.data.table(sales)

35. streamingde hadoop un istifade etdiyi componentler

36. how to open csv file in R

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.