0% found this document useful (0 votes)
40 views

213nt1306- Big Data Analytics Lab Manual

The document is a lab manual for a Big Data Analytics course, detailing the installation and configuration of Hadoop in various modes, including Stand Alone, Pseudo Distributed, and Fully Distributed modes. It includes step-by-step instructions for setting up Hadoop on both Ubuntu and Windows 10, as well as practical exercises for file management tasks within the Hadoop environment. Additionally, it provides a bonafide certificate section for students to record their practical work and examination details.

Uploaded by

9922008379
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

213nt1306- Big Data Analytics Lab Manual

The document is a lab manual for a Big Data Analytics course, detailing the installation and configuration of Hadoop in various modes, including Stand Alone, Pseudo Distributed, and Fully Distributed modes. It includes step-by-step instructions for setting up Hadoop on both Ubuntu and Windows 10, as well as practical exercises for file management tasks within the Hadoop environment. Additionally, it provides a bonafide certificate section for students to record their practical work and examination details.

Uploaded by

9922008379
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

SCHOOL OF COMPUTING

DEPARTMENT OF INFORMATION TECHNOLOGY

LAB MANUAL

213INT1306- BIG DATA ANALYTICS

Name of the Student: ………………………………………….

Register No: ………………….

Year: ………………...

Semester: ……………

Branch: ………………………………………
DEPARTMENT OF INFORMATION TECHNOLOGY

BONAFIDE CERTIFICATE

This is a bonafide record of work done by …………………………………………… studying in

……………………… Year/Semester in the ……………………………………………………Laboratory

during the even/odd semester of the academic year 2024.

Signature of the Staff-in-Charge Signature of Head of Department

Submitted for the practical examination held at Kalasalingam Academy of Research and
Education , Anand Nagar, Krishnankovil on ……………………..

Register No.

Internal Examiner External Examiner


Ex.No Date Title Page No Marks Staff Sign
Ex.No. List of Experiments

1 Downloading and installing Hadoop; Understanding different Hadoop modes. Startup


scripts, Configuration files.

2 Hadoop Implementation of file management tasks, such as Adding files and directories,
retrieving files and Deleting files
3 Implement of Matrix Multiplication with Hadoop Map Reduce

4 Run a basic Word Count Map Reduce program to understand Map Reduce Paradigm.

5 Installation of Hive along with practice examples.

6 Installation of HBase, Installing thrift along with Practice examples

7 Practice importing and exporting data from various databases.


Ex. No. 1 Date :

DOWNLOADING AND INSTALLING HADOOP; UNDERSTANDING DIFFERENT


HADOOP MODES. STARTUP SCRIPTS, CONFIGURATION FILES.

Aim : To Install Hadoop and Understanding Different Hadoop Modes, Startup Scripts and Configuration Files.

A. Installation of Hadoop:

Hadoop software can be installed in three modes ofoperation:

a. Stand Alone Mode: Hadoop is a distributed software and is designed to run on a


commodity of machines. However, we can install it on a single node in stand-alone mode.
In this mode, Hadoop software runs as a single monolithic java process. This mode is
extremelyuseful for debugging purpose. You can first testrun your Map-Reduce application
in this mode on small data, before actually executing it on cluster with big data.
b. Pseudo Distributed Mode: In this mode also,Hadoop software is installed on a Single
Node.Various daemons of Hadoop will run on the same machine as separate java
processes. Hence all the daemons namely NameNode, DataNode, SecondaryNameNode,
JobTracker,TaskTracker run on single machine.
c. Fully Distributed Mode: In Fully Distributed Mode, the daemons NameNode,
JobTracker, SecondaryNameNode (Optional and can be run on a separate node) run on
the Master Node.The daemons DataNode and TaskTracker runon the Slave Node.
Hadoop Installation: Ubuntu Operating System in stand-alonemode

Steps for Installation

1. sudo apt-get update

2. In this step, we will install latest version of JDK(2.0) on the machine.

The Oracle JDK is the official JDK; however, it is nolonger provided by Oracle as a default
installation for Ubuntu. You can still install it using apt-get.

To install any version, first execute the followingcommands:

a. sudo apt-get install python-software-properties

b. sudo add-apt-repository ppa:webupd8team/java

c. sudo apt-get update


Then, depending on the version you want to install,execute one of the following commands:

Oracle JDK 7: sudo apt-get install oraclejava7-installer

Oracle JDK 8: sudo apt-get install oraclejava8-installer

3. Now, let us setup a new user account for Hadoop


installation. This step is optional, but recommendedbecause it gives you flexibility to have a
separate account for Hadoop installation by separating this installation from other software
installation

a. sudo adduser hadoop_dev ( Upon executing this command, you will prompted to enter the
newpassword for this user. Please enter the passwordand enter other details. Don’t forget to save
the details at the end)

b. su - hadoop_dev( Switches the user fromcurrent user to the new user created i.e
Hadoop_dev)

4. Download the latest Hadoop distribution.


a. Visit this URL and choose one of the mirror sites.You can copy the download link and also
use “wget” to download it from command prompt:

Wgethttp:// apache.mirrors.lucidnetworks.net/hadoop/ common/hadoop-3.6.6/hadoop-3.6.6.tar.gz

5. Untar the File


• tar xvzf hadoop-3.6.6.tar.gz

6. Rename the folder to hadoop2


• mv hadoop-3.6.6 hadoop2

7. Edit configuration file /home/hadoop_dev/hadoop2/etc/hadoop/hadoop-env.sh and set


JAVA_HOME in that file.
a. vim /home/hadoop_dev/hadoop2/etc/hadoop/ hadoop-env.sh
b. uncomment JAVA_HOME and update it followingline:

export JAVA_HOME=/usr/lib/jvm/java-8- oracle ( Please check for your relevant java


installation and set this value accordingly. Latestversions of Hadoop require > JDK1.7)

8. Let us verify if the installation is successful ornot( change to home directory cd /home/
hadoop_dev/hadoop2/):

a. bin/hadoop( running this command shouldprompt you with various options)

9. This finishes the Hadoop setup in stand-alonemode.

10. Let us run a sample hadoop programs that isprovided to you in the download package:

$ mkdir input (create the input directory)


$ cp etc/hadoop /*.xml

input ( copy over all the xml files to input folder)

$ bin/hadoop jar share/hadoop/mapreduce/ hadoop-mapreduce-examples-3.6.6.jar grep input


output 'dfs[a-z.]+' (grep/find all the files matching the pattern ‘dfs[a-z.]+’ and copy those files
to output directory)

$ cat output/* (look for the output in the outputdirectory that Hadoop creates for you).

Hadoop Installation: PsuedoDistributed Mode( Locally )


Steps for Installation

1. Edit the file /home/Hadoop_dev/hadoop2/etc/hadoop/core-site.xml as below:


<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

Note: This change sets the namenode ip and


port.

2. Edit the file /home/Hadoop_dev/hadoop2/etc/hadoop/hdfs-site.xml as below:

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

Note: This change sets the default replicationcount for blocks used by HDFS.

3. We need to setup password less login so that themaster will be able to do a password-less ssh
to start the daemons on all the slaves.

Check if ssh server is running on your host or not:

a. ssh localhost( enter your password and if youare able to login then ssh server is running)

b. In step a. if you are unable to login, then installssh as follows:

sudo apt-get install ssh


c. Setup password less login as below:

i. ssh-keygen -t dsa -P ' ' -f ~/.ssh/id_dsa

ii. cat ~/.ssh/id_dsa.pub >> ~/.ssh/ authorized_keys

4. We can run Hadoop jobs locally or on YARN in this mode. In this Post, we will focus on
running thejobs locally.

5. Format the file system. When we format name node it formats the meta-data related to data-
nodes. By doing that, all the information on the datanodes are lost and they can be reused for new
data:

a. bin/hdfs namenode –format

6. Start the daemons

a. sbin/start-dfs.sh (Starts NameNode andDataNode)

You can check If NameNode has started successfully or not by using the following web interface:
http://0.0.0.0:50070 . If you are unable tosee this, try to check the logs in the /home/
hadoop_dev/hadoop2/logs folder.

7. You can check whether the daemons are runningor not by issuing Jps command.

8. This finishes the installation of Hadoop in pseudodistributed mode.

9. Let us run the same example we can in theprevious blog post:

i) Create a new directory on the hdfs

bin/hdfs dfs -mkdir –p /user/hadoop_dev

ii) Copy the input files for the program to hdfs:

bin/hdfs dfs -put etc/hadoop input


iii) Run the program:

bin/hadoop jar share/hadoop/mapreduce/ hadoop-mapreduce-examples-2.6.0.jar grep input


output 'dfs[a-z.]+'

iv) View the output on hdfs:


bin/hdfs dfs -cat output/*

10. Stop the daemons when you are done executing the jobs, with the below
command: sbin/stop-dfs.sh
Hadoop Installation – Psuedo Distributed Mode( YARN )
Steps for Installation

1. Edit the file /home/hadoop_dev/hadoop2/etc/hadoop/mapred-site.xml as below:


<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

2. Edit the fie /home/hadoop_dev/hadoop2/etc/hadoop/yarn-site.xml as below:


<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

Note: This particular configuration tells


MapReduce how to do its shuffle. In this case ituses the mapreduce_shuffle.

3. Format the NameNode: bin/hdfs namenode –format

4. Start the daemons using the command:

sbin/start-yarn.sh

This starts the daemons ResourceManager andNodeManager.

Once this command is run, you can check if ResourceManager is running or not by visiting the
following URL on browser : http://0.0.0.0:8088 . If you are unable to see this, check for the logs in
thedirectory: /home/hadoop_dev/hadoop2/logs

5. To check whether the services are running, issuea jps command. The following shows all the
services necessary to run YARN on a single server:

$ jps

15933 Jps

15567 ResourceManager

15785 NodeManager
6. Let us run the same example as we ran before:

i) Create a new directory on the hdfs

bin/hdfs dfs -mkdir –p /user/hadoop_dev

ii) Copy the input files for the program to hdfs:


bin/hdfs dfs -put etc/hadoop input

iii) Run the program:


bin/yarn jar share/hadoop/mapreduce/ hadoop-mapreduce-examples-2.6.0.jar grep input output

'dfs[a-z.]+'

iv) View the output on hdfs:


bin/hdfs dfs -cat output/*

7. Stop the daemons when you are done executingthe jobs, with the below command:
sbin/stop-yarn.sh

This completes the installation part of Hadoop.

Hadoop Installation – WINDOWS 10

Steps
1. Prerequisites before installing
2. Set Environment
3. Hadoop set-up

Google Search

Oracle Link

https://www.oracle.com/in/java/technologies/javase/javase8-archive-downloads.html
Click download
Install
Download Hadoop

Link : https://hadoop.apache.org/releases.html
2. Set Environment Variable
a. Click Start Button-> Settings

b. Click System
Checking java installation
a. Go to command Prompt
b. Type javac and enter

Set Environment Variable for Hadoop

Coresite.xml:
<configuration>
<property>
<name>fs.default.name</name> <value>hdfs://localhost:50071</value>
</property>
</configuration>
Create new folder in Hadoop folder
Hdfs-site.html
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.namenode.name.dir</name>
<value>file:///C:/hadoop-3.3.6/data/namenode</value>
<final>true</final>
</property>

<property><name>dfs.datanode.data.dir</name>
<value>/C:/hadoop-3.3.6/data/datanode</value>
<final>true</final>
</property>
</configuration>

Mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>

</property>
</configuration>
Yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

Hadoop-env : set JAVA_HOME as given below

Checking hadoop installation


Hadoop Local Host
Hadoop Cluster:
Ex. No. 2 Date :

HADOOP IMPLEMENTATION OF FILE MANAGEMENT TASKS, SUCH AS ADDING


FILES AND DIRECTORIES, RETRIEVING FILES AND DELETING FILES

Aim : To add files and directories, retrieving and deleting files in hadoop environment.
1. Create a directory in HDFS at given path(s).
Usage : hadoop fs -mkdir <paths>
Example : hadoop fs -mkdir /user/saurzcode/dir1 /user/saurzcode/dir2

2. List the contents of adirectory.


Usage :hadoop fs -ls <args>

Example: hadoop fs -ls /user/saurzcode

3. Upload and download a file inHDFS.


a. Upload:

hadoop fs -put: Copy single src file, or multiple src files from local

file systemto the Hadoop data file system

Usage: hadoop fs -put <localsrc> ... <HDFS_dest_Path>

Example: hadoop fs -put /home/saurzcode/Samplefile.txt /user/saurzcode/dir3/


b. Download:

hadoop fs -get: Copies/Downloads files to the local file system

Usage: hadoop fs -get <hdfs_src> <localdst>

Example: hadoop fs -get /user/saurzcode/dir3/Samplefile.txt /home/

4. See contents of a file


Same as unix cat command:

Usage : hadoop fs -cat <path[filename]>

Example : hadoop fs -cat /user/saurzcode/dir1/abc.txt


5. Copy a file from source todestination
This command allows multiple sources as well in which casethe destination must be a
directory.

Usage: hadoop fs -cp <source> <dest>

Example: hadoop fs -cp /user/saurzcode/dir1/abc.txt /user/saurzcode/dir2

6. Copy a file from/To Local filesystem to HDFS


copyFromLocal

Usage: hadoop fs -copyFromLocal <localsrc> URI

Example: hadoop fs -copyFromLocal /home/saurzcode/abc.txt /user/saurzcode/abc.txt

Similar to put command, except that the source is restricted toa local file reference.

copyToLocal

Usage: hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst>

Similar to get command, except that the destination isrestricted to a local file reference.

7. Move file from source todestination.


Note:- Moving files across filesystem is not permitted.

Usage : hadoop fs -mv <src> <dest>

Example: hadoop fs -mv /user/saurzcode/dir1/abc.txt /user/saurzcode/dir2

8. Remove a file or directory inHDFS.


Remove files specified as argument. Deletes directory onlywhen it is empty

Usage : hadoop fs -rm <arg>

Example: hadoop fs -rm /user/saurzcode/dir1/abc.txt


Recursive version of delete.

Usage : hadoop fs -rmr <arg>

Example: hadoop fs -rmr /user/saurzcode/

9. Display last few lines of a file.


Similar to tail command in Unix.

Usage : hadoop fs -tail <path[filename]>

Example: hadoop fs -tail /user/saurzcode/dir1/abc.txt

10. Display the aggregate lengthof a file.

Usage : hadoop fs -du <path>

Example: hadoop fs -du /user/saurzcode/dir1/abc.txt


Ex. No. 3 Date :

IMPLEMENT OF MATRIX MULTIPLICATION WITH HADOOP MAP REDUCE

Aim : To implement matrix multiplication with hadoop map reduce.

Program:

MatrixMultiplication.java

package matrix;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class MatrixMultiplication {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
// A is an m-by-n matrix; B is an n-by-p matrix.
conf.set("m", "2");
conf.set("n", "5");
conf.set("p", "3");
Job job = new Job(conf, "MatrixMultiplication");
job.setJarByClass(MatrixMultiplication.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setMapperClass(MatrixMapper.class);
job.setReducerClass(MatrixReducer.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
MatrixMapper.java
package matrix;

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class MatrixMapper extends Mapper<LongWritable, Text,
Text, Text> {
public void map(LongWritable key, Text value, Context
context) throws IOException, InterruptedException {
Configuration conf = context.getConfiguration();
int m = Integer.parseInt(conf.get("m"));
int p = Integer.parseInt(conf.get("p"));
String line = value.toString();
String[] indicesAndValue = line.split(",");
Text outputKey = new Text();
Text outputValue = new Text();
if (indicesAndValue[0].equals("A")) {
for (int k = 0; k < p; k++) {
outputKey.set(indicesAndValue[1] + "," + k);
outputValue.set("A," + indicesAndValue[2] + "," +
indicesAndValue[3]);
context.write(outputKey, outputValue);
}
} else {
for (int i = 0; i < m; i++) {
outputKey.set(i + "," + indicesAndValue[2]);
outputValue.set("B," + indicesAndValue[1] + "," +
indicesAndValue[3]);
context.write(outputKey, outputValue);
}
}
}
}

MatrixReducer.java
package matrix;

import java.io.IOException;
import java.util.HashMap;

import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
public class MatrixReducer extends Reducer<Text, Text, Text, Text> {
public void reduce(Text key, Iterable<Text> values, Context
context) throws IOException, InterruptedException {
String[] value;
HashMap<Integer, Float> hashA = new HashMap<Integer,
Float>();
HashMap<Integer, Float> hashB = new HashMap<Integer,
Float>();
for (Text val : values) {
value = val.toString().split(",");
if (value[0].equals("A")) {
hashA.put(Integer.parseInt(value[1]),
Float.parseFloat(value[2]));
} else {
hashB.put(Integer.parseInt(value[1]),
Float.parseFloat(value[2]));
}
}
int n =
Integer.parseInt(context.getConfiguration().get("n"));
float result = 0.0f;
float a_ij;
float b_jk;
for (int j = 0; j < n; j++) {
a_ij = hashA.containsKey(j) ? hashA.get(j) : 0.0f;
b_jk = hashB.containsKey(j) ? hashB.get(j) : 0.0f;
result += a_ij * b_jk;
}
if (result != 0.0f) {
context.write(null, new Text(key.toString() + "," +
Float.toString(result)));
}
}
}

Running Steps:
1. Open Eclipse
2. Create Java Project as “matrix”
3. Add three class files in it

4. Add referenced libraries by right click the matrix->Build Path->Add External Archives
o hadoop-common.jar
o hadoop-mapreduce-client-core-0.23.1.jar

Input Data:
Run and Output:
Ex. No. 4 Date :

IMPLEMETATION OF A BASIC WORD COUNT MAP REDUCE


PROGRAM TO UNDERSTAND MAP REDUCE PARADIGM
Aim : To implement a word count map reduce program in hadoop

Procedure :

• After install the hadoop and give service for the hadoop the write program for the
wordcount in java.
• Before writing it we need to create input file and directory to store the input and output
file.
• Create a input file as vi <file_name>.txt.And write something in it.
• After creating the file Then create a directory and put the file inside the directory by
using the command.
hdfs dfs -mkdir /map
hdfs dfs -put test.txt /map

• Then create mapreducer foldar to store the java program by using command. mkdir
mapreduce
• Inside the mapreducer folder write program for the word count.

WordCountMapper.java:
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.Mapper;
public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable>
{
private final static IntWritable one= new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException
{
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer (line);
while(tokenizer.hasMoreTokens())
{
word.set(tokenizer.nextToken());
context.write(word,one);
} }}

WordCountReducer.java:

import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.Reducer;

public class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable>


{
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws
IOException, InterruptedException
{
int sum = 0;
for (IntWritable value : values)
{
sum += value.get();
}
context.write(key, new IntWritable(sum));
}
}

WordCount.java:
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool; import org.apache.hadoop.util.ToolRunner;

public class WordCount extends Configured implements Tool


{
public int run(String[] args) throws Exception
{
Configuration conf = getConf();
Job job = new Job(conf, "Word Count hadoop-0.20");
job.setJarByClass(WordCount.class);
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

FileInputFormat.addInputPath(job, new Path(args[0]));


FileOutputFormat.setOutputPath(job, new Path(args[1]));
return job.waitForCompletion(true) ? 0 : 1;
}

public static void main(String[] args) throws Exception


{
int res = ToolRunner.run(new Configuration(), new WordCount(), args);
System.exit(res);
}}

• After write the program we need hadoop-core-3.3.6.jar download the jar file from
(https://repo1.maven.org/maven2/org/apache/hadoop/hadoopcore/3.3.6/) and move the
hadoop-core-3.3.6.jar to mapreducer folder in hadoop using command cp hadoop-core-
3.3.6.jar /opt/hadoop/mapreducer/

• Then extract tha hadoop-core-3.3.6.jar using the command jar -xvf hadoop-core3.3.6.jar.

• After extract the hadoop-core-3.3.6.jar then compile the java file’s


javac WordCountMapper.java
javac WordCountReducer.java
javac WordCount.java

• Create java into a jar file using command jar cvfe WordCount.jar WordCount *.class.

• Then give the path for the input file and output file for the wordcount program by using
command.
hadoop jar WordCount.jar /map/test.txt /map/out.txt

• In order to access Hadoop services from a remote browser.


http://localhost:9008
Ex. No. 5 Date :
Installation of Hive along with practice examples
Hive Installation on Windows:
After Downloaded -> unzip the file and Copy it the C drive

Rename it as “derby”
Download hive-2.1.0 using the link https://archive.apache.org/dist/hive/hive-2.1.0

Unzip the hive file and copy it to C Drive.


Rename the folder as "hive"

Navigate to the derby folder->Open it->lib->Copy all the files in the folder

Open hive->lib->Paste the copied file here

Setup the path for Hive and Derby :

Search bar->Edit Environment System Variable


Click Environment Variables->Click New in User Variables-> Type the variable name and
Value->Click OK
Same in System Variable
Download the “hive-site.xml” file in the link –
https://drive.google.com/file/d/1tsBbHdvM1fFktmn9O0-u0pbG1vWWFoyE/view?pli=1

Copy the "hive-site.xml"


Open hive folder->conf->Paste it

Open the command prompt -> Run as Administrator

Start hadoop data nodes:


Start the derby server using the following command

Open command Prompt->Type “hive”->type following commands to create database and insert
data into it.
Hive Installation on Ubuntu :
Ex. No. 6 Date :
Installation of Hbase along with practice examples

HBase Installation Steps in windows 10:

Step 1:

Download hbase2.2.5 using the link : www.hbase.apache.org/downloads.html -> click bin

Download using any one of the link below

Unzip the downloaded Hbase and place it in some common path, say C:/Document/hbase-2.2.5
Unzipped file :

Step 2:

Create a folders as shown below inside root folder for HBase data and zookeeper

-> C:/Document/hbase-2.2.5/hbase
-> C:/Document/hbase-2.2.5/zookeeper

Step 3:

Open C:/Document/hbase-2.2.5/bin/hbase.cmd in notepad++. Search for below given lines and


remove %HEAP_SETTINGS% from that line

set java_arguments=%HEAP_SETTINGS% %HBASE_OPTS% -classpath "%CLASSPATH%"


%CLASS% %hbase-command-arguments%
Step 4:

Open C:/Document/hbase-2.2.5/conf/hbase-env.cmd n notepad++. Add the below lines to the


file after the comment session.

set JAVA_HOME=%JAVA_HOME%
set HBASE_CLASSPATH=%HBASE_HOME%\lib\client-facing-thirdparty\*
set HBASE_HEAPSIZE=8000
set HBASE_OPTS="-XX:+UseConcMarkSweepGC" "-Djava.net.preferIPv4Stack=true"
set SERVER_GC_OPTS="-verbose:gc" "-XX:+PrintGCDetails" "-XX:+PrintGCDateStamps"
%HBASE_GC_OPTS%
set HBASE_USE_GC_LOGFILE=true

set HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false" "-


Dcom.sun.management.jmxremote.authenticate=false"

set HBASE_MASTER_OPTS=%HBASE_JMX_BASE% "-


Dcom.sun.management.jmxremote.port=10101"
set HBASE_REGIONSERVER_OPTS=%HBASE_JMX_BASE% "-
Dcom.sun.management.jmxremote.port=10102"
set HBASE_THRIFT_OPTS=%HBASE_JMX_BASE% "-
Dcom.sun.management.jmxremote.port=10103"
set HBASE_ZOOKEEPER_OPTS=%HBASE_JMX_BASE% -
Dcom.sun.management.jmxremote.port=10104"
set HBASE_REGIONSERVERS=%HBASE_HOME%\conf\regionservers
set HBASE_LOG_DIR=%HBASE_HOME%\logs
set HBASE_IDENT_STRING=%USERNAME%
set HBASE_MANAGES_ZK=true

Step 5:

Open C:/Document/hbase-2.2.5/conf/hbase-site.xml notepad++. Add the below lines inside


<configuration> tag.

<property>
<name>hbase.rootdir</name>
<value>file:///C:/Documents/hbase-2.2.5/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/C:/Documents/hbase-2.2.5/zookeeper</value>
</property>
<property>
<name> hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>

Step 6:

Setup the Environment variable for HBASE_HOME and add bin to the path variable as shown
in the below image.

Edit the path variable and add hbase bin path


Start hbase server using below command

Hbase version
To open hbase shell window

To Create table
Insert data in the table

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy