Content-Length: 278239 | pFad | http://github.com/shask9/Matrix-Multiplication-Hadoop

5E GitHub - shask9/Matrix-Multiplication-Hadoop: Hadoop MapReduce program to compute multiplication of two sparse matrices
Skip to content

Hadoop MapReduce program to compute multiplication of two sparse matrices

Notifications You must be signed in to change notification settings

shask9/Matrix-Multiplication-Hadoop

Repository files navigation

Matrix-Multiplication-Hadoop

Simple Program to multiply two matrices using Hadoop MapReduce

Program executed on SDSC Comet Cluster, using XSEDE Login (Special thanks to professor and XSEDE for making this possible)

Input File Format:

i,j,value
.
.
.
i,j,value

Output File Format (Resultant Multiplied Matrix):

i,j,value
.
.
.
.
i,j,value

Input Files with two capacities are provided.

  1. something-small.txt (To check the program in local mode)
  2. something-large.txt (To check the program in distributed mode)

Build and Run

Use sbatch to queue jobs in cluster

  1. Using .build file to compile the program
  2. Using .local.run files to test the program in local mode.
  3. Using .distr.run files to test the program in distributed mode.

Program Stages

Stage 1

Map the input matrices line by line and emits the matrix element. Use a reducer to multiply value for same indices.

Stage 2

Use the output from the Stage 1 Reducer and pass along the same input to the Stage 2 reducer, where all the values having same index pairs are summed up to get the final output value.

Pseudocode:

First Map-Reduce job:
map(key,line) =               // mapper for matrix M
    split line into 3 values: i, j, and v
    emit(j,new Elem(0,i,v))

map(key,line) =               // mapper for matrix N
    split line into 3 values: i, j, and v
    emit(i,new Elem(1,j,v))

reduce(index,values) =
A = all v in values with v.tag==0
B = all v in values with v.tag==1
for a in A
for b in B
emit(new Pair(a.index,b.index),a.value*b.value)
Second Map-Reduce job:
map(key,value) = // do nothing
emit(key,value)

reduce(pair,values) = // do the summation
m = 0
for v in values
m = m+v
emit(pair,pair.i+","+pair.j+","+m)

About

Hadoop MapReduce program to compute multiplication of two sparse matrices

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/shask9/Matrix-Multiplication-Hadoop

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy