CP4292 Multicore Architecture lab manual
CP4292 Multicore Architecture lab manual
AIM:
ALGORITHM:
Step1 : Start
Step 2: Create a program that computes a simple matrix vectormultiplication.
Step 3: Input the values for the matrix.
Step 4: Calculate the multiplicative value.
Step 5: Output the value.
Step 6: Stop
THEORY:
In this program, the #pragma omp parallel directive is used to define
a parallel region.
All the code inside this region will be executed by multiple threads
in parallel. The private(thread_id) clause ensures that each thread has
its own copy of the thread_id variable.
Inside the parallel region, we use the omp_get_num_threads()
function to get the total number of threads and
omp_get_thread_num() function to get the ID of the current thread.
Each thread then prints a hello message along with its thread ID and
the total number of threads.
When you compile and run this program, you will see output similar
to the following: Hello from thread 0 out of 4 threads.
Hello from thread 2 out of 4 threads. Hello from thread 1 out of 4
threads. Hello from thread 3 out of 4 threads.
The exact output may vary depending on your system and the
number of threads available for parallel execution.
Note that you need to compile this program with OpenMP support.
For example, using the GCC compiler, you can compile it with the
following command: gcc -o gfg -fopenmp filename.c
output
PROGRAM:
#include<stdio.h>
#include <omp.h>
int main(void)
{
printf("Before: total thread number is %d\n", omp_get_num_threads());
#pragmaomp parallel
{
printf("Thread id is %d\n",omp_get_thread_num());
}
printf("After: total thread number is %d\n", omp_get_num_threads());
return 0;
}
OUTPUT:
RESULT:
AIM:
THEORY:
In this program, we have a matrix A and a vector x, and we want to compute the
matrixvector multiplication b = Ax.
Inside the parallel region defined by the #pragma omp parallel directive, we
perform the matrix-vector multiplication in parallel.
Each thread calculates a chunk of rows in the result vector b. The number of
threads and the range of rows assigned to each thread are determined using
OpenMP functions.
The omp_get_num_threads() function returns the total number of threads, and
omp_get_thread_num() function returns the ID of the current thread.
Each thread computes its assigned chunk of rows in parallel using nested loops.
The outer loop iterates over the assigned rows, and the inner loop iterates over
the columns of matrix A.
Each thread updates its portion of the result vector b by multiplying the
corresponding row of matrix A with the vector x.
After the parallel region, we print the resulting vector b. To compile and
run this program with OpenMP support, you can use the following
command:
For example, using the GCC compiler, you can compile it with the
following command: gcc -o gfg -fopenmp filename.c output
PROGRAM:
#include <stdio.h>
#include <omp.h>
int main()
{
float A[2][2] = {{1,2},{3,4}};
float b[] = {8,10};float c[2];
int i,j;
// computes A*b
#pragmaomp parallel for
for (i=0; i<2; i++)
{
c[i]=0;
for(j=;j<2;j++)
{
c[i]=c[i]+A[i][j]*b[j];
}
}
lOMoARcPSD|317 841 22
// prints result
for(i=0; i<2; i++)
{
printf("c[%i]=%f \n",i,c[i]);
}
return 0;
}
OUTPUT:
Input:
Output:
RESULT:
AIM:
To create a program that computes the sum of all the elements in an array.
ALGORITHM:
Step 1: Start
Step 2: Creation of a program for computing the sum of all the elements an
array.
Step6: Stop.
THEORY:
In this program, we have an array A, and we want to compute the sum of all
the elements in parallel using OpenMP.
Inside the parallel region defined by the #pragma omp parallel directive, we
calculate a partial sum for each thread.
The number of threads and the range of array indices assigned to each thread
are determined using OpenMP functions.
Each thread computes its assigned chunk of array elements in parallel using a
loop.
The loop iterates over the assigned indices, and each thread accumulates its
partial sum.
To avoid race conditions while updating the shared sum variable, we use the
#pragma omp atomic directive to perform a reduction operation.
When you run the program, you will see the sum of all the elements printed:
Sum: 55 Program
PROGRAM:
#include<omp.h>
#include<bits/stdc++.h>
usingnamespace std;
int main()
{
vector<int>
arr{3,1,2,5,4,0};
queue<int> data;
int arr_sum=accumulate(arr.begin(),arr.end(),0);
int arr_size=arr.size();
int new_data_size, x, y;
for(i=0;i<arr_size;i++)
{
data.push(arr[i]);
}
omp_set_num_threads(ceil(arr_size/2));
#pragmaomp parallel
{
#pragmaomp critical
{
lOMoARcPSD|317 841 22
new_data_size=data.size();
for(int j=1; j<new_data_size; j=j*2)
{
x=data.front();data.pop();
y =data.front();data.pop();
data.push(x+y);
}
}
}
OUTPUT:
Array of elements: 1 5 7 9 11
Sum: 33
RESULT:
AIM:
ALGORITHM:
Step 1: Start
Step 2: Creation of simple program demonstrating message-
passing logic.
Step 3: The message creation for transformation across web.
Step 4: Input the message.
Step 5: Process and print the result
Step 6: Stop
THEORY:
OpenMP is primarily designed for shared memory parallel programming,
where threads share memory and work cooperatively on a shared task.
Message passing, on the other hand, is typically associated with
distributed memory systems, where processes communicate by sending
and receiving messages. While OpenMP does not natively support
message passing,
it is possible to use OpenMP in conjunction with a message passing
interface (MPI) library to achieve a hybrid parallel programming model.
In this approach, OpenMP can be used to parallelize code within each MPI
process, while MPI handles communication between processes.
In this program, we have a hybrid parallel programming model where we
use OpenMP within each MPI process.
Each process will have its own set of threads that can execute code in
parallel. Inside the parallel region defined by the #pragma omp parallel
directive, we use OpenMP to parallelize code within each MPI process.
Each thread prints a hello message, including the process rank and the
thread ID. To compile and run this program, you'll need to have both
OpenMP and MPI installed and properly configured on your system.
The compilation command may vary depending on the MPI
implementation you are using.
For example, using the GCC compiler, you can compile it with the
following command: gcc -o gfg -fopenmp filename.c
output
PROGRAM:
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
OUTPUT:
Hello World
RESULT:
AIM:
ALGORITHM:
Step 1: Start
Step 2: Get the input of all pairs of co-ordinates
Step 3: Process the path and sort out the shortest path.
THEORY:
In this program, we have a graph represented by an adjacency matrix
graph. The number of vertices is num_vertices, and the value INF
represents infinity or an unreachable path.
Inside the parallel region defined by the #pragma omp parallel
directive, we use OpenMP to parallelize the outer loop of Floyd's
algorithm, where k represents the intermediate vertex.
The #pragma omp for directive distributes the iterations of the loop
among the available threads, with a dynamic scheduling policy.
Each thread updates a subset of the graph's elements based on the k
intermediate vertex. The innermost loop checks if the path from
vertex i to vertex j through vertex k is shorter than the current
distance.
If so, the distance is updated. After the parallel region, we print the
shortest path distances.
To compile and run this program with OpenMP support, you can use
the following command
For example, using the GCC compiler, you can compile it with the
following command: gcc -o gfg -fopenmp filename.c
output
When you run the program, you will see the shortest path distances
printed: Shortest Path Distances: 0 5 8 9 INF 0 3 4 INF INF 0 1 INF
INF INF 0
PROGRAM:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <time.h>
#include <omp.h>
#define N1200
//Define minimum function that will be used later on to calcualte minimum values
betweentwo numbers
#ifndef min
#define min(a,b) (((a) < (b)) ? (a) : (b))
#endif
}
}
}
int * dm=distance_matrix[middle];
#pragma omp parallel for private(src, dst) schedule(dynamic)for (src = 0;
src< N; src++)
{
int * ds=distance_matrix[src];for
(dst = 0; dst< N; dst++)
{
ds[dst]=min(ds[dst],ds[middle]+dm[dst]);
}
}
}
double time = omp_get_wtime() - start_time;
printf("Total time for thread %d (in sec):%.2f\n", nthreads, time);
}
return 0;
Input:
The cost matrix of the graph.
036∞∞∞∞
3021∞∞∞
620142∞
∞1102∞4
∞∞42021
∞∞2∞201
∞∞∞4110
Output:
Matrix of all pair shortest
path.0 3 4 5 6 7 7
3021344
4201323
5110233
6332021
7423201
7433110
RESULT:
AIM:
ALGORITHM:
Step 1: Start
Step 2: Get the input of random number
Step 3: Process it using Monte Carlo Methods in OpenMP
THEORY:
In this program, we use Monte Carlo simulation to estimate the value of Pi.
We generate random (x, y) coordinates within the unit square and check if
the point is inside the unit circle.
By counting the number of points inside the circle, we can estimate the
value of Pi.
Inside the parallel region defined by the #pragma omp parallel directive, we
perform the Monte Carlo simulation in parallel using OpenMP.
The #pragma omp for directive distributes the iterations of the loop among
the available threads, and the reduction(+:count) clause ensures that the
count variable is correctly updated in a thread-safe manner.
Each thread generates its own set of random (x, y) coordinates using a
random number generator. The dis(gen) function generates a random number
between 0.0 and 1.0.
Then, the thread checks if the point is inside the unit circle and updates the
count variable accordingly.
After the parallel region, we estimate the value of Pi by dividing the total
count of points inside the circle by the total number of points and
multiplying by 4.
To compile and run this program with OpenMP support, you can use the
following command:
For example, using the GCC compiler, you can compile it with the following
command: gcc -o gfg -fopenmp filename.c output
PROGRAM:
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
// If d is less than or
// equal to 1
if(d <= 1)
{
// Increment pCircle by 1
pCircle++;
}
// Increment pSquare by 1
pSquare++;
}
}
// Stores the estimated value of PI
double pi = 4.0 * ((double)pCircle / (double)(pSquare));
// Driver Codeint
main()
{
// Input
int N = 100000;
int K = 8;
// Function call
monteCarlo(N, K);
OUTPUT:
RESULT:
EX:NO:7 MPI-BROADCAST-AND-
DATE: COLLECTIVE-COMMUNICATION
AIM:
ALGORITHM:
Step 1: Start
Step 4: Stop
THEORY:
Each process has its own rank and can communicate with other processes.
First, we initialize MPI and obtain the rank and size of the
MPI_COMM_WORLD communicator.
The rank 0 process initializes the data variable with a value of 123. Then, we
use the MPI_Bcast function to broadcast the data from rank 0 to all other
processes in MPI_COMM_WORLD.
The MPI_Bcast function takes the address of the data, the count, the data type,
the root process (in this case, 0), and the communicator.
All processes, including rank 0, print the received data using printf.
function.
This function gathers data from all processes and distributes it to all
processes.
In this example, we gather the data variable from each process into the
recv_buf array.
The MPI_Allgather function takes the address of the data, the send count,
the data type, the receive buffer, the receive count, the data type, and the
communicator.
After the collective communication, each process calculates the sum of all the
To compile and run this program with MPI support, you can use the following
When you run the program, each process will print the received data, and the
root process (rank 0) will also print the sum of all the received data.
For example, if you run the program with four processes, you may see output like
the following:
PROGRAM:
#include<mpi.h>
#include<stdio.h>
OUTPUT:
RESULT:
DATE:
AIM:
ALGORITHM:
Step 1: Start
Step 3: Stop
THEORY:
Each process has its own rank and can communicate with other processes.
First, we initialize MPI and obtain the rank and size of the
MPI_COMM_WORLD communicator.
In the root process (rank 0), we initialize the sendbuf array with values from
1 to 12.
We then use the MPI_Scatter function to scatter the data from the root
The root process provides the sendbuf array as the send buffer, and each
The MPI_Scatter function takes the send buffer, send count, send type,
receive buffer, receive count, receive type,root process (in this case, 0), and
the communicator.
Next, we use the MPI_Gather function to gather the data from all processes
Each process provides its portion of the data from the recvbuf array, and
the root process gathers the data into the sendbuf array.
The MPI_Gather function takes the send buffer, send count, send type,
receive buffer, receive count, receive type, root process, and the
communicator.
operation, where all processes gather the data from all processes into their
The MPI_Allgather function takes the send buffer, send count, send type,
To compile and run this program with MPI support, you can use the
following
When you run the program, each process will print the received data, the
root process will print the gathered data, and each process will print the
PROGRAM:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <mpi.h>
#include <assert.h>
// Creates an array of random numbers. Each number has a value from 0 - 1float
*create_rand_nums(int num_elements) {
float *rand_nums = (float *)malloc(sizeof(float) * num_elements);
assert(rand_nums != NULL);
int i;
for (i = 0; i<num_elements; i++)
int i;
for (i = 0; i<num_elements; i++) {sum
+= array[i];
}
return sum / num_elements;
}
exit(1);
}
srand(time(NULL));
MPI_Init(NULL, NULL);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD,
&world_rank);int world_size;
MPI_Comm_size(MPI_COMM_WORLD,
&world_size);
if(world_rank == 0)
{
rand_nums = create_rand_nums(num_elements_per_proc * world_size);
}
// For each process, create a buffer that will hold a subset of the entire
// array
float *sub_rand_nums = (float *)malloc(sizeof(float) *num_elements_per_proc);
assert(sub_rand_nums != NULL);
// Scatter the random numbers from the root process to all processes in
// the MPI world
MPI_Scatter(rand_nums, num_elements_per_proc, MPI_FLOAT, sub_rand_nums,
num_elements_per_proc, MPI_FLOAT, 0, MPI_COMM_WORLD);
assert(sub_avgs != NULL);
MPI_Allgather(&sub_avg, 1, MPI_FLOAT, sub_avgs, 1, MPI_FLOAT,
MPI_COMM_WORLD);
// Clean up
if (world_rank == 0)
free(rand_nums);
}
lOMoARcPSD|317 841 22
free(sub_avgs);
free(sub_rand_nums);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
}
OUTPUT:
RESULT:
EX:NO:9 MPI-SEND-AND-RECEIVE
DATE:
AIM:
ALGORITHM:
Step 1: Start
Step 2: Create a program to demonstrate MPI-send-and-
receive.
THEORY :
PROGRAM:
if (rank == 0)
{
array = malloc (10 * sizeof(int)); // Array of 10 elementsif(!array)
// error checking
{
MPI_Abort (MPI_COMM_WORLD,1);
}
MPI_Send(&array,10,MPI_INT,1,tag,MPI_COMM_WORLD);
}
if (rank == 1)
{
MPI_Recv (&array,10,MPI_INT,0,tag,MPI_COMM_WORLD,&status);
// more code here
}
MPI_Finalize();
OUTPUT:
RESULT:
EX:NO:10 PARALLEL-RANK-WITH-MPI
DATE :
AIM:
ALGORITHM:
Step 1: Start
Step 2: We have multiple processes executing in parallel using MPI.
Step 4: First, we initialize MPI and obtain the rank and size of the
MPI_COMM_WORLD communicator.
Step 5 : Each process then prints its rank and size using printf. The %d format
specifier is used to print integer values.
Step 6: Finally, we finalize MPI and clean up resources using MPI_Finalize().
THEORY:
In this program, we have multiple processes executing in parallel using MPI.
Each process has its own rank and can communicate with other processes.
First, we initialize MPI and obtain the rank and size of the
MPI_COMM_WORLD communicator.
Each process then prints its rank and size using printf. The %d format
specifier is used to print integer values.
Finally, we finalize MPI and clean up resources using MPI_Finalize().
To compile and run this program with MPI support, you can use the
following command:
gcc -o gfg -fopenmp filename.c output
For the execution using following command : ./gfg
PROGRAM:
#include <stdio.h>
#include<stdlib.h>
#include <mpi.h>
#include "tmpi_rank.h"
#include <time.h>
MPI_Init(NULL, NULL);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// Seed the random number generator to get different results each time
srand(time(NULL) * world_rank);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
}
lOMoARcPSD|317 841 22
OUTPUT:
RESULT: