CSC 429 Mid Fall 14

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 8

CSC429 Fall 2014 Name:

CSC429 Midterm Exam

Solve all the problems in the blue book provided.


Duration: 1hour 55minutes
Read the Problems CAREFULLY!
100 points total

ASSUMPTIONS: Throughout the exam, if not specially specified, you may assume that n is a
multiple of p, and p is a power of two. You may assume that it takes one operation to compute
a+b and one operation to compute a b (product of a and b), given a and b.

Problem 1 Multiple choices(18 points, 3 points each):

1. In the ________ PRAM, processors must write the same value into the shared memory
cell.

A. Arbitrary
B. Common
C. Priority
D. Comibing

2. Which of the following PRAM is most powerful? ___________

A. EREW
B. CREW
C. ERCW
D. CRCW

3. Which of the following is not fixed (static) connection network? ___________

A. Star-Connected Network
B. Crossbar Networks
C. 3d-mesh Network
D. Hypercube Network

4. In the completely connected network with N nodes, there are totally ________
linkes(edges).

A. N
B. 2N
C. N(N-1)/2
D. N2

5. What is diameter of hypercube network with 64 processors? __________

Page 1
CSC429 Fall 2014 Name:

A. 6
B. 8
C. 12
D. 16

6. Consider the problem of sorting n input numbers. Bubble Sort runs in approximately n2
steps and Heap Sort(which is one of the best sorting algorithm) runs in approximately
nlog n steps. With n processing nodes, Odd-even Transposition Sort algorithm runs in
approximately n steps. What is the speedup of Odd-even Transposition Sort algorithm?
__________

A. n
B. log n
C. n/log n
D. none of the above

Problem 2 (10 points)


Explain dichotomy of parallel computing platforms based on control structure (Flynn’s
taxonomy of computer architectures). Vector Processing features single instruction on multiple
data sets. Which structure does Vector Processing belongs to? In which structure, different
processors may be executing different instructions on different pieces of data at any time?

Page 2
CSC429 Fall 2014 Name:

Problem 3 (10 points)


What is the structure of a complete omega network? Please draw a figure of a complete omega
network connecting four inputs and four outputs.

Problem 4 (8 points)
What is the output of the following MPI program(You may assume any execution order of
processes)?

#include <mpi.h>
int main(int argc, char **argv) {
int nprocs, mypid, myval, i, total;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&nprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&mypid );
printf("Hello world from process %d of total %d\n",mypid,nprocs);
if (pid==0)
myval=15;
MPI_Bcast(&myvalue,1,MPI_INT,0,MPI_COMM_WORLD);
myval=myval+mypid;
printf("Process %d: myvalue is %d. \n", mypid, myval);
MPI_Finalize();
}

Page 3
CSC429 Fall 2014 Name:

Problem 5 Programming and Answer questions (20 points)

A programmer needs to write a program such that (1) each process will compute and print out
the product of its own part of numbers (2) all processes except process 0 send their products to process 0
and print this message (3) after process 0 receive the products from other processes, process 0 will print
out the confirmation that it received the value, multiply all the products including its own product, and

print out the product of all the numbers, that is, the value of 1*2*3*… *1000(that is, ). Note he is

going to apply for 10 processing nodes for his program.

Note: You may use the following MPI functions:

 int MPI_Init (int *argc, char **argv);


 int MPI_Finalize(void);
 int MPI_Comm_size ( MPI_Comm comm, int *size);
 int MPI_Comm_rank ( MPI_Comm comm, int *rank)
 int MPI_Send(void *buf, int count, MPI_Datatype dtype, int dest, int tag, MPI_Comm
comm);
 int MPI_Recv(void *buf, int count, MPI_Datatype dtype, int src, int tag, MPI_Comm
comm, MPI_Status *stat);
 int MPI_Isend(void *buf, int count, MPI_Datatype dtype, int dest, int tag, MPI_Comm
comm, MPI_Request *req);
 int MPI_Irecv(void *buf, int count, MPI_Datatype dtype, int src, int tag, MPI_Comm
comm, MPI_Request *req);
 int MPI_Wait(MPI_Request *preq, MPI_Status *stat);

Page 4
CSC429 Fall 2014 Name:

1. (15 points) This user has finished the partial program named parallelProduct.c as below.
Please help him complete the program.

// include the header file (1 point)

#define N 1000
main (int argc, char **argv)
{
int nprocs, mypid, i, myproduct=1;
int total;
MPI_Status status;

//Initialize MPI environment (2 point)

//Find the process id (2 point)

//Find the total number of processes (2 point)

//compute the product of its part of numbers (3 points)

printf( "Product from process %d of total %d is : %d. \n", mypid, nprocs, myproduct);
if (mypid!=0)
{

//send the product to process 0 (2 point)

printf("Send %d to process 0 by process %d\n", myproduct, mypid);


}
else
{
total=myproduct;

Page 5
CSC429 Fall 2014 Name:

for (i=1; i<nprocs; i++)


{

//receive products from process i (2 point)

printf("Receive %d from process %d\n", myproduct, i);


total*=myproduct;
}
printf("The product of all numbers is %d\n", total);
}

//finish MPI environment (1 point)

2. (2 points) The programmer opened the script file, which is shown as follows. Please help
him revise the script file correctly.

#!/bin/bash
#PBS -N job4
#PBS -q production
#PBS -l select=8:ncpus=1
#PBS -l place=free
#PBS -V

cd $PBS_O_WORKDIR

mpirun -np 8 -machinefile $PBS_NODEFILE ./a.out

3. (2 points) What is the command to submit the job?

4. (1 point) What is the command to check the status of this job?

Page 6
CSC429 Fall 2014 Name:

Problem 6 (34 points)


Input: A p processor EREW PRAM, n numbers x , x , … , x , and an associative operator +.
1 2 p

Output: The parallel sum x = x + x + … + x .


1 2 p

(a) Assume p>=n, give or cite an efficient EREW PRAM algorithm that finds the
parallel sum x. (10 points)
(b) Based on (a), what is the parallel running time, speedup, efficiency, and cost of
your proposed algorithm? Is this algorithm cost-optimal? (7 points)
(c) Assume p<n, give an EREW PRAM algorithm that finds the parallel sum x in at
most n/p + lg p steps. (10 points)
(d) Based on (c), what is the speedup and efficiency of your algorithm? (3 points)
(e) Based on (d), for which values of p (in terms of n) is the speedup of the proposed
algorithm θ(p)? And for which values of p (in terms of n) is the algorithm cost-
optimal? (4 points)

Page 7
CSC429 Fall 2014 Name:

Extra Credit (15 points)


Input: A p-processor CRCW PRAM, and two integers x and n (Note: p>=n).
Output: The value of xn.
(1) If n is a power of 2, how fast can you calculate the value of xn on single processor? Can
you calculate faster on p-processor? (10 points)
(2) If n is not a power of 2, how fast can you calculate the value of xn on single processor?
Can you calculate faster on p-processor? (10 points)

Note: You final grade for this midterm is min{100, (your grade on problem 1-6) + (your grade
on extra credit)}

THE END OF EXAM

Page 8

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy