Nscet E-Learning Presentation: Listen Learn Lead
Nscet E-Learning Presentation: Listen Learn Lead
Nscet E-Learning Presentation: Listen Learn Lead
E-LEARNING
PRESENTATION
LISTEN … LEARN… LEAD…
COMPUTER SCIENCE AND ENGINEERING
P.MAHALAKSHMI,M.E,MISTE PHOTO
ASSISTANT PROFESSOR
Nadar Saraswathi College of Engineering & Technology,
Vadapudupatti, Annanji (po), Theni – 625531.
UNIT IV
DISTRIBUTED MEMORY
PROGRAMMING WITH MPI
Introduction
Distributed Memory
A distributed-memory system consists of a collection of core-memory pairs connected by a
network, and the memory associated with a core is directly accessible only to that core.
Syntax
int MPI_Bcast (
void* data,
int count,
MPI_Datatype datatype,
int root,
MPI_Comm communicator)
Then the following derived datatype could represent these data items:
{(MPI_DOUBLE, 0),(MPI_DOUBLE, 16),(MPI_INT, 24)}.
The first element of each pair corresponds to the type of the data, and the second element
of each pair is the displacement of the data element from the beginning of the type. We’ve
assumed that the type begins with a, so it has displacement 0, and the other elements have
displacements measured, in bytes, from a: b is 40 - 24 = 16 bytes beyond the start of a,
and n is 48-24 = 24 bytes beyond the start of a.
comm_sz is the number of proces. The times for comm_sz = 1 are the run-times of the
serial program running on a single core of the distributed-memory system.
In MPI programs, the parallel overhead typically comes from communication, and it can
depend on both the problem size and the number of processes.
However, the parallel program also needs to complete a call to MPI_Allgather before it
can carry out the local matrix-vector multiplication. In our example, it appears that
Recall that the most widely used measure of the relation between the serial and the
parallel run-times is the speedup. It’s just the ratio of the serial run-time to the parallel
run-time:
The ideal value for S(n, p) is p. If S(n, p) = p, then our parallel program
with comm_sz = p processes is running p times faster than the serial program. In
practice, this speedup, sometimes called linear speedup, is rarely achieved. Our
matrix-vector multiplication program got the speedups shown in Table. For small p and
large n, our program obtained nearly linear speedup. On the other hand, for large p and
small n, the speedup was considerably less than p. The worst case was n = 1024
and p = 16, when we only managed a speedup of 2.4
Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.
Alternative Proxies: