0% found this document useful (0 votes)
97 views15 pages

Tutorial Presentation 8

This document provides an introduction to OpenMP, which is an application programming interface used to explicitly direct multi-threaded, shared memory parallelism. It discusses how chip manufacturers are moving to multi-core CPUs, OpenMP's shared memory model, fork-join execution model, key components of the OpenMP API including compiler directives and runtime routines, how variables can be classified as private or shared, examples of work-sharing constructs like parallel loops, and different scheduling strategies for loop iterations.

Uploaded by

hisuin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views15 pages

Tutorial Presentation 8

This document provides an introduction to OpenMP, which is an application programming interface used to explicitly direct multi-threaded, shared memory parallelism. It discusses how chip manufacturers are moving to multi-core CPUs, OpenMP's shared memory model, fork-join execution model, key components of the OpenMP API including compiler directives and runtime routines, how variables can be classified as private or shared, examples of work-sharing constructs like parallel loops, and different scheduling strategies for loop iterations.

Uploaded by

hisuin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

OpenMP

Arash Bakhtiari
bakhtiar@in.tum.de

2012-12-18 Tue

Introduction
I

Chip manufacturers are rapidly moving to multi-core


CPUs

Figure : Quad-core processor Intel Sandy Bridge

Shared Memory Model


I

All processors can access all memory in global address


space.
Threads Model: A single process can have multiple,
concurrent execution paths
On a multi-core system, the threads run at the same
time, with each core running a particular thread or task.

Figure : Shared Memory Model [1]

What is OpenMP?

I
I

I
I

An Application Program Interface (API)


Used to explicitly direct multi-threaded, shared memory
parallelism
Provides a portable, scalable model
Supports C/C++ and Fortran on a wide variety of
architectures

Fork-Join Model

I
I

OpenMP-program starts as a single thread


Additional threads (Team) are created when the master
hits a parallel region
When all threads finished the parallel region, the new
threads are given back to the runtime or operating
system.
The master continues after the parallel region

Fork-Join Model (cont.)

Figure : Fork-Join Model [1]

OpenMP API
Primary API components:
I Compiler Directives:
#pragma omp p a r a l l e l

Run-time Library Routines:

i n t omp_get_num_threads ( v o i d ) ;

Environment Variables

e x p o r t OMP_NUM_THREADS=2

Example
Listing 1: OpenMP Hello World!
#i n c l u d e <i o s t r e a m >
#i n c l u d e <omp . h>
int
{

main ( i n t

argc ,

char argv [ ] )

#pragma omp p a r a l l e l
{
s t d : : c o u t << "THREAD : " << omp_get_thread_num ( ) << " \ t H e l l o , World ! \ n " ;
}
return 0;
}

Listing 2: Compiling
g++ o h e l l o

h e l l o . c fopenmp

Classification of Variables

private(var-list):
I

shared(var-list):
I

Variables in var-list are private


Variables in var-list are shared.

default(private | shared | none):


I

Sets the default for all variables in this region.

Example
Listing 3: OpenMP Private Variable
#i n c l u d e <i o s t r e a m >
#i n c l u d e <omp . h>
i n t main ( i n t a r g c , c h a r a r g v [ ] )
{
int i , j ;
i = 1;
j = 2;
s t d : : c o u t << "BEFORE : i , j= "<< i << " , " << j << s t d : : e n d l ;
#pragma omp p a r a l l e l p r i v a t e ( i )
{
i = 3;
j = 5;
s t d : : c o u t << " INLOOP : i , j= "<< i << " , " << j << s t d : : e n d l ;
}

s t d : : c o u t << "AFTER :
return 0;

i , j= "<< i << " , " << j << s t d : : e n d l ;

Work-Sharing Constructs

Work-sharing constructs distribute the specified work to


all threads within the current team
Types:
I
I
I
I

Parallel loop
Parallel section
Master region
Single region

Parallel Loop

Syntax:

#pragma omp f o r

I
I

[ clause

...]

The iterations of the loop are distributed to the threads


The scheduling of loop iterations: static, dynamic,
guided, and runtime.

Scheduling Strategies
I

Schedule clause:

schedule ( type

[ , size ])

static: Chunks of the specified size are assigned in a


round- robin fashion to the threads.
dynamic: The iterations are broken into chunks of the
specified size. When a thread finishes the execution of a
chunk, the next chunk is assigned to that thread.
guided: Similar to dynamic, but the size of the chunks is
exponentially decreasing. The size parameter specifies the
smallest chunk. The initial chunk is implementation
dependent.
runtime: The scheduling type and the chunk size is
determined via environment variables.

Example
Listing 4: OpenMP Private Variable
#i n c l u d e <i o s t r e a m >
#i n c l u d e <omp . h>
#d e f i n e CHUNKSIZE 100
#d e f i n e N
1000
i n t main ( )
{
i n t i , chunk ;
d o u b l e a [N] , b [N] , c [N ] ;
s r a n d ( t i m e ( NULL ) ) ;
f o r ( i =0; i < N ; i ++) {
a [ i ] = generate_random_double ( 0 . 0 ,
b [ i ] = generate_random_double ( 0 . 0 ,
}
c h u n k = CHUNKSIZE ;
#pragma omp p a r a l l e l
{

10.0);
10.0);

s h a r e d ( a , b , c , chunk )

private ( i )

#pragma omp f o r s c h e d u l e ( dynamic , c h u n k ) n o w a i t


f o r ( i =0; i < N ; i ++)
c[ i ] = a[ i ] + b[ i ];
}
return
}

0;

References

Blaise Barney, Lawrence Livermore National Laboratory,


https://computing.llnl.gov/tutorials/openMP/

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy