0% found this document useful (0 votes)
3 views28 pages

openMP

OpenMP is an API for shared-memory parallel programming in C, C++, and Fortran, utilizing compiler directives to parallelize code. It allows for thread-based parallelism, with features like controlling the number of threads and work-sharing constructs for loop iterations. While OpenMP simplifies parallel programming, it also has drawbacks such as reduced insight into parallelism and potential for incorrect directives leading to non-deterministic behavior.

Uploaded by

tatadhanyasri20
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views28 pages

openMP

OpenMP is an API for shared-memory parallel programming in C, C++, and Fortran, utilizing compiler directives to parallelize code. It allows for thread-based parallelism, with features like controlling the number of threads and work-sharing constructs for loop iterations. While OpenMP simplifies parallel programming, it also has drawbacks such as reduced insight into parallelism and potential for incorrect directives leading to non-deterministic behavior.

Uploaded by

tatadhanyasri20
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Introduction to

OpenMP and Matrix-


Vector Multiplication
What is OpenMP?

•Definition: OpenMP (Open Multi-Processing) is an API for shared-memory parallel programming


in C, C++, and Fortran.
•Key Features:
•Portable across different platforms.
•Uses compiler directives (#pragma omp) to parallelize code.
•Thread-based parallelism on shared-memory architectures.
Basic OpenMP Directive

•#pragma omp parallel


•Tells the compiler to create a team of threads.
•Each thread executes the code within the parallel region.
•Default behavior: All threads synchronize (implicit join) at the end of the region.
•Controlling the Number of Threads
•Environment variable: OMP_NUM_THREADS
•Code or library function overrides: omp_set_num_threads()
•Conditional parallelization: if (condition) clause in #pragma omp parallel
When you enter an OpenMP parallel region, the exact number of threads is decided
by several factors in the following order of precedence:
1.if Clause in the Parallel Directive
2.num_threads() Clause
3.omp_set_num_threads() Library Function
4.OMP_NUM_THREADS Environment Variable
5.Implementation Default
OpenMP Execution Model

•Thread Team Creation


1.Master thread spawns additional threads to form a team.
2.Each thread gets an ID (0 to N-1).
3.The team executes the parallel region.
•Implicit Barrier
•By default, threads wait at the end of a parallel region (join).
•Can be modified with nowait in certain constructs (e.g., #pragma omp for
nowait).
Environment Variables
Matrix-Vector Multiplication Example
OpenMP Directive Breakdown
Execution Flow Among Threads
Data Sharing and Synchronization
Work-Sharing Clauses & Scheduling
•#pragma omp for
•Distributes loop iterations among threads.
•Includes an implicit barrier at the end unless nowait is used.
•Scheduling Options:
1.static: Each thread is assigned a fixed chunk of iterations in a round-robin fashion.
2.dynamic: Iterations are assigned to threads dynamically as they finish.
3.guided: Similar to dynamic, but chunk size shrinks over time.
4.auto: Let the compiler/runtime decide.
•Choosing a Schedule depends on load balance and overhead consideration
OpenMP work-sharing for loops
1. Default Behavior: Implicit Barrier
•When you use a work-sharing construct like #pragma omp for, OpenMP inserts an implicit barrier
at the end of the loop by default : All threads must finish executing their assigned iterations before
any thread proceeds past that loop. This ensures a consistent state for subsequent operations.
2. nowait Clause
•Purpose: Tells OpenMP not to synchronize (no implicit barrier) at the end of the work-sharing
region.
•Effect: Threads that complete their loop iterations can immediately move on to subsequent code.
This can improve performance if you do not need to wait for all threads to finish before continuing.
3. ordered Clause
•Purpose: Ensures the iterations of the loop are executed in the same order they would be in a serial
execution.
•Why Use It: In some algorithms, you may need each iteration to occur in strict ascending (or
descending) sequence (e.g., if you’re generating a strictly ordered output).
4. collapse Clause
•Purpose: Allows you to collapse multiple nested loops into a single iteration space
for parallelization.
•When to Use: When you have nested loops and want OpenMP to treat them as one
large loop. This helps if the iterations of the inner loops can also be distributed
among threads.
Constraints for the OpenMP for Loop

When using #pragma omp for, the loop must adhere to specific rules to ensure it can be
safely and efficiently parallelized:
1.No break Statements
•You cannot have a break statement in the loop, because it disrupts the iteration
space. Each thread expects a predictable range of iterations.
2.Loop Control Variable Must Be an Integer
•OpenMP requires an integer loop variable (e.g., int i or long i) for determining
iteration boundaries.
Work-Sharing Constructs Should Not Be Nested
• For example, you cannot place a for loop inside a sections block or vice versa. Each
work-sharing construct (like for, sections, or single) must stand on its own within a
parallel region.
•Critical Sections Cannot Be Nested
You should not put one critical block inside another. If you need mutual exclusion,
use separate, non-overlapping critical regions.
•Barriers Must Be Placed Outside Certain Blocks
Avoid placing a barrier directive inside work-sharing constructs such as for, sections,
single, master, or critical. Barriers are meant to synchronize threads, so they need to
be outside these regions.
•Master Directive Usage
The master directive should not be used inside other work-sharing constructs like for,
sections, or single. The master block is intended only for the master thread.
OpenMP Advantages:
1.Simpler to use than other concepts, for example, PThreads.
2.OpenMP implementation responsible for most organizational aspects.
3.Simplicity facilitates experimentation with alternatives, e.g.,
scheduling techniques.
4.Large applications may be parallelized incrementally.
OpenMP Disadvantages:

1.Simplicity reduces programmers' reflections on parallelism.


2.Insight into impact of certain directives/clauses is reduced.
3.False directives may lead to wrong or even non-deterministic
behavior.
4.False directives are easy to write but hard to find.
5.Performance considerations lead to low-level "expert" code.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy