0% found this document useful (0 votes)

68 views

Running in Parallel

The document discusses running OpenFOAM simulations in parallel. It describes how to determine the number of available cores, decompose the domain, and run OpenFOAM applications using MPI. Domain decomposition distributes the geometry and fields across processors to minimize communication and workload.

Uploaded by

mortezagashti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views

Running in Parallel

Uploaded by

mortezagashti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Running in parallel

• First at all, to know how many processors/cores you have available in your computer,
type in the terminal:
• $> lscpu
• The output for this particular workstation is the following:

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Total number of cores available after
CPU(s): 24
hyper threading (virtual cores)
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6 Number of threads per core (hyper threading)
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel Number of cores per socket or physical
CPU family: 6 processor
Model: 44
Model name: Intel(R) Xeon(R) CPU X5670 @ 2.93GHz
Stepping: 2 Number of sockets (physical processors)
CPU MHz: 1600.000
CPU max MHz: 2934.0000
CPU min MHz: 1600.0000
BogoMIPS: 5851.91
Total number of physical cores
Virtualization: VT-x
=
L1d cache: 32K
Number of cores per socket X Number of sockets
L1i cache: 32K
L2 cache: 256K
L3 cache: 12288K Total number of physical cores = 6 X 2 = 12 cores
NUMA node0 CPU(s): 0-5,12-17
NUMA node1 CPU(s): 6-11,18-23

This is what makes a processor expensive

Running in parallel
• OpenFOAM® does not take advantage of hyper threading technology (HT).
• HT is basically used by the OS to improve multitasking performance.
• This is what we have in the workstation of the previous example:
• 24 virtual cores (hyper threaded)
• 12 physical cores

• To take full advantage of the hardware, we use the maximum number of physical
cores (12 physical cores in this case) when running in parallel.
• If you use the maximum number of virtual cores, OpenFOAM® will run but it will be
slower in comparison to running with the maximum number of physical cores (or even
less cores).
• Same rule applies when running in clusters/super computers, so always read the
hardware specifications to know the limitations.
Running in parallel
Why use parallel computing?
• Solve larger and more complex problems (scale-up):
Thanks to parallel computing we can solve bigger problems (scalability). A single computer has limited
physical memory, many computers interconnected have access to more memory (distributed memory).

• Provide concurrency (scale-out):

A single computer or processor can only do one thing at a time. Multiple processors or computing
resources can do many things simultaneously.

• Save time (speed-up):

Run faster (speed-up) and increase your productivity, with the potential of saving money in the design
process.

• Save money:
In theory, throwing more resources at a task will shorten its time to completion, with potential cost
savings. Parallel computers can be built from cheap, commodity components.

• Limits to serial computing:

Both physical and practical reasons pose significant constraints to simply building ever faster serial
computers (e.g, transmission speed, CPU clock rate, limits to miniaturization, hardware cooling).
Running in parallel
Speed-up and scalability example

• In the context of high performance computing (HPC), there are two common metrics that measure the scalability
of the application:
• Strong scaling (Amdahl’s law): which is defined as how the solution time varies with the number of
processors for a fixed problem size (number of cells in CFD)
• Weak scaling (Gustafson’s law): which is defined as how the solution time varies with the number of
processors for a fixed problem size per processor (or increasing the problem size with a fix number of
processors).
• In this example, when we reach 12 cores inter-processor communication slow-downs the computation. But if we
increase the problem size for a fix number of processors, we will increase the speed-up.
• The parallel case with 1 processor runs slower than the serial case due to the extra overhead when calling the
MPI library.
Running in parallel
• The method of parallel computing used by OpenFOAM® is known as domain
decomposition, in which the geometry and associated fields are broken into pieces
and distributed among different processors.

Shared memory architectures – Workstations and portable computers

Distributed memory architectures – Clusters and super computers

Running in parallel
Some facts about running OpenFOAM® in parallel:
• Applications generally do not require parallel-specific coding. The
parallel programming implementation is hidden from the user.
• In order to run in parallel you will need an MPI library installation in your
system.
• Most of the applications and utilities run in parallel.
• If you write a new solver, it will be in parallel (most of the times).
• We have been able to run in parallel up to 14000 processors.
• We have been able to run OpenFOAM® using single GPU and multiple
GPUs.
• Do not ask about scalability, that is problem/hardware specific.
• If you want to learn more about MPI and GPU programming, do not look
in my direction.
• And of course, to run in parallel you need the hardware.
Running in parallel
To run OpenFOAM® in parallel you will need to:

• Decompose the domain.

To do so we use the decomposePar utility. You also need the dictionary
decomposeParDict which is located in the system directory.

• Distribute the jobs among the processors or computing nodes.

To do so, OpenFOAM® uses the standard message passing interface (MPI).
By using MPI, each processor runs a copy of the solver on a separate part
of the decomposed domain.

• Additionally, you might want to reconstruct (put back together) the

decomposed domain.
This is done by using the reconstrucPar utility. You do not need a
dictionary to use this utility.
Running in parallel
Domain Decomposition in OpenFOAM®
• The mesh and fields are decomposed using the decomposePar utility.
• They are broken up according to a set of parameters specified in a dictionary named
decomposeParDict that is located in the system directory of the case.
• In the decomposeParDict dictionary the user must set the number of domains in
which the case should be decomposed (using the keyword numberOfSubdomains).
The value used should correspond to the number of physical cores available.

numberOfSubdomains 128; Number of subdomains

method scotch; Decomposition method

• In this example, we are subdividing the domain in 128 subdomains, therefore we

should have 128 physical cores available.
• The main goal of domain decomposition is to minimize the inter-processors
communication and the processor workload.
Running in parallel
Domain Decomposition Methods
• These are the decomposition methods available in OpenFOAM® 5.x
• hierarchical
• manual
• metis
• multiLevel
• none We highly recommend you to use this method.
The only input that requires from the user is
• scotch the number of subdomains/cores. This method
attempts to minimize the number of processor
• simple boundaries.

• structured

• If you want more information about each decomposition method, just read
the source code:
• $WM_PROJECT_DIR/src/parallel/decompose/
Running in parallel
Running in parallel – Gathering all together

The information inside the

directories polyMesh/ and
0/ is decomposed using the
utility decomposePar

decomposePar

processor0 processor1 processor2 processor3

• Inside each processorN directory you will have the mesh information, boundary conditions,
initial conditions, and the solution for that processor.
Running in parallel
Running in parallel – Gathering all together
• After decomposing the mesh, we can run in parallel using MPI.

$> mpirun –np <NPROCS> <application/utility> –parallel

• The number of processors to use or <NPROCS>, needs to be the same as the

number of partitions (numberOfSubdomains).
• Do not forget to use the flag –parallel.
Running in parallel
Running in parallel – Gathering all together

• In the decomposed case, you will find the mesh

information, boundary conditions, initial
conditions, and the solution for every processor.
• The information is inside the directory
processorN (where N is the processor number).

reconstructPar

• When you reconstruct the case, you glue together

all the information contained in the decomposed
case.
• All the information (mesh, boundary conditions,
initial conditions, and the solution), is transfer to
the original case folder (polyMesh and time
solution directories).
Running in parallel
Running in parallel – Gathering all together

• Summarizing, to run in parallel we proceed in the following way:

1. $> decomposePar
2. $> mpirun –np <NPROCS> <application/utility> –parallel
3. $> reconstructPar

• You can do the post-processing and visualization on the decomposed case

or reconstructed case. We are going to address this later on.

• If you are doing remeshing or using AMR you will need to use
reconstructParMesh before reconstrucPar.
Running in parallel
Kelvin Helmholtz instability in a coarse mesh

Mesh size
Processors Clock time (seconds)
in x, y, and z directions

1 955 800 X 160 X 1

2 564 800 X 160 X 1

4 333 800 X 160 X 1

8 234 800 X 160 X 1

12 244 800 X 160 X 1

Volume fraction
www.wolfdynamics.com/wiki/kelvin_helmholtz/ani1.gif

You will find this case in the directory: $PTOFC/parallel/kelvin_helmholtz

Running in parallel
Comparison of the Kelvin Helmholtz instability in a coarse and fine mesh

Cells 200 X 40 X 1 Cells 3200 X 640 X 1

Running in parallel
Visualization of a parallel case

• The traditional way is to first reconstruct the case and then do the post-
processing and visualization on the reconstructed case.

• To do so, we type in the terminal:

1. $> reconstructPar
2. $> paraFoam

• Step 1 reconstruct the case. Remember, you can choose to reconstruct all
the time steps, the last time step or a range of time steps.

• In step 2, we use paraFoam to visualize the reconstructed case.

Running in parallel
Visualization of a parallel case

• An alternative way to visualize the solution, is by proceeding in the following

way
• $> paraFoam –builtin

• The option –builtin let us post-process the decomposed case directly.

• Remember, you will need to select on the object inspector the Decomposed
Case option.
Running in parallel
Visualization of a parallel case

• Both of the previous methods are valid.

• When we use the option –builtin with paraFoam, we have the option to
work on the decomposed case directly.
• That is to say, we do not need to reconstruct the case.
• But wait, there is a third option.
• The third option consist in post-processing each decomposed domain
individually.
• To load all processor directories, you will need to manually create the file
processorN.OpenFOAM (where N is the processor number) in each
processor folder.
• After creating all processorN.OpenFOAM files, you can launch paraFoam
and load each file (the processorN.OpenFOAM files).
• As you can see, this option requires more input from the user.
Running in parallel
Decomposing big meshes
• One final word, the utility decomposePar does not run in parallel. So, it is
not possible to distribute the mesh among different computing nodes to do
the partitioning in parallel.

• If you need to partition big meshes, you will need a computing node with
enough memory to handle the mesh. We have been able to decompose
meshes with up to 500.000.000 elements, but we used a computing node
with 512 gigs of memory.

• For example, in a computing node with 16 gigs of memory, it is not possible

to decompose a mesh with 30.000.000. You will need to use a computing
node with at least 32 gigs of memory.

• Same applies for the utility reconstructPar.

Running in parallel
Do all utilities run in parallel?
• At this point, you might be wondering if all solvers/utilities run in parallel.
• To know what solvers/utilities do not run in parallel, in the terminal type:
• $> find $WM_PROJECT_DIR -type f | xargs grep –sl ‘noParallel’
• Paradoxically, the utilities used to decompose the domain and reconstruct the
domain do not run in parallel.
• Another important utility that does not run in parallel is blockMesh. So to generate
big meshes with blockMesh you need to use a big fat computing node.
• Another important utility that does not run in parallel by default is paraFoam.
• To compile paraFoam with MPI support, in the file makeParaView4 (located in the
directory $WM_THIRD_PARTY_DIR), set the option withMPI to true,
• withMPI = true
• While you are working with the file makeParaView4, you might consider enabling
Python support,
• withPYTHON = true
Running in parallel
Running in a cluster using a job scheduler
• Running OpenFOAM® in a cluster is similar to running in a normal
workstation with shared memory.
• The only difference is that you will need to launch your job using a job
scheduler.
• Common job schedulers are:
• Terascale Open-Source Resource and Queue Manager (TORQUE).
• Simple Linux Utility for Resource Management (SLURM).
• Portable Batch System (PBS).
• Sun Grid Engine (SGE).
• Maui Cluster Scheduler.
• BlueGene LoadLeveler (LL).
• Ask your system administrator the job scheduler installed in your system.
Hereafter we will assume that you are using PBS.
Running in parallel
• To launch a job in a cluster with PBS, you will need to write a small shell script where
you tell to the job scheduler the resources you want to use and what you want to do.
• Ask the system administrator about how to write these scripts.

#!/bin/bash
#
# Simple PBS batch script that reserves 16 nodes and runs a
# MPI program on 128 processors (8 processor on each node)
# The walltime is 24 hours !
#
#PBS -N openfoam_simulation //name of the job
#PBS -l nodes=16,walltime=24:00:00 //max resources and execution time
#PBS -m abe -M joel.guerrero@unige.it //send an email as soon as the job
//is launch or terminated

cd PATH_TO_DIRECTORY //go to this directory

decomposePar //decompose the case

mpirun –np 128 pimpleFoam -parallel > log //run parallel solver

The green lines are not PBS comments, they are comments inserted in this slide. PBS comments use the number
sign (#).
Running in parallel
• To launch your job you need to use the qsub command (part of the PBS job
scheduler). The command qsub will send your job to queue.

• $> qsub script_name

• Remember, running in a cluster is no different from running in your

workstation or portable computer. The only difference is that you need to
schedule your jobs.

• Depending on the system current demand of resources, the resources you

request and your job priority, sometimes you can be in queue for hours, even
days, so be patient and wait for your turn.

• Remember to always double check your scripts.

Running in parallel
• Finally, remember to always plan how you will use the resources available.
• For example, if each computing node has 8 gigs of memory available and 8 cores.
You will need to distribute the work load in order not to exceed the maximum
resources available per computing node.
• So if you are running a simulation that requires 32 gigs of memory, the following
options are valid:
• Use 4 computing nodes and ask for 32 cores. Each node will use 8 gigs of
memory and 8 cores.
• Use 8 computing nodes and ask for 32 cores. Each node will use 4 gigs of
memory and 4 cores.
• Use 8 computing nodes and ask for 64 cores. Each node will use 4 gigs of
memory and 8 cores.
• But the following options are not valid:
• Use 2 computing nodes. Each node will need 16 gigs of memory.
• Use 16 computing nodes and ask for 256 cores. The maximum number of cores
for this job is 128.

OpenFoam Tutorials
No ratings yet
OpenFoam Tutorials
233 pages
Node.js 63 Interview Questions and Answers
From Everand
Node.js 63 Interview Questions and Answers
John Edward Cooper Berg
No ratings yet
Parallel
No ratings yet
Parallel
20 pages
Introductory Openfoam® Course From 2 To6 July, 2012: Joel Guerrero University of Genoa, Dicat
No ratings yet
Introductory Openfoam® Course From 2 To6 July, 2012: Joel Guerrero University of Genoa, Dicat
24 pages
Parallel Processing - Openfoam
No ratings yet
Parallel Processing - Openfoam
44 pages
Running in Parallel Luc Chin I
No ratings yet
Running in Parallel Luc Chin I
13 pages
OFcheat Sheet
No ratings yet
OFcheat Sheet
1 page
OpenFoam8 UserGuide
No ratings yet
OpenFoam8 UserGuide
223 pages
openfoam_1
No ratings yet
openfoam_1
8 pages
sc09 Fluid Sim Cohen
No ratings yet
sc09 Fluid Sim Cohen
33 pages
The Performance of Openfoam in Beowulf Clusters: Ying Wei Feng Sha
No ratings yet
The Performance of Openfoam in Beowulf Clusters: Ying Wei Feng Sha
3 pages
Open Foam Slides
No ratings yet
Open Foam Slides
43 pages
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
From Everand
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
Hunter Davis
No ratings yet
Open Foam
No ratings yet
Open Foam
19 pages
OpenFOAM Foundation Handout PDF
No ratings yet
OpenFOAM Foundation Handout PDF
92 pages
Open Foam
No ratings yet
Open Foam
17 pages
How CUDA Programming Works - 1647539841016001sz6e
No ratings yet
How CUDA Programming Works - 1647539841016001sz6e
101 pages
Lec 14
No ratings yet
Lec 14
52 pages
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
No ratings yet
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
41 pages
Presentation Openfoam
No ratings yet
Presentation Openfoam
29 pages
.Trashed-1650000204-Hpc Prac Exam
No ratings yet
.Trashed-1650000204-Hpc Prac Exam
5 pages
Openfoam Model Tutorial
No ratings yet
Openfoam Model Tutorial
29 pages
Gpu Programming
100% (2)
Gpu Programming
96 pages
OpenFOAM Overview
No ratings yet
OpenFOAM Overview
83 pages
2014 OFoam Tut Complete
No ratings yet
2014 OFoam Tut Complete
85 pages
Introduction To CUDA
No ratings yet
Introduction To CUDA
51 pages
Openfoam10 PDF
No ratings yet
Openfoam10 PDF
23 pages
2 NACAWingTutorial
No ratings yet
2 NACAWingTutorial
16 pages
Meshing Using SALOME: Original by Otto J Oeleht August 2017
No ratings yet
Meshing Using SALOME: Original by Otto J Oeleht August 2017
10 pages
Basic Training
No ratings yet
Basic Training
19 pages
Openfoam Training Part 1v5
No ratings yet
Openfoam Training Part 1v5
46 pages
CFD Manual - Complete
No ratings yet
CFD Manual - Complete
101 pages
Basic Tutorials of OpenFOAM
No ratings yet
Basic Tutorials of OpenFOAM
13 pages
OpenFOAM Tutorial Free Surface Tutorial Using InterFoam and RasInterFoam Hassan - Hemida - VOF
100% (1)
OpenFOAM Tutorial Free Surface Tutorial Using InterFoam and RasInterFoam Hassan - Hemida - VOF
31 pages
Open FOAMExercise 1
No ratings yet
Open FOAMExercise 1
15 pages
SDLEC Parallelization Workshop Presentation Slides
No ratings yet
SDLEC Parallelization Workshop Presentation Slides
84 pages
CUDA Introduction
No ratings yet
CUDA Introduction
39 pages
High Performance Computing On Gpu
No ratings yet
High Performance Computing On Gpu
37 pages
Parallel Programming Module 5
No ratings yet
Parallel Programming Module 5
24 pages
OpenFoam Application
No ratings yet
OpenFoam Application
12 pages
FSBIntro PDF
No ratings yet
FSBIntro PDF
22 pages
Parralel Demro 001
No ratings yet
Parralel Demro 001
45 pages
OPENFOAM FORMATION v5-1-EN PDF
No ratings yet
OPENFOAM FORMATION v5-1-EN PDF
109 pages
Comp Arch Project 2 Final
No ratings yet
Comp Arch Project 2 Final
29 pages
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
From Everand
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Digital Equipment Corporation
No ratings yet
OpenFOAM NPTEL ChandanBose
No ratings yet
OpenFOAM NPTEL ChandanBose
31 pages
14 Parallel Algorithms CUDA Basics s20
No ratings yet
14 Parallel Algorithms CUDA Basics s20
89 pages
PP Cuda Unit1 1
No ratings yet
PP Cuda Unit1 1
77 pages
001 - Openfoam SSÇ ÀÚ Á
No ratings yet
001 - Openfoam SSÇ ÀÚ Á
189 pages
OF Example PDF
No ratings yet
OF Example PDF
39 pages
A PimpleFoam Tutorial For Channel Flow, With Respect To Different LES Models
No ratings yet
A PimpleFoam Tutorial For Channel Flow, With Respect To Different LES Models
23 pages
Open_FOAM_V6_User_Guide_ Running_Applications_Parallel
No ratings yet
Open_FOAM_V6_User_Guide_ Running_Applications_Parallel
9 pages
Hands-On Training With OpenFOAM - Flow Around A 2-D Airfoil
0% (1)
Hands-On Training With OpenFOAM - Flow Around A 2-D Airfoil
16 pages
multicore02-2
No ratings yet
multicore02-2
18 pages
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
From Everand
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
Jonathan Rigdon
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
Hack into your Friends Computer
From Everand
Hack into your Friends Computer
Magelan Cyber Security
No ratings yet
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
From Everand
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
Bruce Dang
No ratings yet
Linux Services Deployment
From Everand
Linux Services Deployment
Fabian Mestre
No ratings yet
Direct Numerical Simulations of Non-Equilibrium Dynamics of Colloids
No ratings yet
Direct Numerical Simulations of Non-Equilibrium Dynamics of Colloids
33 pages
Three-Dimensional Lithium-Ion Battery Model
No ratings yet
Three-Dimensional Lithium-Ion Battery Model
35 pages
Python Operator Overloading - The Python Guru
No ratings yet
Python Operator Overloading - The Python Guru
9 pages
Optimization With COMSOL Multiphysics
No ratings yet
Optimization With COMSOL Multiphysics
55 pages
BIOM9711 CourseOutline 2016
No ratings yet
BIOM9711 CourseOutline 2016
5 pages
Flow Structure in The Louvered Fin Heat Exchanger Geometry
No ratings yet
Flow Structure in The Louvered Fin Heat Exchanger Geometry
13 pages
Scientific Computing: Optimization Toolbox Nonlinear Equations, Numerical Optimization
No ratings yet
Scientific Computing: Optimization Toolbox Nonlinear Equations, Numerical Optimization
45 pages
10 question javascript
No ratings yet
10 question javascript
2 pages
Chapter 8-Controlling Information Systems: IT Processes True/False
No ratings yet
Chapter 8-Controlling Information Systems: IT Processes True/False
6 pages
Python Codin
No ratings yet
Python Codin
4 pages
Software Construction Java
No ratings yet
Software Construction Java
14 pages
Compusoft, 3 (10), 1136-1139 PDF
No ratings yet
Compusoft, 3 (10), 1136-1139 PDF
4 pages
Vissim 11 - CI
No ratings yet
Vissim 11 - CI
38 pages
MPC LAB Manual new
No ratings yet
MPC LAB Manual new
24 pages
Pls QL Notes
No ratings yet
Pls QL Notes
9 pages
SPCC - 5
No ratings yet
SPCC - 5
19 pages
Dsbda Mini Priyanshu
No ratings yet
Dsbda Mini Priyanshu
17 pages
Micro Project MIC PDF
No ratings yet
Micro Project MIC PDF
18 pages
Quiz On Basic Flowcharting On Chs
No ratings yet
Quiz On Basic Flowcharting On Chs
11 pages
Chapter Two The Inside of Objects and Classes More OOP Concepts
No ratings yet
Chapter Two The Inside of Objects and Classes More OOP Concepts
13 pages
Introduction To Java Programming Comprehensive Version 9th Edition Liang Test Bank 1
100% (79)
Introduction To Java Programming Comprehensive Version 9th Edition Liang Test Bank 1
12 pages
COA Question Bank
No ratings yet
COA Question Bank
3 pages
Code
No ratings yet
Code
7 pages
Final IOS
No ratings yet
Final IOS
13 pages
Standard VHDL Examples
No ratings yet
Standard VHDL Examples
56 pages
22BCE5C2 Python Programming Notes
No ratings yet
22BCE5C2 Python Programming Notes
44 pages
Deep Learning
No ratings yet
Deep Learning
189 pages
Late Bloomers: In-Service Course Jaipur Question Bank Pointers
No ratings yet
Late Bloomers: In-Service Course Jaipur Question Bank Pointers
12 pages
Angular 8
No ratings yet
Angular 8
53 pages
Module 15 - Synonyms, Sequences and Views
No ratings yet
Module 15 - Synonyms, Sequences and Views
17 pages
Exchange_OffensiveCon24
No ratings yet
Exchange_OffensiveCon24
35 pages
Set A
No ratings yet
Set A
4 pages
Java OPP Programming 7
No ratings yet
Java OPP Programming 7
10 pages
Deepak Jain: Education
No ratings yet
Deepak Jain: Education
1 page
JS Exercises
No ratings yet
JS Exercises
9 pages
FuncSpec v2.00 (For6.3)
No ratings yet
FuncSpec v2.00 (For6.3)
215 pages
23-Exception Handling in Java PDF
No ratings yet
23-Exception Handling in Java PDF
20 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Running in Parallel

Uploaded by

Running in Parallel

Uploaded by

Running in parallel

This is what makes a processor expensive

• Provide concurrency (scale-out):

• Save time (speed-up):

• Limits to serial computing:

Shared memory architectures – Workstations and portable computers

Distributed memory architectures – Clusters and super computers

• Decompose the domain.

• Distribute the jobs among the processors or computing nodes.

• Additionally, you might want to reconstruct (put back together) the

numberOfSubdomains 128; Number of subdomains

method scotch; Decomposition method

• In this example, we are subdividing the domain in 128 subdomains, therefore we

The information inside the

processor0 processor1 processor2 processor3

$> mpirun –np <NPROCS> <application/utility> –parallel

• The number of processors to use or <NPROCS>, needs to be the same as the

• In the decomposed case, you will find the mesh

• When you reconstruct the case, you glue together

• Summarizing, to run in parallel we proceed in the following way:

• You can do the post-processing and visualization on the decomposed case

1 955 800 X 160 X 1

2 564 800 X 160 X 1

4 333 800 X 160 X 1

8 234 800 X 160 X 1

12 244 800 X 160 X 1

You will find this case in the directory: $PTOFC/parallel/kelvin_helmholtz

Cells 200 X 40 X 1 Cells 3200 X 640 X 1

• To do so, we type in the terminal:

• In step 2, we use paraFoam to visualize the reconstructed case.

• An alternative way to visualize the solution, is by proceeding in the following

• The option –builtin let us post-process the decomposed case directly.

• Both of the previous methods are valid.

• For example, in a computing node with 16 gigs of memory, it is not possible

• Same applies for the utility reconstructPar.

cd PATH_TO_DIRECTORY //go to this directory

decomposePar //decompose the case

• $> qsub script_name

• Remember, running in a cluster is no different from running in your

• Depending on the system current demand of resources, the resources you

• Remember to always double check your scripts.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.