0% found this document useful (0 votes)

79 views

CS462 Project Report: Name: Samuel Day 1. The Nature of The Project

This project analyzes the performance advantages of different types of parallelization on a supercomputer. The author tested four configurations: no parallelization, MPI, OpenMP, and a combination of MPI and OpenMP. Tests were run using different numbers of particles and cores. Results showed faster execution times on more cores and slower performance once cores were fully utilized due to process switching overhead. In general, execution times increased with more particles and parallelization provided speedup compared to no parallelization.

Uploaded by

Sam Day

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views

CS462 Project Report: Name: Samuel Day 1. The Nature of The Project

Uploaded by

Sam Day

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

CS462 Project Report

Name: Samuel Day 1. The Nature of the Project My project is a slight modification of a pre-approved project idea meant to measure statistical performance on the Newton supercluster: with a large number of different suites designed for aiding programmers in parallelizing their code, there hasn't been a lot of statistical analysis for the speed advantages that certain types of parallelization could provide and whether or not there is a noticeable difference. elow is my original project proposal, which has been deprecated !too complicated to e"ecute in the time-frame of this project# and should be ignored:
$My project proposal is to analyze the effects of intercommunication and speed between cores of a multiprocessor application through the use of eaglebone lac% microcomputers. & will do this by creating a program for them that simulates a musicians' live jamming setup, where each core is set to play a single instrument alongside the other cores. (he general algorithm & will be using as follows: ). * host computer will serve to connect the eaglebone lac%s to a switch to communicate messages. +. (he host will establish each eaglebone's instrument. ,ne of the eaglebone's will be guaranteed to be a drum %it. &n addition, one of the non-drum instruments will be a $leadinstrument, from which the other instruments will learn the %ey of the song. .. (he drum %it will begin to play a drum pattern/ the lead instrument will listen to this pattern, detect the tempo, then begin to play songs in the chosen %ey/ the other instruments will also detect the tempo from the drum pattern, but their %ey will be chosen by whatever the lead instrument plays. 0. (he drummer can change tempo and the lead instrument can change pitch, and it will be up to the other instruments to analyze this and $react-. (his will primarily be done on the eaglebone lac%s using 12133 threading. & will be using the st% and aubio open source libraries to generate and detect sound respectively. 4astly, & will probably have to borrow a few eaglebones to run all of the tests for the project write-up.-

2. Reasons Why the project s mportant or !orth!h le (here are plenty of reasons as to why this project is important to the overall study of parallel computing. 5or one, benchmar%ing these varying methods of parallel computing provides valuable statistical information for studying the overall benefits of parallel computing from an academic standpoint/ it is one thing to theorize the benefits of parallelization in code but it is another to collect precise timing data on controlled tests to see the overall benefits. 6econdly, its important because there are an increasingly wide-range of software suites, plugins and methods to parallelize code, and its a worthy endeavor to not only analyze which is the best in terms of speed, but also which is the best in terms of usability. *s scientists, mathematicians, and other users from non-computing fields are becoming increasingly responsible for writing and maintaining their own code, they might not be as familiar with the variety of techni7ues and paradigms that e"ists when parallelizing code such as dealing with deadloc%, race conditions, etc., so finding a middle ground

between functionality, usability, and speed is a practice that is worth investigating. 4ast, if nothing else hopefully my implementation of the varying techni7ues provide useful for others curious about parallelization aides but unsure about how to use them. ". Relate# Wor$ (his project is intrinsically tied to the various implementations of the particle code that & have written for earlier assignments throughout the semester. (hough this seems somewhat lazy and also means that the timing data collected could be more open to variance do to there not being a set running time for the problem, but & felt it was a relevant choice for two specific reasons. 5irst, is the fact that & %now the initial code wor%s. 8aving done a number of assignments with it, & am aware of its intricacies in timing and can recognize faulty data more readily. (he problem of alternate implementations of parallel code could produce a variety of results depending on the methods in each program. y limiting all of my tests to a program with a %nown output, it ma%es interpreting that output that much easier !and accurate# 6econd, the nature of the program means there is a fair amount of variability in the results. (his is good because e"ecution time is almost never a given in any application, and it is useful in ensuring that when the code is e"ecuting each processor is performing actual wor%, rather than simply sleeping for a number of seconds. 4. %ey Steps & too$ to mplement my parallel co#e (here were a number of different modifications & had to ma%e for running these tests. 5irst there was obtaining the particles code. (his was simple as & was able to re-use a lot of my code from this semester. 6econd & had to modify some of the code so that it would output results in a manageable fashion for collecting data. Mostly this too% place as modifying the print statements of each program. (hird & had to create an efficient way to run tests on the machine. (his led to a series of scripts & wrote in python to obtain results for different values of n !number of processes#. (hen, & created job files for each implementation of the particles code. & also created another python script called 'run*449, 6.py' !and another called 'run*449, 6):.py' for running on ):-core machines# to run all of the job files & had created. 4ast, for graphing & results & opted to use ,pen,ffice rather than gnuplot for my visualizations. ;hile it was somewhat annoying to transfer results over from the shell to ,pen,ffice without them losing any meaning, the graphs produced provide a much cleaner view of the results. '. The Nature of (y Test Cases My general schema for running the tests was as follows: &mplement the different types of test cases for my code. & chose four different configurations to run tests on: No parallelization: (his will serve as a control group so that the parallelization results can be compared to something.

<arallelization with M<&: (his means manually spawning each thread, passing data between them and a 'parent' thread that collects the data obtained, then compiling all of the data received and calculating the results. <arallelization with ,penM<: ,penM< allows the user to simply use pragmas to designate which loops and pieces of e"ecution will be parallelized <arallelization with both ,penM< and M<&: 5or this implementation & simply too% by parallelized with M<& code, and added ,penM< pragmas for some of the loops' e"ecutions =un the code to obtain the tests results. &n doing this & chose to use particles values of 0>>>>, +>>>>, )>>>>, ?>>>, +?>>, )>>>, and ?>> to graph data of a variety of different e"ecution lengths, and & analyzed results using )-)+ threads. @raph and analyze the test results.

6. The Way a user can run my co#e @ood news, user, &'ve done all of the hard wor% for youA &f you loo% in the main directory you'll notice subdirectories for the e"ecutables used and the data collected. &n the e"ecutable directory there are a number of different files for the different configurations of the particles code. (o compile them, however, all one has to do is simply type 'ma%e all' B the ma%efile will ta%e care of compiling all of the different files for you. * bunch of error messages may or may or not appear, but its safe to ignore them B the code will compile and run just fine. *fter that you'll need to submit the jobs used to gather the data, but that is simple as well. 6imply run 'python run*449, 6.py' and the script will submit all of the jobs re7uired to gather data. &f you want to run the code on the ):-core newton clusters, then run 'python run*449, 6):.py'. Casy. &f you want to clean up the e"ecutables directory then type 'ma%e pristine' and everything e"cept for the source code for each test will be deleted. ). Test Results

MPI and OPENMP

Number of Available Cores: "
35 n n n n n n n n n n n n 1 2 3 4 5 6 ! " # 10 11 12 25

MPI and OPENMP

Number of Available Cores: 16
E&e%u$ion 'ime (in s)
20 15 10 5 0 250 10250 20250 30250 40250 n n n n n n n n n n n n 1 2 3 4 5 6 ! " # 10 11 12

E&e%u$ion 'ime (in s)

30 25 20 15 10 5 0 250 10250 20250 30250 40250

Number of Par$i%les

*s you can see, as the number of particles increases the e"ecution time increases. (his is to be e"pected. ;hat is somewhat interesting is two things: ). (he processors on the ):-core machines appear to be faster than on the D-core machines, given that

the overall e"ecution times are 7uic%er. +. (he effects of process switching can be see at the n E D values on the D-core machines. * detailed view shows how once the number of cores have all been used up, there is some slowdown to account for switching processes in and out between the available cores for e"ecution.

MPI and OPENMP

Number of Available Cores: "
E&e%u$ion 'ime (in s)
n n n n n n n n n n n n 1 2 3 4 5 6 ! " # 10 11 12 100

MPI and OPENMP

Number of Available Cores: 16
n n n n n n n n n n n n 250 10250 20250 30250 40250 1 2 3 4 5 6 ! " # 10 11 12

E&e%u$ion 'ime (in s)

0*5 250 10250 20250 30250 40250

Number of Par$i%les

*nd for the most part these trends hold in each implementation of the code:

No +aralleli,a$ion
Number of Available Cores: "
100 #0 "0 !0 60 50 40 30 20 10 0 250 10250 20250 30250 40250 n n n n n n n n n n n n 1 2 3 4 5 6 ! " # 10 11 12 45 40 35 30 25 20 15 10 5 0 250

No Paralleli,a$ion
Number of Available Cores: 16
n n n n n n n n n n n n 1 2 3 4 5 6 ! " # 10 11 12

E&e%u$ion 'ime (in s)

10250

20250

30250

40250

Number of Par$i%les

;ith no parallelization there is no speedup with the number of processes, but at the very least the benefits of staying within the number of available cores for e"ecution should be clear as in the tests with only D cores available the e"ecution time rose as the number of processes went above D.

MPI Paralleli,a$ion
Number of Available Cores: "
35 n n n n n n n n n n n n 1 2 3 4 5 6 ! " # 10 11 12 25

MPI Paralleli,a$ion
Number of Available Cores: 16
n n n n n n n n n n n n 1 2 3 4 5 6 ! " # 10 11 12

E&e%u$ion 'ime (in s)

30 25 20 15 10 5 0 250 10250 20250 30250 40250

E&e%u$ion 'ime (in s)

20 15 10 5 0 250 10250 20250 30250 40250

Number of Par$i%les

4oo%s almost identical to using M<& and ,penM< in conjunction.

O+enMP
Number of Available Cores: "
160 140 120 100 "0 60 40 20 0 250 10250 20250 30250 40250 n n n n n n n n n n n n 1 2 3 4 5 6 ! " # 10 11 12 45 40 35 30 25 20 15 10 5 0 250

O+enMP
Number of Available Cores: 16
n n n n n n n n n n n n 1 2 3 4 5 6 ! " # 10 11 12

E&e%u$ion 'ime (in s)

10250

20250

30250

40250

Number of Par$i%les

,h my, this loo%s e"actly with the graphs without any parallelization only slowerA 6omething can't be right. 4et's compare with *mdahl's 4aw... *. Compar sons ! th +m#ahl,s -a!

No Paralleli,a$ion
Amda-l.s /a0 (n 121 + 0*##) Com+arison
50

O+enMP Paralleli,a$ion
Amda-l.s /a0 (n 121 + 0*##) Com+arison
E&e%u$ion 'ime (in s)
1000 100 10 1 0*1 0*01 0 0 1000020000 30000 40000 50000 'es$ 2a$a Amda-l.s /a0

E&e%u$ion 'ime (in s)

40 30 20 10 0 0 10000 20000 30000 40000 50000 'es$ 2a$a Amda-l.s /a0

Number of Par$i%les

MPI Paralleli,a$ion
Amda-l.s /a0 (n 121 + 0*##) Com+arison
2*5 2*5

MPI3O+enMP Paralleli,a$ion
Amda-l.s /a0 (n 121 + 0*##) Com+arison
E&e%u$ion 'ime (in s)
2 1*5 1 0*5 0 0 10000 20000 30000 40000 50000 'es$ 2a$a Amda-l.s /a0

E&e%u$ion 'ime (in s)

2 1*5 1 0*5 0 0 10000 20000 30000 40000 50000 'es$ 2a$a Amda-l.s /a0

Number of Par$i%les

6o there are some predictable results, and some unpredictable results. (he predictable ones are fairly clear: No parallelization results in a far longer e"ecution time than *mdahl's 4aw predicts with parallelizing code to a significant degree. 6imilarly, when we parallelize with M<& we are able to see results that mimic *mdahl's 4aw, or even beat it for a couple of data pointsA &n all honesty, this is probably because the value of p was too low B despite it implying that pretty much all of the code is being parallelized. <erhaps using a larger value B >.FFFFF or as close to ) as & can get B will provide a more accurate result, but regardless we can still see the law in effect as the actuals results fall right on top of the estimated results. ;ith the ,penM< results, we can see that something is very wrong B probably an incorrect implementation of the code. &t is here & will ma%e my point about ,penM<: despite appearing to be easier to use and the code itself it much simpler, ,penM< as a library is significantly harder to implement and ma%e use of than M<& threading. (he lac% of access to under-the-hood pieces of parallelization in addition to various compilation issues and version chec%ing ma%es ,penM< difficult to use, especially in code that needs to be portable between different machines. M<& shares some portability problems as well, but its logic is similar to a library-less implementation of threads. ;ith this information we can conclude that M<& is probably the safest way to implement multithreading. &t is significantly faster than no parallelization, appears to shadow *mdahl's 4aw in its results, and isn't as difficult to use correctly as ,penM<.

.. - n$ to tar/all (he tarball for this project can be found at: https:22www.dropbo".com2s2d+hp?g.wm%z?Df"2sdayGcs0:+project.tar

Project 2
No ratings yet
Project 2
19 pages
Parallel and Distributed Computing
33% (3)
Parallel and Distributed Computing
10 pages
.Trashed-1650000204-Hpc Prac Exam
No ratings yet
.Trashed-1650000204-Hpc Prac Exam
5 pages
C Programming for the Pc the Mac and the Arduino Microcontroller System
From Everand
C Programming for the Pc the Mac and the Arduino Microcontroller System
Peter D Minns
No ratings yet
Improvement of Compilers
No ratings yet
Improvement of Compilers
4 pages
09 ParallelizationRecap PDF
No ratings yet
09 ParallelizationRecap PDF
62 pages
Fpga vs. Multi-Core Cpus vs. Gpus: Hands-On Experience With A Sorting Application
No ratings yet
Fpga vs. Multi-Core Cpus vs. Gpus: Hands-On Experience With A Sorting Application
12 pages
Week 5 - The Impact of Multi-Core Computing On Computational Optimization
No ratings yet
Week 5 - The Impact of Multi-Core Computing On Computational Optimization
11 pages
CC ZG501 Course Handout
No ratings yet
CC ZG501 Course Handout
8 pages
Hack into your Friends Computer
From Everand
Hack into your Friends Computer
Magelan Cyber Security
No ratings yet
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Programming Concepts in C++
From Everand
Programming Concepts in C++
Robert Burns
No ratings yet
Bare Metal C: Embedded Programming for the Real World
From Everand
Bare Metal C: Embedded Programming for the Real World
Stephen Oualline
No ratings yet
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
Parallelizing Particle-In-Cell Codes With Openmp and Mpi: Nils Magnus Larsgård
No ratings yet
Parallelizing Particle-In-Cell Codes With Openmp and Mpi: Nils Magnus Larsgård
74 pages
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
14 Parallel Algorithms CUDA Basics s20
No ratings yet
14 Parallel Algorithms CUDA Basics s20
89 pages
Introduction To OpenACC Course 20161026 1550 1
No ratings yet
Introduction To OpenACC Course 20161026 1550 1
68 pages
Digital Engineering: Complex System Design
From Everand
Digital Engineering: Complex System Design
S Mathioudakis
No ratings yet
Design Principles in Architecture
From Everand
Design Principles in Architecture
Rajendra Asan
No ratings yet
Tristram FP 443
No ratings yet
Tristram FP 443
6 pages
Unit V
No ratings yet
Unit V
10 pages
Read-Write, Certifiable Models For Thin Clients
No ratings yet
Read-Write, Certifiable Models For Thin Clients
5 pages
ACA 2024W 01 Introduction
No ratings yet
ACA 2024W 01 Introduction
19 pages
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Cuda Program + Wait For User Input
No ratings yet
Cuda Program + Wait For User Input
2 pages
Poor Man's Computing Revisited: Alexander Shchepetkin, I.G.P.P. UCLA
No ratings yet
Poor Man's Computing Revisited: Alexander Shchepetkin, I.G.P.P. UCLA
12 pages
Fallsem2019-20 Cse4001 Eth Vl2019201001348 Reference Material Cse4001 Parallel and Distributed Computing May 2019 (003) 18
No ratings yet
Fallsem2019-20 Cse4001 Eth Vl2019201001348 Reference Material Cse4001 Parallel and Distributed Computing May 2019 (003) 18
4 pages
CSE5006 Multicore-Architectures ETH 1 AC41
No ratings yet
CSE5006 Multicore-Architectures ETH 1 AC41
9 pages
Intro To MPI
No ratings yet
Intro To MPI
44 pages
CSCE569 Parallel Computing: TTH 03:30AM-04:45PM Dr. Jianjun Hu
No ratings yet
CSCE569 Parallel Computing: TTH 03:30AM-04:45PM Dr. Jianjun Hu
37 pages
Decoupling The UNIVAC Computer From Agents in The World Wide Web
No ratings yet
Decoupling The UNIVAC Computer From Agents in The World Wide Web
4 pages
Scimakelatex 32233 None
No ratings yet
Scimakelatex 32233 None
7 pages
Emulating Expert Systems Using Modular Technology: RTR and RT
No ratings yet
Emulating Expert Systems Using Modular Technology: RTR and RT
7 pages
Emulation of Evolutionary Programming: Nakkakayutuk C. and Yersidil K
No ratings yet
Emulation of Evolutionary Programming: Nakkakayutuk C. and Yersidil K
7 pages
High Performance Computing For Computational Mechanics: ISCM-10
No ratings yet
High Performance Computing For Computational Mechanics: ISCM-10
63 pages
Lecture Week - 3 Amdahl Law 1
No ratings yet
Lecture Week - 3 Amdahl Law 1
19 pages
Openmp
No ratings yet
Openmp
61 pages
A Deep Dive Into The Latest HPC Software
No ratings yet
A Deep Dive Into The Latest HPC Software
38 pages
Learn Programming Using C#
From Everand
Learn Programming Using C#
Taurius Litvinavicius
No ratings yet
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet
Machine Learning: Hands-On for Developers and Technical Professionals
From Everand
Machine Learning: Hands-On for Developers and Technical Professionals
Jason Bell
No ratings yet
CS553 Homework #5: Sort On Single Shared Memory Node
No ratings yet
CS553 Homework #5: Sort On Single Shared Memory Node
3 pages
EAS 520 UmassD Syllabus Sheer
No ratings yet
EAS 520 UmassD Syllabus Sheer
2 pages
The Impact of Secure Technology On Cryptoanalysis
No ratings yet
The Impact of Secure Technology On Cryptoanalysis
5 pages
paper03
No ratings yet
paper03
4 pages
A Case For Lamport Clocks: Random, Generate and Paper
No ratings yet
A Case For Lamport Clocks: Random, Generate and Paper
7 pages
The Effect of Optimal Algorithms On Networking: Juan Veliz
No ratings yet
The Effect of Optimal Algorithms On Networking: Juan Veliz
6 pages
Decoupling 2 Bit Architectures From Context-Free Grammar in Extreme Programming - Eduardo Paradinas Glez. de Vega
No ratings yet
Decoupling 2 Bit Architectures From Context-Free Grammar in Extreme Programming - Eduardo Paradinas Glez. de Vega
7 pages
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
410A-week-5
No ratings yet
410A-week-5
23 pages
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Towards The Construction of Checksums
No ratings yet
Towards The Construction of Checksums
10 pages
Gujarat Technological University: W.E.F. AY 2018-19
No ratings yet
Gujarat Technological University: W.E.F. AY 2018-19
3 pages
On The Visualization of A Search
No ratings yet
On The Visualization of A Search
3 pages
Easy Programming for Everyone
From Everand
Easy Programming for Everyone
Umar Asghar
No ratings yet
ISE-20% Unit Test I-15% Unit Test II-15% ESE-50% (Minimum Passing Marks: 40%)
No ratings yet
ISE-20% Unit Test I-15% Unit Test II-15% ESE-50% (Minimum Passing Marks: 40%)
2 pages
04 Progbasics
No ratings yet
04 Progbasics
62 pages
The Software Programmer: Basis of common protocols and procedures
From Everand
The Software Programmer: Basis of common protocols and procedures
S Mathioudakis
No ratings yet
HPC Fall 2010: Prof. Robert Van Engelen
No ratings yet
HPC Fall 2010: Prof. Robert Van Engelen
35 pages
Multithreaded Programming Using Java Threads
No ratings yet
Multithreaded Programming Using Java Threads
33 pages
Java Nio Framework: Introducing A High-Performance I/O Framework For Java
No ratings yet
Java Nio Framework: Introducing A High-Performance I/O Framework For Java
6 pages
Chapter 20: Database System Architectures
No ratings yet
Chapter 20: Database System Architectures
45 pages
Parallel Computers Networking PDF
No ratings yet
Parallel Computers Networking PDF
48 pages
Lecture 10 - Parallel and Distributed Computing CSC 4106
No ratings yet
Lecture 10 - Parallel and Distributed Computing CSC 4106
11 pages
What Is High-Performance Computing (HPC)
No ratings yet
What Is High-Performance Computing (HPC)
6 pages
Laguna State Polytechnic University: Republic of The Philippines Province of Laguna
No ratings yet
Laguna State Polytechnic University: Republic of The Philippines Province of Laguna
10 pages
How To Speed Up An ECLIPSE Run: ECLIPSE Convergence: December 2009
No ratings yet
How To Speed Up An ECLIPSE Run: ECLIPSE Convergence: December 2009
54 pages
99 SNUG 2019 HW SW Codesign Intel Vikrant Kapila
No ratings yet
99 SNUG 2019 HW SW Codesign Intel Vikrant Kapila
12 pages
Parallel Universe Issue 32
No ratings yet
Parallel Universe Issue 32
74 pages
Answer: A B C D
No ratings yet
Answer: A B C D
50 pages
Nsradmin
No ratings yet
Nsradmin
27 pages
Teradata Beginner's Guide - Architecture
No ratings yet
Teradata Beginner's Guide - Architecture
12 pages
Instruction Level Parallelism
No ratings yet
Instruction Level Parallelism
2 pages
Operating System
No ratings yet
Operating System
37 pages
Design of Sobel Operator Based Image Edge Detection Algorithm On FPGA
No ratings yet
Design of Sobel Operator Based Image Edge Detection Algorithm On FPGA
5 pages
CSC 111 Lecture Notes 2019
No ratings yet
CSC 111 Lecture Notes 2019
57 pages
BSP Dev Guide
No ratings yet
BSP Dev Guide
34 pages
Parallel and Distributed Course Outline
No ratings yet
Parallel and Distributed Course Outline
4 pages
Fast Newton-Raphson Power Flow Analysis Based On Sparse Techniques and Parallel Processing
No ratings yet
Fast Newton-Raphson Power Flow Analysis Based On Sparse Techniques and Parallel Processing
11 pages
Advanced Techniques For Embedded Systems Design and Test
No ratings yet
Advanced Techniques For Embedded Systems Design and Test
297 pages
Supercomputing Architectures - Literature Review
No ratings yet
Supercomputing Architectures - Literature Review
6 pages
(FREE PDF Sample) Real Time Systems Scheduling 1st Edition Maryline Chetto Ebooks
100% (10)
(FREE PDF Sample) Real Time Systems Scheduling 1st Edition Maryline Chetto Ebooks
70 pages
Cluster Computing
No ratings yet
Cluster Computing
7 pages
Machine Learning Platform Design and Application Based On SparkProceedings of SPIE The International Society For Optical Engineering
No ratings yet
Machine Learning Platform Design and Application Based On SparkProceedings of SPIE The International Society For Optical Engineering
6 pages
CS 2354 - Advanced Computer Architecture PDF
100% (1)
CS 2354 - Advanced Computer Architecture PDF
2 pages
Symmetric Multiprocessors: Features of Symmetric Multiprocessing (SMP)
No ratings yet
Symmetric Multiprocessors: Features of Symmetric Multiprocessing (SMP)
3 pages
Mca 5th Sem
No ratings yet
Mca 5th Sem
6 pages
Amdahls Law - Advanced Computer Architecture
No ratings yet
Amdahls Law - Advanced Computer Architecture
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CS462 Project Report: Name: Samuel Day 1. The Nature of The Project

Uploaded by

CS462 Project Report: Name: Samuel Day 1. The Nature of The Project

Uploaded by

CS462 Project Report

MPI and OPENMP

MPI and OPENMP

E&e%u$ion 'ime (in s)

30 25 20 15 10 5 0 250 10250 20250 30250 40250

MPI and OPENMP

MPI and OPENMP

E&e%u$ion 'ime (in s)

0*5 250 10250 20250 30250 40250

E&e%u$ion 'ime (in s)

E&e%u$ion 'ime (in s)

E&e%u$ion 'ime (in s)

30 25 20 15 10 5 0 250 10250 20250 30250 40250

E&e%u$ion 'ime (in s)

20 15 10 5 0 250 10250 20250 30250 40250

4oo%s almost identical to using M<& and ,penM< in conjunction.

E&e%u$ion 'ime (in s)

E&e%u$ion 'ime (in s)

E&e%u$ion 'ime (in s)

40 30 20 10 0 0 10000 20000 30000 40000 50000 'es$ 2a$a Amda-l.s /a0

E&e%u$ion 'ime (in s)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.