33% found this document useful (3 votes)

817 views4 pages

Case Study

The document discusses designing parallel algorithms to sort a list of visitors alphabetically and search for a specific visitor ("John") in the sorted list on a supercomputer with 10 processors. It describes a parallel quicksort algorithm that partitions the data into blocks sized for processor cache, assigns processors to sort halves recursively, and uses a stack to track state for load balancing. This algorithm achieves average O(N/P) partition time and O(P) sorting time for linear speedup. Cache effects could provide super-linear speedup.

Uploaded by

Karo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

33% found this document useful (3 votes)

817 views4 pages

Case Study

Uploaded by

Karo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Case Study

ABC.com is a website where you can watch original movie DVDs. It currently maitains
the list of visitors and details of their visit. The website gets almost 1 billion visitors
everyday and at midnight it processes all the information. It takes almost 5 hours to
pocess all the information and the system remains down for that long. It causes the
company a huge loss. The company decided to buy a super computer for faster
analysis. The supercomputer has 10 processors. Now the need is to design a parallel
algorithm for the following problems:

We now have the list of visitors for the day and the number of movies they watched.

Question 1: Design a parallel algorithm that would sort the names alphabetically.

Write a parallel search algorithm that would find a visitor "John" in this sorted list and
show how many movies he watched.

Can either sorting or searching achieve super linear speedup?

Ans 3

The degree of the increase in the computational speed between a parallel algorithm and
a corresponding sequential algorithm is called speedup and expressed by ratio of
T(sequential) to T(parallel).

If the given ratio exceeds p, where p is the number of processors (cores) used,
super linear speedup takes place. The most common reason for it is the cache effect. It
is called that due to increased total size of cache in multiprocessor system, hence
increased data transfer rate between RAM and CPU, which is cardinal to the work with
the large data sets.

Traditional parallel computer performance evaluation has fixed problem size and varied
the number of processors, the so-called fixed-size model. In mid‘80s the scaled-size
model was developed and subsequently substantiated by experiments on a 1024-
processor hypercube. The scaled size model specifies that the storage complexity
grows in proportion to the number of processors. A third model is the fixed-time model,
in which the problem is scaled to take a constant time as processors are added and
rarely used in real-world applications. Algorithm described here is optimized for the
fixed-size model. It is a modification of the Quicksort algorithm by C. A. R. Hoare(1962)
to be utilized on a system with several processors (or cores).

On the first step, original data set is viewed as blocks of twice the size of the L1
cache (which is typically 32 or 64 kB). Processor with the smallest PID chooses the
pivot element. Then all processors in parallel invoke “neutralization” function on the
leftmost and the rightmost remaining blocks, effectively swapping elements respective
to the value of the pivot, which leaves only <=P+1 blocks to be sorted. After that
remaining not “neutralized” blocks are getting swapped with the “neutralized” ones and
getting sorted sequentially.

The next step is to split given data set at the pivotal point and assign processors
to each half according to its size. Stack is used to keep track of the state of the sorting
algorithm and the sequential steps of the recursion are turned into PUSH and POP
operations on this stack. Whenever a processor encounters a small subarray, which it
can fit in the cache, it will use inserting sort to sort it without PUSHing it into the stack.
When a processor finished its job, it begins helping other processors by POPing out
unused (yet unsorted) arrays from their stacks.

Such optimization of algorithm brings average time of partition phase to O(N/P),
for N>>B, where N – number of elements, B – number of elements in one block and P –
number of processors, and the sorting phase yields us speedup O(P), provided that all
processors are largely independent from one another at this stage and no
synchronization required. This bring total speedup up to T(s)/T(p) = P, i.e. linear
speedup.

Also, reduced time of memory access due to cache effect further decreases
overhead and yields super linear speedup.
Ans 1

Merge sort first divides the unsorted list into smallest possible sub-lists, compares it with
the adjacent list, and merges it in a sorted order. It implements parallelism very nicely
by following the divide and conquer algorithm.

procedureparallelmergesort(id, n, data, newdata)

begin

data = sequentialmergesort(data)

for dim = 1 to n

data = parallelmerge(id, dim, data)

endfor

newdata = data

end

Ans 2

In the conventional sequential BFS algorithm, two data structures are created to store
the frontier and the next frontier. The frontier contains the vertexes that have same
distance(it is also called "level") from the source vertex, these vertexes need to be
explored in BFS. Every neighbor of these vertexes will be checked, some of these
neighbors which are not explored yet will be discovered and put into the next frontier. At
the beginning of BFS algorithm, a given source vertex s is the only vertex in the frontier.
All direct neighbors of s are visited in the first step, which form the next frontier. After
each layer-traversal, the "next frontier" is switched to the frontier and new vertexes will
be stored in the new next frontier. The following pseudo-code outlines the idea of it, in
which the data structures for the frontier and next frontier are called FS and NS
respectively.

1 define bfs_sequential(graph(V,E), source s):

2 for all v in V do

3 d[v] = -1;

4 d[s] = 0; level = 0; FS = {}; NS = {};

5 push(s, FS);

6 while FS !empty do

7 for u in FS do

8 for each neighbour v of u do

9 if d[v] = -1 then

10 push(v, NS);

11 d[v] = level;

12 FS = NS, NS = {}, level = level + 1;

كتاب الهكر الأخلاقي PDF
100% (3)
كتاب الهكر الأخلاقي PDF
188 pages
Available Actions For Certify Web Interface
No ratings yet
Available Actions For Certify Web Interface
9 pages
ABBOTT ARCHITECT C16000c8000i2000i2000srci8200ci16200
100% (1)
ABBOTT ARCHITECT C16000c8000i2000i2000srci8200ci16200
2,488 pages
Implementation of Stack AIM: Write A Program To Implement Stack As An Abstract Data Type Using Linked List and Use
No ratings yet
Implementation of Stack AIM: Write A Program To Implement Stack As An Abstract Data Type Using Linked List and Use
57 pages
Converted Text
No ratings yet
Converted Text
25 pages
SAP - LSMW Create Purchasing Infor Record Using LSMW
No ratings yet
SAP - LSMW Create Purchasing Infor Record Using LSMW
81 pages
Disease Prediction Project - PPT
No ratings yet
Disease Prediction Project - PPT
9 pages
Introduction To Daa
100% (1)
Introduction To Daa
126 pages
Unit-II BDA
No ratings yet
Unit-II BDA
19 pages
Data Structures - Syllabus - R22
No ratings yet
Data Structures - Syllabus - R22
2 pages
Hitachi Placement Paper
No ratings yet
Hitachi Placement Paper
3 pages
PIC16F88X Memory Programming Specification
No ratings yet
PIC16F88X Memory Programming Specification
36 pages
PHD Student Resume
100% (2)
PHD Student Resume
7 pages
CISSP-2022 Exam Cram Domain 3
No ratings yet
CISSP-2022 Exam Cram Domain 3
162 pages
Programming Assignment 2: Priority Queues and Disjoint Sets
No ratings yet
Programming Assignment 2: Priority Queues and Disjoint Sets
11 pages
Mda Winide8086 Manual
No ratings yet
Mda Winide8086 Manual
107 pages
Lab - 03 .: Singly Linked List and Its Applications: Objectives
No ratings yet
Lab - 03 .: Singly Linked List and Its Applications: Objectives
8 pages
Pda 4
No ratings yet
Pda 4
82 pages
Work Sheet No. 1 STACK Implementation Using Array Name: - Reg - No.: - Date
No ratings yet
Work Sheet No. 1 STACK Implementation Using Array Name: - Reg - No.: - Date
12 pages
Pen Test Rules of Engagement Worksheet
No ratings yet
Pen Test Rules of Engagement Worksheet
3 pages
Mini ProjecAndroid Controlled SwitchBoard
No ratings yet
Mini ProjecAndroid Controlled SwitchBoard
59 pages
Basic Computer Eng
No ratings yet
Basic Computer Eng
55 pages
LP V Lab Manual 2022-23 Semester II
No ratings yet
LP V Lab Manual 2022-23 Semester II
45 pages
Practice Final Exam
No ratings yet
Practice Final Exam
3 pages
Sample Certificate of Incorporation Registration en CTC
No ratings yet
Sample Certificate of Incorporation Registration en CTC
2 pages
Quine McCluskey Method
No ratings yet
Quine McCluskey Method
19 pages
CISCO IM&P Jabber Clusters
No ratings yet
CISCO IM&P Jabber Clusters
91 pages
Endsem Imp HPC Unit 6
No ratings yet
Endsem Imp HPC Unit 6
21 pages
Nse 1 - The Threat Landscape
100% (1)
Nse 1 - The Threat Landscape
5 pages
B.Tech Project Report Format
No ratings yet
B.Tech Project Report Format
6 pages
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
No ratings yet
Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques
104 pages
L8 Parallel Algorithms
No ratings yet
L8 Parallel Algorithms
41 pages
DATA Structure Lab Assignment
No ratings yet
DATA Structure Lab Assignment
25 pages
8BIT
No ratings yet
8BIT
121 pages
Section A
No ratings yet
Section A
18 pages
03 - Top Level View of Computer Function and Interconnection
No ratings yet
03 - Top Level View of Computer Function and Interconnection
64 pages
Section C-3
100% (1)
Section C-3
17 pages
HPC Practicals
No ratings yet
HPC Practicals
26 pages
Deaps
No ratings yet
Deaps
45 pages
Basic Computer Architecture
No ratings yet
Basic Computer Architecture
17 pages
Sampling and Analog-to-Digital Conversion
No ratings yet
Sampling and Analog-to-Digital Conversion
47 pages
Remove Left Factoring
100% (1)
Remove Left Factoring
2 pages
Python - 2 Marks Question Bank
No ratings yet
Python - 2 Marks Question Bank
17 pages
2-State Estimation
No ratings yet
2-State Estimation
31 pages
Mcse 004
0% (1)
Mcse 004
15 pages
Virtual Memory: Virtual Memory - Separation of User Logical Memory From Physical Memory
100% (1)
Virtual Memory: Virtual Memory - Separation of User Logical Memory From Physical Memory
26 pages
3.parallel Processing - Algorithms
No ratings yet
3.parallel Processing - Algorithms
37 pages
HPC Viva
No ratings yet
HPC Viva
6 pages
Unit No - 1
No ratings yet
Unit No - 1
24 pages
ATC Notes Module 5
No ratings yet
ATC Notes Module 5
23 pages
Directions: Write The LETTER of The Correct Answer. (NO Erasures)
No ratings yet
Directions: Write The LETTER of The Correct Answer. (NO Erasures)
21 pages
CPP R16 - Unit-3
No ratings yet
CPP R16 - Unit-3
21 pages
Operating System
No ratings yet
Operating System
14 pages
Applications of BFS Algorithm
No ratings yet
Applications of BFS Algorithm
5 pages
Liinux Basics
No ratings yet
Liinux Basics
36 pages
Parallel Thinking: Guy Blelloch Carnegie Mellon University
No ratings yet
Parallel Thinking: Guy Blelloch Carnegie Mellon University
37 pages
Rewrite
No ratings yet
Rewrite
9 pages
BMR On VMXNET3 NIC
No ratings yet
BMR On VMXNET3 NIC
19 pages
Data Underlying Database Objects Can Be Kept.)
100% (1)
Data Underlying Database Objects Can Be Kept.)
7 pages
Section A-2
100% (1)
Section A-2
7 pages
Ref 3
No ratings yet
Ref 3
3 pages
Sorting On A Mesh-Connected Parallel Computer
No ratings yet
Sorting On A Mesh-Connected Parallel Computer
30 pages
U 1 Pharmacology Important Questions 4th Sem Shahruddin Khan 250214 202116
No ratings yet
U 1 Pharmacology Important Questions 4th Sem Shahruddin Khan 250214 202116
37 pages
Lecture 9 - Parallel Algorithms
No ratings yet
Lecture 9 - Parallel Algorithms
28 pages
Digital Footprints and It's Uses
No ratings yet
Digital Footprints and It's Uses
11 pages
MP Ans
No ratings yet
MP Ans
22 pages
HTTP Status Codes - BW
No ratings yet
HTTP Status Codes - BW
1 page
Navigating Your First Case Study 2.0
No ratings yet
Navigating Your First Case Study 2.0
44 pages
Assignment 1 HPC
No ratings yet
Assignment 1 HPC
9 pages
Assignment of Algorithm
No ratings yet
Assignment of Algorithm
9 pages
Kogge-Stone Adder
No ratings yet
Kogge-Stone Adder
6 pages
Thread-Level Parallel Algorithm For Sorting Integer Sequence On Multi-Core Computers
No ratings yet
Thread-Level Parallel Algorithm For Sorting Integer Sequence On Multi-Core Computers
5 pages
Resume CV Template Khanhtrannet
No ratings yet
Resume CV Template Khanhtrannet
1 page
HPC2
No ratings yet
HPC2
22 pages
Assignment # I: Dated: 21 October.2020
No ratings yet
Assignment # I: Dated: 21 October.2020
4 pages
Case Study
100% (2)
Case Study
4 pages
NSC - Solution 1 & Solution 2
No ratings yet
NSC - Solution 1 & Solution 2
4 pages
Chapter 14: Parallel Algorithms
No ratings yet
Chapter 14: Parallel Algorithms
23 pages
DS LAB-3 Rahul Singh (191B317)
100% (1)
DS LAB-3 Rahul Singh (191B317)
7 pages
Q3: What Is Your Father's Occupation? Ans3.
No ratings yet
Q3: What Is Your Father's Occupation? Ans3.
3 pages
OMC 301 - Section A
No ratings yet
OMC 301 - Section A
6 pages
Canoco Guide
No ratings yet
Canoco Guide
6 pages
Solutions Ch4
No ratings yet
Solutions Ch4
7 pages
Pivot Cheat Sheet
No ratings yet
Pivot Cheat Sheet
2 pages
Engineering by K K Aggarwal Yogeshsingh Full Notes PDF
No ratings yet
Engineering by K K Aggarwal Yogeshsingh Full Notes PDF
2 pages
Practicle File Dsa 1
No ratings yet
Practicle File Dsa 1
55 pages
Ethics 6e Tut Ch04
No ratings yet
Ethics 6e Tut Ch04
2 pages
Icles' Motilal Jhunjhunwala College, Vashi IT& CS Department
No ratings yet
Icles' Motilal Jhunjhunwala College, Vashi IT& CS Department
41 pages
Memory Management in Operating System
No ratings yet
Memory Management in Operating System
2 pages
Algorithms Cheat Sheet
No ratings yet
Algorithms Cheat Sheet
2 pages
Computer Architecture Unit 1 - Phase 2 PDF
No ratings yet
Computer Architecture Unit 1 - Phase 2 PDF
26 pages
The Possible Outputs Would Be Previous Date or Invalid Input Date. Design The Boundary Value Test Cases.
100% (1)
The Possible Outputs Would Be Previous Date or Invalid Input Date. Design The Boundary Value Test Cases.
1 page
LM #01-Introduction To ML
No ratings yet
LM #01-Introduction To ML
33 pages
Introduction To Spreadsheets
No ratings yet
Introduction To Spreadsheets
5 pages
Parallel Random Access Machine (PRAM) : Control
No ratings yet
Parallel Random Access Machine (PRAM) : Control
9 pages
Applied Numerical Analysis 7ed - Curtis F. Gerald, Patrick O. Wheatley - Solutions Manual
0% (1)
Applied Numerical Analysis 7ed - Curtis F. Gerald, Patrick O. Wheatley - Solutions Manual
6 pages
OMC 303 - Section A
No ratings yet
OMC 303 - Section A
5 pages
Certified Network Security Specialist CNSS Questions
0% (1)
Certified Network Security Specialist CNSS Questions
14 pages
Write A Program To Find Factorial of List of Number Reading Input As Command Line Argument
100% (1)
Write A Program To Find Factorial of List of Number Reading Input As Command Line Argument
7 pages
Sasken Sample Programming Placement Paper Level1
No ratings yet
Sasken Sample Programming Placement Paper Level1
6 pages
Instruction Set Architecture and Design
No ratings yet
Instruction Set Architecture and Design
27 pages
Information Security Project Report
No ratings yet
Information Security Project Report
3 pages
Vtu-Guidelines For The Preparation of B.tech-Report
No ratings yet
Vtu-Guidelines For The Preparation of B.tech-Report
6 pages
VLSI Implementation of Pipelined Fast Fourier Transform
No ratings yet
VLSI Implementation of Pipelined Fast Fourier Transform
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Case Study

Uploaded by

Case Study

Uploaded by

Case Study

Can either sorting or searching achieve super linear speedup?

procedureparallelmergesort(id, n, data, newdata)

data = parallelmerge(id, dim, data)

1 define bfs_sequential(graph(V,E), source s):

4 d[s] = 0; level = 0; FS = {}; NS = {};

8 for each neighbour v of u do

12 FS = NS, NS = {}, level = level + 1;

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.