0% found this document useful (0 votes)

12 views

Lecture 1

Uploaded by

iakambamu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Lecture 1

Uploaded by

iakambamu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Trends in Computing

Architecture
CMSC828E
Ramani Duraiswami

Several slides taken from a Microway/NVIDIA webinar

Some figures adapted from web sources
Problem sizes in simulation and data processing
are increasing
• Change in paradigm in science
– Simulate then test
– Fidelity demands larger simulations
– Problems being simulated are also much more
• Sensors are getting varied and cheaper; and storage is
getting cheaper
– Cameras, microphones
• Other Large data
– Text (all the newspapers, books, technical papers)
– Genome data
– Medical/biological data (X-Ray, PET, MRI, Ultrasound, Electron
microscopy …)
– Climate (Temperature, Salinity, Pressure, Wind, Oxygen content, …)
Ways to attack problem size
growth
• Faster algorithms with better asymptotic
complexity
• Faster processors
– “Moore’s law will take care of it”
• Go parallel!
– Clusters of computers
– New data parallel chips (multicore processors,
GPUs)
“Moore’s Law will take care of it”
• Not law but an
observation by Gordon
Moore in the 1960s
• Number of transistors
doubles every 18
months
• Basically has been
taken to mean that the
“standard computer”s
performance improves
exponentially, with a
doubling time of 18
months
Refuting the Moore’s law argument
• Argument:
– Moore’s law: Processor speed doubles every 18 months
– If we wait long enough the computer will get fast enough and let my
inefficient algorithm tackle the problem
• Is this true?
– Yes for algorithms with linear asymptotic complexity
– No!! For algorithms with different asymptotic complexity
– Most scientific algorithms are O(N2) or O(N3)
– For a million variables, we would need about 16 generations of
Moore’s law before a O(N2) algorithm was comparable with a O(N)
algorithm
• Did no one tell you that Moore’s law is dead?
Moore’s Law is dead:
“Issues at small scales”

- Lithography not possible

- 2D electrostatics harder to control,
- “parasitic resistance” degrade performance,
- device to device variations will be larger,
- ultra-thin bodies and hyper-abrupt junctions
make manufacturing difficult
Moore’s Law is dead!
• Feature sizes and clock speeds on commodity
chips have been stagnant over the past 4 years
– ~3 GHz and 45 nm
• All manufacturers are going with multicore to
maintain performance
– Core-2, core-2-duo, quad-core, …
• Shared memory multiprocessing
– Intel has demo’ed several many core systems
• Graphics processors and gaming
consoles have already been on the
multicore path for a decade!
Gamer Power

Sony Playstation 3 Microsoft X-Box 360

2.18 teraflops <$400 1.04 teraflops <$300

Difficult to program Difficult to program

Multicore Intel box with 3 GPUs
GEFORCE 8880 GTX
in Slots
~ 1 Teraflop for < 3000
(shown with 1 GPU)
Programming on the GPU
• GPU organized as groups of
multiprocessors (8 relatively slow
processors) with small amount of own Local memory
memory and access to common shared ~50kB
memory
• Factor of 100s difference in speed as one
goes up the memory hierarchy
• To achieve gains problems must fit the GPU GPU shared
programming paradigm/ manage memory memory
~1GB
• Fortunately many practically important tasks
do map well and work on converting others
– Image and Audio Processing
– Some types of linear algebra cores
– Many machine learning algorithms Host memory
• Research issues: ~2-32 GB
– Identifying important tasks and mapping them to the
architecture
– Making it convenient for programmers to call GPU
code from host code
What is GPU Computing?

4 cores

Computing with CPU +

GPU
Heterogeneous
Computing
11
Not 2x or 3x : Speedups are 20x to
150x

146X 36X 18X 50X 100X

Medical Molecular Video Matlab Astrophysic

Imaging Dynamics Transcoding Computing s
U of Utah U of Illinois, Elemental Tech AccelerEyes RIKEN
Urbana

149X 47X 20X 130X 30X

Financial Linear Algebra 3D Quantum Gene

simulation Universidad Ultrasound Chemistry Sequencing
Oxford Jaime Techniscan U of Illinois, U of Maryland
Urbana

12
Accelerating Time to Discovery

4.6 Days
2.7 Days 3 Hours
8 Hours

30 Minutes
27 Minutes 16 Minutes
13 Minutes

CPU Only With GPU

13
Molecular Dynamics

Available MD software
NAMD / VMD (alpha
release)
HOOMD
ACE-MD
MD-GPU Source: Stone, Phillips, Hardy, Schulten

Ongoing work
LAMMPS
CHARMM
GROMACS
AMBER

Source: Anderson, Lorenz, Travesset

14
Quantum Chemistry

Available MD software
NAMD / VMD (alpha
release)
HOOMD
ACE-MD Source: Ufimtsev, Martinez

Ongoing work
LAMMPS
CHARMM
Q-Chem
Gaussian
GAMESS

Source: Yasuda
15
Computational Fluid Dynamics (CFD)

Ongoing work
Navier-Stokes
Lattice Boltzman
3D Euler Solver
Weather and ocean
modeling
Source: Thibault, Senocak

Source: Tolke, Krafczyk 16

Electromagnetics / Electrodynamics

FDTD Solvers
Acceleware
EM Photonics
CUDA Tutorial

Ongoing work
Maxwell equation solver
Ring Oscillator (FDTD)
Particle beam dynamics
simulator

FDTD Acceleration using GPUs

Source: Acceleware

17
Weather, Atmospheric, & Ocean
Modeling
CUDA-accelerated WRF available
Other kernels in WRF being
ported

Ongoing work
Tsunami modeling
Ocean modeling Source: Michalakes,
Vachharajani
Several CFD codes

Source: Matsuoka, Akiyama, et al

18
Computational Finance
Financial Computing Software vendors
SciComp : Derivatives pricing
modeling
Hanweck: Options pricing & risk
analysis
Aqumin: 3D visualization of market
data
Source: SciComp
Exegy: High-volume Tickers & Risk
Analysis
QuantCatalyst: Pricing & Hedging
Engine
Oneye: Algorithmic Trading
Arbitragis Trading: Trinomial Options
Pricing

Ongoing work
LIBOR Monte Carlo market model Source: CUDA SDK
19
Callable Swaps and Continuous Time

Accelerated Computing with HIP
From Everand
Accelerated Computing with HIP
Yifan Sun
4.5/5 (2)
Dell Vostro 3560 Compal LA-8241P Rev 1.0 Schematics
No ratings yet
Dell Vostro 3560 Compal LA-8241P Rev 1.0 Schematics
56 pages
Excel Training Presentation
No ratings yet
Excel Training Presentation
31 pages
Gpu Programming
100% (2)
Gpu Programming
96 pages
GPGPU
No ratings yet
GPGPU
139 pages
Introduction To Massively Parallel Computing
No ratings yet
Introduction To Massively Parallel Computing
44 pages
lecture1
No ratings yet
lecture1
37 pages
L 3 GPU
No ratings yet
L 3 GPU
33 pages
CI-0120 Arquitectura de Computadoras Ejemplos FundamentosDiseño
No ratings yet
CI-0120 Arquitectura de Computadoras Ejemplos FundamentosDiseño
52 pages
01 - Introduction: 1 Why Parallel Programming Is Important in Research
No ratings yet
01 - Introduction: 1 Why Parallel Programming Is Important in Research
50 pages
Lecture 0: Cpus and Gpus: Prof. Mike Giles
No ratings yet
Lecture 0: Cpus and Gpus: Prof. Mike Giles
36 pages
A Look Into Parallel Architectures
No ratings yet
A Look Into Parallel Architectures
43 pages
Lecture 2
No ratings yet
Lecture 2
15 pages
Lecture 1
No ratings yet
Lecture 1
17 pages
Week 1 Csc447
No ratings yet
Week 1 Csc447
36 pages
Lecture 1: Introduction: Graphics Processing Units (Gpus) : Architecture and Programming
No ratings yet
Lecture 1: Introduction: Graphics Processing Units (Gpus) : Architecture and Programming
33 pages
Parralel Demro 001
No ratings yet
Parralel Demro 001
45 pages
PK Introduction CUDA
No ratings yet
PK Introduction CUDA
170 pages
Introduction To CUDA
No ratings yet
Introduction To CUDA
51 pages
Kien Truc May Tinh David Brooks Cs146 Lecture1 Introduction To Computer Architecture (Cuuduongthancong - Com)
No ratings yet
Kien Truc May Tinh David Brooks Cs146 Lecture1 Introduction To Computer Architecture (Cuuduongthancong - Com)
14 pages
Lec 14
No ratings yet
Lec 14
52 pages
Kirk+Hwu GPU
No ratings yet
Kirk+Hwu GPU
92 pages
Programming For Graphics Processing Units (Gpus) : Parallel
No ratings yet
Programming For Graphics Processing Units (Gpus) : Parallel
35 pages
ECE 498AL The CUDA Programming Model
No ratings yet
ECE 498AL The CUDA Programming Model
37 pages
AHA U4
No ratings yet
AHA U4
199 pages
1
No ratings yet
1
44 pages
ch1 PC
No ratings yet
ch1 PC
84 pages
GPGPU Programming With CUDA: Leandro Avila - University of Northern Iowa
No ratings yet
GPGPU Programming With CUDA: Leandro Avila - University of Northern Iowa
29 pages
Ec23 Chapter1
No ratings yet
Ec23 Chapter1
84 pages
Advanced Computer Architecture: Azvjvhd
No ratings yet
Advanced Computer Architecture: Azvjvhd
61 pages
Summary Exam 2015
No ratings yet
Summary Exam 2015
30 pages
Gpu1 - GPU Introduction
No ratings yet
Gpu1 - GPU Introduction
20 pages
Ppar2017 Gpu 1
No ratings yet
Ppar2017 Gpu 1
61 pages
W3C1 Principles of Parallel Computing
No ratings yet
W3C1 Principles of Parallel Computing
28 pages
UNIT-4
No ratings yet
UNIT-4
48 pages
Lecture1 Introduction to Parallel Computing_2025
No ratings yet
Lecture1 Introduction to Parallel Computing_2025
38 pages
GPU Architecture & Implications: David Luebke NVIDIA Research
No ratings yet
GPU Architecture & Implications: David Luebke NVIDIA Research
94 pages
Trends in Computer Architecture
No ratings yet
Trends in Computer Architecture
30 pages
gpus
No ratings yet
gpus
32 pages
Lecture Slides-Week1
No ratings yet
Lecture Slides-Week1
59 pages
Computer Evolution 2 (Details)
No ratings yet
Computer Evolution 2 (Details)
23 pages
Comp422 2011 Lecture1 Introduction
No ratings yet
Comp422 2011 Lecture1 Introduction
50 pages
Lec 3
No ratings yet
Lec 3
48 pages
48423B Fusion Whitepaper WEB
No ratings yet
48423B Fusion Whitepaper WEB
8 pages
Advanced Computer Architecture Fall 2019 Multithreaded Architectures
No ratings yet
Advanced Computer Architecture Fall 2019 Multithreaded Architectures
31 pages
CUDA
No ratings yet
CUDA
46 pages
CC Unit 1
No ratings yet
CC Unit 1
24 pages
CS5204/EE5364 - Advanced Computer Architecture - Introduction
No ratings yet
CS5204/EE5364 - Advanced Computer Architecture - Introduction
28 pages
Introduction To Programming Massively Parallel Graphics Processors
No ratings yet
Introduction To Programming Massively Parallel Graphics Processors
84 pages
Defining Computer Architecture
No ratings yet
Defining Computer Architecture
6 pages
CSE 820 Graduate Computer Architecture: Dr. Enbody
No ratings yet
CSE 820 Graduate Computer Architecture: Dr. Enbody
25 pages
лк CUDA - 1 PDCn
No ratings yet
лк CUDA - 1 PDCn
31 pages
CS3350B Computer Architecture: Marc Moreno Maza
100% (1)
CS3350B Computer Architecture: Marc Moreno Maza
45 pages
001 Intro
No ratings yet
001 Intro
55 pages
IntroGPUs
No ratings yet
IntroGPUs
36 pages
Design of Parallel Algorithm'S: Faculty Guide: Group Members
No ratings yet
Design of Parallel Algorithm'S: Faculty Guide: Group Members
49 pages
Evolution of Microprocessor: - A 30 Year History of Microprocessors - High Performance Microprocessor Drivers
No ratings yet
Evolution of Microprocessor: - A 30 Year History of Microprocessors - High Performance Microprocessor Drivers
85 pages
Compute Cores Whitepaper
No ratings yet
Compute Cores Whitepaper
6 pages
CS516: Parallelization of Programs: Overview of Parallel Architectures
No ratings yet
CS516: Parallelization of Programs: Overview of Parallel Architectures
43 pages
GPU Overclocking Guide
From Everand
GPU Overclocking Guide
Alisa Turing
No ratings yet
Multi-Accelerator Systems
From Everand
Multi-Accelerator Systems
Kai Turing
No ratings yet
Quantum Computer Vs Traditional Computer
From Everand
Quantum Computer Vs Traditional Computer
Arief Muinnudin
No ratings yet
Lecture 3 - Layered Network Architecture, Protocols, Interfaces, Services
No ratings yet
Lecture 3 - Layered Network Architecture, Protocols, Interfaces, Services
27 pages
Lecture 4 Network Security
No ratings yet
Lecture 4 Network Security
17 pages
GROUP ONE
No ratings yet
GROUP ONE
16 pages
Lecture 1- Introduction
No ratings yet
Lecture 1- Introduction
17 pages
Unit 4 INTERNET AND WEB
No ratings yet
Unit 4 INTERNET AND WEB
29 pages
Unit 3 Importance of Computers
No ratings yet
Unit 3 Importance of Computers
15 pages
Computational Mathematics - Lecture - One - Number Theory-Part - One
No ratings yet
Computational Mathematics - Lecture - One - Number Theory-Part - One
17 pages
Unit4 - Instruction Set Architecture (ISA)
No ratings yet
Unit4 - Instruction Set Architecture (ISA)
9 pages
Leture 7
No ratings yet
Leture 7
7 pages
Unit 5 - Computer Architectures
No ratings yet
Unit 5 - Computer Architectures
20 pages
Ismaeel
No ratings yet
Ismaeel
9 pages
Group F
No ratings yet
Group F
8 pages
Lecture 5
No ratings yet
Lecture 5
21 pages
Lecture 4
No ratings yet
Lecture 4
19 pages
Cyber U-4 One Shot Notes
No ratings yet
Cyber U-4 One Shot Notes
32 pages
Rohit Sach Deva
No ratings yet
Rohit Sach Deva
1 page
Extract Data From SQL Database
No ratings yet
Extract Data From SQL Database
5 pages
Basic Networking Fundamentals
No ratings yet
Basic Networking Fundamentals
19 pages
Configuring VLANs and Trunking
No ratings yet
Configuring VLANs and Trunking
11 pages
Google Lens
No ratings yet
Google Lens
6 pages
Smart GPS Tracker Communication Protocol (Rev.1.1)
No ratings yet
Smart GPS Tracker Communication Protocol (Rev.1.1)
32 pages
Installing The IP Office Anywhere Demonstration Software: Release 10.1 Issue 02d January 2018
No ratings yet
Installing The IP Office Anywhere Demonstration Software: Release 10.1 Issue 02d January 2018
33 pages
Checklist For Code Reviews
No ratings yet
Checklist For Code Reviews
2 pages
Portable Private Wifi: Sg5 Project Proposal
No ratings yet
Portable Private Wifi: Sg5 Project Proposal
9 pages
Waqar Ul Hassan FALL-19/BSCS-017 Dbms
No ratings yet
Waqar Ul Hassan FALL-19/BSCS-017 Dbms
4 pages
Ekinops PM Oa-Hcs-Br/Rb/Sf: High Capacity Next Generation Optical Amplifier With Single Fiber Operation
No ratings yet
Ekinops PM Oa-Hcs-Br/Rb/Sf: High Capacity Next Generation Optical Amplifier With Single Fiber Operation
2 pages
Axcent PAVO CETUS - 30 - Instructions For Maintenance EN 1.0
No ratings yet
Axcent PAVO CETUS - 30 - Instructions For Maintenance EN 1.0
54 pages
Omputer Ssembly and Isassembly: Ntroduction
No ratings yet
Omputer Ssembly and Isassembly: Ntroduction
26 pages
AWS Service Datasheet - Controlled
No ratings yet
AWS Service Datasheet - Controlled
2 pages
PF - Final Question Fall 2023
No ratings yet
PF - Final Question Fall 2023
6 pages
Banker's Algorithm: Operating System LAB Report
No ratings yet
Banker's Algorithm: Operating System LAB Report
15 pages
Unit3 Cloud Architecture
No ratings yet
Unit3 Cloud Architecture
53 pages
CNS Unit 5
No ratings yet
CNS Unit 5
22 pages
Does Not Export The Right Colors by The Bodymovin in The Gradient Issue #1440 Airbnblottie-Web GitHub
No ratings yet
Does Not Export The Right Colors by The Bodymovin in The Gradient Issue #1440 Airbnblottie-Web GitHub
1 page
Life Cycle Hooks Sequence:: 1. Core Concepts of Angular Components
No ratings yet
Life Cycle Hooks Sequence:: 1. Core Concepts of Angular Components
30 pages
GR 10 It Set 3 QP 2022 23 DXB
No ratings yet
GR 10 It Set 3 QP 2022 23 DXB
8 pages
Toshiba T2110 T2110CS T2115CS T2130 T2130CS T2130CT T2135 - Maintenance Manual
No ratings yet
Toshiba T2110 T2110CS T2115CS T2130 T2130CS T2130CT T2135 - Maintenance Manual
183 pages
Workshop 3.1 Shell Meshing - Stamped Part: Introduction To ANSYS Icem CFD
No ratings yet
Workshop 3.1 Shell Meshing - Stamped Part: Introduction To ANSYS Icem CFD
12 pages
Our First Home - TutorialsDuniya
No ratings yet
Our First Home - TutorialsDuniya
32 pages
CG Unit 6 Q A
No ratings yet
CG Unit 6 Q A
15 pages
Ar1021x CL3D
No ratings yet
Ar1021x CL3D
1 page
Novo Documento de Texto
No ratings yet
Novo Documento de Texto
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lecture 1

Uploaded by

Lecture 1

Uploaded by

Trends in Computing

Several slides taken from a Microway/NVIDIA webinar

- Lithography not possible

Sony Playstation 3 Microsoft X-Box 360

Difficult to program Difficult to program

Computing with CPU +

146X 36X 18X 50X 100X

Medical Molecular Video Matlab Astrophysic

149X 47X 20X 130X 30X

Financial Linear Algebra 3D Quantum Gene

CPU Only With GPU

Source: Anderson, Lorenz, Travesset

Source: Tolke, Krafczyk 16

FDTD Acceleration using GPUs

Source: Matsuoka, Akiyama, et al

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.