0% found this document useful (0 votes)
12 views

Lecture 1

Uploaded by

iakambamu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Lecture 1

Uploaded by

iakambamu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Trends in Computing

Architecture
CMSC828E
Ramani Duraiswami

Several slides taken from a Microway/NVIDIA webinar


Some figures adapted from web sources
Problem sizes in simulation and data processing
are increasing
• Change in paradigm in science
– Simulate then test
– Fidelity demands larger simulations
– Problems being simulated are also much more
• Sensors are getting varied and cheaper; and storage is
getting cheaper
– Cameras, microphones
• Other Large data
– Text (all the newspapers, books, technical papers)
– Genome data
– Medical/biological data (X-Ray, PET, MRI, Ultrasound, Electron
microscopy …)
– Climate (Temperature, Salinity, Pressure, Wind, Oxygen content, …)
Ways to attack problem size
growth
• Faster algorithms with better asymptotic
complexity
• Faster processors
– “Moore’s law will take care of it”
• Go parallel!
– Clusters of computers
– New data parallel chips (multicore processors,
GPUs)
“Moore’s Law will take care of it”
• Not law but an
observation by Gordon
Moore in the 1960s
• Number of transistors
doubles every 18
months
• Basically has been
taken to mean that the
“standard computer”s
performance improves
exponentially, with a
doubling time of 18
months
Refuting the Moore’s law argument
• Argument:
– Moore’s law: Processor speed doubles every 18 months
– If we wait long enough the computer will get fast enough and let my
inefficient algorithm tackle the problem
• Is this true?
– Yes for algorithms with linear asymptotic complexity
– No!! For algorithms with different asymptotic complexity
– Most scientific algorithms are O(N2) or O(N3)
– For a million variables, we would need about 16 generations of
Moore’s law before a O(N2) algorithm was comparable with a O(N)
algorithm
• Did no one tell you that Moore’s law is dead?
Moore’s Law is dead:
“Issues at small scales”

- Lithography not possible


- 2D electrostatics harder to control,
- “parasitic resistance” degrade performance,
- device to device variations will be larger,
- ultra-thin bodies and hyper-abrupt junctions
make manufacturing difficult
Moore’s Law is dead!
• Feature sizes and clock speeds on commodity
chips have been stagnant over the past 4 years
– ~3 GHz and 45 nm
• All manufacturers are going with multicore to
maintain performance
– Core-2, core-2-duo, quad-core, …
• Shared memory multiprocessing
– Intel has demo’ed several many core systems
• Graphics processors and gaming
consoles have already been on the
multicore path for a decade!
Gamer Power

Sony Playstation 3 Microsoft X-Box 360


2.18 teraflops <$400 1.04 teraflops <$300

Difficult to program Difficult to program


Multicore Intel box with 3 GPUs
GEFORCE 8880 GTX
in Slots
~ 1 Teraflop for < 3000
(shown with 1 GPU)
Programming on the GPU
• GPU organized as groups of
multiprocessors (8 relatively slow
processors) with small amount of own Local memory
memory and access to common shared ~50kB
memory
• Factor of 100s difference in speed as one
goes up the memory hierarchy
• To achieve gains problems must fit the GPU GPU shared
programming paradigm/ manage memory memory
~1GB
• Fortunately many practically important tasks
do map well and work on converting others
– Image and Audio Processing
– Some types of linear algebra cores
– Many machine learning algorithms Host memory
• Research issues: ~2-32 GB
– Identifying important tasks and mapping them to the
architecture
– Making it convenient for programmers to call GPU
code from host code
What is GPU Computing?

4 cores

Computing with CPU +


GPU
Heterogeneous
Computing
11
Not 2x or 3x : Speedups are 20x to
150x

146X 36X 18X 50X 100X

Medical Molecular Video Matlab Astrophysic


Imaging Dynamics Transcoding Computing s
U of Utah U of Illinois, Elemental Tech AccelerEyes RIKEN
Urbana

149X 47X 20X 130X 30X

Financial Linear Algebra 3D Quantum Gene


simulation Universidad Ultrasound Chemistry Sequencing
Oxford Jaime Techniscan U of Illinois, U of Maryland
Urbana

12
Accelerating Time to Discovery

4.6 Days
2.7 Days 3 Hours
8 Hours

30 Minutes
27 Minutes 16 Minutes
13 Minutes

CPU Only With GPU


13
Molecular Dynamics

Available MD software
NAMD / VMD (alpha
release)
HOOMD
ACE-MD
MD-GPU Source: Stone, Phillips, Hardy, Schulten

Ongoing work
LAMMPS
CHARMM
GROMACS
AMBER

Source: Anderson, Lorenz, Travesset


14
Quantum Chemistry

Available MD software
NAMD / VMD (alpha
release)
HOOMD
ACE-MD Source: Ufimtsev, Martinez

Ongoing work
LAMMPS
CHARMM
Q-Chem
Gaussian
GAMESS

Source: Yasuda
15
Computational Fluid Dynamics (CFD)

Ongoing work
Navier-Stokes
Lattice Boltzman
3D Euler Solver
Weather and ocean
modeling
Source: Thibault, Senocak

Source: Tolke, Krafczyk 16


Electromagnetics / Electrodynamics

FDTD Solvers
Acceleware
EM Photonics
CUDA Tutorial

Ongoing work
Maxwell equation solver
Ring Oscillator (FDTD)
Particle beam dynamics
simulator

FDTD Acceleration using GPUs


Source: Acceleware

17
Weather, Atmospheric, & Ocean
Modeling
CUDA-accelerated WRF available
Other kernels in WRF being
ported

Ongoing work
Tsunami modeling
Ocean modeling Source: Michalakes,
Vachharajani
Several CFD codes

Source: Matsuoka, Akiyama, et al


18
Computational Finance
Financial Computing Software vendors
SciComp : Derivatives pricing
modeling
Hanweck: Options pricing & risk
analysis
Aqumin: 3D visualization of market
data
Source: SciComp
Exegy: High-volume Tickers & Risk
Analysis
QuantCatalyst: Pricing & Hedging
Engine
Oneye: Algorithmic Trading
Arbitragis Trading: Trinomial Options
Pricing

Ongoing work
LIBOR Monte Carlo market model Source: CUDA SDK
19
Callable Swaps and Continuous Time

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy