Accelerating Marching Cubes With Graphics Hardware
Accelerating Marching Cubes With Graphics Hardware
Gunnar Johansson
Hamish Carr
Linkoping University
Abstract
For large data sets in medicine and science, efficient isosurface extraction and rendering is crucial for interactive visualization. Previous GPU
acceleration techniques have been restricted to
tetrahedral meshes. We generalize this work
to arbitrary meshes by caching local topology
on the video card to reduce both CPU load and
bandwidth consumption, demonstrating our results with the Marching Cubes cases. We also
present improvements to span space techniques
that pre-classify the rangs over which individual cases are used in a given cube. Our results
indicate that speedups in excess of tenfold are
feasible, compared with speedups of less than
twofold demonstrated in previous papers.
Introduction
and optimizations based on GPUs. Algorithmically, we improve span space performance with
a time-space trade-off that stores the topology
of the isosurface in each cell in the span space
structure, avoiding computations at runtime.
Our main contribution is to improve existing
GPU-based isosurface acceleration by caching
the topology on the video card and using a vertex program to perform geometric interpolation. Previous work [10, 6] demonstrated that
acceleration by 30% was feasible for tetrahedral
data. Reck et al. [11] further accelerated this
method to approximately 850% using an interval tree. However, for cubic data, tetrahedral
subdivision generates surfaces with pronounced
artifacts [2]. Goetz et al. [5] therefore accelerated Marching Cubes on the GPU, but without
span space techniques or the correct normals.
By comparison, we generate isosurfaces that
are more accurate than Goetz et al. [5], and
whose acceleration is at least as good as Pascucci [10] but superior in quality and speed
when tetrahedral subdivision is accounted for.
Moreover, combining the GPU acceleration
with the span space improvements results in
acceleration of as much as 1300%. Our method
is also applicable to higher-order isosurface interpolation, while further acceleration is anticipated in the next generation of GPUs.
Section 2 will describe the data and outline
the GPU pipeline. Section 3 presents the previous work on Marching Cubes and isosurface
acceleration. Section 4 then presents the contributions of this paper. Finally, results, conclusions and future work are presented in Sections 5, 6 and 7.
Background
For a 3D scalar field f : IR3 IR, an isosurface is the inverse image f 1 (h) of a particular isovalue h. In practice f is assumed to be
reconstructed from discrete samples by dividing the domain of the function into polyhedral
cells, often cubes or tetrahedra, with simple interpolants. Isosurfaces are then visualized by
generating an approximation of the mathematically precise surface, usually by constructing a
triangulated approximation in each cell [8].
Since many problems in graphics are decomposable into smaller problems for later
compositing, modern GPUs use deep parallel
pipelines. The input to the pipeline is a stream
of vertices arriving at the vertex processor, followed by rasterization to fragments for final
computations and compositing.
Vertex programs run in the vertex processor(s) and allow the programmer to control the
properties of each vertex individually. However, the parallel nature of the processor prohibits access to properties of other vertices, and
prevents the creation or deletion of any vertices. Although triangles of zero area can be
deleted later in the pipeline, the inability to
create new vertices is a major limitation in migrating isosurface extraction to the GPU.
Fragment programs run in the fragment
processor(s), and allow the programmer to control properties of individual pixels. As with the
vertex processor, access to other fragments is
prohibited, and the fragments position in the
image cannot be changed.
Previous Work
3.1
Marching Cubes
Marching Cubes [8] exploits the decomposability of isosurface extraction. Each corner in a
cubic cell is classified as above (black ) or below
(white) a given isovalue h. Since each cube has
0: Empty
1: Triangle
2: Quad
3.2
Accelerating Algorithmically
7
black: corner value
is above isovalue
white: corner value
is below isovalue
7C
6C
10
5C
11
4C
12
13
14
3C
2C
1C
0C
current
isovalue
maximum
current
isovalue
minimum
The span space is often implemented using a kd-tree [1, 7], which uses O(N ) storage
and O(N
log N ) construction time for searches
in O( N + k) time. Other implementations
use the interval tree [3] for O(N ) storage and
O(N log N ) construction time, with a search
cost of O(log N + k) time, at the expense of
greater memory requirements than the k-d tree.
These methods discard cubes not intersected
by the surface, but still use Marching Cubes or
one of its derivatives to compute the triangles
to be rendered. Accordingly, improvements to
triangle extraction are broadly applicable.
3.3
GPU Acceleration
GPU Isosurfaces
In this section, we will describe our GPU accelerated solution. We start by describing the
caching cell topology technique which makes it
possible to apply complex topology such as the
Marching Cubes cases on the GPU. We then
present a pre-computation step for optimizing
the case classification on the CPU.
4.1
4.2
240
23
150
43
isovalue
240
189
150
78
43
78
4
189
18
23
18
4
case 0
case 1
case 3
case 5
case 8
case 5C
case 2C
case 1C
case 0C
Results
Size
Isovalue
CPU
Marching Cubes
Kd-Tree
Case Kd-Tree
Interval Tree
Case Interval Tree
GPU
Marching Cubes
Kd-Tree
Case Kd-Tree
Interval Tree
Case Interval Tree
Fuel
64x64x64
10
Hydrogen atom
128x128x128
20
Engine
256x256x128
155
Aneurism
256x256x256
100
Skull
256x256x256
80
58
117 (2.0)
125 (2.2)
122 (2.1)
130 (2.2)
23
24
27
32
9.4
(2.5)
(2.5)
(2.9)
(3.4)
2.3
5.3 (2.3)
5.8 (2.6)
6.2 (2.7)
-
1.3
5.1 (3.8)
6.2 (4.6)
6.3 (4.7)
6.7 (5.0)
1.0
1.7 (1.7)
2.1 (2.1)
-
91
89
89
91
97
12
22
22
23
25
(1.3)
(2.4)
(2.4)
(2.5)
(2.7)
2.8
5.3
5.3
5.5
1.6
5.8
5.9
6.0
6.1
1.5 (1.5)
2 (2.0)
2.1 (2.1)
-
(1.6)
(1.5)
(1.5)
(1.6)
(1.7)
(1.2)
(2.3)
(2.4)
(2.4)
-
(1.2)
(4.4)
(4.4)
(4.5)
(4.6)
Conclusions
We have described an approach for accelerating isosurface extraction using graphics hardware, by storing the Marching Cubes cases on
the GPU and interpolating the vertices using a
vertex program. We have also extended the use
of the kd-tree and the interval tree to contain
pre-computed cases. This transfers the case
classification to a pre-processing stage on the
CPU, and completely removes the need for the
CPU to access the original dataset.
Our results demonstrate that the principal
bottleneck in isosurface extraction is in the
CPU rather than the GPU, and that with judicious algorithmic and pipeline optimization,
significant acceleration can be achieved for any
isosurface extraction kernel. We note that the
limiting factor on performance appeared to be
the vertex texture performance, and in particular the lack of efficient 3D vertex textures. We
expect that future hardware will improve on
this situation.
Future Work
Size
Isovalue
CPU
Marching Cubes
Kd-Tree
Case Kd-Tree
Interval Tree
Case Interval Tree
GPU
Marching Cubes
Kd-Tree
Case Kd-Tree
Interval Tree
Case Interval Tree
Fuel
64x64x64
10
65
146 (2.3)
160 (2.5)
157 (2.4)
168 (2.6)
91 (1.4)
448 (6.9)
450 (6.9)
459 (7.1)
479 (7.4)
Hydrogen atom
128x128x128
20
22
25
27
31
9.4
(2.4)
(2.7)
(2.9)
(3.3)
12.4 (1.3)
90 (9.6)
121 (12.9)
112 (12.0)
134 (14.3)
Engine
256x256x128
155
Aneurism
256x256x256
100
Skull
256x256x256
80
2.5
6.9 (2.8)
8 (3.2)
8.3 (3.3)
-
1.4
6.4 (4.6)
8.1 (5.8)
8.1 (5.8)
-
1.1
2.1 (1.9)
2.4 (2.4)
-
3.1 (1.2)
24 (9.6)
28 (11.2)
28 (11.2)
-
Table 2: Framerate (frames per second) with pre-computed normals. Numbers in parentheses show ratio between the respective acceleration technique and the CPU implementation of
brute force Marching Cubes. Missing results indicate that our data structures would not fit in main
memory, or that the pre-computed normals would not fit in VRAM.
References
[1] Jon Louis Bentley. Multidimensional binary
search trees used for associative searching.
Commun. ACM, 18(9):509517, 1975.
[2] Hamish Carr, Torsten M
oller, and Jack
Snoeyink. Artifacts caused by simplicial subdivision. IEEE Transactions on Visualization and Computer Graphics, 12(2):231242,
March 2006.
[3] Paolo Cignoni, Paola Marino, Claudio Montani, Enrico Puppo, and Roberto Scopigno.
Speeding up isosurface extraction using interval trees. IEEE Transactions on Visualization
and Computer Graphics, 3(2):158170, 1997.
[4] Martin. J. Durst. Re: Additional reference
to marching cubes. SIGGRAPH Comput.
Graph., 22(5):243, 1988.
[5] Frank Goetz, Theodor Junklewitz, and Gitta
Domik. Real-time marching cubes on the vertex shader. In Eurographics 2005 Short Presentations. Eurographics Association, 2005.
[6] Thomas Klein, Simon Stegmaier, and Thomas
Ertl. Hardware-accelerated reconstruction of
polygonal isosurface representations on unstructured grids.
In Computer Graphics
and Applications, 12th Pacific Conference on
(PG04), pages 186195, 2004.
[7] Yarden Livnat, Han-Wei Shen, and Christopher R. Johnson.
A near optimal isosurface extraction algorithm using the span