Skip to content

Speeding up pixel_to_pixel and more generally WCS transformations in astropy.wcs #18113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
astrofrog opened this issue May 9, 2025 · 1 comment

Comments

@astrofrog
Copy link
Member

astrofrog commented May 9, 2025

The pixel_to_pixel function in astropy.wcs is used heavily in the reproject package, and in some cases is one of the main performance bottlenecks. This issue is for discussions relating to how we can make it (and more generally WCS transformations) faster. I did some initial exploration/benchmarking:

In [1]: import numpy as np
   ...: from astropy.wcs.utils import pixel_to_pixel
   ...: from astropy.wcs import WCS
   ...: 
   ...: wcs1 = WCS(naxis=2)
   ...: wcs1.wcs.ctype = 'RA---TAN', 'DEC--TAN'
   ...: wcs1.wcs.crpix = 1, 1
   ...: wcs1.wcs.crval = 10., 20.
   ...: wcs1.wcs.cdelt = -1e-3, 1e-3
   ...: wcs1.wcs.cunit = 'deg', 'deg'
   ...: wcs1.wcs.set()
   ...: 
   ...: N = 10_000_000
   ...: xp = np.random.uniform(0, 10, N)
   ...: yp = np.random.uniform(0, 10, N)

In [4]: %timeit  pixel_to_pixel(wcs1, wcs2, xp, yp)
2.39 s ± 94.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

So in this case the transformation takes around 2.4s. This is not way slower than doing the transformations manually without going through high-level objects since in this specific case the world coordinate systems match:

%timeit wcs2.wcs_world2pix(*wcs1.wcs_pix2world(xp, yp, 0),0)
2.13 s ± 101 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

If we look at just the pixel to world part, it predictably takes half the time:

In [2]: %timeit wcs1.wcs_pix2world(xp, yp, 0),
1.14 s ± 84.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Let's take a simpler projection, CAR:

In [3]:  wcs2 = WCS(naxis=2)
   ...:  wcs2.wcs.ctype = 'RA---SIN', 'DEC--SIN'
   ...:  wcs2.wcs.crpix = 1, 1
   ...:  wcs2.wcs.crval = 10.003, 20.004
   ...:  wcs2.wcs.cdelt = -1e-3, 1e-3
   ...:  wcs2.wcs.cunit = 'deg', 'deg'
   ...:  wcs2.wcs.set()

We can implement our own version of this transformation:

def manual_pix_to_world_car(x_pix, y_pix, crpix, crval, cdelt, pc):

    x_pix = x_pix + 1
    y_pix = y_pix + 1
    
    dx = x_pix - crpix[0]
    dy = y_pix - crpix[1]

    x_intermediate = pc[0, 0] * dx + pc[0, 1] * dy
    y_intermediate = pc[1, 0] * dx + pc[1, 1] * dy

    x_intermediate *= cdelt[0]
    y_intermediate *= cdelt[1]

    ra = crval[0] + x_intermediate
    dec = crval[1] + y_intermediate

    return ra, dec

Let's compare the performance of wcs_pix2world and manual_pix_to_world_car:

In [6]: %timeit manual_pix_to_world_car(xp, yp, wcs2.wcs.crpix, wcs2.wcs.crval, wcs2.wcs.cdelt, wcs2.wcs.get_pc())
136 ms ± 1.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [7]: %timeit wcs2.wcs_pix2world(xp, yp, 0)
978 ms ± 29.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The wcs_pix2world call is a little faster than before because the projection is simpler, but it is still almost a second. The manual implementation above is around 7x faster and gives the same results (although there are some small differences at higher precision which would need to be understood).

But here's the fun thing - we can try and use e.g. jax to speed things up further (using e.g. the GPU)- as the above manual function was written without any explicit numpy calls, we can easily jit it:

In [12]: from jax import jit

In [13]: jit_pix_to_world_car = jit(manual_pix_to_world_car)

In [14]: %time jit_pix_to_world_car(xp, yp, wcs2.wcs.crpix, wcs2.wcs.crval, wcs2.wcs.cdelt, wcs2.wcs.get_pc())
CPU times: user 125 ms, sys: 141 ms, total: 266 ms
Wall time: 270 ms
...

In [15]: %time jit_pix_to_world_car(xp, yp, wcs2.wcs.crpix, wcs2.wcs.crval, wcs2.wcs.cdelt, wcs2.wcs.get_pc())
CPU times: user 28.1 ms, sys: 8.02 ms, total: 36.1 ms
Wall time: 29.6 ms
...

In [17]: %timeit jit_pix_to_world_car(xp, yp, wcs2.wcs.crpix, wcs2.wcs.crval, wcs2.wcs.cdelt, wcs2.wcs.get_pc())
24.4 ms ± 188 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

The first call is 270ms because this includes the compilation, but after this the calls take less than 30ms - an improvement of a factor of 4-5x over the plain Numpy function, and over 30x faster than the original wcs_pix2world call.

The point of these experiments is not to suggest re-writing all of WCSLIB, but rather that we might want to consider writing optimized versions of certain transformations for common projections, and this could have a big impact on performance in packages such as reproject or other packages for which WCS transformations may be a bottleneck.

We could then still use WCSLIB to parse, validate etc the WCSes, but one could imagine that in e.g. wcs_pix2world we would have some special cases that would take a more optimal route.

The related reproject issue is astropy/reproject#489, and I'll also ping @crhea93 since this issue was inspired by the approach in their https://github.com/DragonflyTelescope/dfreproject package

@astrofrog astrofrog changed the title Speeding up pixel_to_pixel in astropy.wcs Speeding up pixel_to_pixel and more generally WCS transformations in astropy.wcs May 9, 2025
@astrofrog
Copy link
Member Author

Another related thing to think about is that the Montage package supports using plane-to-plane transformations for certain projections (e.g. TAN), see more here - it might also be worth looking into implementing something like this in pixel_to_pixel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy