BUG: Fix copy performance regression for structured arrays #29270

bonpyt · 2025-06-25T15:51:56Z

This adds a special case to dtype_transfer.c::get_fields_transfer_function to shortcut the copying of structured arrays in certain cases.

bonpyt · 2025-06-26T11:39:14Z

So CI fails because of this

dtype = np.dtype([('a', {'names': ['aa', 'ab'], 'formats': ['f', 'f'], 'offsets': [0, 4]}, (2, 3))])
sparse_dtype = np.dtype([('a', {'names': ['ab'], 'formats': ['f'], 'offsets': [4]}, (2, 3))])

def test_sparse_field_assignment(self):
          arr = np.zeros(3, self.dtype)
          sparse_arr = arr.view(self.sparse_dtype)
      
          sparse_arr[...] = np.finfo(np.float32).max
          # dtype is reduced when accessing the field, so shape is (3, 2, 3):
  >       assert_array_equal(arr["a"]["aa"], np.zeros((3, 2, 3)))
  E       AssertionError: 
  E       Arrays are not equal
  E       
  E       Mismatched elements: 15 / 18 (83.3%)
  E       Max absolute difference among violations: 1.36988107e-17
  E       Max relative difference among violations: inf
  E        ACTUAL: array([[[-1.369881e-17,  0.000000e+00,  2.802597e-45],
  E               [ 2.242078e-44,  1.121039e-44,  5.605194e-45]],
  E       ...
  E        DESIRED: array([[[0., 0., 0.],
  E               [0., 0., 0.]],
  E       ...

It seems to me that the shortcut I added is triggering here

sparse_arr[...] = np.finfo(np.float32).max

which causes the test to fail. In my mind, in this case we shouldn't be using the shortcut to copy - I would expect is_dtype_struct_simple_unaligned_layout(dst_dtype) to return false for sparse_arr, but it returns true and therefore the shortcut is applied. Maybe a maintainer could give an advice on how to address this? Thanks

seberg · 2025-06-28T17:54:28Z

Yeah, but it seems is_dtype_struct_simple_unaligned_layout isn't even used anywhere anymore.
The problem is nesting, this dtype has a nested structured dtype and fails to check that. We could probably fix that there. I do wonder if pushing the logic lower may make sense (unless the "is view" logic aligns, but I don't think it does :/).
(I suspect it doesn't, rather pushing it up may a bit but that is the flags idea.)

bonpyt · 2025-06-30T09:36:26Z

What do you mean by "pushing the logic lower"?

seberg · 2025-06-30T09:57:22Z

What do you mean by "pushing the logic lower"?

Sorry, nvm. you are already as low as it gets, I thought we might be able to include the logic in that follows, but it's effectively at the same place.
We should possibly move it because it avoids duplicating a lot of work two times probably, but it won't help with the underlying issue, unless PyArray_GetDTypeTransferFunction informs us that this is a trivial copy.

Signaling a trivially copyable would probably be the most useful thing to do, but it may require doing so in general in casts or maybe as a dtype flag (I am not 100% sure a dtype flag covers everything, but probably; The viewable flag might help, but I don't think it does unfortunately.).
So my guess would be the best way may be a new "trivially copyable" flag on the dtype instance (if dtypes are equivalent and trivially copyable, we can assume a memcpy is OK -- the opposite "not trivially copyable" may actually be more practical, because in practice it is just struct that has holes that could contain references even).

As a quicker fix that doesn't add a flag, I honestly don't have much of an idea except recursing or just giving up when you find an embedded structured dtype (and keep the assumption that all others without references are fine).
If it is good enough for you, I might lean towards ignoring embedded structures. It's just simpler/quicker, and if you have a case where the embedded structure is large that copy will use the optimization anyway.

(Although, I would be happy to investigate a new flag, since that seems useful in a lot more places to do similar optimizations.)

bonpyt · 2025-07-02T10:09:41Z

Ignoring embedded structures sounds good to me, I'll try this out. Agree a flag might be a better solution long-term :)

seberg · 2025-07-02T11:11:29Z

Thanks, could you see how annoying it is to fold the logic into the branch after wards? Otherwise PyArray_EquivTypes and the check, basically add 2x the same work that we'll be doing in that branch again.

(If it's annoying, I think I can accept not doing it. I hid the fields in NumPy 2, with the hope that we could have a better way to store this information that extract it from tuples every time again and again anyway...)

bonpyt · 2025-07-03T12:09:11Z

so I've played around with it and couldn't quite figure out how to access the "underlying" field type information. basically in the example from the test:

dtype = np.dtype([('a', {'names': ['aa', 'ab'], 'formats': ['f', 'f'], 'offsets': [0, 4]}, (2, 3))])
sparse_dtype = np.dtype([('a', {'names': ['ab'], 'formats': ['f'], 'offsets': [4]}, (2, 3))])

I could get the description of the field a as the one of type "void", and helpers like PyDataType_FIELDS didn't return anything when called on a (i. e. sparse_dtype['a'].fields is None / PyDataType_HASFIELDS(fld_type) == 0). So I'm not sure how to get the information that a has some subfields and what their properties (like offsets) are.

I could just check that the field dtype is not void, I'd go with that if there is no better way, but it seems to be too broad of a check

seberg · 2025-07-03T17:17:27Z

didn't return anything when called on a (i. e. sparse_dtype['a'].fields is None / PyDataType_HASFIELDS(fld_type) == 0). So I'm not sure how to get the information that a has some subfields and what their properties (like offsets) are.

That dtype is a subarray dtype, which should be OK, you could check the base I guess (of course, that could be another subarray dtype.

I dunno, my gut feeling is that if you don't take the trivial route here, adding the flag is probably actually the easier solution (and that could be a negative slag on a whim that such dtypes are rare).
It does require finding the 1-2 places where structured dtypes get created oddly and need to propagate flags, but that's about it probably.

Fix structured array copy slowdown

43320bf

github-actions bot added the 00 - Bug label Jun 25, 2025

melissawm added this to NumPy first-time contributor PRs Jun 30, 2025

melissawm moved this to Pending authors' response in NumPy first-time contributor PRs Jun 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Fix copy performance regression for structured arrays #29270

BUG: Fix copy performance regression for structured arrays #29270

Uh oh!

bonpyt commented Jun 25, 2025

Uh oh!

bonpyt commented Jun 26, 2025

Uh oh!

seberg commented Jun 28, 2025

Uh oh!

bonpyt commented Jun 30, 2025

Uh oh!

seberg commented Jun 30, 2025

Uh oh!

bonpyt commented Jul 2, 2025

Uh oh!

seberg commented Jul 2, 2025 •

edited

Loading

Uh oh!

bonpyt commented Jul 3, 2025

Uh oh!

seberg commented Jul 3, 2025

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

Uh oh!

BUG: Fix copy performance regression for structured arrays #29270

Are you sure you want to change the base?

BUG: Fix copy performance regression for structured arrays #29270

Uh oh!

Conversation

bonpyt commented Jun 25, 2025

Uh oh!

bonpyt commented Jun 26, 2025

Uh oh!

seberg commented Jun 28, 2025

Uh oh!

bonpyt commented Jun 30, 2025

Uh oh!

seberg commented Jun 30, 2025

Uh oh!

bonpyt commented Jul 2, 2025

Uh oh!

seberg commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bonpyt commented Jul 3, 2025

Uh oh!

seberg commented Jul 3, 2025

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

seberg commented Jul 2, 2025 •

edited

Loading