Content-Length: 329485 | pFad | http://github.com/numpy/numpy/pull/29270

69 BUG: Fix copy performance regression for structured arrays by bonpyt · Pull Request #29270 · numpy/numpy · GitHub
Skip to content

BUG: Fix copy performance regression for structured arrays #29270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bonpyt
Copy link

@bonpyt bonpyt commented Jun 25, 2025

Fixes #28194

This adds a special case to dtype_transfer.c::get_fields_transfer_function to shortcut the copying of structured arrays in certain cases.

@bonpyt
Copy link
Author

bonpyt commented Jun 26, 2025

So CI fails because of this

dtype = np.dtype([('a', {'names': ['aa', 'ab'], 'formats': ['f', 'f'], 'offsets': [0, 4]}, (2, 3))])
sparse_dtype = np.dtype([('a', {'names': ['ab'], 'formats': ['f'], 'offsets': [4]}, (2, 3))])

def test_sparse_field_assignment(self):
          arr = np.zeros(3, self.dtype)
          sparse_arr = arr.view(self.sparse_dtype)
      
          sparse_arr[...] = np.finfo(np.float32).max
          # dtype is reduced when accessing the field, so shape is (3, 2, 3):
  >       assert_array_equal(arr["a"]["aa"], np.zeros((3, 2, 3)))
  E       AssertionError: 
  E       Arrays are not equal
  E       
  E       Mismatched elements: 15 / 18 (83.3%)
  E       Max absolute difference among violations: 1.36988107e-17
  E       Max relative difference among violations: inf
  E        ACTUAL: array([[[-1.369881e-17,  0.000000e+00,  2.802597e-45],
  E               [ 2.242078e-44,  1.121039e-44,  5.605194e-45]],
  E       ...
  E        DESIRED: array([[[0., 0., 0.],
  E               [0., 0., 0.]],
  E       ...

It seems to me that the shortcut I added is triggering here

sparse_arr[...] = np.finfo(np.float32).max

which causes the test to fail. In my mind, in this case we shouldn't be using the shortcut to copy - I would expect is_dtype_struct_simple_unaligned_layout(dst_dtype) to return false for sparse_arr, but it returns true and therefore the shortcut is applied. Maybe a maintainer could give an advice on how to address this? Thanks

@seberg
Copy link
Member

seberg commented Jun 28, 2025

Yeah, but it seems is_dtype_struct_simple_unaligned_layout isn't even used anywhere anymore.
The problem is nesting, this dtype has a nested structured dtype and fails to check that. We could probably fix that there. I do wonder if pushing the logic lower may make sense (unless the "is view" logic aligns, but I don't think it does :/).
(I suspect it doesn't, rather pushing it up may a bit but that is the flags idea.)

@bonpyt
Copy link
Author

bonpyt commented Jun 30, 2025

What do you mean by "pushing the logic lower"?

@seberg
Copy link
Member

seberg commented Jun 30, 2025

What do you mean by "pushing the logic lower"?

Sorry, nvm. you are already as low as it gets, I thought we might be able to include the logic in that follows, but it's effectively at the same place.
We should possibly move it because it avoids duplicating a lot of work two times probably, but it won't help with the underlying issue, unless PyArray_GetDTypeTransferFunction informs us that this is a trivial copy.

Signaling a trivially copyable would probably be the most useful thing to do, but it may require doing so in general in casts or maybe as a dtype flag (I am not 100% sure a dtype flag covers everything, but probably; The viewable flag might help, but I don't think it does unfortunately.).
So my guess would be the best way may be a new "trivially copyable" flag on the dtype instance (if dtypes are equivalent and trivially copyable, we can assume a memcpy is OK -- the opposite "not trivially copyable" may actually be more practical, because in practice it is just struct that has holes that could contain references even).

As a quicker fix that doesn't add a flag, I honestly don't have much of an idea except recursing or just giving up when you find an embedded structured dtype (and keep the assumption that all others without references are fine).
If it is good enough for you, I might lean towards ignoring embedded structures. It's just simpler/quicker, and if you have a case where the embedded structure is large that copy will use the optimization anyway.

(Although, I would be happy to investigate a new flag, since that seems useful in a lot more places to do similar optimizations.)

@bonpyt
Copy link
Author

bonpyt commented Jul 2, 2025

Ignoring embedded structures sounds good to me, I'll try this out. Agree a flag might be a better solution long-term :)

@seberg
Copy link
Member

seberg commented Jul 2, 2025

Thanks, could you see how annoying it is to fold the logic into the branch after wards? Otherwise PyArray_EquivTypes and the check, basically add 2x the same work that we'll be doing in that branch again.

(If it's annoying, I think I can accept not doing it. I hid the fields in NumPy 2, with the hope that we could have a better way to store this information that extract it from tuples every time again and again anyway...)

@bonpyt
Copy link
Author

bonpyt commented Jul 3, 2025

so I've played around with it and couldn't quite figure out how to access the "underlying" field type information. basically in the example from the test:

dtype = np.dtype([('a', {'names': ['aa', 'ab'], 'formats': ['f', 'f'], 'offsets': [0, 4]}, (2, 3))])
sparse_dtype = np.dtype([('a', {'names': ['ab'], 'formats': ['f'], 'offsets': [4]}, (2, 3))])

I could get the description of the field a as the one of type "void", and helpers like PyDataType_FIELDS didn't return anything when called on a (i. e. sparse_dtype['a'].fields is None / PyDataType_HASFIELDS(fld_type) == 0). So I'm not sure how to get the information that a has some subfields and what their properties (like offsets) are.

I could just check that the field dtype is not void, I'd go with that if there is no better way, but it seems to be too broad of a check

@seberg
Copy link
Member

seberg commented Jul 3, 2025

didn't return anything when called on a (i. e. sparse_dtype['a'].fields is None / PyDataType_HASFIELDS(fld_type) == 0). So I'm not sure how to get the information that a has some subfields and what their properties (like offsets) are.

That dtype is a subarray dtype, which should be OK, you could check the base I guess (of course, that could be another subarray dtype.

I dunno, my gut feeling is that if you don't take the trivial route here, adding the flag is probably actually the easier solution (and that could be a negative slag on a whim that such dtypes are rare).
It does require finding the 1-2 places where structured dtypes get created oddly and need to propagate flags, but that's about it probably.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Pending authors' response
Development

Successfully merging this pull request may close these issues.

BUG: Significant copy performance regression for structured arrays
2 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/numpy/numpy/pull/29270

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy