-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
BUG: Fix copy performance regression for structured arrays #29270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
So CI fails because of this
It seems to me that the shortcut I added is triggering here
which causes the test to fail. In my mind, in this case we shouldn't be using the shortcut to copy - I would expect |
Yeah, but it seems |
What do you mean by "pushing the logic lower"? |
Sorry, nvm. you are already as low as it gets, I thought we might be able to include the logic in that follows, but it's effectively at the same place. Signaling a trivially copyable would probably be the most useful thing to do, but it may require doing so in general in casts or maybe as a dtype flag (I am not 100% sure a dtype flag covers everything, but probably; The As a quicker fix that doesn't add a flag, I honestly don't have much of an idea except recursing or just giving up when you find an embedded structured dtype (and keep the assumption that all others without references are fine). (Although, I would be happy to investigate a new flag, since that seems useful in a lot more places to do similar optimizations.) |
Ignoring embedded structures sounds good to me, I'll try this out. Agree a flag might be a better solution long-term :) |
Thanks, could you see how annoying it is to fold the logic into the branch after wards? Otherwise (If it's annoying, I think I can accept not doing it. I hid the fields in NumPy 2, with the hope that we could have a better way to store this information that extract it from tuples every time again and again anyway...) |
so I've played around with it and couldn't quite figure out how to access the "underlying" field type information. basically in the example from the test:
I could get the description of the field I could just check that the field dtype is not void, I'd go with that if there is no better way, but it seems to be too broad of a check |
That dtype is a subarray dtype, which should be OK, you could check the base I guess (of course, that could be another subarray dtype. I dunno, my gut feeling is that if you don't take the trivial route here, adding the flag is probably actually the easier solution (and that could be a negative slag on a whim that such dtypes are rare). |
Fixes #28194
This adds a special case to
dtype_transfer.c::get_fields_transfer_function
to shortcut the copying of structured arrays in certain cases.