ENH: avoid thread safety issues around uses of `PySequence_Fast` #29394

ngoldbaum · 2025-07-17T21:45:12Z

See https://docs.python.org/3.14/howto/free-threading-extensions.html#general-api-guidelines for more details, but in short it's not safe to use this API without some kind of locking on the free-threaded build if it's possible for the accessed container to be modified while we're iterating over the contents.

To fix this, I added locking around all uses of the PySequence_Fast API. I also added a critical section around the object that is coerced to an ndarray in PyArray_FromAny_int. Taken together these changes fix the possibility of thread safety issues from mutating the argument to np.array(), np.broadcast(), and np.nditer(), as well as a few other more minor spots.

This does not do anything to lock arbitrary nested array-likes, I only lock the outermost level of nesting. Doing more complicated things is tricky without causing deadlocks. We could also instead only lock whichever "leaf" list in a nested list of lists we are currently looking at, which avoids deadlocks, but that also leaves the outermost list unlocked. IMO the outermost list is most likely to be mutated, in practice.

To aid all that, I added several new macros to npy_pycompat.h, which wrap the public critical section API in the CPython C API: https://docs.python.org/3/c-api/init.html#python-critical-section-api.

The versions in the C API include brackets and if I want to use those, I need to more substantially refactor the places I touched in this PR. I did use the versions with brackets in a few places, where that makes sense.

I also added macros specifically for applying critical section sto sequence, following the private macros for this purpose in CPython. These macros add an extra check that skips applying critical section for tuples. The private macros are defined in terms of things in the public C API, so I can just lift them out of CPython. If the private macros are ever made public in the future we can switch to those versions.

It'd be nice to get tests for einsum and __array_function__, but it wasn't clear to me how to write a test that would trigger thread safety issues. The three tests I added all fail on current main on the free-threaded build and pass on the GIL-enabled build.

seberg

LGTM to in general (might do one more quick pass to double check ref counts, but seemed all good).

One thing that isn't clear to me: I think it is OK to do a goto within a block?

So, I would have to double check the code, but it is not clear to me that you need anything of this with possible one careful change: Merge success/fail into a single finish: label, so that you can write:

    res = NULL;
    BEGIN_CRITICAL_SECTION;

    if failure:
        Py_CLEAR(res);  // ensure res is NULL if needed
        goto finish;

  finish:;  // may need the semicolon, not sure when exactly.
     cleanup
     END_CRITICAL_SECTION;
     return res;

There is no return within the block, so all is good I think. I suspect you could do a goto fail that does the Py_CLEAR(), but not sure if there are subtleties and it is probably just more awkward.

Anyway, LGTM, some small comments you can ignore. But I think it would be good to explore that pattern, because it would be nice to diverge from CPython as little as possible and the single finish: isn't bad at all.

numpy/_core/src/multiarray/ctors.c

numpy/_core/src/multiarray/iterators.c

numpy/_core/src/multiarray/methods.c

numpy/_core/src/multiarray/ctors.c

ngoldbaum · 2025-07-18T13:58:50Z

Not sure why the debug python build crashed - will look at that closer.

ngoldbaum · 2025-07-18T21:00:46Z

// may need the semicolon, not sure when exactly.

It turns out doing this without the semicolon is in C23:

clang [-Wc23-extensions]: Label at end of compound statement is a C23 extension.

Otherwise it looks like your suggestion to merge the error and success paths into a cleanup goto does work. Thanks!

ngoldbaum · 2025-07-18T21:08:58Z

Otherwise it looks like your suggestion to merge the error and success paths into a cleanup goto does work. Thanks!

Spoke a little too soon. The NPY_ALLOC_WORKSPACE macros define inline variables, so it's a bit more painful to use the variants with brackets in the nditer implementation. I left in CRITICAL_SECTION_FAST_NO_BRACKETS to use there, otherwise I'd need to do a bigger refactoring. Will push shortly after I've finished testing locally.

numpy/_core/src/multiarray/ctors.c

seberg

One nitpick, otherwise looks good.

FWIW, I am tempted to just split the workspace macro into a definition now that we have a reason to do so, but happy to not do it here also, and having the no-bracket version isn't terrible (even if it seems to me that avoiding it isn't that bad in the end).

numpy/_core/src/multiarray/iterators.c

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

ngoldbaum · 2025-07-21T16:25:51Z

FWIW, I am tempted to just split the workspace macro into a definition now that we have a reason to do so, but happy to not do it here also, and having the no-bracket version isn't terrible (even if it seems to me that avoiding it isn't that bad in the end).

I think I'll do that in a followup.

ngoldbaum added 6 commits July 17, 2025 13:04

BUG: prevent mutating multiter arguments while processing them

1a63cc1

MAINT: also add locking for array creation from array-likes

a65b82c

MAINT: add lock nditer argument with a critical section

de128cc

MAINT: give NO_BRACKETS variants a label argument

7613951

MAINT: more critical sections

3e1fee4

MAINT: apply suggestions from Sam

9393742

github-actions bot added the 01 - Enhancement label Jul 17, 2025

ngoldbaum added the 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) label Jul 17, 2025

MAINT: fix linter and GIL-enabled build compilation

833ba1a

seberg reviewed Jul 18, 2025

View reviewed changes

numpy/_core/src/multiarray/ctors.c Outdated Show resolved Hide resolved

numpy/_core/src/multiarray/iterators.c Outdated Show resolved Hide resolved

numpy/_core/src/multiarray/methods.c Show resolved Hide resolved

seberg reviewed Jul 18, 2025

View reviewed changes

numpy/_core/src/multiarray/ctors.c Outdated Show resolved Hide resolved

ngoldbaum commented Jul 18, 2025

View reviewed changes

numpy/_core/src/multiarray/ctors.c Show resolved Hide resolved

MAINT: simplify error paths

b67b95a

ngoldbaum force-pushed the pysequence-fast-fix branch from 71611cb to b67b95a Compare July 18, 2025 21:24

seberg approved these changes Jul 21, 2025

View reviewed changes

numpy/_core/src/multiarray/iterators.c Outdated Show resolved Hide resolved

Update numpy/_core/src/multiarray/iterators.c

69dca4b

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

ngoldbaum merged commit 5a031d9 into numpy:main Jul 21, 2025
76 checks passed

This was referenced Jul 21, 2025

MAINT/BUG: Followups for PySequence_Fast locking #29405

Merged

BUG: prevent mutating multiter arguments while processing them #29363

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: avoid thread safety issues around uses of `PySequence_Fast` #29394

ENH: avoid thread safety issues around uses of `PySequence_Fast` #29394

ngoldbaum commented Jul 17, 2025 •

edited

Loading

Uh oh!

seberg left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ngoldbaum commented Jul 18, 2025

Uh oh!

ngoldbaum commented Jul 18, 2025

Uh oh!

ngoldbaum commented Jul 18, 2025

Uh oh!

Uh oh!

seberg left a comment

Uh oh!

Uh oh!

ngoldbaum commented Jul 21, 2025

Uh oh!

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

Uh oh!

ENH: avoid thread safety issues around uses of PySequence_Fast #29394

ENH: avoid thread safety issues around uses of PySequence_Fast #29394

Conversation

ngoldbaum commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ngoldbaum commented Jul 18, 2025

Uh oh!

ngoldbaum commented Jul 18, 2025

Uh oh!

ngoldbaum commented Jul 18, 2025

Uh oh!

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ngoldbaum commented Jul 21, 2025

Uh oh!

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

ENH: avoid thread safety issues around uses of `PySequence_Fast` #29394

ENH: avoid thread safety issues around uses of `PySequence_Fast` #29394

ngoldbaum commented Jul 17, 2025 •

edited

Loading

seberg left a comment •

edited

Loading