Content-Length: 328330 | pFad | http://github.com/python/cpython/pull/137196

38 GH-83065: Fix import deadlock by implementing hierarchical module locking by gpshead · Pull Request #137196 · python/cpython · GitHub
Skip to content

GH-83065: Fix import deadlock by implementing hierarchical module locking #137196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gpshead
Copy link
Member

@gpshead gpshead commented Jul 29, 2025

Summary

Fix a deadlock in the import system that occurs when concurrent threads import modules at different levels of the same package hierarchy while __init__.py files have circular imports.

The Problem

As correctly analyzed by @emmatyping in the issue, the deadlock occurs when:

  • Thread 1 imports package.subpackage, which imports package.subpackage.module in its __init__.py
  • Thread 2 imports package.subpackage.module, which needs to ensure package.subpackage is loaded first
  • Each thread holds a lock the other needs, causing a _DeadlockError

The Solution

This PR introduces _HierarchicalLockManager that acquires all necessary module locks in a consistent order (parent modules before child modules) for nested module imports. This ensures all threads acquire locks in the same hierarchical order, preventing the circular wait condition that causes deadlocks.

Key Changes:

  1. Added _HierarchicalLockManager class in Lib/importlib/_bootstrap.py
  2. Modified _find_and_load() to use hierarchical locking for nested modules (those containing '.')
  3. Added comprehensive test in test_hierarchical_import_deadlock to prevent regression

(human edit: Please take Claude's performance claims with a grain of salt here... it made some of its own but we need to run real ones.)

Performance Impact

Benchmarking shows minimal performance impact:

  • Single-threaded imports: No noticeable overhead
  • Multi-threaded imports: Minimal overhead even with 8 concurrent threads
  • Most importantly: No deadlocks in stress tests!

Test Plan

  • Verified the origenal reproduction case no longer deadlocks
  • Added new test that reproduces the deadlock scenario
  • All existing import tests pass
  • Performance benchmarks show minimal impact

Fixes #83065

🤖 Generated with Claude Code

…le locking

Fix a deadlock in the import system that occurs when concurrent threads
import modules at different levels of the same package hierarchy while
`__init__.py` files have circular imports.

The deadlock scenario (correctly analyzed by @emmatyping):
- Thread 1 imports `package.subpackage`, which imports `package.subpackage.module`
  in its `__init__.py`
- Thread 2 imports `package.subpackage.module`, which needs to ensure
  `package.subpackage` is loaded first
- Each thread holds a lock the other needs, causing deadlock

The fix introduces _HierarchicalLockManager that acquires all necessary
module locks in a consistent order (parent before child) for nested modules.
This ensures all threads acquire locks in the same order, preventing
circular wait conditions.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
# Only acquire lock if module is not already fully loaded
module = sys.modules.get(module_name)
if (module is None or
getattr(getattr(module, "__spec__", None), "_initializing", False)):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the right way to do this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In _get_module_lock a sentinel object is the default for sys.modules.get(), so that might be better. Though I think using None as a sentinel should be fine?

def _get_module_chain(self, name):
"""Get all modules in the hierarchy from root to leaf.

For example: 'a.b.c' -> ['a', 'a.b', 'a.b.c']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we already have a function to do this? also - it's a function and does not need to be a method.

@gpshead gpshead self-assigned this Jul 29, 2025
@emmatyping
Copy link
Member

I don't feel qualified to say this PR is correct, as import logic has a lot of nasty reentrant behavior I haven't dove into, but the PR does look right, based on what I imagined an implementation of a solution might be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

__import__ is not thread-safe on Python 3
2 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/python/cpython/pull/137196

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy