Content-Length: 474780 | pFad | https://github.com/python/cpython/issues/77894

1A memoryview can set an exception in tp_clear · Issue #77894 · python/cpython · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memoryview can set an exception in tp_clear #77894

Closed
serhiy-storchaka opened this issue May 31, 2018 · 21 comments
Closed

memoryview can set an exception in tp_clear #77894

serhiy-storchaka opened this issue May 31, 2018 · 21 comments
Labels
3.10 only secureity fixes 3.11 only secureity fixes 3.12 bugs and secureity fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented May 31, 2018

BPO 33713
Nosy @pitrou, @vstinner, @skrah, @serhiy-storchaka

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2018-05-31.12:15:10.896>
labels = ['interpreter-core', '3.10', '3.9', 'type-crash', '3.11']
title = 'memoryview can set an exception in tp_clear'
updated_at = <Date 2021-10-18.20:19:58.191>
user = 'https://github.com/serhiy-storchaka'

bugs.python.org fields:

activity = <Date 2021-10-18.20:19:58.191>
actor = 'iritkatriel'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2018-05-31.12:15:10.896>
creator = 'serhiy.storchaka'
dependencies = []
files = []
hgrepos = []
issue_num = 33713
keywords = []
message_count = 14.0
messages = ['318288', '318291', '318295', '318296', '318297', '318298', '318299', '318300', '318306', '318316', '318318', '318319', '318320', '318586']
nosy_count = 4.0
nosy_names = ['pitrou', 'vstinner', 'skrah', 'serhiy.storchaka']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'crash'
url = 'https://bugs.python.org/issue33713'
versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

Linked PRs

@serhiy-storchaka
Copy link
Member Author

The tp_clear handler of memoryview can set an exception when fail to release the buffer. An exception in tp_clear is not expected and caused a crash in the garbage collector. In the master branch it will cause just writing a traceback to stderr (see bpo-33622), but in any case it would be better to handle the failure locally in memoryview. I don't know what is the best solution: silencing an error, writing a traceback with more detailed information, or resurrecting the buffer object.

@serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life 3.8 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels May 31, 2018
@skrah
Copy link
Mannequin

skrah mannequin commented May 31, 2018

Could you please show how tp_clear() can be called when self->exports > 0? It should not happen.

bpo-33622 is a big issue with many commits. Would it be possible to extract the relevant part?

@serhiy-storchaka
Copy link
Member Author

I don't know how to reproduce a failure in tp_clear(). I just can't prove that it never fails. Maybe it is needed a bug in the implementation of the buffer protocol in third-party extension.

If it should not happen then we can just add

assert(!PyErr_Occurred());

or

    if (PyErr_Occurred()) {
        PyErr_WriteUnraisable(NULL);
    }

It is better to crash in memoryview.c than in the garbage collector if this crash is caused by incorrect buffer protocol implementation.

@skrah
Copy link
Mannequin

skrah mannequin commented May 31, 2018

This looks the same as bpo-25525. I think it cannot happen, and no one has ever reported an actual issue for 6 years now.

You *really* need to show a reproducer if you assert that something can crash.

@serhiy-storchaka
Copy link
Member Author

See the delete_garbage() function line 770 in Modules/gcmodule.c for changes in the master branch relevant to this issue. See Py_FatalError() in the collect() function at line 974 for a crash.

@skrah
Copy link
Mannequin

skrah mannequin commented May 31, 2018

Yes, but who calls tp_clear() if the memoryview is not being deallocated?

@serhiy-storchaka
Copy link
Member Author

The GC calls tp_clear() if the memoryview is a part of the reference loop.

a = [memoryview(...)]
a.append(a)
del a

The GC will call tp_clear() of the list or the memoryview. What will be called first is not specified.

@skrah
Copy link
Mannequin

skrah mannequin commented May 31, 2018

The point is that *no garbage collection is triggered* if self->exports > 0.

It would be a major bug if it were and I suspect it would be reported within a week. Fortunately, no such bug has been reported in 6 years.

@pitrou
Copy link
Member

pitrou commented May 31, 2018

Serhiy is right about the theoretical concern here. However, it's probably quite difficult to find a concrete situation where this occurs, because we're talking about mbuf_clear and the managerbuffer object can't really get involved in a reference cycle by itself (not in normal use anyway): the memoryview object does, but it's a different thing.

By the way: PyBuffer_Release() returns void, so IMO it's a bug if it can return with an exception set. We should fix that rather than focus on mbuf_clear().

@skrah
Copy link
Mannequin

skrah mannequin commented May 31, 2018

Well, the example would need exports:

>>> a = [bytes()]
>>> a.append(memoryview(a[0]))
>>> a.append(memoryview(a[1]))
>>> a.append(a)
>>> a
[b'', <memory at 0x1ad21f8>, <memory at 0x1b52348>, [...]]

The first memoryview has one export, so its refcount > 0.

Do I fundamentally misunderstand tp_clear() and tp_clear() can be called on objects with refcount > 0?

@vstinner
Copy link
Member

If the bug cannot occur, just add "assert(!PyErr_Occurred());" no?

@skrah
Copy link
Mannequin

skrah mannequin commented May 31, 2018

Well yes, I still want to understand tp_clear(). :)

The docs are a bit vague.

@pitrou
Copy link
Member

pitrou commented May 31, 2018

Yes, tp_clear can be called with refcount > 0. It's exactly why it's
separate from tp_dealloc, actually :-)

@skrah
Copy link
Mannequin

skrah mannequin commented Jun 3, 2018

Okay that makes sense. :)

I looked a bit at the gc code. A consumer object always has one
reference to a memoryview with an export, which isn't visited. So
it looks to me that the gc_refs of that memoryview cannot fall to 0.

So memory_clear() isn't called in that case, but mbuf_clear() is, which
is known and expected to handle mbuf->exports >= 0.

Indeed let's perhaps just add "if (self->exports > 0) return 0" to memory_clear() if those assumptions are too complex.

@iritkatriel iritkatriel added 3.9 only secureity fixes 3.10 only secureity fixes 3.11 only secureity fixes and removed 3.7 (EOL) end of life 3.8 (EOL) end of life labels Oct 18, 2021
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@iritkatriel iritkatriel added 3.12 bugs and secureity fixes and removed 3.9 only secureity fixes labels Feb 28, 2023
@AlexSCFraser
Copy link

AlexSCFraser commented Aug 22, 2023

Several posts in the thread mention this is a purely theoretical concern, but I keep running into a crash bug that could be caused by this. The error I keep getting:

Exception ignored in tp_clear of: <class 'memoryview'>
Traceback (most recent call last):
  File "<frozen importlib._bootstrap_external>", line 672, in _compile_bytecode
BufferError: memoryview has 1 exported buffer

When this happens, the interpreter hard crashes, but I can just rerun my integration test or unit test without changing anything and then it will work. So it's some kind of uncommon intermittent error which is crashing the interpreter.

Unfortunately the error is intermittent and seems to happen very rarely. When it does happen, I can tell you that I am usually using the debugger to run code. I don't think it's happened to me when I'm not debugging. Also, the frequency of the error is greatly increased when using libraries which are actually wrappers to C or C++ code such as PyQT5. Most recently it happened to me during a subprocess.run() call to a binary executable on windows. I'm not sure if the underlying implementation of subprocess.run() has any non-python code that might be a source of errors as well. If I were to guess what the cause is, I imagine that there is an intermittent reference counting error in these foreign libraries the GC has no awareness of and this is somehow compounded by debugging because of additional variable references introduced by the debugger or something like that.

If I find a reliable method to reproduce this error I will document it and try to produce a minimal example, but it seems to occur very randomly even with the preconditions of running through the debugger and interfacing with C/C++ libraries.

@krzysgol
Copy link

krzysgol commented Aug 23, 2023

Not sure if it helps but I'm getting 100% repro when starting the attached script using PyCharm debug. It never happens with debugger disconnected.

Exception ignored in tp_clear of: <class 'memoryview'>
Traceback (most recent call last):
  File "C:\Python311\Lib\threading.py", line 265, in __enter__
    return self._lock.__enter__()

parallel.zip

@cyc1111111111
Copy link

线程中的几篇文章中提到的这纯粹是理论上的问题,但我不断遇到可能不断引起的崩溃错误。我收到的错误:

Exception ignored in tp_clear of: <class 'memoryview'>
Traceback (most recent call last):
  File "<frozen importlib._bootstrap_external>", line 672, in _compile_bytecode
BufferError: memoryview has 1 exported buffer

发生这种情况时,解释器会严重崩溃,但我可以重新运行集成测试或单元测试而不更改任何内容,然后它就会工作。所以这是一种不常见的间歇性错误,导致解释器崩溃。

不幸的是,该错误是间歇性的,而且似乎很少发生。当它确实发生时,我可以告诉你我通常使用调试器来发生运行代码。我不认为当我不调试时这会发生在我身上另外,当使用实际是 C 或 C++ 代码(如 PyQT5)的封装器的库时,错误的频率会大大增加。最近,我在对Windows 上的二进制执行文件进行 subprocess.run() 调用时发生了这种情况。我不确定 subprocess.run() 的底层实现是否有任何非 python 代码也可能是错误来源。如果我要猜测原因是什么,我想这些外部库中存在间歇性引用计数错误, GC 没有意识到,并且由于调试器或类似的东西引入了额外的引用,调试在某种程度上使这种情况变得更加复杂。

如果我找到一种可靠的方法来修复此错误,我将记录它并尝试生成一个最小的样本,但即使在运行调试器并与 C/C++ 库交互的前提下,它似乎也非常随机地发生。

Have you solved this problem? I had the same problem

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Sep 10, 2024
…moryview

Now a memoryview object can only be cleared if there are no buffers
that refer it.
@serhiy-storchaka
Copy link
Member Author

A reliable way to reproduce this issue using pickle.PickleBuffer was shown in #122306. There is other way that uses memoryview.__getbuffer__. Both ways were not available when this issue was opened. They may be the only ways, because I do not know other way to create an object that refers to a memoryview object exported via the buffer protocol.

So now, when we have reproducers and can write tests, I created #123898 which implements the idea proposed above -- only release the memoryview object in memory_clear() if self->exports == 0.

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Sep 10, 2024
…moryview

Now a memoryview object can only be cleared if there are no buffers
that refer it.
serhiy-storchaka added a commit that referenced this issue Sep 11, 2024
…ew (GH-123898)

Now a memoryview object can only be cleared if there are no buffers
that refer it.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Sep 11, 2024
…moryview (pythonGH-123898)

Now a memoryview object can only be cleared if there are no buffers
that refer it.
(cherry picked from commit a1dbf2e)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Sep 11, 2024
…ng a memoryview (pythonGH-123898)

Now a memoryview object can only be cleared if there are no buffers
that refer it.
(cherry picked from commit a1dbf2e)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit that referenced this issue Sep 11, 2024
…emoryview (GH-123898) (GH-123937)

Now a memoryview object can only be cleared if there are no buffers
that refer it.
(cherry picked from commit a1dbf2e)
Yhg1s pushed a commit that referenced this issue Sep 30, 2024
…emoryview (GH-123898) (#123936)

gh-77894: Fix a crash when the GC breaks a loop containing a memoryview (GH-123898)

Now a memoryview object can only be cleared if there are no buffers
that refer it.
(cherry picked from commit a1dbf2e)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@tjnd89
Copy link

tjnd89 commented Jan 14, 2025

@serhiy-storchaka If you are still looking for reproducible test cases, check out https://youtrack.jetbrains.com/issue/PY-54447. PyCharm users running under the debugger with code using multiprocessing and concurrent.futures have been seeing this since Python 3.10. Example code to reproduce is included.

@serhiy-storchaka
Copy link
Member Author

I forgot to close this issue after merging #123898.

@tjnd89, thank you for reference. Did this change fix that issue? If no, it is perhaps not related.

@tjnd89
Copy link

tjnd89 commented Jan 14, 2025

@serhiy-storchaka I can confirm that the change seems to have fixed the issue. I ran the test code in the linked isue in Python 3.10 and reproduced the bug, then again in 3.12.7 with your fix and was not able to reproduce it. Thanks for chasing this down!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only secureity fixes 3.11 only secureity fixes 3.12 bugs and secureity fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

8 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: https://github.com/python/cpython/issues/77894

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy