-
-
Notifications
You must be signed in to change notification settings - Fork 32.3k
gh-127146: Skip test_open_undecodable_uri on Emscripten #136510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-127146: Skip test_open_undecodable_uri on Emscripten #136510
Conversation
PR python#136326 removed the Emscripten skip for this file but it is still broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised. Is test_open_with_undecodable_path passed?
@serhiy-storchaka Both FWIW - in both cases, the test path being used is When I run this locally, I get a warning in the console:
which suggests to me that the bad unicode isn't round-tripping consistently through the filesystem. I agree that this should be a "skip both or skip neither" situation; and I'd definitely prefer to understand better why it needs to be skipped explicitly before restoring the skip. |
I suppose that paths are Unicode strings on Emscripten. Python and SQLite can use different ways to decode bytes path to Unicode, so This is similar to Windows where we need to keep a separate skip. |
Well |
Interesting. Locally only |
I changed the test to this and added syscall tracing: def get_undecodable_path(self):
path = TESTFN_UNDECODABLE
print("open", path)
f = open(path, 'wb')
print("close", path)
f.close()
print("unlink", path)
unlink(path)
return path
@unittest.skipIf(sys.platform == "win32", "skipped on Windows")
def test_open_with_undecodable_path(self):
path = self.get_undecodable_path()
self.addCleanup(unlink, path)
print("sqlite.connect", path)
c = sqlite.connect(path)
with contextlib.closing(c) as cx:
print("exists", path)
exists = os.path.exists(path)
print(" .. ", exists)
self.assertTrue(exists) The relevant part of the log looks like this:
|
Okay I got it: the problem is that // When using conditional TextDecoder, skip it for short strings as the overhead of the native call is not worth it.
if (endPtr - idx > 16 && heapOrArray.buffer && UTF8Decoder) {
return UTF8Decoder.decode({{{ getUnsharedTextDecoderView('heapOrArray', 'idx', 'endPtr') }}});
} https://github.com/emscripten-core/emscripten/blob/main/src/lib/libstrings.js#L57-L59 So when the file system decodes the string, it checks the length. If it's short, it uses the JS decoder, if it's long it uses the native decoder. sqlite fully resolves the path before opening the file, and the absolute path is longer than 16 bytes. Whereas the not fully resolved path is 15 bytes long and gets the correct slow path. Adding two extra bytes to |
Upstream report: |
PR #136326 removed the Emscripten skip for this file but it is still broken.