Use robust shebang code for all shebang modification #13390

flying-sheep · 2025-05-11T14:29:33Z

news/13389.bugfix.rst

pfmoore · 2025-05-11T15:02:02Z

src/pip/_internal/operations/install/wheel.py

-        firstline = script.readline()
-        if not firstline.startswith(b"#!python"):
+        prelude = script.readline()
+        if (m := re.match(rb"^#!python[^\s]*(\s.*)?$", prelude)) is None:


This doesn't conform to the spec, which only allows rewriting lines starting with precisely b'#!python.

I think you misread: that’s exactly what that regex matches.

The only reason I switched from startswith to a regex is to capture the stuff behind the #!python and pass it on.

Sorry, I did misread. But I'd still argue that the regex is harder to read. Why not simply split firstline on whitespace?

It's worth noting that there are some edge cases in the spec - #!python3.12 should be rewritten, but it's not clear what "the correct interpreter" is if pip is running under Python 3.13 🙁

The current code doesn't get this right, either, but if we're going to claim the new code "is more robust" or "will no longer break", we should at least try to follow the spec.

Why not simply split firstline on whitespace?

and strip the trailing newline from the second part. I find the regex to be more readable than that.

OK, we disagree.

(BTW, without documentation on _get_shebang, I don't know whether stripping the newline is necessary).

pfmoore · 2025-05-11T15:12:35Z

src/pip/_internal/operations/install/wheel.py

            return False
-        exename = sys.executable.encode(sys.getfilesystemencoding())
-        firstline = b"#!" + exename + os.linesep.encode("ascii")
+        prelude = ScriptMaker(None, None)._get_shebang("utf-8", m.group(1) or b"")


Actually, _get_shebang does the wrong thing here. It doesn't correctly rewrite #!pythonw as the GUI version of the interpreter, if pip is being run under the CLI version. Entry points use the ScriptMaker.make(..., {'gui': True}) mechanism to do that.

I'd suggest that we need a test here that ensures that #!pythonw gets rewritten as Path(sys.executable).parent / (Path(sys.executable).stem + "w" + Path(sys.executable).suffix), as that's the correct behaviour according to the spec. The code as given would fail that test.

pfmoore · 2025-05-11T15:14:52Z

tests/functional/test_install_wheel.py

@@ -388,7 +406,12 @@ def test_wheel_record_lines_have_updated_hash_for_scripts(

    script_path = script.bin_path / "dostuff"
    script_contents = script_path.read_bytes()
-    assert not script_contents.startswith(b"#!python\n")
+    expected_prefix = (
+        b"#!/bin/sh\n'''exec'"


Strong -1 on having the test rely on the precise mechanism used in the generated shebang..

how should I do it? without knowledge of the mechanism, checks like exe_path in script_contents aren’t reliable either, because maybe the space is escaped with \ or so.

I don't know. It's a hard problem. Distlib has a bunch of tests around shebangs, so if we were using documented and supported distlib APIs, I'd say we could rely on them (and the existing check, that just confirms we did rewrite, would be enough). But because we're using the internal _get_shebang function, we can't do that, as we have no way to ensure that we're using the function in a way that is covered by the distlib tests.

The fact that tests are failing on Windows demonstrates that this test isn't sufficient - the shebang you're checking for isn't what gets used on Windows (because Windows shells don't support it).

pfmoore

See comments inline.

flying-sheep added 2 commits May 11, 2025 16:23

Re-use shebang code and add test

6da1037

relnote

b5917be

psf-chronographer bot added the bot:chronographer:provided label May 11, 2025

flying-sheep changed the title ~~Re-use shebang code and add test~~ Use robust shebang code for all shebang modification May 11, 2025

flying-sheep added 2 commits May 11, 2025 16:33

fmt

cb9d513

remove redundant code

1f3452f

notatallshaw reviewed May 11, 2025

View reviewed changes

news/13389.bugfix.rst Outdated Show resolved Hide resolved

pfmoore reviewed May 11, 2025

View reviewed changes

pfmoore requested changes May 11, 2025

View reviewed changes

flying-sheep added 2 commits May 11, 2025 18:01

WIP pythonw support

a460e1f

reword relnote

fa92def

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use robust shebang code for all shebang modification #13390

Use robust shebang code for all shebang modification #13390

Uh oh!

flying-sheep commented May 11, 2025

Uh oh!

Uh oh!

pfmoore May 11, 2025

Uh oh!

flying-sheep May 11, 2025

Uh oh!

pfmoore May 11, 2025

Uh oh!

flying-sheep May 11, 2025

Uh oh!

pfmoore May 11, 2025

Uh oh!

pfmoore May 11, 2025

Uh oh!

pfmoore May 11, 2025

Uh oh!

flying-sheep May 11, 2025

Uh oh!

pfmoore May 11, 2025

Uh oh!

pfmoore left a comment

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

Use robust shebang code for all shebang modification #13390

Are you sure you want to change the base?

Use robust shebang code for all shebang modification #13390

Uh oh!

Conversation

flying-sheep commented May 11, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pfmoore left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!