Content-Length: 416928 | pFad | http://github.com/pypa/pip/pull/13390

BC Use robust shebang code for all shebang modification by flying-sheep · Pull Request #13390 · pypa/pip · GitHub
Skip to content

Use robust shebang code for all shebang modification #13390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

flying-sheep
Copy link

Fixes #13389

@flying-sheep flying-sheep changed the title Re-use shebang code and add test Use robust shebang code for all shebang modification May 11, 2025
firstline = script.readline()
if not firstline.startswith(b"#!python"):
prelude = script.readline()
if (m := re.match(rb"^#!python[^\s]*(\s.*)?$", prelude)) is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't conform to the spec, which only allows rewriting lines starting with precisely b'#!python.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you misread: that’s exactly what that regex matches.

The only reason I switched from startswith to a regex is to capture the stuff behind the #!python and pass it on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I did misread. But I'd still argue that the regex is harder to read. Why not simply split firstline on whitespace?

It's worth noting that there are some edge cases in the spec - #!python3.12 should be rewritten, but it's not clear what "the correct interpreter" is if pip is running under Python 3.13 🙁

The current code doesn't get this right, either, but if we're going to claim the new code "is more robust" or "will no longer break", we should at least try to follow the spec.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply split firstline on whitespace?

and strip the trailing newline from the second part. I find the regex to be more readable than that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we disagree.

(BTW, without documentation on _get_shebang, I don't know whether stripping the newline is necessary).

return False
exename = sys.executable.encode(sys.getfilesystemencoding())
firstline = b"#!" + exename + os.linesep.encode("ascii")
prelude = ScriptMaker(None, None)._get_shebang("utf-8", m.group(1) or b"")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, _get_shebang does the wrong thing here. It doesn't correctly rewrite #!pythonw as the GUI version of the interpreter, if pip is being run under the CLI version. Entry points use the ScriptMaker.make(..., {'gui': True}) mechanism to do that.

I'd suggest that we need a test here that ensures that #!pythonw gets rewritten as Path(sys.executable).parent / (Path(sys.executable).stem + "w" + Path(sys.executable).suffix), as that's the correct behaviour according to the spec. The code as given would fail that test.

@@ -388,7 +406,12 @@ def test_wheel_record_lines_have_updated_hash_for_scripts(

script_path = script.bin_path / "dostuff"
script_contents = script_path.read_bytes()
assert not script_contents.startswith(b"#!python\n")
expected_prefix = (
b"#!/bin/sh\n'''exec'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strong -1 on having the test rely on the precise mechanism used in the generated shebang..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how should I do it? without knowledge of the mechanism, checks like exe_path in script_contents aren’t reliable either, because maybe the space is escaped with \ or so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. It's a hard problem. Distlib has a bunch of tests around shebangs, so if we were using documented and supported distlib APIs, I'd say we could rely on them (and the existing check, that just confirms we did rewrite, would be enough). But because we're using the internal _get_shebang function, we can't do that, as we have no way to ensure that we're using the function in a way that is covered by the distlib tests.

The fact that tests are failing on Windows demonstrates that this test isn't sufficient - the shebang you're checking for isn't what gets used on Windows (because Windows shells don't support it).

Copy link
Member

@pfmoore pfmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments inline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use robust shebang manipulation also for (data)/scripts (e.g. to support spaces in venv paths)
3 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/pypa/pip/pull/13390

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy