-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
BUG: Include python-including headers first #29281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pleae add comments so we don’t undo this in a code refactor |
Might just put |
Could also just run clang-format on the file after, it needs a cleanup. |
I added an explicit Python.h include to those two files, with comments to explain why. I then went a bit overboard and added those comments to every other Python.h include in the repository, moving it higher in the file if necessary. |
I guess this all makes sense. What do others think? |
/* Any file that includes Python.h must include it before any other files */ | ||
/* https://docs.python.org/3/extending/extending.html#a-simple-example */ | ||
/* npy_common.h includes Python.h so it also counts in this list */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to touch every single header, I'd prefer it if this comment could be formatted like other comments in the numpy repo
/* Any file that includes Python.h must include it before any other files */ | |
/* https://docs.python.org/3/extending/extending.html#a-simple-example */ | |
/* npy_common.h includes Python.h so it also counts in this list */ | |
/* | |
Any file that includes Python.h must include it before any other files | |
https://docs.python.org/3/extending/extending.html#a-simple-example | |
*/ |
I also think the addendum about npy_common.h
is only relevant for files that include it, and even then is maybe more confusing than helpful, since the most important thing is that Python.h
comes first and the comment makes that clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I might be happiest with just adding it as post-fix comment like #include <Python.h> /* Python.h include must be first. */
and leave it to the user to google if they care enough rather than this amount of comment everywhere.
(Or yeah, just don't add it and hope for a future style check instead.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think the addendum about
npy_common.h
is only relevant for files that include it,
The intent was that npy_common.h
is a file that includes Python.h, and so either it or Python.h needs to be the first include in any file that includes it, but that's not relevant for this batch of comments and I can see how that wasn't clear.
Just a drive-by thought: maybe we could do this using a lint check instead of adding identical comments all over the codebase? Unfortunately I let #28634 sit, otherwise we'd already have a C linter you could add to.... |
There's the Python script to check this I wrote for SciPy. Should I revert the last two comments, add that as a check, and adjust the files it flags? If so, should I move existing headers to the top of the files or add an explicit |
Yes, I think that would be better. We have a linter check on azure
I think that would be a less noisy change. |
I am not sure this works. I reverted the change to |
It looks like |
CI is failing with
|
Note that if Python.h is present, clang-format will put it first. |
Not whichever was last included
tools/get_submodule_paths.py
Outdated
submodule_paths = [os.path.join(root_directory, path) for path in | ||
submodule_paths] | ||
# vendored with a script rather than via gitmodules | ||
submodule_paths.append(os.path.join(root_directory, 'scipy/_lib/pyprima')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor observation as I scan through my GitHub notifications--I recall suggesting/adding this line specifically for SciPy in some work with Lucas I think, so it can probably be removed here.
I'm a bit rusty on the NumPy vendoring situation--I suppose if NumPy vendors things without using git submodules there could be other relevant entries to add here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to, could try with linguist-vendored files from .gitattributes. Not the list is probably complete, but it is a list that could be expanded.
(Not saying I think it's a blocker here)
Also allow passing a specific directory.
I wrote the initial version of |
I dunno if it matters much, since I guess you mostly wrote it either way. It seems to me we should just get it in, or does anyone of the original reviewers want to have another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple comments. The changes to the headers look much more minimal than the version of this PR I last looked at, so that's good.
I didn't try to understand the logic in the script. It seems pretty complicated? But if SciPy is using it, it's probably reasonably battle-tested, so I think I'm ok with including it.
not included_python | ||
and not warned_python_construct | ||
and ".h" not in basename_to_check | ||
) and ("py::" in line or "PYBIND11_" in line): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numpy doesn't use pybind11 so this is unnecessary but probably harmless?
@@ -56,6 +56,9 @@ stages: | |||
python tools/linter.py | |||
displayName: 'Run Lint Checks' | |||
failOnStderr: true | |||
- script: | | |||
python tools/check_python_h_first.py | |||
displayName: 'Check Python.h is first file included' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you check whether introducing an intentional mistake causes this to fail as expected? I see the linter needs failOnStderr: ture
- that's not needed here because the script exits with a nonzero status code if it fails, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran it and saw failures in places that made sense until I fixed them; that's how I found the changes beyond the two in the first commit. I probably won't write a test for that until I separate it out as its own package, which I should probably do soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this up into a proper package with tests over at https://github.com/DWesl/check-python-h-first. I did have to make changes to get all the tests to pass; the most relevant change was telling the script that a function starting with Py
probably implies Python.h
should have been included.
A short version of the core function: for each file in the list, it checks each line in that file for
If it reaches the end of the file without including any files, that file is added to the list of files known not to include any other file. The rest of it is basically a wrapper to exclude files in submodules or vendored projects from the list of files it should check, and to try to sort the list so headers appear before the files that include them, then to count the number of files that first include a non-
I think I added it to SciPy a year or so back (after some searching, as scipy/scipy#20536), as a cheaper alternative to a Cygwin CI run to catch the most common failures: I think SciPy takes an hour to build, and likely longer to run the tests. |
Ran into this trying to install scikit-image