-
-
Notifications
You must be signed in to change notification settings - Fork 32.3k
gh-85222: Document the side effect in multiprocessing #136426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Doc/library/multiprocessing.rst
Outdated
@@ -867,6 +867,10 @@ For an example of the usage of queues for interprocess communication see | |||
locks/semaphores. When a process first puts an item on the queue a feeder | |||
thread is started which transfers objects from a buffer into the pipe. | |||
|
|||
If the global start method has not been set, calling this function will | |||
have the side effect of setting the current global start method. | |||
See the :func:`get_context` function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think repeating the same words here may be a bit 'noisy'. Could we just mention it once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that if a user only uses one of these functions and checks that function's documentation, and unfortunately that is not the function that contains this document, they may mess it up.
But I'm unsure about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be biased but I don't think the extra clarity hurts... I'm with @aisk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I asked for this to be mentioned everywhere for that reason as this is non-obvious behavior of multiprocessing. This could probably be refined a bit, i'll point some docs focused reviewers at it.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that if a user only uses one of these functions and checks that function's documentation, and unfortunately that is not the function that contains this document, they may mess it up.
But I'm unsure about this.
IMO it may be better if we mentioned it once and create a reference link to the mention in every function. I'm not sure too (。・ω・。)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The repeated part could be a single sentence like Ensures that the current global start method is set., with ”global start method” linking to a longer explanation in a dedicated section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it needs to be mentioned in every method. There's a section at the top about contexts that can say the global start method is set by any function that does real work (or some appropriate language). I find we often have overarching concerns that apply throughout a module and don't repeat them. Sometime people have to read widely in a module page in order to understand all the nuances.
Maybe I missed the discussion: why is this particular caveat important enough to sprinkle everywhere? What footgun are we preventing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The callout about the surprising behavior is also (in addition to the issue for this PR) due to #109070 which only just added the caveat about .get_context()
having the setting side effect behavior via https://github.com/python/cpython/pull/136341/files.
I agree that the https://docs.python.org/3.15/library/multiprocessing.html#contexts-and-start-methods section up top should be more clear about this. It really only ever mentions get_context()
as an alternative today and never highlights that the context is implicitly set at instantiation time by all sorts of APIs... We do say To select a start method you use the set_start_method() in the if __name__ == '__main__' clause of the main module.
but we never actually explain the restrictions leading to that advice, the why, there... That would help.
After that these mentions about context setting could turn into a single sentence similar to what encoku suggested that ref's back to that section.
(the other part of this PR is documenting which things accept a ctx= parameter at all)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thank you all for review this PR, I updated the document and added a new section to explain the implicitly set of start method, and changed other methods/types to link to this section. Please review it to see if this is suitable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed we say
Calling this may set the global start method. See
:ref:`global-start-method` for more details.
And we say this nearly all the time
But wouldn't it make more sense to say the following in some cases as we are dealing with classes ↓
Instantiating this class may set the global start method. See
:ref:`global-start-method` for more details.
Anyway that's what I think... feel free to correct me if I'm wrong :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably should have mentioned this before but I forgot.
As of now we say something like this
Instantiating this class may set the global start method. See
:ref:`global-start-method` for more details.
But if you prefer we could also say something like this:
Instantiating this class may override the global start method. See
:ref:`global-start-method` for more details
Set --> override
But again up to you
I wouldn't use the word override for this behavior. If the global start method has already been set, it will be used. It isn't changed at that point. it's just that it is a set-once before first use global. So if it has not yet been explicitly set, it will be set to the default by the APIs doing this as having the right mp context is required. |
@gpshead Do you think that then maybe saying "either set or alter" instead is better? Something like
|
@sharktide I believe set is the correct term since you are not able to alter after it is set. |
I believe the requested change has been addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small suggestion, this probably doesn't matter too much
Around line 1142, we say
If *method* is ``None`` then the default context is returned. Note that if the global start method has not been set, this will set it.
See :ref:`global-start-method` for more details.
Earlier it used to say:
If *method* is ``None`` then the default context is returned. Note that if
the global start method has not been set, this will set it to the
default method.
Just asking, why did we remove this will set it to the default method
and instead just say this will set it.
?
I think it might be a little better to say
If *method* is ``None`` then the default context is returned. Note that if the global start method has not been set, this will set it to the default method
See :ref:`global-start-method` for more details.
But this doesn't really matter. Up to you all if you think this is worth it.
@aisk Thanks for your work on this PR. 🎉 I'm fine with any of the above options. @sharktide Thanks for taking the time to review and make suggestions. Usually, we will only mark a review "required changes" if there is technically something wrong and it is important to change the PR. For things that are suggestions, we typically use the comment function in the review. Keep up the good work. ☀️ |
The latest review comment is a suggested change so the requested change is not necessary.
Noted. Thanks! |
Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small grammatical error
The global start method can only be done once. If you need to change the
start method from the system default, you must proactively set the global start method
before calling functions or methods, or creating these objects.
it should instead say Setting the global start method…
because without the word Setting the sentence doesn’t make a lot of sense if you think about it. Sorry for being nitty :)
Doc/library/multiprocessing.rst
Outdated
|
||
Several multiprocessing functions and methods, as well as creating some objects, will implicitly | ||
set the global start method to the system's default, if the global start method is not already | ||
set. The global start method can only be done once. If you need to change the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set. The global start method can only be done once. If you need to change the | |
set. The global start method can only be set once. If you need to change the |
@sharktide Hi, how about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure !
EDIT: my other suggestion includes this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This time I left a bunch of suggestions at once so I don't bother anyone too much :)
Doc/library/multiprocessing.rst
Outdated
|
||
Several multiprocessing functions and methods, as well as creating some objects, will implicitly | ||
set the global start method to the system's default, if the global start method is not already | ||
set. The global start method can only be done once. If you need to change the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure !
EDIT: my other suggestion includes this
@@ -2313,7 +2367,9 @@ with the :class:`Pool` class. | |||
the worker processes. Usually a pool is created using the | |||
function :func:`multiprocessing.Pool` or the :meth:`Pool` method | |||
of a context object. In both cases *context* is set | |||
appropriately. | |||
appropriately. If ``None``, calling this function will have the side effect | |||
of setting the current global start method if it has not been set already. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of setting the current global start method if it has not been set already. | |
of setting the current global start method to the | |
system default if it has not been set already. |
More small suggestions to improve reading flow
Up to you if you wanna take this one
Co-authored-by: R Chintan Meher <meherrihaan@gmail.com>
Ahh linter… I’ll give you a fix for that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should make the linter happy
Co-authored-by: R Chintan Meher <meherrihaan@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had a thought
Where we say
Instantiating this class may set the global start method. See…
We could explicitly tell the user that it’ll be set to the system default for 2 reasons:
Reading flow —> the period seems a bit abrupt
Clarity: people may not want to click the link for more info; this would give them a basic overview for most scenarios so as not to interrupt reading in cases that this doesn’t really matter
So maybe something like
Instantiating this class may set the global start method to the system default. See …
@aisk feel free to adapt this if you wish
Hi @sharktide, thank you for your review. I think we shouldn't mention too many details in every function's documentation. I think users can guess that the start method will be set to the system's default, or if they are confused, they can go to the link to check the details. |
All right |
Fine by me! Thanks for the clarification |
📚 Documentation preview 📚: https://cpython-previews--136426.org.readthedocs.build/