Skip to content

Fixed #31169 -- Adapted the parallel test runner to use spawn. #15421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 15, 2022

Conversation

smithdc1
Copy link
Member

Ticket : #31169
Previous PR : #12646

This is my attempt at rebasing the previous PR and to accomodate the comments. There was a lot of discussion on the previous PR so maybe I've not captured everything as yet, likley I've also not interpreted it all correctly as well.

I've had to make a some small changes so that this passess locally for me on both Windows and Linux but interested to see what the test suites here have to say. I've also changed get_max_test_processes() so the test suite will now run in parallel by default when spawn is in play.

Copy link
Member

@ngnpope ngnpope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up @smithdc1.

I've left a bunch of comments.

Comment on lines 144 to 147
second_db = sqlite3.connect(worker_db, uri=True)
source_db.backup(second_db)
source_db.close()
self.connection.settings_dict["NAME"] = worker_db
self.connection.connect()
second_db.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this section could do with a bit of thought. Looking at the Python sqlite docs, it seems we should do:

Suggested change
second_db = sqlite3.connect(worker_db, uri=True)
source_db.backup(second_db)
source_db.close()
self.connection.settings_dict["NAME"] = worker_db
self.connection.connect()
second_db.close()
second_db = sqlite3.connect(worker_db, uri=True)
with second_db:
source_db.backup(second_db)
source_db.close()
# Re-open connection to in-memory database before closing copy connection.
self.connection.settings_dict["NAME"] = worker_db
self.connection.connect()
second_db.close()

Not really sure what the context manager is for 🤷🏻‍♂️

Can we also rename second_db to target_db? While we're at it, perhaps we should also change the name of worker_db as it is just the connection string.

Other thoughts related to .backup() that might be worth exploring:

  • It takes a sleep argument with a default of 0.25. Maybe we can reduce this to speed up setup time?
  • It takes a progress argument accepting a callable. Perhaps we can pass verbosity to setup_worker_connection() and display progress information?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the with block is for either. I read the PR earlier and that didn't make it any clearer for me as the implementation is written in c.

The second example shows it written more like it is here, but I agree that it is a little untidy.

I've not yet tested Nick's proposal, which I'd judge as more readable than the current attempt.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I testted Nick's proposals I get a handful of test failures. I'm not entirely sure why. I therefore just progressed with the renaming suggestions.

This still leaves the question about changing the other arguments to backup()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the with ... usage is documented as "Connection objects can be used as context managers that automatically commit or rollback transactions. In the event of an exception, the transaction is rolled back; otherwise, the transaction is committed", which is presumably useful and necessary when the data pages to be backed up are locked for writing etc (hence also, the sleep ...)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this again. When using the context manager I again saw test failures (different to those before) and they were intermittent. I hadn't seen that with the current patch.

However given they are intermittent I'm not sure if it's an issue with the context manager or this patch.

@carltongibson
Copy link
Member

Hey @smithdc1 — thanks for picking this up.

This is quite exciting:

Screenshot 2022-02-15 at 08 49 32

First run, hitting an error (quickly 🙂) on macOS:

Exception in thread Thread-1 (create_object):
Traceback (most recent call last):
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/carlton/Projects/Django/django/django/db/backends/sqlite3/base.py", line 366, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.OperationalError: no such table: backends_object
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 946, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/carlton/Projects/Django/django/tests/backends/sqlite/tests.py", line 277, in create_object
    Object.objects.create()
  File "/Users/carlton/Projects/Django/django/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/carlton/Projects/Django/django/django/db/models/query.py", line 541, in create
    obj.save(force_insert=True, using=self.db)
  File "/Users/carlton/Projects/Django/django/django/db/models/base.py", line 830, in save
    self.save_base(
  File "/Users/carlton/Projects/Django/django/django/db/models/base.py", line 881, in save_base
    updated = self._save_table(
  File "/Users/carlton/Projects/Django/django/django/db/models/base.py", line 1024, in _save_table
    results = self._do_insert(
  File "/Users/carlton/Projects/Django/django/django/db/models/base.py", line 1065, in _do_insert
    return manager._insert(
  File "/Users/carlton/Projects/Django/django/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/carlton/Projects/Django/django/django/db/models/query.py", line 1552, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/Users/carlton/Projects/Django/django/django/db/models/sql/compiler.py", line 1638, in execute_sql
    cursor.execute(sql, params)
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 84, in _execute
    with self.db.wrap_database_errors:
  File "/Users/carlton/Projects/Django/django/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/carlton/Projects/Django/django/django/db/backends/sqlite3/base.py", line 366, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.OperationalError: no such table: backends_object
test_database_sharing_in_threads (backends.sqlite.tests.ThreadSharing) failed:

    AssertionError('1 != 2')
...

I shall have a little dig-in to that.

👍

@smithdc1
Copy link
Member Author

sqlite3.OperationalError: no such table: backends_object

I was seeing something similar on the Pi with the previous patch. This is why I introduced the extra logic in setup_worker_connection to do something different for fork and spawn.

Maybe that step isn't quite right for MacOS? 🤷

Building on Nick's comments something doesnt quite seem right at the moment, but I find it hard to describe. It seems that were 'cloning' a file for spawn (fork gets its own in memory copy) but then copying (converting the various files back to in memory) / reopening again (fork?) in each process again in 'setup_worker_connection'.

@carltongibson
Copy link
Member

carltongibson commented Feb 15, 2022

OK, I'm seeing two intermittent errors.

1

======================================================================
FAIL: test_delete_signals (signals.tests.SignalTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 594, in run
    self._callTearDown()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 552, in _callTearDown
    self.tearDown()
  File "/Users/carlton/Projects/Django/django/tests/signals/tests.py", line 32, in tearDown
    self.assertEqual(self.pre_signals, post_signals)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 837, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 1054, in assertTupleEqual
    self.assertSequenceEqual(tuple1, tuple2, msg, seq_type=tuple)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 1025, in assertSequenceEqual
    self.fail(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 667, in fail
    raise self.failureException(msg)
AssertionError: Tuples differ: (1, 0, 2, 1) != (1, 0, 2, 0)

First differing element 3:
1
0

- (1, 0, 2, 1)
?           ^

+ (1, 0, 2, 0)
?           ^


----------------------------------------------------------------------

2

======================================================================
FAIL: test_database_sharing_in_threads (backends.sqlite.tests.ThreadSharing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/Users/carlton/Projects/Django/django/tests/backends/sqlite/tests.py", line 283, in test_database_sharing_in_threads
    self.assertEqual(Object.objects.count(), 2)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 837, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 830, in _baseAssertEqual
    raise self.failureException(msg)
AssertionError: 1 != 2

3

Plus a shutdown 7 leaked semaphores issue occasionally:

Full output...
(django) carlton@Carltons-MacBook-Pro tests % ./runtests.py                      
Testing against Django installed in '/Users/carlton/Projects/Django/django/django' with up to 10 processes
Found 15580 test(s).
Creating test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Creating test database for alias 'other'...
Cloning test database for alias 'other'...
Cloning test database for alias 'other'...
Cloning test database for alias 'other'...
Cloning test database for alias 'other'...
Cloning test database for alias 'other'...
Cloning test database for alias 'other'...
Cloning test database for alias 'other'...
Cloning test database for alias 'other'...
Cloning test database for alias 'other'...
Cloning test database for alias 'other'...
System check identified no issues (17 silenced).
................................................................................................................................................................s.....................................................................................................................................................................................................................................................................................................s...........................................................................................................................................................s.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................sssssss.........................................s....................................s..........................s.......................s...s...s....sss..................................................................................................................................................................................................................................................................................................................s............................s..............s..s....s.s....................s.s...s...s.s.............................................................................................................................................................................s..................................................................................................s..s..s..s...................................s.....................................sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss.s..................................................s.......ss.....................................................ssss..............................s..............................................................................................................................................................sss.............................................................................................................................s.................s......................................................................................................................ssss........................sssss.sssssssssssssssssssssssssssssssssssss..........................................s.s.s...............................s......s................................................................................................................................................s...................................................s.sss............s..................s.........................x..s.............................x......................................s.................s......................................................................................................................s..................................................................................................s........................................s...s...........................................................................................ssss............s.......................ssss..............................s................................ss...........s..s..............................................................................................................................................................................................................................................................................................................................................................................................................................................................s.................................s......................................................................s.....................s.......................................................................................................................................................................sssssssssssssssssssssss....................................s.ss.....................................................................................................................................................sssssssss..........................................................................................................................................................s.................................ss...................................................................................................................................................................................................................sssssssssssssssssssss.....s...........sssssssssssssssssssssssss.........................................................................................................................................s..................................................................................................................................x................ss.sss.s...................s..........s...............................................................................................................................................................................................................ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss.....................................................................................................................................................................................................................................................................................................................s...........................................................................................................................................................................................................................................................................................................................................................................x.............................................................................................................................................sss..ss...........................................................................................................s........................................................................................................................................ssss..........................................................................s.....................ssssss..............................................................................................................................................................................................................................................................................................................................................................x..............................ss..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ss............................................................................................................................................................................................................................................................................s....................................................................................................................................................................................................................................................................................................................................................................................ssssss................................................................................................................................................................................................................................s..........................................................................................................................................................................................ssssssssssssssssssssssssssssssssssssssssssssssss.........................................................................................................................................s.......................sssssss.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................s.................................................................................................................................................................................................................................................................................................s........................................ssss....................................................ssssssssssssssssssssssssss.......Exception in thread Thread-1 (create_object):
Traceback (most recent call last):
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/carlton/Projects/Django/django/django/db/backends/sqlite3/base.py", line 366, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.OperationalError: no such table: backends_object

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/threading.py", line 946, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/carlton/Projects/Django/django/tests/backends/sqlite/tests.py", line 277, in create_object
    Object.objects.create()
  File "/Users/carlton/Projects/Django/django/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/carlton/Projects/Django/django/django/db/models/query.py", line 541, in create
    obj.save(force_insert=True, using=self.db)
  File "/Users/carlton/Projects/Django/django/django/db/models/base.py", line 830, in save
    self.save_base(
  File "/Users/carlton/Projects/Django/django/django/db/models/base.py", line 881, in save_base
    updated = self._save_table(
  File "/Users/carlton/Projects/Django/django/django/db/models/base.py", line 1024, in _save_table
    results = self._do_insert(
  File "/Users/carlton/Projects/Django/django/django/db/models/base.py", line 1065, in _do_insert
    return manager._insert(
  File "/Users/carlton/Projects/Django/django/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/Users/carlton/Projects/Django/django/django/db/models/query.py", line 1552, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/Users/carlton/Projects/Django/django/django/db/models/sql/compiler.py", line 1638, in execute_sql
    cursor.execute(sql, params)
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 84, in _execute
    with self.db.wrap_database_errors:
  File "/Users/carlton/Projects/Django/django/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/carlton/Projects/Django/django/django/db/backends/sqlite3/base.py", line 366, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.OperationalError: no such table: backends_object
F.........ss................................sss.s............................ss...ss...s.s................................................................................................................sss.......................................................................................................................................................................ss................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................s................................sss...........s...................................................................................................................................................................................................................................................................................................................................................................sss.sssssssssssssssssssssssss.......................s...s.............ssssssssssssssssssss.s.sssss.ssssssss......................................................s................................................................................................................................s.............................................................................................................................................s...........................s.............................................................................s.........................................................................................................................................................................................................................................................ssssssssssssssssssss.......................................................................................................................................................................................................................................................................................................................................................sss...............................................s...........................................................................................................................................................................................................................................ssssssssssss...........................................................................................................................................................................................................................................................................................................................................................................................ss.............sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss.....sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss.........................ss....sssssss..sssss..............s...............................................sssssssssssssssss.sss..ssssssssssssssssssssssssssssssssssssssssssssssssssssss.......................................................................................................................................................s...........

test_database_writes (servers.tests.LiveServerDatabase) failed:

    <HTTPError 500: 'Internal Server Error'>

Unfortunately, the exception it raised cannot be pickled, making it impossible
for the parallel test runner to handle it cleanly.

Here's the error encountered while trying to pickle the exception:

    TypeError("cannot pickle '_io.BufferedReader' object")

You should re-run this test with the --parallel=1 option to reproduce the
failure and get a correct traceback.

Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'other'...
Destroying test database for alias 'other'...
Destroying test database for alias 'other'...
Destroying test database for alias 'other'...
Destroying test database for alias 'other'...
Destroying test database for alias 'other'...
Destroying test database for alias 'other'...
Destroying test database for alias 'other'...
Destroying test database for alias 'other'...
Destroying test database for alias 'other'...
Destroying test database for alias 'other'...
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 443, in _run_subsuite
    result = runner.run(subsuite)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 366, in run
    test(result)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/suite.py", line 122, in run
    test(result)
  File "/Users/carlton/Projects/Django/django/django/test/testcases.py", line 258, in __call__
    self._setup_and_call(result)
  File "/Users/carlton/Projects/Django/django/django/test/testcases.py", line 293, in _setup_and_call
    super().__call__(result)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 650, in __call__
    return self.run(*args, **kwds)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 599, in run
    self._feedErrorsToResult(result, outcome.errors)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 518, in _feedErrorsToResult
    result.addError(test, exc_info)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 291, in addError
    self.check_picklable(test, err)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 216, in check_picklable
    self._confirm_picklable(err)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 186, in _confirm_picklable
    pickle.loads(pickle.dumps(obj))
TypeError: cannot pickle '_io.BufferedReader' object
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/carlton/Projects/Django/django/tests/./runtests.py", line 757, in <module>
    failures = django_tests(
  File "/Users/carlton/Projects/Django/django/tests/./runtests.py", line 421, in django_tests
    failures = test_runner.run_tests(test_labels)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 1044, in run_tests
    result = self.run_suite(suite)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 966, in run_suite
    return runner.run(suite)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/runner.py", line 176, in run
    test(result)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 512, in run
    subsuite_index, events = test_results.next(timeout=0.1)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 870, in next
    raise value
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 443, in _run_subsuite
    result = runner.run(subsuite)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 366, in run
    test(result)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/suite.py", line 122, in run
    test(result)
  File "/Users/carlton/Projects/Django/django/django/test/testcases.py", line 258, in __call__
    self._setup_and_call(result)
  File "/Users/carlton/Projects/Django/django/django/test/testcases.py", line 293, in _setup_and_call
    super().__call__(result)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 650, in __call__
    return self.run(*args, **kwds)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 599, in run
    self._feedErrorsToResult(result, outcome.errors)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 518, in _feedErrorsToResult
    result.addError(test, exc_info)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 291, in addError
    self.check_picklable(test, err)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 216, in check_picklable
    self._confirm_picklable(err)
  File "/Users/carlton/Projects/Django/django/django/test/runner.py", line 186, in _confirm_picklable
    pickle.loads(pickle.dumps(obj))
TypeError: cannot pickle '_io.BufferedReader' object
^CException ignored in atexit callback: <function _exit_function at 0x1175c36d0>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/util.py", line 334, in _exit_function
Process SpawnPoolWorker-2:
    _run_finalizers(0)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/util.py", line 300, in _run_finalizers
    finalizer()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/util.py", line 224, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 729, in _terminate_pool
    p.join()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/process.py", line 149, in join
    res = self._popen.wait(timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/popen_fork.py", line 43, in wait
    return self.poll(os.WNOHANG if timeout == 0.0 else 0)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/popen_fork.py", line 27, in poll
    pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt: 
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/queues.py", line 365, in get
    with self._rlock:
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/synchronize.py", line 95, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
Exception ignored in atexit callback: <function rmtree at 0x10287e680>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 717, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 644, in _rmtree_safe_fd
    onerror(os.lstat, fullname, sys.exc_info())
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 641, in _rmtree_safe_fd
    orig_st = entry.stat(follow_symlinks=False)
FileNotFoundError: [Errno 2] No such file or directory: 'django_mxdmsp6q'
Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/util.py", line 224, in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/synchronize.py", line 86, in _cleanup
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/util.py", line 224, in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/synchronize.py", line 86, in _cleanup
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/util.py", line 224, in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/synchronize.py", line 86, in _cleanup
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/util.py", line 224, in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/synchronize.py", line 86, in _cleanup
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/util.py", line 224, in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/synchronize.py", line 86, in _cleanup
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/util.py", line 224, in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/synchronize.py", line 86, in _cleanup
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/util.py", line 224, in __call__
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/synchronize.py", line 86, in _cleanup
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <function Pool.__del__ at 0x140f9fc70>
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 265, in __del__
ResourceWarning: unclosed running multiprocessing pool <multiprocessing.pool.Pool state=RUN pool_size=10>
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 7 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

The traceback just ends there. It gets blocked at shutdown. I have to hit ^C at:

TypeError: cannot pickle '_io.BufferedReader' object
^CException ignored in atexit callback: <function _exit_function at 0x1103c00d0>

(See the ^C there.)

... to get it to exit.


I also saw an Exception ignored in: <function CoroutineClearingView.__del__ at 0x...>
#15429 should address that.

@smithdc1
Copy link
Member Author

Hi @carltongibson thanks for looking at this.

Thank you for highlighting those test failures. I'll have a look and see if I can get them to reproduce on windows as I don't have MacOS.

On the final, semaphore, issue I found this but I'm not sure if it is related to the issue shown in the traceback above.

python/cpython#30617

@smithdc1
Copy link
Member Author

intermittent errors.

Ahh the best type of error. 😄

Running python .\runtests.py backends.sqlite numerous times I'm not able to reproduce a failure on Windows with Python 3.9.0. What version of Python are you using?

@carltongibson
Copy link
Member

Python 3.10.

It passes more that it fails. ... — I've just run three times. Only on the third do I hit the TypeError: cannot pickle '_io.BufferedReader' object followed by ^C and then resource_tracker: There appear to be 7 leaked semaphore objects to clean up at shutdown — I'm pretty sure the last comes from the first. (Wonder if #15381 is relevant 🤔)

It's also significantly faster, so — if it's stable on Windows — I'm half-inclined to say let's have it and fix the isolation issues as we can. (We've had extended test failure issues on macOS for an age... until we commit to running CI on macOS I don't see that changing.) However... let me have another play.

@carltongibson
Copy link
Member

I can't quite tell (before lunch at least 🥪) if @ngnpope's comments are all resolved? 🤔

@smithdc1
Copy link
Member Author

I can't quite tell ... if .... comments are all resolved? 🤔

Not quite -- I've pushed a couple of small edits, squashed and rebased.

I've also been through and re-marked resolved/not resolved as I think this patch now stands. There's 3 comments that I've not had chance as yet to think about.

@smithdc1
Copy link
Member Author

Am I right in thinking the Django's PR runners run the tests with parallel=1?

@smithdc1
Copy link
Member Author

FAIL: test_database_sharing_in_threads (backends.sqlite.tests.ThreadSharing)

So I have seen this but on Linux, but like you say it was intermittent.

I think it's to do with this comment, so I've now reset the connection settings which was missing for both fork and spawn until now.

        # connection.settings_dict must be updated in place for changes to be
        # reflected in django.db.connections. Otherwise new threads would
        # connect to the default database instead of the appropriate clone.

Carlton -- with my latest amends does this fix the tests failures you were seeing.

Am I right in thinking the Django's PR runners run the tests with parallel=1?

I therefore think this patch had a regression in it which wasn't being caught as c/i doesn't run in parallel. While I've done some more testing with long run times on my devices I can't do enough "reps" to prove this is stable.

It would therefore be useful, I think, if this could be tested a little bit more widely before merging?

I'm getting there slowly, just 1 more of Nicks comments to investigate.

@smithdc1
Copy link
Member Author

Hi @carltongibson

I think this is ready for review again. I'd appreciate you feedback to understand if the previous batch of test failures are now resolved.

There is one outstanding question about use of the context manager when migrating the dB back to memory.

As your timings are c. 7x-8x quicker than mine ( I think I'm now memory constrained) I wondered if you could help test the reliability of this patch with/without the suggested context manager? 🙏

@carltongibson
Copy link
Member

Thanks @smithdc1. Let me give it another run. 👍

Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK… so…

macOS.

Running with --parallel=1 works without issue, which is the situation ante, since that's all that works on main.

With this patch runtime is ≈38s vs 246s on main.

It occasionally fails, maybe 1 in 4 runs? Here I can re-run it (twice it needs be) before the --parallel=1 version has finished.

Issues

The main error I'm seeing is this one:

test_database_writes (servers.tests.LiveServerDatabase) failed:

    <HTTPError 500: 'Internal Server Error'>

Unfortunately, the exception it raised cannot be pickled, making it impossible
for the parallel test runner to handle it cleanly.

Here's the error encountered while trying to pickle the exception:

    TypeError("cannot pickle '_io.BufferedReader' object")

This leads to a problem on shutdown:

(snipped)

multiprocessing.pool.RemoteTraceback: 

...

TypeError: cannot pickle '_io.BufferedReader' object

...

Exception ignored in: <function Pool.__del__ at 0x109c2dea0>

Previously this was freezing, requiring a ^C but it's exiting cleanly now.

Then I've seen this morning, one time each (over quite a few runs):

FAIL: test_delete_signals (signals.tests.SignalTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 594, in run
    self._callTearDown()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 552, in _callTearDown
    self.tearDown()
  File "/Users/carlton/Projects/Django/django/tests/signals/tests.py", line 32, in tearDown
    self.assertEqual(self.pre_signals, post_signals)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 837, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 1054, in assertTupleEqual
    self.assertSequenceEqual(tuple1, tuple2, msg, seq_type=tuple)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 1025, in assertSequenceEqual
    self.fail(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 667, in fail
    raise self.failureException(msg)
AssertionError: Tuples differ: (1, 0, 2, 1) != (1, 0, 2, 0)

First differing element 3:
1
0

- (1, 0, 2, 1)
?           ^

+ (1, 0, 2, 0)
?           ^

and


======================================================================
FAIL: test_database_sharing_in_threads (backends.sqlite.tests.ThreadSharing)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/Users/carlton/Projects/Django/django/tests/backends/sqlite/tests.py", line 271, in test_database_sharing_in_threads
    self.assertEqual(Object.objects.count(), 2)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 837, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/unittest/case.py", line 830, in _baseAssertEqual
    raise self.failureException(msg)
AssertionError: 1 != 2

# This was printed to stderr: 
Exception in thread Thread-1 (create_object):
Traceback (most recent call last):
  File "/Users/carlton/Projects/Django/django/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/Users/carlton/Projects/Django/django/django/db/backends/sqlite3/base.py", line 357, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.OperationalError: no such table: backends_object

I think if we could work out the issue with test_database_writes (servers.tests.LiveServerDatabase) that would be the most of it.
(Obviously why the other two come up would be worth resolving too.)

The issue with test_database_writes (servers.tests.LiveServerDatabase) doesn't reproduce running just servers.tests.LiveServerDatabase or servers tests.

I'm minded to take this and work on the edges in a much happier (faster) land.

@carltongibson
Copy link
Member

There is one outstanding question about use of the context manager when migrating the dB back to memory.

As your timings are c. 7x-8x quicker than mine ( I think I'm now memory constrained) I wondered if you could help test the reliability of this patch with/without the suggested context manager?

I need to look at this still.

I tested already with disk vs in memory SQLite, which made no difference to the behaviour.

@carltongibson
Copy link
Member

carltongibson commented Feb 23, 2022

I'm not seeing any difference with the context manager approach (discussed above):

           with target_db:
                source_db.backup(target_db)

By the time we reach the end of the Cloning test database for alias ... phase, it's already played out I guess. 🤔
(If this were to go wrong, we'd want to abort no?)

@smithdc1 — what were the errors you saw? (Similar to those I'm seeing or others?)

@smithdc1
Copy link
Member Author

@smithdc1 — what were the errors you saw? (Similar to those I'm seeing or others?)

Last time I tried it I had a file access error (so something different). But I only saw it the once, so hard to tell if it is related to this approach.

@smithdc1
Copy link
Member Author

Or revert 'em and see if anything breaks. 😄

It breaks. I see 5 failures running postgres tests in parallel without this change, see gist.

Looking at the previous PR, there was some discussion here and here but nothing on the why from my reading.

Assuming we're happy to keep it, I think all comments are now updated for. I've also rebased for the hook that was pulled out into a separate PR earlier today.

@smithdc1
Copy link
Member Author

Hi All,

I've been trying to replicate some of the test failures seen here using Windows this morning. I tried using --shuffle and saw a number of tests fail but the same tests don't always fail each time. After a few attempts I've been able to get one case narrowed down but am a little unsure at the moment where to go next with this.

With this command / setup I see repeated test fails. (Not so when runnign the whole test suite). The tests pass when running in a single process with --parallel=1
python runtests.py --shuffle 2714725009 .\admin_scripts\ .\asgi\


(django) PS C:\Users\smith\PycharmProjects\django\tests> python .\runtests.py --shuffle 2714725009 .\admin_scripts\ .\asgi\
Testing against Django installed in 'c:\users\smith\pycharmprojects\django\django' with up to 8 processes
Using shuffle seed: 2714725009 (given)
Found 218 test(s).
Creating test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
System check identified no issues (0 silenced).
.....................................................................E..................................................................................................................................................s.Exception ignored in: <_io.FileIO name='C:\\Users\\smith\\PycharmProjects\\django\\tests\\asgi\\urls.py' mode='rb' closefd=True>
ResourceWarning: unclosed file <_io.FileIO name='C:\\Users\\smith\\PycharmProjects\\django\\tests\\asgi\\urls.py' mode='rb' closefd=True>

======================================================================
ERROR: test_file_response (asgi.tests.ASGITest)
Makes sure that FileResponse works over ASGI.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\unittest\case.py", line 59, in testPartExecutor
    yield
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\unittest\case.py", line 593, in run
    self._callTestMethod(testMethod)
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\unittest\case.py", line 550, in _callTestMethod
    method()
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\sync.py", line 223, in __call__
    return call_result.result()
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\_base.py", line 433, in result
    return self.__get_result()
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\_base.py", line 389, in __get_result
    raise self._exception
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\sync.py", line 292, in main_wrap
    result = await self.awaitable(*args, **kwargs)
  File "C:\Users\smith\PycharmProjects\django\tests\asgi\tests.py", line 77, in test_file_response
    response_start = await communicator.receive_output()
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\testing.py", line 85, in receive_output
    raise e
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\testing.py", line 74, in receive_output
    return await self.output_queue.get()
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\timeout.py", line 66, in __aexit__
    self._do_exit(exc_type)
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\timeout.py", line 103, in _do_exit
    raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError

----------------------------------------------------------------------
Ran 218 tests in 102.811s

FAILED (errors=1, skipped=1)
Used shuffle seed: 2714725009 (given)
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
(django) PS C:\Users\smith\PycharmProjects\django\tests> python .\runtests.py --shuffle 2714725009 .\admin_scripts\ .\asgi\ --failfast
Testing against Django installed in 'c:\users\smith\pycharmprojects\django\django' with up to 8 processes
Using shuffle seed: 2714725009 (given)
Found 218 test(s).
Creating test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
Cloning test database for alias 'default'...
System check identified no issues (0 silenced).
......................................................E
======================================================================
ERROR: test_file_response (asgi.tests.ASGITest)
Makes sure that FileResponse works over ASGI.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\unittest\case.py", line 59, in testPartExecutor
    yield
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\unittest\case.py", line 593, in run
    self._callTestMethod(testMethod)
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\unittest\case.py", line 550, in _callTestMethod
    method()
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\sync.py", line 223, in __call__
    return call_result.result()
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\_base.py", line 433, in result
    return self.__get_result()
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\concurrent\futures\_base.py", line 389, in __get_result
    raise self._exception
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\sync.py", line 292, in main_wrap
    result = await self.awaitable(*args, **kwargs)
  File "C:\Users\smith\PycharmProjects\django\tests\asgi\tests.py", line 77, in test_file_response
    response_start = await communicator.receive_output()
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\testing.py", line 85, in receive_output
    raise e
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\testing.py", line 74, in receive_output
    return await self.output_queue.get()
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\timeout.py", line 66, in __aexit__
    self._do_exit(exc_type)
  File "C:\Users\smith\PycharmProjects\venv\django\lib\site-packages\asgiref\timeout.py", line 103, in _do_exit
    raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError

----------------------------------------------------------------------
Ran 55 tests in 28.541s

FAILED (errors=1)
Used shuffle seed: 2714725009 (given)
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Destroying test database for alias 'default'...
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "C:\Users\smith\AppData\Local\Programs\Python\Python39\lib\shutil.py", line 617, in _rmtree_unsafe
    os.rmdir(path)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\smith\\AppData\\Local\\Temp\\django_84lgia__\\django_4vu32qt5\\tmp5n4kffn1\\test_project'

Full output from this mornings logs here

@carltongibson
Copy link
Member

Hey @smithdc1 — interesting. Typically, I'm not seeing that failure running the same here. 🤔 (at e445604)

I think we should try to divide the issues into:

  1. Problems with this patch
  2. Latent test-isolation bugs, that only manifest with this patch.

I can well believe we'll find plenty of 2. But do we think there are any 1s left? If not I think we can probably go for it, and resolve the 2s with time and visibility, once folks are running the test suite in parallel in more environments. (Or so I might argue.)

@smithdc1
Copy link
Member Author

I think that's difficult to tell as we need this patch to find the isolation issues. Especially as they seem hard to replicate.

I was thinking if we are happy that this patch doesn't create a regression in the following scenarios then we should progress.

  • parallel=1 on all platforms
  • multiprocess with fork/Linux.

People using spawn can always revert to parallel=1 which is what they had before this patch if they encounter an issue?

@carltongibson
Copy link
Member

@smithdc1 Yes, that's more or less my reasoning too. (Plus it's so much faster that I'd take the failures whilst we track 'em down as a trade-off without regret.)

Copy link
Member

@felixxm felixxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smithdc1 Thanks 👍 I squashed commit, rebased, and pushed small edits.

@felixxm felixxm force-pushed the ticket_31169 branch 2 times, most recently from 29dd40c to f667201 Compare March 1, 2022 12:18
@felixxm
Copy link
Member

felixxm commented Mar 1, 2022

@carltongibson Can you check this with selenium tests?

@giff-h
Copy link
Contributor

giff-h commented Mar 1, 2022

Question: Could this be considered an extended support upgrade for 3.2?

I ask because 3.2 is the last version of Django to support Python 3.7, and 3.7 is the last version of Python that can run parallel tests on mac by setting the OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES environment variable. If this is only intended for 4.0+, it could mean a slight disruption to development environments on mac machines as they upgrade the Python of their deployment environments and then the Django of the project.

@carltongibson
Copy link
Member

@hamstap85 This should be part of Django 4.1. It doesn't qualify for a backport to Django 3.2, which is now in extended support, and only receives security and data loss fixes anyways.

@carltongibson
Copy link
Member

@carltongibson Can you check this with selenium tests?

As it stands, Selenium tests are all skipped unless --parallel=1 (Presumably because we're not passing the selenium option to the forked process... 🤔)

I'd suggest taking getting that working as a follow-up.

Maybe a error if fork is the start method and parallel is not 1, like we do with --pdb...

% ./runtests.py --pdb
...
ValueError: You cannot use --pdb with parallel tests; pass --parallel=1 to use it.

@giff-h
Copy link
Contributor

giff-h commented Mar 2, 2022

@carltongibson thanks for clarifying 👍

@smithdc1
Copy link
Member Author

smithdc1 commented Mar 2, 2022

Not had chance yet to test it on Windows as yet, but maybe something like this?

diff --git a/tests/runtests.py b/tests/runtests.py
index 06755688ea..48840b0042 100755
--- a/tests/runtests.py
+++ b/tests/runtests.py
@@ -3,6 +3,7 @@ import argparse
 import atexit
 import copy
 import gc
+import multiprocessing
 import os
 import shutil
 import socket
@@ -683,6 +684,16 @@ if __name__ == "__main__":
 
     options = parser.parse_args()
 
+    if (
+        options.selenium
+        and options.parallel != 1
+        and multiprocessing.get_start_method() != "fork"
+    ):
+        raise ValueError(
+            "You cannot use --selenium with parallel tests; "
+            "pass --parallel=1 to use it."
+        )
+
     using_selenium_hub = options.selenium and options.selenium_hub
     if options.selenium_hub and not options.selenium:
         parser.error(

@carltongibson
Copy link
Member

@smithdc1 I think that would be just right. Do you have the capacity to add that, and a test? — I think then we can get this in. 😜

@carltongibson
Copy link
Member

carltongibson commented Mar 9, 2022

@smithdc1 — I added a check for the --selenium and --parallel flags (with spawn). Since the --selenium option doesn't pass down to the discover runner, there's no convenient test location (as you already saw 🙂). We'll go without.

Just now discussing the mark_expected_failures_and_skips() question with @felixxm

@smithdc1
Copy link
Member Author

smithdc1 commented Mar 9, 2022

Thank you for pushing this along. I was still contemplating how to add a test 😄

Co-authored-by: Valz <ahmadahussein0@gmail.com>
Co-authored-by: Nick Pope <nick@nickpope.me.uk>
@carltongibson carltongibson merged commit 3b3f38b into django:main Mar 15, 2022
@smithdc1 smithdc1 deleted the ticket_31169 branch March 15, 2022 15:30
@felixxm
Copy link
Member

felixxm commented Mar 16, 2022

@smithdc1 @carltongibson This seems to caused some issues:

Running tests...
----------------------------------------------------------------------
Process SpawnPoolWorker-2:
Traceback (most recent call last):
  File "C:\Python310\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "C:\Python310\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Python310\lib\multiprocessing\pool.py", line 109, in worker
    initializer(*initargs)
  File "C:\Jenkins\workspace\django-windows\database\sqlite3\label\windows\python\Python310\django\test\runner.py", line 429, in _init_worker
    connection.settings_dict.update(initial_settings[alias])
TypeError: 'NoneType' object is not subscriptable
Process SpawnPoolWorker-1:
Traceback (most recent call last):
  File "C:\Python310\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "C:\Python310\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Python310\lib\multiprocessing\pool.py", line 109, in worker
    initializer(*initargs)
  File "C:\Jenkins\workspace\django-windows\database\sqlite3\label\windows\python\Python310\django\test\runner.py", line 429, in _init_worker
    connection.settings_dict.update(initial_settings[alias])
TypeError: 'NoneType' object is not subscriptable
Process SpawnPoolWorker-3:
Traceback (most recent call last):
  File "C:\Python310\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "C:\Python310\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Python310\lib\multiprocessing\pool.py", line 109, in worker
    initializer(*initargs)
  File "C:\Jenkins\workspace\django-windows\database\sqlite3\label\windows\python\Python310\django\test\runner.py", line 429, in _init_worker
    connection.settings_dict.update(initial_settings[alias])
TypeError: 'NoneType' object is not subscriptable

...

  File "C:\Python310\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "C:\Python310\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Python310\lib\multiprocessing\pool.py", line 109, in worker
    initializer(*initargs)
  File "C:\Jenkins\workspace\django-windows\database\sqlite3\label\windows\python\Python310\django\test\runner.py", line 429, in _init_worker
    connection.settings_dict.update(initial_settings[alias])
TypeError: 'NoneType' object is not subscriptable
Process SpawnPoolWorker-70432:

I stopped this build on 70432 workers 🤯 See logs.

@felixxm
Copy link
Member

felixxm commented Mar 16, 2022

This is probably related with the fact that we use

TEST_RUNNER = 'xmlrunner.extra.djangotestrunner.XMLTestRunner'

on Jenkins. However, folks can use different test runners and it shouldn't crash.

@smithdc1
Copy link
Member Author

A draft PR with a potential fix at #15520

@adamchainz
Copy link
Member

A bit late, but I think this deserves a release note!

At the least, projects that have patched to allow fork on macOS will be able to remove the patch. Possibly there are other workarounds and test scripts users will want to update now they can run tests in parallel on all platforms!

@carltongibson
Copy link
Member

carltongibson commented Apr 14, 2022

Good call. @adamchainz How about #15599 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy