DISABLED test_graph_partition_forward_with_skipped_cudagraphed_backward (__main__.CudaGraphTreeTests)

Platforms: rocm

  This test was disabled because it is failing in CI. See [recent examples](https://hud.pytorch.org/flakytest?name=test_graph_partition_forward_with_skipped_cudagraphed_backward&suite=CudaGraphTreeTests&limit=100) and the most recent trunk [workflow logs](https://github.com/pytorch/pytorch/runs/45486797851).

  Over the past 6 hours, it has been determined flaky in 16 workflow(s) with 32 failures and 16 successes.

  **Debugging instructions (after clicking on the recent samples link):**
  DO NOT ASSUME THINGS ARE OKAY IF THE CI IS GREEN. We now shield flaky tests from developers so CI will thus be green but it will be harder to parse the logs.
  To find relevant log snippets:
  1. Click on the workflow logs linked above
  2. Click on the Test step of the job so that it is expanded. Otherwise, the grepping will not work.
  3. Grep for `test_graph_partition_forward_with_skipped_cudagraphed_backward`
  4. There should be several instances run (as flaky tests are rerun in CI) from which you can study the logs.
  
  

<details><summary>Sample error message</summary>

```
Traceback (most recent call last):
  File "/var/lib/jenkins/pytorch/test/inductor/test_cudagraph_trees.py", line 2936, in test_graph_partition_forward_with_skipped_cudagraphed_backward
    out.backward(back_inp)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py", line 653, in backward
    torch.autograd.backward(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/__init__.py", line 354, in backward
    _engine_run_backward(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py", line 829, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py", line 311, in apply
    return user_fn(self, *args)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 2260, in backward
    return impl_fn()
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 2246, in impl_fn
    out = CompiledFunction._backward_impl(ctx, all_args)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 2348, in _backward_impl
    CompiledFunction.compiled_bw = aot_config.bw_compiler(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 482, in __call__
    return self.compiler_fn(gm, example_inputs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/backends/common.py", line 76, in _wrapped_bw_compiler
    disable(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 962, in _fn
    return fn(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_utils_internal.py", line 98, in wrapper_function
    return function(*args, **kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 2373, in bw_compiler
    return inner_compile(
  File "/opt/conda/envs/py_3.10/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 767, in compile_fx_inner
    return wrap_compiler_debug(_compile_fx_inner, compiler_name="inductor")(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/repro/after_aot.py", line 124, in debug_wrapper
    inner_compiled_fn = compiler_fn(gm, example_inputs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1021, in _compile_fx_inner
    compiled_graph.post_compile(example_inputs, constants, graph_kwargs)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/output_code.py", line 637, in post_compile
    cudagraph_partition_post_compile(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/output_code.py", line 274, in cudagraph_partition_post_compile
    maybe_handle_backward_generation(compiled_graph, boxed_forward_device_index)
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/output_code.py", line 161, in maybe_handle_backward_generation
    assert manager is not None
AssertionError
```

</details>


  Test file path: `inductor/test_cudagraph_trees_expandable_segments.py`

  For all disabled tests (by GitHub issue), see https://hud.pytorch.org/disabled.

cc @ptrblck @msaroufim @eqy @jerryzh168 @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd @clee2000

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DISABLED test_graph_partition_forward_with_skipped_cudagraphed_backward (main.CudaGraphTreeTests) #157723

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

DISABLED test_graph_partition_forward_with_skipped_cudagraphed_backward (__main__.CudaGraphTreeTests) #157723

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

DISABLED test_graph_partition_forward_with_skipped_cudagraphed_backward (main.CudaGraphTreeTests) #157723