Description
🐛 Describe the bug
If you run detectron2_fcos_r_50_fpn with automatic_dynamic_shapes, you'll notice that there are dynamic shapes recompilations, which aren't really true dynamic shape recompilations:
[2023-06-15 05:43:07,747] torch.fx.experimental.symbolic_shapes: [INFO] 12.0: create_env
[2023-06-15 05:43:08,412] torch.fx.experimental.symbolic_shapes: [INFO] 12.0: produce_guards
[2023-06-15 05:43:08,423] torch.fx.experimental.symbolic_shapes: [INFO] 12.1: create_env
[2023-06-15 05:43:08,425] torch._dynamo.variables.builder: [WARNING] automatic dynamic L['input'] size(1) 256 != 64
[2023-06-15 05:43:08,425] torch.fx.experimental.symbolic_shapes: [INFO] 12.1: create_symbol s0 = 256 for L['input'].size()[1]
[2023-06-15 05:43:08,525] torch.fx.experimental.symbolic_shapes: [INFO] 12.1: eval Eq(s0, 256) [guard added] at home/ezyang/local/b/pytorch-env/lib/python3.10/site-packages/detectron2/layers/wrappers.py:106 in forward (_subclasses/fake_tensor.py:612 in conv)
[2023-06-15 05:43:09,075] torch.fx.experimental.symbolic_shapes: [INFO] 12.1: produce_guards
Instead, they're due to the architecture of detectron2_fcos_r_50_fpn, which stacks a number of similar blocks together but with differing parameters. For Dynamo's intents and purposes, these are all using the same code block, and so each block with differing configuration all gets chucked in the same code cache.
I think if our policy is that we want to specialize on every unique parameter size (this seems reasonable to me), it would be best if we could arrange for our compiled code to NOT live on all the same code object.
@anijain2305 also mentioned to me that he noticed this could contribute to guard overhead, although ISTR that in the end it wasn't clear how much this actually mattered in the end.
Versions
main