flex attention: fix dispatch order for tensor subclasses, avoid hardcoding call to faketensor impl in dynamo #151719

bdhirsh · 2025-04-18T23:14:13Z

This is enough to get @XilunWu 's stack in a state where his flex_attention DTensor implementations worked E2E for me. It also required these changes on the DTensor side, to properly add a DTensor rule for flex backward: P1789852198

There are two problems:

(1) in the normal dispatcher, we have a precedence ordering between modes and subclasses. Modes are dispatched to first, but modes are allowed to return NotImplemented, giving subclasses a chance to run.

This normally happens automatically in FakeTensorMode.__torch_dispatch__ and FunctionalTensorMode.__torch_dispatch__. However, since HOPs implement these two modes themselves, HOPs do not get this benefit. For now, I ended up hardcoding this NotImplemented logic directly into the functional/fake rules for flex attention.

Having to do this for every HOP seems a bit painful. If we could plumb every HOP through Fake[|Functional]TensorMode.__torch_dispatch__ then we would get this support. Another option could be to just assume that most HOP <> mode implementations want the same treatment by default, and hardcode this NotImplemented logic into torch/_ops.py. I'm not sure if we'd need a way for the HOP to opt out of this though.

(2) We were hardcoding a call to flex attention's fake implementation in dynamo to run fake prop. This is technically wrong for subclasses, because it doesn't give subclasses the chance to interpose on the op and desugar it before fake prop runs. I tweaked dynamo's logic to call the op, and let the dispatcher handle invoking the fake implementation.

Testing Xilun is adding some DTensor tests in his PR that will end up testing this logic. If folks would prefer, though, I can try to add a test that uses another subclass instead that is maybe more basic.

This is the tlparse that his DTensor test gnerated for me: https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/hirsheybar/0196c1d3-a9a2-46ea-a46d-aa21618aa060/custom/rank_0/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000

Stack from ghstack (oldest at bottom):

-> flex attention: fix dispatch order for tensor subclasses, avoid hardcoding call to faketensor impl in dynamo #151719

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

…oding call to faketensor impl in dynamo [ghstack-poisoned]

pytorch-bot · 2025-04-18T23:14:16Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/151719

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 1ba7ad0 with merge base c92f107 ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (inductor_torchbench_cpu_smoketest_perf, 1, 1, linux.24xl.spr-metal) (gh) (trunk failure)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…oding call to faketensor impl in dynamo ghstack-source-id: c4d5d49 Pull Request resolved: #151719

torch/_higher_order_ops/flex_attention.py

torch/_dynamo/variables/higher_order_ops.py

torch/_higher_order_ops/flex_attention.py

…avoid hardcoding call to faketensor impl in dynamo" This is enough to get XilunWu 's stack in a state where his flex_attention DTensor implementations worked E2E for me. It also required these changes on the DTensor side, to properly add a DTensor rule for flex backward: P1789852198 There are two problems: (1) in the normal dispatcher, we have a precedence ordering between modes and subclasses. Modes are dispatched to first, but modes are allowed to return NotImplemented, giving subclasses a chance to run. This normally happens automatically in `FakeTensorMode.__torch_dispatch__` and `FunctionalTensorMode.__torch_dispatch__`. However, since HOPs implement these two modes themselves, HOPs do not get this benefit. For now, I ended up hardcoding this `NotImplemented` logic directly into the functional/fake rules for flex attention. Having to do this for every HOP seems a bit painful. If we could plumb every HOP through `Fake[|Functional]TensorMode.__torch_dispatch__` then we would get this support. Another option could be to just assume that most HOP <> mode implementations want the same treatment by default, and hardcode this `NotImplemented` logic into `torch/_ops.py`. I'm not sure if we'd need a way for the HOP to opt out of this though. (2) We were hardcoding a call to flex attention's fake implementation in dynamo to run fake prop. This is technically wrong for subclasses, because it doesn't give subclasses the chance to interpose on the op and desugar it before fake prop runs. I tweaked dynamo's logic to call the op, and let the dispatcher handle invoking the fake implementation. **Testing** Xilun is adding some DTensor tests in his PR that will end up testing this logic. If folks would prefer, though, I can try to add a test that uses another subclass instead that is maybe more basic. This is the tlparse that his DTensor test gnerated for me: https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/hirsheybar/0196c1d3-a9a2-46ea-a46d-aa21618aa060/custom/rank_0/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]

…oding call to faketensor impl in dynamo ghstack-source-id: ba6f1ee Pull Request resolved: #151719

…avoid hardcoding call to faketensor impl in dynamo" This is enough to get XilunWu 's stack in a state where his flex_attention DTensor implementations worked E2E for me. It also required these changes on the DTensor side, to properly add a DTensor rule for flex backward: P1789852198 There are two problems: (1) in the normal dispatcher, we have a precedence ordering between modes and subclasses. Modes are dispatched to first, but modes are allowed to return NotImplemented, giving subclasses a chance to run. This normally happens automatically in `FakeTensorMode.__torch_dispatch__` and `FunctionalTensorMode.__torch_dispatch__`. However, since HOPs implement these two modes themselves, HOPs do not get this benefit. For now, I ended up hardcoding this `NotImplemented` logic directly into the functional/fake rules for flex attention. Having to do this for every HOP seems a bit painful. If we could plumb every HOP through `Fake[|Functional]TensorMode.__torch_dispatch__` then we would get this support. Another option could be to just assume that most HOP <> mode implementations want the same treatment by default, and hardcode this `NotImplemented` logic into `torch/_ops.py`. I'm not sure if we'd need a way for the HOP to opt out of this though. (2) We were hardcoding a call to flex attention's fake implementation in dynamo to run fake prop. This is technically wrong for subclasses, because it doesn't give subclasses the chance to interpose on the op and desugar it before fake prop runs. I tweaked dynamo's logic to call the op, and let the dispatcher handle invoking the fake implementation. **Testing** Xilun is adding some DTensor tests in his PR that will end up testing this logic. If folks would prefer, though, I can try to add a test that uses another subclass instead that is maybe more basic. This is the tlparse that his DTensor test gnerated for me: https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/hirsheybar/0196c1d3-a9a2-46ea-a46d-aa21618aa060/custom/rank_0/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]

…oding call to faketensor impl in dynamo ghstack-source-id: cc7c17d Pull Request resolved: #151719

…avoid hardcoding call to faketensor impl in dynamo" This is enough to get XilunWu 's stack in a state where his flex_attention DTensor implementations worked E2E for me. It also required these changes on the DTensor side, to properly add a DTensor rule for flex backward: P1789852198 There are two problems: (1) in the normal dispatcher, we have a precedence ordering between modes and subclasses. Modes are dispatched to first, but modes are allowed to return NotImplemented, giving subclasses a chance to run. This normally happens automatically in `FakeTensorMode.__torch_dispatch__` and `FunctionalTensorMode.__torch_dispatch__`. However, since HOPs implement these two modes themselves, HOPs do not get this benefit. For now, I ended up hardcoding this `NotImplemented` logic directly into the functional/fake rules for flex attention. Having to do this for every HOP seems a bit painful. If we could plumb every HOP through `Fake[|Functional]TensorMode.__torch_dispatch__` then we would get this support. Another option could be to just assume that most HOP <> mode implementations want the same treatment by default, and hardcode this `NotImplemented` logic into `torch/_ops.py`. I'm not sure if we'd need a way for the HOP to opt out of this though. (2) We were hardcoding a call to flex attention's fake implementation in dynamo to run fake prop. This is technically wrong for subclasses, because it doesn't give subclasses the chance to interpose on the op and desugar it before fake prop runs. I tweaked dynamo's logic to call the op, and let the dispatcher handle invoking the fake implementation. **Testing** Xilun is adding some DTensor tests in his PR that will end up testing this logic. If folks would prefer, though, I can try to add a test that uses another subclass instead that is maybe more basic. This is the tlparse that his DTensor test gnerated for me: https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/hirsheybar/0196c1d3-a9a2-46ea-a46d-aa21618aa060/custom/rank_0/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]

…oding call to faketensor impl in dynamo ghstack-source-id: e402015 Pull Request resolved: #151719

…avoid hardcoding call to faketensor impl in dynamo" This is enough to get XilunWu 's stack in a state where his flex_attention DTensor implementations worked E2E for me. It also required these changes on the DTensor side, to properly add a DTensor rule for flex backward: P1789852198 There are two problems: (1) in the normal dispatcher, we have a precedence ordering between modes and subclasses. Modes are dispatched to first, but modes are allowed to return NotImplemented, giving subclasses a chance to run. This normally happens automatically in `FakeTensorMode.__torch_dispatch__` and `FunctionalTensorMode.__torch_dispatch__`. However, since HOPs implement these two modes themselves, HOPs do not get this benefit. For now, I ended up hardcoding this `NotImplemented` logic directly into the functional/fake rules for flex attention. Having to do this for every HOP seems a bit painful. If we could plumb every HOP through `Fake[|Functional]TensorMode.__torch_dispatch__` then we would get this support. Another option could be to just assume that most HOP <> mode implementations want the same treatment by default, and hardcode this `NotImplemented` logic into `torch/_ops.py`. I'm not sure if we'd need a way for the HOP to opt out of this though. (2) We were hardcoding a call to flex attention's fake implementation in dynamo to run fake prop. This is technically wrong for subclasses, because it doesn't give subclasses the chance to interpose on the op and desugar it before fake prop runs. I tweaked dynamo's logic to call the op, and let the dispatcher handle invoking the fake implementation. **Testing** Xilun is adding some DTensor tests in his PR that will end up testing this logic. If folks would prefer, though, I can try to add a test that uses another subclass instead that is maybe more basic. This is the tlparse that his DTensor test gnerated for me: https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/hirsheybar/0196c1d3-a9a2-46ea-a46d-aa21618aa060/custom/rank_0/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]

…oding call to faketensor impl in dynamo ghstack-source-id: 0eeb838 Pull Request resolved: #151719

…avoid hardcoding call to faketensor impl in dynamo" This is enough to get XilunWu 's stack in a state where his flex_attention DTensor implementations worked E2E for me. It also required these changes on the DTensor side, to properly add a DTensor rule for flex backward: P1789852198 There are two problems: (1) in the normal dispatcher, we have a precedence ordering between modes and subclasses. Modes are dispatched to first, but modes are allowed to return NotImplemented, giving subclasses a chance to run. This normally happens automatically in `FakeTensorMode.__torch_dispatch__` and `FunctionalTensorMode.__torch_dispatch__`. However, since HOPs implement these two modes themselves, HOPs do not get this benefit. For now, I ended up hardcoding this `NotImplemented` logic directly into the functional/fake rules for flex attention. Having to do this for every HOP seems a bit painful. If we could plumb every HOP through `Fake[|Functional]TensorMode.__torch_dispatch__` then we would get this support. Another option could be to just assume that most HOP <> mode implementations want the same treatment by default, and hardcode this `NotImplemented` logic into `torch/_ops.py`. I'm not sure if we'd need a way for the HOP to opt out of this though. (2) We were hardcoding a call to flex attention's fake implementation in dynamo to run fake prop. This is technically wrong for subclasses, because it doesn't give subclasses the chance to interpose on the op and desugar it before fake prop runs. I tweaked dynamo's logic to call the op, and let the dispatcher handle invoking the fake implementation. **Testing** Xilun is adding some DTensor tests in his PR that will end up testing this logic. If folks would prefer, though, I can try to add a test that uses another subclass instead that is maybe more basic. This is the tlparse that his DTensor test gnerated for me: https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/hirsheybar/0196c1d3-a9a2-46ea-a46d-aa21618aa060/custom/rank_0/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]

…oding call to faketensor impl in dynamo ghstack-source-id: 17535e0 Pull Request resolved: #151719

[ghstack-poisoned]

ghstack-source-id: 02f4618 Pull Request resolved: #155851

drisspg · 2025-06-17T00:21:27Z

@zou3519 @bdhirsh @ydwu4
Update the code, added a test and addressed all comments

[ghstack-poisoned]

…oding call to faketensor impl in dynamo ghstack-source-id: d7f1a8e Pull Request resolved: #151719

[ghstack-poisoned]

…oding call to faketensor impl in dynamo ghstack-source-id: 872151d Pull Request resolved: #151719

[ghstack-poisoned]

…oding call to faketensor impl in dynamo ghstack-source-id: 0daefe7 Pull Request resolved: #151719

drisspg · 2025-06-17T23:57:55Z

torch/_dynamo/variables/higher_order_ops.py

@@ -2892,7 +2884,7 @@ def call_function(
                ),
                kwargs={},
            ),
-            example_value=example_value,
+            example_value=None,


By passing in None for example value wrap_fx_proxy will implicitly do the above

test/inductor/test_flex_attention.py

[ghstack-poisoned]

…oding call to faketensor impl in dynamo ghstack-source-id: 18b3717 Pull Request resolved: #151719

drisspg · 2025-06-18T02:25:50Z

@pytorchbot merge -i

pytorchmergebot · 2025-06-18T02:27:41Z

Merge started

Your change will be merged while ignoring the following 1 checks: inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (inductor_torchbench_cpu_smoketest_perf, 1, 1, linux.24xl.spr-metal)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

zou3519 · 2025-06-18T13:37:24Z

torch/_higher_order_ops/flex_attention.py

+    if has_user_subclass(
+        (
+            query,
+            key,
+            value,
+            score_mod,
+            block_mask,
+            scale,
+            kernel_options,
+            score_mod_other_buffers,
+            mask_mod_other_buffers,
+        ),
+        allowed_subclasses=(FakeTensor, FunctionalTensor),
+    ):
+        return NotImplemented


This should go into py_functionalize_impl so that other HOPs can benefit:

pytorch/torch/_ops.py

Lines 173 to 188 in 1bb9b18

def functionalize_dk_fn(*args: _P.args, **kwargs: _P.kwargs) -> _T:

return fn(CppFunctionalizeAPI(), *args, **kwargs)

def functionalize_dispatch_mode_fn(

mode: Optional[FunctionalTensorMode], *args: _P.args, **kwargs: _P.kwargs

) -> _T:

return fn(PythonFunctionalizeAPI(mode), *args, **kwargs)

def functionalize_functorch_fn(

interpreter, *args: _P.args, **kwargs: _P.kwargs

) -> _T:

return fn(FunctorchFunctionalizeAPI(interpreter), *args, **kwargs)

self.py_impl(DispatchKey.Functionalize)(functionalize_dk_fn)

self.py_impl(FunctionalTensorMode)(functionalize_dispatch_mode_fn)

self.py_impl(TransformType.Functionalize)(functionalize_functorch_fn)

.

All HOPs need this behavior, this PR only adds it for FlexAttention

I think Brian mentioned that he wasn't sure if we should make this default for everyone, do we need a way for hops to opt out?

Could just make it an option to py_functionalize_impl if we do need a way for HOPs to opt-out. But I don't think we should allow HOPs to opt out, there's not a way for operators to opt-out of this.

zou3519 · 2025-06-18T13:40:03Z

torch/_higher_order_ops/flex_attention.py

+    if has_user_subclass(
+        (
+            query,
+            key,
+            value,
+            score_mod,
+            block_mask,
+            scale,
+            kernel_options,
+            score_mod_other_buffers,
+            mask_mod_other_buffers,
+        ),
+        allowed_subclasses=(FakeTensor,),
+    ):
+        return NotImplemented


I don't see why this is necessary -- @register_fake will directly call FakeTensor's __torch_dispatch__, and FakeTensor already has this logic. So we shouldn't need to duplicate it.

zou3519 · 2025-06-18T13:40:35Z

torch/_higher_order_ops/flex_attention.py

+    if has_user_subclass(
+        (
+            query,
+            key,
+            value,
+            out,
+            logsumexp,
+            grad_out,
+            grad_logsumexp,
+            block_mask,
+            scale,
+            kernel_options,
+            score_mod_other_buffers,
+            mask_mod_other_buffers,
+        ),
+        allowed_subclasses=(FakeTensor, FunctionalTensor),
+    ):
+        return NotImplemented


Same as above -- if we put this into py_functionalize_impl, then we don't need to do this again for flex_attention_backward

zou3519

@drisspg , if you have bandwidth, could you try to address the comments?

drisspg · 2025-06-18T16:33:17Z

@zou3519 yeah ill open up a new one

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

3b04657

…oding call to faketensor impl in dynamo [ghstack-poisoned]

bdhirsh requested a review from zou3519 as a code owner April 18, 2025 23:14

bdhirsh added a commit that referenced this pull request Apr 18, 2025

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

7ef67ce

…oding call to faketensor impl in dynamo ghstack-source-id: c4d5d49 Pull Request resolved: #151719

pytorch-bot bot added ciflow/inductor module: dynamo labels Apr 18, 2025

github-actions bot requested review from albanD, antoniojkim, ezyang, miladm and SherlockNoMad April 18, 2025 23:14

bdhirsh mentioned this pull request Apr 18, 2025

[cp] dispatch flex_attention to CP impl in TorchDispatchMode #151497

Open

Skylion007 reviewed Apr 19, 2025

View reviewed changes

torch/_higher_order_ops/flex_attention.py Outdated Show resolved Hide resolved

torch/_higher_order_ops/flex_attention.py Outdated Show resolved Hide resolved

albanD reviewed Apr 21, 2025

View reviewed changes

zou3519 reviewed Apr 22, 2025

View reviewed changes

torch/_higher_order_ops/flex_attention.py Outdated Show resolved Hide resolved

bdhirsh mentioned this pull request Apr 25, 2025

SAC: fix recompute tag propagation for ops with list[tensor] inputs #152195

Closed

bdhirsh added a commit that referenced this pull request Apr 25, 2025

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

61e3c83

…oding call to faketensor impl in dynamo ghstack-source-id: ba6f1ee Pull Request resolved: #151719

bdhirsh added a commit that referenced this pull request Apr 25, 2025

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

d0694d5

…oding call to faketensor impl in dynamo ghstack-source-id: cc7c17d Pull Request resolved: #151719

bdhirsh added a commit that referenced this pull request Apr 28, 2025

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

f8fadbf

…oding call to faketensor impl in dynamo ghstack-source-id: e402015 Pull Request resolved: #151719

bdhirsh added a commit that referenced this pull request Apr 29, 2025

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

00751e1

…oding call to faketensor impl in dynamo ghstack-source-id: 0eeb838 Pull Request resolved: #151719

bdhirsh mentioned this pull request May 2, 2025

Add a test for AsyncCollectiveTensor handling for maybe-view ops #152688

Closed

bdhirsh added a commit that referenced this pull request May 2, 2025

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

577af2e

…oding call to faketensor impl in dynamo ghstack-source-id: 17535e0 Pull Request resolved: #151719

jbschlosser mentioned this pull request May 29, 2025

torch.compiled flex_attention + NJT raises RuntimeError: Attempting to use FunctionalTensor on its own. #154556

Closed

XilunWu added a commit that referenced this pull request Jun 12, 2025

patch PR #151719

b3269af

[ghstack-poisoned]

XilunWu added a commit that referenced this pull request Jun 12, 2025

patch PR #151719

7539e90

ghstack-source-id: 02f4618 Pull Request resolved: #155851

Update

fcadeb7

[ghstack-poisoned]

drisspg pushed a commit that referenced this pull request Jun 17, 2025

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

44f28b8

…oding call to faketensor impl in dynamo ghstack-source-id: d7f1a8e Pull Request resolved: #151719

Update

398bc2e

[ghstack-poisoned]

drisspg pushed a commit that referenced this pull request Jun 17, 2025

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

b3b239b

…oding call to faketensor impl in dynamo ghstack-source-id: 872151d Pull Request resolved: #151719

Update

dc2033a

[ghstack-poisoned]

drisspg pushed a commit that referenced this pull request Jun 17, 2025

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

6464701

…oding call to faketensor impl in dynamo ghstack-source-id: 0daefe7 Pull Request resolved: #151719

drisspg removed request for SherlockNoMad, antoniojkim and miladm June 17, 2025 23:56

drisspg reviewed Jun 17, 2025

View reviewed changes

ydwu4 approved these changes Jun 17, 2025

View reviewed changes

test/inductor/test_flex_attention.py Outdated Show resolved Hide resolved

test/inductor/test_flex_attention.py Outdated Show resolved Hide resolved

Update

1ba7ad0

[ghstack-poisoned]

drisspg pushed a commit that referenced this pull request Jun 18, 2025

flex attention: fix dispatch order for tensor subclasses, avoid hardc…

19f69d3

…oding call to faketensor impl in dynamo ghstack-source-id: 18b3717 Pull Request resolved: #151719

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 18, 2025

pytorchmergebot added the merging label Jun 18, 2025

pytorchmergebot closed this in ccc6279 Jun 18, 2025

pytorchmergebot added Merged and removed merging labels Jun 18, 2025

zou3519 reviewed Jun 18, 2025

View reviewed changes

XilunWu mentioned this pull request Jun 30, 2025

[WIP][RFC] Compilable flex_attention + Context Parallel #157015

Open

bdhirsh mentioned this pull request Jul 2, 2025

[fake tensor] fix issue of no attribute tags #156689

Closed

github-actions bot deleted the gh/bdhirsh/655/head branch July 19, 2025 02:23

	def functionalize_dk_fn(args: _P.args, *kwargs: _P.kwargs) -> _T:
	return fn(CppFunctionalizeAPI(), args, *kwargs)

	def functionalize_dispatch_mode_fn(
	mode: Optional[FunctionalTensorMode], args: _P.args, *kwargs: _P.kwargs
	) -> _T:
	return fn(PythonFunctionalizeAPI(mode), args, *kwargs)

	def functionalize_functorch_fn(
	interpreter, args: _P.args, *kwargs: _P.kwargs
	) -> _T:
	return fn(FunctorchFunctionalizeAPI(interpreter), args, *kwargs)

	self.py_impl(DispatchKey.Functionalize)(functionalize_dk_fn)
	self.py_impl(FunctionalTensorMode)(functionalize_dispatch_mode_fn)
	self.py_impl(TransformType.Functionalize)(functionalize_functorch_fn)

flex attention: fix dispatch order for tensor subclasses, avoid hardcoding call to faketensor impl in dynamo #151719

flex attention: fix dispatch order for tensor subclasses, avoid hardcoding call to faketensor impl in dynamo #151719

Uh oh!

Conversation

bdhirsh commented Apr 18, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/151719

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

drisspg commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drisspg Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

drisspg commented Jun 18, 2025

Uh oh!

pytorchmergebot commented Jun 18, 2025

Merge started

Uh oh!

zou3519 Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

drisspg Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

zou3519 Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

zou3519 Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

zou3519 Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

drisspg commented Jun 18, 2025

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

bdhirsh commented Apr 18, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Apr 18, 2025 •

edited

Loading

drisspg commented Jun 17, 2025 •

edited

Loading