Skip to content

[Graph Partition] use pinned memory and foreach when moving cpu scalar tensor to gpu #155360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
BoyuanFeng opened this issue Jun 6, 2025 · 0 comments
Assignees
Labels
module: inductor oncall: pt2 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@BoyuanFeng
Copy link
Contributor

BoyuanFeng commented Jun 6, 2025

Graph partition automatically moves cpu scalar tensors to gpu when possible (#154464). It's better to use pin memory and copy with non_blocking. This depends on #155121. More context in this issue.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

@BoyuanFeng BoyuanFeng self-assigned this Jun 6, 2025
@BoyuanFeng BoyuanFeng added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: inductor labels Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: inductor oncall: pt2 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

1 participant
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy