Skip to content

Nixl optimization for llama4 local attention #87

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 59 commits into
base: pd-launch-branch
Choose a base branch
from

Conversation

mgoin
Copy link
Member

@mgoin mgoin commented May 15, 2025

No description provided.

ekagra-ranjan and others added 30 commits May 14, 2025 12:31
…aft model to free ~1GB for llama 3 model (vllm-project#17326)

Co-authored-by: root <root@ekagra-8xh100.us-east5-a.c.serving-efficiency-poc.internal>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Aaron Pham <Aaronpham0103@gmail.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
)

Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…llm-project#18013)

Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Andy Xie <andy.xning@gmail.com>
Signed-off-by: inkcherry <mingzhi.liu@intel.com>
Signed-off-by: David Xia <david@davidxia.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: omahs <73983677+omahs@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
schoennenbeck and others added 25 commits May 15, 2025 09:00
…8190)

Signed-off-by: Sebastian Schönnenbeck <sebastian.schoennenbeck@comma-soft.com>
…Error to ValueError (vllm-project#18181)

Signed-off-by: Abatom <abzhonghua@gmail.com>
… unquantizedMethod to reenable LLama4 BF16 (vllm-project#18205)

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Lucia Fang <fanglu@fb.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
…-project#18229)

Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
…attention on ROCm (vllm-project#18093)

Signed-off-by: kf <kuanfu.liu@embeddedllm.com>
Signed-off-by: lisiqi23 <lisiqi23@xiaomi.com>
Signed-off-by: skylee-01 <497627264@qq.com>
Co-authored-by: lisiqi23 <lisiqi23@xiaomi.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: David Xia <david@davidxia.com>
vllm-project#17973)

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@centml.ai>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Felix Marty <felmarty@amd.com>
Signed-off-by: learner0810 <zhongjun.li@daocloud.io>
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy