Tags: dltn/vllm
Tags
[Bugfix] Fix deepseekv3 grouped topk error (vllm-project#13474) Signed-off-by: Chen-XiaoBing <chenxb002@whu.edu.cn>
[Misc] Improve error message for incorrect pynvml (vllm-project#12809) Signed-off-by: youkaichao <youkaichao@gmail.com>
Disable chunked prefill and/or prefix caching when MLA is enabled (vl… …lm-project#12642) From @mgoin in vllm-project#12638 I cannot push to that branch, therefore a new PR to unblock release. --------- Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: simon-mo <simon.mo@hey.com> Co-authored-by: mgoin <michael@neuralmagic.com>
[Bugfix] Fix Granite 3.0 MoE model loading (vllm-project#12446) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Deepseek v3 (vllm-project#11502) Signed-off-by: mgoin <michael@neuralmagic.com> Co-authored-by: mgoin <michael@neuralmagic.com> Co-authored-by: robertgshaw2-neuralmagic <rshaw@neuralmagic.com>
[BugFix] Fix quantization for all other methods (vllm-project#11547)
[Bugfix] Fix request cancellation without polling (vllm-project#11190)
[Misc] bump mistral common version (vllm-project#10367) Signed-off-by: simon-mo <simon.mo@hey.com>
[CI/Build] remove .github from .dockerignore, add dirty repo check (v… …llm-project#9375)
PreviousNext