Add a remote-vllm integration test to GitHub Actions workflow. #1648

wukaixingxp · 2025-03-14T18:55:21Z

🚀 Describe the new functionality needed

Given that vllm has been a very popular choice for inference solution, I would like to suggest we add a remote-vllm integration test to GitHub Actions workflow, maybe test the CPU version of vLLM on 1B/3B model is enough, similar to this PR for adding Ollama test

💡 Why is this needed? What if we don't build it?

vLLM provider maybe broken and many users/companies can not use llama-stack with vLLM.

Other thoughts

will add some inference costs but I believe making sure vLLM provider is working well with llama-stack is very important.

ashwinb · 2025-03-14T22:51:31Z

Here's how you can run vllm easy enough:

uv run --with vllm --python 3.12 vllm serve meta-llama/Llama-3.2-3B-Instruct

This probably needs a huggingface token though which has permissions to read the protected llama repository :/

nathan-weinberg · 2025-03-15T02:36:54Z

Couldn't you just use a non-Llama model that doesn't require a HuggingFace token? Or are only Llama models support with the vLLM provider?

wukaixingxp · 2025-03-16T18:21:04Z

Here's how you can run vllm easy enough:

uv run --with vllm --python 3.12 vllm serve meta-llama/Llama-3.2-3B-Instruct
This probably needs a huggingface token though which has permissions to read the protected llama repository :/

Do we have a way to store secrets in the Github action? I wonder how we are testing meta-reference server? as it also need some credential to get our Pytorch weights..

wukaixingxp added api provider Specific to a Llama Stack API Provider enhancement New feature or request labels Mar 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a remote-vllm integration test to GitHub Actions workflow. #1648

Add a remote-vllm integration test to GitHub Actions workflow. #1648

wukaixingxp commented Mar 14, 2025 •

edited

Loading

ashwinb commented Mar 14, 2025

nathan-weinberg commented Mar 15, 2025

wukaixingxp commented Mar 16, 2025

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Add a remote-vllm integration test to GitHub Actions workflow. #1648

Add a remote-vllm integration test to GitHub Actions workflow. #1648

Comments

wukaixingxp commented Mar 14, 2025 • edited Loading

🚀 Describe the new functionality needed

💡 Why is this needed? What if we don't build it?

Other thoughts

ashwinb commented Mar 14, 2025

nathan-weinberg commented Mar 15, 2025

wukaixingxp commented Mar 16, 2025

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

wukaixingxp commented Mar 14, 2025 •

edited

Loading