Skip to content

Expose vLLM Metrics to serve.llm API #52719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 13, 2025
Merged

Conversation

eicherseiji
Copy link
Contributor

@eicherseiji eicherseiji commented May 1, 2025

Why are these changes needed?

This change provides visibility into Ray Serve LLM deployments, including vLLM-specific statistics.

Dashboard panels:

Screenshot 2025-05-08 at 5 47 34 PM Screenshot 2025-05-08 at 5 47 42 PM Screenshot 2025-05-08 at 5 47 46 PM Screenshot 2025-05-08 at 5 47 49 PM

Docs:
Screenshot 2025-05-12 at 3 44 19 PM
Screenshot 2025-05-12 at 3 44 27 PM

Related issue number

JR-1864

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Tested following steps on https://docs.ray.io/en/latest/cluster/metrics.html

@eicherseiji eicherseiji self-assigned this May 1, 2025
@eicherseiji eicherseiji force-pushed the JR_1864 branch 2 times, most recently from 33cca0b to 56e7858 Compare May 7, 2025 02:52
@hainesmichaelc hainesmichaelc added the community-contribution Contributed by the community label May 7, 2025
Copy link
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some V0 vs. V1 stuff. Could you also ask from observability team to review as well??

@kouroshHakha kouroshHakha requested a review from alanwguo May 8, 2025 16:58
@kouroshHakha kouroshHakha removed the community-contribution Contributed by the community label May 8, 2025
@hainesmichaelc hainesmichaelc added the community-contribution Contributed by the community label May 8, 2025
@eicherseiji eicherseiji marked this pull request as ready for review May 9, 2025 00:46
@eicherseiji eicherseiji requested a review from a team as a code owner May 9, 2025 00:46
Copy link
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to server_models and vllm_engine looks good to me. Thanks a ton.

@eicherseiji eicherseiji added the go add ONLY when ready to merge, run all tests label May 10, 2025
Copy link
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could You create docs for logging?

Basically you want to cover:

  1. How to enable logging?
  2. What does logging give you: i.e engine emitted metrics like vllm metrics about cache hit rate, spec decoding hit rate, etc + service level metrics like number of input tokens served, output tokens, etc
    Maybe with some nice screenshots.

You don't need to create an extensive list of all metrics.

@eicherseiji eicherseiji requested review from edoakes, zcin, akshay-anyscale and a team as code owners May 12, 2025 20:48
Copy link
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor change requests:

Copy link
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kouroshHakha kouroshHakha enabled auto-merge (squash) May 12, 2025 22:16
@masoudcharkhabi masoudcharkhabi added serve Ray Serve Related Issue usability labels May 12, 2025
Copy link

@dstrodtman dstrodtman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions, mostly for clarity and to improve readability and SEO.

@eicherseiji
Copy link
Contributor Author

Thanks @dstrodtman for comments!

Copy link
Contributor

@angelinalg angelinalg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits. Thanks for doing the tech writer review, Douglas and the quick resolutions, @eicherseiji!

@eicherseiji
Copy link
Contributor Author

Thanks @angelinalg!

eicherseiji and others added 12 commits May 13, 2025 18:17
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
…1 only

Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>
@kouroshHakha kouroshHakha merged commit 881cd91 into ray-project:master May 13, 2025
5 checks passed
zhaoch23 pushed a commit to Bye-legumes/ray that referenced this pull request May 14, 2025
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Signed-off-by: zhaoch23 <c233zhao@uwaterloo.ca>
iamjustinhsu pushed a commit to iamjustinhsu/ray that referenced this pull request May 15, 2025
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Signed-off-by: iamjustinhsu <jhsu@anyscale.com>
lk-chen pushed a commit to lk-chen/ray that referenced this pull request May 17, 2025
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
matthewdeng pushed a commit that referenced this pull request May 20, 2025
Adding back some default panel configurations that were accidentally
removed in a prior PR #52719


Signed-off-by: Alan Guo <aguo@anyscale.com>
kenmcheng pushed a commit to kenmcheng/ray that referenced this pull request May 27, 2025
Adding back some default panel configurations that were accidentally
removed in a prior PR ray-project#52719


Signed-off-by: Alan Guo <aguo@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-backlog community-contribution Contributed by the community go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue usability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy