Content-Length: 249913 | pFad | http://github.com/RustPython/RustPython/issues/5742

64 Make RustPython benchmarks readable · Issue #5742 · RustPython/RustPython · GitHub
Skip to content

Make RustPython benchmarks readable #5742

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
e-dervieux opened this issue Apr 24, 2025 · 3 comments
Open

Make RustPython benchmarks readable #5742

e-dervieux opened this issue Apr 24, 2025 · 3 comments

Comments

@e-dervieux
Copy link

TL;DR: benchmarks are poorly readable and could be greatly improved. This is key element in convincing people of the soundness of RustPython so it should probably not be neglected IMHO.


The violin plots available here are not easily readable and their Y-axes labels are hardly readable at all because they got left-cut at some point. This is especially troublesome for the MICROBENCHMARKS section, for which it is impossible to tell RustPython from CPython.

This issue could be alleviated by doing the following:

  • Use a specific color for CPython and another one for RustPython (and keep this color pair consistent across all plots).
  • Always have CPython data on top and RustPython data on bottom (this is not consistent: in the EXECUTION tab, CPython is on top and RustPython on bottom, while in tab PARSE_TO_AST it is the other way around).
  • Only keep the name of the benchmark in the Y-axis labels, i.e. replace execution/mandelbrot.py/cpython by either Mandelbrot (and use a legend to indicate which color is which interpreter), or make a plot title saying Mandelbrot and use the Y-axis labels to tell whether it is CPython or RustPython.

In addition to these visual issues, some other improvements could be implemented:

  • Make the plots user-friendly using some interactive backend such as plotly.
  • Put hyperlinks to the benchmark script location / source-code, so that users can check what the benchmarks are actually doing.
  • In the same line of thought, add a small descriptive text about what the benchmark does / why it is relevant (for instance "benchmark X is particularly I/O intensive" or whatnot).
  • On top of the page, give the hash of the commit / version (possibly with release date to know at a glance if they're outdated or not) of both CPython and RustPython binaries that were used, whether they were recompiled with -o3 locally, as well as the machine specs (this would allow for meaningful comparison and reproducibility).

I think that benchmarks one of the key element that might convince anyone to switch from one interpreter to another (apart from functionalities / low-level bindings). Hence they should not be neglected.

If someone could point me to where these plots are generated, I'd be happy to help typesetting them / add further info (although I might need some technical support about why benchmark X is especially relevant or not).

@arihant2math
Copy link
Collaborator

Thanks for bringing this up. These issues do need to be addressed. Currently our benchmarks graphs are handled by criterion: https://bheisler.github.io/criterion.rs/book/user_guide/comparing_functions.html

Sadly from what I know, criterion doesn’t support graphing config.

The workflow that updates these benchmarks is here:

Now everything else (display-wise) is located here:
https://github.com/RustPython/rustpython.github.io/blob/master/_layouts/benchmarks.html

I think the last 3 bullet points should be very feasible to implement. The graph related ones could either be done during CI via a custom script or on the frontend with a js plotting library.

@Undersk0re
Copy link
Contributor

I don't think anyone needs to be convinced of anything; in my opinion, you use the right tool depending on the project. Maybe it would be a good idea to ask if RustPython should be included in various distribution tools like pyenv or similar.

In any case, I can try to understand and solve this issue.

@Undersk0re
Copy link
Contributor

From what I could understand, Criterion uses a parameter in the function criterion.rs/src/plot/gnuplot_backend/summary.rs::fn violin(), setting the plot color via the constant DARK_BLUE.

I'm not experienced enough in Rust to confirm whether this constant can be modified before compile-time depending on whether you're testing "RustPython" or "CPython."

Even when changing Criterion's backend, this parameter doesn't seem to be available, not to mention criterion-plot (an unmaintained library).

One method to change the color scheme appears to be using a Criterion.toml in the project's root, but it still doesn't modify the violin plots, which are far too dense to be distinguishable.

[colors]
# These are used in many charts to compare the current measurement against 
# the previous one.
current_sample = {r = 31, g = 120, b = 180}
previous_sample = {r = 7, g = 26, b = 28}

# These are used by the full PDF chart to highlight which samples were outliers.
not_an_outlier = {r = 31, g = 120, b = 180}
mild_outlier = {r = 5, g = 127, b = 0}
severe_outlier = {r = 7, g = 26, b = 28}

# These are used for the line chart to compare multiple different functions.
comparison_colors = [
    {r = 230, g = 25,  b = 75},   # red
    {r = 60,  g = 180, b = 75},   # green
    {r = 255, g = 225, b = 25},   # yellow
    {r = 0,   g = 130, b = 200},  # blue
    {r = 245, g = 130, b = 48},   # orange
    {r = 145, g = 30,  b = 180},  # purple
    {r = 70,  g = 240, b = 240},  # light cyan
    {r = 240, g = 50,  b = 230},  # magenta
]

The solution shold be to find a library that can replace Criterion as quickly as possible while also returning a similar HTML report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/RustPython/RustPython/issues/5742

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy