Skip to content

Unexpected benchmark results: torchcodec with GPU slower than CPU in some scenarios #765

@XingyuHu109

Description

@XingyuHu109

📚 The doc issue

Hi torchcodec team,

I was reviewing the benchmark chart in the README (specifically, benchmarks/decoders/benchmark_readme_chart.png), and I noticed something unexpected: in certain scenarios, torchcodec with GPU acceleration appears to be slower than the CPU-only version.

For context, I'm interested in using torchcodec for video decoding tasks, and GPU support is appealing for performance reasons. However, the chart suggests potential regressions or overheads on GPU in some cases. Could you please provide some explanation or insights into why this might be happening? For example:

Is this due to data transfer overhead between CPU and GPU?
Are there specific video formats, resolutions, or batch sizes where GPU is expected to underperform?
Were these benchmarks run on particular hardware (e.g., NVIDIA GPU models, CUDA versions), and could that influence the results?
Any recommendations for optimizing GPU usage to avoid these slowdowns?
I'd appreciate any details or updates to the documentation to help users understand when to prefer GPU vs. CPU. Thanks for your work on this library—it's really promising!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      pFad - Phonifier reborn

      Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

      Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


      Alternative Proxies:

      Alternative Proxy

      pFad Proxy

      pFad v3 Proxy

      pFad v4 Proxy