Skip to content

[XPU User Empathy Day][whisper][Arc770][Win]XPU performance is worse than CPU #151985

@yinghu5

Description

@yinghu5

🐛 Describe the bug

Try openai/whisper-tiny
on one desktop machine with Arc 770 Windows 11
and find the performance of XPU is worse than CPU. which is not expected.

reproduce step:

  1. install windows 11 on the target machine
  2. install XPU driver: https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html => the version is 32.0.101.6734 WHQL Certified 4/8/2025
    after install and restart, please check your XPU working in task manager
  3. install python from https://www.python.org/downloads/release/python-31210/, using python 3.12.104.
  4. install pytorch environment:
    open CMD, run C:\Users\gta\AppData\Local\Programs\Python.exe
    python -m venv venv_py27_xpu
    venv_py27_xpu\Scripts\activate
    (venv_py27_xpu) C:\Users\gta>pip3 install torch==2.7.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/test/xpu

python -c "import torch; print(torch.xpu.is_available())"

  1. install whisper dependency
    pip install transformers
    pip install datasets
    pip install librosa

  2. python test_whisper
    run XPU performance is less than CPU

import torch
from transformers import WhisperProcessor, WhisperForConditionalGeneration
from datasets import load_dataset

## comment the two line for CPU run
#device = "cpu"
device = torch.device("xpu" if torch.xpu.is_available() else "cpu")
print(f"Using device: {device}")

import  time
start_time = time.time()
print(f"Starting voice conversion at {time.strftime('%Y-%m-%d %H:%M:%S')}")

from torch.profiler import profile, ProfilerActivity

# load model and processor
print("\nLoading Whisper Model...")
whisper_start_time = time.time()

processor = WhisperProcessor.from_pretrained("openai/whisper-tiny")
model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny").to(device)
forced_decoder_ids = processor.get_decoder_prompt_ids(language="en", task="transcribe")

whisper_end_time = time.time()
print(f"Loading Whisper Model time taken: {whisper_end_time - whisper_start_time:.2f} seconds") 

# load dummy dataset and read audio files
ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = ds[2]["audio"]
input_features = processor(sample["array"], sampling_rate=sample["sampling_rate"], return_tensors="pt",  
      return_attention_mask=True,  # Critical for reliable results  
    padding=True  # Required if batching multiple audios  
 ).input_features.to(device)

# generate token ids
print("\n Runing Whisper Model...")
whisper_start_time = time.time()
#with profile(activities=[ProfilerActivity.CPU,
#                       ProfilerActivity.XPU]) as prof:
predicted_ids = model.generate(input_features,forced_decoder_ids=forced_decoder_ids)
#print(prof.key_averages().table(sort_by="xpu_time_total"))

# decode token ids to text
#transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)

transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
whisper_end_time = time.time()
print(f"Runing Whisper Model time taken: {whisper_end_time - whisper_start_time:.2f} seconds") 

print(transcription) 

end_time = time.time()
print(f"Voice conversion completed at {time.strftime('%Y-%m-%d %H:%M:%S')}")
print(f"Total time taken: {end_time - start_time:.2f} seconds") 
"openai/whisper-tiny CPU (s) XPU(s) Torch.compile 12s mp3
Load 1.57 2.34 2.67 hf-internal-testing/librispeech_asr_dummy · Datasets at Hugging Face [2]
Run Generate 0.57 3.58 3.58  
Total 7.33 10.63 11.12  

Versions

pip3 install torch==2.7.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/test/xpu

cc @msaroufim @jerryzh168 @gujinghui @EikanWang @fengyuan14 @guangyey

Metadata

Metadata

Assignees

Labels

module: performanceIssues related to performance, either of kernel code or framework gluemodule: xpuIntel XPU related issuestriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    pFad - Phonifier reborn

    Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

    Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


    Alternative Proxies:

    Alternative Proxy

    pFad Proxy

    pFad v3 Proxy

    pFad v4 Proxy