quantizer_hqq should not require a gpu/cuda device to run

`quantizer_hqq.py` requires cuda device:

https://github.com/huggingface/transformers/blob/badc71b9f604ca910bb87a43979c795eaf6e7d64/src/transformers/quantizers/quantizer_hqq.py#L74-L75

However the original HQQ library also runs on CPU, by falling back to default aten operators: https://github.com/mobiusml/hqq?tab=readme-ov-file#usage-with-models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

quantizer_hqq should not require a gpu/cuda device to run #38439

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

	if not torch.cuda.is_available():
	raise RuntimeError("No GPU found. A GPU is needed for quantization.")

quantizer_hqq should not require a gpu/cuda device to run #38439

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.