Skip to content

mit-han-lab/haq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Introduction

This repo contains PyTorch implementation for paper HAQ: Hardware-Aware Automated Quantization with Mixed Precision (CVPR2019, oral)

overview

@inproceedings{haq,
author = {Wang, Kuan and Liu, Zhijian and Lin, Yujun and Lin, Ji and Han, Song},
title = {HAQ: Hardware-Aware Automated Quantization With Mixed Precision},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2019}
}

Other papers related to automated model design:

  • AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV 2018)

  • ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware (ICLR 2019)

Dependencies

We evaluate this code with Pytorch 1.1 (cuda10) and torchvision 0.3.0, you can install pytorch with conda:

# install pytorch
conda install -y pytorch torchvision cudatoolkit=10.0 -c pytorch

And you can use the following command to set up the environment:

# install packages and download the pretrained model
bash run/setup.sh

(If the server is down, you can download the pretrained model from google drive: mobilenetv2-150.pth.tar)

Current code base is tested under following environment:

  1. Python 3.7.3
  2. PyTorch 1.1
  3. torchvision 0.3.0
  4. numpy 1.14
  5. matplotlib 3.0.1
  6. scikit-learn 0.21.0
  7. easydict 1.8
  8. progress 1.4
  9. tensorboardX 1.7

Dataset

If you already have the ImageNet dataset for pytorch, you could create a link to data folder and use it:

# prepare dataset, change the path to your own
ln -s /path/to/imagenet/ data/

If you do not have the ImageNet yet, you can download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

We use a subset of ImageNet in the linear quantizaiton search phase to save the training time, to create the link of the subset, you can use the following tool:

# prepare imagenet100 dataset
python lib/utils/make_data.py

Reinforcement learning search

  • You can run the bash file as following to search the K-Means quantization strategy, which only quantizes the weights with K-Means to compress model size of specific model.
# K-Means quantization, for model size
bash run/run_kmeans_quantize_search.sh
  • You can run the bash file as following to search the linear quantization strategy, which linearly quantizes both the weights and activations to reduce latency/energy of specific model.
# Linear quantization, for latency/energy
bash run/run_linear_quantize_search.sh
  • Usage details
python rl_quantize.py --help

Finetune Policy

  • After searching, you can get the quantization strategy list, and you can replace the strategy list in finetune.py to finetune and evaluate the performance on ImageNet dataset.
  • We set the default K-Means quantization strategy searched under preserve ratio = 0.1 like:
# preserve ratio 10%
strategy = [6, 6, 5, 5, 5, 5, 4, 5, 5, 4, 5, 5, 5, 5, 5, 5, 3, 5, 4, 3, 5, 4, 3, 4, 4, 4, 2, 5, 4, 3, 3, 5, 3, 2, 5, 3, 2, 4, 3, 2, 5, 3, 2, 5, 3, 4, 2, 5, 2, 3, 4, 2, 3, 4]

You can follow the following bash file to finetune the K-Means quantized model to get a better performance:

bash run/run_kmeans_quantize_finetune.sh
  • We set the default linear quantization strategy searched under preserve ratio = 0.6 like:
# preserve ratio 60%
strategy = [[8, -1], [7, 7], [5, 6], [4, 6], [5, 6], [5, 7], [5, 6], [7, 4], [4, 6], [4, 6], [7, 7], [5, 6], [4, 6], [7, 3], [5, 7], [4, 7], [7, 3], [5, 7], [4, 7], [7, 7], [4, 7], [4, 7], [6, 4], [6, 7], [4, 7], [7, 4], [6, 7], [5, 7], [7, 4], [6, 7], [5, 7], [7, 4], [6, 7], [6, 7], [6, 4], [5, 7], [6, 7], [6, 4], [5, 7], [6, 7], [7, 7], [4, 7], [7, 7], [7, 7], [4, 7], [7, 7], [7, 7], [4, 7], [7, 7], [7, 7], [4, 7], [7, 7], [8, 8]]

You can follow the following bash file to finetune the linear quantized model to get a better performance:

bash run/run_linear_quantize_finetune.sh
  • Usage details
python finetune.py --help

Evaluate

You can download the pretrained quantized model like this:

# download checkpoint
mkdir -p checkpoints/resnet50/
mkdir -p checkpoints/mobilenetv2/
cd checkpoints/resnet50/
wget https://hanlab.mit.edu/files/haq/resnet50_0.1_75.48.pth.tar
cd ../mobilenetv2/
wget https://hanlab.mit.edu/files/haq/qmobilenetv2_0.6_71.23.pth.tar
cd ../..

(If the server is down, you can download the pretrained model from google drive: qmobilenetv2_0.6_71.23.pth.tar)

You can evaluate the K-Means quantized model like this:

# evaluate K-Means quantization
bash run/run_kmeans_quantize_eval.sh
Models preserve ratio Top1 Acc (%) Top5 Acc (%)
resnet50 (original) 1.0 76.15 92.87
resnet50 (10x compress) 0.1 75.48 92.42

You can evaluate the linear quantized model like this:

# evaluate linear quantization
bash run/run_linear_quantize_eval.sh
Models preserve ratio Top1 Acc (%) Top5 Acc (%)
mobilenetv2 (original) 1.0 72.05 90.49
mobilenetv2 (0.6x latency) 0.6 71.23 90.00

About

[CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy