Image Segmentation on AWS EC2 using Semantic Segmentation on MIT ADE20K dataset in PyTorch

Online segmentation application

This is simple Flask API to consume the segmentation model below. You can use the joined Dockerfile to generate an image and use it as you want.

Create image

docker build -t image_semantic_segmentation .

Create container

docker run –name segmentation -p 80:80 image_semantic_segmentation

Semantic Segmentation on MIT ADE20K dataset in PyTorch

This project use a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing dataset (http://sceneparsing.csail.mit.edu/).

ADE20K is the largest open source dataset for semantic segmentation and scene parsing, released by MIT Computer Vision team. Follow the link below to find the repository for our dataset and implementations on Caffe and Torch7: https://github.com/CSAILVision/sceneparsing

Color encoding of semantic categories can be found here: https://docs.google.com/spreadsheets/d/1se8YEtb2detS7OuPE86fXGyD269pMycAWe2mtKUj2W8/edit?usp=sharing

Model description

Encoder:

MobileNetV2dilated
ResNet18/ResNet18dilated
ResNet50/ResNet50dilated
ResNet101/ResNet101dilated
HRNetV2 (W48)

Decoder:

C1 (one convolution module)
C1_deepsup (C1 + deep supervision trick)
PPM (Pyramid Pooling Module, see PSPNet paper for details.)
PPM_deepsup (PPM + deep supervision trick)
UPerNet (Pyramid Pooling + FPN head, see UperNet for details.)

Model performance:

IMPORTANT: The base ResNet in our repository is a customized (different from the one in torchvision). The base models will be automatically downloaded when needed.

Architecture	MultiScale Testing	Mean IoU	Pixel Accuracy(%)	Overall Score	Inference Speed(fps)
MobileNetV2dilated + C1_deepsup	No	34.84	75.75	54.07	17.2
MobileNetV2dilated + C1_deepsup	Yes	33.84	76.80	55.32	10.3
MobileNetV2dilated + PPM_deepsup	No	35.76	77.77	56.27	14.9
MobileNetV2dilated + PPM_deepsup	Yes	36.28	78.26	57.27	6.7
ResNet18dilated + C1_deepsup	No	33.82	76.05	54.94	13.9
ResNet18dilated + C1_deepsup	Yes	35.34	77.41	56.38	5.8
ResNet18dilated + PPM_deepsup	No	38.00	78.64	58.32	11.7
ResNet18dilated + PPM_deepsup	Yes	38.81	79.29	59.05	4.2
ResNet50dilated + PPM_deepsup	No	41.26	79.73	60.50	8.3
ResNet50dilated + PPM_deepsup	Yes	42.14	80.13	61.14	2.6
ResNet101dilated + PPM_deepsup	No	42.19	80.59	61.39	6.8
ResNet101dilated + PPM_deepsup	Yes	42.53	80.91	61.72	2.0
UperNet50	No	40.44	79.80	60.12	8.4
UperNet50	Yes	41.55	80.23	60.89	2.9
UperNet101	No	42.00	80.79	61.40	7.8
UperNet101	Yes	42.66	81.01	61.84	2.3
HRNetV2	No	42.03	80.77	61.40	5.8
HRNetV2	Yes	43.20	81.47	62.34	1.9

The training is benchmarked on a server with 8 NVIDIA Pascal Titan Xp GPUs (12GB GPU memory), the inference speed is benchmarked a single NVIDIA Pascal Titan Xp GPU, without visualization.

Quick start: Test on an image using our trained model

Install python lib

pip install -r requirements.txt

Launch init.py

# download model's parameter and test image
python init.py

To test on an image or a folder of images ($PATH_IMG), you can simply do the following:

python test.py --imgs $PATH_IMG

Reference

If you find the code or pre-trained models useful, please cite the following papers:

Semantic Understanding of Scenes through ADE20K Dataset. B. Zhou, H. Zhao, X. Puig, T. Xiao, S. Fidler, A. Barriuso and A. Torralba. International Journal on Computer Vision (IJCV), 2018. (https://arxiv.org/pdf/1608.05442.pdf)

@article{zhou2018semantic,
  title={Semantic understanding of scenes through the ade20k dataset},
  author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Xiao, Tete and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
  journal={International Journal on Computer Vision},
  year={2018}
}

Scene Parsing through ADE20K Dataset. B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso and A. Torralba. Computer Vision and Pattern Recognition (CVPR), 2017. (http://people.csail.mit.edu/bzhou/publication/scene-parse-camera-ready.pdf)

@inproceedings{zhou2017scene,
    title={Scene Parsing through ADE20K Dataset},
    author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
    year={2017}
}

Name		Name	Last commit message	Last commit date
Latest commit History 316 Commits
config		config
data		data
mit_semseg		mit_semseg
templates		templates
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
demo_test.sh		demo_test.sh
download.py		download.py
download_ADE20K.sh		download_ADE20K.sh
eval.py		eval.py
eval_multipro.py		eval_multipro.py
index.py		index.py
init.py		init.py
outil_segmentation.py		outil_segmentation.py
requirements.txt		requirements.txt
segmentation.py		segmentation.py
server.py		server.py
setup.py		setup.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Segmentation on AWS EC2 using Semantic Segmentation on MIT ADE20K dataset in PyTorch

Online segmentation application

Semantic Segmentation on MIT ADE20K dataset in PyTorch

Model description

Model performance:

Quick start: Test on an image using our trained model

Reference

About

Uh oh!

Releases

Packages

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

License

Kiol-Ice/semantic-segmentation-pytorch

Folders and files

Latest commit

History

Repository files navigation

Image Segmentation on AWS EC2 using Semantic Segmentation on MIT ADE20K dataset in PyTorch

Online segmentation application

Semantic Segmentation on MIT ADE20K dataset in PyTorch

Model description

Model performance:

Quick start: Test on an image using our trained model

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Packages