R-FCN: Object Detection via Region-based Fully Convolutional Networks

By Jifeng Dai, Yi Li, Kaiming He, Jian Sun

It is highly recommended to use the deformable R-FCN implemented in MXNet, which significantly increases the accuracy at very low extra computational overhead.

A python version of R-FCN is available, which supports end-to-end training/inference of R-FCN for object detection.

Introduction

R-FCN is a region-based object detection framework leveraging deep fully-convolutional networks, which is accurate and efficient. In contrast to previous region-based detectors such as Fast/Faster R-CNN that apply a costly per-region sub-network hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. R-FCN can natually adopt powerful fully convolutional image classifier backbones, such as ResNets, for object detection.

R-FCN was initially described in a NIPS 2016 paper.

This code has been tested on Windows 7/8 64 bit, Windows Server 2012 R2, and Ubuntu 14.04, with Matlab 2014a.

License

R-FCN is released under the MIT License (refer to the LICENSE file for details).

Citing R-FCN

If you find R-FCN useful in your research, please consider citing:

@article{dai16rfcn,
    Author = {Jifeng Dai, Yi Li, Kaiming He, Jian Sun},
    Title = {{R-FCN}: Object Detection via Region-based Fully Convolutional Networks},
    Journal = {arXiv preprint arXiv:1605.06409},
    Year = {2016}
}

Main Results

	training data	test data	mAP	time/img (K40)	time/img (Titian X)
R-FCN, ResNet-50	VOC 07+12 trainval	VOC 07 test	77.4%	0.12sec	0.09sec
R-FCN, ResNet-101	VOC 07+12 trainval	VOC 07 test	79.5%	0.17sec	0.12sec

Requirements: software

Caffe build for R-FCN (included in this repository, see external/caffe)
- If you are using Windows, you may download a compiled mex file by running fetch_data/fetch_caffe_mex_windows_vs2013_cuda75.m
- If you are using Linux or you want to compile for Windows, please recompile our Caffe branch.
MATLAB 2014a or later

Requirements: hardware

GPU: Titan, Titan X, K40, K80.

Demo

Run fetch_data/fetch_caffe_mex_windows_vs2013_cuda75.m to download a compiled Caffe mex (for Windows only).
Run fetch_data/fetch_demo_model_ResNet101.m to download a R-FCN model using ResNet-101 net (trained on VOC 07+12 trainval).
Run rfcn_build.m.
Run startup.m.
Run experiments/script_rfcn_demo.m to apply the R-FCN model on demo images.

Preparation for Training & Testing

Run fetch_data/fetch_caffe_mex_windows_vs2013_cuda75.m to download a compiled Caffe mex (for Windows only).
Run fetch_data/fetch_model_ResNet50.m to download an ImageNet-pre-trained ResNet-50 net.
Run fetch_data/fetch_model_ResNet101.m to download an ImageNet-pre-trained ResNet-101 net.
Run fetch_data/fetch_region_proposals.m to download the pre-computed region proposals.
Download VOC 2007 and 2012 data to ./datasets.
Run rfcn_build.m.
Run startup.m.

Training & Testing

Run experiments/script_rfcn_VOC0712_ResNet50_OHEM_ss.m to train a model using ResNet-50 net with online hard example mining (OHEM), leveraging selective search proposals. The accuracy should be ~75.4% in mAP.
- Note: the training time is ~13 hours on Titian X.
Run experiments/script_rfcn_VOC0712_ResNet50_OHEM_rpn.m to train a model using ResNet-50 net with OHEM, leveraging RPN proposals (using ResNet-50 net). The accuracy should be ~77.4% in mAP.
- Note: the training time is ~13 hours on Titian X.
Run experiments/script_rfcn_VOC0712_ResNet101_OHEM_rpn.m to train a model using ResNet-101 net with OHEM, leveraging RPN proposals (using ResNet-101 net). The accuracy should be ~79.5% in mAP.
- Note: the training time is ~19 hours on Titian X.
Check other scripts in ./experiments for more settings.

Note:

In all the experiments, training is performed on VOC 07+12 trainval, and testing is performed on VOC 07 test.
Results are subject to some random variations. We have run 'experiments/script_rfcn_VOC0712_ResNet50_OHEM_rpn.m' for 5 times, the results are 77.1%, 77.3%, 77.7%, 77.9%, and 77.0%. The mean is 77.4%, and the std is 0.39%.
Running time is not recorded in the test log (which is slower), but instead in an optimized implementation.

Resources

Experiment logs: OneDrive, BaiduYun

If the automatic "fetch_data" fails, you may manually download resouces from:

Pre-complied caffe mex (Windows):
- OneDrive, BaiduYun
Demo R-FCN model:
- OneDrive, BaiduYun
ImageNet-pretrained networks:
- ResNet-50 net OneDrive, BaiduYun
- ResNet-101 net OneDrive, BaiduYun
Pre-computed region proposals:
- OneDrive, BaiduYun

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data/demo		data/demo
experiments		experiments
external		external
fetch_data		fetch_data
functions		functions
imdb		imdb
models/rfcn_prototxts		models/rfcn_prototxts
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
rfcn_build.m		rfcn_build.m
startup.m		startup.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R-FCN: Object Detection via Region-based Fully Convolutional Networks

Introduction

License

Citing R-FCN

Main Results

Requirements: software

Requirements: hardware

Demo

Preparation for Training & Testing

Training & Testing

Resources

About

Releases

Packages

Contributors 4

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

License

daijifeng001/R-FCN

Folders and files

Latest commit

History

Repository files navigation

R-FCN: Object Detection via Region-based Fully Convolutional Networks

Introduction

License

Citing R-FCN

Main Results

Requirements: software

Requirements: hardware

Demo

Preparation for Training & Testing

Training & Testing

Resources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Packages