【ECCV 2018 .Jian Sun】DetNet: A Backbone network for Object Detection

2019年5月5日 225次阅读来源: Moonsmile

【Background】：ECCV is one of the top conferences in computer vision,In this blog,I will introduce an paper from Sun Jian team, which is about a backbone network for object detection.What is worth mentioning is that this paper does not have any formula.

paper name: DetNet: A Backbone network for Object Detection
paper url: https://arxiv.org/abs/1804.06215

《【ECCV 2018 .Jian Sun】DetNet: A Backbone network for Object Detection》 paper-information

1、【Abstract】

Object detection is a heavily researched topic in computer vision. It aims at finding “where” and “what” each object instance is when given an image.

In network structure, recent CNN based detectors are usually split into two parts. The one is backbone network,and the other is detection business part. you can see the picture below to understand this.
《【ECCV 2018 .Jian Sun】DetNet: A Backbone network for Object Detection》 1.png

2、【Motivation】

the existing backbone networks usually have some problems,because they are desighed for classification task at first,there is no doubt that there are differences between different tasks(classification and detection).so,de signing a new backbone networks for detection is become very neccessary.

《【ECCV 2018 .Jian Sun】DetNet: A Backbone network for Object Detection》 2.png

3、【The problems of exist backbone network】

《【ECCV 2018 .Jian Sun】DetNet: A Backbone network for Object Detection》 2.png

4、【Comparisons of different backbones used in FPN】

《【ECCV 2018 .Jian Sun】DetNet: A Backbone network for Object Detection》 3.png

5、【Contributions】

《【ECCV 2018 .Jian Sun】DetNet: A Backbone network for Object Detection》 4.png

6、【DetNet design】

《【ECCV 2018 .Jian Sun】DetNet: A Backbone network for Object Detection》 5.png
6.png

7、【More details】

the author adopt ResNet-50 as baseline, which is widely used as the backbone network in a lot of object detectors. To fairly compare with the ResNet-50, we keep stage 1,2,3,4 the same as original ResNet-50 for our DetNet.the more details you can see picture 7.png.

《【ECCV 2018 .Jian Sun】DetNet: A Backbone network for Object Detection》 7.png

apply bottleneck with dilation as a basic network block to efficiently enlarge the receptive filed. Since dilated convolution is still time consuming,our stage 5 and stage 6 keep the same channels as stage 4 (256 input channels for bottleneck block). This is different from traditional backbone design,which will double channels in a later stage.

More information about dilation:https://blog.csdn.net/jzrita/article/details/72639969