使用Detectron进行目标检测

背景

上个月的时候(2018年1月),Facebook终于开源了他们的目标检测平台Detectron: facebookresearch/Detectron。这一平台是在老旧的py-faster-rcnn停止维护2年后才开源出来的,所以直觉上变化会很大。本文Gemfield将介绍Detectron的使用步骤,注意一点,Detectron是基于Caffe2的,而Caffe2只有GPU版本。

编译及安装

1,编译并安装Caffe2,Gemfield直接使用的是Caffe2的docker 容器;如果你也想和Gemfield一样直接使用caffe2的容器的话,用下面的命令来获得该docker image:

docker pull gemfield/caffe2

2,检查Caffe2安装是否成功:

#检查Caffe2是否编译安装成功
gemfield@caffe2:/# python2 -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure"
Success

#检查Caffe2的GPU依赖是否正确,下面命令输出的GPU卡的数量必须要大于0
#否则不能使用Detectron
gemfield@caffe2:/# python2 -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'                                     
2

3,安装detectron依赖的python包,注意,detectron基于python2:

pip install numpy>=1.13 pyyaml>=3.12 \
    matplotlib opencv-python>=3.2 setuptools Cython \
    mock scipy networkx

4,编译安装COCO API

gemfield@caffe2:~# git clone https://github.com/cocodataset/cocoapi.git 
gemfield@caffe2:~# cd cocoapi/PythonAPI/
gemfield@caffe2:~/cocoapi/PythonAPI# make install

5,编译detectron

gemfield@caffe2:~# git clone https://github.com/facebookresearch/detectron
gemfield@caffe2:~# cd detectron/lib/
gemfield@caffe2:~/detectron/lib# make
......
Installed /root/detectron/lib
Processing dependencies for Detectron==0.0.0
Finished processing dependencies for Detectron==0.0.0

一共编译出下面这4个库:

gemfield@caffe2:~/detectron/lib# find . -name "*.so"
./utils/cython_nms.so
./utils/cython_bbox.so
./build/lib.linux-x86_64-2.7/utils/cython_nms.so
./build/lib.linux-x86_64-2.7/utils/cython_bbox.so                                                                                                              

6,跑 UT case:

gemfield@caffe2:~/detectron/lib# python2 ~/detectron/tests/test_spatial_narrow_as_op.py                                              
Found Detectron ops lib: /usr/local/lib/libcaffe2_detectron_ops_gpu.so                                                                 
...                                                                                                                                    
----------------------------------------------------------------------
Ran 3 tests in 1.492s

OK

运行Detectron

可以使用tools目录下内置的infer_simple.py 来使用预训练的模型来预测实际的照片,infer_simple.py里面调用的是detectron封装的vis_utils.vis_one_image API。

gemfield@caffe2:~/detectron# python2 tools/infer_simple.py \
>     --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
>     --output-dir /tmp/detectron-visualizations \
>     --image-ext jpg \
>     --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
>     demo
Found Detectron ops lib: /usr/local/lib/libcaffe2_detectron_ops_gpu.so
......
INFO infer_simple.py: 111: Processing demo/15673749081_767a7fa63a_k.jpg -> /tmp/detectron-visualizations/15673749081_767a7fa63a_k.jpg.pdf
INFO infer_simple.py: 119: Inference time: 0.215s
INFO infer_simple.py: 121:  | im_detect_bbox: 0.118s
INFO infer_simple.py: 121:  | misc_mask: 0.076s
INFO infer_simple.py: 121:  | im_detect_mask: 0.018s
INFO infer_simple.py: 121:  | misc_bbox: 0.003s

在上面的使用中,infer_simple.py一共使用了5个参数:

1, –cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml

使用–cfg来指定模型的配置文件,该文件等同于solver.prototxt加上py-faster-rcnn中的配置文件;

2,–output-dir /tmp/detectron-visualizations

把检测结果可视化,并以pdf的格式生成在/tmp/detectron-visualizations目录中;

3,–image-ext jpg

寻找jpg后缀的文件;

4,–wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl

模型文件,支持http协议,这种情况下,将会下载此模型文件到本地的/tmp目录下:

gemfield@caffe2:~/detectron/tools# ls -l /tmp/detectron-download-cache/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl
-rw-r--r-- 1 gemfield gemfield 514281564 Feb 26 12:50 /tmp/detectron-download-cache/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl

注意:在本命令中也会下载cfg文件中配置的预训练模型(如果使用的是http地址的话)。

5,demo

检测当前demo目录里jpg后缀的图片;

结果

最终,检测结果就以pdf的格式输出到了/tmp/detectron-visualizations目录下:

gemfield@caffe2:~/detectron# ls -l /tmp/detectron-visualizations
total 8756
-rw-r--r-- 1 root root 1124568 Feb 26 12:50 15673749081_767a7fa63a_k.jpg.pdf
......

训练自己的模型

1,首先要准备自己的数据集

目前detectron支持的数据集的名字及目录格式定义在 lib/datasets/dataset_catalog.py 文件中:

DATASETS = {
    'cityscapes_fine_instanceonly_seg_train': {
        IM_DIR:_DATA_DIR + '/cityscapes/images',
        ANN_FN:_DATA_DIR + '/cityscapes/annotations/instancesonly_gtFine_train.json',
        RAW_DIR:_DATA_DIR + '/cityscapes/raw'
    },
    'cityscapes_fine_instanceonly_seg_val': {
        IM_DIR:_DATA_DIR + '/cityscapes/images',
        # use filtered validation as there is an issue converting contours
        ANN_FN:_DATA_DIR + '/cityscapes/annotations/instancesonly_filtered_gtFine_val.json',
        RAW_DIR:_DATA_DIR + '/cityscapes/raw'
    },
    'cityscapes_fine_instanceonly_seg_test': {
        IM_DIR:_DATA_DIR + '/cityscapes/images',
        ANN_FN:_DATA_DIR + '/cityscapes/annotations/instancesonly_gtFine_test.json',
        RAW_DIR:_DATA_DIR + '/cityscapes/raw'
    },
    'coco_2014_train': {
        IM_DIR:_DATA_DIR + '/coco/coco_train2014',
        ANN_FN:_DATA_DIR + '/coco/annotations/instances_train2014.json'
    },
    'coco_2014_val': {
        IM_DIR:_DATA_DIR + '/coco/coco_val2014',
        ANN_FN:_DATA_DIR + '/coco/annotations/instances_val2014.json'
    },
    'coco_2014_minival': {
        IM_DIR:_DATA_DIR + '/coco/coco_val2014',
        ANN_FN:_DATA_DIR + '/coco/annotations/instances_minival2014.json'
    },
    'coco_2014_valminusminival': {
        IM_DIR:_DATA_DIR + '/coco/coco_val2014',
        ANN_FN:_DATA_DIR + '/coco/annotations/instances_valminusminival2014.json'
    },
    'coco_2015_test': {
        IM_DIR:_DATA_DIR + '/coco/coco_test2015',
        ANN_FN:_DATA_DIR + '/coco/annotations/image_info_test2015.json'
    },
    'coco_2015_test-dev': {
        IM_DIR:_DATA_DIR + '/coco/coco_test2015',
        ANN_FN:_DATA_DIR + '/coco/annotations/image_info_test-dev2015.json'
    },
    'coco_2017_test': {  # 2017 test uses 2015 test images
        IM_DIR:_DATA_DIR + '/coco/coco_test2015',
        ANN_FN:_DATA_DIR + '/coco/annotations/image_info_test2017.json',
        IM_PREFIX:'COCO_test2015_'
    },
    'coco_2017_test-dev': {  # 2017 test-dev uses 2015 test images
        IM_DIR:_DATA_DIR + '/coco/coco_test2015',
        ANN_FN:_DATA_DIR + '/coco/annotations/image_info_test-dev2017.json',
        IM_PREFIX:'COCO_test2015_'
    },
    'coco_stuff_train': {
        IM_DIR:_DATA_DIR + '/coco/coco_train2014',
        ANN_FN:_DATA_DIR + '/coco/annotations/coco_stuff_train.json'
    },
    'coco_stuff_val': {
        IM_DIR:_DATA_DIR + '/coco/coco_val2014',
        ANN_FN:_DATA_DIR + '/coco/annotations/coco_stuff_val.json'
    },
    'keypoints_coco_2014_train': {
        IM_DIR:_DATA_DIR + '/coco/coco_train2014',
        ANN_FN:_DATA_DIR + '/coco/annotations/person_keypoints_train2014.json'
    },
    'keypoints_coco_2014_val': {
        IM_DIR:_DATA_DIR + '/coco/coco_val2014',
        ANN_FN:_DATA_DIR + '/coco/annotations/person_keypoints_val2014.json'
    },
    'keypoints_coco_2014_minival': {
        IM_DIR:_DATA_DIR + '/coco/coco_val2014',
        ANN_FN:_DATA_DIR + '/coco/annotations/person_keypoints_minival2014.json'
    },
    'keypoints_coco_2014_valminusminival': {
        IM_DIR:_DATA_DIR + '/coco/coco_val2014',
        ANN_FN:_DATA_DIR + '/coco/annotations/person_keypoints_valminusminival2014.json'
    },
    'keypoints_coco_2015_test': {
        IM_DIR:_DATA_DIR + '/coco/coco_test2015',
        ANN_FN:_DATA_DIR + '/coco/annotations/image_info_test2015.json'
    },
    'keypoints_coco_2015_test-dev': {
        IM_DIR:_DATA_DIR + '/coco/coco_test2015',
        ANN_FN:_DATA_DIR + '/coco/annotations/image_info_test-dev2015.json'
    },
    'voc_2007_trainval': {
        IM_DIR:_DATA_DIR + '/VOC2007/JPEGImages',
        ANN_FN:_DATA_DIR + '/VOC2007/annotations/voc_2007_trainval.json',
        DEVKIT_DIR:_DATA_DIR + '/VOC2007/VOCdevkit2007'
    },
    'voc_2007_test': {
        IM_DIR:_DATA_DIR + '/VOC2007/JPEGImages',
        ANN_FN:_DATA_DIR + '/VOC2007/annotations/voc_2007_test.json',
        DEVKIT_DIR:_DATA_DIR + '/VOC2007/VOCdevkit2007'
    },
    'voc_2012_trainval': {
        IM_DIR:_DATA_DIR + '/VOC2012/JPEGImages',
        ANN_FN:_DATA_DIR + '/VOC2012/annotations/voc_2012_trainval.json',
        DEVKIT_DIR:_DATA_DIR + '/VOC2012/VOCdevkit2012'
    }
}

要确保你的数据集的名字已经正确的配置在了cfg文件中,否则会报 AssertionError: Unknown dataset name: xxxx。

2,转换数据集格式

就像上一节说的,detectron支持3种数据集COCO、PASCAL VOC、Cityscapes。值得注意的是,这里的PASCAL VOC使用的是json来替代传统的xml格式。而Gemfield以前的数据集正是PASCAL VOC xml格式,所以Gemfield使用了下面的python代码来将xml转化为json格式:

https://github.com/CivilNet/Gemfield/blob/master/src/python/pascal_voc_xml2json/pascal_voc_xml2json.pygithub.com

最后,这些数据以如下的目录结构形式存放在detectron的lib/dataset/data目录下:

gemfield@detectron:~/detectron/lib/datasets/data# ls -lR
.:
total 8
-rw-r--r-- 1 root root 3074 Mar 20 07:15 README.md
drwxr-xr-x 1 root root 4096 Mar 17 11:24 VOC2012

./VOC2012:
total 4
lrwxrwxrwx 1 root root   24 Mar 17 11:23 JPEGImages -> /bigdata/VOC/JPEGImages/
drwxr-xr-x 2 root root 4096 Mar 20 07:09 annotations

./VOC2012/annotations:
total 0
lrwxrwxrwx 1 root root 27 Mar 20 07:09 voc_2012_trainval.json -> /bigdata/VOC/instances.json

其中,/bigdata/VOC/instances.json正是使用上面的pascal_voc_xml2json.py 从以前的xml转换而来的。

3,更改yml配置文件里分类的数量

在文件configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml 中,修改

NUM_CLASSES的值从81到自己的值,如225:

NUM_CLASSES: 225

注意,这个值要在数据集分类的基础上加1,这个1是背景。如果NUM_CLASSES这个值不对,会报下面的错误:

Exception encountered running PythonOp function: ValueError: could not broadcast input array from shape (4) into shape (0)。

当然了,你也可能需要修改其它的值,主要是solver的参数。

4,开始训练

如果不用测试最终生成的model,那么此刻已经可以开始训练了(使用–skip-test参数),使用下面的命令开始训练:

gemfield@detectron:~/detectron# python2 tools/train_net.py --cfg configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml \
    OUTPUT_DIR /tmp/detectron-output --skip-test

一些输入日志:

Found Detectron ops lib: /usr/local/lib/libcaffe2_detectron_ops_gpu.so
Found Detectron ops lib: /usr/local/lib/libcaffe2_detectron_ops_gpu.so
E0321 07:01:54.091614   573 init_intrinsics_check.cc:59] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0321 07:01:54.091797   573 init_intrinsics_check.cc:59] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0321 07:01:54.091819   573 init_intrinsics_check.cc:59] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
INFO train_net.py: 104: Called with args:
INFO train_net.py: 105: Namespace(cfg_file='configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml', multi_gpu_testing=False, opts=['OUTPUT_DIR', '/tmp/detectron-output'], skip_test=True)
json_stats: {"accuracy_cls": 0.977539, "eta": "10:59:50", "iter": 93800, "loss": 0.163753, "loss_bbox": 0.085133, "loss_cls": 0.069081, "loss_rpn_bbox_fpn2": 0.000000, "loss_rpn_bbox_fpn3": 0.002045, "loss_rpn_bbox_fpn4": 0.000435, "loss_rpn_bbox_fpn5": 0.000000, "loss_rpn_bbox_fpn6": 0.000000, "loss_rpn_cls_fpn2": 0.000024, "loss_rpn_cls_fpn3": 0.000489, "loss_rpn_cls_fpn4": 0.000079, "loss_rpn_cls_fpn5": 0.000000, "loss_rpn_cls_fpn6": 0.000000, "lr": 0.002500, "mb_qsize": 64, "mem": 3145, "time": 0.191999}
json_stats: {"accuracy_cls": 0.963867, "eta": "10:59:46", "iter": 93820, "loss": 0.193032, "loss_bbox": 0.090194, "loss_cls": 0.091983, "loss_rpn_bbox_fpn2": 0.000000, "loss_rpn_bbox_fpn3": 0.001847, "loss_rpn_bbox_fpn4": 0.000357, "loss_rpn_bbox_fpn5": 0.000000, "loss_rpn_bbox_fpn6": 0.000000, "loss_rpn_cls_fpn2": 0.000025, "loss_rpn_cls_fpn3": 0.000385, "loss_rpn_cls_fpn4": 0.000027, "loss_rpn_cls_fpn5": 0.000000, "loss_rpn_cls_fpn6": 0.000000, "lr": 0.002500, "mb_qsize": 64, "mem": 3145, "time": 0.192000}
......
INFO test_engine.py: 225: im_detect: range [1, 76717] of 76717: 41831/76717 0.099s + 0.009s (eta: 1:02:38)
INFO test_engine.py: 225: im_detect: range [1, 76717] of 76717: 41841/76717 0.099s + 0.009s (eta: 1:02:37)
......
INFO test_engine.py: 225: im_detect: range [1, 76717] of 76717: 76691/76717 0.099s + 0.009s (eta: 0:00:02)
INFO test_engine.py: 225: im_detect: range [1, 76717] of 76717: 76701/76717 0.099s + 0.009s (eta: 0:00:01)
INFO test_engine.py: 225: im_detect: range [1, 76717] of 76717: 76711/76717 0.099s + 0.009s (eta: 0:00:00)

5,如果不加–skip-test

那上面准备的数据集还不够,你最终会遇到下面这样的错误:

INFO test_engine.py: 258: Wrote detections to: /tmp/detectron-output/test/voc_2012_trainval/generalized_rcnn/detections.pkl
......
  File "/root/detectron/lib/datasets/voc_dataset_evaluator.py", line 169, in voc_info
    'Devkit directory {} not found'.format(devkit_path)
AssertionError: Devkit directory /root/detectron/lib/datasets/data/VOC2012/VOCdevkit2012 not found

所以你还要准备

1,detectron/lib/datasets/data/VOC2012/VOCdevkit2012/results/VOC2012/Main空目录;

2,下面这个Annotations目录,用来存放以前传统的pascal voc xml 文件:

gemfield@detectron:~/detectron/lib/datasets/data# ls -l VOC2012/VOCdevkit2012/VOC2012/Annotations/ | head
total 306868
-rw-r--r-- 1 1000 1000 1400 3月  20 05:47 self1.mp4_1040.xml
...

3,下面这个ImageSets目录,用来存放以前传统的 ImageSets/Main/trainval.txt 文件:

gemfield@detectron:~/detectron# ls -l lib/datasets/data/VOC2012/VOCdevkit2012/VOC2012/ImageSets/Main/trainval.txt
-rw-rw-r-- 1 1000 1000 1278209 3月  22 09:40 lib/datasets/data/VOC2012/VOCdevkit2012/VOC2012/ImageSets/Main/trainval.txt

注意:trainval.txt文件中的图片顺序要和json文件中Images字段中的image list中的图片顺序严格一致,就是下面这个顺序:

gemfield@detectron:~/detectron# head lib/datasets/data/VOC2012/VOCdevkit2012/VOC2012/ImageSets/Main/trainval.txt
wzry101.mp4_2540
wzry102.mp4_2610
......

6,最后的test结果

最终模型生成在/tmp/detectron-output/train/voc_2012_trainval/generalized_rcnn/model_final.pkl, 其中,model_final 近乎关键字。下面是一些test model相关的log:

......
INFO voc_dataset_evaluator.py: 127: AP for gemfield1 = 0.9881
INFO voc_dataset_evaluator.py: 127: AP for gemfield2 = 0.9878
INFO voc_dataset_evaluator.py: 127: AP for gemfield3 = 0.9937
INFO voc_dataset_evaluator.py: 127: AP for gemfield4 = 0.9761
INFO voc_dataset_evaluator.py: 127: AP for gemfield5 = 0.8923
INFO voc_dataset_evaluator.py: 130: Mean AP = 0.9843
......
INFO task_evaluation.py:  61: Evaluating bounding boxes is done!
INFO task_evaluation.py: 180: copypaste: Dataset: voc_2012_trainval
INFO task_evaluation.py: 182: copypaste: Task: box
INFO task_evaluation.py: 185: copypaste: AP,AP50,AP75,APs,APm,APl
INFO task_evaluation.py: 186: copypaste: -1.0000,-1.0000,-1.0000,-1.0000,-1.0000,-1.0000

测试自己的模型

在运行实际测试前,你还需要修改下lib/datasets/dummy_datasets.py 文件,将其中的分类变为自己数据集的分类。

然后使用infer_simple.py:

gemfield@detectron:~/detectron# python2 tools/infer_simple.py --cfg configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml --output-dir /bigdata/VOC/gemfield/detectron-visualizations --image-ext jpg --wts /tmp/detectron-output/train/voc_2012_trainval/generalized_rcnn/model_final.pkl gemfield_demo
Found Detectron ops lib: /usr/local/lib/libcaffe2_detectron_ops_gpu.so
......
INFO infer_simple.py: 111: Processing gemfield_demo/Screenshot_20180324_100714.jpg -> /bigdata/VOC/gemfield/detectron-visualizations/Screenshot_20180324_100714.jpg.pdf
INFO infer_simple.py: 119: Inference time: 0.096s
INFO infer_simple.py: 121:  | im_detect_bbox: 0.089s
INFO infer_simple.py: 121:  | misc_bbox: 0.007s

    原文作者:Gemfield
    原文地址: https://zhuanlan.zhihu.com/p/34036460
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞