TX2 using TensorRT (JetPack3.3)

一、安装TensorRT

  1. 安装JetPack 3.3_b39版本
  2. CUDA 9.0 + OpenCV4Tegra + cuDNN + TensorRT 4.0.2.0
  3. 安装pip3 (Python3.5.2)

二、编译TensorRT Samples

cd /usr/src/tensorrt/samples 
sudo mkdir ../bin
sudo chmod 777 ../bin/ -R
make

编译后的结果自动安装至 /usr/src/tensorrt/bin/ 目录(查看Makefile.config)

三、试用TensorRT
案例1:
基于NVIDIA TensorRT利用来自TensorFlow的模型进行图像分类
NV tf_to_trt_image_classification
TensorRT及搭配whl文件下载路径:
TensorRT 4.0.1.6 for TX2

Step1. 下载并编译tf_to_trt_image_classification

cd tf_to_trt_image_classification
mkdir build
cd build
cmake ..
make 
cd ..

若出现如uff_to_plan.cpp报错,请按照如下修改代码:
uff_to_plan.cpp:71:79: error: no matching function for call to ‘nvuffparser::IUffParser::registerInput(const char*, nvinfer1::DimsCHW)’

  //parser->registerInput(inputName.c_str(), DimsCHW(3, inputHeight, inputWidth));
  parser->registerInput(inputName.c_str(), DimsCHW(3, inputHeight, inputWidth), UffInputOrder::kNCHW);

Step2. 安装uff exporter on Jetson TX2
TensorRT 4.0 Tar

i. Download TensorRT 4.0 tar package from  [TensorRT 4.0]
ii. Extract gz 
tar zxf TensorRT-4.0.1.6.Ubuntu-16.04.4.x86_64-gnu.cuda-9.0.cudnn7.1.tar.gz
iii. Install uff & graphsurgeon using pip
sudo pip3 install TensorRT-4.0.1.6/uff/uff-0.4.0-py2.py3-none-any.whl 
sudo pip install TensorRT-4.0.1.6/uff/uff-0.4.0-py2.py3-none-any.whl 
sudo pip3 install TensorRT-4.0.1.6/graphsurgeon/graphsurgeon-0.2.0-py2.py3-none-any.whl
sudo pip install TensorRT-4.0.1.6/graphsurgeon/graphsurgeon-0.2.0-py2.py3-none-any.whl

Step3. Download models and create frozen graphs
Download models

Run the following bash script to download all of the pretrained models:
source scripts/download_models.sh

Create frozen graphs
如果有不想采用的Model可以修改download_models.sh 移除该Model下载 scripts/download_models.sh.
接下来,鉴于TensorFlow模型提供的格式是checkpoint,必须将其转换为frozen graphs才可以通过TensorRT进行优化,运行脚本:scripts/models_to_frozen_graphs.py script.

$python3 scripts/models_to_frozen_graphs.py

Bug1. ImportError: No module named 'slim'

解决:
$pip3 install aiohttp==2.1.0 --user
$pip3 install slim --user

Bug2.ImportError: No module named 'slim.nets'

$ python3 scripts/models_to_frozen_graphs.py 
Traceback (most recent call last):
  File "scripts/models_to_frozen_graphs.py", line 12, in <module>
    import slim.nets as nets
ImportError: No module named 'slim.nets'
解决:
下载TF Models并保存至thirdparty/models/
$git clone https://github.com/tensorflow/models
$rm ./third_party/models -rf
$mv models  ./third_party/
export PYTHONPATH=$PYTHONPATH:$PWD/third_party/models/research/slim/
再次运行Okay: 
$python3 scripts/models_to_frozen_graphs.py 

Step4. Convert frozen graph to TensorRT engine
当前目录下运行 scripts/convert_plan.py 通过引用 models table 以获得相关参数。
例如转换Inception V1 model 运行如下:

python3 scripts/convert_plan.py data/frozen_graphs/inception_v1.pb data/plans/inception_v1.plan input 224 224 InceptionV1/Logits/SpatialSqueeze 1 0 float

convert_plan.py 的输入参数包括:

  • frozen graph path
  • output plan path
  • input node name
  • input height
  • input width
  • output node name
  • max batch size
  • max workspace size
  • data type (float or half)

This script assumes single output single input image models, and may not work out of the box for models other than those in the table above.

Step4. Execute TensorRT engine

$./build/examples/classify_image/classify_image data/images/person.jpg data/plans/inception_v1.plan data/imagenet_labels_1001.txt input InceptionV1/Logits/SpatialSqueeze inception

Loading TensorRT engine from plan file...
Preprocessing input...
Executing inference engine...

The top-5 indices are: 356 349 228 350 271 
Which corresponds to class labels: 
0. llama
1. ram, tup
2. kelpie
3. bighorn, bighorn sheep, cimarron, Rocky Mountain bighorn, Rocky Mountain sheep, Ovis canadensis
4. white wolf, Arctic wolf, Canis lupus tundrarum

作为参考,上述指令的输入内容包括如下六项:

  1. input image path
  2. plan file path
  3. labels file (one label per line, line number corresponds to index in output)
  4. input node name
  5. output node name
  6. preprocessing function (either vgg or inception)

Git包 data folder中提供了两个图片标记文件供使用。
有些TensorFlow模型采用了额外的 “background” 类进行训练,导致模型具有1001输出而不是1000,为了准确确定每个模型的输出个数如1001一伙1000,需要在scripts/model_meta.py脚本中引用 NETS 变量,如:’num_classes’: 1000

Step5. Benchmark all models
To benchmark all of the models, first convert all of the models that you downloaded above into TensorRT engines. Run the following script to convert all models

python scripts/frozen_graphs_to_plans.py

If you want to change parameters related to TensorRT optimization, just edit the scripts/frozen_graphs_to_plans.py file. Next, to benchmark all of the models run the scripts/test_trt.py script

python scripts/test_trt.py

Once finished, the timing results will be stored at data/test_output_trt.txt. If you want to also benchmark the TensorFlow models, simply run.

python scripts/test_tf.py

The results will be stored at data/test_output_tf.txt. This benchmarking script loads an example image as input, make sure you have downloaded the sample images as above.

    原文作者:吾家七月
    原文地址: https://www.jianshu.com/p/c2f7a9f0e166
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞