PyTorch转TVM

背景

最近尝试将PyTorch的模型转化为tvm,使用tvm框架进行模型的前向。简单来说就是将PyTorch的模型export为onnx,再把onnx转化为tvm的模型。Gemfield使用的是ONNX的opset version 9。

安装TVM

1,克隆仓库

git clone --recursive https://github.com/dmlc/tvm

2,安装依赖

sudo apt-get update
sudo apt-get install -y python python-dev python-setuptools gcc \
     libtinfo-dev zlib1g-dev build-essential cmake

3,安装llvm

要安装大于4.0版本的,而ubuntu 16.04 apt官方源最新只有3.x,ubuntu 18.04则没问题(安装的是6.0)。如果apt官方最新的llvm版本小于4,那么使用llvm的源(参考 https://apt.llvm.org/):

apt install software-properties-common

apt-add-repository "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-7 main"
apt-add-repository "deb-src http://apt.llvm.org/xenial/ llvm-toolchain-xenial-7 main"
apt-get update

#apt-get install llvm-<version>-dev libclang-<version>-dev clang-<version>
#如果version取7的话,则是
apt-get install llvm-7-dev libclang-7-dev clang-7

哦,Gemfield使用的是KDE Ubuntu 19.04,那就简单了:

gemfield@ThinkPad-X1C:~$ sudo apt install clang libclang-dev llvm-dev

安装mvn,为后续的Android开发做好准备:

gemfield@ThinkPad-X1C:~$ sudo apt install maven

4,定制config.cmake

mkdir build
cp cmake/config.cmake build
cd build

编辑build/config.cmake文件,里面有一些功能开关,这些配置有:

USE_CUDA,NVIDIA的GPU计算;
USE_ROCM,通用的GPU计算,AMD提出,目的很显然...;
USE_SDACCEL,FPGA计算;
USE_AOCL,Intel FPGA SDK for OpenCL (AOCL) runtime;
USE_OPENCL,异构平台编写程序的框架,异构平台可由CPU、GPU、DSP、FPGA或其他类型的处理器与硬件加速器所组成;
USE_METAL,iOS上的GPU计算;
USE_VULKAN,新一代的openGL,Android 7.x开始支持(iOS不支持,因为有自己的metal2);
USE_OPENGL,2D/3D渲染库标准,显卡厂家负责实现和支持;
USE_SGX, Intel SGX ; 
USE_RPC,远程调用,电脑和手机可以通过网络联调;
USE_STACKVM_RUNTIME,embed stackvm into the runtime;
USE_GRAPH_RUNTIME,enable tiny embedded graph runtime;
USE_GRAPH_RUNTIME_DEBUG,enable additional graph debug functions;
USE_LLVM,llvm support;
USE_BLAS,API标准,规范发布基础线性代数操作的数值库(如矢量或矩阵乘法),不同的实现有openblas, mkl, atlas, apple
USE_RANDOM,contrib.random运行时;
USE_NNPACK,
USE_CUDNN,
USE_CUBLAS,
USE_MIOPEN,
USE_MPS,
USE_ROCBLAS,
USE_SORT,使用contrib sort;
USE_ANTLR,
USE_VTA_TSIM,
USE_RELAY_DEBUG,Relay debug模式

gemfield只打开了set(USE_LLVM ON)、USE_SORT、USE_GRAPH_RUNTIME、USE_RPC。其它的都没开启,为什么?因为有些用不到,有些还不知道是啥意思。

5,编译

开启llvm的情况下,一共会编译几百个编译单元:

#cmake -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON .. for verbose, by civilnet
cmake ..
make -j4

最终链接出以下so库:

[  5%] Linking CXX shared library libvta.so
[ 12%] Linking CXX shared library libtvm_runtime.so
[ 86%] Linking CXX shared library libtvm.so
[ 94%] Linking CXX shared library libtvm_topi.so
[100%] Linking CXX shared library libnnvm_compiler.so

gemfield简单介绍下这几个共享库:

1,libvta.so (VTA,Versatile Tensor Accelerator的缩写),参考https://docs.tvm.ai/vta/index.html,由以下这几个编译单元生成。

vta/src/device_api.cc 
vta/src/runtime.cc 
vta/src/sim/sim_driver.cc

2,libtvm_runtime.so

顾名思义,tvm的运行时,实际上,这个库是TVM运行时的一个最小化库,由“Minimum runtime related codes”编译而成——也即下面的这些源文件:

src/runtime/builtin_fp16.cc
src/runtime/c_dsl_api.cc
src/runtime/c_runtime_api.cc
src/runtime/cpu_device_api.cc
src/runtime/dso_module.cc
src/runtime/file_util.cc
src/runtime/module.cc
src/runtime/module_util.cc
src/runtime/ndarray.cc
src/runtime/registry.cc
src/runtime/system_lib_module.cc
src/runtime/thread_pool.cc
src/runtime/threading_backend.cc
src/runtime/vm/memory_manager.cc
src/runtime/vm/object.cc
src/runtime/vm/vm.cc
src/runtime/workspace_pool.cc
3rdparty/bfloat16/bfloat16.cc
src/runtime/rpc/*.cc
src/runtime/graph/graph_runtime.cc
src/contrib/sort/sort.cc

3,libtvm.so

完整的tvm,由编译时、运行时、rpc部分等组成:

common: Internal common utilities.
api: API function registration.
lang: The definition of DSL related data structure.
arithmetic: Arithmetic expression and set simplification.
op: The detail implementations about each operation(compute, scan, placeholder).
schedule: The operations on the schedule graph before converting to IR.
pass: The optimization pass on the IR structure.
codegen: The code generator.
runtime: Minimum runtime related codes.
autotvm: The auto-tuning module.
relay: Implementation of Relay. The second generation of NNVM, a new IR for deep learning frameworks.
contrib: Contrib extension libraries.

这个库比较大,有200多个编译单元:

src/api/*.cc
src/arithmetic/*.cc
src/autotvm/*.cc
src/codegen/*.cc
src/lang/*.cc
src/op/*.cc
src/pass/*.cc
src/schedule/*.cc
src/relay/backend/*.cc
src/relay/ir/*.cc
src/relay/op/*.cc
src/relay/pass/*.cc
3rdparty/HalideIR/src/*.cpp
src/runtime/stackvm/*.cc
src/codegen/opt/*.cc
src/codegen/llvm/*.cc
src/runtime/*.cc
src/contrib/hybrid/codegen_hybrid.cc
3rdparty/bfloat16/bfloat16.cc
src/contrib/sort/sort.cc

4,libtvm_topi.so

TOPI(TVM OP Inventory),is the operator collection library for TVM intended at sharing the effort of crafting and optimizing tvm generated kernels。由下面的编译单元生成:

topi/src/topi.cc

5,libnnvm_compiler.so

NNVM编译器,由以下编译单元生成:

nnvm/src/c_api/*.cc
nnvm/src/compiler/*.cc
nnvm/src/core/*.cc
nnvm/src/pass/*.cc
nnvm/src/top/nn/*.cc
nnvm/src/top/tensor/*.cc
nnvm/src/top/vision/nms.cc
nnvm/src/top/vision/ssd/mutibox_op.cc
nnvm/src/top/vision/yolo/reorg.cc
nnvm/src/top/image/resize.cc

6,设置PYTHONPATH

export TVM_HOME=/home/gemfield/github/Gemfield/tvm/
export PYTHONPATH=$TVM_HOME/python:$TVM_HOME/topi/python:$TVM_HOME/nnvm/python:${PYTHONPATH}

7,安装python依赖

注意,TVM已经放弃对python2的支持了。

#必备的依赖
gemfield@ThinkPad-X1C:~$ pip3 install numpy decorator attrs

#如果想使用RPC Tracker的话
gemfield@ThinkPad-X1C:~$ pip3 install tornado

#如果想使用auto-tuning module的话
gemfield@ThinkPad-X1C:~$ pip3 install tornado psutil xgboost

安装onnx

gemfield@ThinkPad-X1C:~$ pip3 install onnx

转换开始

使用下面的代码:

import onnx
import numpy as np
import tvm
import tvm.relay as relay

onnx_model = onnx.load('gemfield.onnx')

x = np.ones([1,3,256,256])
# arch = "arm64"
# target =  "llvm -target=%s-linux-android" % arch
target = 'llvm'
input_name = 'gemfield'
shape_dict = {input_name: x.shape}
sym, params = relay.frontend.from_onnx(onnx_model, shape_dict)

with relay.build_config(opt_level=1):
    intrp = relay.build_module.create_executor('graph', sym, tvm.cpu(0), target)

dtype = 'float32'
tvm_output = intrp.evaluate(sym)(tvm.nd.array(x.astype(dtype)), **params).asnumpy()

with relay.build_config(opt_level=2):
    graph, lib, params = relay.build_module.build(sym, target, params=params)

libpath = "gemfield.so"
lib.export_library(libpath)

graph_json_path = "gemfield.json"
with open(graph_json_path, 'w') as fo:
    fo.write(graph)

param_path = "gemfield.params"
with open(param_path, 'wb') as fo:
    fo.write(relay.save_param_dict(params))

目前阻塞在upsample op的转换上,很显然tvm目前不支持opset9。PR已经create出来:

Enhance upsample operator to adapt onnx opset version 9 by gemfield · Pull Request #2840 · dmlc/tvmgithub.com《PyTorch转TVM》

经过这个fix后,接着的错误是不支持group kernel。新的错误如下:

not support arbitrary group number for now"

大约一个月后,这个错误已经被fix,现在可以继续开始转换了。转换成功后,会生成如下3个文件:

-rw-rw-r-- 1 gemfield gemfield  124561 5月  17 19:38 gemfield.json
-rw-rw-r-- 1 gemfield gemfield  407658 5月  17 19:38 gemfield.params
-rwxrwxr-x 1 gemfield gemfield  585760 5月  17 19:38 gemfield.so

此次生成的gemfield.so是x86-64的动态库,只依赖基础的C库,之前网络中的op计算已经转换成了如下的C函数:

fused_concatenate
fused_concatenate_1
fused_concatenate_multiply_add_nn_prelu
fused_concatenate_multiply_add_nn_prelu_1
fused_nn_avg_pool2d
fused_nn_avg_pool2d_1
fused_nn_conv2d
fused_nn_conv2d_1
fused_nn_conv2d_add
fused_nn_conv2d_add_2
fused_nn_conv2d_multiply_add
fused_nn_conv2d_multiply_add_add_nn_prelu
fused_nn_conv2d_multiply_add_add_nn_prelu_1
fused_nn_conv2d_multiply_add_nn_prelu
fused_nn_conv2d_multiply_add_nn_prelu_1
fused_nn_pad_1
fused_nn_upsampling
fused_nn_upsampling_concatenate
fused_nn_upsampling_nn_upsampling_nn_upsampling_nn_upsampling_concatenate
...

并编译生成了gemfield.so这个动态库文件。此外,gemfield.json使用json结构描述了神经网络结构,gemfield.params里面包含了网络权重参数。

前向推理

上述编译出来的gemfield.so通过tvm.module加载。下面的代码演示了如何使用gemfield.so和tvm模块进行前向推理:

from PIL import Image
import numpy as np
import cv2
import tvm
import numpy as np
from tvm.contrib import util, ndk, graph_runtime
import os

loaded_json = open("gemfield.json").read()
loaded_lib = tvm.module.load('gemfield.so')
loaded_params = bytearray(open('gemfield.params', "rb").read())

ctx = tvm.cpu()
module = graph_runtime.create(loaded_json, loaded_lib, ctx)
module.load_params(loaded_params)

files = os.listdir("input/")
mean = [109.496254,118.698456,124.68751]
std = 58.50182

for f in files:
    img_in = cv2.imread("input/"+f)
    img = cv2.resize(img_in,(256, 256))
    img=img.astype(np.float32)
    for j in range(3):
        img[:,:,j]-=mean[j]
    for j in range(3):
        img[:,:,j]/=std

    img/=255
    img=img.transpose((2,0,1))
    img=np.expand_dims(img, axis=0)
    module.set_input("gemfield",img.astype(np.float32))
    module.run()

    img_out = module.get_output(0).asnumpy()
    img = np.argmax(img_out,axis=1)
    img=np.squeeze(img)

    palette=np.array([[0, 0, 0],[128, 0, 0],[0, 128, 0],[128, 128, 0],\
                     [0, 0, 128],[128, 0, 128],[0, 128, 128],[128, 128, 128],\
                     [64, 0, 0],[192, 0, 0],[64, 128, 0],[192, 128, 0],\
                     [64, 0, 128],[192, 0, 128],[64, 128, 128],[192, 128, 128],\
                     [0, 64, 0],[128, 64, 0],[0, 192, 0],[128, 192, 0],\
                     [0, 64, 128]], dtype='uint8').flatten()

    img=Image.fromarray(img.astype('uint8'))
    img.putpalette(palette)
    #only mask, by gemfield
    img.save("output_mask/"+os.path.splitext(f)[0]+".png")

    #blend with original image
    img1 = Image.open("output_mask/"+os.path.splitext(f)[0]+".png")
    img1 = img1.convert('RGBA')
    img2 = Image.open("input/"+f)
    img2 = img2.resize((256,256))
    img2 = img2.convert('RGBA')
    
    img = Image.blend(img1, img2, 0.5)
    img.save("output_blend/"+os.path.splitext(f)[0]+".png")

打印前向速度

可以在目标设备上做个简单的性能测试:

import onnx
import numpy as np
import tvm
import tvm.relay as relay
from tvm.contrib import graph_runtime, rpc
import time

onnx_model = onnx.load('gemfield.onnx')

x = np.ones([1,3,256,256])
# target can be "opencl", "llvm", "metal" or any target supported by tvm
# arch = "arm64"
# target = "llvm -target=%s-linux-android" % arch
target = "llvm"
# target = "opencl"
input_name = 'gemfield'
shape_dict = {input_name: x.shape}
sym, params = relay.frontend.from_onnx(onnx_model, shape_dict)
ctx = tvm.context(target, 0)
with relay.build_config(opt_level=0):
    intrp = relay.build_module.create_executor('graph', sym, ctx, target)

with relay.build_config(opt_level=0):
    graph, lib, params = relay.build_module.build(sym, target, params=params)

x = np.ones([1,3,256,256])
dtype = np.float32

module = graph_runtime.create(graph, lib, ctx)
module.set_input('gemfield', tvm.nd.array(x.astype(dtype)))
module.set_input(**params)
module.run()

img_out = module.get_output(0).asnumpy()

print('benchmark by gemfield on cpu')
t1 = time.time()
ftimer = module.module.time_evaluator("run", ctx, 100)
prof_res = ftimer()
print(prof_res)

输出:

ProfileResult(mean=0.02616951292, results=(0.02616951292,))

继续转换(Android平台)

host上可以成功运行后,现在就准备把模型跑在手机上了。这次的编译就要使用不同的target了,这里设置为target = “llvm -target=arm64-linux-android”。除了设置target之外,还要设置交叉编译器,直接使用NDK里的即可:

import onnx
import numpy as np
import tvm
import tvm.relay as relay

onnx_model = onnx.load('gemfield.onnx')

x = np.ones([1,3,256,256])
arch = "arm64"
target =  "llvm -target=%s-linux-android" % arch

input_name = 'gemfield'
shape_dict = {input_name: x.shape}
sym, params = relay.frontend.from_onnx(onnx_model, shape_dict)

with relay.build_config(opt_level=0):
    intrp = relay.build_module.create_executor('graph', sym, tvm.cpu(0), target)

with relay.build_config(opt_level=0):
    graph, lib, params = relay.build_module.build(sym, target, params=params)

libpath = "gemfield.so"
#lib.export_library(libpath)
lib.export_library(libpath, cc="/home/gemfield/Android/android-ndk-r19c/ \
        toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang")

graph_json_path = "gemfield.json"
with open(graph_json_path, 'w') as fo:
    fo.write(graph)

param_path = "gemfield.params"
with open(param_path, 'wb') as fo:
    fo.write(relay.save_param_dict(params))

编译成功后,和上次在宿主机上一样会生成3个文件。区别就是这次的gemfield.so是ARM aarch64的ELF格式了。

注意1:如果你设置了一个不同于当前host 架构的target,比如arm64-linux-android,却在host上evaluate,那么就会报这样的错误:TVMError: Cannot run module, architecture mismatch module=arm64-linux-android system=x86_64-pc-linux-gnu。这种情况下,正确的做法是停止在host上evaluate。

注意2:如果target是arm64-linux-android,但是指定的编译器却不是这个架构的,就会报下面这样的错误:

/usr/bin/ld: /tmp/tmpdepff9wm/lib.o: Relocations in generic ELF (EM: 183)
/usr/bin/ld: /tmp/tmpdepff9wm/lib.o: Relocations in generic ELF (EM: 183)
/usr/bin/ld: /tmp/tmpdepff9wm/lib.o: error adding symbols: file in wrong format
collect2: error: ld returned 1 exit status

编译安装TVM4J

TVM for java,TVM的java前端。如果想使用TVM的java API,那么需要编译TVM4J库(jar包);另外如果你需要RPC的话,TVM4J也为RPC server 和 client提供了简单的API;

1,使用make jvmpkg命令编译TVM4J,这会下载很多maven的pom配置文件:

gemfield@ThinkPad-X1C:~/github/Gemfield/tvm$ make jvmpkg

注意TVM目前只支持openjdk8,如果你用的是openjdk11,则会遇到如下错误:

[ERROR] /home/gemfield/github/Gemfield/tvm/jvm/core/src/main/java/ml/dmlc/tvm/rpc/Server.java:[20,16] 找不到符号
  符号:   类 SharedSecrets
  位置: 程序包 sun.misc

还有这个:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.3:testCompile (default-testCompile) on project tvm4j-core: Compilation failure: Compilation failure: 
[ERROR] /home/gemfield/github/Gemfield/tvm/jvm/core/src/test/java/ml/dmlc/tvm/contrib/GraphRuntimeTest.java:[46,5] reference to Module is ambiguous
[ERROR]   both class ml.dmlc.tvm.Module in ml.dmlc.tvm and class java.lang.Module in java.lang match

这么改下去就不断有不兼容的错误出现,gemfield索性直接把版本降为openjdk8了。

编译成功后显示:

[INFO] 
[INFO] --- maven-javadoc-plugin:2.9.1:jar (attach-javadocs) @ tvm4j-full-linux-x86_64 ---
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for TVM4J Package - Parent 0.0.1-SNAPSHOT:
[INFO] 
[INFO] TVM4J Package - Parent ............................. SUCCESS [  1.510 s]
[INFO] TVM4J Package - Core ............................... SUCCESS [  3.713 s]
[INFO] TVM4J Package - Native Parent ...................... SUCCESS [  0.025 s]
[INFO] TVM4J Package - Native Linux-x86_64 ................ SUCCESS [ 58.494 s]
[INFO] TVM4J Package - Full Parent ........................ SUCCESS [  0.028 s]
[INFO] TVM4J Package - Full Linux-x86_64 .................. SUCCESS [01:47 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  02:51 min
[INFO] Finished at: 2019-05-20T21:47:02+08:00
[INFO] ------------------------------------------------------------------------

2,安装

使用make jvminstall命令将tvm4j的相关jar包安装在$HOME/.m2/repository/目录下:

gemfield@ThinkPad-X1C:~/github/Gemfield/tvm$ make jvminstall
......
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-parent/0.0.1-SNAPSHOT/tvm4j-parent-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/target/tvm4j-parent-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-parent/0.0.1-SNAPSHOT/tvm4j-parent-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/core/target/tvm4j-core-0.0.1-SNAPSHOT.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-core/0.0.1-SNAPSHOT/tvm4j-core-0.0.1-SNAPSHOT.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/core/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-core/0.0.1-SNAPSHOT/tvm4j-core-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/core/target/tvm4j-core-0.0.1-SNAPSHOT-sources.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-core/0.0.1-SNAPSHOT/tvm4j-core-0.0.1-SNAPSHOT-sources.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/core/target/tvm4j-core-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-core/0.0.1-SNAPSHOT/tvm4j-core-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/core/target/tvm4j-core-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-core/0.0.1-SNAPSHOT/tvm4j-core-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-native-parent/0.0.1-SNAPSHOT/tvm4j-native-parent-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/target/tvm4j-native-parent-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-native-parent/0.0.1-SNAPSHOT/tvm4j-native-parent-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/linux-x86_64/target/libtvm4j-linux-x86_64.so to /home/gemfield/.m2/repository/ml/dmlc/tvm/libtvm4j-linux-x86_64/0.0.1-SNAPSHOT/libtvm4j-linux-x86_64-0.0.1-SNAPSHOT.so
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/linux-x86_64/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/libtvm4j-linux-x86_64/0.0.1-SNAPSHOT/libtvm4j-linux-x86_64-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/linux-x86_64/target/libtvm4j-linux-x86_64-sources.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/libtvm4j-linux-x86_64/0.0.1-SNAPSHOT/libtvm4j-linux-x86_64-0.0.1-SNAPSHOT-sources.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/linux-x86_64/target/libtvm4j-linux-x86_64-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/libtvm4j-linux-x86_64/0.0.1-SNAPSHOT/libtvm4j-linux-x86_64-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/assembly/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-full-parent/0.0.1-SNAPSHOT/tvm4j-full-parent-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/assembly/target/tvm4j-full-parent-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-full-parent/0.0.1-SNAPSHOT/tvm4j-full-parent-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/assembly/linux-x86_64/target/tvm4j-full-linux-x86_64-0.0.1-SNAPSHOT.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-full-linux-x86_64/0.0.1-SNAPSHOT/tvm4j-full-linux-x86_64-0.0.1-SNAPSHOT.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/assembly/linux-x86_64/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-full-linux-x86_64/0.0.1-SNAPSHOT/tvm4j-full-linux-x86_64-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/assembly/linux-x86_64/target/tvm4j-full-linux-x86_64-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-full-linux-x86_64/0.0.1-SNAPSHOT/tvm4j-full-linux-x86_64-0.0.1-SNAPSHOT-javadoc.jar

安装RPC App到Android手机上

这一步依赖TVM4J。

1,设置Android SDK的路径:

gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ export ANDROID_HOME=~/Android/Sdk/
#可选,如果找不到ndk-build
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ export PATH=$PATH:/home/gemfield/Android/Sdk/ndk-bundle/

2,编译RPC的APK

gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ gradle clean build
......
16:12:24.695 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] FAILURE: Build failed with an exception.
16:12:24.695 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] 
16:12:24.695 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] * What went wrong:
16:12:24.695 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] Gradle build daemon disappeared unexpectedly (it may have been killed or may have crashed)

#出现上述错误,加上--no-daemon参数,如下所示:
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ gradle clean build --no-daemon

编译成功后,会生成如下apk文件:

gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ find . -name "*.apk" -exec ls -l {} \+
-rw-rw-r-- 1 gemfield gemfield 4098714 May 21 16:25 ./app/build/outputs/apk/debug/app-debug.apk
-rw-rw-r-- 1 gemfield gemfield 3576289 May 21 16:25 ./app/build/outputs/apk/release/app-release-unsigned.apk

生成gemfield的证书:

keytool -genkey -keystore /home/gemfield/github/Gemfield/tvm/apps/android_rpc/dev_tools/tvmrpc.keystore -alias tvmrpc -keyalg RSA -validity 10000

为上面生成的apk签名(注意,这里使用的是自己给自己签发的证书):

jarsigner -keystore /home/gemfield/github/Gemfield/tvm/apps/android_rpc/dev_tools/tvmrpc.keystore \
          -signedjar /home/gemfield/github/Gemfield/tvm/apps/android_rpc/dev_tools/../app/build/outputs/apk/release/tvmrpc-release.apk \
          /home/gemfield/github/Gemfield/tvm/apps/android_rpc/dev_tools/../app/build/outputs/apk/release/app-release-unsigned.apk 'tvmrpc'

然后安装这个apk到你的Android手机上:

gemfield@ThinkPad-X1C:~$ adb install -r /home/gemfield/github/Gemfield/tvm/apps/android_rpc/dev_tools/../app/build/outputs/apk/release/tvmrpc-release.apk
Success

通过RPC在Android上调试模型的OP

1,在宿主机上运行RPC tracker:

gemfield@ThinkPad-X1C:~$ python -m tvm.exec.rpc_tracker --port 7030
INFO:root:If you are running ROCM/Metal, fork will cause compiler internal error. Try to launch with arg ```--no-fork```
INFO:RPCTracker:bind to 0.0.0.0:7030

这个服务会listen在宿主机的7030端口上。

2,打开Android手机的TVM RPC程序,填写以下信息

Address:192.168.31.74
Port:7030
Key:android

然后enable rpc即可。

3,在宿主机上查看RPC连接信息:

gemfield@ThinkPad-X1C:~$ python -m tvm.exec.query_rpc_tracker --port 7030
Tracker address localhost:7030

Server List
----------------------------
server-address  key
----------------------------
192.168.31.5:55630      server:android
----------------------------

Queue Status
-------------------------------
key       total  free  pending
-------------------------------
android   1      1     0      
-------------------------------

4,运行tests/android_rpc_test.py

这会把TVM IR编译为共享库,并且上传共享库到Android手机上,并且在Android手机上运行:

gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ export TVM_TRACKER_HOST=0.0.0.0
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ export TVM_TRACKER_PORT=7030
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ export TVM_NDK_CC=/home/gemfield/Android/./Sdk/ndk-bundle/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang++
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ python3 tests/android_rpc_test.py
Run CPU test ...
5.18708e-05 secs/op

使用RPC将ONNX模型运行在Android手机上

先设置环境变量:TVM_TRACKER_HOST、TVM_TRACKER_PORT、TVM_NDK_CC,然后运行下面的脚本:

import os
import sys
import onnx
import numpy as np
import tvm
import tvm.relay as relay
from tvm.contrib import graph_runtime, rpc

from time import time
from tvm.contrib import util, ndk

opt_level = 0
num_iter = 100
dtype = np.float32

onnx_model = onnx.load('gemfield.onnx')

x = np.ones([1,3,256,256])
# target can be "opencl", "llvm", "metal" or any target supported by tvm
arch = "arm64"
target =  "llvm -target=%s-linux-android" % arch
# target = "llvm"
# target = "opencl"
input_name = 'gemfield'
shape_dict = {input_name: x.shape}
sym, params = relay.frontend.from_onnx(onnx_model, shape_dict)

tracker_host = os.environ["TVM_TRACKER_HOST"]
tracker_port = int(os.environ["TVM_TRACKER_PORT"])
key = "android"

tracker = rpc.connect_tracker(tracker_host, tracker_port)
remote = tracker.request(key, priority=0,session_timeout=60)
target_host = None
ctx = remote.cpu(0)

with relay.build_config(opt_level=0):
    graph, lib, params = relay.build_module.build(sym, target, params=params)

so_name = "gemfield.so"
temp = util.tempdir()
path_so = temp.relpath(so_name)
lib.export_library(path_so, ndk.create_shared)

print("gemfield upload file: ", path_so)
remote.upload(path_so)
rlib = remote.load_module(so_name)

### run on remote device
x = np.ones([1,3,256,256])
rmodule = graph_runtime.create(graph, rlib, ctx)
rmodule.set_input('gemfield', tvm.nd.array(x.astype(dtype)))
rmodule.set_input(**params)

rmodule.run()

img_out = rmodule.get_output(0).asnumpy()

print('benchmark by gemfield on Android cpu')
ftimer = rmodule.module.time_evaluator("run", ctx, num_iter)
prof_res = ftimer()
print(prof_res)

一个真正的Android工程

安装ninja:

sudo apt install ninja-build

切换到tvm/apps/android_deploy目录下,编译官方自带的android_deploy项目:

#setup env
export PATH=$PATH:/home/gemfield/Android/Sdk/ndk-bundle/
export ANDROID_HOME=~/Android/Sdk/

#build
gemfield@ThinkPad-X1C:~/projects/tvm/apps/android_deploy$ gradle clean build --no-daemon
......

build完成后会生成apk文件:

gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_deploy$ find . -name "*.apk" -exec ls -l {} \+
-rw-rw-r-- 1 gemfield gemfield 91319321 5月  21 18:49 ./app/build/outputs/apk/debug/app-debug.apk
-rw-rw-r-- 1 gemfield gemfield 90565871 5月  21 18:49 ./app/build/outputs/apk/release/app-release-unsigned.apk

apk需要签名,否则adb install的时候会报错:

gemfield@ThinkPad-X1C:~$ adb install ./app/build/outputs/apk/release/app-release-unsigned.apk
adb: failed to install ./app/build/outputs/apk/release/app-release-unsigned.apk: Failure [INSTALL_PARSE_FAILED_NO_CERTIFICATES: Package /data/app/vmdl1784974544.tmp/base.apk has no certificates at entry AndroidManifest.xml]

签名(你可能需要使用gen_keystore.sh提前生成证书):

bash dev_tools/sign_apk.sh

再次使用adb进行安装:

gemfield@ThinkPad-X1C:~$ adb install -r /home/gemfield/projects/tvm/apps/android_deploy/dev_tools/../app/build/outputs/apk/release/tvmdemo-release.apk

总结

推理速度和之前PyTorch中的caffe2相比,还没有发现快多少或者慢多少。

    原文作者:Gemfield
    原文地址: https://zhuanlan.zhihu.com/p/58995914
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞