背景
最近尝试将PyTorch的模型转化为tvm,使用tvm框架进行模型的前向。简单来说就是将PyTorch的模型export为onnx,再把onnx转化为tvm的模型。Gemfield使用的是ONNX的opset version 9。
安装TVM
1,克隆仓库
git clone --recursive https://github.com/dmlc/tvm
2,安装依赖
sudo apt-get update
sudo apt-get install -y python python-dev python-setuptools gcc \
libtinfo-dev zlib1g-dev build-essential cmake
3,安装llvm
要安装大于4.0版本的,而ubuntu 16.04 apt官方源最新只有3.x,ubuntu 18.04则没问题(安装的是6.0)。如果apt官方最新的llvm版本小于4,那么使用llvm的源(参考 https://apt.llvm.org/):
apt install software-properties-common
apt-add-repository "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-7 main"
apt-add-repository "deb-src http://apt.llvm.org/xenial/ llvm-toolchain-xenial-7 main"
apt-get update
#apt-get install llvm-<version>-dev libclang-<version>-dev clang-<version>
#如果version取7的话,则是
apt-get install llvm-7-dev libclang-7-dev clang-7
哦,Gemfield使用的是KDE Ubuntu 19.04,那就简单了:
gemfield@ThinkPad-X1C:~$ sudo apt install clang libclang-dev llvm-dev
安装mvn,为后续的Android开发做好准备:
gemfield@ThinkPad-X1C:~$ sudo apt install maven
4,定制config.cmake
mkdir build
cp cmake/config.cmake build
cd build
编辑build/config.cmake文件,里面有一些功能开关,这些配置有:
USE_CUDA,NVIDIA的GPU计算;
USE_ROCM,通用的GPU计算,AMD提出,目的很显然...;
USE_SDACCEL,FPGA计算;
USE_AOCL,Intel FPGA SDK for OpenCL (AOCL) runtime;
USE_OPENCL,异构平台编写程序的框架,异构平台可由CPU、GPU、DSP、FPGA或其他类型的处理器与硬件加速器所组成;
USE_METAL,iOS上的GPU计算;
USE_VULKAN,新一代的openGL,Android 7.x开始支持(iOS不支持,因为有自己的metal2);
USE_OPENGL,2D/3D渲染库标准,显卡厂家负责实现和支持;
USE_SGX, Intel SGX ;
USE_RPC,远程调用,电脑和手机可以通过网络联调;
USE_STACKVM_RUNTIME,embed stackvm into the runtime;
USE_GRAPH_RUNTIME,enable tiny embedded graph runtime;
USE_GRAPH_RUNTIME_DEBUG,enable additional graph debug functions;
USE_LLVM,llvm support;
USE_BLAS,API标准,规范发布基础线性代数操作的数值库(如矢量或矩阵乘法),不同的实现有openblas, mkl, atlas, apple
USE_RANDOM,contrib.random运行时;
USE_NNPACK,
USE_CUDNN,
USE_CUBLAS,
USE_MIOPEN,
USE_MPS,
USE_ROCBLAS,
USE_SORT,使用contrib sort;
USE_ANTLR,
USE_VTA_TSIM,
USE_RELAY_DEBUG,Relay debug模式
gemfield只打开了set(USE_LLVM ON)、USE_SORT、USE_GRAPH_RUNTIME、USE_RPC。其它的都没开启,为什么?因为有些用不到,有些还不知道是啥意思。
5,编译
开启llvm的情况下,一共会编译几百个编译单元:
#cmake -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON .. for verbose, by civilnet
cmake ..
make -j4
最终链接出以下so库:
[ 5%] Linking CXX shared library libvta.so
[ 12%] Linking CXX shared library libtvm_runtime.so
[ 86%] Linking CXX shared library libtvm.so
[ 94%] Linking CXX shared library libtvm_topi.so
[100%] Linking CXX shared library libnnvm_compiler.so
gemfield简单介绍下这几个共享库:
1,libvta.so (VTA,Versatile Tensor Accelerator的缩写),参考https://docs.tvm.ai/vta/index.html,由以下这几个编译单元生成。
vta/src/device_api.cc
vta/src/runtime.cc
vta/src/sim/sim_driver.cc
2,libtvm_runtime.so
顾名思义,tvm的运行时,实际上,这个库是TVM运行时的一个最小化库,由“Minimum runtime related codes”编译而成——也即下面的这些源文件:
src/runtime/builtin_fp16.cc
src/runtime/c_dsl_api.cc
src/runtime/c_runtime_api.cc
src/runtime/cpu_device_api.cc
src/runtime/dso_module.cc
src/runtime/file_util.cc
src/runtime/module.cc
src/runtime/module_util.cc
src/runtime/ndarray.cc
src/runtime/registry.cc
src/runtime/system_lib_module.cc
src/runtime/thread_pool.cc
src/runtime/threading_backend.cc
src/runtime/vm/memory_manager.cc
src/runtime/vm/object.cc
src/runtime/vm/vm.cc
src/runtime/workspace_pool.cc
3rdparty/bfloat16/bfloat16.cc
src/runtime/rpc/*.cc
src/runtime/graph/graph_runtime.cc
src/contrib/sort/sort.cc
3,libtvm.so
完整的tvm,由编译时、运行时、rpc部分等组成:
common: Internal common utilities.
api: API function registration.
lang: The definition of DSL related data structure.
arithmetic: Arithmetic expression and set simplification.
op: The detail implementations about each operation(compute, scan, placeholder).
schedule: The operations on the schedule graph before converting to IR.
pass: The optimization pass on the IR structure.
codegen: The code generator.
runtime: Minimum runtime related codes.
autotvm: The auto-tuning module.
relay: Implementation of Relay. The second generation of NNVM, a new IR for deep learning frameworks.
contrib: Contrib extension libraries.
这个库比较大,有200多个编译单元:
src/api/*.cc
src/arithmetic/*.cc
src/autotvm/*.cc
src/codegen/*.cc
src/lang/*.cc
src/op/*.cc
src/pass/*.cc
src/schedule/*.cc
src/relay/backend/*.cc
src/relay/ir/*.cc
src/relay/op/*.cc
src/relay/pass/*.cc
3rdparty/HalideIR/src/*.cpp
src/runtime/stackvm/*.cc
src/codegen/opt/*.cc
src/codegen/llvm/*.cc
src/runtime/*.cc
src/contrib/hybrid/codegen_hybrid.cc
3rdparty/bfloat16/bfloat16.cc
src/contrib/sort/sort.cc
4,libtvm_topi.so
TOPI(TVM OP Inventory),is the operator collection library for TVM intended at sharing the effort of crafting and optimizing tvm generated kernels。由下面的编译单元生成:
topi/src/topi.cc
5,libnnvm_compiler.so
NNVM编译器,由以下编译单元生成:
nnvm/src/c_api/*.cc
nnvm/src/compiler/*.cc
nnvm/src/core/*.cc
nnvm/src/pass/*.cc
nnvm/src/top/nn/*.cc
nnvm/src/top/tensor/*.cc
nnvm/src/top/vision/nms.cc
nnvm/src/top/vision/ssd/mutibox_op.cc
nnvm/src/top/vision/yolo/reorg.cc
nnvm/src/top/image/resize.cc
6,设置PYTHONPATH
export TVM_HOME=/home/gemfield/github/Gemfield/tvm/
export PYTHONPATH=$TVM_HOME/python:$TVM_HOME/topi/python:$TVM_HOME/nnvm/python:${PYTHONPATH}
7,安装python依赖
注意,TVM已经放弃对python2的支持了。
#必备的依赖
gemfield@ThinkPad-X1C:~$ pip3 install numpy decorator attrs
#如果想使用RPC Tracker的话
gemfield@ThinkPad-X1C:~$ pip3 install tornado
#如果想使用auto-tuning module的话
gemfield@ThinkPad-X1C:~$ pip3 install tornado psutil xgboost
安装onnx
gemfield@ThinkPad-X1C:~$ pip3 install onnx
转换开始
使用下面的代码:
import onnx
import numpy as np
import tvm
import tvm.relay as relay
onnx_model = onnx.load('gemfield.onnx')
x = np.ones([1,3,256,256])
# arch = "arm64"
# target = "llvm -target=%s-linux-android" % arch
target = 'llvm'
input_name = 'gemfield'
shape_dict = {input_name: x.shape}
sym, params = relay.frontend.from_onnx(onnx_model, shape_dict)
with relay.build_config(opt_level=1):
intrp = relay.build_module.create_executor('graph', sym, tvm.cpu(0), target)
dtype = 'float32'
tvm_output = intrp.evaluate(sym)(tvm.nd.array(x.astype(dtype)), **params).asnumpy()
with relay.build_config(opt_level=2):
graph, lib, params = relay.build_module.build(sym, target, params=params)
libpath = "gemfield.so"
lib.export_library(libpath)
graph_json_path = "gemfield.json"
with open(graph_json_path, 'w') as fo:
fo.write(graph)
param_path = "gemfield.params"
with open(param_path, 'wb') as fo:
fo.write(relay.save_param_dict(params))
目前阻塞在upsample op的转换上,很显然tvm目前不支持opset9。PR已经create出来:
经过这个fix后,接着的错误是不支持group kernel。新的错误如下:
not support arbitrary group number for now"
大约一个月后,这个错误已经被fix,现在可以继续开始转换了。转换成功后,会生成如下3个文件:
-rw-rw-r-- 1 gemfield gemfield 124561 5月 17 19:38 gemfield.json
-rw-rw-r-- 1 gemfield gemfield 407658 5月 17 19:38 gemfield.params
-rwxrwxr-x 1 gemfield gemfield 585760 5月 17 19:38 gemfield.so
此次生成的gemfield.so是x86-64的动态库,只依赖基础的C库,之前网络中的op计算已经转换成了如下的C函数:
fused_concatenate
fused_concatenate_1
fused_concatenate_multiply_add_nn_prelu
fused_concatenate_multiply_add_nn_prelu_1
fused_nn_avg_pool2d
fused_nn_avg_pool2d_1
fused_nn_conv2d
fused_nn_conv2d_1
fused_nn_conv2d_add
fused_nn_conv2d_add_2
fused_nn_conv2d_multiply_add
fused_nn_conv2d_multiply_add_add_nn_prelu
fused_nn_conv2d_multiply_add_add_nn_prelu_1
fused_nn_conv2d_multiply_add_nn_prelu
fused_nn_conv2d_multiply_add_nn_prelu_1
fused_nn_pad_1
fused_nn_upsampling
fused_nn_upsampling_concatenate
fused_nn_upsampling_nn_upsampling_nn_upsampling_nn_upsampling_concatenate
...
并编译生成了gemfield.so这个动态库文件。此外,gemfield.json使用json结构描述了神经网络结构,gemfield.params里面包含了网络权重参数。
前向推理
上述编译出来的gemfield.so通过tvm.module加载。下面的代码演示了如何使用gemfield.so和tvm模块进行前向推理:
from PIL import Image
import numpy as np
import cv2
import tvm
import numpy as np
from tvm.contrib import util, ndk, graph_runtime
import os
loaded_json = open("gemfield.json").read()
loaded_lib = tvm.module.load('gemfield.so')
loaded_params = bytearray(open('gemfield.params', "rb").read())
ctx = tvm.cpu()
module = graph_runtime.create(loaded_json, loaded_lib, ctx)
module.load_params(loaded_params)
files = os.listdir("input/")
mean = [109.496254,118.698456,124.68751]
std = 58.50182
for f in files:
img_in = cv2.imread("input/"+f)
img = cv2.resize(img_in,(256, 256))
img=img.astype(np.float32)
for j in range(3):
img[:,:,j]-=mean[j]
for j in range(3):
img[:,:,j]/=std
img/=255
img=img.transpose((2,0,1))
img=np.expand_dims(img, axis=0)
module.set_input("gemfield",img.astype(np.float32))
module.run()
img_out = module.get_output(0).asnumpy()
img = np.argmax(img_out,axis=1)
img=np.squeeze(img)
palette=np.array([[0, 0, 0],[128, 0, 0],[0, 128, 0],[128, 128, 0],\
[0, 0, 128],[128, 0, 128],[0, 128, 128],[128, 128, 128],\
[64, 0, 0],[192, 0, 0],[64, 128, 0],[192, 128, 0],\
[64, 0, 128],[192, 0, 128],[64, 128, 128],[192, 128, 128],\
[0, 64, 0],[128, 64, 0],[0, 192, 0],[128, 192, 0],\
[0, 64, 128]], dtype='uint8').flatten()
img=Image.fromarray(img.astype('uint8'))
img.putpalette(palette)
#only mask, by gemfield
img.save("output_mask/"+os.path.splitext(f)[0]+".png")
#blend with original image
img1 = Image.open("output_mask/"+os.path.splitext(f)[0]+".png")
img1 = img1.convert('RGBA')
img2 = Image.open("input/"+f)
img2 = img2.resize((256,256))
img2 = img2.convert('RGBA')
img = Image.blend(img1, img2, 0.5)
img.save("output_blend/"+os.path.splitext(f)[0]+".png")
打印前向速度
可以在目标设备上做个简单的性能测试:
import onnx
import numpy as np
import tvm
import tvm.relay as relay
from tvm.contrib import graph_runtime, rpc
import time
onnx_model = onnx.load('gemfield.onnx')
x = np.ones([1,3,256,256])
# target can be "opencl", "llvm", "metal" or any target supported by tvm
# arch = "arm64"
# target = "llvm -target=%s-linux-android" % arch
target = "llvm"
# target = "opencl"
input_name = 'gemfield'
shape_dict = {input_name: x.shape}
sym, params = relay.frontend.from_onnx(onnx_model, shape_dict)
ctx = tvm.context(target, 0)
with relay.build_config(opt_level=0):
intrp = relay.build_module.create_executor('graph', sym, ctx, target)
with relay.build_config(opt_level=0):
graph, lib, params = relay.build_module.build(sym, target, params=params)
x = np.ones([1,3,256,256])
dtype = np.float32
module = graph_runtime.create(graph, lib, ctx)
module.set_input('gemfield', tvm.nd.array(x.astype(dtype)))
module.set_input(**params)
module.run()
img_out = module.get_output(0).asnumpy()
print('benchmark by gemfield on cpu')
t1 = time.time()
ftimer = module.module.time_evaluator("run", ctx, 100)
prof_res = ftimer()
print(prof_res)
输出:
ProfileResult(mean=0.02616951292, results=(0.02616951292,))
继续转换(Android平台)
host上可以成功运行后,现在就准备把模型跑在手机上了。这次的编译就要使用不同的target了,这里设置为target = “llvm -target=arm64-linux-android”。除了设置target之外,还要设置交叉编译器,直接使用NDK里的即可:
import onnx
import numpy as np
import tvm
import tvm.relay as relay
onnx_model = onnx.load('gemfield.onnx')
x = np.ones([1,3,256,256])
arch = "arm64"
target = "llvm -target=%s-linux-android" % arch
input_name = 'gemfield'
shape_dict = {input_name: x.shape}
sym, params = relay.frontend.from_onnx(onnx_model, shape_dict)
with relay.build_config(opt_level=0):
intrp = relay.build_module.create_executor('graph', sym, tvm.cpu(0), target)
with relay.build_config(opt_level=0):
graph, lib, params = relay.build_module.build(sym, target, params=params)
libpath = "gemfield.so"
#lib.export_library(libpath)
lib.export_library(libpath, cc="/home/gemfield/Android/android-ndk-r19c/ \
toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang")
graph_json_path = "gemfield.json"
with open(graph_json_path, 'w') as fo:
fo.write(graph)
param_path = "gemfield.params"
with open(param_path, 'wb') as fo:
fo.write(relay.save_param_dict(params))
编译成功后,和上次在宿主机上一样会生成3个文件。区别就是这次的gemfield.so是ARM aarch64的ELF格式了。
注意1:如果你设置了一个不同于当前host 架构的target,比如arm64-linux-android,却在host上evaluate,那么就会报这样的错误:TVMError: Cannot run module, architecture mismatch module=arm64-linux-android system=x86_64-pc-linux-gnu。这种情况下,正确的做法是停止在host上evaluate。
注意2:如果target是arm64-linux-android,但是指定的编译器却不是这个架构的,就会报下面这样的错误:
/usr/bin/ld: /tmp/tmpdepff9wm/lib.o: Relocations in generic ELF (EM: 183)
/usr/bin/ld: /tmp/tmpdepff9wm/lib.o: Relocations in generic ELF (EM: 183)
/usr/bin/ld: /tmp/tmpdepff9wm/lib.o: error adding symbols: file in wrong format
collect2: error: ld returned 1 exit status
编译安装TVM4J
TVM for java,TVM的java前端。如果想使用TVM的java API,那么需要编译TVM4J库(jar包);另外如果你需要RPC的话,TVM4J也为RPC server 和 client提供了简单的API;
1,使用make jvmpkg命令编译TVM4J,这会下载很多maven的pom配置文件:
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm$ make jvmpkg
注意TVM目前只支持openjdk8,如果你用的是openjdk11,则会遇到如下错误:
[ERROR] /home/gemfield/github/Gemfield/tvm/jvm/core/src/main/java/ml/dmlc/tvm/rpc/Server.java:[20,16] 找不到符号
符号: 类 SharedSecrets
位置: 程序包 sun.misc
还有这个:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.3:testCompile (default-testCompile) on project tvm4j-core: Compilation failure: Compilation failure:
[ERROR] /home/gemfield/github/Gemfield/tvm/jvm/core/src/test/java/ml/dmlc/tvm/contrib/GraphRuntimeTest.java:[46,5] reference to Module is ambiguous
[ERROR] both class ml.dmlc.tvm.Module in ml.dmlc.tvm and class java.lang.Module in java.lang match
这么改下去就不断有不兼容的错误出现,gemfield索性直接把版本降为openjdk8了。
编译成功后显示:
[INFO]
[INFO] --- maven-javadoc-plugin:2.9.1:jar (attach-javadocs) @ tvm4j-full-linux-x86_64 ---
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for TVM4J Package - Parent 0.0.1-SNAPSHOT:
[INFO]
[INFO] TVM4J Package - Parent ............................. SUCCESS [ 1.510 s]
[INFO] TVM4J Package - Core ............................... SUCCESS [ 3.713 s]
[INFO] TVM4J Package - Native Parent ...................... SUCCESS [ 0.025 s]
[INFO] TVM4J Package - Native Linux-x86_64 ................ SUCCESS [ 58.494 s]
[INFO] TVM4J Package - Full Parent ........................ SUCCESS [ 0.028 s]
[INFO] TVM4J Package - Full Linux-x86_64 .................. SUCCESS [01:47 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:51 min
[INFO] Finished at: 2019-05-20T21:47:02+08:00
[INFO] ------------------------------------------------------------------------
2,安装
使用make jvminstall命令将tvm4j的相关jar包安装在$HOME/.m2/repository/目录下:
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm$ make jvminstall
......
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-parent/0.0.1-SNAPSHOT/tvm4j-parent-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/target/tvm4j-parent-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-parent/0.0.1-SNAPSHOT/tvm4j-parent-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/core/target/tvm4j-core-0.0.1-SNAPSHOT.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-core/0.0.1-SNAPSHOT/tvm4j-core-0.0.1-SNAPSHOT.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/core/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-core/0.0.1-SNAPSHOT/tvm4j-core-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/core/target/tvm4j-core-0.0.1-SNAPSHOT-sources.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-core/0.0.1-SNAPSHOT/tvm4j-core-0.0.1-SNAPSHOT-sources.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/core/target/tvm4j-core-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-core/0.0.1-SNAPSHOT/tvm4j-core-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/core/target/tvm4j-core-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-core/0.0.1-SNAPSHOT/tvm4j-core-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-native-parent/0.0.1-SNAPSHOT/tvm4j-native-parent-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/target/tvm4j-native-parent-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-native-parent/0.0.1-SNAPSHOT/tvm4j-native-parent-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/linux-x86_64/target/libtvm4j-linux-x86_64.so to /home/gemfield/.m2/repository/ml/dmlc/tvm/libtvm4j-linux-x86_64/0.0.1-SNAPSHOT/libtvm4j-linux-x86_64-0.0.1-SNAPSHOT.so
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/linux-x86_64/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/libtvm4j-linux-x86_64/0.0.1-SNAPSHOT/libtvm4j-linux-x86_64-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/linux-x86_64/target/libtvm4j-linux-x86_64-sources.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/libtvm4j-linux-x86_64/0.0.1-SNAPSHOT/libtvm4j-linux-x86_64-0.0.1-SNAPSHOT-sources.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/native/linux-x86_64/target/libtvm4j-linux-x86_64-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/libtvm4j-linux-x86_64/0.0.1-SNAPSHOT/libtvm4j-linux-x86_64-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/assembly/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-full-parent/0.0.1-SNAPSHOT/tvm4j-full-parent-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/assembly/target/tvm4j-full-parent-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-full-parent/0.0.1-SNAPSHOT/tvm4j-full-parent-0.0.1-SNAPSHOT-javadoc.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/assembly/linux-x86_64/target/tvm4j-full-linux-x86_64-0.0.1-SNAPSHOT.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-full-linux-x86_64/0.0.1-SNAPSHOT/tvm4j-full-linux-x86_64-0.0.1-SNAPSHOT.jar
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/assembly/linux-x86_64/pom.xml to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-full-linux-x86_64/0.0.1-SNAPSHOT/tvm4j-full-linux-x86_64-0.0.1-SNAPSHOT.pom
[INFO] Installing /home/gemfield/github/Gemfield/tvm/jvm/assembly/linux-x86_64/target/tvm4j-full-linux-x86_64-0.0.1-SNAPSHOT-javadoc.jar to /home/gemfield/.m2/repository/ml/dmlc/tvm/tvm4j-full-linux-x86_64/0.0.1-SNAPSHOT/tvm4j-full-linux-x86_64-0.0.1-SNAPSHOT-javadoc.jar
安装RPC App到Android手机上
这一步依赖TVM4J。
1,设置Android SDK的路径:
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ export ANDROID_HOME=~/Android/Sdk/
#可选,如果找不到ndk-build
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ export PATH=$PATH:/home/gemfield/Android/Sdk/ndk-bundle/
2,编译RPC的APK
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ gradle clean build
......
16:12:24.695 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] FAILURE: Build failed with an exception.
16:12:24.695 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]
16:12:24.695 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] * What went wrong:
16:12:24.695 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] Gradle build daemon disappeared unexpectedly (it may have been killed or may have crashed)
#出现上述错误,加上--no-daemon参数,如下所示:
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ gradle clean build --no-daemon
编译成功后,会生成如下apk文件:
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ find . -name "*.apk" -exec ls -l {} \+
-rw-rw-r-- 1 gemfield gemfield 4098714 May 21 16:25 ./app/build/outputs/apk/debug/app-debug.apk
-rw-rw-r-- 1 gemfield gemfield 3576289 May 21 16:25 ./app/build/outputs/apk/release/app-release-unsigned.apk
生成gemfield的证书:
keytool -genkey -keystore /home/gemfield/github/Gemfield/tvm/apps/android_rpc/dev_tools/tvmrpc.keystore -alias tvmrpc -keyalg RSA -validity 10000
为上面生成的apk签名(注意,这里使用的是自己给自己签发的证书):
jarsigner -keystore /home/gemfield/github/Gemfield/tvm/apps/android_rpc/dev_tools/tvmrpc.keystore \
-signedjar /home/gemfield/github/Gemfield/tvm/apps/android_rpc/dev_tools/../app/build/outputs/apk/release/tvmrpc-release.apk \
/home/gemfield/github/Gemfield/tvm/apps/android_rpc/dev_tools/../app/build/outputs/apk/release/app-release-unsigned.apk 'tvmrpc'
然后安装这个apk到你的Android手机上:
gemfield@ThinkPad-X1C:~$ adb install -r /home/gemfield/github/Gemfield/tvm/apps/android_rpc/dev_tools/../app/build/outputs/apk/release/tvmrpc-release.apk
Success
通过RPC在Android上调试模型的OP
1,在宿主机上运行RPC tracker:
gemfield@ThinkPad-X1C:~$ python -m tvm.exec.rpc_tracker --port 7030
INFO:root:If you are running ROCM/Metal, fork will cause compiler internal error. Try to launch with arg ```--no-fork```
INFO:RPCTracker:bind to 0.0.0.0:7030
这个服务会listen在宿主机的7030端口上。
2,打开Android手机的TVM RPC程序,填写以下信息
Address:192.168.31.74
Port:7030
Key:android
然后enable rpc即可。
3,在宿主机上查看RPC连接信息:
gemfield@ThinkPad-X1C:~$ python -m tvm.exec.query_rpc_tracker --port 7030
Tracker address localhost:7030
Server List
----------------------------
server-address key
----------------------------
192.168.31.5:55630 server:android
----------------------------
Queue Status
-------------------------------
key total free pending
-------------------------------
android 1 1 0
-------------------------------
4,运行tests/android_rpc_test.py
这会把TVM IR编译为共享库,并且上传共享库到Android手机上,并且在Android手机上运行:
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ export TVM_TRACKER_HOST=0.0.0.0
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ export TVM_TRACKER_PORT=7030
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ export TVM_NDK_CC=/home/gemfield/Android/./Sdk/ndk-bundle/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang++
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_rpc$ python3 tests/android_rpc_test.py
Run CPU test ...
5.18708e-05 secs/op
使用RPC将ONNX模型运行在Android手机上
先设置环境变量:TVM_TRACKER_HOST、TVM_TRACKER_PORT、TVM_NDK_CC,然后运行下面的脚本:
import os
import sys
import onnx
import numpy as np
import tvm
import tvm.relay as relay
from tvm.contrib import graph_runtime, rpc
from time import time
from tvm.contrib import util, ndk
opt_level = 0
num_iter = 100
dtype = np.float32
onnx_model = onnx.load('gemfield.onnx')
x = np.ones([1,3,256,256])
# target can be "opencl", "llvm", "metal" or any target supported by tvm
arch = "arm64"
target = "llvm -target=%s-linux-android" % arch
# target = "llvm"
# target = "opencl"
input_name = 'gemfield'
shape_dict = {input_name: x.shape}
sym, params = relay.frontend.from_onnx(onnx_model, shape_dict)
tracker_host = os.environ["TVM_TRACKER_HOST"]
tracker_port = int(os.environ["TVM_TRACKER_PORT"])
key = "android"
tracker = rpc.connect_tracker(tracker_host, tracker_port)
remote = tracker.request(key, priority=0,session_timeout=60)
target_host = None
ctx = remote.cpu(0)
with relay.build_config(opt_level=0):
graph, lib, params = relay.build_module.build(sym, target, params=params)
so_name = "gemfield.so"
temp = util.tempdir()
path_so = temp.relpath(so_name)
lib.export_library(path_so, ndk.create_shared)
print("gemfield upload file: ", path_so)
remote.upload(path_so)
rlib = remote.load_module(so_name)
### run on remote device
x = np.ones([1,3,256,256])
rmodule = graph_runtime.create(graph, rlib, ctx)
rmodule.set_input('gemfield', tvm.nd.array(x.astype(dtype)))
rmodule.set_input(**params)
rmodule.run()
img_out = rmodule.get_output(0).asnumpy()
print('benchmark by gemfield on Android cpu')
ftimer = rmodule.module.time_evaluator("run", ctx, num_iter)
prof_res = ftimer()
print(prof_res)
一个真正的Android工程
安装ninja:
sudo apt install ninja-build
切换到tvm/apps/android_deploy目录下,编译官方自带的android_deploy项目:
#setup env
export PATH=$PATH:/home/gemfield/Android/Sdk/ndk-bundle/
export ANDROID_HOME=~/Android/Sdk/
#build
gemfield@ThinkPad-X1C:~/projects/tvm/apps/android_deploy$ gradle clean build --no-daemon
......
build完成后会生成apk文件:
gemfield@ThinkPad-X1C:~/github/Gemfield/tvm/apps/android_deploy$ find . -name "*.apk" -exec ls -l {} \+
-rw-rw-r-- 1 gemfield gemfield 91319321 5月 21 18:49 ./app/build/outputs/apk/debug/app-debug.apk
-rw-rw-r-- 1 gemfield gemfield 90565871 5月 21 18:49 ./app/build/outputs/apk/release/app-release-unsigned.apk
apk需要签名,否则adb install的时候会报错:
gemfield@ThinkPad-X1C:~$ adb install ./app/build/outputs/apk/release/app-release-unsigned.apk
adb: failed to install ./app/build/outputs/apk/release/app-release-unsigned.apk: Failure [INSTALL_PARSE_FAILED_NO_CERTIFICATES: Package /data/app/vmdl1784974544.tmp/base.apk has no certificates at entry AndroidManifest.xml]
签名(你可能需要使用gen_keystore.sh提前生成证书):
bash dev_tools/sign_apk.sh
再次使用adb进行安装:
gemfield@ThinkPad-X1C:~$ adb install -r /home/gemfield/projects/tvm/apps/android_deploy/dev_tools/../app/build/outputs/apk/release/tvmdemo-release.apk
总结
推理速度和之前PyTorch中的caffe2相比,还没有发现快多少或者慢多少。