在C++中调用python时,传递大型数组是一个大问题,目前网上并没有给出靠谱而且简便的方法。在我跟boost.python的开发者沟通后给出了以下解决方案:
使用boost.python库: https://github.com/boostorg/python
环境配置方法:
- 下载C++的boost库:http://www.boost.org/
- 解压boost文件,在其目录中执行
.\bootstrap.bat
,会生成编译器b2.exe
和bjam.exe
- 修改
project-config.jam
文件,加入python的版本及路径(不加入则会默认python2):
import option ;
using msvc ;
option.set keep-going : false ;
using python
: 3.6 # Version
: D:\\Anaconda3\\python.exe # Python Path
: D:\\Anaconda3\\include # include path
: D:\\Anaconda3\\libs # lib path(s)
;
- 执行命令
`.\bjam.exe toolset=msvc-14.0 --with-python threading=multi link=shared address-model=64
,在stage\lib
目录中会生成boost_numpy3-*
和boost_python3-*
字样的文件。这是编译C++文件时需要的链接库。 - 新建C++工程,在
Project
→Project Property
→Configuration Properties
→VC++ Directories
中,Library Directories
中需要包含D:\Anaconda3\libs
和D:\boost\boost_1_66_0\stage\lib
;Include Directories
中需要包含D:\Anaconda3\include
和D:\boost\boost_1_66_0\build\include\boost-1_66
C++代码:
#include<boost/python.hpp>
#include<boost/python/numpy.hpp>
#include<iostream>
namespace p = boost::python;
namespace np = boost::python::numpy;
int main(int argc, char *argv[]){
\\初始化python解释器
Py_Initialize();
\\导入python模块
p::object pModule = p::import("mine");
\\导入python函数
p::object func1 = pModule.attr("func1");
\\初始化numpy
np::initialize();
\\生成ndarray类实例
uint8_t data_in_c[] = {1,2,3,4,5,6,7,8,1,3,5,7};
p::tuple shape = p::make_tuple(3, 4);
p::tuple stride = p::make_tuple(4, 1);
np::dtype dt1 = np::dtype::get_builtin<uint8_t>();
np::ndarray data_from_c = np::from_data(data_in_c, dt1,
shape, stride, p::object());
\\在C++中输出ndarray
std::cout << "C++ ndarray:" << std::endl;
std::cout << p::extract<char const *>(p::str(data_from_c)) << std::endl;
std::cout << std::endl;
\\调用python的函数,并传入ndarray,之后取回结果
p::object data_obj_from_python = func1(data_from_c);;
\\返回值是p::object类型,转换为np::array类型
np::ndarray data = np::array(data_obj_from_python);
\\取出ndarry的数据指针
\\由于某种原因,data.get_data()得到的数据指针是char *类型,需要转换为对应的数据类型
\\详见:https://github.com/boostorg/python/blob/develop/include/boost/python/numpy/ndarray.hpp#L41-L55
double *pp = reinterpret_cast<double*> (data.get_data());
\\调用python的函数,并传入ndarray
std::cout << "data from python:" << std::endl;
std::cout << p::extract<char const *>(p::str(data)) << std::endl;
std::cout << std::endl;
std::cout << "pointer:" << std::endl;
for (int i = 0; i < 9; i++) {
std::cout << *(pp+i) << std::endl;
}
}
Python代码mine.py
(需要跟编译后的exe文件放在同一个目录下):
import numpy as np
data = np.random.random((3,3))
def func1(data_from_C):
print("data_from_C++.shape:")
print(data_from_C.shape)
print('')
print("data_from_C++:")
print(data_from_C)
print('')
print("data.type:")
print(data.dtype)
print('')
print("data in python")
print(data)
print('')
return data
输出结果:
C++ ndarray:
[[1 2 3 4]
[5 6 7 8]
[1 3 5 7]]
data_from_C++.shape:
(3, 4)
data_from_C++:
[[1 2 3 4]
[5 6 7 8]
[1 3 5 7]]
data.type:
float64
data in python
[[ 0.70759695 0.39755579 0.9951812 ]
[ 0.97369017 0.57502282 0.25125566]
[ 0.92008613 0.74213496 0.64438868]]
data from python:
[[ 0.70759695 0.39755579 0.9951812 ]
[ 0.97369017 0.57502282 0.25125566]
[ 0.92008613 0.74213496 0.64438868]]
pointer:
0.707597
0.397556
0.995181
0.97369
0.575023
0.251256
0.920086
0.742135
0.644389
该方法为指针传值,python中的ndarray与C++中的数组共享数据空间。调用及返回的开销极低,完全可以忽略不计,而且使用方法比目前网上所有的方法都要简单。
注意:运行时如果出现错误:
Fatal Python error: Py_Initialize: unable to load the file system codec.
ImportError: No module named 'encodings'
则需要把 stage\lib
加入环境系统环境变量 PATH
,然后设置系统环境变量 PYTHONPATH
为 D:\Anaconda3\DLLs;D:\Anaconda3\Lib\site-packages;D:\Anaconda3\Lib