我目前正在编写一个库,我希望能够允许用户定义一个函数(声明为restrict(amp))并允许它们将此函数传递给我的一个库函数以供在一个库中使用. concurrency :: parallel_for_each循环.例如:
template <typename T, typename Func>
void Foo( const concurrency::array_view<const T>& avParam, Func f )
{
concurrency::array<T, 1> arrResult( avParam.extent );
concurrency::parallel_for_each( avParam.extent, [=, &arrResult]( concurrency::index<1> index ) restrict(amp) {
arrResult[index] = f( avParam[index] );
} );
// Do stuff...
}
我希望这可以工作,前提是f被声明为一个有效的AMP兼容函数,好像我直接用函数本身替换函数指针f在内核中;一切都按预期工作.但是,使用f会导致以下错误:
Function pointer, function reference, or pointer to member function is not supported.
有没有什么方法可以让我的行为得到我想要的行为,而不会阻止我的用户使用除lambda之外的函子?
最佳答案
References and pointers (to a compatible type) may be used locally but cannot be captured by a lambda. Function pointers, pointer-to-pointer, and the like are not allowed; neither are static or global variables.
Classes must meet more rules if you wish to use instances of them. They must have no virtual functions or virtual inheritance. Constructors, destructors, and other non-virtual functions are allowed. The member variables must all be of compatible types, which could of course include instances of other classes as long as those classes meet the same rules. The actual code in your amp-compatible function is not running on a CPU and therefore can’t do certain things that you might be used to doing:
- recursion
- pointer casting
- use of virtual functions
- new or delete
- RTTI or dynamic casting
您应该根据仿函数的lambdas编写库,因为可以使用restrict(amp)内核访问它们.您可以执行以下操作:
template <typename T, typename Func>
void Foo(const concurrency::array_view<const T>& avParam, Func f)
{
concurrency::array<T, 1> arrResult(avParam.extent);
concurrency::parallel_for_each(avParam.extent, [=, &arrResult](concurrency::index<1> index) restrict(amp)
{
arrResult[index] = f(avParam[index]);
});
// Do stuff...
}
template <typename T>
class Bar
{
public:
T operator()(const T& v) const restrict(amp)
{
return v + 2;
}
};
int _tmain(int argc, _TCHAR* argv[])
{
std::vector<int> result(100, 0);
array_view<const int, 1> result_av(result.size(), result);
Foo(result_av, Bar<int>());
return 0;
}
考虑这一点的一种方法是functor或lambda等效创建一个容器,编译器可以确保它没有依赖关系,并且C AMP运行时可以在GPU上实例化.使用函数指针实现这一点要困难得多.