我有4核的计算机和OMP应用程序,有2个重要的任务.
int main()
{
#pragma omp parallel sections
{
#pragma omp section
WeightyTask1();
#pragma omp section
WeightyTask2();
}
return 0;
}
每项任务都有如此重要的部分:
#omp pragma parallel for
for (int i = 0; i < N; i++)
{
...
}
我使用-fopenmp参数编译程序,导出OMP_NUM_THREADS = 4.
问题是只加载了两个核心.如何在我的任务中使用所有核心?
最佳答案 我最初的反应是:你必须声明更多的并行性.
您已定义了两个可以并行运行的任务. OpenMP在两个以上内核上运行它的任何尝试都会降低您的速度(因为缓存局部性和可能的错误共享).
编辑如果并行for循环具有任何显着的卷(例如,不超过8次迭代),并且您没有看到使用超过2个核心,请查看
> omp_set_nested()
> OMP_NESTED
=TRUE
|FALSE
environment variable
This environment variable enables or disables nested parallelism. The setting of this environment variable can be overridden by calling the
omp_set_nested()
runtime library function.If nested parallelism is disabled, nested parallel regions are serialized and run in the current thread.
In the current implementation, nested parallel regions are always serialized. As a result,
OMP_SET_NESTED
does not have any effect, andomp_get_nested()
always returns 0. If -qsmp=nested_par option is on (only in non-strict OMP mode), nested parallel regions may employ additional threads as available. However, no new team will be created to run nested parallel regions.
The default value for OMP_NESTED is FALSE.