并行处理 – OpenMP,使用并行的所有内核

我有4核的计算机和OMP应用程序,有2个重要的任务.

int main()
{
    #pragma omp parallel sections
    {
        #pragma omp section
        WeightyTask1();

        #pragma omp section
        WeightyTask2();
    }

    return 0;
}

每项任务都有如此重要的部分:

#omp pragma parallel for
for (int i = 0; i < N; i++)
{
    ...
}

我使用-fopenmp参数编译程序,导出OMP_NUM_THREADS = 4.
问题是只加载了两个核心.如何在我的任务中使用所有核心?

最佳答案 我最初的反应是:你必须声明更多的并行性.

您已定义了两个可以并行运行的任务. OpenMP在两个以上内核上运行它的任何尝试都会降低您的速度(因为缓存局部性和可能的​​错误共享).

编辑如果并行for循环具有任何显着的卷(例如,不超过8次迭代),并且您没有看到使用超过2个核心,请查看

> omp_set_nested()
> OMP_NESTED=TRUE|FALSE environment variable

This environment variable enables or disables nested parallelism. The setting of this environment variable can be overridden by calling the omp_set_nested() runtime library function.

If nested parallelism is disabled, nested parallel regions are serialized and run in the current thread.

In the current implementation, nested parallel regions are always serialized. As a result, OMP_SET_NESTED does not have any effect, and omp_get_nested() always returns 0. If -qsmp=nested_par option is on (only in non-strict OMP mode), nested parallel regions may employ additional threads as available. However, no new team will be created to run nested parallel regions.
The default value for OMP_NESTED is FALSE.

点赞