【c++】OpenMP自动并行
自动并行相关技术
文档教程OpenMP 推荐!!!下面的几篇博客为此教程的翻译整理版本。
官网:OpenMP应用程序编程接口
入门博客:
- 最简单的并行计算——OpenMP的使用
- OpenMP入门教程(一)
- OpenMP入门教程(二)
- OpenMP入门教程(三)
- 通过 GCC 学习 OpenMP 框架
- OpenMP编程入门之一
- OpenMP中几个容易混淆的函数(线程数量/线程ID/线程最大数)以及并行区域线程数量的确定
- 并行编程-OpenMP(必看!)
- 嵌套并行操作(必看!)
omp例子
#include <iostream>
#include <omp.h>
#include <thread>
void report_num_threads(int level)
{
// #pragma omp single
{
printf("Level %d: number of threads in the team: %d\n",
level, omp_get_num_threads());
// 获取当前线程的 ID
std::thread::id this_id = std::this_thread::get_id();
// 输出线程 ID
std::cout << " Current thread ID: " << this_id << std::endl;
}
}
int main() {
const int n = 32;
int sum = 0;
// int data[n];
int a1 = 0;
// int data2[n];
int a2 = 0;
// int data3[n];
int a3 = 0;
// int data4[n];
int a4 = 0;
// int data5[n];
int a5 = 0;
bool is_nested = false;
is_nested = omp_get_nested();
std::cout<<"是否支持并行嵌套:" << is_nested << "\n";
omp_set_nested(1);//设置支持嵌套并行, 嵌套则会产生 n*m个线程(线程id不同)
is_nested = omp_get_nested();
std::cout<<"是否支持并行嵌套:" << is_nested << "\n";
omp_set_num_threads(4); // 这个设置一个最大线程数,其他地方不指定线程数,然后开启嵌套,不会增加线程数,但是可以嵌套并行,然后指定的for里面会按照指定的来,没有指定的全部按照这个默认值
double start_time = omp_get_wtime();
for (int j = 0; j < 1; j++) {
#pragma omp parallel for // num_threads(4)
for (int i = 0; i < 1; ++i) {
std::cout << "Thread " << omp_get_thread_num() << ": " << std::endl;
int sum = 0;
report_num_threads(1);
#pragma omp parallel for //num_threads(3)
for (int k = 0; k < 10; ++k) {
sum = sum + 1;
report_num_threads(2);
}
report_num_threads(3);
std::cout << "Thread " << omp_get_thread_num()
<< " sum=" << sum
<< std::endl;
}
}
report_num_threads(4);
double end_time = omp_get_wtime();
std::cout << "Time taken: " << end_time - start_time << " seconds" << std::endl;
return 0;
}
// cmd: g++ -fopenmp -o sum_openmp openmp_test.cpp && ./sum_openmp
/*
output:
ID: 0, Max threads: 52, Num threads: 1
ID: 0, Max threads: 5, Num threads: 1
ID: 0, Max threads: 5, Num threads: 5
ID: 4, Max threads: 5, Num threads: 5
ID: 2, Max threads: 5, Num threads: 5
ID: 1, Max threads: 5, Num threads: 5
ID: 3, Max threads: 5, Num threads: 5
ID: 0, Max threads: 5, Num threads: 1
ID: 0, Max threads: 6, Num threads: 1
ID: 0, Max threads: 6, Num threads: 6
ID: 3, Max threads: 6, Num threads: 6
ID: 1, Max threads: 6, Num threads: 6
ID: 4, Max threads: 6, Num threads: 6
ID: 2, Max threads: 6, Num threads: 6
*/
嵌套并行操作
嵌套并行操作
例子:
#include <iostream>
#include <omp.h>
void report_num_threads(int level)
{
// #pragma omp single
{
printf("Level %d: number of threads in the team: %d\n",
level, omp_get_num_threads());
}
}
int main() {
const int n = 32;
int sum = 0;
// int data[n];
int a1 = 0;
// int data2[n];
int a2 = 0;
// int data3[n];
int a3 = 0;
// int data4[n];
int a4 = 0;
// int data5[n];
int a5 = 0;
bool is_nested = false;
is_nested = omp_get_nested();
std::cout<<"是否支持并行嵌套:" << is_nested << "\n";
omp_set_nested(1);//设置支持嵌套并行
is_nested = omp_get_nested();
std::cout<<"是否支持并行嵌套:" << is_nested << "\n";
omp_set_num_threads(4);
double start_time = omp_get_wtime();
for (int j = 0; j < 1; j++) {
#pragma omp parallel for num_threads(4)
for (int i = 0; i < 2; ++i) {
std::cout << "Thread " << omp_get_thread_num() << ": " << std::endl;
int sum = 0;
report_num_threads(1);
#pragma omp parallel for num_threads(16)
for (int k = 0; k < 3; ++k) {
sum = sum + 1;
report_num_threads(2);
}
report_num_threads(3);
std::cout << "Thread " << omp_get_thread_num()
<< " sum=" << sum
<< std::endl;
}
}
report_num_threads(4);
double end_time = omp_get_wtime();
std::cout << "Time taken: " << end_time - start_time << " seconds" << std::endl;
return 0;
}
// cmd: g++ -fopenmp -o sum_openmp openmp_test.cpp && ./sum_openmp
/*
是否支持并行嵌套:0
是否支持并行嵌套:1
Thread 0: Thread
1:
Level 1: number of threads in the team: 4
Level 1: number of threads in the team: 4
Level 2: number of threads in the team: 16
Level 2: number of threads in the team: 16
Level 2: number of threads in the team: 16
Level 2: number of threads in the team: 16
Level 2: number of threads in the team: 16
Level 2: number of threads in the team: 16
Level 3: number of threads in the team: 4
Thread 1 sum=3
Level 3: number of threads in the team: 4
Thread 0 sum=3
Level 4: number of threads in the team: 1
Time taken: 0.000629196 seconds*/