我在编译代码时使用
-O3,现在我需要对其进行分析.对于分析,我有两个主要选择:
valgrind –tool=callgrind和
gprof.
Valgrind(callgrind)文档声明:
As with Cachegrind, you probably want to compile with debugging info (the -g option) and with optimization turned on.
然而,在Agner Fog的C++ optimization book中,我读过以下内容:
Many optimization options are incompatible with debugging. A debugger can execute a
code one line at a time and show the values of all variables. Obviously, this is not possible
when parts of the code have been reordered, inlined, or optimized away. It is common to
make two versions of a program executable: a debug version with full debugging support
which is used during program development, and a release version with all relevant
optimization options turned on. Most IDE’s (Integrated Development Environments) have
facilities for making a debug version and a release version of object files and executables.
Make sure to distinguish these two versions and turn off debugging and profiling support in
the optimized version of the executable.
这似乎与callgrind指令冲突,以使用调试信息标志-g编译代码.如果我按以下方式启用调试:
-ggdb -DFULLDEBUG
我不会导致此选项与-O3优化标志冲突吗?在我到目前为止所阅读的内容之后,使用这两个选项对我来说毫无意义.
如果我使用说-O3优化标志,我可以使用以下命令编译带有其他分析信息的代码:
-pg
仍然用valgrind描述它?
配置编译的代码是否有意义
-ggdb -DFULLDEBUG -O0
标志?这看起来很愚蠢 – 不是内联函数和展开循环可能会改变代码中的瓶颈,所以这应该仅用于开发,以使代码实际上正确地执行.
用一个优化标志编译代码并用另一个优化标志编译代码是否有意义?
最佳答案 你为什么要剖析?只是为了测量或找到加速?
您应该只分析优化代码的常识是基于假设代码几乎是最佳的开始,如果有显着的加速,则不是.
您应该将加速的发现视为错误.许多人使用this method这样做.
在你删除了不必要的计算后,如果你仍然有紧张的CPU循环,即你没有把所有的时间花在系统或库或优化器没有看到的I / O例程上,那么打开-O3,然后让它做它的魔力.