Here are some suggestions for using -qhot:
- Try using -qhot along with -O3 for
all of your code. It is designed to have a neutral effect when no
opportunities for transformation exist.
- If the runtime performance of your code can significantly benefit
from automatic inlining and memory locality optimizations, try using -O4 with -qhot=level=0 or -qhot=novector.
- If you encounter unacceptably long compile time (this can happen
with complex loop nests), try -qhot=level=0.
- If your code size is unacceptably large, try using -qcompact along
with -qhot.
- You can compile some source files with the -qhot option
and some files without the -qhot option, allowing
the compiler to improve only the parts of your code that need optimization.
- Use -qreport along with -qsimd=auto to
generate a loop transformation listing. The listing file identifies
how loops are transformed in a section marked LOOP TRANSFORMATION
SECTION. Use the listing information as feedback about how the
loops in your program are being transformed. Based on this information,
you may want to adjust your code so that the compiler can transform
loops more effectively. For example, you can use this
section of the listing to identify non-stride-one references that
may prevent loop vectorization.
- Use -qreport along with -qhot or
any optimization option that implies -qhot to generate
information about nested loops in the LOOP TRANSFORMATION
SECTION of the listing file. In addition, when you use -qprefetch=assistthread to
generate prefetching assist threads, a message Assist thread
for data prefetching was generated is also displayed in this
section of the report. To generate a list of aggressive loop transformations
and parallelizations performed on loop nests in the LOOP TRANSFORMATION
SECTION of the listing file, use -qhot=level=2 and -qsmp together
with -qreport.
- If you specify -qassert=refalign,
you assert to the compiler that all pointers inside the compilation
unit only point to data that is naturally aligned with respect to
the length of the pointer types. With this assertion, the compiler
might generate more efficient code. This assertion is particularly
useful when you target a SIMD architecture with -qhot=level=0 or -qhot=level=1 with
the -qsimd=auto option.