Getting the most out of target machine options

Using -qarch options

You use the -qarch compiler option to generate instructions that are optimized for a specific machine architecture. For example, if you want to generate an object code that contains instructions optimized for POWER7, you use -qarch=pwr7. If your application runs on the same machine on which you are compiling it, you can use the -qarch=auto option, which automatically detects the specific architecture of the compiling machine, and generates code to take advantage of instructions available only on that machine (or on a system that supports the equivalent processor architecture). Otherwise, use the -qarch option to specify the smallest possible family of the machines that can run your code reasonably well.

If you want to run your application on a system architecture that provides specific feature supports, you must specify a corresponding -qarch suboption to generate the object code for your system architecture. For example, if you want to deploy your application on a POWER6™ or POWER7 machine and to fully exploit vector processing and large-page support, you must specify -qarch=pwr6 for POWER6 and -qarch=pwr7 for POWER7 on your compiling machine. Specifying -qarch=auto or -qarch does not give you the support you want. However, if you deploy your application on both POWER6 and POWER7, you must make sure that -qarch is set to the lowest common architecture. This way your application will only contain instructions that are common to all processors the application is deployed on. In this example, the lowest common architecture is POWER6 so it is best to use -qarch=pwr6. For details about -qarch and its suboptions, see -qarch in the XL C/C++ Compiler Reference. For details about the corresponding system architectures each -qarch suboption supports, see the Features support in processor architectures table in -qarch.

Using -qtune options

You use the -qtune compiler option to control the scheduling of instructions that are optimized for your machine architecture. If you specify a particular architecture with -qarch, -qtune automatically selects the suboption that generates instruction sequences with the best performance for that architecture. If you specify a group of architectures with -qarch, compiling with -qtune=auto generates code that runs on all of the architectures in the specified group, but the instruction sequences are those with the best performance on the architecture of the compiling machine.

Try to specify with -qtune the particular architecture that the compiler should target for best performance but still allow execution of the produced object file on all architectures specified in the -qarch option. For information on the valid combinations of -qarch and -qtune, see Acceptable -qarch/-qtune combinations in the -qtune section of the XL C/C++ Compiler Reference.

If you need to create a single binary that runs on a range of PowerPC® hardware, consider using the -qtune=balanced option. With this option in effect, optimization decisions made by the compiler are not targeted to a specific version of hardware. Instead, tuning decisions try to include features that are generally helpful across a broad range of hardware and avoid those optimizations that might be harmful on some hardware.
Note: You must verify the performance of code compiled with the -qtune=balanced option before distributing it.

The main difference between using -qtune=balanced and -qtune=auto is that, with -qtune=auto and a specified -qarch suboption, the compiler generates instructions that are optimized for that specified versions of hardware architecture and might not perform well on others. For example, if you want to use -qtune=auto to generate optimized instructions that are deployable on a POWER7 machine, you use -qarch=pwr7 -qtune=auto. To generate instructions that perform reasonably well across a range of Power hardware, use -qtune=balanced instead. For details, see -qtune in the XL C/C++ Compiler Reference.

Using -qcache options

Before using the -qcache option, use the -qlistopt option to generate a listing of the current settings and verify if they are satisfactory. If you decide to specify your own -qcache suboptions, use -qhot or -qsmp along with it. For the full set of suboptions, option syntax, and guidelines for use, see -qcache.