Compiling for Intel with Intel Composer XE, MKL, and Intel MPI
Contents
General Notes
- Static linking is not possible because Red Hat does not distribute a static libm (standard math library)
Motivation
Intel Compilers + MKL can produce executables which run significantly faster on Intel CPUs, when compared with that produced by GCC. For example, see the metrics reported here which compare linear algebra performance of MKL vs ATLAS (Automatically Tuned Linear Algebra Software).
Versions Available
The cluster management vendor Bright Computing provides the Intel Composer suite as multiple modules:
[juser@proteusi01 ~]$ module avail intel ----------------------------------------------------- /cm/shared/modulefiles ----------------------------------------------------- intel/compiler/64/14.0/2013_sp1.3.174 intel-cluster-checker/2.1.2 intel-mpi/64/4.1.1/036 intel/ipp/64/8.1/2013_sp1.3.174 intel-cluster-runtime/ia32/3.6 intel-mpi/mic/4.1.1/036 intel/mkl/64/11.1/2013_sp1.3.174 intel-cluster-runtime/intel64/3.6 intel-tbb-oss/ia32/42_20140601oss intel/sourcechecker/64/14.0/2013_sp1.3.174 intel-cluster-runtime/mic/3.6 intel-tbb-oss/intel64/42_20140601oss intel/tbb/32/4.2/2013_sp1.3.174 intel-itac/8.1.3/037 intel/tbb/64/4.2/2013_sp1.3.174 intel-mpi/32/4.1.1/036 ---------------------------------------------------- /mnt/HA/opt/modulefiles ----------------------------------------------------- intel/composerxe/2013.3.174 intel/composerxe/2015.1.133 intel/composerxe/2016.0.109 intel/composerxe/current
The modules under /cm/shared/modulefiles are provided by Bright. The modules under /mnt/HA/opt/modulefiles are locally-installed.
For convenience, use the locally-installed modules.
Intel Composer XE
In all the versions described below, all associated packages (MKL, TBB, IPP) are loaded with a single module.
Version 2013
Intel Composer XE is a suite of tools including compilers, parallel debugger, optimized libraries, the Math Kernel Library, and tools for profiling and tuning applications.[1]
[juser@proteusi01 ~]$ module load intel/composerxe/2013.3.174
With Composer XE 2013.3.174, MKL 11.1 is installed.
Version 2015
Version 2015 is also installed, with all components loaded by a single module:
[juser@proteusi01 ~]$ module load intel/composerxe/2015.1.133
With Composer XE 2015.1.133, MKL 11.2 is installed.
Version 2016
Version 2016 is installed, with all components loaded by a single module:
[juser@proteusi01 ~]$ module load intel/composerxe/2016.0.109
With Composer XE 2016.0.109, MKL 11.3 is installed.
Optimization Flags
Please see Hardware for details on what hardware-specific optimizations may be used.
- 2015-04-15: -xHost -- CPU architecture of proteusi01 is identical to all Intel compute nodes
Intel Math Kernel Library (MKL)
For best performance on Intel CPUs, do not use generic linear algebra libraries (BLAS, LAPACK). Instead, use the MKL.[2][3]
- MKL 11.1 is installed with Composer XE 2013
- MKL 11.2 is installed with Composer XE 2015
- MKL 11.3 is installed with Composer XE 2016
The installations on Proteus also includes interfaces for BLAS95, LAPACK95, FFTW2 (double), and FFTW3 (double).
Choice of Integer Size
The MKL offers the choice of standard 32-bit integers (denoted LP64, or long 64-bit integers (denoted ILP64).[4][5] The installations on Proteus default to 32-bit integers.
Interfaces for BLAS95, LAPACK95, FFTW2, and FFTW3
The interfaces for BLAS95, LAPACK95, FFTW2, and FFTW3 are available, as well. They are provided as static library files, compiled locally against the MKL. The libraries are in the directory $MKLROOT/lib/intel64
The library files themselves are:
libmkl_blas95_lp64.a libmkl_blas95_ilp64.a libmkl_lapack95_lp64.a libmkl_lapack95_ilp64.a libfftw3xf_intel.a libfftw3x_cdft_ilp64.a libfftw3x_cdft_lp64.a libfftw3xc_intel.a libfftw2xf_single_intel.a libfftw2xf_double_intel.a libfftw2x_cdft_DOUBLE_lp64.a libfftw2x_cdft_SINGLE_lp64.a libfftw2xc_single_intel.a libfftw2xc_double_intel.a
As these are not part of the base MKL libraries, the Link Line Advisor will not generate link flags for these libraries. You should manually include them in your link line, e.g.
-L$MKLROOT/lib/intel64 -lmkl_blas95_lp64 -lfftw3xc_intel
Compiling Numpy and Scipy with MKL
Intel has instructions on using Intel Compilers + MKL to compile Numpy and Scipy:
https://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl
They also include comparative performance numbers (against ATLAS).
MKL Link Line Advisor
Linking against the MKL can be complicated: consult the MKL User's Guide for detailed documentation.[6] The MKL Link Line Advisor web-based tool will generate the proper compilation options to compile and link against the MKL:[7]
http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
MPI implementation
Intel MPI
NOTE As of 2015-01-01 we do not have a license for Intel MPI. Please use MVAPICH2 or OpenMPI (the latter is recommended).
IN PROGRESS
Intel MPI is Intel's implementation of MPI-2.[8][9] It is available via the module:
[juser@proteusi01 ~]$ module load intel-mpi/64
The compiler commands are:
- mpiicc (note the two letters "i")
- mpiifort
See Intel MPI for Linux Getting Started Guide.[10] Also see the article on Message Passing Interface.
Open MPI
For the 2013 version, use:
proteus-openmpi/intel/64/1.8.1-mlnx-ofed
For the 2015 version, use:
proteus-openmpi/intel/2015/1.8.1-mlnx-ofed
Hybrid MPI-OpenMP
Intel MPI supports hybrid MPI-OpenMP code[11]
- Use the thread-safe MPI library by passing the compiler option: -mt_mpi
- Set the environment variable I_MPI_PIN_DOMAIN to "omp": export I_MPI_PIN_DOMAIN=omp. This sets the pinning domain size to be equal to the value given by the environment variable OMP_NUM_THREADS. If OMP_NUM_THREADS is not set, Intel MPI will assume all cores are to be used.
NOTE: Grid Engine may assign only some of the cores in a node to any domain of MPI.
Recommended Combination for Proteus
This is the combination of Intel compilers/libraries and MPI implementation that we recommend:
intel/composerxe/2015.1.133 proteus-openmpi/intel/2015/1.8.1-mlnx-ofed
This combination supports hybrid OpenMP-MPI code, though performance improvement of hybrid code over MPI-only may be small.
See Also
- Compiling with GCC
- For a concrete example of using the Intel 2015 + OpenMPI toolchain, see Compiling LAMMPS
References
- ↑ Intel Composer XE information website
- ↑ Intel MKL information website
- ↑ Intel MKL 11.1 Reference Manual
- ↑ Intel Math Kernel Library for Linux OS User's Guide: Using the ILP64 Interface vs. LP64 Interface
- ↑ Intel Math Kernel Library for Linux OS User's Guide: Support for ILP64 Programming
- ↑ Intel Math Kernel Library for Linux OS User's Guide: Linking Your Application with the Intel Math Kernel Library
- ↑ Intel MKL Link Line Advisor
- ↑ Intel MPI 4.1 Reference Manual
- ↑ Intel MPI Reference Manual - Interoperability with OpenMP* API
- ↑ File:IntelMPIforLinuxGettingStarted.pdf
- ↑ Intel Developer Zone - Hybrid applications: Intel MPI Library and OpenMP