Logo Cineca Logo SCAI
MARCONI status
GALILEO100 status
Marconi100 status

You are here

Matrix-Matrix Moltiplication - Native mode (MPI+OpenMP)

Here you'll find a program to perform Matrix-Matrix multiplication using MPI + OpenMP : it works only using 2 Task.

 

Fisrt you have to load the correct compiler, the mpi library, fix the environment and enable MPI for MIC

module load intel (i.e. compiler suite)
module load intelmpi (i.e. mpi library)
source $INTEL_HOME/bin/compilervars.sh intel64 (to set up the environment variables)
export I_MPI_MIC=enable (to enable mpi on MIC)

 

Now you can cross-compile the code for running on the MIC coprocessor using mpicc for C programs and mpifc for fortran ones

Fortran: mpifc mm_hybrid.F90  -mmic -openmp -o mm_hybrid_F.x 

 

Now you can run you job without entering the MIC coprocessor, but defining the MIC you are intended to run on, using the command mpirun and setting the number of threads you are intended to use.

mpirun.mic -host node021-mic0 -np 2 -genv OMP_NUM_THREADS 120 -genv LD_LIBRARY_PATH=/cineca/prod/compilers/intel/cs-xe-2013/binary/lib/mic/ ./mm_hybrid_F.x 2028
Hybirid version with threads, tasks = 120 2
MPI version with task = 2
Matrix size is 2028
Check on a random element: 0.000000000000000E+000 511.852700478611
Elapsed time 0.380214929580688 s
Tot Gflops 43.8737414609068
All done...


If you want one task on different MIC coprocessor you have to use the -perhost flag

mpirun.mic -host node021-mic0,node021-mic1 -np 2 -genv OMP_NUM_THREADS 120 -genv LD_LIBRARY_PATH=/cineca/prod/compilers/intel/cs-xe-2013/binary/lib/mic/ -perhost 1 ./mm_hybrid_F.x 2028
Hybirid version with threads, tasks = 120 2
MPI version with task = 2
Matrix size is 2028
Check on a random element: 0.000000000000000E+000 511.852700478611
Elapsed time 0.662999153137207 s
Tot Gflops 25.1605925001050
All done...