numpy, scipy is numerical calculation library. For linear algebra calculation, openblas is used. For intel CPU, Intel MKL is expected to be more faster. I try it.
Contents
Machine
MacBook Pro (15-inch, 2018)
プロセッサ 2.9 GHz Intel Core i9
メモリ 32 GB 2400 MHz DDR4
numpy scipy with openblas
I checked numpy and scipy work with openblas.
>>> import numpy
>>> numpy.show_config()
blas_mkl_info:
NOT AVAILABLE
blis_info:
NOT AVAILABLE
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
NOT AVAILABLE
openblas_lapack_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
>>> import scipy
>>> scipy.show_config()
lapack_mkl_info:
NOT AVAILABLE
openblas_lapack_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
blas_mkl_info:
NOT AVAILABLE
blis_info:
NOT AVAILABLE
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
I measured the calculation time with the following code.
import numpy as np
import time
import scipy.linalg.blas
N = 10000
A = np.random.rand(N,N)
B = np.random.rand(N,N)
t1 = time.time()
C = scipy.linalg.blas.dgemm(alpha=1.0, a=A, b=B)
t2 = time.time()
print(t2-t1)
The result is the following
$ python test.py
12.556735038757324
$ python test.py
12.65819001197815
$ python test.py
12.273000955581665
The average time is 12.496.
numpy scipy with Intel MKL
The way to install intel MKL is describe the following previous post.
I installed numpy and scipy as the following.
For “.numpy-site.cfg”, change pass according to your environment.
$ python -m venv venv
$ source ./venv/bin/activate
(venv) $ echo "[mkl]
library_dirs = /opt/intel/mkl/lib/intel64
include_dirs = /opt/intel/mkl/include
mkl_libs = mkl_rt
lapack_libs =" > .numpy-site.cfg
(venv) $ pip install --no-binary :all: numpy
(venv) $ pip install --no-binary :all: scipy
Check numpy and scipy work with intel MKL
>>> import numpy
>>> numpy.show_config()
blas_mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/include', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
blas_opt_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/include', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
lapack_mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/include', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
lapack_opt_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/include', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
>>> import scipy
>>> scipy.show_config()
lapack_mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/include', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
lapack_opt_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/include', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
blas_mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/include', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
blas_opt_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/include', '/opt/intel/compilers_and_libraries_2019.1.144/mac/mkl/lib']
The result is the following.
(venv) $ python test.py
11.162160873413086
(venv) $ python test.py
11.657418966293335
(venv) $ python test.py
11.576632022857666
The average time is 11.465.
Calculation with C/C++
The same calculation is tried in the previous post. See the following.
This time, I compile the same code with -O3 option.
$ c++ -DMKL_ILP64 -m64 -I${MKLROOT}/include ${MKLROOT}/lib/libmkl_intel_ilp64.a ${MKLROOT}/lib/libmkl_intel_thread.a ${MKLROOT}/lib/libmkl_core.a -liomp5 -lpthread -lm -ldl -O3 -std=c++11 -o dgemm_mkl dgemm_mkl.cpp
$ ./dgemm_mkl
cblas_dgemm with 10000 * 10000 matrix.
Realtime: 7.507 sec.
CPU time: 44.7587sec.
Summary
I install numpy and scipy with intel MKL. numpy and scipy with intel MKL is faster a little bit. Openblas is excellent. It takes a long time to compile numpy and scipy with intel MKL. So, it is not reasonable.