Ching-Chuan Chen's Blogger

Statistics, Machine Learning and Programming

0%

Compile Julia on Windows with Intel MKL

這篇主要是紀錄在Windows編譯Julia

首先是安裝cygwin64,下載好exe之後,下指令安裝需要的東西:

1
setup-x86_64.exe -s http://ftp.yzu.edu.tw/cygwin/ -q -P cmake,gcc-g++,git,make,patch,curl,m4,python,p7zip,mingw64-x86_64-gcc-g++,mingw64-x86_64-gcc-fortran

打開cygwin terminal:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# clone source code
git clone https://github.com/JuliaLang/julia.git
mv julia julia-master
cd julia-master
# create make info
tee Make.user << EOF
XC_HOST = x86_64-w64-mingw32
USE_INTEL_MKL = 1
MKLROOT = /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/compilers_and_libraries_2017/windows/mkl
EOF

# 把dll複製過來
MKLROOT=/cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/compilers_and_libraries_2017/windows/mkl
mkdir -p usr/bin
cp "$MKLROOT/../redist/intel64_win/compiler/libiomp5md.dll" usr/bin/
cp "$MKLROOT/../redist/intel64_win/mkl/mkl*.dll" usr/bin/
cp usr/bin/mkl_rt.dll usr/bin/libmkl_rt.dll

# 開始編譯
make -j 28
# 編譯extra
make win-extras
# create binary distribution
make binary-dist

然後就會出現julia-53db863142-win64.exe類似這種檔案在你的Cygwin資料夾下面了,點擊兩下後就可以安裝了

下面比較測試一下OpenBLAS跟MKL (@Ryzen ThreadRipper 1950X@3.7GHz, 128 GB RAM)

程式碼如下

1
2
3
4
5
using LinearAlgebra
A = Array(ones(m,n))
B = Bidiagonal(ones(n,p), :U)
@elapsed inv(B) * A
@elapsed B * A

結果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
1. m = n = p = 500
1 - openblas
The elapsed time of inv(B) * A: 0.18717380649999998
The elapsed time of B * A: 0.1825800896
1 - mkl
The elapsed time of inv(B) * A: 0.19427489685000002
The elapsed time of B * A: 0.16237521085

2. m = n = p = 1500
2 - openblas
The elapsed time of inv(B) * A: 1.6726551640500003
The elapsed time of B * A: 1.23033070435

2 - mkl
The elapsed time of inv(B) * A: 1.6964857918000007
The elapsed time of B * A: 1.1328988554000001

3. m = n = p = 2500
3 - openblas
The elapsed time of inv(B) * A: 4.77680297335
The elapsed time of B * A: 3.5322971791500004

3 - mkl
The elapsed time of inv(B) * A: 4.916846529599999
The elapsed time of B * A: 3.2682480265000002

基本上互有勝負… 雖然我覺得根本就是Intel MKL對AMD處理器負優化…