Raspberry Pi 2 / 3 向け CentOS 7 用 Kernel の UnixBench

Meltdown / Spectre に対応した Raspberry Pi 向け CentOS 用 Kernel を検証してみた。

さすがに rt-tests の Cyclictest 結果だけでは面白くないので、UnixBench でも試験してみた。
結論だけ言えば、やはりあまり影響はなさそうだった。

■試験環境整備
・UnixBench のビルド
まずは必要となる物をインストール

# yum groupinstall "Development Tools"
# yum install perl-Time-HiRes

・UnixBench のダウンロードとビルド

# cd /usr/local/src/
# wget -N https://github.com/kdlucas/byte-unixbench/tarball/61663da4fd51a0a5d514ce670884b3ed0ef81608
# tar zxvf ./61663da4fd51a0a5d514ce670884b3ed0ef81608
# chown -R admin. /usr/local/src/kdlucas-byte-unixbench-61663da/
# cd /usr/local/src/kdlucas-byte-unixbench-61663da/UnixBench/
# cp -piav Makefile Makefile.org
# sed -i -e 's/^OPTON += -march=native -mtune=native/OPTON += -mcpu=cortex-a15 -mfpu=neon-vfpv4/g' Makefile
# su - admin -c "cd /usr/local/src/kdlucas-byte-unixbench-61663da/UnixBench/ ; make all"

・試験実行

# cd /usr/local/src/kdlucas-byte-unixbench-61663da/UnixBench/
# ./Run

■Meltdown / Spectre 対策前後の比較。
●4.9.68 vs 4.9.76 (Single)

Test case Kernel-4.9.68
(Single)
Kernel-4.9.76
(Single)
Diff
Dhrystone 2 using register variables 4444482.7 4446369.6 100%
Double-Precision Whetstone 925.4 927.6 100%
Execl Throughput 859.1 859.7 100%
File Copy 1024 bufsize 2000 maxblocks 122908.2 122338.7 100%
File Copy 256 bufsize 500 maxblocks 34494 34371.4 100%
File Copy 4096 bufsize 8000 maxblocks 327961.5 327680.2 100%
Pipe Throughput 206251.7 206728.9 100%
Pipe-based Context Switching 49852.1 49842.8 100%
Process Creation 2185 2200.8 101%
Shell Scripts (1 concurrent) 1413.7 1412.3 100%
Shell Scripts (8 concurrent) 386.1 384.4 100%
System Call Overhead 359893.9 345096 96%

●4.9.68 vs 4.9.76 (Multi)

Test case Kernel-4.9.68
(Multi)
Kernel-4.9.76
(Multi)
Diff
Dhrystone 2 using register variables 17747491.1 17748069.1 100%
Double-Precision Whetstone 3705 3704.6 100%
Execl Throughput 2059.2 2070.2 101%
File Copy 1024 bufsize 2000 maxblocks 234876.6 234681 100%
File Copy 256 bufsize 500 maxblocks 63297.1 63788 101%
File Copy 4096 bufsize 8000 maxblocks 594215.6 596785.1 100%
Pipe Throughput 818667.2 825419.7 101%
Pipe-based Context Switching 188870.9 188815.8 100%
Process Creation 4639.2 4714 102%
Shell Scripts (1 concurrent) 2976 2986.7 100%
Shell Scripts (8 concurrent) 434.9 433.7 100%
System Call Overhead 1394643.9 1344020.8 96%

●4.9.68.rt60 vs 4.9.76.rt61 (Single)

Test case Kernel-4.9.68.rt60
(Single)
Kernel-4.9.76.rt61
(Single)
Diff
Dhrystone 2 using register variables 4427651.6 4423413.4 100%
Double-Precision Whetstone 923.2 921.8 100%
Execl Throughput 663.7 667.5 101%
File Copy 1024 bufsize 2000 maxblocks 90040.7 88057 98%
File Copy 256 bufsize 500 maxblocks 24912.3 24404.4 98%
File Copy 4096 bufsize 8000 maxblocks 254216 250157.8 98%
Pipe Throughput 158194.7 159817.2 101%
Pipe-based Context Switching 42960 42581.9 99%
Process Creation 1570 1597.7 102%
Shell Scripts (1 concurrent) 1213 1216.6 100%
Shell Scripts (8 concurrent) 243.1 244.2 100%
System Call Overhead 278139.2 285093.3 103%

●4.9.68.rt60 vs 4.9.76.rt61 (Multi)

Test case Kernel-4.9.68.rt60
(Multi)
Kernel-4.9.76.rt61
(Multi)
Diff
Dhrystone 2 using register variables 17657614.4 17656872.1 100%
Double-Precision Whetstone 3688 3689.6 100%
Execl Throughput 1145.7 1146.8 100%
File Copy 1024 bufsize 2000 maxblocks 59137.5 58784.8 99%
File Copy 256 bufsize 500 maxblocks 14873 14858 100%
File Copy 4096 bufsize 8000 maxblocks 215322.4 213830.9 99%
Pipe Throughput 629543.9 634778.9 101%
Pipe-based Context Switching 173011.4 173843.6 100%
Process Creation 3207.7 3210.3 100%
Shell Scripts (1 concurrent) 2053.9 2057.6 100%
Shell Scripts (8 concurrent) 192.4 193 100%
System Call Overhead 1088837.3 1115143.2 102%

■【おまけ】あまり参考にならない比較
●通常Kernel とRT-Kernel の比較
・4.9.68 vs 4.9.68.rt60 (Single)

Test case Kernel-4.9.68
(Single)
Kernel-4.9.68.rt60
(Single)
Diff
Dhrystone 2 using register variables 4444482.7 4427651.6 100%
Double-Precision Whetstone 925.4 923.2 100%
Execl Throughput 859.1 663.7 77%
File Copy 1024 bufsize 2000 maxblocks 122908.2 90040.7 73%
File Copy 256 bufsize 500 maxblocks 34494 24912.3 72%
File Copy 4096 bufsize 8000 maxblocks 327961.5 254216 78%
Pipe Throughput 206251.7 158194.7 77%
Pipe-based Context Switching 49852.1 42960 86%
Process Creation 2185 1570 72%
Shell Scripts (1 concurrent) 1413.7 1213 86%
Shell Scripts (8 concurrent) 386.1 243.1 63%
System Call Overhead 359893.9 278139.2 77%

・4.9.68 vs 4.9.68.rt60 (Multi)

Test case Kernel-4.9.68
(Multi)
Kernel-4.9.68.rt60
(Multi)
Diff
Dhrystone 2 using register variables 17747491.1 17657614.4 99%
Double-Precision Whetstone 3705 3688 100%
Execl Throughput 2059.2 1145.7 56%
File Copy 1024 bufsize 2000 maxblocks 234876.6 59137.5 25%
File Copy 256 bufsize 500 maxblocks 63297.1 14873 23%
File Copy 4096 bufsize 8000 maxblocks 594215.6 215322.4 36%
Pipe Throughput 818667.2 629543.9 77%
Pipe-based Context Switching 188870.9 173011.4 92%
Process Creation 4639.2 3207.7 69%
Shell Scripts (1 concurrent) 2976 2053.9 69%
Shell Scripts (8 concurrent) 434.9 192.4 44%
System Call Overhead 1394643.9 1088837.3 78%

・4.9.76 vs 4.9.76.rt61 (Single)

Test case Kernel-4.9.76
(Single)
Kernel-4.9.76.rt61
(Single)
Diff
Dhrystone 2 using register variables 4446369.6 4423413.4 99%
Double-Precision Whetstone 927.6 921.8 99%
Execl Throughput 859.7 667.5 78%
File Copy 1024 bufsize 2000 maxblocks 122338.7 88057 72%
File Copy 256 bufsize 500 maxblocks 34371.4 24404.4 71%
File Copy 4096 bufsize 8000 maxblocks 327680.2 250157.8 76%
Pipe Throughput 206728.9 159817.2 77%
Pipe-based Context Switching 49842.8 42581.9 85%
Process Creation 2200.8 1597.7 73%
Shell Scripts (1 concurrent) 1412.3 1216.6 86%
Shell Scripts (8 concurrent) 384.4 244.2 64%
System Call Overhead 345096 285093.3 83%

・4.9.76 vs 4.9.76.rt61 (Multi)

Test case Kernel-4.9.76
(Multi)
Kernel-4.9.76.rt61
(Multi)
Diff
Dhrystone 2 using register variables 17748069.1 17656872.1 99%
Double-Precision Whetstone 3704.6 3689.6 100%
Execl Throughput 2070.2 1146.8 55%
File Copy 1024 bufsize 2000 maxblocks 234681 58784.8 25%
File Copy 256 bufsize 500 maxblocks 63788 14858 23%
File Copy 4096 bufsize 8000 maxblocks 596785.1 213830.9 36%
Pipe Throughput 825419.7 634778.9 77%
Pipe-based Context Switching 188815.8 173843.6 92%
Process Creation 4714 3210.3 68%
Shell Scripts (1 concurrent) 2986.7 2057.6 69%
Shell Scripts (8 concurrent) 433.7 193 45%
System Call Overhead 1344020.8 1115143.2 83%

●1core vs 4core
・4.9.76 (Single) vs 4.9.76 (Multi)

Test case Kernel-4.9.76
(Single)
Kernel-4.9.76
(Multi)
Diff
Dhrystone 2 using register variables 4446369.6 17748069.1 399%
Double-Precision Whetstone 927.6 3704.6 399%
Execl Throughput 859.7 2070.2 241%
File Copy 1024 bufsize 2000 maxblocks 122338.7 234681 192%
File Copy 256 bufsize 500 maxblocks 34371.4 63788 186%
File Copy 4096 bufsize 8000 maxblocks 327680.2 596785.1 182%
Pipe Throughput 206728.9 825419.7 399%
Pipe-based Context Switching 49842.8 188815.8 379%
Process Creation 2200.8 4714 214%
Shell Scripts (1 concurrent) 1412.3 2986.7 211%
Shell Scripts (8 concurrent) 384.4 433.7 113%
System Call Overhead 345096 1344020.8 389%

・4.9.76.rt61 (Single) vs 4.9.76.rt61 (Multi)

Test case Kernel-4.9.76.rt61
(Single)
Kernel-4.9.76.rt61
(Multi)
Diff
Dhrystone 2 using register variables 4423413.4 17656872.1 399%
Double-Precision Whetstone 921.8 3689.6 400%
Execl Throughput 667.5 1146.8 172%
File Copy 1024 bufsize 2000 maxblocks 88057 58784.8 67%
File Copy 256 bufsize 500 maxblocks 24404.4 14858 61%
File Copy 4096 bufsize 8000 maxblocks 250157.8 213830.9 85%
Pipe Throughput 159817.2 634778.9 397%
Pipe-based Context Switching 42581.9 173843.6 408%
Process Creation 1597.7 3210.3 201%
Shell Scripts (1 concurrent) 1216.6 2057.6 169%
Shell Scripts (8 concurrent) 244.2 193 79%
System Call Overhead 285093.3 1115143.2 391%

■思ったこと
・RT-Kernel の特性が、やはり普通の Kernel とはだいぶ異なる感じであった。
・コア数が増えても上がりにくいパフォーマンスの動向すら違う。むしろ下がることすら。

カテゴリー: めも パーマリンク

コメントを残す

メールアドレスが公開されることはありません。 * が付いている欄は必須項目です