[LU-16042] Sanity 155e,155f,155g,155h fail due to no cache size get on Arm64 Created: 25/Jul/22  Updated: 19/Oct/23  Resolved: 08/Aug/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0, Lustre 2.15.4

Type: Bug Priority: Minor
Reporter: Kevin Zhao Assignee: Kevin Zhao
Resolution: Fixed Votes: 0
Labels: arm-server

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

== sanity test 155e: Verify big file correctness: read cache:on write_cache:on ========================================================== 14:04:03 (1658239443)
CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
[[ -z "lustre-OST0000" ]] && param= ||
param=$(grep lustre-OST0000 <<< "$params");
[[ -z $param ]] && param="$params";
while read s; do echo ost1 $s;
done <<< "$param"
CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
[[ -z "lustre-OST0001" ]] && param= ||
param=$(grep lustre-OST0001 <<< "$params");
[[ -z $param ]] && param="$params";
while read s; do echo ost2 $s;
done <<< "$param"
CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
[[ -z "lustre-OST0002" ]] && param= ||
param=$(grep lustre-OST0002 <<< "$params");
[[ -z $param ]] && param="$params";
while read s; do echo ost3 $s;
done <<< "$param"
CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
[[ -z "lustre-OST0003" ]] && param= ||
param=$(grep lustre-OST0003 <<< "$params");
[[ -z $param ]] && param="$params";
while read s; do echo ost4 $s;
done <<< "$param"
CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
[[ -z "lustre-OST0004" ]] && param= ||
param=$(grep lustre-OST0004 <<< "$params");
[[ -z $param ]] && param="$params";
while read s; do echo ost5 $s;
done <<< "$param"
CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
[[ -z "lustre-OST0005" ]] && param= ||
param=$(grep lustre-OST0005 <<< "$params");
[[ -z $param ]] && param="$params";
while read s; do echo ost6 $s;
done <<< "$param"
CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
[[ -z "lustre-OST0006" ]] && param= ||
param=$(grep lustre-OST0006 <<< "$params");
[[ -z $param ]] && param="$params";
while read s; do echo ost7 $s;
done <<< "$param"
Waiting for MDT destroys to complete
OST kbytes available: 9083632 9083628 9083628 9083628 9083624 9083624 9083624
Min free space: OST 4: 9083624
Max free space: OST 0: 9083632
CMD: lustre-xwcomty2-05 awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo
OSS cache size: KB
Large file size: 0 KB
dd: invalid number: '0'
sanity test_155e: @@@@@@ FAIL: dd of=/tmp/f155e.sanity bs=0 count=1k failed
Trace dump:
= /root/test/build-06244k/lustre/lustre/tests/test-framework.sh:6406:error()
= /root/test/build-06244k/lustre/lustre/tests/sanity.sh:15542:test_155_big_load()
= /root/test/build-06244k/lustre/lustre/tests/sanity.sh:15627:test_155e()
= /root/test/build-06244k/lustre/lustre/tests/test-framework.sh:6723:run_one()
= /root/test/build-06244k/lustre/lustre/tests/test-framework.sh:6770:run_one_logged()
= /root/test/build-06244k/lustre/lustre/tests/test-framework.sh:6596:run_test()
= /root/test/build-06244k/lustre/lustre/tests/sanity.sh:15631:main()
Dumping lctl log to /tmp/test_logs/2022-07-19/135519/sanity.test_155e.*.1658239459.log
FAIL 155e (19s)



 Comments   
Comment by Kevin Zhao [ 25/Jul/22 ]

CMD: lustre-xwcomty2-05 awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo
OSS cache size: KB

Looks like this command can not get the CPU cache size on Arm64 VM.

 

Some test results:

On X86_64 VM:
root@iZj6ce071s2zz3reioxn93Z:~# awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo
33792

On Arm64:

[root@lustre-xwcomty2-01 ~]#  awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo

[root@lustre-xwcomty2-01 ~]#

 

The traditional Cache size is not listed in /proc/cpuinfo:
processor    : 7
BogoMIPS    : 400.00
Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm
CPU implementer    : 0x43
CPU architecture: 8
CPU variant    : 0x1
CPU part    : 0x0af
CPU revision    : 1

Comment by Kevin Zhao [ 25/Jul/22 ]

In Arm64 baremetal, we can get CPU cache size at some platform, such as: Marvell ThunderX2, Hisilicon Kunpeng 920.

Marvell THX2 :~$ lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              224
On-line CPU(s) list: 0-223
Thread(s) per core:  4
Core(s) per socket:  28
Socket(s):           2
NUMA node(s):        2
Vendor ID:           Cavium
Model:               1
Model name:          ThunderX2 99xx
Stepping:            0x1
BogoMIPS:            400.00
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            32768K
NUMA node0 CPU(s):   0-111
NUMA node1 CPU(s):   112-223
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm

But can not get the same info on some older chips like Cavium Thx1, Hisilicon kunpeng 916.

Kunpeng 916~$ lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              64
On-line CPU(s) list: 0-63
Thread(s) per core:  1
Core(s) per socket:  16
Socket(s):           4
NUMA node(s):        4
Vendor ID:           ARM
Model:               2
Model name:          Cortex-A72
Stepping:            r0p2
BogoMIPS:            100.00
NUMA node0 CPU(s):   0-15
NUMA node1 CPU(s):   16-31
NUMA node2 CPU(s):   32-47
NUMA node3 CPU(s):   48-63
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

While in VM, nothing.

Arm64 VM # lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              8
On-line CPU(s) list: 0-7
Thread(s) per core:  1
Core(s) per cluster: 8
Socket(s):           8
Cluster(s):          1
NUMA node(s):        1
Vendor ID:           Cavium
BIOS Vendor ID:      QEMU
Model:               1
Model name:          ThunderX2 99xx
BIOS Model name:     virt-5.2
Stepping:            0x1
BogoMIPS:            400.00
NUMA node0 CPU(s):   0-7
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm

 

So here it's better to fallback to a pre-set value on Arm64. will work on a fix.

Comment by Gerrit Updater [ 25/Jul/22 ]

"Kevin Zhao <kevin.zhao@linaro.org>" uploaded a new patch: https://review.whamcloud.com/48030
Subject: LU-16042 tests: can not get cache size on Arm64
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c0bf6843a963cc22428d75bac4c8e0c4f76a058b

Comment by Gerrit Updater [ 08/Aug/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48030/
Subject: LU-16042 tests: can not get cache size on Arm64
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: f276f1cb0859e8718448e69bd99ee305f5e62d42

Comment by Peter Jones [ 08/Aug/22 ]

Landed for 2.16

Comment by Gerrit Updater [ 31/May/23 ]

"xinliang <xinliang.liu@linaro.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51179
Subject: LU-16042 tests: can not get cache size on Arm64
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 089f6c4f51d1819c5ddcc93b18d5eef823fa6317

Comment by Gerrit Updater [ 19/Oct/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51179/
Subject: LU-16042 tests: can not get cache size on Arm64
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: a9e47f3bf9255047890d9aa886954432fe058ef5

Generated at Sat Feb 10 03:23:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.