Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16042

Sanity 155e,155f,155g,155h fail due to no cache size get on Arm64

Details

    • 3
    • 9223372036854775807

    Description

      == sanity test 155e: Verify big file correctness: read cache:on write_cache:on ========================================================== 14:04:03 (1658239443)
      CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
      [[ -z "lustre-OST0000" ]] && param= ||
      param=$(grep lustre-OST0000 <<< "$params");
      [[ -z $param ]] && param="$params";
      while read s; do echo ost1 $s;
      done <<< "$param"
      CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
      [[ -z "lustre-OST0001" ]] && param= ||
      param=$(grep lustre-OST0001 <<< "$params");
      [[ -z $param ]] && param="$params";
      while read s; do echo ost2 $s;
      done <<< "$param"
      CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
      [[ -z "lustre-OST0002" ]] && param= ||
      param=$(grep lustre-OST0002 <<< "$params");
      [[ -z $param ]] && param="$params";
      while read s; do echo ost3 $s;
      done <<< "$param"
      CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
      [[ -z "lustre-OST0003" ]] && param= ||
      param=$(grep lustre-OST0003 <<< "$params");
      [[ -z $param ]] && param="$params";
      while read s; do echo ost4 $s;
      done <<< "$param"
      CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
      [[ -z "lustre-OST0004" ]] && param= ||
      param=$(grep lustre-OST0004 <<< "$params");
      [[ -z $param ]] && param="$params";
      while read s; do echo ost5 $s;
      done <<< "$param"
      CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
      [[ -z "lustre-OST0005" ]] && param= ||
      param=$(grep lustre-OST0005 <<< "$params");
      [[ -z $param ]] && param="$params";
      while read s; do echo ost6 $s;
      done <<< "$param"
      CMD: lustre-xwcomty2-05 params=$(/root/test/build-06244k/lustre/lustre/utils/lctl get_param osd-..writethrough_cache_enable);
      [[ -z "lustre-OST0006" ]] && param= ||
      param=$(grep lustre-OST0006 <<< "$params");
      [[ -z $param ]] && param="$params";
      while read s; do echo ost7 $s;
      done <<< "$param"
      Waiting for MDT destroys to complete
      OST kbytes available: 9083632 9083628 9083628 9083628 9083624 9083624 9083624
      Min free space: OST 4: 9083624
      Max free space: OST 0: 9083632
      CMD: lustre-xwcomty2-05 awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo
      OSS cache size: KB
      Large file size: 0 KB
      dd: invalid number: '0'
      sanity test_155e: @@@@@@ FAIL: dd of=/tmp/f155e.sanity bs=0 count=1k failed
      Trace dump:
      = /root/test/build-06244k/lustre/lustre/tests/test-framework.sh:6406:error()
      = /root/test/build-06244k/lustre/lustre/tests/sanity.sh:15542:test_155_big_load()
      = /root/test/build-06244k/lustre/lustre/tests/sanity.sh:15627:test_155e()
      = /root/test/build-06244k/lustre/lustre/tests/test-framework.sh:6723:run_one()
      = /root/test/build-06244k/lustre/lustre/tests/test-framework.sh:6770:run_one_logged()
      = /root/test/build-06244k/lustre/lustre/tests/test-framework.sh:6596:run_test()
      = /root/test/build-06244k/lustre/lustre/tests/sanity.sh:15631:main()
      Dumping lctl log to /tmp/test_logs/2022-07-19/135519/sanity.test_155e.*.1658239459.log
      FAIL 155e (19s)

      Attachments

        Activity

          [LU-16042] Sanity 155e,155f,155g,155h fail due to no cache size get on Arm64

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51179/
          Subject: LU-16042 tests: can not get cache size on Arm64
          Project: fs/lustre-release
          Branch: b2_15
          Current Patch Set:
          Commit: a9e47f3bf9255047890d9aa886954432fe058ef5

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51179/ Subject: LU-16042 tests: can not get cache size on Arm64 Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: a9e47f3bf9255047890d9aa886954432fe058ef5

          "xinliang <xinliang.liu@linaro.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51179
          Subject: LU-16042 tests: can not get cache size on Arm64
          Project: fs/lustre-release
          Branch: b2_15
          Current Patch Set: 1
          Commit: 089f6c4f51d1819c5ddcc93b18d5eef823fa6317

          gerrit Gerrit Updater added a comment - "xinliang <xinliang.liu@linaro.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51179 Subject: LU-16042 tests: can not get cache size on Arm64 Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 089f6c4f51d1819c5ddcc93b18d5eef823fa6317
          pjones Peter Jones added a comment -

          Landed for 2.16

          pjones Peter Jones added a comment - Landed for 2.16

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48030/
          Subject: LU-16042 tests: can not get cache size on Arm64
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: f276f1cb0859e8718448e69bd99ee305f5e62d42

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/48030/ Subject: LU-16042 tests: can not get cache size on Arm64 Project: fs/lustre-release Branch: master Current Patch Set: Commit: f276f1cb0859e8718448e69bd99ee305f5e62d42

          "Kevin Zhao <kevin.zhao@linaro.org>" uploaded a new patch: https://review.whamcloud.com/48030
          Subject: LU-16042 tests: can not get cache size on Arm64
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: c0bf6843a963cc22428d75bac4c8e0c4f76a058b

          gerrit Gerrit Updater added a comment - "Kevin Zhao <kevin.zhao@linaro.org>" uploaded a new patch: https://review.whamcloud.com/48030 Subject: LU-16042 tests: can not get cache size on Arm64 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: c0bf6843a963cc22428d75bac4c8e0c4f76a058b
          kevin.zhao Kevin Zhao added a comment -

          In Arm64 baremetal, we can get CPU cache size at some platform, such as: Marvell ThunderX2, Hisilicon Kunpeng 920.

          Marvell THX2 :~$ lscpu
          Architecture:        aarch64
          Byte Order:          Little Endian
          CPU(s):              224
          On-line CPU(s) list: 0-223
          Thread(s) per core:  4
          Core(s) per socket:  28
          Socket(s):           2
          NUMA node(s):        2
          Vendor ID:           Cavium
          Model:               1
          Model name:          ThunderX2 99xx
          Stepping:            0x1
          BogoMIPS:            400.00
          L1d cache:           32K
          L1i cache:           32K
          L2 cache:            256K
          L3 cache:            32768K
          NUMA node0 CPU(s):   0-111
          NUMA node1 CPU(s):   112-223
          Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm

          But can not get the same info on some older chips like Cavium Thx1, Hisilicon kunpeng 916.

          Kunpeng 916~$ lscpu
          Architecture:        aarch64
          Byte Order:          Little Endian
          CPU(s):              64
          On-line CPU(s) list: 0-63
          Thread(s) per core:  1
          Core(s) per socket:  16
          Socket(s):           4
          NUMA node(s):        4
          Vendor ID:           ARM
          Model:               2
          Model name:          Cortex-A72
          Stepping:            r0p2
          BogoMIPS:            100.00
          NUMA node0 CPU(s):   0-15
          NUMA node1 CPU(s):   16-31
          NUMA node2 CPU(s):   32-47
          NUMA node3 CPU(s):   48-63
          Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

          While in VM, nothing.

          Arm64 VM # lscpu
          Architecture:        aarch64
          Byte Order:          Little Endian
          CPU(s):              8
          On-line CPU(s) list: 0-7
          Thread(s) per core:  1
          Core(s) per cluster: 8
          Socket(s):           8
          Cluster(s):          1
          NUMA node(s):        1
          Vendor ID:           Cavium
          BIOS Vendor ID:      QEMU
          Model:               1
          Model name:          ThunderX2 99xx
          BIOS Model name:     virt-5.2
          Stepping:            0x1
          BogoMIPS:            400.00
          NUMA node0 CPU(s):   0-7
          Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm

           

          So here it's better to fallback to a pre-set value on Arm64. will work on a fix.

          kevin.zhao Kevin Zhao added a comment - In Arm64 baremetal, we can get CPU cache size at some platform, such as: Marvell ThunderX2, Hisilicon Kunpeng 920. Marvell THX2 :~$ lscpu Architecture:        aarch64 Byte Order:          Little Endian CPU(s):              224 On-line CPU(s) list: 0-223 Thread(s) per core:  4 Core(s) per socket:  28 Socket(s):           2 NUMA node(s):        2 Vendor ID:           Cavium Model:               1 Model name:          ThunderX2 99xx Stepping:            0x1 BogoMIPS:            400.00 L1d cache:           32K L1i cache:           32K L2 cache:            256K L3 cache:            32768K NUMA node0 CPU(s):   0-111 NUMA node1 CPU(s):   112-223 Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm But can not get the same info on some older chips like Cavium Thx1, Hisilicon kunpeng 916. Kunpeng 916~$ lscpu Architecture:        aarch64 Byte Order:          Little Endian CPU(s):              64 On-line CPU(s) list: 0-63 Thread(s) per core:  1 Core(s) per socket:  16 Socket(s):           4 NUMA node(s):        4 Vendor ID:           ARM Model:               2 Model name:          Cortex-A72 Stepping:            r0p2 BogoMIPS:            100.00 NUMA node0 CPU(s):   0-15 NUMA node1 CPU(s):   16-31 NUMA node2 CPU(s):   32-47 NUMA node3 CPU(s):   48-63 Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid While in VM, nothing. Arm64 VM # lscpu Architecture:        aarch64 Byte Order:          Little Endian CPU(s):              8 On-line CPU(s) list: 0-7 Thread(s) per core:  1 Core(s) per cluster: 8 Socket(s):           8 Cluster(s):          1 NUMA node(s):        1 Vendor ID:           Cavium BIOS Vendor ID:      QEMU Model:               1 Model name:          ThunderX2 99xx BIOS Model name:     virt-5.2 Stepping:            0x1 BogoMIPS:            400.00 NUMA node0 CPU(s):   0-7 Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm   So here it's better to fallback to a pre-set value on Arm64. will work on a fix.
          kevin.zhao Kevin Zhao added a comment -

          CMD: lustre-xwcomty2-05 awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo
          OSS cache size: KB

          Looks like this command can not get the CPU cache size on Arm64 VM.

           

          Some test results:

          On X86_64 VM:
          root@iZj6ce071s2zz3reioxn93Z:~# awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo
          33792

          On Arm64:

          [root@lustre-xwcomty2-01 ~]#  awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo

          [root@lustre-xwcomty2-01 ~]#

           

          The traditional Cache size is not listed in /proc/cpuinfo:
          processor    : 7
          BogoMIPS    : 400.00
          Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm
          CPU implementer    : 0x43
          CPU architecture: 8
          CPU variant    : 0x1
          CPU part    : 0x0af
          CPU revision    : 1

          kevin.zhao Kevin Zhao added a comment - CMD: lustre-xwcomty2-05 awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo OSS cache size: KB Looks like this command can not get the CPU cache size on Arm64 VM.   Some test results: On X86_64 VM: root@iZj6ce071s2zz3reioxn93Z:~# awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo 33792 On Arm64: [root@lustre-xwcomty2-01 ~] #  awk '/cache/ {sum+=$4} END {print sum}' /proc/cpuinfo [root@lustre-xwcomty2-01 ~] #   The traditional Cache size is not listed in /proc/cpuinfo: processor    : 7 BogoMIPS    : 400.00 Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics cpuid asimdrdm CPU implementer    : 0x43 CPU architecture: 8 CPU variant    : 0x1 CPU part    : 0x0af CPU revision    : 1

          People

            kevin.zhao Kevin Zhao
            kevin.zhao Kevin Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: