Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11803

sanity test 255c fails with 'Ladvise test 12, bad lock count, returned 1, actual 0'

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.14.0
    • Lustre 2.12.0
    • Ubuntu 18.04 clients
    • 3
    • 9223372036854775807

    Description

      sanity test_255c fails with 'Ladvise test 12, bad lock count, returned 1, actual 0'. It looks like this is only impacting Ubuntu 18.04 client testing.

      Looking at the client test_log from https://testing.whamcloud.com/test_sets/d1b23f66-fdd4-11e8-a97c-52540065bddc , a problem is that we can’t find lock_unused_count

      == sanity test 255c: suite of ladvise lockahead tests ================================================ 19:46:58 (1544471218)
      CMD: trevis-19vm3 /usr/sbin/lctl get_param -n version 2>/dev/null ||
      				/usr/sbin/lctl lustre_build_version 2>/dev/null ||
      				/usr/sbin/lctl --version 2>/dev/null | cut -d' ' -f2
      Starting test test10 at 1544471219
      Finishing test test10 at 1544471219
      CMD: trevis-19vm3 /usr/sbin/lctl get_param -n ost.OSS.ost.stats
      Starting test test11 at 1544471219
      Finishing test test11 at 1544471219
      CMD: trevis-19vm3 /usr/sbin/lctl get_param -n ost.OSS.ost.stats
      error: get_param: param_path 'ldlm/namespaces/lustre-OST0000*osc-f*/lock_unused_count': No such file or directory
      Starting test test12 at 1544471220
      Finishing test test12 at 1544471220
      error: get_param: param_path 'ldlm/namespaces/lustre-OST0000*osc-f*/lock_unused_count': No such file or directory
      

      There are several example of this failure all for Ubuntu at
      https://testing.whamcloud.com/test_sets/bd048afa-f713-11e8-b67f-52540065bddc
      https://testing.whamcloud.com/test_sets/b223ccee-f778-11e8-b67f-52540065bddc
      https://testing.whamcloud.com/test_sets/50f5970e-fa16-11e8-8a18-52540065bddc

      Attachments

        Issue Links

          Activity

            [LU-11803] sanity test 255c fails with 'Ladvise test 12, bad lock count, returned 1, actual 0'

            James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/33894
            Subject: LU-11803 tests: don't assume obd device name
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: ce075a8726fe822803f24d1b336301ad87e9ce12

            gerrit Gerrit Updater added a comment - James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/33894 Subject: LU-11803 tests: don't assume obd device name Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ce075a8726fe822803f24d1b336301ad87e9ce12
            simmonsja James A Simmons added a comment - - edited

            I see the issue. Its a test problem in this case. The people that wrote these test assume the obd device name is of the format $FSNAME-OST0000-osc-ffff* but that is not always the case. In newer kernels the ffff is filtered out. This also impacts older kernels running on non x86 platforms. For example on Power8 is lustre-OST0000-osc-c000001e4de63000. I will push a patch to fix this.

            simmonsja James A Simmons added a comment - - edited I see the issue. Its a test problem in this case. The people that wrote these test assume the obd device name is of the format $FSNAME-OST0000-osc-ffff* but that is not always the case. In newer kernels the ffff is filtered out. This also impacts older kernels running on non x86 platforms. For example on Power8 is lustre-OST0000-osc-c000001e4de63000. I will push a patch to fix this.

            Its the uuid issue. So newer kernels no long allow you to expose the pointer address to user land. We need to create a new UUID method that is not the pointer of an internal kernel object

            simmonsja James A Simmons added a comment - Its the uuid issue. So newer kernels no long allow you to expose the pointer address to user land. We need to create a new UUID method that is not the pointer of an internal kernel object

            Let me look

            simmonsja James A Simmons added a comment - Let me look

            simmonsja this looks like fallout from moving parameters from /proc to /sys? The Ubuntu kernel is 4.15, which is definitely the newest one we have running, so may be more likely to be affected by these changes.

            adilger Andreas Dilger added a comment - simmonsja this looks like fallout from moving parameters from /proc to /sys? The Ubuntu kernel is 4.15, which is definitely the newest one we have running, so may be more likely to be affected by these changes.

            Since this is a similar failure, we are also seeing sanity-flr tests 0g and 31 fail with similar issues

            == sanity-flr test 0g: lfs mirror create flags support =============================================== 00:45:20 (1544489120)
            osc.lustre-OST0000-osc-        (ptrval).stats=clear
            osc.lustre-OST0001-osc-        (ptrval).stats=clear
            osc.lustre-OST0002-osc-        (ptrval).stats=clear
            osc.lustre-OST0003-osc-        (ptrval).stats=clear
            osc.lustre-OST0004-osc-        (ptrval).stats=clear
            osc.lustre-OST0005-osc-        (ptrval).stats=clear
            osc.lustre-OST0006-osc-        (ptrval).stats=clear
            error: get_param: param_path 'osc/lustre-OST0000-osc-ffff*/stats': No such file or directory
             sanity-flr test_0g: @@@@@@ FAIL: read was not provided by OST1 
            

            and

            == sanity-flr test 31: make sure glimpse request can be retried ====================================== 00:46:46 (1544489206)
            fail_loc=0x1A00
            CMD: trevis-19vm3 grep -c /mnt/lustre-ost1' ' /proc/mounts || true
            Stopping /mnt/lustre-ost1 (opts:) on trevis-19vm3
            CMD: trevis-19vm3 umount -d /mnt/lustre-ost1
            CMD: trevis-19vm3 lsmod | grep lnet > /dev/null &&
            lctl dl | grep ' ST ' || true
            CMD: trevis-19vm1.trevis.whamcloud.com lctl get_param -n at_min
            can't get osc.lustre-OST0000-osc-ffff*.ost_server_uuid in 40 secs
            error: get_param: param_path 'ldlm/namespaces/lustre-OST0000-osc-ffff*/lock_count': No such file or directory
            error: get_param: param_path 'ldlm/namespaces/lustre-OST0001-osc-ffff*/lock_count': No such file or directory
             sanity-flr test_31: @@@@@@ FAIL: OST 1: no glimpse request was sent 
            
            jamesanunez James Nunez (Inactive) added a comment - Since this is a similar failure, we are also seeing sanity-flr tests 0g and 31 fail with similar issues == sanity-flr test 0g: lfs mirror create flags support =============================================== 00:45:20 (1544489120) osc.lustre-OST0000-osc- (ptrval).stats=clear osc.lustre-OST0001-osc- (ptrval).stats=clear osc.lustre-OST0002-osc- (ptrval).stats=clear osc.lustre-OST0003-osc- (ptrval).stats=clear osc.lustre-OST0004-osc- (ptrval).stats=clear osc.lustre-OST0005-osc- (ptrval).stats=clear osc.lustre-OST0006-osc- (ptrval).stats=clear error: get_param: param_path 'osc/lustre-OST0000-osc-ffff*/stats': No such file or directory sanity-flr test_0g: @@@@@@ FAIL: read was not provided by OST1 and == sanity-flr test 31: make sure glimpse request can be retried ====================================== 00:46:46 (1544489206) fail_loc=0x1A00 CMD: trevis-19vm3 grep -c /mnt/lustre-ost1' ' /proc/mounts || true Stopping /mnt/lustre-ost1 (opts:) on trevis-19vm3 CMD: trevis-19vm3 umount -d /mnt/lustre-ost1 CMD: trevis-19vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' || true CMD: trevis-19vm1.trevis.whamcloud.com lctl get_param -n at_min can't get osc.lustre-OST0000-osc-ffff*.ost_server_uuid in 40 secs error: get_param: param_path 'ldlm/namespaces/lustre-OST0000-osc-ffff*/lock_count': No such file or directory error: get_param: param_path 'ldlm/namespaces/lustre-OST0001-osc-ffff*/lock_count': No such file or directory sanity-flr test_31: @@@@@@ FAIL: OST 1: no glimpse request was sent

            People

              simmonsja James A Simmons
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: