Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10940

sanity test_802: set mdt quota type failed

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.12.0
    • Lustre 2.12.0
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/d3a87ae4-471b-11e8-95c0-52540065bddc

      test_802 failed with the following error:

      set mdt quota type failed
      

      This failure seems start showing on 2.11.50.51, b3738 on April 9, 2018

      test log

      CMD: trevis-4vm4 lctl get_param -n timeout
      Using TIMEOUT=20
      CMD: trevis-4vm4 lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
      CMD: trevis-4vm1.trevis.hpdd.intel.com lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
      error: get_param: param_path 'mdc/*/connect_flags': No such file or directory
      jobstats not supported by server
      enable quota as required
      CMD: trevis-4vm4 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.quota_slave.enabled
      CMD: trevis-4vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-OST0000.quota_slave.enabled
      [HOST:trevis-4vm1.trevis.hpdd.intel.com] [old_mdt_qtype:ug] [old_ost_qtype:ug] [new_qtype:ug3]
      CMD: trevis-4vm4 /usr/sbin/lctl conf_param lustre.quota.mdt=ug3
      trevis-4vm4: error: conf_param: Read-only file system
       sanity test_802: @@@@@@ FAIL: set mdt quota type failed 
       Trace dump:
      
      

      MDS dmesg

      [ 7400.522030] Lustre: DEBUG MARKER: SKIP: sanity test_801c
      [ 7400.803247] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity test 802: simulate readonly device ========================================================= 23:28:21 \(1524439701\)
      [ 7400.993579] Lustre: DEBUG MARKER: == sanity test 802: simulate readonly device ========================================================= 23:28:21 (1524439701)
      [ 7401.164206] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null ||
       /usr/sbin/lctl lustre_build_version 2>/dev/null ||
       /usr/sbin/lctl --version 2>/dev/null | cut -d' ' -f2
      [ 7401.912727] Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1
      [ 7405.691947] Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1
      [ 7407.328094] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
      [ 7407.639629] Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
      [ 7412.735986] Lustre: lustre-MDT0000: Not available for connect from 10.9.4.31@tcp (stopping)
      [ 7412.738161] Lustre: Skipped 3 previous similar messages
      [ 7417.729349] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.9.4.31@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
      [ 7417.733806] LustreError: Skipped 15 previous similar messages
      [ 7419.993052] Lustre: 7085:0:(client.c:2099:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1524439714/real 1524439714] req@ffff880061347900 x1598483208664528/t0(0) o251->MGC10.9.4.32@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1524439720 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
      [ 7420.027093] Lustre: server umount lustre-MDT0000 complete
      [ 7420.199653] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
      lctl dl | grep ' ST ' || true
      [ 7420.520293] Lustre: DEBUG MARKER: modprobe dm-flakey;
       dmsetup targets | grep -q flakey
      [ 7432.809742] Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts);
      mpts=$(mount | grep -c /mnt/lustre-mds1' ');
      if [ $running -ne $mpts ]; then
       echo $(hostname) env are INSANE!;
       exit 1;
      fi
      [ 7433.175430] Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts);
      mpts=$(mount | grep -c /mnt/lustre-mds1' ');
      if [ $running -ne $mpts ]; then
       echo $(hostname) env are INSANE!;
       exit 1;
      fi
      [ 7434.341465] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
      [ 7434.650345] Lustre: DEBUG MARKER: modprobe dm-flakey;
       dmsetup targets | grep -q flakey
      [ 7434.951416] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
      [ 7435.251720] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
      [ 7435.550827] Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
      [ 7435.846295] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
      [ 7436.141605] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o rdonly_dev /dev/mapper/mds1_flakey /mnt/lustre-mds1
      [ 7436.313923] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
      [ 7436.317692] Turning device dm-3 (0xfc00003) read-only
      [ 7436.319501] Lustre: lustre-MDT0000-osd: set dev_rdonly on this device
      [ 7436.395144] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180
      [ 7436.566211] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
      [ 7436.878021] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
      [ 7437.467981] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-4vm4.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7437.468228] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-4vm4.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7437.655837] Lustre: DEBUG MARKER: trevis-4vm4.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7437.655862] Lustre: DEBUG MARKER: trevis-4vm4.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7437.827912] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
      [ 7438.131105] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]\{3}[0-9]\{4}'
      [ 7438.434939] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]\{3}[0-9]\{4}'
      [ 7438.762419] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
      [ 7439.087312] Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
      [ 7442.556447] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7442.755098] Lustre: DEBUG MARKER: trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7447.098744] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7447.288273] Lustre: DEBUG MARKER: trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7451.644719] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7451.830105] Lustre: DEBUG MARKER: trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7456.196503] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7456.388152] Lustre: DEBUG MARKER: trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7460.757075] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7460.957519] Lustre: DEBUG MARKER: trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7465.345849] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7465.538568] Lustre: DEBUG MARKER: trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7469.992735] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7470.193712] Lustre: DEBUG MARKER: trevis-4vm3.trevis.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [ 7471.395975] Lustre: DEBUG MARKER: lctl get_param -n timeout
      [ 7471.796323] Lustre: DEBUG MARKER: /usr/sbin/lctl mark Using TIMEOUT=20
      [ 7471.989513] Lustre: DEBUG MARKER: Using TIMEOUT=20
      [ 7472.150495] Lustre: DEBUG MARKER: lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
      [ 7472.495736] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.quota_slave.enabled
      [ 7473.143673] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.mdt=ug3
      [ 7473.296428] LustreError: 9740:0:(osd_handler.c:1689:osd_trans_create()) lustre-MDT0000: someone try to start transaction under readonly mode, should be disabled.
      [ 7473.302031] CPU: 0 PID: 9740 Comm: llog_process_th Tainted: G OE ------------ 3.10.0-693.21.1.el7_lustre.x86_64 #1
      [ 7473.307279] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      [ 7473.309907] Call Trace:
      [ 7473.312287] [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
      [ 7473.314849] [<ffffffffc0d2ea9c>] osd_trans_create+0x5cc/0x610 [osd_ldiskfs]
      [ 7473.317607] [<ffffffffc0877c71>] llog_write+0x91/0x3d0 [obdclass]
      [ 7473.320207] [<ffffffffc0db012a>] mgs_modify_handler+0x36a/0x440 [mgs]
      [ 7473.322805] [<ffffffffc08759c9>] llog_process_thread+0x839/0x1560 [obdclass]
      [ 7473.325492] [<ffffffffc089fc19>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
      [ 7473.328172] [<ffffffffc08770ff>] llog_process_thread_daemonize+0x9f/0xe0 [obdclass]
      [ 7473.330884] [<ffffffffc0877060>] ? llog_backup+0x500/0x500 [obdclass]
      [ 7473.333483] [<ffffffff810b4031>] kthread+0xd1/0xe0
      [ 7473.335897] [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40
      [ 7473.338389] [<ffffffff816c0577>] ret_from_fork+0x77/0xb0
      [ 7473.340792] [<ffffffff810b3f60>] ? insert_kthread_work+0x40/0x40
      [ 7473.343238] LustreError: 9739:0:(mgs_llog.c:954:mgs_modify()) MGS: modify lustre/quota.mdt failed: rc = -30
      [ 7473.345910] LustreError: 9739:0:(mgs_llog.c:1940:mgs_write_log_direct_all()) MGS: Can't modify llog lustre-MDT0000: rc = -30
      [ 7473.348694] CPU: 1 PID: 9739 Comm: lctl Tainted: G OE ------------ 3.10.0-693.21.1.el7_lustre.x86_64 #1
      [ 7473.351406] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      [ 7473.353790] Call Trace:
      [ 7473.355798] [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
      [ 7473.358052] [<ffffffffc0d2ea9c>] osd_trans_create+0x5cc/0x610 [osd_ldiskfs]
      [ 7473.360387] [<ffffffffc0877c71>] llog_write+0x91/0x3d0 [obdclass]
      [ 7473.362665] [<ffffffffc0dad80e>] record_marker+0x15e/0x2b0 [mgs]
      [ 7473.364843] [<ffffffffc0dae9f2>] mgs_write_log_direct+0xe2/0x2d0 [mgs]
      [ 7473.367092] [<ffffffffc0dbd6cb>] mgs_write_log_direct_all+0x38b/0x640 [mgs]
      [ 7473.369279] [<ffffffffc0dd06ea>] mgs_write_log_quota+0x2d7/0x31d [mgs]
      [ 7473.371448] [<ffffffffc0dbe4bb>] mgs_write_log_param+0x5ab/0x1e30 [mgs]
      [ 7473.373529] [<ffffffffc0dbfd87>] ? mgs_find_fsdb+0x47/0x70 [mgs]
      [ 7473.375591] [<ffffffffc0dc2677>] ? mgs_find_or_make_fsdb+0x67/0x1c0 [mgs]
      [ 7473.377614] [<ffffffffc0dc6d6c>] mgs_set_param+0xabc/0xd40 [mgs]
      [ 7473.379604] [<ffffffffc0dac23a>] mgs_iocontrol+0xd2a/0xde0 [mgs]
      [ 7473.381507] [<ffffffffc088aae3>] class_handle_ioctl+0x18d3/0x1de0 [obdclass]
      [ 7473.383517] [<ffffffff811b1f16>] ? do_read_fault.isra.44+0xe6/0x130
      [ 7473.385376] [<ffffffff812b72be>] ? security_capable+0x1e/0x20
      [ 7473.387227] [<ffffffffc086f802>] obd_class_ioctl+0xd2/0x170 [obdclass]
      [ 7473.389074] [<ffffffff81219e90>] do_vfs_ioctl+0x350/0x560
      [ 7473.390832] [<ffffffff816bb521>] ? __do_page_fault+0x171/0x450
      [ 7473.392525] [<ffffffff8121a141>] SyS_ioctl+0xa1/0xc0
      [ 7473.394199] [<ffffffff816c0655>] ? system_call_after_swapgs+0xa2/0x146
      [ 7473.395942] [<ffffffff816c0715>] system_call_fastpath+0x1c/0x21
      [ 7473.397679] [<ffffffff816c0661>] ? system_call_after_swapgs+0xae/0x146
      [ 7473.399422] LustreError: 9739:0:(mgs_llog.c:1948:mgs_write_log_direct_all()) MGS: writing log lustre-MDT0000: rc = -30
      [ 7473.401661] CPU: 0 PID: 9741 Comm: llog_process_th Tainted: G OE ------------ 3.10.0-693.21.1.el7_lustre.x86_64 #1
      
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_802 - set mdt quota type failed

      Attachments

        Issue Links

          Activity

            [LU-10940] sanity test_802: set mdt quota type failed

            Patch landed to master

            jamesanunez James Nunez (Inactive) added a comment - Patch landed to master

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32900/
            Subject: LU-10940 tests: skip sanity test 802 when quota enabled
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: ddb3d0b61ded0b9507baa25de08a2d51af17b284

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32900/ Subject: LU-10940 tests: skip sanity test 802 when quota enabled Project: fs/lustre-release Branch: master Current Patch Set: Commit: ddb3d0b61ded0b9507baa25de08a2d51af17b284

            James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/32900
            Subject: LU-10940 tests: skip sanity test 802 when quota enabled
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: acf177a62c538cf8517a697fe57a20f340de5538

            gerrit Gerrit Updater added a comment - James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/32900 Subject: LU-10940 tests: skip sanity test 802 when quota enabled Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: acf177a62c538cf8517a697fe57a20f340de5538

            Since enabling quota requires changes to the filesystem on the targets, it doesn't make sense to enable it on a read-only filesystem.

            adilger Andreas Dilger added a comment - Since enabling quota requires changes to the filesystem on the targets, it doesn't make sense to enable it on a read-only filesystem.

            sanity test 802 fails when we run the ‘full’ test group, but passes for all other testing; review-ldiskfs, review-dne, etc. On difference between full and all other testing is, for ‘full’ testing, we enable quotas for all test suites and for all other testing, we don’t enable quotas.

            For sanity test 802, we stop all servers and then mount the servers as read only. When we bring up the server, in read only mode, we try and reset quotas in setup_quota() and the following call to conf_param on the mgs fails

            2119         do_facet mgs $LCTL conf_param $FSNAME.quota.mdt=$QUOTA_TYPE ||
            2120                 error "set mdt quota type failed"
            2121         do_facet mgs $LCTL conf_param $FSNAME.quota.ost=$QUOTA_TYPE ||
            2122                 error "set ost quota type failed"
            

            One question is, is Lustre behaving properly and not allowing calls to conf_param when a server is read-only or, more specifically, should we be able to set quotas by calling conf_param on a read only server?

            jamesanunez James Nunez (Inactive) added a comment - sanity test 802 fails when we run the ‘full’ test group, but passes for all other testing; review-ldiskfs, review-dne, etc. On difference between full and all other testing is, for ‘full’ testing, we enable quotas for all test suites and for all other testing, we don’t enable quotas. For sanity test 802, we stop all servers and then mount the servers as read only. When we bring up the server, in read only mode, we try and reset quotas in setup_quota() and the following call to conf_param on the mgs fails 2119 do_facet mgs $LCTL conf_param $FSNAME.quota.mdt=$QUOTA_TYPE || 2120 error "set mdt quota type failed" 2121 do_facet mgs $LCTL conf_param $FSNAME.quota.ost=$QUOTA_TYPE || 2122 error "set ost quota type failed" One question is, is Lustre behaving properly and not allowing calls to conf_param when a server is read-only or, more specifically, should we be able to set quotas by calling conf_param on a read only server?

            This failure seems start showing on 2.11.50.51, b3738 on April 9, 2018

            It probably makes sense to see which quota-related patches landed just before then.

            adilger Andreas Dilger added a comment - This failure seems start showing on 2.11.50.51, b3738 on April 9, 2018 It probably makes sense to see which quota-related patches landed just before then.
            pjones Peter Jones added a comment -

            Hongchao

            Could you please investigate?

            Thanks

            Peter

            pjones Peter Jones added a comment - Hongchao Could you please investigate? Thanks Peter

            People

              hongchao.zhang Hongchao Zhang
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: