Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12777

conf-sanity test 103 fails with ‘set mdt quota type failed’

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.13.0, Lustre 2.12.3
    • Lustre 2.13.0
    • ZFS with RHEL 8 clients
    • 3
    • 9223372036854775807

    Description

      conf-sanity test_103 fails with ‘set mdt quota type failed’ for ZFS with RHEL 8.0 clients only and started on 14-JULY-2019.

      Looking at the client test_log, we see

      mount mylustre on /mnt/lustre.....
      Starting client: trevis-17vm1.trevis.whamcloud.com:  -o user_xattr,flock trevis-17vm4@tcp:/mylustre /mnt/lustre
      CMD: trevis-17vm1.trevis.whamcloud.com mkdir -p /mnt/lustre
      CMD: trevis-17vm1.trevis.whamcloud.com mount -t lustre -o user_xattr,flock trevis-17vm4@tcp:/mylustre /mnt/lustre
      mount.lustre: mount trevis-17vm4@tcp:/mylustre at /mnt/lustre failed: Input/output error
      Is the MGS running?
      Starting client trevis-17vm1.trevis.whamcloud.com,trevis-17vm2:  -o user_xattr,flock trevis-17vm4@tcp:/mylustre /mnt/lustre
      CMD: trevis-17vm1.trevis.whamcloud.com,trevis-17vm2 mkdir -p /mnt/lustre
      CMD: trevis-17vm1.trevis.whamcloud.com,trevis-17vm2 
      running=\$(mount | grep -c /mnt/lustre' ');
      rc=0;
      if [ \$running -eq 0 ] ; then
      	mkdir -p /mnt/lustre;
      	mount -t lustre  -o user_xattr,flock trevis-17vm4@tcp:/mylustre /mnt/lustre;
      	rc=\$?;
      fi;
      exit \$rc
      trevis-17vm2: mount.lustre: mount trevis-17vm4@tcp:/mylustre at /mnt/lustre failed: Input/output error
      trevis-17vm2: Is the MGS running?
      pdsh@trevis-17vm1: trevis-17vm2: ssh exited with exit code 5
      trevis-17vm1: mount.lustre: mount trevis-17vm4@tcp:/mylustre at /mnt/lustre failed: Input/output error
      trevis-17vm1: Is the MGS running?
      pdsh@trevis-17vm1: trevis-17vm1: ssh exited with exit code 5
      CMD: trevis-17vm4 lctl get_param -n timeout
      Using TIMEOUT=20
      CMD: trevis-17vm4 lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
      CMD: trevis-17vm1.trevis.whamcloud.com lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
      error: set_param: param_path 'osc/*/idle_timeout': No such file or directory
      error: get_param: param_path 'mdc/*/connect_flags': No such file or directory
      jobstats not supported by server
      enable quota as required
      CMD: trevis-17vm4 /usr/sbin/lctl get_param -n osd-zfs.lustre-MDT0000.quota_slave.enabled
      trevis-17vm4: error: get_param: param_path 'osd-zfs/lustre-MDT0000/quota_slave/enabled': No such file or directory
      pdsh@trevis-17vm1: trevis-17vm4: ssh exited with exit code 2
      CMD: trevis-17vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0000.quota_slave.enabled
      trevis-17vm3: error: get_param: param_path 'osd-zfs/lustre-OST0000/quota_slave/enabled': No such file or directory
      pdsh@trevis-17vm1: trevis-17vm3: ssh exited with exit code 2
      [HOST:trevis-17vm1.trevis.whamcloud.com] [old_mdt_qtype:] [old_ost_qtype:] [new_qtype:ug3]
      CMD: trevis-17vm4 /usr/sbin/lctl conf_param mylustre.quota.mdt=ug3
      trevis-17vm4: No device found for name MGS: Invalid argument
      trevis-17vm4: This command must be run on the MGS.
      trevis-17vm4: error: conf_param: No such device
      pdsh@trevis-17vm1: trevis-17vm4: ssh exited with exit code 19
       conf-sanity test_103: @@@@@@ FAIL: set mdt quota type failed 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:6115:error()
        = /usr/lib64/lustre/tests/test-framework.sh:2209:setup_quota()
        = /usr/lib64/lustre/tests/test-framework.sh:5188:init_param_vars()
        = /usr/lib64/lustre/tests/test-framework.sh:4920:setupall()
        = /usr/lib64/lustre/tests/conf-sanity.sh:7512:test_103()
      

      IN the console for client 1 (vm1), we see a couple of errors

      [63663.392258] LustreError: 29000:0:(mgc_request.c:250:do_config_log_add()) MGC10.9.4.201@tcp: failed processing log, type 1: rc = -5
      [63670.752274] LustreError: 29006:0:(mgc_request.c:598:do_requeue()) failed processing log: -5
      [63674.656290] LustreError: 15c-8: MGC10.9.4.201@tcp: Confguration from log mylustre-client failed from MGS -5. Communication error between node & MGS, a bad configuration, or other errors. See syslog for more info
      [63674.659682] Lustre: Unmounted mylustre-client
      [63674.660992] LustreError: 29000:0:(obd_mount.c:1669:lustre_fill_super()) Unable to mount  (-5)
      [63674.942608] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre
      [63675.766879] Lustre: DEBUG MARKER: 
      [63675.766879] running=$(mount | grep -c /mnt/lustre' ');
      [63675.766879] rc=0;
      [63675.766879] if [ $running -eq 0 ] ; then
      [63675.766879] 	mkdir -p /mnt/lustre;
      [63675.766879] 	mount -t lustre  -o user_xattr,flock trevis-17vm4@tcp:/mylustre /mnt/lustre;
      [63675.766879] 	rc=$?;
      [63675.766879] fi;
      [63675.766879] exit $rc
      [63682.336278] LustreError: 29245:0:(mgc_request.c:250:do_config_log_add()) MGC10.9.4.201@tcp: failed processing log, type 1: rc = -5
      [63692.256268] LustreError: 29250:0:(mgc_request.c:598:do_requeue()) failed processing log: -5
      [63693.600289] LustreError: 15c-8: MGC10.9.4.201@tcp: Confguration from log mylustre-client failed from MGS -5. Communication error between node & MGS, a bad configuration, or other errors. See syslog for more info
      [63693.603727] Lustre: Unmounted mylustre-client
      [63693.604849] LustreError: 29245:0:(obd_mount.c:1669:lustre_fill_super()) Unable to mount  (-5)
      [63696.068194] Lustre: DEBUG MARKER: /usr/sbin/lctl mark Using TIMEOUT=20
      [63696.538630] Lustre: DEBUG MARKER: Using TIMEOUT=20
      [63697.213487] Lustre: DEBUG MARKER: lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
      [63699.496142] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  conf-sanity test_103: @@@@@@ FAIL: set mdt quota type failed 
      

      Logs for conf-sanity test 103 failures are at
      https://testing.whamcloud.com/test_sets/513fcf0c-d708-11e9-98c8-52540065bddc
      https://testing.whamcloud.com/test_sets/b4dd0e36-a705-11e9-861b-52540065bddc
      https://testing.whamcloud.com/test_sets/7683ebf6-d2b5-11e9-9fc9-52540065bddc
      https://testing.whamcloud.com/test_sets/61f82b1a-d59f-11e9-90ad-52540065bddc

      Attachments

        Issue Links

          Activity

            People

              wshilong Wang Shilong (Inactive)
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: