Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14404

lustre-initialization fails with “auster : @@@@@@ FAIL: /usr/bin/lfs setquota -u quota_usr -b 13363604 -B 14031784 -i 838864 -I 880807 /mnt/lustre FAILED!”

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.14.0
    • PPC64 clients
    • 3
    • 9223372036854775807

    Description

      lustre-initialization fails with the error message '"lustre-initialization timed out"'. So far, I’ve only seen this issue twice and only for PPC64 client testing.

      Looking at the autotest log for a recent failure https://testing.whamcloud.com/test_sets/4579843f-9edb-47cc-9a0c-5f326eeee193, we see that we are failing setting quotas

      2021-02-08T16:17:57 enable quota as required
      2021-02-08T16:17:57 CMD: trevis-4vm2 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.quota_slave.enabled
      2021-02-08T16:17:57 CMD: trevis-4vm1 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-OST0000.quota_slave.enabled
      2021-02-08T16:17:57 [HOST:trevis-77vm9.trevis.whamcloud.com] [old_mdt_qtype:none] [old_ost_qtype:none] [new_qtype:ug3]
      2021-02-08T16:17:57 CMD: trevis-4vm2 /usr/sbin/lctl conf_param lustre.quota.mdt=ug3
      2021-02-08T16:17:57 CMD: trevis-4vm2 /usr/sbin/lctl conf_param lustre.quota.ost=ug3
      2021-02-08T16:17:57 Total disk size: 13362580  block-softlimit: 13363604 block-hardlimit: 14031784 inode-softlimit: 838864 inode-hardlimit: 880807
      2021-02-08T16:17:57 Setting up quota on trevis-77vm9.trevis.whamcloud.com:/mnt/lustre for quota_usr...
      2021-02-08T16:17:57 + /usr/bin/lfs setquota -u quota_usr -b 13363604 -B 14031784 -i 838864 -I 880807 /mnt/lustre
      2021-02-08T16:17:57  auster : @@@@@@ FAIL: /usr/bin/lfs setquota -u quota_usr -b 13363604 -B 14031784 -i 838864 -I 880807 /mnt/lustre FAILED! 
      2021-02-08T16:17:57   Trace dump:
      2021-02-08T16:17:57   = /usr/lib64/lustre/tests/test-framework.sh:6273:error()
      2021-02-08T16:17:57   = /usr/lib64/lustre/tests/test-framework.sh:2302:setup_quota()
      2021-02-08T16:17:57   = /usr/lib64/lustre/tests/test-framework.sh:5329:init_param_vars()
      2021-02-08T16:17:57   = /usr/lib64/lustre/tests/test-framework.sh:5061:setupall()
      2021-02-08T16:17:57   = auster:146:setup_if_needed()
      2021-02-08T16:17:57   = auster:331:main()
      

      Looking at the MDS (vm2) console log, we see the error

      [  212.037292] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.quota_slave.enabled
      [  212.889389] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.mdt=ug3
      [  213.315973] Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ug3
      [  213.527799] LustreError: 11349:0:(mdt_handler.c:2964:mdt_quotactl()) lustre-MDT0000: unsupported quotactl command 134250496: rc = -14
      [  213.791766] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  auster : @@@@@@ FAIL: \/usr\/bin\/lfs setquota -u quota_usr -b 13363604 -B 14031784 -i 838864 -I 880807 \/mnt\/lustre FAILED! 
      [  214.037691] Lustre: DEBUG MARKER: auster : @@@@@@ FAIL: /usr/bin/lfs setquota -u quota_usr -b 13363604 -B 14031784 -i 838864 -I 880807 /mnt/lustre FAILED!
      

      On the client (vm9) console log, we see

      [  314.690859] LustreError: 9025:0:(mdc_request.c:2039:mdc_quotactl()) lustre-MDT0000-mdc-c0000000b3f26800: ptlrpc_queue_wait failed: rc = -14
      [  314.849768] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  auster : @@@@@@ FAIL: \/usr\/bin\/lfs setquota -u quota_usr -b 13363604 -B 14031784 -i 838864 -I 880807 \/mnt\/lustre FAILED! 
      [  315.077149] Lustre: DEBUG MARKER: auster : @@@@@@ FAIL: /usr/bin/lfs setquota -u quota_usr -b 13363604 -B 14031784 -i 838864 -I 880807 /mnt/lustre FAILED!
      

      Attachments

        Issue Links

          Activity

            [LU-14404] lustre-initialization fails with “auster : @@@@@@ FAIL: /usr/bin/lfs setquota -u quota_usr -b 13363604 -B 14031784 -i 838864 -I 880807 /mnt/lustre FAILED!”

            Hello,

            [  213.527799] LustreError: 11349:0:(mdt_handler.c:2964:mdt_quotactl()) lustre-MDT0000: unsupported quotactl command 134250496: rc = -14 

            134250496 in hex is 0x8008000, while quota commands lay in an interval [0x800100-0x800014].
            Wrong command id comes from the client. Somehow it relates to PPC64 client code.

            scherementsev Sergey Cheremencev added a comment - Hello, [ 213.527799] LustreError: 11349:0:(mdt_handler.c:2964:mdt_quotactl()) lustre-MDT0000: unsupported quotactl command 134250496: rc = -14 134250496 in hex is 0x8008000, while quota commands lay in an interval [0x800100-0x800014] . Wrong command id comes from the client. Somehow it relates to PPC64 client code.

            Sergey -
            Would you please take a look at this failure and see if this could be related to any changes made for OST pools or if you understand what the issue is?

            Thank you

            jamesanunez James Nunez (Inactive) added a comment - Sergey - Would you please take a look at this failure and see if this could be related to any changes made for OST pools or if you understand what the issue is? Thank you

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: