Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15129

sanity-quota test_75: 'write failed, expect succeed (2)'

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • Lustre 2.15.0, Lustre 2.15.3, Lustre 2.15.6
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for S Buisson <sbuisson@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/310e8154-d9b2-4028-8e0a-fe683a01e469

      test_75 failed with the following error:

      write failed, expect succeed (2)
      

      Test log shows:

      Disk quotas for usr 60000 (uid 60000):
           Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
          /mnt/lustre    9052*  10240   20480    none       1       0       0       -
      lustre-MDT0000_UUID
                            0       -       0       -       1       -       0       -
      lustre-MDT0001_UUID
                            0       -       0       -       0       -       0       -
      lustre-MDT0002_UUID
                            0       -       0       -       0       -       0       -
      lustre-MDT0003_UUID
                            0       -    2052       -       0       -       0       -
      lustre-OST0000_UUID
                            0       -      48       -       -       -       -       -
      lustre-OST0001_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0002_UUID
                         9052       -    9060       -       -       -       -       -
      lustre-OST0003_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0004_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0005_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0006_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0007_UUID
                            0       -       0       -       -       -       -       -
      Total allocated inode limit: 0, total allocated block limit: 9108
      Files for user (60000):
        File: /mnt/lustre
        Size: 12288     	Blocks: 24         IO Block: 4096   directory
      Device: 2c54f966h/743766374d	Inode: 144115188193296385  Links: 5
      Access: (0755/drwxr-xr-x)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
      Access: 1970-01-01 00:00:00.000000000 +0000
      Modify: 2021-10-18 21:24:13.000000000 +0000
      Change: 2021-10-18 21:24:13.000000000 +0000
       Birth: -
        File: /mnt/lustre/ffsx.sanity-dom.fsxgood
        Size: 0         	Blocks: 0          IO Block: 4194304 regular empty file
      Device: 2c54f966h/743766374d	Inode: 144115675051327527  Links: 1
      Access: (0644/-rw-r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
      Access: 2021-10-18 16:52:16.000000000 +0000
      Modify: 2021-10-18 16:52:16.000000000 +0000
      Change: 2021-10-18 16:52:16.000000000 +0000
       Birth: -
        File: /mnt/lustre/d75.sanity-quota
        Size: 4096      	Blocks: 8          IO Block: 4096   directory
      Device: 2c54f966h/743766374d	Inode: 144116245459894494  Links: 2
      Access: (0777/drwxrwxrwx)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
      Access: 2021-10-18 21:26:24.000000000 +0000
      Modify: 2021-10-18 21:26:00.000000000 +0000
      Change: 2021-10-18 21:26:00.000000000 +0000
       Birth: -
        File: /mnt/lustre/d75.sanity-quota/file
        Size: 9269248   	Blocks: 18104      IO Block: 4194304 regular file
      Device: 2c54f966h/743766374d	Inode: 144116245459894498  Links: 1
      Access: (0644/-rw-r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
      Access: 2021-10-18 21:26:00.000000000 +0000
      Modify: 2021-10-18 21:26:23.000000000 +0000
      Change: 2021-10-18 21:26:23.000000000 +0000
       Birth: -
      CMD: trevis-66vm7 /usr/sbin/lctl get_param -n version 2>/dev/null
       sanity-quota test_75: @@@@@@ FAIL: write failed, expect succeed (2)
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-quota test_75 - write failed, expect succeed (2)

      Attachments

        Issue Links

          Activity

            [LU-15129] sanity-quota test_75: 'write failed, expect succeed (2)'
            yujian Jian Yu added a comment - +1 on Lustre b2_15 branch: https://testing.whamcloud.com/test_sets/8102d3fc-f790-4053-9e6b-51923ad0e251

            "Minh Diep <mdiep@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51168
            Subject: LU-15129 tests: sanity-quota_75_dom fix
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: d73758030096243d4915b60d5988046ba44d3269

            gerrit Gerrit Updater added a comment - "Minh Diep <mdiep@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51168 Subject: LU-15129 tests: sanity-quota_75_dom fix Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: d73758030096243d4915b60d5988046ba44d3269
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50164/
            Subject: LU-15129 tests: sanity-quota_75_dom fix
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 7d05a687ee5d4f4b95585244a7f60394475fe0ba

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50164/ Subject: LU-15129 tests: sanity-quota_75_dom fix Project: fs/lustre-release Branch: master Current Patch Set: Commit: 7d05a687ee5d4f4b95585244a7f60394475fe0ba

            "Sergey Cheremencev <scherementsev@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50164
            Subject: LU-15129 tests: sanity-quota_75_dom fix
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: b41175be794beac336a815fbc3a98c402f537cc9

            gerrit Gerrit Updater added a comment - "Sergey Cheremencev <scherementsev@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50164 Subject: LU-15129 tests: sanity-quota_75_dom fix Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: b41175be794beac336a815fbc3a98c402f537cc9
            scherementsev Sergey Cheremencev added a comment - - edited

            It fails because dd uses "oflag=sync". Due to that a client writes page by page instead of writing a big chunk of pages. It takes a time.

             00040000:04000000:0.0:1676593722.914518:0:713014:0:(qsd_handler.c:748:qsd_op_begin0()) $$$ op_begin space:12  qsd:lustre-OST0005 qtype:usr id:60000 enforced:1 granted: 8644 pending:0 waiting:0 req:0 usage: 8636 qunit:1024 qtune:512 edquot:0 default:no
            00040000:04000000:0.0:1676593723.069069:0:713014:0:(qsd_handler.c:748:qsd_op_begin0()) $$$ op_begin space:12  qsd:lustre-OST0005 qtype:usr id:60000 enforced:1 granted: 8648 pending:0 waiting:0 req:0 usage: 8640 qunit:1024 qtune:512 edquot:0 default:no
            00040000:04000000:0.0:1676593723.365384:0:713014:0:(qsd_handler.c:748:qsd_op_begin0()) $$$ op_begin space:12  qsd:lustre-OST0005 qtype:usr id:60000 enforced:1 granted: 8652 pending:0 waiting:0 req:0 usage: 8644 qunit:1024 qtune:512 edquot:0 default:no
            ...

            Furthermore, when granted space becomes closer to soft_limit(i.e. over 9MB if soft_limit is 10MB), OST can not preacquire space anymore. Also OST could acquire only requested amount of space - see qmt_alloc_expand. Thus OST has to send quota acquire request at MDT for each BRW request from the client. Sometimes 20 seconds is not enough to write 10MB.

            I'll send a patch to change test_dom_75.

            scherementsev Sergey Cheremencev added a comment - - edited It fails because dd uses "oflag=sync". Due to that a client writes page by page instead of writing a big chunk of pages. It takes a time. 00040000:04000000:0.0:1676593722.914518:0:713014:0:(qsd_handler.c:748:qsd_op_begin0()) $$$ op_begin space:12  qsd:lustre-OST0005 qtype:usr id:60000 enforced:1 granted: 8644 pending:0 waiting:0 req:0 usage: 8636 qunit:1024 qtune:512 edquot:0 default:no 00040000:04000000:0.0:1676593723.069069:0:713014:0:(qsd_handler.c:748:qsd_op_begin0()) $$$ op_begin space:12  qsd:lustre-OST0005 qtype:usr id:60000 enforced:1 granted: 8648 pending:0 waiting:0 req:0 usage: 8640 qunit:1024 qtune:512 edquot:0 default:no 00040000:04000000:0.0:1676593723.365384:0:713014:0:(qsd_handler.c:748:qsd_op_begin0()) $$$ op_begin space:12  qsd:lustre-OST0005 qtype:usr id:60000 enforced:1 granted: 8652 pending:0 waiting:0 req:0 usage: 8644 qunit:1024 qtune:512 edquot:0 default:no ... Furthermore, when granted space becomes closer to soft_limit(i.e. over 9MB if soft_limit is 10MB), OST can not preacquire space anymore. Also OST could acquire only requested amount of space - see qmt_alloc_expand. Thus OST has to send quota acquire request at MDT for each BRW request from the client. Sometimes 20 seconds is not enough to write 10MB. I'll send a patch to change test_dom_75.

            "Arshad Hussain <arshad.hussain@aeoncomputing.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50129
            Subject: LU-15129 tests: sanity-quota/test_75
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 70e5c1a89a941e2166b88146f13b929f7161532d

            gerrit Gerrit Updater added a comment - "Arshad Hussain <arshad.hussain@aeoncomputing.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50129 Subject: LU-15129 tests: sanity-quota/test_75 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 70e5c1a89a941e2166b88146f13b929f7161532d
            adilger Andreas Dilger added a comment - It looks like this subtest failed 40/474 runs in the past week: https://testing.whamcloud.com/search?test_set_script_id=61149410-4a46-11e0-a7f6-52540025f9af&sub_test_script_id=217bd2d8-23d7-4016-a521-48d60f5a070f&start_date=2023-02-14&end_date=2023-02-21&source=sub_tests#redirect Sergey, could you please investigate.
            eaujames Etienne Aujames added a comment - +1 on master: https://testing.whamcloud.com/test_sets/585a6fbc-4dc5-4ba1-a591-47abfd633df9
            arshad512 Arshad Hussain added a comment - +1 on Master https://testing.whamcloud.com/test_sets/ec4e6eb3-fc08-4789-99f1-89853a3421c6

            People

              scherementsev Sergey Cheremencev
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: