Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11678

sanity-quota test 1 fails with 'user write success, but expect EDQUOT'

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.13.0, Lustre 2.12.3
    • Lustre 2.12.0, Lustre 2.13.0, Lustre 2.12.1
    • DNE
    • 3
    • 9223372036854775807

    Description

      sanity-quota test_1 is failing in a DNE configuration, review-dne-part-4, with

      'user write success, but expect EDQUOT' 
      

      meaning that the user write, dd, should not have succeeded.

      Looking at the failure at https://testing.whamcloud.com/test_sets/43baf332-e791-11e8-b67f-52540065bddc , the last lines seen in the client test_log is

      Disk quotas for usr quota_usr (uid 60000):
           Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
          /mnt/lustre   10240*      0   10240       -       1       0       0       -
      lustre-MDT0000_UUID
                            0       -       0       -       1       -       0       -
      lustre-MDT0001_UUID
                            0       -       0       -       0       -       0       -
      lustre-MDT0002_UUID
                            0       -       0       -       0       -       0       -
      lustre-MDT0003_UUID
                            0       -       0       -       0       -       0       -
      lustre-OST0000_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0001_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0002_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0003_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0004_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0005_UUID
                        10240*      -   10240       -       -       -       -       -
      lustre-OST0006_UUID
                            0       -       0       -       -       -       -       -
      lustre-OST0007_UUID
                            0       -       0       -       -       -       -       -
      Total allocated inode limit: 0, total allocated block limit: 10240
      uid 60000 is using default file quota setting
      Files for user (quota_usr):
        File: '/mnt/lustre/d1.sanity-quota/f1.sanity-quota-0'
        Size: 11534336  	Blocks: 20480      IO Block: 4194304 regular file
      Device: 2c54f966h/743766374d	Inode: 144119902892982281  Links: 1
      Access: (0644/-rw-r--r--)  Uid: (60000/quota_usr)   Gid: (60000/quota_usr)
      Access: 2018-11-13 20:03:03.000000000 +0000
      Modify: 2018-11-13 20:03:07.000000000 +0000
      Change: 2018-11-13 20:03:07.000000000 +0000
       Birth: -
       sanity-quota test_1: @@@@@@ FAIL: user write success, but expect EDQUOT 
      

      There are no errors or anything to indicate that there was a problem in any of the console or dmesg logs.

      This may be related to or the same as LU-5245. Although this test failed in October several times with the same error message, these test failures have many errors in the logs.

      It looks like this test started failing again on 2018-11-13.

      Logs for other recent failures are at
      https://testing.whamcloud.com/test_sets/52c657cc-e8d6-11e8-86c0-52540065bddc
      https://testing.whamcloud.com/test_sets/9608d2d6-e79f-11e8-bfe1-52540065bddc
      https://testing.whamcloud.com/test_sets/b126d406-e889-11e8-b67f-52540065bddc

      Attachments

        Issue Links

          Activity

            [LU-11678] sanity-quota test 1 fails with 'user write success, but expect EDQUOT'

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34916/
            Subject: LU-11678 quota: make overquota flag for old req
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 831e564f76d05cc21df21e7990388bcc45fef7cb

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34916/ Subject: LU-11678 quota: make overquota flag for old req Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 831e564f76d05cc21df21e7990388bcc45fef7cb
            adilger Andreas Dilger added a comment - Still seeing failures of this test: https://testing.whamcloud.com/sub_tests/fa32c096-b0b7-11e9-912a-52540065bddc https://testing.whamcloud.com/sub_tests/84a6529c-b20c-11e9-b023-52540065bddc etc

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34915/
            Subject: LU-11678 quota: protect quota flags at OSC
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 090ff485c0ef584c9180a21d830ce546483c2982

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34915/ Subject: LU-11678 quota: protect quota flags at OSC Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 090ff485c0ef584c9180a21d830ce546483c2982

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34916
            Subject: LU-11678 quota: make overquota flag for old req
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: ea6ecadf04cb36950a9e8577a65d7c5830bfb1e1

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34916 Subject: LU-11678 quota: make overquota flag for old req Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: ea6ecadf04cb36950a9e8577a65d7c5830bfb1e1

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34915
            Subject: LU-11678 quota: protect quota flags at OSC
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 2ce0eb6699a23f677b9f1a585ec1c1c59c6531b5

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34915 Subject: LU-11678 quota: protect quota flags at OSC Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 2ce0eb6699a23f677b9f1a585ec1c1c59c6531b5
            pjones Peter Jones added a comment -

            Landed for 2.13

            pjones Peter Jones added a comment - Landed for 2.13

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34645/
            Subject: LU-11678 quota: make overquota flag for old req
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: c59cf862c3c06758c270564dd6e8948e167316b9

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34645/ Subject: LU-11678 quota: make overquota flag for old req Project: fs/lustre-release Branch: master Current Patch Set: Commit: c59cf862c3c06758c270564dd6e8948e167316b9

            Hongchao Zhang (hongchao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34645
            Subject: LU-11678 quota: make overquota flag for old req
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 70efe426cbf3b1bae4e750a7e5078b1f53078bbe

            gerrit Gerrit Updater added a comment - Hongchao Zhang (hongchao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34645 Subject: LU-11678 quota: make overquota flag for old req Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 70efe426cbf3b1bae4e750a7e5078b1f53078bbe

            the cause of the new occurrence is the disorder to process the real writes at OST.

            00000008:04000000:0.0:1553773046.731149:0:14482:0:(osc_request.c:1753:osc_brw_fini_request()) setdq for [60000 60000 0] with valid 0x18000006b584fb9, flags 4000
            00000008:00010000:0.0:1553773046.731470:0:14482:0:(osc_cache.c:871:osc_extent_finish()) ### extent: ffff8ca93a946630 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 7/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1
            00000008:00010000:0.0:1553773046.733357:0:14482:0:(osc_cache.c:493:__osc_extent_remove()) ### extent: ffff8ca93a946630 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 7/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1
            00000100:00100000:0.0:1553773046.733367:0:14482:0:(client.c:2061:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_01:57a70172-96b7-525e-4422-c6adf43d1fb9:14482:1629248967152176:10.9.5.42@tcp:4
            00000008:04000000:0.0:1553773046.755084:0:14481:0:(osc_request.c:1753:osc_brw_fini_request()) setdq for [60000 60000 0] with valid 0x18000006b584fb9, flags 4000
            00000008:00010000:0.0:1553773046.756072:0:14481:0:(osc_cache.c:871:osc_extent_finish()) ### extent: ffff8ca93a946790 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 6/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1
            00000008:00010000:0.0:1553773046.757667:0:14481:0:(osc_cache.c:493:__osc_extent_remove()) ### extent: ffff8ca93a946790 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 6/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1
            00000100:00100000:0.0:1553773046.757681:0:14481:0:(client.c:2061:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_00:57a70172-96b7-525e-4422-c6adf43d1fb9:14481:1629248967152272:10.9.5.42@tcp:4
            00000008:04000000:0.0:1553773046.780222:0:14481:0:(osc_request.c:1753:osc_brw_fini_request()) setdq for [60000 60000 0] with valid 0x18000006b584fb9, flags 4000
            00000008:00010000:0.0:1553773046.780234:0:14481:0:(osc_cache.c:871:osc_extent_finish()) ### extent: ffff8ca93a946420 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 5/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1
            00000008:00010000:0.0:1553773046.798046:0:14481:0:(osc_cache.c:493:__osc_extent_remove()) ### extent: ffff8ca93a946420 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 5/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1
            00000100:00100000:0.0:1553773046.799223:0:14481:0:(client.c:2061:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_00:57a70172-96b7-525e-4422-c6adf43d1fb9:14481:1629248967152240:10.9.5.42@tcp:4
            00000008:04000000:1.0:1553773046.810498:0:14481:0:(osc_request.c:1753:osc_brw_fini_request()) setdq for [60000 60000 0] with valid 0x18000006b584fb9, flags 4100
            00000008:00010000:1.0:1553773046.810509:0:14481:0:(osc_cache.c:871:osc_extent_finish()) ### extent: ffff8ca93a9464d0 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 4/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1
            00000008:00010000:1.0:1553773046.810619:0:14481:0:(osc_cache.c:493:__osc_extent_remove()) ### extent: ffff8ca93a9464d0 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 4/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1
            00000100:00100000:1.0:1553773046.810628:0:14481:0:(client.c:2061:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_00:57a70172-96b7-525e-4422-c6adf43d1fb9:14481:1629248967152224:10.9.5.42@tcp:4
            00000008:04000000:1.0:1553773046.811851:0:14481:0:(osc_request.c:1753:osc_brw_fini_request()) setdq for [60000 60000 0] with valid 0x18000006b584fb9, flags 4100
            00000008:00010000:1.0:1553773046.811855:0:14481:0:(osc_cache.c:871:osc_extent_finish()) ### extent: ffff8ca93a9462c0 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 3/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1
            00000008:00010000:1.0:1553773046.811950:0:14481:0:(osc_cache.c:493:__osc_extent_remove()) ### extent: ffff8ca93a9462c0 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 3/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1
            00000100:00100000:1.0:1553773046.811956:0:14481:0:(client.c:2061:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_00:57a70172-96b7-525e-4422-c6adf43d1fb9:14481:1629248967152256:10.9.5.42@tcp:4
            

            the last request with XID 1629248967152272 was processed before other write requests

            00000100:00100000:1.0:1553773046.108916:0:26612:0:(service.c:2148:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_003:57a70172-96b7-525e-4422-c6adf43d1fb9+7:14482:x1629248967152176:12345-10.9.5.40@tcp:4
            00000100:00100000:1.0:1553773046.129278:0:17859:0:(service.c:2148:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_002:57a70172-96b7-525e-4422-c6adf43d1fb9+10:14481:x1629248967152272:12345-10.9.5.40@tcp:4
            00000100:00100000:0.0:1553773046.130214:0:26614:0:(service.c:2148:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_005:57a70172-96b7-525e-4422-c6adf43d1fb9+11:14481:x1629248967152240:12345-10.9.5.40@tcp:4
            00000100:00100000:1.0:1553773046.135298:0:17858:0:(service.c:2148:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_001:57a70172-96b7-525e-4422-c6adf43d1fb9+14:14481:x1629248967152224:12345-10.9.5.40@tcp:4
            00000100:00100000:1.0:1553773046.146322:0:26612:0:(service.c:2198:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_003:57a70172-96b7-525e-4422-c6adf43d1fb9+16:14482:x1629248967152176:12345-10.9.5.40@tcp:4 Request processed in 37405us (37455us total) trans 4294967318 rc 0/0
            00000100:00100000:1.0:1553773046.146349:0:26612:0:(service.c:2148:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_003:57a70172-96b7-525e-4422-c6adf43d1fb9+15:14481:x1629248967152256:12345-10.9.5.40@tcp:4
            00000100:00100000:1.0:1553773046.169798:0:17859:0:(service.c:2198:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_002:57a70172-96b7-525e-4422-c6adf43d1fb9+17:14481:x1629248967152272:12345-10.9.5.40@tcp:4 Request processed in 40520us (40576us total) trans 4294967319 rc 0/0
            00000100:00100000:0.0:1553773046.195849:0:26614:0:(service.c:2198:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_005:57a70172-96b7-525e-4422-c6adf43d1fb9+17:14481:x1629248967152240:12345-10.9.5.40@tcp:4 Request processed in 65634us (66655us total) trans 4294967320 rc 0/0
            00000100:00100000:0.0:1553773046.214748:0:17858:0:(service.c:2198:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_001:57a70172-96b7-525e-4422-c6adf43d1fb9+17:14481:x1629248967152224:12345-10.9.5.40@tcp:4 Request processed in 79446us (79509us total) trans 4294967321 rc 0/0
            00000100:00100000:1.0:1553773046.229365:0:26612:0:(service.c:2198:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_003:57a70172-96b7-525e-4422-c6adf43d1fb9+17:14481:x1629248967152256:12345-10.9.5.40@tcp:4 Request processed in 83015us (94114us total) trans 4294967322 rc 0/0
            

            the processing order is 1629248967152176, 1629248967152272, 1629248967152240, 1629248967152224,
            1629248967152256.

            the previous patch https://review.whamcloud.com/33747/ fixed the disorder of replies.
            in https://testing.whamcloud.com/test_sets/d083d9e2-ee39-11e8-86c0-52540065bddc

            00000008:00100000:1.0:1542871521.676809:0:14034:0:(osc_request.c:1548:osc_brw_prep_request()) brw rpc ffff8b2f59ab4900 - object 0x0:4418 offset 8388608<>10485760
            00000100:00100000:1.0:1542871521.676827:0:27649:0:(client.c:1631:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_01:9b13e76e-5635-ce95-1a07-bf23bb86c908:27649:1617804100297424:10.9.5.235@tcp:4
            00000008:04000000:1.0:1542871521.787877:0:27649:0:(osc_request.c:1746:osc_brw_fini_request()) setdq for [60000 60000 1000] with valid 0x18000006b584fb9, flags 4080
            00000001:04000000:1.0:1542871521.787887:0:27649:0:(osc_quota.c:139:osc_quota_setdq()) lustre-OST0006-osc-ffff8b2f7abbd000: setdq to insert for prj 1000 (0)
            00000008:00010000:1.0:1542871521.787896:0:27649:0:(osc_cache.c:866:osc_extent_finish()) ### extent: ffff8b2f62feca50 ns: lustre-OST0006-osc-ffff8b2f7abbd000 lock: ffff8b2f597e38c0/0xf74760ed13c30a9c lrc: 4/0,0 mode: PW/PW res: [0x1142:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29400020000 nid: local remote: 0x334cf6a11bdefff expref: -99 pid: 20636 timeout: 0 lvb_type: 1
            00000008:00010000:1.0:1542871521.788118:0:27649:0:(osc_cache.c:488:__osc_extent_remove()) ### extent: ffff8b2f62feca50 ns: lustre-OST0006-osc-ffff8b2f7abbd000 lock: ffff8b2f597e38c0/0xf74760ed13c30a9c lrc: 4/0,0 mode: PW/PW res: [0x1142:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29400020000 nid: local remote: 0x334cf6a11bdefff expref: -99 pid: 20636 timeout: 0 lvb_type: 1
            00000100:00100000:1.0:1542871521.788130:0:27649:0:(client.c:2059:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_01:9b13e76e-5635-ce95-1a07-bf23bb86c908:27649:1617804100297424:10.9.5.235@tcp:4
            00000008:04000000:1.0:1542871522.192045:0:27648:0:(osc_request.c:1746:osc_brw_fini_request()) setdq for [60000 60000 1000] with valid 0x18000006b584fb9, flags 4000
            00000001:04000000:1.0:1542871522.192049:0:27648:0:(osc_quota.c:152:osc_quota_setdq()) lustre-OST0006-osc-ffff8b2f7abbd000: setdq to remove for prj 1000 (00000000deadbeef)
            00000008:00010000:1.0:1542871522.192055:0:27648:0:(osc_cache.c:866:osc_extent_finish()) ### extent: ffff8b2f5a3be420 ns: lustre-OST0006-osc-ffff8b2f7abbd000 lock: ffff8b2f597e38c0/0xf74760ed13c30a9c lrc: 3/0,0 mode: PW/PW res: [0x1142:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29400020000 nid: local remote: 0x334cf6a11bdefff expref: -99 pid: 20636 timeout: 0 lvb_type: 1
            00000008:00010000:1.0:1542871522.192496:0:27648:0:(osc_cache.c:488:__osc_extent_remove()) ### extent: ffff8b2f5a3be420 ns: lustre-OST0006-osc-ffff8b2f7abbd000 lock: ffff8b2f597e38c0/0xf74760ed13c30a9c lrc: 3/0,0 mode: PW/PW res: [0x1142:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29400020000 nid: local remote: 0x334cf6a11bdefff expref: -99 pid: 20636 timeout: 0 lvb_type: 1
            00000100:00100000:1.0:1542871522.192528:0:27648:0:(client.c:2059:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_00:9b13e76e-5635-ce95-1a07-bf23bb86c908:27648:1617804100297392:10.9.5.235@tcp:4
            

            the reply of previous request with XID 1617804100297392 was processed after the request with XID 1617804100297424.
            but at OST, the 1617804100297392 request was processed before the 1617804100297424 request.

            will try to create a new patch to fix it.

            hongchao.zhang Hongchao Zhang added a comment - the cause of the new occurrence is the disorder to process the real writes at OST. 00000008:04000000:0.0:1553773046.731149:0:14482:0:(osc_request.c:1753:osc_brw_fini_request()) setdq for [60000 60000 0] with valid 0x18000006b584fb9, flags 4000 00000008:00010000:0.0:1553773046.731470:0:14482:0:(osc_cache.c:871:osc_extent_finish()) ### extent: ffff8ca93a946630 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 7/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1 00000008:00010000:0.0:1553773046.733357:0:14482:0:(osc_cache.c:493:__osc_extent_remove()) ### extent: ffff8ca93a946630 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 7/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1 00000100:00100000:0.0:1553773046.733367:0:14482:0:(client.c:2061:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_01:57a70172-96b7-525e-4422-c6adf43d1fb9:14482:1629248967152176:10.9.5.42@tcp:4 00000008:04000000:0.0:1553773046.755084:0:14481:0:(osc_request.c:1753:osc_brw_fini_request()) setdq for [60000 60000 0] with valid 0x18000006b584fb9, flags 4000 00000008:00010000:0.0:1553773046.756072:0:14481:0:(osc_cache.c:871:osc_extent_finish()) ### extent: ffff8ca93a946790 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 6/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1 00000008:00010000:0.0:1553773046.757667:0:14481:0:(osc_cache.c:493:__osc_extent_remove()) ### extent: ffff8ca93a946790 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 6/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1 00000100:00100000:0.0:1553773046.757681:0:14481:0:(client.c:2061:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_00:57a70172-96b7-525e-4422-c6adf43d1fb9:14481:1629248967152272:10.9.5.42@tcp:4 00000008:04000000:0.0:1553773046.780222:0:14481:0:(osc_request.c:1753:osc_brw_fini_request()) setdq for [60000 60000 0] with valid 0x18000006b584fb9, flags 4000 00000008:00010000:0.0:1553773046.780234:0:14481:0:(osc_cache.c:871:osc_extent_finish()) ### extent: ffff8ca93a946420 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 5/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1 00000008:00010000:0.0:1553773046.798046:0:14481:0:(osc_cache.c:493:__osc_extent_remove()) ### extent: ffff8ca93a946420 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 5/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1 00000100:00100000:0.0:1553773046.799223:0:14481:0:(client.c:2061:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_00:57a70172-96b7-525e-4422-c6adf43d1fb9:14481:1629248967152240:10.9.5.42@tcp:4 00000008:04000000:1.0:1553773046.810498:0:14481:0:(osc_request.c:1753:osc_brw_fini_request()) setdq for [60000 60000 0] with valid 0x18000006b584fb9, flags 4100 00000008:00010000:1.0:1553773046.810509:0:14481:0:(osc_cache.c:871:osc_extent_finish()) ### extent: ffff8ca93a9464d0 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 4/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1 00000008:00010000:1.0:1553773046.810619:0:14481:0:(osc_cache.c:493:__osc_extent_remove()) ### extent: ffff8ca93a9464d0 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 4/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1 00000100:00100000:1.0:1553773046.810628:0:14481:0:(client.c:2061:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_00:57a70172-96b7-525e-4422-c6adf43d1fb9:14481:1629248967152224:10.9.5.42@tcp:4 00000008:04000000:1.0:1553773046.811851:0:14481:0:(osc_request.c:1753:osc_brw_fini_request()) setdq for [60000 60000 0] with valid 0x18000006b584fb9, flags 4100 00000008:00010000:1.0:1553773046.811855:0:14481:0:(osc_cache.c:871:osc_extent_finish()) ### extent: ffff8ca93a9462c0 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 3/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1 00000008:00010000:1.0:1553773046.811950:0:14481:0:(osc_cache.c:493:__osc_extent_remove()) ### extent: ffff8ca93a9462c0 ns: lustre-OST0000-osc-ffff8ca93a40d000 lock: ffff8ca93938b200/0x18cb0c311b7fec30 lrc: 3/0,0 mode: PW/PW res: [0x4:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29480020000 nid: local remote: 0xf9a6e43c4b4526f3 expref: -99 pid: 24698 timeout: 0 lvb_type: 1 00000100:00100000:1.0:1553773046.811956:0:14481:0:(client.c:2061:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_00:57a70172-96b7-525e-4422-c6adf43d1fb9:14481:1629248967152256:10.9.5.42@tcp:4 the last request with XID 1629248967152272 was processed before other write requests 00000100:00100000:1.0:1553773046.108916:0:26612:0:(service.c:2148:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_003:57a70172-96b7-525e-4422-c6adf43d1fb9+7:14482:x1629248967152176:12345-10.9.5.40@tcp:4 00000100:00100000:1.0:1553773046.129278:0:17859:0:(service.c:2148:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_002:57a70172-96b7-525e-4422-c6adf43d1fb9+10:14481:x1629248967152272:12345-10.9.5.40@tcp:4 00000100:00100000:0.0:1553773046.130214:0:26614:0:(service.c:2148:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_005:57a70172-96b7-525e-4422-c6adf43d1fb9+11:14481:x1629248967152240:12345-10.9.5.40@tcp:4 00000100:00100000:1.0:1553773046.135298:0:17858:0:(service.c:2148:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_001:57a70172-96b7-525e-4422-c6adf43d1fb9+14:14481:x1629248967152224:12345-10.9.5.40@tcp:4 00000100:00100000:1.0:1553773046.146322:0:26612:0:(service.c:2198:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_003:57a70172-96b7-525e-4422-c6adf43d1fb9+16:14482:x1629248967152176:12345-10.9.5.40@tcp:4 Request processed in 37405us (37455us total) trans 4294967318 rc 0/0 00000100:00100000:1.0:1553773046.146349:0:26612:0:(service.c:2148:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_003:57a70172-96b7-525e-4422-c6adf43d1fb9+15:14481:x1629248967152256:12345-10.9.5.40@tcp:4 00000100:00100000:1.0:1553773046.169798:0:17859:0:(service.c:2198:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_002:57a70172-96b7-525e-4422-c6adf43d1fb9+17:14481:x1629248967152272:12345-10.9.5.40@tcp:4 Request processed in 40520us (40576us total) trans 4294967319 rc 0/0 00000100:00100000:0.0:1553773046.195849:0:26614:0:(service.c:2198:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_005:57a70172-96b7-525e-4422-c6adf43d1fb9+17:14481:x1629248967152240:12345-10.9.5.40@tcp:4 Request processed in 65634us (66655us total) trans 4294967320 rc 0/0 00000100:00100000:0.0:1553773046.214748:0:17858:0:(service.c:2198:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_001:57a70172-96b7-525e-4422-c6adf43d1fb9+17:14481:x1629248967152224:12345-10.9.5.40@tcp:4 Request processed in 79446us (79509us total) trans 4294967321 rc 0/0 00000100:00100000:1.0:1553773046.229365:0:26612:0:(service.c:2198:ptlrpc_server_handle_request()) Handled RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_io00_003:57a70172-96b7-525e-4422-c6adf43d1fb9+17:14481:x1629248967152256:12345-10.9.5.40@tcp:4 Request processed in 83015us (94114us total) trans 4294967322 rc 0/0 the processing order is 1629248967152176, 1629248967152272, 1629248967152240, 1629248967152224, 1629248967152256. the previous patch https://review.whamcloud.com/33747/ fixed the disorder of replies. in https://testing.whamcloud.com/test_sets/d083d9e2-ee39-11e8-86c0-52540065bddc 00000008:00100000:1.0:1542871521.676809:0:14034:0:(osc_request.c:1548:osc_brw_prep_request()) brw rpc ffff8b2f59ab4900 - object 0x0:4418 offset 8388608<>10485760 00000100:00100000:1.0:1542871521.676827:0:27649:0:(client.c:1631:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_01:9b13e76e-5635-ce95-1a07-bf23bb86c908:27649:1617804100297424:10.9.5.235@tcp:4 00000008:04000000:1.0:1542871521.787877:0:27649:0:(osc_request.c:1746:osc_brw_fini_request()) setdq for [60000 60000 1000] with valid 0x18000006b584fb9, flags 4080 00000001:04000000:1.0:1542871521.787887:0:27649:0:(osc_quota.c:139:osc_quota_setdq()) lustre-OST0006-osc-ffff8b2f7abbd000: setdq to insert for prj 1000 (0) 00000008:00010000:1.0:1542871521.787896:0:27649:0:(osc_cache.c:866:osc_extent_finish()) ### extent: ffff8b2f62feca50 ns: lustre-OST0006-osc-ffff8b2f7abbd000 lock: ffff8b2f597e38c0/0xf74760ed13c30a9c lrc: 4/0,0 mode: PW/PW res: [0x1142:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29400020000 nid: local remote: 0x334cf6a11bdefff expref: -99 pid: 20636 timeout: 0 lvb_type: 1 00000008:00010000:1.0:1542871521.788118:0:27649:0:(osc_cache.c:488:__osc_extent_remove()) ### extent: ffff8b2f62feca50 ns: lustre-OST0006-osc-ffff8b2f7abbd000 lock: ffff8b2f597e38c0/0xf74760ed13c30a9c lrc: 4/0,0 mode: PW/PW res: [0x1142:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29400020000 nid: local remote: 0x334cf6a11bdefff expref: -99 pid: 20636 timeout: 0 lvb_type: 1 00000100:00100000:1.0:1542871521.788130:0:27649:0:(client.c:2059:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_01:9b13e76e-5635-ce95-1a07-bf23bb86c908:27649:1617804100297424:10.9.5.235@tcp:4 00000008:04000000:1.0:1542871522.192045:0:27648:0:(osc_request.c:1746:osc_brw_fini_request()) setdq for [60000 60000 1000] with valid 0x18000006b584fb9, flags 4000 00000001:04000000:1.0:1542871522.192049:0:27648:0:(osc_quota.c:152:osc_quota_setdq()) lustre-OST0006-osc-ffff8b2f7abbd000: setdq to remove for prj 1000 (00000000deadbeef) 00000008:00010000:1.0:1542871522.192055:0:27648:0:(osc_cache.c:866:osc_extent_finish()) ### extent: ffff8b2f5a3be420 ns: lustre-OST0006-osc-ffff8b2f7abbd000 lock: ffff8b2f597e38c0/0xf74760ed13c30a9c lrc: 3/0,0 mode: PW/PW res: [0x1142:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29400020000 nid: local remote: 0x334cf6a11bdefff expref: -99 pid: 20636 timeout: 0 lvb_type: 1 00000008:00010000:1.0:1542871522.192496:0:27648:0:(osc_cache.c:488:__osc_extent_remove()) ### extent: ffff8b2f5a3be420 ns: lustre-OST0006-osc-ffff8b2f7abbd000 lock: ffff8b2f597e38c0/0xf74760ed13c30a9c lrc: 3/0,0 mode: PW/PW res: [0x1142:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x29400020000 nid: local remote: 0x334cf6a11bdefff expref: -99 pid: 20636 timeout: 0 lvb_type: 1 00000100:00100000:1.0:1542871522.192528:0:27648:0:(client.c:2059:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ptlrpcd_00_00:9b13e76e-5635-ce95-1a07-bf23bb86c908:27648:1617804100297392:10.9.5.235@tcp:4 the reply of previous request with XID 1617804100297392 was processed after the request with XID 1617804100297424. but at OST, the 1617804100297392 request was processed before the 1617804100297424 request. will try to create a new patch to fix it.

            Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34627
            Subject: LU-11678 tests: Additional debug
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: b0987312838e2fd5ef47e09457ea94a4f3450e05

            gerrit Gerrit Updater added a comment - Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34627 Subject: LU-11678 tests: Additional debug Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: b0987312838e2fd5ef47e09457ea94a4f3450e05

            People

              hongchao.zhang Hongchao Zhang
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: