Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8807

racer test_1: (layout.c:2062:__req_capsule_get()) LBUG

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • Lustre 2.9.0
    • None
    • Full - EL7.2 Server/EL7.2 Client - DNE
      master, build# 3468
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/e9d2ad98-a26d-11e6-bf05-5254006e85c2.

      The sub-test test_1 failed with the following error:

      test failed to respond and timed out
      

      client test_log:

      04:31:40:[53110.788399] LustreError: 20828:0:(ldlm_resource.c:874:ldlm_resource_complain()) lustre-OST0001-osc-ffff88004691b800: namespace resource [0xa6:0x0:0x0].0x0 (ffff880079103b40) refcount nonzero (1) after lock cleanup; forcing cleanup.
      04:31:40:[53110.793207] LustreError: 20828:0:(ldlm_resource.c:1455:ldlm_resource_dump()) --- Resource: [0xa6:0x0:0x0].0x0 (ffff880079103b40) refcount = 2
      04:31:40:[53110.797501] LustreError: 20828:0:(ldlm_resource.c:1458:ldlm_resource_dump()) Granted locks (in reverse order):
      04:31:40:[53110.799997] LustreError: 20828:0:(ldlm_resource.c:1461:ldlm_resource_dump()) ### ### ns: lustre-OST0001-osc-ffff88004691b800 lock: ffff88004c9f8200/0x63275ddd857fc6d lrc: 3/0,1 mode: PW/PW res: [0xa6:0x0:0x0].0x0 rrc: 2 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x526400000000 nid: local remote: 0x6943ada1f873b7a5 expref: -99 pid: 1080 timeout: 0 lvb_type: 1
      04:31:40:[53510.956205] Lustre: 27937:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478172086/real 1478172086]  req@ffff880046af1500 x1549972390972608/t0(0) o36->lustre-MDT0001-mdc-ffff8800415bc000@10.2.4.176@tcp:12/10 lens 872/952 e 5 to 1 dl 1478172687 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
      04:31:40:[53510.964116] Lustre: lustre-MDT0001-mdc-ffff8800415bc000: Connection to lustre-MDT0001 (at 10.2.4.176@tcp) was lost; in progress operations using this service will wait for recovery to complete
      04:31:40:[53510.969142] Lustre: Skipped 1 previous similar message
      04:31:40:[53510.976402] Lustre: lustre-MDT0001-mdc-ffff8800415bc000: Connection restored to 10.2.4.176@tcp (at 10.2.4.176@tcp)
      04:31:40:[53510.979289] Lustre: Skipped 7 previous similar messages
      04:31:40:[53518.256263] LustreError: 11171:0:(layout.c:2062:__req_capsule_get()) ASSERTION( msg != ((void *)0) ) failed: 
      04:31:40:[53518.263177] LustreError: 11171:0:(layout.c:2062:__req_capsule_get()) LBUG
      04:31:40:[53518.266145] Pid: 11171, comm: lfs
      04:31:40:[53518.268672] 
      

      Attachments

        Issue Links

          Activity

            [LU-8807] racer test_1: (layout.c:2062:__req_capsule_get()) LBUG
            mdiep Minh Diep added a comment -

            Landed in 2.10

            mdiep Minh Diep added a comment - Landed in 2.10

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23666/
            Subject: LU-8807 llite: check reply status in ll_migrate()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 056783782eab03b341c464c85ce4a803508e390b

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23666/ Subject: LU-8807 llite: check reply status in ll_migrate() Project: fs/lustre-release Branch: master Current Patch Set: Commit: 056783782eab03b341c464c85ce4a803508e390b

            Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/23666
            Subject: LU-8807 llite: check reply status in ll_migrate()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: ad151b8fa56d4560ca778df757ac4eb949fb38de

            gerrit Gerrit Updater added a comment - Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/23666 Subject: LU-8807 llite: check reply status in ll_migrate() Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ad151b8fa56d4560ca778df757ac4eb949fb38de

            ll_migrate() tries to read reply buffer without checking if the request is replied successfully, I think the fix of LU-7396 isn't complete.

            niu Niu Yawei (Inactive) added a comment - ll_migrate() tries to read reply buffer without checking if the request is replied successfully, I think the fix of LU-7396 isn't complete.
            pjones Peter Jones added a comment -

            Niu

            Could you please advise on this issue?

            Thanks

            Peter

            pjones Peter Jones added a comment - Niu Could you please advise on this issue? Thanks Peter

            People

              niu Niu Yawei (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: