Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1659

cl_lock_hold_release()) failed at lock->cll_state != CLS_HELD

    XMLWordPrintable

Details

    • 3
    • 6361

    Description

      We hit a CL_LOCK_ASSERT in cl_lock_hold_release() on about 20 clients when they were evicted. Note that we were testing 2.1.1-17chaos (which includes patchset 10 of the LU-1299 patch).

      Here is a relevant log excerpt from the client:

      2012-07-22 18:38:14 LustreError: 11-0: lc2-OST000b-osc-ffff880434f91800: Communicating with 10.1.1.48@o2ib9, operation ost_read failed with -107.
      2012-07-22 18:38:14 LustreError: Skipped 6 previous similar messages
      2012-07-22 18:38:14 Lustre: lc2-OST000b-osc-ffff880434f91800: Connection to lc2-OST000b (at 10.1.1.48@o2ib9) was lost; in progress operations using this service will wait for recovery to complete
      2012-07-22 18:38:14 LustreError: 167-0: This client was evicted by lc2-OST000b; in progress operations using this service will fail.
      2012-07-22 18:38:14 LustreError: 84597:0:(osc_lock.c:816:osc_ldlm_completion_ast()) lock@ffff8806c64d5978[4755 3 0 1 1 00000000] R(1):[491520, 665087]@[0x1000b0000:0x71c66f4:0x0] {
      2012-07-22 18:38:14 LustreError: 84597:0:(osc_lock.c:816:osc_ldlm_completion_ast())     lovsub@ffff880781ea9ca0: [0 ffff88070951a200 R(1):[983040, 1329919]@[0x20018eee8:0x1f2e2:0x0]] 
      2012-07-22 18:38:14 LustreError: 84597:0:(osc_lock.c:816:osc_ldlm_completion_ast())     osc@ffff88077320eb50: ffff880822868240 40100001 0x798339c36387a9e3 3 ffff8808336bdad0 size: 4294967296 mtime: 1343005207 atime: 1343002227 ctime: 1343005207 blocks: 8316944
      2012-07-22 18:38:14 LustreError: 84597:0:(osc_lock.c:816:osc_ldlm_completion_ast()) } lock@ffff8806c64d5978
      2012-07-22 18:38:14 LustreError: 84597:0:(osc_lock.c:816:osc_ldlm_completion_ast()) dlmlock returned -5
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:1412:cl_unuse_try()) result = -5, this is unlikely!
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:1426:cl_unuse_locked()) lock@ffff8806c64d5438[1 3 0 1 0 00000000] R(1):[983040, 1329919]@[0x20018eee8:0x1f2e2:0x0] {
      2012-07-22 18:38:14 LustreError: 84597:0:(ldlm_resource.c:755:ldlm_resource_complain()) Namespace lc2-OST000b-osc-ffff880434f91800 resource refcount nonzero (7) after lock cleanup; forcing cleanup.
      2012-07-22 18:38:14 LustreError: 84597:0:(ldlm_resource.c:761:ldlm_resource_complain()) Resource: ffff8807c9fb56c0 (119301876/0/0/0) (rc: 7)
      2012-07-22 18:38:14 Lustre: lc2-OST000b-osc-ffff880434f91800: Connection restored to lc2-OST000b (at 10.1.1.48@o2ib9)
      2012-07-22 18:38:14 Lustre: Skipped 1 previous similar message
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:1426:cl_unuse_locked())     vvp@ffff880774bb94e8: 
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:1426:cl_unuse_locked())     lov@ffff88070951a200: 1
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:1426:cl_unuse_locked())     0 0: ---
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:1426:cl_unuse_locked()) 
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:1426:cl_unuse_locked()) } lock@ffff8806c64d5438
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:1426:cl_unuse_locked()) unuse return -5
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:906:cl_lock_hold_release()) lock@ffff8806c64d5438[1 3 0 0 0 00000000] R(1):[983040, 1329919]@[0x20018eee8:0x1f2e2:0x0] {
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:906:cl_lock_hold_release())     vvp@ffff880774bb94e8: 
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:906:cl_lock_hold_release())     lov@ffff88070951a200: 1
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:906:cl_lock_hold_release())     0 0: ---
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:906:cl_lock_hold_release()) 
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:906:cl_lock_hold_release()) } lock@ffff8806c64d5438
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:906:cl_lock_hold_release()) failed at lock->cll_state != CLS_HELD.
      2012-07-22 18:38:14 LustreError: 84123:0:(cl_lock.c:906:cl_lock_hold_release()) LBUG
      

      Attachments

        Activity

          People

            jay Jinshan Xiong (Inactive)
            morrone Christopher Morrone
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: