Details

    • Technical task
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.1.1
    • None
    • 9747

    Description

      Hit the following LBUG:

      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) lock@ffff8802615024d8[4 2 0 0 0 00000005] P(0):[0, 18446744073709551615]@[0x16b9ac4cbe:0xc:0x0] {
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     vvp@ffff88025946e9e8: 
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     lov@ffff880431e80cf8: 2
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     0 0: ---
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     1 1: lock@ffff8803c465faf8[1 3 0 1 1 00000000] R(1):[0, 18446744073709551615]@[0x100a80000:0x1b3b212:0x0] {
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     lovsub@ffff8801f6d645a0: [1 ffff880431e80cf8 P(0):[0, 18446744073709551615]@[0x16b9ac4cbe:0xc:0x0]] 
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     osc@ffff8803c4694b50: ffff88010cd766c0 00101001 0xe12a56a3ad7ca7fa 3 ffff880428397e48 size: 0 mtime: 1335700029 atime: 1335700029 ctime: 1335700029 blocks: 0
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) } lock@ffff8803c465faf8
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) 
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) } lock@ffff8802615024d8
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) Delete CLS_HELD lock
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:383:lovsub_lock_delete_one()) Impossible state: 2
      2012-04-29 07:43:52 LustreError: 83833:0:(lovsub_lock.c:384:lovsub_lock_delete_one()) LBUG
      

      Attachments

        Activity

          [LU-1355] LBUG in lovsub_lock_delete_one -- Impossible state: 2 (CLS_ENQUEUED)

          Actually my previous comment was inaccurate. This bug is concealed by LU-1299 because LU-1299 should be hit first if it was not fixed.

          Please check patch: http://review.whamcloud.com/2632 for a fix.

          This is still about signal handling problem.

          jay Jinshan Xiong (Inactive) added a comment - Actually my previous comment was inaccurate. This bug is concealed by LU-1299 because LU-1299 should be hit first if it was not fixed. Please check patch: http://review.whamcloud.com/2632 for a fix. This is still about signal handling problem.

          Do you still need the debug log? Now that I'm trying to reproduce it, of course it is being elusive.

          nedbass Ned Bass (Inactive) added a comment - Do you still need the debug log? Now that I'm trying to reproduce it, of course it is being elusive.

          I confirm that this bug was imported by LU-1299. I will cook a patch tomorrow.

          jay Jinshan Xiong (Inactive) added a comment - I confirm that this bug was imported by LU-1299 . I will cook a patch tomorrow.

          I need more time on this bug.

          Ned, if you can reproduce this issue quite often, is it possible to collect a debug log at the client side. I need DLMTRACE to be set.

          jay Jinshan Xiong (Inactive) added a comment - I need more time on this bug. Ned, if you can reproduce this issue quite often, is it possible to collect a debug log at the client side. I need DLMTRACE to be set.

          That's great news! We hit it very often. I just unintentionally reproduced it running ls on a login node. I wonder if we landed something that makes it more likely to hit since we never saw it before.

          nedbass Ned Bass (Inactive) added a comment - That's great news! We hit it very often. I just unintentionally reproduced it running ls on a login node. I wonder if we landed something that makes it more likely to hit since we never saw it before.

          No, this is a pre-existing bug. How often do you hit this one? I've found the root cause of this one and am working on 1356.

          jay Jinshan Xiong (Inactive) added a comment - No, this is a pre-existing bug. How often do you hit this one? I've found the root cause of this one and am working on 1356.

          Hi Jinshan,

          Do you think this may be a regression in our recent 2.1.1-10chaos tag? If not, do you have any ideas about the nature of the bug?

          nedbass Ned Bass (Inactive) added a comment - Hi Jinshan, Do you think this may be a regression in our recent 2.1.1-10chaos tag? If not, do you have any ideas about the nature of the bug?

          Here is the patch stack that was running at the time.

          https://github.com/chaos/lustre/commits/2.1.1-10chaos

          nedbass Ned Bass (Inactive) added a comment - Here is the patch stack that was running at the time. https://github.com/chaos/lustre/commits/2.1.1-10chaos

          Sure. This was hit on many nodes, this one printed the stack trace to the console:

          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) lock@ffff880320159b38[4 2 0 0 0 00000005] P(0):[0, 18446744073709551615]@[0x1bb30df888:0x4:0x0] {
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     vvp@ffff8803515befb0: 
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     lov@ffff88037119c6e0: 2
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     0 0: ---
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     1 1: lock@ffff88031bb1eeb8[1 3 0 1 1 00000000] R(1):[0, 18446744073709551615]@[0x100950000:0x1a45bfa:0x0] {
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     lovsub@ffff8804276ae3a0: [1 ffff88037119c6e0 P(0):[0, 18446744073709551615]@[0x1bb30df888:0x4:0x0]] 
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one())     osc@ffff880425f1c2b0: ffff8801354436c0 00101001 0x92774b263242bbe4 3 ffff880812898b10 size: 0 mtime: 1335712511 atime: 1335712511 ctime: 1335712511 blocks: 0
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) } lock@ffff88031bb1eeb8
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) 
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) } lock@ffff880320159b38
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) Delete CLS_HELD lock
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:383:lovsub_lock_delete_one()) Impossible state: 2
          2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:384:lovsub_lock_delete_one()) LBUG
          2012-04-29 12:15:02 Pid: 72532, comm: rm
          2012-04-29 12:15:02 
          2012-04-29 12:15:02 Call Trace:
          2012-04-29 12:15:02  [<ffffffffa03e2895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
          2012-04-29 12:15:02  [<ffffffffa03e2dc7>] lbug_with_loc+0x47/0xb0 [libcfs]
          2012-04-29 12:15:02  [<ffffffffa08bfff2>] lovsub_lock_delete+0x1a2/0x370 [lov]
          2012-04-29 12:15:02  [<ffffffffa055a665>] cl_lock_delete0+0xb5/0x150 [obdclass]
          2012-04-29 12:15:02  [<ffffffffa055a816>] cl_lock_hold_release+0x116/0x170 [obdclass]
          2012-04-29 12:15:02  [<ffffffffa055bf73>] cl_lock_unhold+0x33/0xc0 [obdclass]
          2012-04-29 12:15:02  [<ffffffffa08b7e78>] lov_sublock_release+0x198/0x260 [lov]
          2012-04-29 12:15:02  [<ffffffffa08b8ed8>] lov_lock_cancel+0xe8/0x230 [lov]
          2012-04-29 12:15:02  [<ffffffffa0559715>] cl_lock_cancel0+0x75/0xf0 [obdclass]
          2012-04-29 12:15:02  [<ffffffffa055a7f5>] cl_lock_hold_release+0xf5/0x170 [obdclass]
          2012-04-29 12:15:02  [<ffffffffa055beb2>] cl_lock_release+0x82/0x110 [obdclass]
          2012-04-29 12:15:02  [<ffffffffa09722d6>] cl_glimpse_lock+0x196/0x330 [lustre]
          2012-04-29 12:15:02  [<ffffffffa0972914>] cl_glimpse_size+0x124/0x130 [lustre]
          2012-04-29 12:15:02  [<ffffffffa093ab8a>] ll_inode_revalidate_it+0x5a/0x140 [lustre]
          2012-04-29 12:15:02  [<ffffffff811843e5>] ? putname+0x45/0x50
          2012-04-29 12:15:02  [<ffffffffa093acb9>] ll_getattr_it+0x49/0x170 [lustre]
          2012-04-29 12:15:02  [<ffffffffa093ae17>] ll_getattr+0x37/0x40 [lustre]
          2012-04-29 12:15:02  [<ffffffff8117cb21>] vfs_getattr+0x51/0x80
          2012-04-29 12:15:02  [<ffffffff810d152f>] ? audit_putname+0x3f/0x50
          2012-04-29 12:15:02  [<ffffffff8117cbb8>] vfs_fstatat+0x68/0x80
          2012-04-29 12:15:02  [<ffffffff8117cbf4>] sys_newfstatat+0x24/0x50
          2012-04-29 12:15:02  [<ffffffff8100c5b5>] ? math_state_restore+0x45/0x60
          2012-04-29 12:15:02  [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
          2012-04-29 12:15:02 
          2012-04-29 12:15:02 Apr 29 12:15:02 Kernel panic - not syncing: LBUG
          2012-04-29 12:15:02 cab283 kernel: LPid: 72532, comm: rm Not tainted 2.6.32-220.13.1.2chaos.ch5.x86_64 #1
          2012-04-29 12:15:02 ustreError: 7253Call Trace:
          2012-04-29 12:15:02 2:0:(lovsub_lock [<ffffffff814ee6c8>] ? panic+0x78/0x143
          2012-04-29 12:15:02 .c:384:lovsub_lo [<ffffffffa03e2e1b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
          2012-04-29 12:15:02 ck_delete_one()) [<ffffffffa08bfff2>] ? lovsub_lock_delete+0x1a2/0x370 [lov]
          2012-04-29 12:15:02  LBUG
          2012-04-29 12:15:02 Apr 29 12: [<ffffffffa055a665>] ? cl_lock_delete0+0xb5/0x150 [obdclass]
          2012-04-29 12:15:02 15:02 cab283 ker [<ffffffffa055a816>] ? cl_lock_hold_release+0x116/0x170 [obdclass]
          2012-04-29 12:15:02 nel: Kernel pani [<ffffffffa055bf73>] ? cl_lock_unhold+0x33/0xc0 [obdclass]
          2012-04-29 12:15:02 c - not syncing: [<ffffffffa08b7e78>] ? lov_sublock_release+0x198/0x260 [lov]
          2012-04-29 12:15:02  LBUG
          2012-04-29 12:15:02  [<ffffffffa08b8ed8>] ? lov_lock_cancel+0xe8/0x230 [lov]
          2012-04-29 12:15:02  [<ffffffffa0559715>] ? cl_lock_cancel0+0x75/0xf0 [obdclass]
          2012-04-29 12:15:02  [<ffffffffa055a7f5>] ? cl_lock_hold_release+0xf5/0x170 [obdclass]
          2012-04-29 12:15:02  [<ffffffffa055beb2>] ? cl_lock_release+0x82/0x110 [obdclass]
          2012-04-29 12:15:02  [<ffffffffa09722d6>] ? cl_glimpse_lock+0x196/0x330 [lustre]
          2012-04-29 12:15:02  [<ffffffffa0972914>] ? cl_glimpse_size+0x124/0x130 [lustre]
          2012-04-29 12:15:02  [<ffffffffa093ab8a>] ? ll_inode_revalidate_it+0x5a/0x140 [lustre]
          2012-04-29 12:15:03  [<ffffffff811843e5>] ? putname+0x45/0x50
          2012-04-29 12:15:03  [<ffffffffa093acb9>] ? ll_getattr_it+0x49/0x170 [lustre]
          2012-04-29 12:15:03  [<ffffffffa093ae17>] ? ll_getattr+0x37/0x40 [lustre]
          2012-04-29 12:15:03  [<ffffffff8117cb21>] ? vfs_getattr+0x51/0x80
          2012-04-29 12:15:03  [<ffffffff810d152f>] ? audit_putname+0x3f/0x50
          2012-04-29 12:15:03  [<ffffffff8117cbb8>] ? vfs_fstatat+0x68/0x80
          2012-04-29 12:15:03  [<ffffffff8117cbf4>] ? sys_newfstatat+0x24/0x50
          2012-04-29 12:15:03  [<ffffffff8100c5b5>] ? math_state_restore+0x45/0x60
          2012-04-29 12:15:03  [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
          

          We also have a core dump which I can poke at if you need me to.

          prakash Prakash Surya (Inactive) added a comment - Sure. This was hit on many nodes, this one printed the stack trace to the console: 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) lock@ffff880320159b38[4 2 0 0 0 00000005] P(0):[0, 18446744073709551615]@[0x1bb30df888:0x4:0x0] { 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) vvp@ffff8803515befb0: 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) lov@ffff88037119c6e0: 2 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) 0 0: --- 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) 1 1: lock@ffff88031bb1eeb8[1 3 0 1 1 00000000] R(1):[0, 18446744073709551615]@[0x100950000:0x1a45bfa:0x0] { 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) lovsub@ffff8804276ae3a0: [1 ffff88037119c6e0 P(0):[0, 18446744073709551615]@[0x1bb30df888:0x4:0x0]] 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) osc@ffff880425f1c2b0: ffff8801354436c0 00101001 0x92774b263242bbe4 3 ffff880812898b10 size: 0 mtime: 1335712511 atime: 1335712511 ctime: 1335712511 blocks: 0 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) } lock@ffff88031bb1eeb8 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) } lock@ffff880320159b38 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:381:lovsub_lock_delete_one()) Delete CLS_HELD lock 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:383:lovsub_lock_delete_one()) Impossible state: 2 2012-04-29 12:15:02 LustreError: 72532:0:(lovsub_lock.c:384:lovsub_lock_delete_one()) LBUG 2012-04-29 12:15:02 Pid: 72532, comm: rm 2012-04-29 12:15:02 2012-04-29 12:15:02 Call Trace: 2012-04-29 12:15:02 [<ffffffffa03e2895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 2012-04-29 12:15:02 [<ffffffffa03e2dc7>] lbug_with_loc+0x47/0xb0 [libcfs] 2012-04-29 12:15:02 [<ffffffffa08bfff2>] lovsub_lock_delete+0x1a2/0x370 [lov] 2012-04-29 12:15:02 [<ffffffffa055a665>] cl_lock_delete0+0xb5/0x150 [obdclass] 2012-04-29 12:15:02 [<ffffffffa055a816>] cl_lock_hold_release+0x116/0x170 [obdclass] 2012-04-29 12:15:02 [<ffffffffa055bf73>] cl_lock_unhold+0x33/0xc0 [obdclass] 2012-04-29 12:15:02 [<ffffffffa08b7e78>] lov_sublock_release+0x198/0x260 [lov] 2012-04-29 12:15:02 [<ffffffffa08b8ed8>] lov_lock_cancel+0xe8/0x230 [lov] 2012-04-29 12:15:02 [<ffffffffa0559715>] cl_lock_cancel0+0x75/0xf0 [obdclass] 2012-04-29 12:15:02 [<ffffffffa055a7f5>] cl_lock_hold_release+0xf5/0x170 [obdclass] 2012-04-29 12:15:02 [<ffffffffa055beb2>] cl_lock_release+0x82/0x110 [obdclass] 2012-04-29 12:15:02 [<ffffffffa09722d6>] cl_glimpse_lock+0x196/0x330 [lustre] 2012-04-29 12:15:02 [<ffffffffa0972914>] cl_glimpse_size+0x124/0x130 [lustre] 2012-04-29 12:15:02 [<ffffffffa093ab8a>] ll_inode_revalidate_it+0x5a/0x140 [lustre] 2012-04-29 12:15:02 [<ffffffff811843e5>] ? putname+0x45/0x50 2012-04-29 12:15:02 [<ffffffffa093acb9>] ll_getattr_it+0x49/0x170 [lustre] 2012-04-29 12:15:02 [<ffffffffa093ae17>] ll_getattr+0x37/0x40 [lustre] 2012-04-29 12:15:02 [<ffffffff8117cb21>] vfs_getattr+0x51/0x80 2012-04-29 12:15:02 [<ffffffff810d152f>] ? audit_putname+0x3f/0x50 2012-04-29 12:15:02 [<ffffffff8117cbb8>] vfs_fstatat+0x68/0x80 2012-04-29 12:15:02 [<ffffffff8117cbf4>] sys_newfstatat+0x24/0x50 2012-04-29 12:15:02 [<ffffffff8100c5b5>] ? math_state_restore+0x45/0x60 2012-04-29 12:15:02 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b 2012-04-29 12:15:02 2012-04-29 12:15:02 Apr 29 12:15:02 Kernel panic - not syncing: LBUG 2012-04-29 12:15:02 cab283 kernel: LPid: 72532, comm: rm Not tainted 2.6.32-220.13.1.2chaos.ch5.x86_64 #1 2012-04-29 12:15:02 ustreError: 7253Call Trace: 2012-04-29 12:15:02 2:0:(lovsub_lock [<ffffffff814ee6c8>] ? panic+0x78/0x143 2012-04-29 12:15:02 .c:384:lovsub_lo [<ffffffffa03e2e1b>] ? lbug_with_loc+0x9b/0xb0 [libcfs] 2012-04-29 12:15:02 ck_delete_one()) [<ffffffffa08bfff2>] ? lovsub_lock_delete+0x1a2/0x370 [lov] 2012-04-29 12:15:02 LBUG 2012-04-29 12:15:02 Apr 29 12: [<ffffffffa055a665>] ? cl_lock_delete0+0xb5/0x150 [obdclass] 2012-04-29 12:15:02 15:02 cab283 ker [<ffffffffa055a816>] ? cl_lock_hold_release+0x116/0x170 [obdclass] 2012-04-29 12:15:02 nel: Kernel pani [<ffffffffa055bf73>] ? cl_lock_unhold+0x33/0xc0 [obdclass] 2012-04-29 12:15:02 c - not syncing: [<ffffffffa08b7e78>] ? lov_sublock_release+0x198/0x260 [lov] 2012-04-29 12:15:02 LBUG 2012-04-29 12:15:02 [<ffffffffa08b8ed8>] ? lov_lock_cancel+0xe8/0x230 [lov] 2012-04-29 12:15:02 [<ffffffffa0559715>] ? cl_lock_cancel0+0x75/0xf0 [obdclass] 2012-04-29 12:15:02 [<ffffffffa055a7f5>] ? cl_lock_hold_release+0xf5/0x170 [obdclass] 2012-04-29 12:15:02 [<ffffffffa055beb2>] ? cl_lock_release+0x82/0x110 [obdclass] 2012-04-29 12:15:02 [<ffffffffa09722d6>] ? cl_glimpse_lock+0x196/0x330 [lustre] 2012-04-29 12:15:02 [<ffffffffa0972914>] ? cl_glimpse_size+0x124/0x130 [lustre] 2012-04-29 12:15:02 [<ffffffffa093ab8a>] ? ll_inode_revalidate_it+0x5a/0x140 [lustre] 2012-04-29 12:15:03 [<ffffffff811843e5>] ? putname+0x45/0x50 2012-04-29 12:15:03 [<ffffffffa093acb9>] ? ll_getattr_it+0x49/0x170 [lustre] 2012-04-29 12:15:03 [<ffffffffa093ae17>] ? ll_getattr+0x37/0x40 [lustre] 2012-04-29 12:15:03 [<ffffffff8117cb21>] ? vfs_getattr+0x51/0x80 2012-04-29 12:15:03 [<ffffffff810d152f>] ? audit_putname+0x3f/0x50 2012-04-29 12:15:03 [<ffffffff8117cbb8>] ? vfs_fstatat+0x68/0x80 2012-04-29 12:15:03 [<ffffffff8117cbf4>] ? sys_newfstatat+0x24/0x50 2012-04-29 12:15:03 [<ffffffff8100c5b5>] ? math_state_restore+0x45/0x60 2012-04-29 12:15:03 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b We also have a core dump which I can poke at if you need me to.

          Here's the stack trace.

          crash> bt
          PID: 72532  TASK: ffff880431796ae0  CPU: 3   COMMAND: "rm"
           #0 [ffff880332151920] machine_kexec at ffffffff8103216b
           #1 [ffff880332151980] crash_kexec at ffffffff810b8c12
           #2 [ffff880332151a50] panic at ffffffff814ee6cf
           #3 [ffff880332151ad0] lbug_with_loc at ffffffffa03e2e1b [libcfs]
           #4 [ffff880332151af0] lovsub_lock_delete at ffffffffa08bfff2 [lov]
           #5 [ffff880332151b50] cl_lock_delete0 at ffffffffa055a665 [obdclass]
           #6 [ffff880332151b80] cl_lock_hold_release at ffffffffa055a816 [obdclass]
           #7 [ffff880332151ba0] cl_lock_unhold at ffffffffa055bf73 [obdclass]
           #8 [ffff880332151bd0] lov_sublock_release at ffffffffa08b7e78 [lov]
           #9 [ffff880332151c20] lov_lock_cancel at ffffffffa08b8ed8 [lov]
          #10 [ffff880332151c90] cl_lock_cancel0 at ffffffffa0559715 [obdclass]
          #11 [ffff880332151cc0] cl_lock_hold_release at ffffffffa055a7f5 [obdclass]
          #12 [ffff880332151ce0] cl_lock_release at ffffffffa055beb2 [obdclass]
          #13 [ffff880332151d10] cl_glimpse_lock at ffffffffa09722d6 [lustre]
          #14 [ffff880332151d70] cl_glimpse_size at ffffffffa0972914 [lustre]
          #15 [ffff880332151dc0] ll_inode_revalidate_it at ffffffffa093ab8a [lustre]
          #16 [ffff880332151e00] ll_getattr_it at ffffffffa093acb9 [lustre]
          #17 [ffff880332151e30] ll_getattr at ffffffffa093ae17 [lustre]
          #18 [ffff880332151e70] vfs_getattr at ffffffff8117cb21
          #19 [ffff880332151eb0] vfs_fstatat at ffffffff8117cbb8
          #20 [ffff880332151ef0] sys_newfstatat at ffffffff8117cbf4
          #21 [ffff880332151f80] system_call_fastpath at ffffffff8100b0f2
              RIP: 00002aaaaada549a  RSP: 00007fffffffd2d8  RFLAGS: 00010246
              RAX: 0000000000000106  RBX: ffffffff8100b0f2  RCX: 3144324c2f303144
              RDX: 000000000060e600  RSI: 000000000060d0f0  RDI: ffffffffffffff9c
              RBP: 000000000060e600   R8: 0000000000000100   R9: 0000000000000000
              R10: 0000000000000100  R11: 0000000000000246  R12: 000000000060d060
              R13: 000000000060e590  R14: 0000000000000004  R15: 000000000060d060
              ORIG_RAX: 0000000000000106  CS: 0033  SS: 002b
          
          nedbass Ned Bass (Inactive) added a comment - Here's the stack trace. crash> bt PID: 72532 TASK: ffff880431796ae0 CPU: 3 COMMAND: "rm" #0 [ffff880332151920] machine_kexec at ffffffff8103216b #1 [ffff880332151980] crash_kexec at ffffffff810b8c12 #2 [ffff880332151a50] panic at ffffffff814ee6cf #3 [ffff880332151ad0] lbug_with_loc at ffffffffa03e2e1b [libcfs] #4 [ffff880332151af0] lovsub_lock_delete at ffffffffa08bfff2 [lov] #5 [ffff880332151b50] cl_lock_delete0 at ffffffffa055a665 [obdclass] #6 [ffff880332151b80] cl_lock_hold_release at ffffffffa055a816 [obdclass] #7 [ffff880332151ba0] cl_lock_unhold at ffffffffa055bf73 [obdclass] #8 [ffff880332151bd0] lov_sublock_release at ffffffffa08b7e78 [lov] #9 [ffff880332151c20] lov_lock_cancel at ffffffffa08b8ed8 [lov] #10 [ffff880332151c90] cl_lock_cancel0 at ffffffffa0559715 [obdclass] #11 [ffff880332151cc0] cl_lock_hold_release at ffffffffa055a7f5 [obdclass] #12 [ffff880332151ce0] cl_lock_release at ffffffffa055beb2 [obdclass] #13 [ffff880332151d10] cl_glimpse_lock at ffffffffa09722d6 [lustre] #14 [ffff880332151d70] cl_glimpse_size at ffffffffa0972914 [lustre] #15 [ffff880332151dc0] ll_inode_revalidate_it at ffffffffa093ab8a [lustre] #16 [ffff880332151e00] ll_getattr_it at ffffffffa093acb9 [lustre] #17 [ffff880332151e30] ll_getattr at ffffffffa093ae17 [lustre] #18 [ffff880332151e70] vfs_getattr at ffffffff8117cb21 #19 [ffff880332151eb0] vfs_fstatat at ffffffff8117cbb8 #20 [ffff880332151ef0] sys_newfstatat at ffffffff8117cbf4 #21 [ffff880332151f80] system_call_fastpath at ffffffff8100b0f2 RIP: 00002aaaaada549a RSP: 00007fffffffd2d8 RFLAGS: 00010246 RAX: 0000000000000106 RBX: ffffffff8100b0f2 RCX: 3144324c2f303144 RDX: 000000000060e600 RSI: 000000000060d0f0 RDI: ffffffffffffff9c RBP: 000000000060e600 R8: 0000000000000100 R9: 0000000000000000 R10: 0000000000000100 R11: 0000000000000246 R12: 000000000060d060 R13: 000000000060e590 R14: 0000000000000004 R15: 000000000060d060 ORIG_RAX: 0000000000000106 CS: 0033 SS: 002b

          Do you have a stack trace for this?

          jay Jinshan Xiong (Inactive) added a comment - Do you have a stack trace for this?

          People

            jay Jinshan Xiong (Inactive)
            prakash Prakash Surya (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: