Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3669

(mdt_xattr.c:178:mdt_getxattr()) ASSERTION( lu_object_assert_exists(&info->mti_object->mot_obj) ) failed:

Details

    • 3
    • 9457

    Description

      Using today's master 2.4.52-72-g643e0ae with the extra racer operations from http://review.whamcloud.com/#/c/5936/ I see LDLM hangs and LBUGs is
      the getxattr handler.

      # MOUNT_2=y llmount.sh
      # sh ./lustre/tests/racer.sh
      
      LNet: Service thread pid 16149 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      Pid: 16149, comm: mdt01_004
      
      # cat /proc/16149/stack
      [<ffffffffa04ed641>] cfs_waitq_timedwait+0x11/0x20 [libcfs]
      [<ffffffffa07f912d>] ldlm_completion_ast+0x4ed/0x960 [ptlrpc]
      [<ffffffffa07f8870>] ldlm_cli_enqueue_local+0x1f0/0x5c0 [ptlrpc]
      [<ffffffffa0cd3c0b>] mdt_object_lock0+0x33b/0xaf0 [mdt]
      [<ffffffffa0cd4484>] mdt_object_lock+0x14/0x20 [mdt]
      [<ffffffffa0cd4520>] mdt_intent_getxattr+0x90/0x160 [mdt]
      [<ffffffffa0cd111e>] mdt_intent_policy+0x3ae/0x770 [mdt]
      [<ffffffffa07d9471>] ldlm_lock_enqueue+0x361/0x8c0 [ptlrpc]
      [<ffffffffa080219f>] ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc]
      [<ffffffffa0cd15e6>] mdt_enqueue+0x46/0xe0 [mdt]
      [<ffffffffa0cd7db7>] mdt_handle_common+0x647/0x16d0 [mdt]
      [<ffffffffa0d12d25>] mds_regular_handle+0x15/0x20 [mdt]
      [<ffffffffa0831fd8>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
      [<ffffffffa083335d>] ptlrpc_main+0xabd/0x1700 [ptlrpc]
      [<ffffffff81096936>] kthread+0x96/0xa0
      [<ffffffff8100c0ca>] child_rip+0xa/0x20
      [<ffffffffffffffff>] 0xffffffffffffffff
      
      LustreError: 11483:0:(mdt_xattr.c:178:mdt_getxattr()) ASSERTION( lu_object_assert_exists(&info->mti_object->mot_obj) ) failed: 
      LustreError: 11483:0:(mdt_xattr.c:178:mdt_getxattr()) LBUG
      Pid: 11483, comm: mdt00_005
      
      crash> bt
      PID: 11483  TASK: ffff8801f2651540  CPU: 0   COMMAND: "mdt00_005"
       #0 [ffff8801feb478f8] machine_kexec at ffffffff81035c9b
       #1 [ffff8801feb47958] crash_kexec at ffffffff810c0d22
       #2 [ffff8801feb47a28] panic at ffffffff8150eeff
       #3 [ffff8801feb47aa8] lbug_with_loc at ffffffffa0c08eeb [libcfs]
       #4 [ffff8801feb47ac8] mdt_getxattr at ffffffffa06ad35f [mdt]
       #5 [ffff8801feb47b68] mdt_intent_getxattr at ffffffffa068a52e [mdt]
       #6 [ffff8801feb47bb8] mdt_intent_policy at ffffffffa068711e [mdt]
       #7 [ffff8801feb47bf8] ldlm_lock_enqueue at ffffffffa0e6e471 [ptlrpc]
       #8 [ffff8801feb47c58] ldlm_handle_enqueue0 at ffffffffa0e9719f [ptlrpc]
       #9 [ffff8801feb47cc8] mdt_enqueue at ffffffffa06875e6 [mdt]
      #10 [ffff8801feb47ce8] mdt_handle_common at ffffffffa068ddf7 [mdt]
      #11 [ffff8801feb47d38] mds_regular_handle at ffffffffa06c8ca5 [mdt]
      #12 [ffff8801feb47d48] ptlrpc_server_handle_request at ffffffffa0ec6fd8 [ptlrpc]
      #13 [ffff8801feb47e48] ptlrpc_main at ffffffffa0ec835d [ptlrpc]
      #14 [ffff8801feb47ee8] kthread at ffffffff81096936
      #15 [ffff8801feb47f48] kernel_thread at ffffffff8100c0ca
      

      We can't keep adding IBITs without defining some ordering or hierarchy for when we use them.

      00010000:00010000:6.0:1375198422.885704:0:5685:0:(ldlm_resource.c:1448:ldlm_resource_dump()) --- Resource: [0x200000401:0x8:0x0].0 (ffff8801eceee480) refcount = 8
      00010000:00010000:6.0:1375198422.885707:0:5685:0:(ldlm_resource.c:1451:ldlm_resource_dump()) Granted locks (in reverse order):
      00010000:00010000:6.0:1375198422.885708:0:5685:0:(ldlm_resource.c:1454:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801e9cc9d00/0x132cecfff5acfc2f lrc: 3/0,0 mode: PW/PW res: [0x200000401:0x8:0x0].0 bits 0x20 rrc: 8 type: IBT flags: 0x60200000000020 nid: 0@lo remote: 0x132cecfff5acfc05 expref: 30 pid: 3274 timeout: 4294866401 lvb_type: 0
      00010000:00010000:6.0:1375198422.885715:0:5685:0:(ldlm_resource.c:1454:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801eeae4d40/0x132cecfff5acf6e1 lrc: 2/0,0 mode: CR/CR res: [0x200000401:0x8:0x0].0 bits 0x9 rrc: 8 type: IBT flags: 0x40200000000000 nid: 0@lo remote: 0x132cecfff5acf6c5 expref: 30 pid: 5252 timeout: 0 lvb_type: 0
      00010000:00010000:6.0:1375198422.885721:0:5685:0:(ldlm_resource.c:1454:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801ec5246c0/0x132cecfff5acebf8 lrc: 2/0,0 mode: CR/CR res: [0x200000401:0x8:0x0].0 bits 0x9 rrc: 8 type: IBT flags: 0x40200000000000 nid: 0@lo remote: 0x132cecfff5acebdc expref: 39 pid: 3274 timeout: 0 lvb_type: 0
      00010000:00010000:6.0:1375198422.885727:0:5685:0:(ldlm_resource.c:1454:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801eb589280/0x132cecfff5ace354 lrc: 2/0,0 mode: CR/CR res: [0x200000401:0x8:0x0].0 bits 0x8 rrc: 8 type: IBT flags: 0x40000000000000 nid: 0@lo remote: 0x132cecfff5ace34d expref: 30 pid: 3810 timeout: 0 lvb_type: 3
      00010000:00010000:6.0:1375198422.885734:0:5685:0:(ldlm_resource.c:1469:ldlm_resource_dump()) Waiting locks:
      00010000:00010000:6.0:1375198422.885736:0:5685:0:(ldlm_resource.c:1471:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801eea356c0/0x132cecfff5ad0e42 lrc: 3/1,0 mode: --/PR res: [0x200000401:0x8:0x0].0 bits 0x20 rrc: 8 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 5306 timeout: 0 lvb_type: 0
      00010000:00010000:6.0:1375198422.885742:0:5685:0:(ldlm_resource.c:1471:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801e95e8d00/0x132cecfff5ad0f1b lrc: 3/0,1 mode: --/EX res: [0x200000401:0x8:0x0].0 bits 0x2 rrc: 8 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 3274 timeout: 0 lvb_type: 0
      

      Attachments

        Issue Links

          Activity

            [LU-3669] (mdt_xattr.c:178:mdt_getxattr()) ASSERTION( lu_object_assert_exists(&info->mti_object->mot_obj) ) failed:

            The hang reproduces within seconds of starting the two loops.

            OK, I see.

            I pulled your patch for the crash and it seems OK.

            Great! Thanks a lot for the feedback.

            Eventually the setattr gets a PW XATTR lock, the unlink enqueues a blocked EX FULL lock, and the setattr enqueues a blocked EX UPDATE lock. Sorry I didn't mention it earlier, I only thought of it this morning.

            Hm, right. It looks like nested LDLM locking is often a headache.

            panda Andrew Perepechko added a comment - The hang reproduces within seconds of starting the two loops. OK, I see. I pulled your patch for the crash and it seems OK. Great! Thanks a lot for the feedback. Eventually the setattr gets a PW XATTR lock, the unlink enqueues a blocked EX FULL lock, and the setattr enqueues a blocked EX UPDATE lock. Sorry I didn't mention it earlier, I only thought of it this morning. Hm, right. It looks like nested LDLM locking is often a headache.
            jhammond John Hammond added a comment -

            Hi Andrew, I was just updating the ticket when I saw your comment.

            > It reproduces within seconds of starting the two loops.

            Sorry, I misread your question. The hang reproduces within seconds of starting the two loops. For the crash use http://review.whamcloud.com/#/c/5936/ and run racer. Four times out of five it will hang the other time it will crash. I pulled your patch for the crash and it seems OK.

            > John, is there a chance that 7148 will be landed or should locking be redesigned anyway?

            Not sure. I'm working to get reviewers on it.

            I'm still concerned that 7148 + 7191 are not enough. Consider the following case:

            # MOUNT_2=y llmount.sh
            # cd /mnt/lustre
            # while true; do touch f0; rm f0; done &
            # cd /mnt/lustre2
            # while true; do setfattr -n user.ATTR -v LULZ f0; done &
            

            Eventually the setattr gets a PW XATTR lock, the unlink enqueues a blocked EX FULL lock, and the setattr enqueues a blocked EX UPDATE lock. Sorry I didn't mention it earlier, I only thought of it this morning.

            jhammond John Hammond added a comment - Hi Andrew, I was just updating the ticket when I saw your comment. > It reproduces within seconds of starting the two loops. Sorry, I misread your question. The hang reproduces within seconds of starting the two loops. For the crash use http://review.whamcloud.com/#/c/5936/ and run racer. Four times out of five it will hang the other time it will crash. I pulled your patch for the crash and it seems OK. > John, is there a chance that 7148 will be landed or should locking be redesigned anyway? Not sure. I'm working to get reviewers on it. I'm still concerned that 7148 + 7191 are not enough. Consider the following case: # MOUNT_2=y llmount.sh # cd /mnt/lustre # while true; do touch f0; rm f0; done & # cd /mnt/lustre2 # while true; do setfattr -n user.ATTR -v LULZ f0; done & Eventually the setattr gets a PW XATTR lock, the unlink enqueues a blocked EX FULL lock, and the setattr enqueues a blocked EX UPDATE lock. Sorry I didn't mention it earlier, I only thought of it this morning.

            Using http://review.whamcloud.com/#/c/7148/ and http://review.whamcloud.com/#/c/7191/ seems to make racer.sh pass (no traces of eviction or crashes for a few runs).

            John, is there a chance that 7148 will be landed or should locking be redesigned anyway?

            panda Andrew Perepechko added a comment - Using http://review.whamcloud.com/#/c/7148/ and http://review.whamcloud.com/#/c/7191/ seems to make racer.sh pass (no traces of eviction or crashes for a few runs). John, is there a chance that 7148 will be landed or should locking be redesigned anyway?
            panda Andrew Perepechko added a comment - http://review.whamcloud.com/#/c/7191/1 for the crash part

            Never mind, I was able to reproduce it on my hardware.

            panda Andrew Perepechko added a comment - Never mind, I was able to reproduce it on my hardware.

            How could you see the deadlock then?

            panda Andrew Perepechko added a comment - How could you see the deadlock then?
            jhammond John Hammond added a comment -

            It reproduces within seconds of starting the two loops.

            jhammond John Hammond added a comment - It reproduces within seconds of starting the two loops.
            panda Andrew Perepechko added a comment - - edited

            Right. Thanks for the detailed explanation, John.
            By the way, how easily can you reproduce the crash?

            panda Andrew Perepechko added a comment - - edited Right. Thanks for the detailed explanation, John. By the way, how easily can you reproduce the crash?
            jhammond John Hammond added a comment -

            Try it and see. Or you can just do something like:

            MOUNT_2=y llmount.sh
            cd /mnt/lustre
            while true; do setfattr -n user.ATTR -v LULZ f0; done &
            cd /mnt/lustre2
            while true; do getfattr -n user.ATTR f0; done &
            

            Soon both handlers will hang and you'll have a resource that looks like:

            Resource: [0x200000400:0x6f:0x0].0 (ffff880194bb20c0) refcount = 6
            Granted locks (in reverse order):
            ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff880187b3e2c0/0x20a0719b52891d17 lrc: 3/0,0 mode: PW/PW res: [0x200000400:0x6f:0x0].0 bits 0x20 rrc: 6 type: IBT flags: 0x60200000000020 nid: 0@lo remote: 0x20a0719b52891d09 expref: 6 pid: 20067 timeout: 4296110059 lvb_type: 0
            ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff88018fbe3d40/0x20a0719b52891bdc lrc: 2/0,0 mode: CR/CR res: [0x200000400:0x6f:0x0].0 bits 0x8 rrc: 6 type: IBT flags: 0x40000000000000 nid: 0@lo remote: 0x20a0719b52891bd5 expref: 8 pid: 17987 timeout: 0 lvb_type: 3
            Waiting locks:
            ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff880194bfd3c0/0x20a0719b52891d2c lrc: 3/1,0 mode: --/PR res: [0x200000400:0x6f:0x0].0 bits 0x20 rrc: 6 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 17987 timeout: 0 lvb_type: 0
            ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff88018f920180/0x20a0719b52891d33 lrc: 3/0,1 mode: --/EX res: [0x200000400:0x6f:0x0].0 bits 0x2 rrc: 6 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 20067 timeout: 0 lvb_type: 0
            

            Indeed, the EX lock was added to the waiting list to wait for cancellation of some other lock and this is after that lock was canceled. But the EX lock will not be granted since the head of the waiting list is blocked. Please see http://review.whamcloud.com/#/c/7148/ for some discussion.

            jhammond John Hammond added a comment - Try it and see. Or you can just do something like: MOUNT_2=y llmount.sh cd /mnt/lustre while true; do setfattr -n user.ATTR -v LULZ f0; done & cd /mnt/lustre2 while true; do getfattr -n user.ATTR f0; done & Soon both handlers will hang and you'll have a resource that looks like: Resource: [0x200000400:0x6f:0x0].0 (ffff880194bb20c0) refcount = 6 Granted locks (in reverse order): ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff880187b3e2c0/0x20a0719b52891d17 lrc: 3/0,0 mode: PW/PW res: [0x200000400:0x6f:0x0].0 bits 0x20 rrc: 6 type: IBT flags: 0x60200000000020 nid: 0@lo remote: 0x20a0719b52891d09 expref: 6 pid: 20067 timeout: 4296110059 lvb_type: 0 ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff88018fbe3d40/0x20a0719b52891bdc lrc: 2/0,0 mode: CR/CR res: [0x200000400:0x6f:0x0].0 bits 0x8 rrc: 6 type: IBT flags: 0x40000000000000 nid: 0@lo remote: 0x20a0719b52891bd5 expref: 8 pid: 17987 timeout: 0 lvb_type: 3 Waiting locks: ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff880194bfd3c0/0x20a0719b52891d2c lrc: 3/1,0 mode: --/PR res: [0x200000400:0x6f:0x0].0 bits 0x20 rrc: 6 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 17987 timeout: 0 lvb_type: 0 ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff88018f920180/0x20a0719b52891d33 lrc: 3/0,1 mode: --/EX res: [0x200000400:0x6f:0x0].0 bits 0x2 rrc: 6 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 20067 timeout: 0 lvb_type: 0 Indeed, the EX lock was added to the waiting list to wait for cancellation of some other lock and this is after that lock was canceled. But the EX lock will not be granted since the head of the waiting list is blocked. Please see http://review.whamcloud.com/#/c/7148/ for some discussion.

            We can't keep adding IBITs without defining some ordering or hierarchy for when we use them.

            That supposedly means that the list below shows a deadlock or something of that kind.

            However, nothing prevents EX from being granted. After the EX lock is granted and setxattr succeeds, the client will release the PW lock and the PR lock request can succeed.

            panda Andrew Perepechko added a comment - We can't keep adding IBITs without defining some ordering or hierarchy for when we use them. That supposedly means that the list below shows a deadlock or something of that kind. However, nothing prevents EX from being granted. After the EX lock is granted and setxattr succeeds, the client will release the PW lock and the PR lock request can succeed.

            People

              jhammond John Hammond
              jhammond John Hammond
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: