Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14379

ls hangs on client and console log message LNet: Service thread pid 25241 was inactive performing mdt_reint_rename

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • None
    • zfs-0.7
      kernel-3.10.0-1160.4.1.1chaos.ch6.x86_64
      lustre-2.12.5_10.llnl-3.ch6.x86_64
    • 3
    • 9223372036854775807

    Description

      At some time the morning of Jan 27th, we got reports of directory listings ("ls") hanging, where the directories were on MDT5 and MDT7.

      The console log of MDT5 and MDT7 both reported repeated watchdog dumps, all with very similar stacks. The first one on MDT7 appeared Thu Jan 21 12:01:33 2021

      # From zinc8 dmesg log
      LNet: Service thread pid 25241 was inactive for 200.13s. The thread might be hung, or it might only be slow and will resume later. D
      Pid: 25241, comm: mdt01_049 3.10.0-1160.4.1.1chaos.ch6.x86_64 #1 SMP Fri Oct 9 17:56:20 PDT 2020
      Call Trace:
       [<ffffffffc141a460>] ldlm_completion_ast+0x440/0x870 [ptlrpc]
       [<ffffffffc141be2f>] ldlm_cli_enqueue_fini+0x96f/0xdf0 [ptlrpc]
       [<ffffffffc141ef3e>] ldlm_cli_enqueue+0x40e/0x920 [ptlrpc]
       [<ffffffffc1982342>] osp_md_object_lock+0x162/0x2d0 [osp]
       [<ffffffffc1895194>] lod_object_lock+0xf4/0x780 [lod]
       [<ffffffffc1916ace>] mdd_object_lock+0x3e/0xe0 [mdd]
       [<ffffffffc17ae681>] mdt_remote_object_lock_try+0x1e1/0x750 [mdt]
       [<ffffffffc17aec1a>] mdt_remote_object_lock+0x2a/0x30 [mdt]
       [<ffffffffc17c407e>] mdt_rename_lock+0xbe/0x4b0 [mdt]
       [<ffffffffc17c6400>] mdt_reint_rename+0x2c0/0x2900 [mdt]
       [<ffffffffc17cf113>] mdt_reint_rec+0x83/0x210 [mdt]
       [<ffffffffc17ab303>] mdt_reint_internal+0x6e3/0xaf0 [mdt]
       [<ffffffffc17b6b37>] mdt_reint+0x67/0x140 [mdt] 
       [<ffffffffc14b8b1a>] tgt_request_handle+0xada/0x1570 [ptlrpc]
       [<ffffffffc145d80b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
       [<ffffffffc1461bfd>] ptlrpc_main+0xc4d/0x2280 [ptlrpc]
       [<ffffffffadecafc1>] kthread+0xd1/0xe0
       [<ffffffffae5c1ff7>] ret_from_fork_nospec_end+0x0/0x39
       [<ffffffffffffffff>] 0xffffffffffffffff
      LustreError: dumping log to /tmp/lustre-log.1611259694.25241 

      The remaining dumps took a different path within ldlm_cli_enqueue:

      ptlrpc_set_wait+0x4d8/0x800 [ptlrpc]
      ptlrpc_queue_wait+0x83/0x230 [ptlrpc]
      ldlm_cli_enqueue+0x3d2/0x920 [ptlrpc]
      osp_md_object_lock+0x162/0x2d0 [osp]
      lod_object_lock+0xf4/0x780 [lod]
      mdd_object_lock+0x3e/0xe0 [mdd]
      mdt_remote_object_lock_try+0x1e1/0x750 [mdt]
      mdt_remote_object_lock+0x2a/0x30 [mdt]
      mdt_rename_lock+0xbe/0x4b0 [mdt]
      mdt_reint_rename+0x2c0/0x2900 [mdt]
      mdt_reint_rec+0x83/0x210 [mdt]
      mdt_reint_internal+0x6e3/0xaf0 [mdt]
      mdt_reint+0x67/0x140 [mdt]
      tgt_request_handle+0xada/0x1570 [ptlrpc]
      ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
      ptlrpc_main+0xc4d/0x2280 [ptlrpc]
      kthread+0xd1/0xe0
      ret_from_fork_nospec_end+0x0/0x39
      0xffffffffffffffff
      

      I can provide mdt7 debug logs and ldlm namespace dump; I have a core dump from mdt5.  And dmesg logs for both.

      Attachments

        Issue Links

          Activity

            [LU-14379] ls hangs on client and console log message LNet: Service thread pid 25241 was inactive performing mdt_reint_rename
            laisiyao Lai Siyao added a comment -

            This looks to be duplicate of LU-13437, and it was landed to the latest 2.12, you can cherry-pick from there.

            laisiyao Lai Siyao added a comment - This looks to be duplicate of LU-13437 , and it was landed to the latest 2.12, you can cherry-pick from there.
            ofaaland Olaf Faaland added a comment -

            PID 25241, given in the watchdog warning above, appears in console log messages regarding changelog handling:

            # [Sat Jan 23 12:21:13 2021] Lustre: 25161:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22
            >> [Sat Jan 23 12:21:53 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22
            # [Sat Jan 23 12:24:08 2021] Lustre: 25188:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22 
            

            There are many like that - some of the others are:

            [Sat Jan 23 13:38:06 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22
            [Sat Jan 23 13:38:06 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) Skipped 12 previous similar messages
            [Sat Jan 23 15:25:52 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22
            [Sat Jan 23 15:25:52 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) Skipped 9 previous similar messages
            [Sat Jan 23 19:39:04 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22
            

            The changelog clear messages are from Starfish, a policy manager similar to Robinhood, which frequently clears changelog entries which have already been cleared. 

            ofaaland Olaf Faaland added a comment - PID 25241, given in the watchdog warning above, appears in console log messages regarding changelog handling: # [Sat Jan 23 12:21:13 2021] Lustre: 25161:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22 >> [Sat Jan 23 12:21:53 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22 # [Sat Jan 23 12:24:08 2021] Lustre: 25188:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22 There are many like that - some of the others are: [Sat Jan 23 13:38:06 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22 [Sat Jan 23 13:38:06 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) Skipped 12 previous similar messages [Sat Jan 23 15:25:52 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22 [Sat Jan 23 15:25:52 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) Skipped 9 previous similar messages [Sat Jan 23 19:39:04 2021] Lustre: 25241:0:(mdd_device.c:1811:mdd_changelog_clear()) lsh-MDD0007: Failure to clear the changelog for user 6: -22 The changelog clear messages are from Starfish, a policy manager similar to Robinhood, which frequently clears changelog entries which have already been cleared. 
            ofaaland Olaf Faaland added a comment -

            SysRq  show-backtrace-all-active-cpus(l) dumped no stacks other than the core processing the SysRq.  All other cores were idling:

            # From [Wed Jan 27 13:13:19 2021] 
            SysRq : Show backtrace of all active CPUs
            sending NMI to all CPUs:
            NMI backtrace for cpu 0 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 1 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 2 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 3 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 4 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 5 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 6 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 7 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 8
            CPU: 8 PID: 0 Comm: swapper/8 Kdump: loaded Tainted: P           OE  ------------ T 3.10.0-1160.4.1.1chaos.ch6.x86_64 #1
            Hardware name: Intel Corporation S2600WTTR/S2600WTTR, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016
            task: ffff98d5ca74e300 ti: ffff98d5cab50000 task.ti: ffff98d5cab50000
            RIP: 0010:[<ffffffffade6102b>]  [<ffffffffade6102b>] default_send_IPI_mask_sequence_phys+0x7b/0x100
            RSP: 0018:ffff9914bf203ce8  EFLAGS: 00000046
            RAX: ffff9914bf240000 RBX: 000000000000e02e RCX: 0000000000000020
            RDX: 0000000000000009 RSI: 0000000000000012 RDI: 0000000000000000
            RBP: ffff9914bf203d20 R08: ffffffffaeb5b5a0 R09: 0000000000008032
            R10: 0000000000000000 R11: ffff9914bf203a2e R12: ffffffffaeb5b5a0
            R13: 0000000000000400 R14: 0000000000000086 R15: 0000000000000002
            FS:  0000000000000000(0000) GS:ffff9914bf200000(0000) knlGS:0000000000000000
            CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
            CR2: 00007ffff7ff8000 CR3: 0000007bf1ed8000 CR4: 00000000001607e0
            Call Trace:
             <IRQ> [<ffffffffade66d4e>] physflat_send_IPI_mask+0xe/0x10
             [<ffffffffade6153a>] arch_trigger_all_cpu_backtrace+0x2ea/0x2f0
             [<ffffffffae2932f3>] sysrq_handle_showallcpus+0x13/0x20
             [<ffffffffae2939ed>] __handle_sysrq+0x11d/0x180
             [<ffffffffae293a76>] handle_sysrq+0x26/0x30
             ...
             [<ffffffffae5c6b2d>] do_IRQ+0x4d/0xf0
             [<ffffffffae5b836a>] common_interrupt+0x16a/0x16a
             <EOI> [<ffffffffae3eab67>] ? cpuidle_enter_state+0x57/0xd0
             [<ffffffffae3eacbe>] cpuidle_idle_call+0xde/0x270
             [<ffffffffade3919e>] arch_cpu_idle+0xe/0xc0
             [<ffffffffadf0835a>] cpu_startup_entry+0x14a/0x1e0
             [<ffffffffade5cb87>] start_secondary+0x207/0x280
             [<ffffffffade000d5>] start_cpu+0x5/0x14
            Code: c2 01 4c 89 e7 48 63 d2 e8 93 8f 35 00 89 c2 48 63 05 ae e6 cf 00 48 39 c2 73 53 48 8b 04 d5 a0 15 b5 ae 41 83 ff 02 0f b7 34 03 <74> 5a 8b 04 25 00 93 5a ff f6 c4 10 74 15 0f 1f 80 00 00 00 00 
            NMI backtrace for cpu 9 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 10 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 11 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 12 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 13 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 14 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 15 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 16 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 17 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 18 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 19 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 20 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 21 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 22 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 23 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 24 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 25 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 26 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 27 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 28 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 29 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 30 skipped: idling at pc 0xffffffffae5b6aa1
            NMI backtrace for cpu 31 skipped: idling at pc 0xffffffffae5b6aa1
            ofaaland Olaf Faaland added a comment - SysRq  show-backtrace-all-active-cpus(l) dumped no stacks other than the core processing the SysRq.  All other cores were idling: # From [Wed Jan 27 13:13:19 2021] SysRq : Show backtrace of all active CPUs sending NMI to all CPUs: NMI backtrace for cpu 0 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 1 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 2 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 3 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 4 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 5 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 6 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 7 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 8 CPU: 8 PID: 0 Comm: swapper/8 Kdump: loaded Tainted: P           OE  ------------ T 3.10.0-1160.4.1.1chaos.ch6.x86_64 #1 Hardware name: Intel Corporation S2600WTTR/S2600WTTR, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016 task: ffff98d5ca74e300 ti: ffff98d5cab50000 task.ti: ffff98d5cab50000 RIP: 0010:[<ffffffffade6102b>]  [<ffffffffade6102b>] default_send_IPI_mask_sequence_phys+0x7b/0x100 RSP: 0018:ffff9914bf203ce8  EFLAGS: 00000046 RAX: ffff9914bf240000 RBX: 000000000000e02e RCX: 0000000000000020 RDX: 0000000000000009 RSI: 0000000000000012 RDI: 0000000000000000 RBP: ffff9914bf203d20 R08: ffffffffaeb5b5a0 R09: 0000000000008032 R10: 0000000000000000 R11: ffff9914bf203a2e R12: ffffffffaeb5b5a0 R13: 0000000000000400 R14: 0000000000000086 R15: 0000000000000002 FS:  0000000000000000(0000) GS:ffff9914bf200000(0000) knlGS:0000000000000000 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffff7ff8000 CR3: 0000007bf1ed8000 CR4: 00000000001607e0 Call Trace: <IRQ> [<ffffffffade66d4e>] physflat_send_IPI_mask+0xe/0x10 [<ffffffffade6153a>] arch_trigger_all_cpu_backtrace+0x2ea/0x2f0 [<ffffffffae2932f3>] sysrq_handle_showallcpus+0x13/0x20 [<ffffffffae2939ed>] __handle_sysrq+0x11d/0x180 [<ffffffffae293a76>] handle_sysrq+0x26/0x30 ... [<ffffffffae5c6b2d>] do_IRQ+0x4d/0xf0 [<ffffffffae5b836a>] common_interrupt+0x16a/0x16a <EOI> [<ffffffffae3eab67>] ? cpuidle_enter_state+0x57/0xd0 [<ffffffffae3eacbe>] cpuidle_idle_call+0xde/0x270 [<ffffffffade3919e>] arch_cpu_idle+0xe/0xc0 [<ffffffffadf0835a>] cpu_startup_entry+0x14a/0x1e0 [<ffffffffade5cb87>] start_secondary+0x207/0x280 [<ffffffffade000d5>] start_cpu+0x5/0x14 Code: c2 01 4c 89 e7 48 63 d2 e8 93 8f 35 00 89 c2 48 63 05 ae e6 cf 00 48 39 c2 73 53 48 8b 04 d5 a0 15 b5 ae 41 83 ff 02 0f b7 34 03 <74> 5a 8b 04 25 00 93 5a ff f6 c4 10 74 15 0f 1f 80 00 00 00 00 NMI backtrace for cpu 9 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 10 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 11 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 12 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 13 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 14 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 15 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 16 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 17 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 18 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 19 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 20 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 21 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 22 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 23 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 24 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 25 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 26 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 27 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 28 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 29 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 30 skipped: idling at pc 0xffffffffae5b6aa1 NMI backtrace for cpu 31 skipped: idling at pc 0xffffffffae5b6aa1
            ofaaland Olaf Faaland added a comment -

            Code listings for the point where the stacks are different:

            (gdb) l *(ldlm_cli_enqueue+0x3d2)
            0x25f32 is in ldlm_cli_enqueue (/usr/src/debug/lustre-2.12.5_10.llnl/lustre/ldlm/ldlm_request.c:1036).
            1031	
            1032		LDLM_DEBUG(lock, "sending request");
            1033	
            1034		rc = ptlrpc_queue_wait(req);
            1035	
            1036		err = ldlm_cli_enqueue_fini(exp, req, einfo->ei_type, policy ? 1 : 0,
            1037					    einfo->ei_mode, flags, lvb, lvb_len,
            1038					    lockh, rc);
            1039	
            1040		/* If ldlm_cli_enqueue_fini did not find the lock, we need to free
            
            
            (gdb) l *(ldlm_cli_enqueue+0x40e)
            0x25f6e is in ldlm_cli_enqueue (/usr/src/debug/lustre-2.12.5_10.llnl/lustre/ldlm/ldlm_request.c:1042).
            1037					    einfo->ei_mode, flags, lvb, lvb_len,
            1038					    lockh, rc);
            1039	
            1040		/* If ldlm_cli_enqueue_fini did not find the lock, we need to free
            1041		 * one reference that we took */
            1042		if (err == -ENOLCK)
            1043			LDLM_LOCK_RELEASE(lock);
            1044		else
            1045			rc = err;
            1046	 
            ofaaland Olaf Faaland added a comment - Code listings for the point where the stacks are different: (gdb) l *(ldlm_cli_enqueue+0x3d2) 0x25f32 is in ldlm_cli_enqueue (/usr/src/debug/lustre-2.12.5_10.llnl/lustre/ldlm/ldlm_request.c:1036). 1031 1032 LDLM_DEBUG(lock, "sending request"); 1033 1034 rc = ptlrpc_queue_wait(req); 1035 1036 err = ldlm_cli_enqueue_fini(exp, req, einfo->ei_type, policy ? 1 : 0, 1037     einfo->ei_mode, flags, lvb, lvb_len, 1038     lockh, rc); 1039 1040 /* If ldlm_cli_enqueue_fini did not find the lock, we need to free (gdb) l *(ldlm_cli_enqueue+0x40e) 0x25f6e is in ldlm_cli_enqueue (/usr/src/debug/lustre-2.12.5_10.llnl/lustre/ldlm/ldlm_request.c:1042). 1037     einfo->ei_mode, flags, lvb, lvb_len, 1038     lockh, rc); 1039 1040 /* If ldlm_cli_enqueue_fini did not find the lock, we need to free 1041 * one reference that we took */ 1042 if (err == -ENOLCK) 1043 LDLM_LOCK_RELEASE(lock); 1044 else 1045 rc = err; 1046
            ofaaland Olaf Faaland added a comment -

            Possibly related to LU-14378

            ofaaland Olaf Faaland added a comment - Possibly related to LU-14378
            ofaaland Olaf Faaland added a comment -

            For my records, my local ticket is TOSS5037

            ofaaland Olaf Faaland added a comment - For my records, my local ticket is TOSS5037

            People

              laisiyao Lai Siyao
              ofaaland Olaf Faaland
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: