Details

    • Bug
    • Resolution: Incomplete
    • Critical
    • None
    • Lustre 2.10.4, Lustre 2.10.5, Lustre 2.10.6
    • None
    • CentOS 7, 3.10.0-862.2.3.el7_lustre.x86_64, 1 MDS (+1 HA pair), 4 OSS
    • 3
    • 9223372036854775807

    Description

      After an upgrade of Lustre servers from Centos 6 and IEEL/DDN version of Lustre 2.5 to Centos 7 and Lustre 2.10.5 we are experiencing stability issues related to and MDS/OSS deadlock, most likely caused by hung OST thread. The issue is visible on one of two filesystems, the smaller one, both are working on identical HW and SW stack, so the issue is most likely related to a specific workload on the filesystem. This is mostly an midterm storage filesystem, so more metadata ops happen on it than on the scratch one. We initially thought that the issue might be caused by our try to enable project quotas (tune2fs -O project on all targets), so we turned it off (tune2fs -O ^project), which didn't change a thing. Also all combinations of MDT/OST quota enforcement (initally it was 'g', than 'gp', now 'none') doesn't change a thing, so probably it is not related to the problem. In terms of relevant information, we run on sync_journal=1 as a workaround to a problem with memory accounting on the clients suggested by DDN some time ago. The crash happens at least once per few hours, sometimes sequentially one after another, on different OSS nodes. I also tagged 2.10.6 as affected, as we tried everything from 2.10.4 to current b2_10.

      On OSS we get:

       

      [Tue Oct  2 11:56:44 2018] LNet: Service thread pid 11401 was inactive for 200.27s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      [Tue Oct  2 11:56:44 2018] Pid: 11401, comm: ll_ost_out00_00 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:44 2018] Call Trace:
      [Tue Oct  2 11:56:44 2018]  [<ffffffff8395aa77>] call_rwsem_down_write_failed+0x17/0x30
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0eeaa8c>] osd_write_lock+0x5c/0xe0 [osd_ldiskfs]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b5b119>] out_tx_attr_set_exec+0x69/0x3f0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b55591>] out_tx_end+0xe1/0x5c0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b596d3>] out_handle+0x1453/0x1bc0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b4f38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0af7e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0afb592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffff836bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:44 2018]  [<ffffffff83d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:44 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:44 2018] LustreError: dumping log to /tmp/lustre-log.1538474204.11401
      [Tue Oct  2 11:56:44 2018] Pid: 35664, comm: ll_ost_io00_098 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:44 2018] Call Trace:
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0e7e495>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0e76533>] jbd2_journal_stop+0x343/0x3d0 [jbd2]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc112ab4c>] __ldiskfs_journal_stop+0x3c/0xb0 [ldiskfs]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0efb783>] osd_trans_stop+0x183/0x850 [osd_ldiskfs]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc107d352>] ofd_trans_stop+0x22/0x60 [ofd]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc10835f4>] ofd_commitrw_write+0x7e4/0x1c90 [ofd]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc10877a9>] ofd_commitrw+0x4c9/0xae0 [ofd]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b80864>] obd_commitrw+0x2f3/0x336 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b5338d>] tgt_brw_write+0xffd/0x17d0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b4f38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0af7e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0afb592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffff836bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:45 2018]  [<ffffffff83d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:45 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:45 2018] Pid: 35678, comm: ll_ost_io00_112 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:45 2018] Call Trace:
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0e7e495>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0e76533>] jbd2_journal_stop+0x343/0x3d0 [jbd2]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc112ab4c>] __ldiskfs_journal_stop+0x3c/0xb0 [ldiskfs]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0efb783>] osd_trans_stop+0x183/0x850 [osd_ldiskfs]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc107d352>] ofd_trans_stop+0x22/0x60 [ofd]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc10835f4>] ofd_commitrw_write+0x7e4/0x1c90 [ofd]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc10877a9>] ofd_commitrw+0x4c9/0xae0 [ofd]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0b80864>] obd_commitrw+0x2f3/0x336 [ptlrpc]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0b5338d>] tgt_brw_write+0xffd/0x17d0 [ptlrpc]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0b4f38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0af7e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0afb592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:45 2018]  [<ffffffff836bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:45 2018]  [<ffffffff83d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:45 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      

       

       

      On MDS:

      [Tue Oct  2 11:56:44 2018] LNet: Service thread pid 69588 was inactive for 200.49s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      [Tue Oct  2 11:56:44 2018] Pid: 69588, comm: mdt00_095 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:44 2018] Call Trace:
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0f7e140>] ptlrpc_set_wait+0x4c0/0x920 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0f7e61d>] ptlrpc_queue_wait+0x7d/0x220 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc14de463>] osp_remote_sync+0xd3/0x200 [osp]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc14c6dd0>] osp_attr_set+0x4c0/0x5d0 [osp]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1421c6b>] lod_sub_attr_set+0x1cb/0x460 [lod]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1403be6>] lod_obj_stripe_attr_set_cb+0x16/0x30 [lod]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc140fa96>] lod_obj_for_each_stripe+0xb6/0x230 [lod]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1411043>] lod_attr_set+0x2f3/0x9a0 [lod]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc14810a0>] mdd_attr_set_internal+0x120/0x2a0 [mdd]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1481e8d>] mdd_attr_set+0x8ad/0xce0 [mdd]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc13645f5>] mdt_reint_setattr+0xba5/0x1060 [mdt]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1364b33>] mdt_reint_rec+0x83/0x210 [mdt]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc134636b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1351f07>] mdt_reint+0x67/0x140 [mdt]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0fee38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0f96e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0f9a592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffff946bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:44 2018]  [<ffffffff94d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:44 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:44 2018] LustreError: dumping log to /tmp/lustre-log.1538474204.69588
      [Tue Oct  2 11:56:49 2018] LNet: Service thread pid 7634 was inactive for 200.51s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      [Tue Oct  2 11:56:49 2018] Pid: 7634, comm: mdt_rdpg00_000 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:49 2018] Call Trace:
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef085>] wait_transaction_locked+0x85/0xd0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef368>] add_transaction_credits+0x268/0x2f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef5e1>] start_this_handle+0x1a1/0x430 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06efa93>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc118aa99>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc12cac8e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1002512>] top_trans_start+0x702/0x940 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc13ec3f1>] lod_trans_start+0x31/0x40 [lod]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc148c1ba>] mdd_trans_start+0x1a/0x20 [mdd]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1481b69>] mdd_attr_set+0x589/0xce0 [mdd]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136c1d6>] mdt_mfd_close+0x1a6/0x610 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1371951>] mdt_close_internal+0x121/0x220 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1371c70>] mdt_close+0x220/0x780 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fee38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f96e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f9a592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffff946bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:49 2018]  [<ffffffff94d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:49 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:49 2018] LustreError: dumping log to /tmp/lustre-log.1538474210.7634
      [Tue Oct  2 11:56:49 2018] Pid: 69508, comm: mdt00_067 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:49 2018] Call Trace:
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef085>] wait_transaction_locked+0x85/0xd0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef368>] add_transaction_credits+0x268/0x2f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef5e1>] start_this_handle+0x1a1/0x430 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06efa93>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc118aa99>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc12cac8e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1369de7>] mdt_empty_transno+0xf7/0x840 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136cf1e>] mdt_mfd_open+0x8de/0xe70 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136da2b>] mdt_finish_open+0x57b/0x690 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136f308>] mdt_reint_open+0x17c8/0x3190 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1364b33>] mdt_reint_rec+0x83/0x210 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc134636b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1346892>] mdt_intent_reint+0x162/0x430 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1351671>] mdt_intent_policy+0x441/0xc70 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f3b2ba>] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f64b53>] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fea452>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fee38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f96e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f9a592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffff946bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:49 2018]  [<ffffffff94d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:49 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:49 2018] Pid: 69607, comm: mdt00_105 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:49 2018] Call Trace:
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef085>] wait_transaction_locked+0x85/0xd0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef368>] add_transaction_credits+0x268/0x2f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef5e1>] start_this_handle+0x1a1/0x430 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06efa93>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc118aa99>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc12cac8e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1369de7>] mdt_empty_transno+0xf7/0x840 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136cf1e>] mdt_mfd_open+0x8de/0xe70 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136da2b>] mdt_finish_open+0x57b/0x690 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136f308>] mdt_reint_open+0x17c8/0x3190 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1364b33>] mdt_reint_rec+0x83/0x210 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc134636b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1346892>] mdt_intent_reint+0x162/0x430 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1351671>] mdt_intent_policy+0x441/0xc70 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f3b2ba>] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f64b53>] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fea452>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fee38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f96e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f9a592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffff946bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:49 2018]  [<ffffffff94d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:49 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:49 2018] Pid: 69594, comm: mdt00_098 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:49 2018] Call Trace:
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef085>] wait_transaction_locked+0x85/0xd0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef368>] add_transaction_credits+0x268/0x2f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef5e1>] start_this_handle+0x1a1/0x430 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06efa93>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc118aa99>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc12cac8e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1002512>] top_trans_start+0x702/0x940 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc13ec3f1>] lod_trans_start+0x31/0x40 [lod]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc148c1ba>] mdd_trans_start+0x1a/0x20 [mdd]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc14763c0>] mdd_create+0xbe0/0x1400 [mdd]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136fcb5>] mdt_reint_open+0x2175/0x3190 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1364b33>] mdt_reint_rec+0x83/0x210 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc134636b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1346892>] mdt_intent_reint+0x162/0x430 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1351671>] mdt_intent_policy+0x441/0xc70 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f3b2ba>] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f64b53>] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fea452>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fee38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f96e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f9a592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffff946bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:49 2018]  [<ffffffff94d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:49 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      

      Please have a look at above, full dmesg and lustre log files are attached as an archive.

       

      Attachments

        1. cyf_dec10_client_stacks.log
          28 kB
        2. cyf_dec10_server_stacks.log
          117 kB
        3. logs.zip
          15.44 MB

        Issue Links

          Activity

            [LU-11465] OSS/MDS deadlock in 2.10.5
            m.magrys Marek Magrys added a comment -

            Wi will bump servers kernel to latest RHEL 7.6 with patches from 2.10.6, our kernels from 2.10.4 lacked some ext4 patches which might be the root cause of the mentioned stack traces.

            m.magrys Marek Magrys added a comment - Wi will bump servers kernel to latest RHEL 7.6 with patches from 2.10.6, our kernels from 2.10.4 lacked some ext4 patches which might be the root cause of the mentioned stack traces.

            hm, I see only writing threads on OSS (ofd_commitrw_write()) and around), but nobody else. either the log is missing important trace(s) or it's something different. is it possible to dump all threads (echo t >/proc/sysrq-trigger) and attach them to the ticket?

            bzzz Alex Zhuravlev added a comment - hm, I see only writing threads on OSS (ofd_commitrw_write()) and around), but nobody else. either the log is missing important trace(s) or it's something different. is it possible to dump all threads (echo t >/proc/sysrq-trigger) and attach them to the ticket?
            lflis Lukasz Flis added a comment - - edited

            We have updated all servers: MDS + OSS  together.

            Please find the attached stack traces from servers, and corresponding stacks from clients affected by the hangups

            cyf_dec10_client_stacks.log cyf_dec10_server_stacks.log
            There were no stack traces on MDS today. There were some in past days so i can attach them also if needed

             

             

            lflis Lukasz Flis added a comment - - edited We have updated all servers: MDS + OSS  together. Please find the attached stack traces from servers, and corresponding stacks from clients affected by the hangups cyf_dec10_client_stacks.log cyf_dec10_server_stacks.log There were no stack traces on MDS today. There were some in past days so i can attach them also if needed    

            have you updated MDS? can you provide all stack traces please?

            bzzz Alex Zhuravlev added a comment - have you updated MDS? can you provide all stack traces please?
            lflis Lukasz Flis added a comment - - edited

            Short update:

            After moving to 2.10.6 RC2 with LU-5152 reverted we still see hung threads on OSSes

             

             

            [507798.983947] LNet: Service thread pid 17130 was inactive for 200.47s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
             [507799.001471] Pid: 17130, comm: ll_ost_io00_094 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
             [507799.012231] Call Trace:
             [507799.014947] [<ffffffffc0eac085>] wait_transaction_locked+0x85/0xd0 [jbd2]
             [507799.022084] [<ffffffffc0eac368>] add_transaction_credits+0x268/0x2f0 [jbd2]
             [507799.029399] [<ffffffffc0eac5e1>] start_this_handle+0x1a1/0x430 [jbd2]
             [507799.036186] [<ffffffffc0eaca93>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
             [507799.043074] [<ffffffffc0f22009>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
             [507799.050909] [<ffffffffc0fecb1e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs]
             [507799.058151] [<ffffffffc11242ae>] ofd_trans_start+0x6e/0xf0 [ofd]
             [507799.064513] [<ffffffffc112a75b>] ofd_commitrw_write+0x94b/0x1c90 [ofd]
             [507799.071400] [<ffffffffc112e7a9>] ofd_commitrw+0x4c9/0xae0 [ofd]
             [507799.077676] [<ffffffffc0bb7824>] obd_commitrw+0x2f3/0x336 [ptlrpc]
             [507799.084277] [<ffffffffc0b8a38d>] tgt_brw_write+0xffd/0x17d0 [ptlrpc]
             [507799.091046] [<ffffffffc0b8638a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
             [507799.098231] [<ffffffffc0b2ee4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
             [507799.106355] [<ffffffffc0b32592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
             [507799.112928] [<ffffffffbc4bae31>] kthread+0xd1/0xe0
             [507799.118077] [<ffffffffbcb1f5f7>] ret_from_fork_nospec_end+0x0/0x39
             [507799.124610] [<ffffffffffffffff>] 0xffffffffffffffff
             
            

             

            lflis Lukasz Flis added a comment - - edited Short update: After moving to 2.10.6 RC2 with LU-5152 reverted we still see hung threads on OSSes     [507798.983947] LNet: Service thread pid 17130 was inactive for 200.47s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [507799.001471] Pid: 17130, comm: ll_ost_io00_094 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018 [507799.012231] Call Trace: [507799.014947] [<ffffffffc0eac085>] wait_transaction_locked+0x85/0xd0 [jbd2] [507799.022084] [<ffffffffc0eac368>] add_transaction_credits+0x268/0x2f0 [jbd2] [507799.029399] [<ffffffffc0eac5e1>] start_this_handle+0x1a1/0x430 [jbd2] [507799.036186] [<ffffffffc0eaca93>] jbd2__journal_start+0xf3/0x1f0 [jbd2] [507799.043074] [<ffffffffc0f22009>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [507799.050909] [<ffffffffc0fecb1e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs] [507799.058151] [<ffffffffc11242ae>] ofd_trans_start+0x6e/0xf0 [ofd] [507799.064513] [<ffffffffc112a75b>] ofd_commitrw_write+0x94b/0x1c90 [ofd] [507799.071400] [<ffffffffc112e7a9>] ofd_commitrw+0x4c9/0xae0 [ofd] [507799.077676] [<ffffffffc0bb7824>] obd_commitrw+0x2f3/0x336 [ptlrpc] [507799.084277] [<ffffffffc0b8a38d>] tgt_brw_write+0xffd/0x17d0 [ptlrpc] [507799.091046] [<ffffffffc0b8638a>] tgt_request_handle+0x92a/0x1370 [ptlrpc] [507799.098231] [<ffffffffc0b2ee4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] [507799.106355] [<ffffffffc0b32592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [507799.112928] [<ffffffffbc4bae31>] kthread+0xd1/0xe0 [507799.118077] [<ffffffffbcb1f5f7>] ret_from_fork_nospec_end+0x0/0x39 [507799.124610] [<ffffffffffffffff>] 0xffffffffffffffff    

            revert of LU-5152 should be enough.

            bzzz Alex Zhuravlev added a comment - revert of LU-5152 should be enough.
            m.magrys Marek Magrys added a comment -

            Ok, we could take the patches for a spin on one of our production systems. Should we pull both https://review.whamcloud.com/33682 (revert of LU-5152) and https://review.whamcloud.com/33586 ("LU-10048 osd: async truncate") or would you recommend to stick just to 33682 now?

            Frequency of the issue got much lower during last weeks, so it will take some time before we confirm that the problem is solved.

            m.magrys Marek Magrys added a comment - Ok, we could take the patches for a spin on one of our production systems. Should we pull both  https://review.whamcloud.com/33682 (revert of LU-5152 ) and https://review.whamcloud.com/33586 (" LU-10048 osd: async truncate") or would you recommend to stick just to 33682 now? Frequency of the issue got much lower during last weeks, so it will take some time before we confirm that the problem is solved.

            For 2.10.5 the https://review.whamcloud.com/33682 revert of LU-5152 has passed testing and is the simplest solution. For 2.12 or 2.13 we are looking into a proper solution that preserves chgrp behavior without impacting the performance or inducing deadlocks.

            adilger Andreas Dilger added a comment - For 2.10.5 the https://review.whamcloud.com/33682 revert of LU-5152 has passed testing and is the simplest solution. For 2.12 or 2.13 we are looking into a proper solution that preserves chgrp behavior without impacting the performance or inducing deadlocks.

            In either case, reverting the LU-5152 patch should avoid this problem. I'll note that this issue is marked as affecting 2.10.4, 2.10.5, and 2.10.6, and the LU-5152 patch was landed in 2.10.4 so it is at least a strong correlation for being the root cause.

            adilger Andreas Dilger added a comment - In either case, reverting the LU-5152 patch should avoid this problem. I'll note that this issue is marked as affecting 2.10.4, 2.10.5, and 2.10.6, and the LU-5152 patch was landed in 2.10.4 so it is at least a strong correlation for being the root cause.

            Andreas, I think the deadlock wasn't caused by dt_sync() but rather by the sync settr RPC to OST (which is to verify the group has enough quota).

            bzzz Alex Zhuravlev added a comment - Andreas, I think the deadlock wasn't caused by dt_sync() but rather by the sync settr RPC to OST (which is to verify the group has enough quota).

            Hongchao, in the LU-5152 patch it may be OK to remove the dt_sync() call, and instead ensure in the OSP code to properly order the chgrp with previous setattr requests. That could avoid the deadlock here, as well as the performance problems reported in LU-11303. Also, there is already a mechanism to force sync permission changes via permission_needs_sync() so we definitely should not be doing two syncs per setattr.

            I've pushed https://review.whamcloud.com/33676 to revert the LU-5152 patch since I think it is causing as many problems as it solves.

            adilger Andreas Dilger added a comment - Hongchao, in the LU-5152 patch it may be OK to remove the dt_sync() call, and instead ensure in the OSP code to properly order the chgrp with previous setattr requests. That could avoid the deadlock here, as well as the performance problems reported in LU-11303 . Also, there is already a mechanism to force sync permission changes via permission_needs_sync() so we definitely should not be doing two syncs per setattr. I've pushed https://review.whamcloud.com/33676 to revert the LU-5152 patch since I think it is causing as many problems as it solves.

            People

              bzzz Alex Zhuravlev
              m.magrys Marek Magrys
              Votes:
              3 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: