Details

    • Bug
    • Resolution: Incomplete
    • Critical
    • None
    • Lustre 2.10.4, Lustre 2.10.5, Lustre 2.10.6
    • None
    • CentOS 7, 3.10.0-862.2.3.el7_lustre.x86_64, 1 MDS (+1 HA pair), 4 OSS
    • 3
    • 9223372036854775807

    Description

      After an upgrade of Lustre servers from Centos 6 and IEEL/DDN version of Lustre 2.5 to Centos 7 and Lustre 2.10.5 we are experiencing stability issues related to and MDS/OSS deadlock, most likely caused by hung OST thread. The issue is visible on one of two filesystems, the smaller one, both are working on identical HW and SW stack, so the issue is most likely related to a specific workload on the filesystem. This is mostly an midterm storage filesystem, so more metadata ops happen on it than on the scratch one. We initially thought that the issue might be caused by our try to enable project quotas (tune2fs -O project on all targets), so we turned it off (tune2fs -O ^project), which didn't change a thing. Also all combinations of MDT/OST quota enforcement (initally it was 'g', than 'gp', now 'none') doesn't change a thing, so probably it is not related to the problem. In terms of relevant information, we run on sync_journal=1 as a workaround to a problem with memory accounting on the clients suggested by DDN some time ago. The crash happens at least once per few hours, sometimes sequentially one after another, on different OSS nodes. I also tagged 2.10.6 as affected, as we tried everything from 2.10.4 to current b2_10.

      On OSS we get:

       

      [Tue Oct  2 11:56:44 2018] LNet: Service thread pid 11401 was inactive for 200.27s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      [Tue Oct  2 11:56:44 2018] Pid: 11401, comm: ll_ost_out00_00 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:44 2018] Call Trace:
      [Tue Oct  2 11:56:44 2018]  [<ffffffff8395aa77>] call_rwsem_down_write_failed+0x17/0x30
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0eeaa8c>] osd_write_lock+0x5c/0xe0 [osd_ldiskfs]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b5b119>] out_tx_attr_set_exec+0x69/0x3f0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b55591>] out_tx_end+0xe1/0x5c0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b596d3>] out_handle+0x1453/0x1bc0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b4f38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0af7e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0afb592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffff836bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:44 2018]  [<ffffffff83d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:44 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:44 2018] LustreError: dumping log to /tmp/lustre-log.1538474204.11401
      [Tue Oct  2 11:56:44 2018] Pid: 35664, comm: ll_ost_io00_098 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:44 2018] Call Trace:
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0e7e495>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0e76533>] jbd2_journal_stop+0x343/0x3d0 [jbd2]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc112ab4c>] __ldiskfs_journal_stop+0x3c/0xb0 [ldiskfs]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0efb783>] osd_trans_stop+0x183/0x850 [osd_ldiskfs]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc107d352>] ofd_trans_stop+0x22/0x60 [ofd]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc10835f4>] ofd_commitrw_write+0x7e4/0x1c90 [ofd]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc10877a9>] ofd_commitrw+0x4c9/0xae0 [ofd]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b80864>] obd_commitrw+0x2f3/0x336 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b5338d>] tgt_brw_write+0xffd/0x17d0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0b4f38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0af7e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0afb592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffff836bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:45 2018]  [<ffffffff83d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:45 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:45 2018] Pid: 35678, comm: ll_ost_io00_112 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:45 2018] Call Trace:
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0e7e495>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0e76533>] jbd2_journal_stop+0x343/0x3d0 [jbd2]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc112ab4c>] __ldiskfs_journal_stop+0x3c/0xb0 [ldiskfs]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0efb783>] osd_trans_stop+0x183/0x850 [osd_ldiskfs]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc107d352>] ofd_trans_stop+0x22/0x60 [ofd]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc10835f4>] ofd_commitrw_write+0x7e4/0x1c90 [ofd]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc10877a9>] ofd_commitrw+0x4c9/0xae0 [ofd]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0b80864>] obd_commitrw+0x2f3/0x336 [ptlrpc]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0b5338d>] tgt_brw_write+0xffd/0x17d0 [ptlrpc]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0b4f38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0af7e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:45 2018]  [<ffffffffc0afb592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:45 2018]  [<ffffffff836bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:45 2018]  [<ffffffff83d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:45 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      

       

       

      On MDS:

      [Tue Oct  2 11:56:44 2018] LNet: Service thread pid 69588 was inactive for 200.49s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      [Tue Oct  2 11:56:44 2018] Pid: 69588, comm: mdt00_095 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:44 2018] Call Trace:
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0f7e140>] ptlrpc_set_wait+0x4c0/0x920 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0f7e61d>] ptlrpc_queue_wait+0x7d/0x220 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc14de463>] osp_remote_sync+0xd3/0x200 [osp]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc14c6dd0>] osp_attr_set+0x4c0/0x5d0 [osp]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1421c6b>] lod_sub_attr_set+0x1cb/0x460 [lod]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1403be6>] lod_obj_stripe_attr_set_cb+0x16/0x30 [lod]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc140fa96>] lod_obj_for_each_stripe+0xb6/0x230 [lod]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1411043>] lod_attr_set+0x2f3/0x9a0 [lod]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc14810a0>] mdd_attr_set_internal+0x120/0x2a0 [mdd]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1481e8d>] mdd_attr_set+0x8ad/0xce0 [mdd]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc13645f5>] mdt_reint_setattr+0xba5/0x1060 [mdt]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1364b33>] mdt_reint_rec+0x83/0x210 [mdt]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc134636b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc1351f07>] mdt_reint+0x67/0x140 [mdt]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0fee38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0f96e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffffc0f9a592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:44 2018]  [<ffffffff946bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:44 2018]  [<ffffffff94d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:44 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:44 2018] LustreError: dumping log to /tmp/lustre-log.1538474204.69588
      [Tue Oct  2 11:56:49 2018] LNet: Service thread pid 7634 was inactive for 200.51s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      [Tue Oct  2 11:56:49 2018] Pid: 7634, comm: mdt_rdpg00_000 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:49 2018] Call Trace:
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef085>] wait_transaction_locked+0x85/0xd0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef368>] add_transaction_credits+0x268/0x2f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef5e1>] start_this_handle+0x1a1/0x430 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06efa93>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc118aa99>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc12cac8e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1002512>] top_trans_start+0x702/0x940 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc13ec3f1>] lod_trans_start+0x31/0x40 [lod]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc148c1ba>] mdd_trans_start+0x1a/0x20 [mdd]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1481b69>] mdd_attr_set+0x589/0xce0 [mdd]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136c1d6>] mdt_mfd_close+0x1a6/0x610 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1371951>] mdt_close_internal+0x121/0x220 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1371c70>] mdt_close+0x220/0x780 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fee38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f96e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f9a592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffff946bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:49 2018]  [<ffffffff94d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:49 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:49 2018] LustreError: dumping log to /tmp/lustre-log.1538474210.7634
      [Tue Oct  2 11:56:49 2018] Pid: 69508, comm: mdt00_067 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:49 2018] Call Trace:
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef085>] wait_transaction_locked+0x85/0xd0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef368>] add_transaction_credits+0x268/0x2f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef5e1>] start_this_handle+0x1a1/0x430 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06efa93>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc118aa99>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc12cac8e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1369de7>] mdt_empty_transno+0xf7/0x840 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136cf1e>] mdt_mfd_open+0x8de/0xe70 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136da2b>] mdt_finish_open+0x57b/0x690 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136f308>] mdt_reint_open+0x17c8/0x3190 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1364b33>] mdt_reint_rec+0x83/0x210 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc134636b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1346892>] mdt_intent_reint+0x162/0x430 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1351671>] mdt_intent_policy+0x441/0xc70 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f3b2ba>] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f64b53>] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fea452>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fee38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f96e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f9a592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffff946bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:49 2018]  [<ffffffff94d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:49 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:49 2018] Pid: 69607, comm: mdt00_105 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:49 2018] Call Trace:
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef085>] wait_transaction_locked+0x85/0xd0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef368>] add_transaction_credits+0x268/0x2f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef5e1>] start_this_handle+0x1a1/0x430 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06efa93>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc118aa99>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc12cac8e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1369de7>] mdt_empty_transno+0xf7/0x840 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136cf1e>] mdt_mfd_open+0x8de/0xe70 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136da2b>] mdt_finish_open+0x57b/0x690 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136f308>] mdt_reint_open+0x17c8/0x3190 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1364b33>] mdt_reint_rec+0x83/0x210 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc134636b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1346892>] mdt_intent_reint+0x162/0x430 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1351671>] mdt_intent_policy+0x441/0xc70 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f3b2ba>] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f64b53>] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fea452>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fee38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f96e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f9a592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffff946bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:49 2018]  [<ffffffff94d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:49 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      [Tue Oct  2 11:56:49 2018] Pid: 69594, comm: mdt00_098 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
      [Tue Oct  2 11:56:49 2018] Call Trace:
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef085>] wait_transaction_locked+0x85/0xd0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef368>] add_transaction_credits+0x268/0x2f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06ef5e1>] start_this_handle+0x1a1/0x430 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc06efa93>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc118aa99>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc12cac8e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1002512>] top_trans_start+0x702/0x940 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc13ec3f1>] lod_trans_start+0x31/0x40 [lod]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc148c1ba>] mdd_trans_start+0x1a/0x20 [mdd]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc14763c0>] mdd_create+0xbe0/0x1400 [mdd]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc136fcb5>] mdt_reint_open+0x2175/0x3190 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1364b33>] mdt_reint_rec+0x83/0x210 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc134636b>] mdt_reint_internal+0x5fb/0x9c0 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1346892>] mdt_intent_reint+0x162/0x430 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc1351671>] mdt_intent_policy+0x441/0xc70 [mdt]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f3b2ba>] ldlm_lock_enqueue+0x38a/0x980 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f64b53>] ldlm_handle_enqueue0+0x9d3/0x16a0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fea452>] tgt_enqueue+0x62/0x210 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0fee38a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f96e4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffffc0f9a592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      [Tue Oct  2 11:56:49 2018]  [<ffffffff946bae31>] kthread+0xd1/0xe0
      [Tue Oct  2 11:56:49 2018]  [<ffffffff94d1f5f7>] ret_from_fork_nospec_end+0x0/0x39
      [Tue Oct  2 11:56:49 2018]  [<ffffffffffffffff>] 0xffffffffffffffff
      

      Please have a look at above, full dmesg and lustre log files are attached as an archive.

       

      Attachments

        1. cyf_dec10_client_stacks.log
          28 kB
        2. cyf_dec10_server_stacks.log
          117 kB
        3. logs.zip
          15.44 MB

        Issue Links

          Activity

            [LU-11465] OSS/MDS deadlock in 2.10.5

            have you updated MDS? can you provide all stack traces please?

            bzzz Alex Zhuravlev added a comment - have you updated MDS? can you provide all stack traces please?
            lflis Lukasz Flis added a comment - - edited

            Short update:

            After moving to 2.10.6 RC2 with LU-5152 reverted we still see hung threads on OSSes

             

             

            [507798.983947] LNet: Service thread pid 17130 was inactive for 200.47s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
             [507799.001471] Pid: 17130, comm: ll_ost_io00_094 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018
             [507799.012231] Call Trace:
             [507799.014947] [<ffffffffc0eac085>] wait_transaction_locked+0x85/0xd0 [jbd2]
             [507799.022084] [<ffffffffc0eac368>] add_transaction_credits+0x268/0x2f0 [jbd2]
             [507799.029399] [<ffffffffc0eac5e1>] start_this_handle+0x1a1/0x430 [jbd2]
             [507799.036186] [<ffffffffc0eaca93>] jbd2__journal_start+0xf3/0x1f0 [jbd2]
             [507799.043074] [<ffffffffc0f22009>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs]
             [507799.050909] [<ffffffffc0fecb1e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs]
             [507799.058151] [<ffffffffc11242ae>] ofd_trans_start+0x6e/0xf0 [ofd]
             [507799.064513] [<ffffffffc112a75b>] ofd_commitrw_write+0x94b/0x1c90 [ofd]
             [507799.071400] [<ffffffffc112e7a9>] ofd_commitrw+0x4c9/0xae0 [ofd]
             [507799.077676] [<ffffffffc0bb7824>] obd_commitrw+0x2f3/0x336 [ptlrpc]
             [507799.084277] [<ffffffffc0b8a38d>] tgt_brw_write+0xffd/0x17d0 [ptlrpc]
             [507799.091046] [<ffffffffc0b8638a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
             [507799.098231] [<ffffffffc0b2ee4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
             [507799.106355] [<ffffffffc0b32592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
             [507799.112928] [<ffffffffbc4bae31>] kthread+0xd1/0xe0
             [507799.118077] [<ffffffffbcb1f5f7>] ret_from_fork_nospec_end+0x0/0x39
             [507799.124610] [<ffffffffffffffff>] 0xffffffffffffffff
             
            

             

            lflis Lukasz Flis added a comment - - edited Short update: After moving to 2.10.6 RC2 with LU-5152 reverted we still see hung threads on OSSes     [507798.983947] LNet: Service thread pid 17130 was inactive for 200.47s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [507799.001471] Pid: 17130, comm: ll_ost_io00_094 3.10.0-862.2.3.el7_lustre.x86_64 #1 SMP Tue May 22 17:36:23 UTC 2018 [507799.012231] Call Trace: [507799.014947] [<ffffffffc0eac085>] wait_transaction_locked+0x85/0xd0 [jbd2] [507799.022084] [<ffffffffc0eac368>] add_transaction_credits+0x268/0x2f0 [jbd2] [507799.029399] [<ffffffffc0eac5e1>] start_this_handle+0x1a1/0x430 [jbd2] [507799.036186] [<ffffffffc0eaca93>] jbd2__journal_start+0xf3/0x1f0 [jbd2] [507799.043074] [<ffffffffc0f22009>] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [507799.050909] [<ffffffffc0fecb1e>] osd_trans_start+0x1ae/0x460 [osd_ldiskfs] [507799.058151] [<ffffffffc11242ae>] ofd_trans_start+0x6e/0xf0 [ofd] [507799.064513] [<ffffffffc112a75b>] ofd_commitrw_write+0x94b/0x1c90 [ofd] [507799.071400] [<ffffffffc112e7a9>] ofd_commitrw+0x4c9/0xae0 [ofd] [507799.077676] [<ffffffffc0bb7824>] obd_commitrw+0x2f3/0x336 [ptlrpc] [507799.084277] [<ffffffffc0b8a38d>] tgt_brw_write+0xffd/0x17d0 [ptlrpc] [507799.091046] [<ffffffffc0b8638a>] tgt_request_handle+0x92a/0x1370 [ptlrpc] [507799.098231] [<ffffffffc0b2ee4b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] [507799.106355] [<ffffffffc0b32592>] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [507799.112928] [<ffffffffbc4bae31>] kthread+0xd1/0xe0 [507799.118077] [<ffffffffbcb1f5f7>] ret_from_fork_nospec_end+0x0/0x39 [507799.124610] [<ffffffffffffffff>] 0xffffffffffffffff    

            revert of LU-5152 should be enough.

            bzzz Alex Zhuravlev added a comment - revert of LU-5152 should be enough.
            m.magrys Marek Magrys added a comment -

            Ok, we could take the patches for a spin on one of our production systems. Should we pull both https://review.whamcloud.com/33682 (revert of LU-5152) and https://review.whamcloud.com/33586 ("LU-10048 osd: async truncate") or would you recommend to stick just to 33682 now?

            Frequency of the issue got much lower during last weeks, so it will take some time before we confirm that the problem is solved.

            m.magrys Marek Magrys added a comment - Ok, we could take the patches for a spin on one of our production systems. Should we pull both  https://review.whamcloud.com/33682 (revert of LU-5152 ) and https://review.whamcloud.com/33586 (" LU-10048 osd: async truncate") or would you recommend to stick just to 33682 now? Frequency of the issue got much lower during last weeks, so it will take some time before we confirm that the problem is solved.

            For 2.10.5 the https://review.whamcloud.com/33682 revert of LU-5152 has passed testing and is the simplest solution. For 2.12 or 2.13 we are looking into a proper solution that preserves chgrp behavior without impacting the performance or inducing deadlocks.

            adilger Andreas Dilger added a comment - For 2.10.5 the https://review.whamcloud.com/33682 revert of LU-5152 has passed testing and is the simplest solution. For 2.12 or 2.13 we are looking into a proper solution that preserves chgrp behavior without impacting the performance or inducing deadlocks.

            In either case, reverting the LU-5152 patch should avoid this problem. I'll note that this issue is marked as affecting 2.10.4, 2.10.5, and 2.10.6, and the LU-5152 patch was landed in 2.10.4 so it is at least a strong correlation for being the root cause.

            adilger Andreas Dilger added a comment - In either case, reverting the LU-5152 patch should avoid this problem. I'll note that this issue is marked as affecting 2.10.4, 2.10.5, and 2.10.6, and the LU-5152 patch was landed in 2.10.4 so it is at least a strong correlation for being the root cause.

            Andreas, I think the deadlock wasn't caused by dt_sync() but rather by the sync settr RPC to OST (which is to verify the group has enough quota).

            bzzz Alex Zhuravlev added a comment - Andreas, I think the deadlock wasn't caused by dt_sync() but rather by the sync settr RPC to OST (which is to verify the group has enough quota).

            Hongchao, in the LU-5152 patch it may be OK to remove the dt_sync() call, and instead ensure in the OSP code to properly order the chgrp with previous setattr requests. That could avoid the deadlock here, as well as the performance problems reported in LU-11303. Also, there is already a mechanism to force sync permission changes via permission_needs_sync() so we definitely should not be doing two syncs per setattr.

            I've pushed https://review.whamcloud.com/33676 to revert the LU-5152 patch since I think it is causing as many problems as it solves.

            adilger Andreas Dilger added a comment - Hongchao, in the LU-5152 patch it may be OK to remove the dt_sync() call, and instead ensure in the OSP code to properly order the chgrp with previous setattr requests. That could avoid the deadlock here, as well as the performance problems reported in LU-11303 . Also, there is already a mechanism to force sync permission changes via permission_needs_sync() so we definitely should not be doing two syncs per setattr. I've pushed https://review.whamcloud.com/33676 to revert the LU-5152 patch since I think it is causing as many problems as it solves.

            I think that the LU-5152 patch may be the root cause of for this problem. It sends a sync during chgrp to the OSTs, but I suspect that this induces a deadlock when the OSTs are trying to get new quota from the master.

            I'm thinking that we should just revert the LU-5152 patch for now, since it is also causing performance issues. In the meantime, it means that quota would no longer properly handle file group ownership changes if group quota is enabled.

            adilger Andreas Dilger added a comment - I think that the LU-5152 patch may be the root cause of for this problem. It sends a sync during chgrp to the OSTs, but I suspect that this induces a deadlock when the OSTs are trying to get new quota from the master. I'm thinking that we should just revert the LU-5152 patch for now, since it is also causing performance issues. In the meantime, it means that quota would no longer properly handle file group ownership changes if group quota is enabled.
            lflis Lukasz Flis added a comment -

            Alex, we can test lustre with and without the patch on our testing instance tommorow if you can share the reproducer code.

            lflis Lukasz Flis added a comment - Alex, we can test lustre with and without the patch on our testing instance tommorow if you can share the reproducer code.

            I'm able to reproduce the issue and couple patches under LU-10048 seem to solve this. they are in testing now..

            bzzz Alex Zhuravlev added a comment - I'm able to reproduce the issue and couple patches under LU-10048 seem to solve this. they are in testing now..

            People

              bzzz Alex Zhuravlev
              m.magrys Marek Magrys
              Votes:
              3 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: