[LU-13447] Sudden slow file create (MDS problem) Created: 10/Apr/20 Updated: 16/Jan/22 Resolved: 16/Jan/22 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Stephane Thiell | Assignee: | Mikhail Pershin |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
CentOS 7.6 |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
We faced a sudden MDS problem with 2.12.4, a few hours after LU-13442, where users reported this kind of slowness when creating new files: sh02-ln03:bp86 09:03:28> time touch $SCRATCH/asdf real 0m26.923s user 0m0.000s sys 0m0.003s Looking at the MDS in question, we could see backtraces like these: [1803096.937756] LNet: Service thread pid 41461 completed after 237.28s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [1805310.391443] LNet: Service thread pid 20879 was inactive for 200.02s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1805310.408554] Pid: 20879, comm: mdt03_010 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [1805310.418900] Call Trace: [1805310.421543] [<ffffffffc1855fa8>] osp_precreate_reserve+0x2e8/0x800 [osp] [1805310.428575] [<ffffffffc184a949>] osp_declare_create+0x199/0x5f0 [osp] [1805310.435312] [<ffffffffc179269f>] lod_sub_declare_create+0xdf/0x210 [lod] [1805310.442330] [<ffffffffc178a86e>] lod_qos_declare_object_on+0xbe/0x3a0 [lod] [1805310.449586] [<ffffffffc178d80e>] lod_alloc_rr.constprop.19+0xeee/0x1490 [lod] [1805310.457012] [<ffffffffc179192d>] lod_qos_prep_create+0x12fd/0x1890 [lod] [1805310.464007] [<ffffffffc177296a>] lod_declare_instantiate_components+0x9a/0x1d0 [lod] [1805310.472042] [<ffffffffc1785725>] lod_declare_layout_change+0xb65/0x10f0 [lod] [1805310.479468] [<ffffffffc17f7f82>] mdd_declare_layout_change+0x62/0x120 [mdd] [1805310.486724] [<ffffffffc1800ec6>] mdd_layout_change+0xb46/0x16a0 [mdd] [1805310.493473] [<ffffffffc166135f>] mdt_layout_change+0x2df/0x480 [mdt] [1805310.500130] [<ffffffffc16697d0>] mdt_intent_layout+0x8a0/0xe00 [mdt] [1805310.506787] [<ffffffffc1666d35>] mdt_intent_policy+0x435/0xd80 [mdt] [1805310.513459] [<ffffffffc0ffbe06>] ldlm_lock_enqueue+0x356/0xa20 [ptlrpc] [1805310.520394] [<ffffffffc10244f6>] ldlm_handle_enqueue0+0xa56/0x15f0 [ptlrpc] [1805310.527688] [<ffffffffc10acb12>] tgt_enqueue+0x62/0x210 [ptlrpc] [1805310.534034] [<ffffffffc10b564a>] tgt_request_handle+0xada/0x1570 [ptlrpc] [1805310.541167] [<ffffffffc105843b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [1805310.549063] [<ffffffffc105bda4>] ptlrpc_main+0xb34/0x1470 [ptlrpc] [1805310.555596] [<ffffffff9fac2e81>] kthread+0xd1/0xe0 [1805310.560688] [<ffffffffa0177c24>] ret_from_fork_nospec_begin+0xe/0x21 [1805310.567340] [<ffffffffffffffff>] 0xffffffffffffffff [1805310.572544] LustreError: dumping log to /tmp/lustre-log.1586359839.20879 [1805316.373786] LNet: Service thread pid 20879 completed after 206.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). We took a crash dump available in the Whamcloud FTP as fir-md1-s3_20200408_vmcore Attaching dmesg as fir-md1-s3_20200408_vmcore-dmesg.txt A restart of the MDS has fixed the issue for now. |
| Comments |
| Comment by Peter Jones [ 10/Apr/20 ] |
|
Mike Could you please advise? Thanks Peter |
| Comment by Mikhail Pershin [ 13/Apr/20 ] |
|
Logs shows connectivity problem with OSTs, so MDT cannot create OST objects for a file being created. I am not sure is that network problem or OST server problem, but that is not MDT issue. There are many messages like this: [1767428.569313] LNetError: 20288:0:(o2iblnd_cb.c:3351:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [1767428.579482] LNetError: 20288:0:(o2iblnd_cb.c:3426:kiblnd_check_conns()) Timed out RDMA with 10.0.10.109@o2ib7 (5): c: 0, oc: 0, rc: 8 [1767428.591908] LNetError: 20288:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.53@o2ib7 added to recovery queue. Health = 900 [1767428.604016] LNetError: 20288:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 5 previous similar messages and as result: [1767243.181693] LustreError: 20702:0:(osp_precreate.c:970:osp_precreate_cleanup_orphans()) fir-OST0034-osc-MDT0002: cannot cleanup orphans: rc = -11 [1767244.194720] LustreError: 20702:0:(osp_precreate.c:970:osp_precreate_cleanup_orphans()) fir-OST0034-osc-MDT0002: cannot cleanup orphans: rc = -11 [1767327.245811] LustreError: 20710:0:(osp_precreate.c:970:osp_precreate_cleanup_orphans()) fir-OST0038-osc-MDT0002: cannot cleanup orphans: rc = -11 [1767341.462169] LustreError: 20714:0:(osp_precreate.c:970:osp_precreate_cleanup_orphans()) fir-OST003a-osc-MDT0002: cannot cleanup orphans: rc = -11 |
| Comment by Andreas Dilger [ 14/Apr/20 ] |
|
We shouldn't really be blocking the create if an OST is unreachable unless all of the OSTs are unreachable. |
| Comment by Stephane Thiell [ 20/Apr/20 ] |
|
Thanks for taking the time to look at this! Still, to me, it's bizarre that this could be due to a network or OST specific problem, as other MDTs accessing the OSTs seemed fine at that time, and all targets are connected to the same IB switch. Also, only one MDT was affected (1 / 4). Anyway, this is the single occurrence of this problem that we have seen so far, so I guess it's very rare. We'll report back if we see it again. |