[LU-13599] LustreError: 30166:0:(service.c:189:ptlrpc_save_lock()) ASSERTION( rs->rs_nlocks < 8 ) failed Created: 25/May/20 Updated: 04/Oct/22 Resolved: 09/Dec/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.4 |
| Fix Version/s: | Lustre 2.14.0, Lustre 2.12.6 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Stephane Thiell | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
We hit the following crash on one of a MDS of Fir last night, running Lustre 2.12.4. Same problem occurred after re-mount of MDT and recovery. I had to kill the robinhood client that was running purge but also MDT-to-MDT migration (a single lfs migrate -m 0). Then I was able to remount this MDT. It looks a bit like [1579975.369592] Lustre: fir-MDT0003: haven't heard from client 6fb18a53-0376-4 (at 10.50.6.54@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9150778e2400, cur 1590281140 expire 1590280990 last 1590280913 [1580039.870825] Lustre: fir-MDT0003: Connection restored to b7b3778b-82b4-4 (at 10.50.6.54@o2ib2) [1600137.924189] Lustre: fir-MDT0003: haven't heard from client ed7f1c7c-f5de-4 (at 10.50.4.54@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff913f6360b800, cur 1590301302 expire 1590301152 last 1590301075 [1619876.303011] Lustre: fir-MDT0003: Connection restored to 796a800c-02e4-4 (at 10.49.20.10@o2ib1) [1639768.911234] Lustre: fir-MDT0003: Connection restored to 83415c02-51ff-4 (at 10.49.20.5@o2ib1) [1639821.000702] Lustre: fir-MDT0003: haven't heard from client 83415c02-51ff-4 (at 10.49.20.5@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff913f32f56800, cur 1590340984 expire 1590340834 last 1590340757 [1647672.215034] Lustre: fir-MDT0003: haven't heard from client 19e3d49f-43e4-4 (at 10.50.9.37@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff913f6240bc00, cur 1590348835 expire 1590348685 last 1590348608 [1647717.069200] Lustre: fir-MDT0003: Connection restored to a4c7b337-bfab-4 (at 10.50.9.37@o2ib2) [1667613.717650] Lustre: fir-MDT0003: haven't heard from client 20e68a82-bbdb-4 (at 10.50.6.54@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff914d1725c400, cur 1590368776 expire 1590368626 last 1590368549 [1667717.713398] Lustre: fir-MDT0003: Connection restored to b7b3778b-82b4-4 (at 10.50.6.54@o2ib2) [1692403.249073] LustreError: 30166:0:(service.c:189:ptlrpc_save_lock()) ASSERTION( rs->rs_nlocks < 8 ) failed: [1692403.258985] LustreError: 30166:0:(service.c:189:ptlrpc_save_lock()) LBUG [1692403.265867] Pid: 30166, comm: mdt00_002 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [1692403.276224] Call Trace: [1692403.278866] [<ffffffffc0aff7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs] [1692403.285611] [<ffffffffc0aff87c>] lbug_with_loc+0x4c/0xa0 [libcfs] [1692403.292002] [<ffffffffc0fb8851>] ptlrpc_save_lock+0xc1/0xd0 [ptlrpc] [1692403.298695] [<ffffffffc14b4bab>] mdt_save_lock+0x20b/0x360 [mdt] [1692403.305003] [<ffffffffc14b4d5c>] mdt_object_unlock+0x5c/0x3c0 [mdt] [1692403.311572] [<ffffffffc14b82e7>] mdt_object_unlock_put+0x17/0x120 [mdt] [1692403.318479] [<ffffffffc150c4fc>] mdt_unlock_list+0x54/0x174 [mdt] [1692403.324876] [<ffffffffc14d3fd3>] mdt_reint_migrate+0xa03/0x1310 [mdt] [1692403.331619] [<ffffffffc14d4963>] mdt_reint_rec+0x83/0x210 [mdt] [1692403.337841] [<ffffffffc14b1273>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [1692403.344586] [<ffffffffc14bc6e7>] mdt_reint+0x67/0x140 [mdt] [1692403.350463] [<ffffffffc101c64a>] tgt_request_handle+0xada/0x1570 [ptlrpc] [1692403.357586] [<ffffffffc0fbf43b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [1692403.365483] [<ffffffffc0fc2da4>] ptlrpc_main+0xb34/0x1470 [ptlrpc] [1692403.371991] [<ffffffffb98c2e81>] kthread+0xd1/0xe0 [1692403.377079] [<ffffffffb9f77c24>] ret_from_fork_nospec_begin+0xe/0x21 [1692403.383725] [<ffffffffffffffff>] 0xffffffffffffffff [1692403.388918] Kernel panic - not syncing: LBUG [1692403.393363] CPU: 44 PID: 30166 Comm: mdt00_002 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 [1692403.406214] Hardware name: Dell Inc. PowerEdge R6415/07YXFK, BIOS 1.10.6 08/15/2019 [1692403.414040] Call Trace: [1692403.416670] [<ffffffffb9f65147>] dump_stack+0x19/0x1b [1692403.421981] [<ffffffffb9f5e850>] panic+0xe8/0x21f [1692403.426954] [<ffffffffc0aff8cb>] lbug_with_loc+0x9b/0xa0 [libcfs] [1692403.433351] [<ffffffffc0fb8851>] ptlrpc_save_lock+0xc1/0xd0 [ptlrpc] [1692403.439978] [<ffffffffc14b4bab>] mdt_save_lock+0x20b/0x360 [mdt] [1692403.446259] [<ffffffffc14b4d5c>] mdt_object_unlock+0x5c/0x3c0 [mdt] [1692403.452796] [<ffffffffc14b82e7>] mdt_object_unlock_put+0x17/0x120 [mdt] [1692403.459687] [<ffffffffc150c4fc>] mdt_unlock_list+0x54/0x174 [mdt] [1692403.466057] [<ffffffffc14d3fd3>] mdt_reint_migrate+0xa03/0x1310 [mdt] [1692403.472794] [<ffffffffc0d3cfa9>] ? check_unlink_entry+0x19/0xd0 [obdclass] [1692403.479942] [<ffffffffc14d4963>] mdt_reint_rec+0x83/0x210 [mdt] [1692403.486134] [<ffffffffc14b1273>] mdt_reint_internal+0x6e3/0xaf0 [mdt] [1692403.492843] [<ffffffffc14bc6e7>] mdt_reint+0x67/0x140 [mdt] [1692403.498721] [<ffffffffc101c64a>] tgt_request_handle+0xada/0x1570 [ptlrpc] [1692403.505806] [<ffffffffc0ff40b1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc] [1692403.513553] [<ffffffffc0affbde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [1692403.520811] [<ffffffffc0fbf43b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [1692403.528673] [<ffffffffc0fbb565>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [1692403.535632] [<ffffffffb98cfeb4>] ? __wake_up+0x44/0x50 [1692403.541065] [<ffffffffc0fc2da4>] ptlrpc_main+0xb34/0x1470 [ptlrpc] [1692403.547537] [<ffffffffc0fc2270>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc] [1692403.555106] [<ffffffffb98c2e81>] kthread+0xd1/0xe0 [1692403.560158] [<ffffffffb98c2db0>] ? insert_kthread_work+0x40/0x40 [1692403.566424] [<ffffffffb9f77c24>] ret_from_fork_nospec_begin+0xe/0x21 [1692403.573037] [<ffffffffb98c2db0>] ? insert_kthread_work+0x40/0x40 [root@fir-md1-s4 127.0.0.1-2020-05-25-00:59:34]# Message from syslogd@fir-md1-s4 at May 25 09:46:42 ... kernel:LustreError: 29280:0:(service.c:189:ptlrpc_save_lock()) ASSERTION( rs->rs_nlocks < 8 ) failed: Message from syslogd@fir-md1-s4 at May 25 09:46:42 ... kernel:LustreError: 29280:0:(service.c:189:ptlrpc_save_lock()) LBUG May 25 09:55:43 fir-md1-s1 kernel: Lustre: fir-MDT0003: Recovery over after 2:45, of 1302 clients 1301 recovered and 1 was evicted. |
| Comments |
| Comment by Peter Jones [ 26/May/20 ] |
|
Mike Any ideas here? Peter |
| Comment by Mikhail Pershin [ 26/May/20 ] |
|
The mdt_reint_migrate() can take quite a lot of LDLM locks while ptlrpc_reply_state has only 8 slots for saved locks. There are checks to keep that limit and don't try to save more lock that allowed but it seems there is a flaw in these checks. This part of code should be reviewed |
| Comment by Stephane Thiell [ 14/Jun/20 ] |
|
FYI, we had another MDS crash today with same LBUG on our MDT0000 with Lustre 2.12.5 (Fir) while two "lfs migrate -m 3" were running. |
| Comment by Mikhail Pershin [ 14/Jun/20 ] |
|
Stephane, could you report layout of migrated directory and its default LOV layout as well? |
| Comment by Stephane Thiell [ 15/Jun/20 ] |
|
Hi Mike, One directory being migrated was /scratch/groups/jidoyaga and the crash happened just after this output from lfs migrate: migrate /scratch/groups/./jidoyaga/rleylek/atac-seq-pipeline/cromwell-executions/atac/49a5f27e-b6bd-46a8-beae-5ec5c01c1e7e/call-overlap_pr/shard-0/inputs/275296689 to MDT3 stripe count 0 Looks like this directory si still on MDT0 and the default LOV layout seems to be our default one: [root@sh02-01n60 ~]# lfs getdirstripe /scratch/groups/jidoyaga/rleylek/atac-seq-pipeline/cromwell-executions/atac/49a5f27e-b6bd-46a8-beae-5ec5c01c1e7e/call-overlap_pr/shard-0/inputs/275296689
lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none
[root@sh02-01n60 ~]# lfs getstripe -d /scratch/groups/jidoyaga/rleylek/atac-seq-pipeline/cromwell-executions/atac/49a5f27e-b6bd-46a8-beae-5ec5c01c1e7e/call-overlap_pr/shard-0/inputs/275296689
lcm_layout_gen: 0
lcm_mirror_count: 1
lcm_entry_count: 3
lcme_id: N/A
lcme_mirror_id: N/A
lcme_flags: 0
lcme_extent.e_start: 0
lcme_extent.e_end: 134217728
stripe_count: 1 stripe_size: 4194304 pattern: raid0 stripe_offset: -1
lcme_id: N/A
lcme_mirror_id: N/A
lcme_flags: 0
lcme_extent.e_start: 134217728
lcme_extent.e_end: 137438953472
stripe_count: 2 stripe_size: 4194304 pattern: raid0 stripe_offset: -1
lcme_id: N/A
lcme_mirror_id: N/A
lcme_flags: 0
lcme_extent.e_start: 137438953472
lcme_extent.e_end: EOF
stripe_count: 4 stripe_size: 4194304 pattern: raid0 stripe_offset: -1
[970717.500738] Lustre: DEBUG MARKER: Sun Jun 14 20:17:42 2020 [970890.723334] LNet: Service thread pid 48047 was inactive for 200.23s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [970890.740358] Pid: 48047, comm: mdt_rdpg03_008 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 [970890.751051] Call Trace: [970890.753601] [<ffffffffbb588c18>] call_rwsem_down_read_failed+0x18/0x30 [970890.760341] [<ffffffffc0f5f0bf>] llog_cat_declare_add_rec+0x4f/0x260 [obdclass] [970890.767880] [<ffffffffc0f56318>] llog_declare_add+0x78/0x1a0 [obdclass] [970890.774709] [<ffffffffc12ab8be>] top_trans_start+0x17e/0x940 [ptlrpc] [970890.781416] [<ffffffffc188a494>] lod_trans_start+0x34/0x40 [lod] [970890.787632] [<ffffffffc193f6ba>] mdd_trans_start+0x1a/0x20 [mdd] [970890.793870] [<ffffffffc1932c29>] mdd_attr_set+0x649/0xda0 [mdd] [970890.800016] [<ffffffffc17bcf5b>] mdt_mfd_close+0x6cb/0x870 [mdt] [970890.806257] [<ffffffffc17c25b1>] mdt_close_internal+0x121/0x220 [mdt] [970890.812915] [<ffffffffc17c28d0>] mdt_close+0x220/0x780 [mdt] [970890.818797] [<ffffffffc129a66a>] tgt_request_handle+0xada/0x1570 [ptlrpc] [970890.825829] [<ffffffffc123d44b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [970890.833642] [<ffffffffc1240db4>] ptlrpc_main+0xb34/0x1470 [ptlrpc] [970890.840058] [<ffffffffbb2c2e81>] kthread+0xd1/0xe0 [970890.845058] [<ffffffffbb977c24>] ret_from_fork_nospec_begin+0xe/0x21 [970890.851628] [<ffffffffffffffff>] 0xffffffffffffffff [970890.856735] LustreError: dumping log to /tmp/lustre-log.1592191235.48047 lfs getdirstripe /scratch/groups/jidoyaga is also hanging. I'm attaching a dump of the tasks on fir-md1-s4 (MDT0003 @ 10.0.10.54@o2ib7) while the lfs getdirstripe command is hanging on a client (10.50.0.1@o2ib2) as fir-md1-s4_hang_LU-13599.log |
| Comment by Stephane Thiell [ 15/Jun/20 ] |
|
Also, if I start from the directory of the last migrate message, and test each parent dir, I get: [root@sh02-01n60 ~]# lfs getdirstripe /scratch/groups/jidoyaga/rleylek/atac-seq-pipeline/cromwell-executions/atac/49a5f27e-b6bd-46a8-beae-5ec5c01c1e7e/call-overlap_pr/shard-0/inputs/275296689 lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none [root@sh02-01n60 ~]# lfs getdirstripe /scratch/groups/jidoyaga/rleylek/atac-seq-pipeline/cromwell-executions/atac/49a5f27e-b6bd-46a8-beae-5ec5c01c1e7e/call-overlap_pr/shard-0/inputs lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none [root@sh02-01n60 ~]# lfs getdirstripe /scratch/groups/jidoyaga/rleylek/atac-seq-pipeline/cromwell-executions/atac/49a5f27e-b6bd-46a8-beae-5ec5c01c1e7e/call-overlap_pr/shard-0 <hang> |
| Comment by Stephane Thiell [ 15/Jun/20 ] |
|
Mike, [1015186.802058] LustreError: 0-0: Forced cleanup waiting for fir-MDT0000-osp-MDT0003 namespace with 1 resources in use, (rc=-110) But then the recovery was fast and MDT is working again. [root@sh02-01n60 ~]# lfs getdirstripe /scratch/groups/jidoyaga
lmv_stripe_count: 2 lmv_stripe_offset: 3 lmv_hash_type: fnv_1a_64,migrating
mdtidx FID[seq:oid:ver]
3 [0x280040561:0x1220:0x0]
0 [0x200000406:0xf1:0x0]
|
| Comment by Mikhail Pershin [ 26/Jun/20 ] |
|
Stephane, I've found possible source of problem, making fix right now. |
| Comment by Stephane Thiell [ 26/Jun/20 ] |
|
Great, thanks for the update Mike! |
| Comment by Gerrit Updater [ 26/Jun/20 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39191 |
| Comment by Gerrit Updater [ 26/Jun/20 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39194 |
| Comment by Stephane Thiell [ 07/Jul/20 ] |
|
Mike, tomorrow on Wednesday, we have a maintenance on our systems (including Fir), so I think I'm going to try your patch on top of 2.12.5 on the servers, as it looks ready (unless you told me otherwise!). Thanks! |
| Comment by Mikhail Pershin [ 07/Jul/20 ] |
|
Stephane, yes, I think patch is ready to try |
| Comment by Stephane Thiell [ 08/Jul/20 ] |
|
Hi Mike, we tried the patch, but something is wrong with it I think. With 2.12.5 + patch, we hit the following crash two times. One time just after the recovery (see fir-md1-s1-vmcore-dmesg_070720.txt Jul 07 18:54:12 fir-md1-s1 kernel: Pid: 23205, comm: mdt00_105 3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1 SMP Thu Nov 7 15:26:16 PST 2019 Message from syslogd@fir-md1-s1 at Jul 7 18:54:12 ... kernel:LustreError: 23205:0:(mdt_handler.c:892:mdt_big_xattr_get()) ASSERTION( info->mti_big_lmm_used == 0 ) failed: Message from syslogd@fir-md1-s1 at Jul 7 18:54:12 ... kernel:LustreError: 23205:0:(mdt_handler.c:892:mdt_big_xattr_get()) LBUG Jul 07 18:54:12 fir-md1-s1 kernel: Call Trace: Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc0d6d7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc0d6d87c>] lbug_with_loc+0x4c/0xa0 [libcfs] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc180dcc0>] mdt_big_xattr_get+0x640/0x810 [mdt] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc180e0c7>] mdt_stripe_get+0x237/0x400 [mdt] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc18306b6>] mdt_reint_migrate+0x1056/0x1350 [mdt] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc1830a33>] mdt_reint_rec+0x83/0x210 [mdt] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc180d273>] mdt_reint_internal+0x6e3/0xaf0 [mdt] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc18186e7>] mdt_reint+0x67/0x140 [mdt] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc12e566a>] tgt_request_handle+0xada/0x1570 [ptlrpc] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc128844b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffc128bdb4>] ptlrpc_main+0xb34/0x1470 [ptlrpc] Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffff872c2e81>] kthread+0xd1/0xe0 Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffff87977c24>] ret_from_fork_nospec_begin+0xe/0x21 Jul 07 18:54:12 fir-md1-s1 kernel: [<ffffffffffffffff>] 0xffffffffffffffff` |
| Comment by Stephane Thiell [ 08/Jul/20 ] |
|
We have not applied the patch at our system maintenance today, due to the new assertion above ASSERTION( info->mti_big_lmm_used == 0 ) which seems more frequent than {{ASSERTION( rs->rs_nlocks < 8 ) }} . We'll stay with 2.12.5 until we understand more. Thanks! |
| Comment by Mikhail Pershin [ 08/Jul/20 ] |
|
Stephane, this assertion was seen on master and was fixed, I will prepare patch for 2.12 |
| Comment by Stephane Thiell [ 08/Jul/20 ] |
|
Great, thanks Mike! |
| Comment by Stephane Thiell [ 22/Jul/20 ] |
|
Hi Mike, |
| Comment by Gerrit Updater [ 28/Jul/20 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39521 |
| Comment by Mikhail Pershin [ 28/Jul/20 ] |
|
Hello Stephane, this fix had no separate ticket in master branch but was made as side work in |
| Comment by Stephane Thiell [ 28/Jul/20 ] |
|
Thanks Mike! $ git log --oneline | head -3 8ac362a LU-13599 mdt: fix mti_big_lmm buffer usage 1324114 LU-13599 mdt: fix logic of skipping local locks in reply_state 78d712a New release 2.12.5 |
| Comment by Stephane Thiell [ 29/Jul/20 ] |
|
Hi Mike, I'm glad to report that all 4 MDSes on Fir have now the two patches and so far have been running without any issue, even with multiple parallel lfs migrate -m running on a client. I'll let you know if I see any issue but it's very promising! Thanks so much! |
| Comment by Olaf Faaland [ 30/Jul/20 ] |
|
Hi Peter or Mike, |
| Comment by Peter Jones [ 31/Jul/20 ] |
|
ofaaland I think that we're all set now - just need things to land |
| Comment by Gerrit Updater [ 07/Aug/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39191/ |
| Comment by Stephane Thiell [ 20/Aug/20 ] |
|
Just checking regarding https://review.whamcloud.com/#/c/39521/ This patch is critical to avoid MDS crashes and has not landed into b2_12 yet. Thanks! |
| Comment by Gerrit Updater [ 01/Sep/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39521/ |
| Comment by Stephane Thiell [ 01/Sep/20 ] |
|
Thanks y'all! |
| Comment by Gerrit Updater [ 08/Sep/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39194/ |
| Comment by Peter Jones [ 09/Dec/20 ] |
|
Landed for 2.12.6. Fixed on master as part of a larger change ( |