[LU-13462] MDS deadlocks in osd_read_lock() Created: 18/Apr/20 Updated: 25/Jul/22 Resolved: 25/Jul/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | Yang Sheng |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 2 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
MDS deadlocked Similar to 12287155.058187] LNet: Service thread pid 15312 was inactive for 550.48s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [12287155.109703] LNet: Skipped 2 previous similar messages [12287155.125609] [<ffffffffa5f87398>] call_rwsem_down_read_failed+0x18/0x30 [12287155.130583] [<ffffffffc144acfc>] osd_read_lock+0x5c/0xe0 [osd_ldiskfs] [12287155.130612] [<ffffffffc16f28ea>] lod_read_lock+0x3a/0xd0 [lod] [12287155.130625] [<ffffffffc17779aa>] mdd_read_lock+0x3a/0xd0 [mdd] [12287155.130632] [<ffffffffc177d730>] mdd_xattr_get+0x70/0x5c0 [mdd] [12287155.130648] [<ffffffffc15e6ea6>] mdt_stripe_get+0xd6/0x400 [mdt] [12287155.130657] [<ffffffffc15e7a2d>] mdt_attr_get_complex+0x46d/0x850 [mdt] [12287155.130665] [<ffffffffc15e800c>] mdt_getattr_internal+0x1fc/0xf60 [mdt] [12287155.130673] [<ffffffffc15ebd60>] mdt_getattr_name_lock+0x950/0x1c30 [mdt] [12287155.130681] [<ffffffffc15f3c05>] mdt_intent_getattr+0x2b5/0x480 [mdt] [12287155.130691] [<ffffffffc15f0a18>] mdt_intent_policy+0x2e8/0xd00 [mdt] [12287155.130736] [<ffffffffc0f2dd26>] ldlm_lock_enqueue+0x366/0xa60 [ptlrpc] [12287155.130769] [<ffffffffc0f56587>] ldlm_handle_enqueue0+0xa47/0x15a0 [ptlrpc] [12287155.130815] [<ffffffffc0fde882>] tgt_enqueue+0x62/0x210 [ptlrpc] [12287155.130853] [<ffffffffc0fe31da>] tgt_request_handle+0xaea/0x1580 [ptlrpc] [12287155.130887] [<ffffffffc0f8880b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] [12287155.130921] [<ffffffffc0f8c13c>] ptlrpc_main+0xafc/0x1fc0 [ptlrpc] [12287155.130925] [<ffffffffa5cc1da1>] kthread+0xd1/0xe0 [12287155.130929] [<ffffffffa6375c37>] ret_from_fork_nospec_end+0x0/0x39 [12287155.130947] [<ffffffffffffffff>] 0xffffffffffffffff |
| Comments |
| Comment by Peter Jones [ 18/Apr/20 ] |
|
Mahmoud Could you please supply details of the kernel version that you are running? Yang Sheng Could you please advise Thanks Peter |
| Comment by Yang Sheng [ 18/Apr/20 ] |
|
Hi, Mahmoud, Could you please provide more info? What do you mean for similar Thanks, |
| Comment by Mahmoud Hanafi [ 20/Apr/20 ] |
|
The stack trace for hung threads is the same as Our kernel is: 3.10.0-957.21.3.el7_lustre212.x86_64
|
| Comment by Yang Sheng [ 23/Apr/20 ] |
|
Then have any possible to provide sysrq-t info? From stack trace i don't think it same as lu-13073. |
| Comment by Mahmoud Hanafi [ 23/Apr/20 ] |
|
Attached the stack trace. |
| Comment by Yang Sheng [ 24/Apr/20 ] |
|
Hi, Mahmoud, The log you attached really duplicated with Thanks, |