[LU-14410] 5404:0:(mgs_handler.c:282:mgs_revoke_lock()) MGS: can't take cfg lock for 0x736d61726170/0x3 : rc = -11 Created: 10/Feb/21 Updated: 20/Feb/23 Resolved: 16/Jan/22 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | Mikhail Pershin |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
Adding config setting to MGS locked up with error
Feb 9 20:54:25 nbp12-srv1 kernel: [9617492.048505] Lustre: Setting parameter *.osd-ldiskfs.*.writethrough_cache_enable in log params
Feb 9 20:54:25 nbp12-srv1 kernel: [9617492.074175] LustreError: 5404:0:(mgs_handler.c:282:mgs_revoke_lock()) MGS: can't take cfg lock for 0x736d61726170/0x3 : rc = -11
Feb 9 20:54:35 nbp12-srv1 kernel: [9617502.527968] Lustre: Setting parameter *.osd-ldiskfs.*.read_cache_enable in log params
Feb 9 21:03:10 nbp12-srv2 kernel: [38810599.981790] Lustre: nbp12-OST000d: Connection restored to fd09a309-d9b4-ecfd-0645-9bd5343738ad (at 10.151.45.247@o2ib)
Feb 9 21:03:10 nbp12-srv2 kernel: [38810599.981793] Lustre: Skipped 379 previous similar messages
Feb 9 21:04:50 nbp12-srv1 kernel: [9618116.154646] Lustre: nbp12-OST0003: Connection restored to 21dfbcba-e039-b886-e94c-6bec8c8725e1 (at 10.151.31.174@o2ib)
Feb 9 21:04:50 nbp12-srv1 kernel: [9618116.154649] Lustre: Skipped 349 previous similar messages
Feb 9 21:08:20 nbp12-srv1 kernel: [9618325.605792] LustreError: 5432:0:(ldlm_request.c:129:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1612932875, 825s ago); not entering recovery in server code, just going back to sleep ns: MGS lock: ffff9ea3c46a6f40/0x3193ac90e5ebeee0 lrc: 3/0,1 mode: --/EX res: [0x736d61726170:0x3:0x0].0x0 rrc: 1000 type: PLN flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 5432 timeout: 0 lvb_type: 0
Feb 9 21:08:20 nbp12-srv1 kernel: [9618325.721807] LustreError: dumping log to /tmp/lustre-log.1612933700.5432
clients could not mout the filesystem after the error. I uploaded lustre-log.1612933700.5432 to ftp.whamcloud.com/uploads.
|
| Comments |
| Comment by Peter Jones [ 10/Feb/21 ] |
|
Mike Could you please advise? Thanks Peter |
| Comment by Andreas Dilger [ 10/Feb/21 ] |
|
Mahmoud, I noticed that the parameter name "*.osd-ldiskfs.*.writethrough_cache_enable" is not correct. It should be just "osd-ldiskfs.*.writethrough_cache_enable" (no leading "*."). The format of the "lctl set_param -P <parameter>" parameter on the MGS should exactly match the format of "lctl {get,set}_param <parameter>" on the client/server. That allows verifying the syntax in advance, unlike "lctl conf_param" which was more difficult to get the correct parameter name for. Note that you can delete this parameter from the config log on the MGS: # lctl --device MGS llog_print params
- { index: 2, event: set_param, device: general, parameter: timeout, value: 10 }
- { index: 8, event: set_param, device: general, parameter: *.osd-ldiskfs.*.writethrough_cache_enable, value: 0 }
# lctl --device MGS llog_cancel params 8
index 8 was canceled.
# lctl --device MGS llog_print params
- { index: 2, event: set_param, device: general, parameter: timeout, value: 10 }
|