Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.12.3
-
None
-
CentOS 7.6, Lustre 2.12.3_4
-
3
-
9223372036854775807
Description
Following a changelog-related crash reported in LU-13113, MDT0 took ~2h20 to mount:
Jan 04 18:56:01 fir-md1-s1 kernel: Lustre: fir-MDD0000: changelog on Jan 04 18:56:01 fir-md1-s1 kernel: Lustre: 21788:0:(mdd_device.c:542:mdd_changelog_llog_init()) fir-MDD0000 : orphan changelog records found, starting from index 19457684034 to index 20588833107, being cleared now
Jan 04 21:16:30 fir-md1-s1 kernel: Lustre: fir-MDT0000: in recovery but waiting for the first client to connect Jan 04 21:16:30 fir-md1-s1 kernel: Lustre: fir-MDT0000: Will be in recovery for at least 5:00, or until 1286 clients reconnect
I guess this might be a consequence of leaking changelogs. Readers do hang quite frequently and we see "fir-MDD0000: catalog [0x5:0xa:0x0] crosses index zero" when this happens.