[LU-16182] llog_osd_prev_block() ASSERTION( last_rec->lrh_index == tail->lrt_index ) failed Created: 22/Sep/22  Updated: 23/Sep/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.10, Lustre 2.15.2
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-8673 kernel panic when mounting MDS after ... Resolved
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

There is an LBUG triggered during MDT mount due to corruption in the Changelog catalog/log file:

LustreError: 29516:0:(llog_osd.c:1075:llog_osd_prev_block()) ASSERTION( last_rec->lrh_index == tail->lrt_index ) failed:
LustreError: 29516:0:(llog_osd.c:1075:llog_osd_prev_block()) LBUG
Pid: 29516, comm: mount.lustre
Call Trace:
libcfs_debug_dumpstack+0x53/0x80 [libcfs]
lbug_with_loc+0x45/0xc0 [libcfs]
llog_osd_prev_block+0x9f7/0xaf0 [obdclass]
llog_reverse_process+0x147/0xac0 [obdclass]
? changelog_init_cb+0x0/0x1f0 [mdd]
llog_cat_reverse_process_cb+0x157/0x540 [obdclass]
llog_reverse_process+0x269/0xac0 [obdclass]
llog_cat_reverse_process+0x199/0x2d0 [obdclass]
mdd_prepare+0x1269/0x1a00 [mdd]
mdt_prepare+0x51/0x3b0 [mdt]
server_start_targets+0x2574/0x2e10 [obdclass]
server_fill_super+0x108d/0x184c [obdclass]
lustre_fill_super+0x328/0x950 [obdclass]
mount_nodev+0x4d/0xb0
lustre_mount+0x38/0x60 [obdclass]
mount_fs+0x39/0x1b0
vfs_kern_mount+0x5f/0xf0
do_mount+0x24e/0xa40

The kernel code should not have an LASSERT() check for data that is read from disk, so this should be removed and replaced with error handling:

                LASSERT(last_rec->lrh_index == tail->lrt_index);

The llog_osd_prev_block() function has many other places where errors are returned to the caller, so it looks (at first glance) that the LASSERT() should be replaced with an error message that prints the llog record number and FID, and returns an error to the caller that stops changelog processing and either clears this record or deletes the whole changelog.

Also, I see that changelog_init_cb() also has LASSERT() checks for the llog records that could fail if the records are corrupted:

        LASSERT(llh->lgh_hdr->llh_flags & LLOG_F_IS_PLAIN);
        LASSERT(rec->cr_hdr.lrh_type == CHANGELOG_REC);

that should also be fixed.

The llog_cat_reverse_process(changelog_init_cb) handling looks like it is finding the highest changelog index currently in use? If this reverse llog processing fails, then it may be that the last changelog index is lost? Options would include doing "forward" llog processing, but this may also suffer from the same problem, or using the changelog_users file to at least start with a changelog index higher than what the users have processed (e.g. current_index = max(user_index) + 10M or similar).


Generated at Sat Feb 10 03:24:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.