[LU-13450] handle corrupted llog files Created: 14/Apr/20  Updated: 09/Jan/24

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Alex Zhuravlev Assignee: Alex Zhuravlev
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Blocker
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

if plain llog got corrupted for a reason, then MDS crashes:


LustreError: 9838:0:(llog_osd.c:259:llog_osd_read_header()) lustre-MDT0000-osd: bad log [0x1:0x6:0x0] header magic: 0x0 (expected 0x10645539)
Lustre: 9838:0:(llog_cat.c:830:llog_cat_process_common()) lustre-OST0001-osc-MDT0000: can't find llog handle [0x6:0x1:0x0]:0: rc = -5
LustreError: 9838:0:(llog.c:730:llog_process_thread()) lustre-OST0001-osc-MDT0000: Local llog found corrupted #0x5:1:0 catalog index 1 count 3
LustreError: 9838:0:(osp_sync.c:1284:osp_sync_thread()) ASSERTION( kthread_should_stop() ) failed: 0 changes, 0 in progress, 0 in flight
LustreError: 9838:0:(osp_sync.c:1284:osp_sync_thread()) LBUG

{coe}

the above is reproduced on master branch, the test is scripted and will be included into the patch.



 Comments   
Comment by Gerrit Updater [ 14/Apr/20 ]

Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38219
Subject: LU-13450 tests: reproduce LBUG on llog processing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 743991bbf2d486610ee2587cf9f50d20af4c188d

Generated at Sat Feb 10 03:01:22 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.