[LU-13411] processing of update logs couldn't delete empty plain llogs Created: 03/Apr/20  Updated: 19/Oct/22  Resolved: 20/May/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.14.0

Type: Bug Priority: Critical
Reporter: Alexander Boyko Assignee: Alexander Boyko
Resolution: Fixed Votes: 0
Labels: patch

Issue Links:
Related
is related to LU-13195 replay-single test_118: dt_declare_re... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

After failover, update log catalog could include plain logs with zero size, the processing logic couldn't cancel it, because of LLOG_F_ZAP_WHEN_EMPTY flag. It is not set for a fresh allocated llog header.

00000040:00080000:0.0:1584724451.874095:0:32142:0:(llog.c:806:llog_process_or_fork()) Processing [0x19990:0xc0000405:0x2] flags 0x012 startcat 0 startidx 0 first_idx -1 last_idx -1
00000040:00080000:0.0:1584724451.874257:0:32142:0:(llog.c:649:llog_process_thread()) index: 2, lh_last_idx: 390 synced_idx: 0 lgh_last_idx: 390
00000040:00080000:0.0:1584724451.874259:0:32142:0:(llog_cat.c:814:llog_cat_process_common()) processing log [0x19992:0xc0000405:0x2]:0 at index 2 of catalog [0x19990:0xc0000405:0x2]
00000040:00080000:0.0:1584724451.874335:0:32142:0:(llog_osd.c:233:llog_osd_read_header()) not reading header from 0-byte log
00000040:00080000:0.0:1584724451.874339:0:32142:0:(llog.c:806:llog_process_or_fork()) Processing [0x19992:0xc0000405:0x2] flags 0x004 startcat -1046185984 startidx -25265 first_idx -1 last_idx -1
00000040:00080000:0.0:1584724451.875288:0:32142:0:(llog.c:699:llog_process_thread()) stop processing plain 0x19992:3221226501:0 index 261376 count 1
 

We also see next error during processing

2020-03-20 17:01:54 [70530.343769] LustreError: 66765:0:(llog.c:625:llog_process_thread()) fs1-MDT0001-osp-MDT0000: [0x2:0x80070cdc:0x5] Invalid record: index 67727 but expected 67726
2020-03-20 17:01:54 [70530.362675] LustreError: 66765:0:(lod_dev.c:441:lod_sub_recovery_thread()) fs1-MDT0001-osp-MDT0000 get update log failed: rc = -34

record 67726 is missing at llog and bit is zero for it.



 Comments   
Comment by Gerrit Updater [ 03/Apr/20 ]

Alexander Boyko (c17825@cray.com) uploaded a new patch: https://review.whamcloud.com/38131
Subject: LU-13411 llog: allow delete of zero size llog, index jump
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a85878accb8896ba71caf7acd64e098f684d83aa

Comment by Gerrit Updater [ 20/May/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38131/
Subject: LU-13411 llog: allow delete of zero size llog
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: bc7f8cfe0fc6a5977d452c4637e340cd63081bdc

Comment by Peter Jones [ 20/May/20 ]

Fixed in 2.14

Comment by Gerrit Updater [ 02/Jun/22 ]

"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/47514
Subject: LU-13411 llog: allow delete of zero size llog
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 4dc7b177c0f2d269ba97704d478cb361e8504729

Generated at Sat Feb 10 03:01:02 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.