[LU-17204] lod_ea_store_resize()) ASSERTION( info->lti_ea_store_size < round ) failed Created: 17/Oct/23  Updated: 08/Nov/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Alex Zhuravlev Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

I hit this assertion in racer infrequently:

[  283.409542] LustreError: 12528:0:(lod_lov.c:461:lod_ea_store_resize()) ASSERTION( info->lti_ea_store_size < round ) failed: 
[  283.409804] LustreError: 12528:0:(lod_lov.c:461:lod_ea_store_resize()) LBUG
[  283.409952] Pid: 12528, comm: mdt00_047 4.18.0 #2 SMP Sun Oct 23 17:58:04 UTC 2022
[  283.410083] Call Trace TBD:
[  283.410146] [<0>] libcfs_call_trace+0x67/0x90 [libcfs]
[  283.410232] [<0>] lbug_with_loc+0x3e/0x80 [libcfs]
[  283.410325] [<0>] lod_ea_store_resize+0x475/0x5a0 [lod]
[  283.410410] [<0>] lod_get_ea+0x34b/0x550 [lod]
[  283.410495] [<0>] lod_get_default_lov_striping+0xdc/0x8f0 [lod]
[  283.410596] [<0>] lod_ah_init+0x520/0x14f0 [lod]
[  283.410686] [<0>] mdd_create+0x639/0x25a0 [mdd]

I tend to think this is due to LOVEA getting "shorter" in another process.



 Comments   
Comment by Gerrit Updater [ 17/Oct/23 ]

"Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52727
Subject: LU-17204 lod: don't panic on short LOVEA
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7939c374beb4233e3ab0a381f40f9c9a1e76dddb

Comment by Gerrit Updater [ 08/Nov/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52727/
Subject: LU-17204 lod: don't panic on short LOVEA
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 8fa3532b1ee887be378adbf9432707b2d8a2d814

Generated at Sat Feb 10 03:33:30 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.