[LU-16720] large-scale test_3a osp_precreate_rollover_new_seq()) ASSERTION( fid_seq(fid) != fid_seq(last_fid) ) failed: fid [0x240000bd0:0x1:0x0], last_fid [0x240000bd0:0x3fff:0x0] Created: 07/Apr/23  Updated: 12/Apr/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.16.0
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Oleg Drokin Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-11912 reduce number of OST objects created ... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Starting from March 30, right after landings on that day, a new assertion crash appeared in large-scale test 3a (only gets run in full testing I guess, so flew under radar)

LustreError: 676976:0:(osp_precreate.c:488:osp_precreate_rollover_new_seq()) ASSERTION( fid_seq(fid) != fid_seq(last_fid) ) failed: fid [0x240000bd0:0x1:0x0], last_fid [0x240000bd0:0x3fff:0x0]
LustreError: 676976:0:(osp_precreate.c:488:osp_precreate_rollover_new_seq()) LBUG
Pid: 676976, comm: osp-pre-0-0 4.18.0-425.10.1.el8_lustre.x86_64 #1 SMP Thu Mar 2 00:54:22 UTC 2023
Call Trace TBD:
[<0>] libcfs_call_trace+0x6f/0xa0 [libcfs]
[<0>] lbug_with_loc+0x3f/0x70 [libcfs]
[<0>] osp_precreate_thread+0x121d/0x1230 [osp]
[<0>] kthread+0x10b/0x130
[<0>] ret_from_fork+0x35/0x40 

 

Example crashes:

https://testing.whamcloud.com/test_sets/5173c0c5-ff80-4f5b-aec2-d6e1419cbd85

https://testing.whamcloud.com/test_sets/68c90481-1450-4526-a659-b6d5d6b97f0a

https://testing.whamcloud.com/test_sets/20a4a76a-e1bf-4f46-985c-b8cbed94e51b

I suspect this is due to LU-11912 patch landing, the timing checks out.



 Comments   
Comment by Andreas Dilger [ 07/Apr/23 ]

Dongyang, this shouldn't be a case with replay_barrier, just creating a lot of files. It isn't exactly the same as LU-16692, since this is LASSERT that the sequences are different, while that ticket is LASSERT that they are the same.

It seems like there is an off-by-one in the rollover? Also, it may be that we need to replace the LASSERT with error handling, since they seem too easily hit.

Comment by Dongyang Li [ 10/Apr/23 ]

This is a different issue to LU-16692. Looks like the LASSERT happened in osp_precreate_rollover_new_seq()
During SEQ rollover we get a new SEQ id, and then it has to be different to the previous using SEQ saved in last_used_fid, note the object id from the last_used_fid is 0x3fff(the reduced SEQ width), which means the SEQ is used up and due to be changed.
I feel like this is actually a bug found by changing the SEQ more frequently, maybe a race when changing the SEQ?

Comment by Dongyang Li [ 11/Apr/23 ]

I think I know what's going on.
before large-scale, it was replay-ost-single, and it does replay_barrier on ost1, and from the logs the MDT0 osp got a new SEQ after the replay_barrier on ost1.

[ 9541.509199] Lustre: DEBUG MARKER: == replay-ost-single test 12b: write after OST failover to a missing object ========================================================== 03:08:10 (1680059290)
[ 9545.683083] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n debug
[ 9546.092712] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=0
[ 9546.795469] Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x240000401 to 0x240000bd0

the replay_barrier on ost1 dropping writes so we lost the seq range update, after that as we progress to large-scale when we need to allocate new SEQ from ofd we still got the old one because the seq range update is lost.
forcing new seq on all mdts in replay-ost-single should fix this, I've updated https://review.whamcloud.com/c/fs/lustre-release/+/50478

Generated at Sat Feb 10 03:29:26 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.