Details
-
Improvement
-
Resolution: Fixed
-
Minor
-
Lustre 2.15.0
-
9223372036854775807
Description
Since Lustre 2.4.0 and DNE1, it has been possible to create OST objects using a different FID SEQ range for each MDT, to avoid contention during MDT object precreation.
Objects that are created by MDT0000 are put into FID SEQ 0 (O/0/d*) on all OSTs and have a filename that is the decimal FID OID in ASCII. However, SEQ=0 objects are remapped to IDIF FID SEQ (0x100000000 | (ost_idx << 16)) so that they are unique across all OSTs.
Objects that are created by other MDTs (or MDT0000 after 2^48 objects are created in SEQ 0) use a unique SEQ in the FID_SEQ_NORMAL range (> 0x200000400), and use a filename that is the hexadecimal FID OID in ASCII.
For compatibility with pre-DNE MDTs and OSTs, the use of SEQ=0 by MDT0000 was kept until now, but there has not been a reason to keep this compatibility for new filesystems. It would be better to have MDT0000 assigned a "regular" FID SEQ range at startup, so that the SEQ=0 compatibility can eventually be removed. That would ensure OST objects have "proper and unique" FIDs, and avoid the complexity of mapping between the old SEQ=0 48-bit OID values and the IDIF FIDs.
Older filesystems using SEQ=0 would eventually delete old objects in this range and/or could be forced to migrate to using new objects to clean up the remaining usage, if necessary.
I will update conf-sanity/84.
Alex, the new crash is a different issue, mostly because landing of https://review.whamcloud.com/c/fs/lustre-release/+/38424/
Now, the patch introduces a SEQ width of 16384 in Maloo, so the SEQ change will happen more frequently and randomly.
To make sure SEQ change doesn't happen after replay_barrier, the patch from 38424 actually has force_new_seq, to change the SEQ for test suites like replay-single starts. It did change the SEQ from the log,
but I think the seq width of 16384 is not enough for the whole replay-single, given we have only 2 OSTs, more objects will be created for each OST.
I think there are 2 things we could do: use force_new_seq for every replay_barrier, which I think is a bit too heavy, or we could enlarge the default 16384 SEQ width according to number of OSTs.
Note we don't really need force_new_seq for conf-sanity/84, the changing of IDIF seq to normal seq happens as soon as osp connects, we just need to wait for that before using replay_barrier.