Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.14.0, Lustre 2.16.0
-
None
-
3
-
9223372036854775807
Description
While hitting the LU-16692 sequence rollover LASSERT(), which caused an MDT to reboot in a loop, one of the MDT lov_objseq files looks like it had the low bytes of all FID SEQ values replaced by LOV_MAGIC_MAGIC or similar value (there are unfortunately a few different constants that have "0BD0" in them, like LUSTRE_MSG_MAGIC_V1):
# od -Ax -tx8 lov_objseq 0000000 40c0000bd0 4900000bd0 0000010 1240000bd0 1ac0000bd0 0000020 23c0000bd0 3100000bd0 0000030 48c0000bd0 c40000bd0 0000040 3d40000bd0 1f00000bd0 0000050 1280000bd0 1ec0000bd0 0000060 4c40000bd0 2880000bd0 0000070 10c0000bd0 12c0000bd0 0000080 2680000bd0 1100000bd0 0000090 4c80000bd0 2d00000bd0 00000a0 2540000bd0 a40000bd0 00000b0 600000bd0 2ec0000bd0 00000c0 1400000bd0 3940000bd0 00000d0 4140000bd0 1e80000bd0 00000e0 940000bd0 780000bd0 00000f0 1080000bd0 2c00000bd0 :
In contrast, the lov_objseq on another MDT looked more as expected, close to the original "0x400" starting point and with some slight variation between OSTs due to usage and assignment of different SEQ values to MDTs in slightly different orders:
0000000 40c0000405 490000040b 0000010 1240000404 1ac000040d 0000020 23c000040b 310000040b 0000030 48c000040b c4000040b 0000040 3d4000040c 1f00000404 0000050 1280000404 1ec0000404 0000060 4c40000406 2880000409 0000070 10c0000407 12c0000404 0000080 2680000409 1100000407 0000090 4c80000401 2d00000402 00000a0 254000040b a40000401 00000b0 600000408 2ec0000402 00000c0 1400000403 3940000401 00000d0 4140000405 1e80000404 00000e0 940000402 780000403 00000f0 1080000407 2c00000404 :
The lov_objid fields for the OSTs looked reasonable for a system running with LU-11912 to cause OST FID SEQ rollover to happen more quickly. The OID numbers in each case were fairly close to others within the same lov_objid file, though not very close to those in the other file.
dongyang do you have any thoughts on how the lov_objseq values could be affected in this way?
Attachments
Issue Links
- is related to
-
LU-17658 sanity check when ofd assign a new sequence to osp
- Open
- is related to
-
LU-16692 replay-single: test_70c osp_fid_diff()) ASSERTION( fid_seq(fid1) == fid_seq(fid2) )
- Resolved
-
LU-16720 large-scale test_3a osp_precreate_rollover_new_seq()) ASSERTION( fid_seq(fid) != fid_seq(last_fid) ) failed: fid [0x240000bd0:0x1:0x0], last_fid [0x240000bd0:0x3fff:0x0]
- Resolved
-
LU-11912 reduce number of OST objects created per MDS Sequence
- Resolved