Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17538

lov_objseq file contains 0x0BD0 contstant in low bytes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.14.0, Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      While hitting the LU-16692 sequence rollover LASSERT(), which caused an MDT to reboot in a loop, one of the MDT lov_objseq files looks like it had the low bytes of all FID SEQ values replaced by LOV_MAGIC_MAGIC or similar value (there are unfortunately a few different constants that have "0BD0" in them, like LUSTRE_MSG_MAGIC_V1):

      # od  -Ax -tx8 lov_objseq
      0000000                                 40c0000bd0                               4900000bd0
      0000010                                 1240000bd0                               1ac0000bd0
      0000020                                 23c0000bd0                               3100000bd0
      0000030                                 48c0000bd0                               c40000bd0
      0000040                                 3d40000bd0                               1f00000bd0
      0000050                                 1280000bd0                               1ec0000bd0
      0000060                                 4c40000bd0                               2880000bd0
      0000070                                 10c0000bd0                               12c0000bd0
      0000080                                 2680000bd0                               1100000bd0
      0000090                                 4c80000bd0                               2d00000bd0
      00000a0                                 2540000bd0                               a40000bd0
      00000b0                                 600000bd0                               2ec0000bd0
      00000c0                                 1400000bd0                               3940000bd0
      00000d0                                 4140000bd0                               1e80000bd0
      00000e0                                 940000bd0                               780000bd0
      00000f0                                 1080000bd0                               2c00000bd0
      :
      

      In contrast, the lov_objseq on another MDT looked more as expected, close to the original "0x400" starting point and with some slight variation between OSTs due to usage and assignment of different SEQ values to MDTs in slightly different orders:

      0000000                                 40c0000405                               490000040b
      0000010                                 1240000404                               1ac000040d
      0000020                                 23c000040b                               310000040b
      0000030                                 48c000040b                               c4000040b
      0000040                                 3d4000040c                               1f00000404
      0000050                                 1280000404                               1ec0000404
      0000060                                 4c40000406                               2880000409
      0000070                                 10c0000407                               12c0000404
      0000080                                 2680000409                               1100000407
      0000090                                 4c80000401                               2d00000402
      00000a0                                 254000040b                               a40000401
      00000b0                                 600000408                               2ec0000402
      00000c0                                 1400000403                               3940000401
      00000d0                                 4140000405                               1e80000404
      00000e0                                 940000402                               780000403
      00000f0                                 1080000407                               2c00000404
      :
      

      The lov_objid fields for the OSTs looked reasonable for a system running with LU-11912 to cause OST FID SEQ rollover to happen more quickly. The OID numbers in each case were fairly close to others within the same lov_objid file, though not very close to those in the other file.

      dongyang do you have any thoughts on how the lov_objseq values could be affected in this way?

      Attachments

        Issue Links

          Activity

            People

              dongyang Dongyang Li
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: