Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18157

MDT crashes from assert when an OST mounts

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.12.9
    • None
    • None
    • CentOS 7.9, Lustre 2.12.7
    • 3
    • 9223372036854775807

    Description

      After an abrupt power outtage we started to see an assertion causing a kernel panic on the MDT. From the attached vmcore-dmesg.txt we see the following specific errors:

      [ 6867.143694] Lustre: 79890:0:(llog.c:615:llog_process_thread()) lustre01-OST0032-osc-MDT0000: invalid length 0 in llog [0x52ab:0x1:0x0]record for index 0/2
      [ 6867.143705] Lustre: 79890:0:(llog.c:615:llog_process_thread()) Skipped 1 previous similar message
      [ 6867.143720] LustreError: 79890:0:(osp_sync.c:1272:osp_sync_thread()) lustre01-OST0032-osc-MDT0000: llog process with osp_sync_process_queues failed: -22
      [ 6867.148800] LustreError: 79890:0:(osp_sync.c:1272:osp_sync_thread()) Skipped 1 previous similar message 

      After a reboot, we can start the MDT and it finishes recovery. Bringing online all other OSTs things stay online. When the mentioned OST is mounted, though, we see the same kernel panic/assert.

      I attempted to mount the MDT with ldiskfs and clear the updatelog and changelog but this made no difference.

      Attachments

        Activity

          People

            core-lustre-triage Core Lustre Triage
            makia Makia Minich
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: