Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15809

replay-dual test_29: timeout llog_verify_record() lustre-MDT0000-osp-MDT0001: record is too large: 0 > 32768

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/9c32c9af-c574-4023-9286-07091f92769c

      test_29 failed with the following error:

      Started lustre-MDT0001
      
      Timeout occurred after 135 mins, last suite running was replay-dual
      

      It looks like the MDS is having trouble reading the recovery llog and is stuck doing this forever with "retry remote llog process":

      [Mon Dec 27 22:49:21 2021] LustreError: 113045:0:(llog.c:472:llog_verify_record()) lustre-MDT0000-osp-MDT0001: record is too large: 0 > 32768
      [Mon Dec 27 22:49:21 2021] LustreError: 113045:0:(llog.c:656:llog_process_thread()) lustre-MDT0000-osp-MDT0001: invalid record in llog [0x2:0x11d41:0x2] record for index 0/2: rc = -22
      [Mon Dec 27 22:49:21 2021] LustreError: 113045:0:(llog.c:482:llog_verify_record()) lustre-MDT0000-osp-MDT0001: magic 0 is bad
      [Mon Dec 27 22:49:21 2021] LustreError: 113045:0:(llog.c:781:llog_process_thread()) lustre-MDT0000-osp-MDT0001 retry remote llog process
      [Mon Dec 27 22:49:22 2021] Lustre: lustre-MDT0001: in recovery but waiting for the first client to connect
      [Mon Dec 27 22:49:22 2021] LustreError: 113045:0:(llog.c:472:llog_verify_record()) lustre-MDT0000-osp-MDT0001: record is too large: 400547 > 32768
      [Mon Dec 27 22:49:22 2021] LustreError: 113045:0:(llog.c:472:llog_verify_record()) Skipped 205 previous similar messages
      [Mon Dec 27 22:49:22 2021] LustreError: 113045:0:(llog.c:656:llog_process_thread()) lustre-MDT0000-osp-MDT0001: invalid record in llog [0x2:0x11d41:0x2] record for index 96/0: rc = -22
      [Mon Dec 27 22:49:22 2021] LustreError: 113045:0:(llog.c:656:llog_process_thread()) Skipped 309 previous similar messages
      :
      :
      [Mon Dec 27 23:36:25 2021] LustreError: 113045:0:(llog.c:482:llog_verify_record()) lustre-MDT0000-osp-MDT0001: magic 0 is bad
      [Mon Dec 27 23:36:25 2021] LustreError: 113045:0:(llog.c:482:llog_verify_record()) Skipped 129784 previous similar messages
      [Mon Dec 27 23:36:25 2021] LustreError: 113045:0:(llog.c:781:llog_process_thread()) lustre-MDT0000-osp-MDT0001 retry remote llog process
      [Mon Dec 27 23:36:25 2021] LustreError: 113045:0:(llog.c:781:llog_process_thread()) Skipped 32445 previous similar messages
      [Mon Dec 27 23:36:29 2021] Lustre: 113052:0:(ldlm_lib.c:1962:extend_recovery_timer()) lustre-MDT0001: extended recovery timer reached hard limit: 180, extend: 1
      [Mon Dec 27 23:36:29 2021] Lustre: 113052:0:(ldlm_lib.c:1962:extend_recovery_timer()) Skipped 29 previous similar messages
      [Mon Dec 27 23:46:25 2021] LustreError: 113045:0:(llog.c:472:llog_verify_record()) lustre-MDT0000-osp-MDT0001: record is too large: 0 > 32768
      [Mon Dec 27 23:46:25 2021] LustreError: 113045:0:(llog.c:472:llog_verify_record()) Skipped 258999 previous similar messages
      [Mon Dec 27 23:46:25 2021] LustreError: 113045:0:(llog.c:656:llog_process_thread()) lustre-MDT0000-osp-MDT0001: invalid record in llog [0x2:0x11d41:0x2] record for index 0/0: rc = -22
      [Mon Dec 27 23:46:25 2021] LustreError: 113045:0:(llog.c:656:llog_process_thread()) Skipped 388499 previous similar messages
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      replay-dual test_29 - Timeout occurred after 135 mins, last suite running was replay-dual

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: