Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2387

Error messages printed in mdt_reint_open, possibly causing evictions

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.4.0
    • Tag: 2.3.56-2chaos-3surya1
    • 3
    • 5661

    Description

      I just rebooted our MDS and see the following messages on the console:

      Lustre: lstest-MDT0000: Will be in recovery for at least 5:00, or until 265 clients reconnect.
      LustreError: 33073:0:(mdt_open.c:1328:mdt_reint_open()) @@@ [0x2000182dc:0x156a9:0x0]/simul_open.0->[0x2000182dc:0x1f2d1:0x0] cr_flags=03 mode=0100000 msg_flag=0x4 not found in open replay.  req@ffff881fd78c5050 x1419179046425207/t0(201876066869) o101->bfcbc1b8-9a26-425f-5198-348ff50beb40@172.20.3.120@o2ib500:0/0 lens 568/1136 e 0 to 0 dl 1353975967 ref 1 fl Complete:/4/0 rc 0/0
      2012-11-26 16:25:06 LustreError: 33073:0:(mdt_open.c:1328:mdt_reint_open()) @@@ [0x2000182dc:0x156a9:0x0]/simul_close.0->[0x2000182dc:0x1f2d2:0x0] cr_flags=03 mode=0100000 msg_flag=0x4 not found in open replay.  req@ffff881fd7fe1c50 x1419179046425214/t0(201876066900) o101->bfcbc1b8-9a26-425f-5198-348ff50beb40@172.20.3.120@o2ib500:0/0 lens 568/1136 e 0 to 0 dl 1353975967 ref 1 fl Complete:/4/0 rc 0/0
      2012-11-26 16:25:07 LustreError: 33073:0:(mdt_open.c:1328:mdt_reint_open()) @@@ [0x2000182dc:0x156a9:0x0]/simul_open.0->[0x2000182dc:0x1f380:0x0] cr_flags=03 mode=0100000 msg_flag=0x4 not found in open replay.  req@ffff881fd80c9450 x1419179046426510/t0(201876068937) o101->bfcbc1b8-9a26-425f-5198-348ff50beb40@172.20.3.120@o2ib500:0/0 lens 568/1136 e 0 to 0 dl 1353975968 ref 1 fl Complete:/4/0 rc 0/0
      2012-11-26 16:25:07 LustreError: 33073:0:(mdt_open.c:1328:mdt_reint_open()) Skipped 43 previous similar messages
      2012-11-26 16:25:08 LustreError: 33073:0:(mdt_open.c:1328:mdt_reint_open()) @@@ [0x2000182dc:0x156a9:0x0]/simul_lseek.0->[0x2000182dc:0x1f45a:0x0] cr_flags=03 mode=0100000 msg_flag=0x4 not found in open replay.  req@ffff881fd675f850 x1419179046428033/t0(201876070976) o101->bfcbc1b8-9a26-425f-5198-348ff50beb40@172.20.3.120@o2ib500:0/0 lens 568/1136 e 0 to 0 dl 1353975969 ref 1 fl Complete:/4/0 rc 0/0
      2012-11-26 16:25:08 LustreError: 33073:0:(mdt_open.c:1328:mdt_reint_open()) Skipped 56 previous similar messages
      2012-11-26 16:25:09 Lustre: lstest-MDT0000: Recovery over after 1:15, of 265 clients 256 recovered and 9 were evicted.
      

      I haven't looked at the code to determine what the LustreError messages mean, but I wanted to open an issue in the mean time.

      First off, these should really be cleaned up and reworked to print something sane that an admin can understand.

      Secondly, 9 clients were evicted here, so I'm curious if these eviction are a result of the error messages printed just prior to recovery completion.

      Attachments

        Issue Links

          Activity

            People

              bzzz Alex Zhuravlev
              prakash Prakash Surya (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: