Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14046

lov tgt 0 not cleaned! deathrow=0, lovrc=1

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.13.0, Lustre 2.12.5
    • None
    • 2.12.5 servers + 2.13 clients, CentOS 7.6
    • 9223372036854775807

    Description

      Today we upgrade Oak servers from 2.10.8 to 2.12.5, and now we ~50 clients (2.13) out of ~1,500 that cannot mount Oak at all after reboot. Example with client 10.50.0.63@o2ib2:

      Oct 19 13:31:26 sh02-ln03.stanford.edu kernel: LustreError: 94181:0:(lov_obd.c:828:lov_cleanup()) oak-clilov-ffffa0d562f8a800: lov tgt 0 not cleaned! deathrow=0, lovrc=1
      Oct 19 13:31:26 sh02-ln03.stanford.edu kernel: LustreError: 94181:0:(lov_obd.c:828:lov_cleanup()) Skipped 291 previous similar messages
      Oct 19 13:31:27 sh02-ln03.stanford.edu kernel: Lustre: Unmounted oak-client
      Oct 19 13:31:27 sh02-ln03.stanford.edu kernel: LustreError: 94181:0:(obd_mount.c:1669:lustre_fill_super()) Unable to mount  (-5) 

       

      On the MGS side, I can see this:

      /tmp/dk:00010000:02000400:7.0:1603137393.190601:0:7903:0:(ldlm_lib.c:1151:target_handle_connect()) MGS: Received new LWP connection from 10.50.0.63@o2ib2, removing former export from same NID
      /tmp/dk:00010000:00080000:7.0:1603137393.190602:0:7903:0:(ldlm_lib.c:1227:target_handle_connect()) MGS: connection from f3832037-ce6f-4@10.50.0.63@o2ib2 t0 exp ffff88f2a4e59c00 cur 12765 last 1603137393 

      2.10 servers with 2.13 clients worked fine. This is 2.12 servers with 2.13 clients.

      Please advise. Is it the same as in LU-13719?

      Thanks!

      Stephane

      Attachments

        Issue Links

          Activity

            People

              hongchao.zhang Hongchao Zhang
              sthiell Stephane Thiell
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: