Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14888

We are uncertain that we may hit the bug

    XMLWordPrintable

Details

    • Question/Request
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.12.3
    • None
    • Client : 2.12.2
      Server : 2.12.3 , from HPE lustre
    • 9223372036854775807

    Description

      We are finding an issue that is frequently disconnecting OST from the client

      We currently have OSS1-4 and MDS1-2 which works as Lustre server , and there is the version of 2.12.3 provided by HPE

      Our client are using lustre client of 2.12.2


      On OSS1, we noticed many disconnections and reconnections of Lustre clients from various OSTs as shown below.

      In particular, bulk IO read error was reported for the client at 192.168.3.182 (NFS2).

       

       The NFS2 even got under very high loading on 15 Jun morning, and we rebooted it.

      Since then it could no longer mount any Lustre file systems with error below in NFS2’s dmesg

       

       

      Upon investigation, we obtained the MDS1-2 and OSS1-2 sosreport and one of the compute node that are reporting OST disconnection, which are included in the link below

      https://drive.google.com/open?id=1_tR7DiXCjzXWEd5ctPjPA5NFn_FweFvq

       

      -------------

      We are suspecting we faced the bug of the below , LU-13719, and we want to make sure that is true.

      https://jira.whamcloud.com/browse/LU-13719

      Attachments

        Activity

          People

            pjones Peter Jones
            itsupport.cgs Hong Kong University
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: