Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9242

Applications are failing to complete due to connection loss with OSS servers

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Blocker
    • None
    • Lustre 2.10.0
    • None
    • Lustre 2.9.54 running on servers using RHEL7 and using ldiskfs. Client side is Cray SLES11SP4 also running lustre 2.9.54
    • 3
    • 9223372036854775807

    Description

      With my testing of the latest master branch I see jobs failing due to the lose of communication with the OSS servers. I see a reconnect storm but in the end the application error out.

      Attachments

        Activity

          People

            yujian Jian Yu
            simmonsja James A Simmons
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: