Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2719

Lustre slowdown, errors with IOR: INFO: task IOR: blocked for more than 120 seconds.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • None
    • Lustre 2.1.4
    • None
    • 3
    • 6612

    Description

      Observed slow-down in performance in test environment that they are using to benchmark HD video streaming. After an initial PoC before Christmas, they have restarted their evaluation and are trying to re-establish the baseline performance and meeting with poor results compared with their original testing.

      The system was completely rebooted earlier in the week and is, so far, much improved. Still waiting on confirmation that the numbers are in line with expectations but I would like to try and make sure that we haven't missed anything.

      There is a possibility that the problems experienced were environment or configuration related (e.g. network instability or a configuration error). Since rebooting the servers, the system appears to be more stable.

      Ideally, looking for consensus or confirmation that issue is not systemic.

      Syslogs attached. They have been scrubbed a bit to remove some very verbose and extraneous informational entries from unrelated software.

      Attachments

        1. hd-client-excerpt.txt
          28 kB
        2. messages-client-00-trunc
          150 kB
        3. messages-client-01-trunc
          128 kB
        4. messages-client-02-trunc
          125 kB
        5. messages-hd-oss-00
          110 kB
        6. messages-hd-oss-01
          110 kB
        7. messages-mds-00-trunc
          105 kB

        Activity

          People

            wc-triage WC Triage
            malkolm Malcolm Cowe (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: