Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7393

OSS hung with high load and blocked ll_{*} threads

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Blocker
    • None
    • None
    • lola
      build: build: 2.7.62-28-g0754bc8, 0754bc8f2623bea184111af216f7567608db35b6; soakbuild '20151104.1'
    • 3
    • 9223372036854775807

    Description

      Error occurred during soak testing of build '20151104.1' on cluster lola (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20151104.1). MDTs are fromated with ldiskfs and OSTs with zfs as storage backend. DNE is enabled. MDSes are configured in HA failover configuration.
      OSS nodes are neither restarted nor failed over.

      Symptom:

      • OSS node (lola-3) shows high load to large number of blocked processes. No iowait or high disk load + long queue and wait times can seen
      • List of blocked process can be seen from 'w' and 't' sysrq-trigger iniiated at Nov 5 08:19:12 PST 2015, and 08:23:3 PST 2015 respectively (see attached messages file)
      • Problems most likely started at Nov 4, 18:50
        see messages file and debug log file (lustre-log.1446691819.85273.bz2) attached
      • 220 additional debug log files have been written which could be provided on demand

      Attachments

        Activity

          [LU-7393] OSS hung with high load and blocked ll_{*} threads

          Issue was not reproduced

          cliffw Cliff White (Inactive) added a comment - Issue was not reproduced

          Joe, by accident I didn't attached the files mentioned in the description. I was convinced I did. After checking possible locations on soak nodes and my laptop I'm sure they're gone. I'm very sorry.

          heckes Frank Heckes (Inactive) added a comment - Joe, by accident I didn't attached the files mentioned in the description. I was convinced I did. After checking possible locations on soak nodes and my laptop I'm sure they're gone. I'm very sorry.

          Frank,
          Are the debug logs available?
          Thanks.
          Joe

          jgmitter Joseph Gmitter (Inactive) added a comment - Frank, Are the debug logs available? Thanks. Joe

          People

            wc-triage WC Triage
            heckes Frank Heckes (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: