Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5745

sanity test_17m: mds reboot from OOM killer

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 16136

    Description

      This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/a1d8fa06-5413-11e4-9c8a-5254006e85c2.

      The sub-test test_17m failed with the following error:

      test failed to respond and timed out
      

      While this looked at first like a random timeout examination of the MDS logs show a reboot due to OOM killer running. Not at all clear why MDS is OOM. This mod should have no impact at all on the el6 test run in progress at the time of the failure.

      Info required for matching: sanity 17m

      Attachments

        Issue Links

          Activity

            [LU-5745] sanity test_17m: mds reboot from OOM killer
            adilger Andreas Dilger made changes -
            Resolution New: Duplicate [ 3 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            adilger Andreas Dilger made changes -
            Link New: This issue duplicates LU-5077 [ LU-5077 ]
            bogl Bob Glossman (Inactive) made changes -
            Description Original: This issue was created by maloo for Bob Glossman &lt;bob.glossman@intel.com&gt;

            This issue relates to the following test suite run: [https://testing.hpdd.intel.com/test_sets/a1d8fa06-5413-11e4-9c8a-5254006e85c2].

            While this looked at first like a random timeout examination of the MDS logs show a reboot due to OOM killer running. Not at all clear why MDS is OOM. This mod should have no impact at all on the el6 test run in progress at the time of the failure.

            The sub-test test_17m failed with the following error:
            {noformat}
            test failed to respond and timed out
            {noformat}

            Please provide additional information about the failure here.

            Info required for matching: sanity 17m
            New: This issue was created by maloo for Bob Glossman &lt;bob.glossman@intel.com&gt;

            This issue relates to the following test suite run: [https://testing.hpdd.intel.com/test_sets/a1d8fa06-5413-11e4-9c8a-5254006e85c2].

            The sub-test test_17m failed with the following error:
            {noformat}
            test failed to respond and timed out
            {noformat}

            While this looked at first like a random timeout examination of the MDS logs show a reboot due to OOM killer running. Not at all clear why MDS is OOM. This mod should have no impact at all on the el6 test run in progress at the time of the failure.

            Info required for matching: sanity 17m
            maloo Maloo created issue -

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: