Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.11.0
    • None
    • 3
    • 15940

    Description

      Sometimes lc_watchdogd disappears w/o any messages and lustre logs are not dumped after watchdog triggered.

      How the correct behaviour should look:

      LNet: Service thread pid 7096 was inactive for 10.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      Pid: 7096, comm: lctl
      
      Call Trace:
       [<ffffffff81528eb2>] schedule_timeout+0x192/0x2e0
       [<ffffffff81084220>] ? process_timeout+0x0/0x10
       [<ffffffffa0380df7>] proc_trigger_watchdog+0x67/0x80 [libcfs]
       [<ffffffff811fd8e7>] proc_sys_call_handler+0x97/0xd0
       [<ffffffff811fd934>] proc_sys_write+0x14/0x20
       [<ffffffff81188f68>] vfs_write+0xb8/0x1a0
       [<ffffffff81189861>] sys_write+0x51/0x90
       [<ffffffff8152b2be>] ? do_device_not_available+0xe/0x10
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      
      LustreError: dumping log to /tmp/lustre-log.1411548646.7096
      

      and how it may look in the kernel logs when lustre logs are not dumped:

      Lustre: DEBUG MARKER: == sanity test 242: Check that watchdog causes kernel log dump == 09:19:38 (1411550378)
      LNet: Service thread pid 12742 stopped after 20.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
      Lustre: DEBUG MARKER: sanity test_242: @@@@@@ FAIL: Lustre log wasn't dumped
      Lustre: DEBUG MARKER: == sanity test complete, duration 29 sec == 09:20:01 (1411550401)
      

      Attachments

        Issue Links

          Activity

            [LU-5695] watchdog dispatch thread disappears
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/12155/
            Subject: LU-5695 libcfs: watchdog dispatch thread fix
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 1947bc08c0709ad80611dc65785ccb8dbf7f7214

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/12155/ Subject: LU-5695 libcfs: watchdog dispatch thread fix Project: fs/lustre-release Branch: master Current Patch Set: Commit: 1947bc08c0709ad80611dc65785ccb8dbf7f7214

            yes, lets go with the one-line fix.

            zam Alexander Zarochentsev added a comment - yes, lets go with the one-line fix.

            Since Alex is okay with a one line fix I refreshed the patch. Very simple and should be landed soon.

            simmonsja James A Simmons added a comment - Since Alex is okay with a one line fix I refreshed the patch. Very simple and should be landed soon.

            Is this fixed?

            simmonsja James A Simmons added a comment - Is this fixed?

            Old issue, already fixed

            cliffw Cliff White (Inactive) added a comment - Old issue, already fixed

            procfs (or sysfs) part of the patch is only for testing, I think at least the actual fix from the patch can be landed.

            zam Alexander Zarochentsev added a comment - procfs (or sysfs) part of the patch is only for testing, I think at least the actual fix from the patch can be landed.

            Still waiting for a patch

            cliffw Cliff White (Inactive) added a comment - Still waiting for a patch

            Please reopen. I plan to update the patch but I was waiting until the port to sysfs happens for Lustre 2.10.

            simmonsja James A Simmons added a comment - Please reopen. I plan to update the patch but I was waiting until the port to sysfs happens for Lustre 2.10.

            Bug out of date, no patch update. Closing

            cliffw Cliff White (Inactive) added a comment - Bug out of date, no patch update. Closing

            People

              simmonsja James A Simmons
              zam Alexander Zarochentsev
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: