Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13667

ptlrpc_pinger_main is stuck in endless loop

Details

    • 3
    • 9223372036854775807

    Description

      In ptlrpc_pinger_main, the process of the pingable imports or obd_update_maxusage
      could cost long time and be stuck in endless loop because of the negative timeout
      returned by pinger_check_timeout

      Attachments

        Issue Links

          Activity

            [LU-13667] ptlrpc_pinger_main is stuck in endless loop

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39344/
            Subject: LU-13667 ptlrpc: fix endless loop issue
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 95cd26446e16c63b531ed94a844b5f69c8b3730f

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39344/ Subject: LU-13667 ptlrpc: fix endless loop issue Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 95cd26446e16c63b531ed94a844b5f69c8b3730f

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39344
            Subject: LU-13667 ptlrpc: fix endless loop issue
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 5acd6853b4a64057ce55174a15a93b11d2922eab

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/39344 Subject: LU-13667 ptlrpc: fix endless loop issue Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 5acd6853b4a64057ce55174a15a93b11d2922eab
            pjones Peter Jones added a comment -

            Landed for 2.14

            pjones Peter Jones added a comment - Landed for 2.14

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38915/
            Subject: LU-13667 ptlrpc: fix endless loop issue
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 6be2dbb2595121fabceda86c5f7bdcb45e10b320

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38915/ Subject: LU-13667 ptlrpc: fix endless loop issue Project: fs/lustre-release Branch: master Current Patch Set: Commit: 6be2dbb2595121fabceda86c5f7bdcb45e10b320
            ofaaland Olaf Faaland added a comment -

            Thank you. This should go into b2_12 after it's merged to master.

            ofaaland Olaf Faaland added a comment - Thank you. This should go into b2_12 after it's merged to master.

            Yes, it should be the same issue with this ticket.

            hongchao.zhang Hongchao Zhang added a comment - Yes, it should be the same issue with this ticket.

            Please let me know whether you agree my described symptoms match this issue, thanks.

            ofaaland Olaf Faaland added a comment - Please let me know whether you agree my described symptoms match this issue, thanks.

            I believe we hit this today on a Lustre 2.12.4 client. The pinger was taking 100% of a core. Over 498 seconds, the "next wakeup in" message appeared in the debug log 76,716 times, and the time_to_next_wake started at -41,852 and ended at -42,350 (getting more and more negative with time).

            ofaaland Olaf Faaland added a comment - I believe we hit this today on a Lustre 2.12.4 client. The pinger was taking 100% of a core. Over 498 seconds, the "next wakeup in" message appeared in the debug log 76,716 times, and the time_to_next_wake started at -41,852 and ended at -42,350 (getting more and more negative with time).

            Hongchao Zhang (hongchao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38915
            Subject: LU-13667 ptlrpc: fix endless loop issue
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 9ccad707e88b3b2bc118eac588817737cb9da1c9

            gerrit Gerrit Updater added a comment - Hongchao Zhang (hongchao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38915 Subject: LU-13667 ptlrpc: fix endless loop issue Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 9ccad707e88b3b2bc118eac588817737cb9da1c9

            People

              hongchao.zhang Hongchao Zhang
              hongchao.zhang Hongchao Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: