Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14810

sanity-lnet test_212: lnet_assert_handler_unused() ASSERTION(md->md_handler != handler) failed

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.17.0
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/84ed5e32-bda8-4880-a254-dbd96cb6a478

      test_212 failed with the following error:

      trevis-79vm3 crashed during sanity-lnet test_212
      
      LNetError: 944562:0:(lib-md.c:288:lnet_assert_handler_unused()) ASSERTION( md->md_handler != handler ) failed: 
      Pid: 944562, comm: lnet_discovery 4.18.0-240.22.1.el8_3.aarch64 #1 SMP Thu Apr 8 19:01:45 UTC 2021
      Call Trace:
       libcfs_call_trace+0xb8/0x118 [libcfs]
       lbug_with_loc+0x60/0xa0 [libcfs]
       lnet_assert_handler_unused+0xb8/0xe0 [lnet]
       lnet_peer_discovery+0x16c4/0x1cb0 [lnet]
       kthread+0x130/0x138
      

      This test was recently landed via patch https://review.whamcloud.com/43418 "LU-14627 lnet: Ensure ref taken when queueing for discovery"

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-lnet test_212 - trevis-79vm3 crashed during sanity-lnet test_212

      Attachments

        Issue Links

          Activity

            [LU-14810] sanity-lnet test_212: lnet_assert_handler_unused() ASSERTION(md->md_handler != handler) failed
            adilger Andreas Dilger added a comment - This is still being hit on master, despite patch 54884 landing as commit v2_15_63-8-g3ba393a5cb: https://testing.whamcloud.com/test_sets/979b507b-225a-407d-bc2c-e02925ae9d39 https://testing.whamcloud.com/test_sets/3805363b-bb13-4133-a2cf-41b6bcff7154 https://testing.whamcloud.com/test_sets/7d5b948d-8245-476f-abd1-9fbe8d1cb752 https://testing.whamcloud.com/test_sets/bb60d168-511c-4861-ac7d-5f14389d836e
            pjones Peter Jones added a comment -

            Merged for 2.16

            pjones Peter Jones added a comment - Merged for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/54884/
            Subject: LU-14810 lnet: ongoing push when discovery is stopped
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 3ba393a5cb21ff0f8bd8a09c341ee01e936321c7

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/54884/ Subject: LU-14810 lnet: ongoing push when discovery is stopped Project: fs/lustre-release Branch: master Current Patch Set: Commit: 3ba393a5cb21ff0f8bd8a09c341ee01e936321c7

            "Cyril Bordage <cbordage@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54884
            Subject: LU-14810 lnet: ongoing push when discovery is stopped
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 7c25fd3bc2c4e0cb241688c36acfa4d82cc52078

            gerrit Gerrit Updater added a comment - "Cyril Bordage <cbordage@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/54884 Subject: LU-14810 lnet: ongoing push when discovery is stopped Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7c25fd3bc2c4e0cb241688c36acfa4d82cc52078

            Any progress on this? It still seems to be crashing on a regular basis.

            adilger Andreas Dilger added a comment - Any progress on this? It still seems to be crashing on a regular basis.
            adilger Andreas Dilger added a comment - - edited 6x crashes in the past week: https://testing.whamcloud.com/search?status%5B%5D=FAIL&status%5B%5D=TIMEOUT&status%5B%5D=CRASH&test_set_script_id=a2b1c4b2-b449-11e9-b88c-52540065bddc&sub_test_script_id=058541e7-0f00-4f82-a6e5-8316ea99a160&start_date=2024-04-12&end_date=2024-04-13&source=sub_tests#redirect
            adilger Andreas Dilger added a comment - +1 on latest master in sanity-let test_205: https://testing.whamcloud.com/test_sets/3cf7e1b8-0fa3-43e0-a2af-06a1c4a538eb
            adilger Andreas Dilger added a comment - Still seeing several crashes per week on master with 2.15.60.58, when patch 53356 was landed as 2.15.59.81: https://testing.whamcloud.com/search?horizon=518400&status%5B%5D=TIMEOUT&status%5B%5D=CRASH&test_set_script_id=a2b1c4b2-b449-11e9-b88c-52540065bddc&sub_test_script_id=058541e7-0f00-4f82-a6e5-8316ea99a160&source=sub_tests#redirect
            adilger Andreas Dilger added a comment - +1 on master: https://testing.whamcloud.com/sub_tests/121f699a-8b7c-40dc-be4e-99f534b874b4
            hornc Chris Horn added a comment -

            The patch that landed only fixes one of the two bugs that result in tripping the assert.

            Reopening.

            hornc Chris Horn added a comment - The patch that landed only fixes one of the two bugs that result in tripping the assert. Reopening.
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            People

              cbordage Cyril Bordage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: