Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17962

conf-sanity test_32a: failed with replace_nids operation already in progress

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.17.0
    • Lustre 2.16.0, Lustre 2.15.6
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/4c9dfb66-c4ee-4ceb-9fd7-436f4fc46eb8

      test_32a failed with the following error:

      CMD: trevis-128vm8 mount -t lustre -o nosvc t32fs-mdt1/mdt1 /tmp/t32/mnt/mdt
      CMD: trevis-128vm8 /usr/sbin/lctl replace_nids t32fs-OST0000 10.240.45.26@tcp
      trevis-128vm8: error: replace_nids: Operation now in progress
      pdsh@trevis-128vm1: trevis-128vm8: ssh exited with exit code 115
      CMD: trevis-128vm8 /usr/sbin/lctl dl
        0 UP osd-zfs t32fs-MDT0000-osd t32fs-MDT0000-osd_UUID 5
        1 UP mgs MGS MGS 7
        2 UP mgc MGC10.240.45.31@tcp 524e7669-b108-4f05-8270-6dd5e88a654d 5
       conf-sanity test_32a: @@@@@@ FAIL: replace_nids t32fs-OST0000 10.240.45.26@tcp failed 
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/100649 - 4.18.0-477.27.1.el8_8.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/100649 - 4.18.0-477.27.1.el8_lustre.x86_64

      <<Please provide additional information about the failure here>>

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      conf-sanity test_32a - replace_nids t32fs-OST0000 10.240.45.26@tcp failed

      Attachments

        Issue Links

          Activity

            [LU-17962] conf-sanity test_32a: failed with replace_nids operation already in progress
            pjones Peter Jones added a comment -

            Merged for 2.17

            pjones Peter Jones added a comment - Merged for 2.17

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/56709/
            Subject: LU-17962 mgc: free nidlist correctly
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 6ddf46420826cc66263599ba430c5144eabf766e

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/56709/ Subject: LU-17962 mgc: free nidlist correctly Project: fs/lustre-release Branch: master Current Patch Set: Commit: 6ddf46420826cc66263599ba430c5144eabf766e
            yujian Jian Yu added a comment - Lustre 2.16.0 RC5 client with 2.15.5 server: https://testing.whamcloud.com/test_sets/5bf51531-c7ba-462d-aecb-d01083f98aba
            emoly.liu Emoly Liu added a comment -

            "Emoly Liu <emoly@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/56709
            Subject: LU-17962 mgc: free nidlist correctly
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 3
            Commit: 79515b97a31537505b914871b811a4e3cfc1ec1e

            emoly.liu Emoly Liu added a comment - "Emoly Liu <emoly@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/56709 Subject: LU-17962 mgc: free nidlist correctly Project: fs/lustre-release Branch: master Current Patch Set: 3 Commit: 79515b97a31537505b914871b811a4e3cfc1ec1e
            emoly.liu Emoly Liu added a comment -

            The leak_finder.pl found the following leak:

            *** Leak: 20 bytes allocated at 00000000718f9558 (mgc_request.c:mgc_apply_recover_logs:1285:(nidlist), debug file line 7005)
            

            I will fix it soon.

            emoly.liu Emoly Liu added a comment - The leak_finder.pl found the following leak: *** Leak: 20 bytes allocated at 00000000718f9558 (mgc_request.c:mgc_apply_recover_logs:1285:(nidlist), debug file line 7005) I will fix it soon.

            "Emoly Liu <emoly@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/56709
            Subject: LU-17962 tests: debug conf-sanity.sh test_29 failure
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 1cc5e26a883b4561d639e4bbf3ae6703f802f304

            gerrit Gerrit Updater added a comment - "Emoly Liu <emoly@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/56709 Subject: LU-17962 tests: debug conf-sanity.sh test_29 failure Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 1cc5e26a883b4561d639e4bbf3ae6703f802f304
            yujian Jian Yu added a comment - - edited

            The memory leak failure occurred consistently in the following 2.16.0 clients with 2.15.5 servers interop test sessions:
            lustre-reviews_el8.10-x86_64_full-dne-part-3
            lustre-reviews_el8.10-x86_64_el9.4-x86_64_full-dne-part-3
            lustre-reviews_el8.10-x86_64_sles15sp6-x86_64_full-dne-part-3
            lustre-reviews_el8.10-x86_64_ubuntu2404-x86_64_full-dne-part-3

            conf-sanity test 29, 46a, 50h, 51, 70e, and 93 failed with this issue.

            yujian Jian Yu added a comment - - edited The memory leak failure occurred consistently in the following 2.16.0 clients with 2.15.5 servers interop test sessions: lustre-reviews_el8.10-x86_64_full-dne-part-3 lustre-reviews_el8.10-x86_64_el9.4-x86_64_full-dne-part-3 lustre-reviews_el8.10-x86_64_sles15sp6-x86_64_full-dne-part-3 lustre-reviews_el8.10-x86_64_ubuntu2404-x86_64_full-dne-part-3 conf-sanity test 29, 46a, 50h, 51, 70e, and 93 failed with this issue.
            yujian Jian Yu added a comment -

            Test session details:
            clients: https://build.whamcloud.com/job/lustre-master/4581 - 4.18.0-553.16.1.el8_10.x86_64
            servers: https://build.whamcloud.com/job/lustre-b2_15/94 - 4.18.0-553.5.1.el8_lustre.x86_64
            https://testing.whamcloud.com/test_sets/31e964e5-1404-4a3c-b868-30ad5dd3fcc6
            conf-sanity test 29, 50h, 51, 70e, and 93 failed with this issue:

            [21909.029507] LustreError: 385894:0:(class_obd.c:895:obdclass_exit()) obd_memory max: 6606559, leaked: 20
            

             

            yujian Jian Yu added a comment - Test session details: clients: https://build.whamcloud.com/job/lustre-master/4581 - 4.18.0-553.16.1.el8_10.x86_64 servers: https://build.whamcloud.com/job/lustre-b2_15/94 - 4.18.0-553.5.1.el8_lustre.x86_64 https://testing.whamcloud.com/test_sets/31e964e5-1404-4a3c-b868-30ad5dd3fcc6 conf-sanity test 29, 50h, 51, 70e, and 93 failed with this issue: [21909.029507] LustreError: 385894:0:(class_obd.c:895:obdclass_exit()) obd_memory max: 6606559, leaked: 20  

            People

              emoly.liu Emoly Liu
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: