Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8824

sanity-sec test_9: ASSERTION( config->nmc_default_nodemap )

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.9.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/abdb13dc-a627-11e6-964e-5254006e85c2.

      The sub-test test_9 failed with the following error:

      trevis-34vm4:LBUG/LASSERT detected
      
      02:04:11:[15785.994958] Lustre: DEBUG MARKER: == sanity-sec test 9: nodemap range add ============================================================== 02:02:49 (1478656969)
      02:04:11:[15792.826885] Lustre: 10421:0:(nodemap_handler.c:1020:nodemap_create()) adding nodemap '27295_7' to config without default nodemap
      02:04:11:[15792.830823] Lustre: 10421:0:(nodemap_handler.c:1020:nodemap_create()) Skipped 3 previous similar messages
      02:04:11:[15800.705743] Lustre: 10421:0:(mgc_request.c:1756:mgc_process_recover_nodemap_log()) MGC10.9.5.176@tcp: error processing nodemap log nodemap: rc = -2
      02:04:11:[15800.709914] LustreError: 10421:0:(nodemap_handler.c:1428:nodemap_config_set_active()) ASSERTION( config->nmc_default_nodemap ) failed: 
      02:04:11:[15800.714076] LustreError: 10421:0:(nodemap_handler.c:1428:nodemap_config_set_active()) LBUG
      02:04:11:[15800.716317] Pid: 10421, comm: ll_cfg_requeue
      02:04:11:[15800.718308] 
      02:04:11:[15800.718308] Call Trace:
      02:04:11:[15800.721741]  [<ffffffffa09387d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
      02:04:11:[15800.723818]  [<ffffffffa0938d75>] lbug_with_loc+0x45/0xc0 [libcfs]
      02:04:11:[15800.725837]  [<ffffffffa0d34a17>] nodemap_config_set_active+0x2a7/0x2e0 [ptlrpc]
      02:04:11:[15800.727873]  [<ffffffffa0d3d908>] nodemap_config_set_active_mgc+0x38/0x1e0 [ptlrpc]
      02:04:11:[15800.729985]  [<ffffffffa0ca28f0>] ? ptlrpc_request_cache_free+0x90/0x1d0 [ptlrpc]
      02:04:11:[15800.732071]  [<ffffffffa0ca35d5>] ? __ptlrpc_req_finished+0x475/0x690 [ptlrpc]
      02:04:11:[15800.734162]  [<ffffffffa0c43e6b>] mgc_process_recover_nodemap_log+0x34b/0xe10 [mgc]
      02:04:11:[15800.736195]  [<ffffffffa0c46894>] mgc_process_log+0x754/0x880 [mgc]
      02:04:11:[15800.738132]  [<ffffffff816399cd>] ? schedule_timeout+0x17d/0x2d0
      02:04:11:[15800.740126]  [<ffffffffa09439d7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      02:04:11:[15800.742013]  [<ffffffffa0c48908>] mgc_requeue_thread+0x2b8/0x880 [mgc]
      02:04:11:[15800.744113]  [<ffffffff810b8940>] ? default_wake_function+0x0/0x20
      02:04:11:[15800.746313]  [<ffffffffa0c48650>] ? mgc_requeue_thread+0x0/0x880 [mgc]
      02:04:11:[15800.748437]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
      02:04:11:[15800.750331]  [<ffffffff810a5ac0>] ? kthread+0x0/0xe0
      02:04:11:[15800.752203]  [<ffffffff81646c98>] ret_from_fork+0x58/0x90
      02:04:11:[15800.754097]  [<ffffffff810a5ac0>] ? kthread+0x0/0xe0
      02:04:11:[15800.755897] 
      

      Please provide additional information about the failure here.

      Info required for matching: sanity-sec 9

      Attachments

        Issue Links

          Activity

            [LU-8824] sanity-sec test_9: ASSERTION( config->nmc_default_nodemap )
            pjones Peter Jones added a comment -

            Landed for 2.9

            pjones Peter Jones added a comment - Landed for 2.9

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/23849/
            Subject: LU-8824 nodemap: load nodemap definitions first
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 89ce9d5b125762f39339916f14c01242107739ed

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/23849/ Subject: LU-8824 nodemap: load nodemap definitions first Project: fs/lustre-release Branch: master Current Patch Set: Commit: 89ce9d5b125762f39339916f14c01242107739ed

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/23778/
            Subject: LU-8824 nodemap: properly handle errors loading nodemap conf
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 9be888d56caf73184f72a4ad782196d255331ee2

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/23778/ Subject: LU-8824 nodemap: properly handle errors loading nodemap conf Project: fs/lustre-release Branch: master Current Patch Set: Commit: 9be888d56caf73184f72a4ad782196d255331ee2

            Kit Westneat (kit.westneat@gmail.com) uploaded a new patch: http://review.whamcloud.com/23849
            Subject: LU-8824 nodemap: load nodemap definitions first
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 7d5800455161e0d2fca47a1754b7fc734d4a2999

            gerrit Gerrit Updater added a comment - Kit Westneat (kit.westneat@gmail.com) uploaded a new patch: http://review.whamcloud.com/23849 Subject: LU-8824 nodemap: load nodemap definitions first Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7d5800455161e0d2fca47a1754b7fc734d4a2999
            pjones Peter Jones added a comment -

            Thanks Kit! This is encouraging news

            pjones Peter Jones added a comment - Thanks Kit! This is encouraging news

            Kit Westneat (kit.westneat@gmail.com) uploaded a new patch: http://review.whamcloud.com/23778
            Subject: LU-8824 nodemap: properly handle errors loading nodemap conf
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 0ae8e3db5cd16acc4f3bde47a896b05a01383c9b

            gerrit Gerrit Updater added a comment - Kit Westneat (kit.westneat@gmail.com) uploaded a new patch: http://review.whamcloud.com/23778 Subject: LU-8824 nodemap: properly handle errors loading nodemap conf Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0ae8e3db5cd16acc4f3bde47a896b05a01383c9b

            Hi Peter,

            I can get a patch up for the error handling tonight or tomorrow. Fixing the config loading and unloading will take a bit longer, but I'll ty to get a patch up by the end of the week.

            • Kit
            kit.westneat Kit Westneat (Inactive) added a comment - Hi Peter, I can get a patch up for the error handling tonight or tomorrow. Fixing the config loading and unloading will take a bit longer, but I'll ty to get a patch up by the end of the week. Kit
            pjones Peter Jones added a comment -

            Kit

            This is indeed good news. How are things progressing on making the changes necessary with the error handling?

            Peter

            pjones Peter Jones added a comment - Kit This is indeed good news. How are things progressing on making the changes necessary with the error handling? Peter

            Kit,

            Awesome find.

            EXCEPTing test_9 just delays the ASSERTION to test_15:
            https://testing.hpdd.intel.com/sub_tests/aaedadbe-a888-11e6-b6bd-5254006e85c2

            I'm think getting a real fix is necissary for sanity-sec to pass with ZFS.

            utopiabound Nathaniel Clark added a comment - Kit, Awesome find. EXCEPTing test_9 just delays the ASSERTION to test_15: https://testing.hpdd.intel.com/sub_tests/aaedadbe-a888-11e6-b6bd-5254006e85c2 I'm think getting a real fix is necissary for sanity-sec to pass with ZFS.

            I think I've figured out what's going on. The config load code expects the index file to return the key/values in key-sorted order, which the ldiskfs index files do. The ZFS index files however appear to return the keys in hash sorted order, at least according to this comment:
            /*

            • XXX: implement support for fixed-size keys sorted with natural
            • numerical way (not using internal hash value)
              */

            We currently embed the config record type in the key so that create records are processed before update records, and so not having the records sent in key-order breaks this.

            I'm going to investigate how easy it would be to modify the config load/send operation to have it do a two-pass load, where the create records would be loaded first, and then the other records could be loaded after.

            kit.westneat Kit Westneat (Inactive) added a comment - I think I've figured out what's going on. The config load code expects the index file to return the key/values in key-sorted order, which the ldiskfs index files do. The ZFS index files however appear to return the keys in hash sorted order, at least according to this comment: /* XXX: implement support for fixed-size keys sorted with natural numerical way (not using internal hash value) */ We currently embed the config record type in the key so that create records are processed before update records, and so not having the records sent in key-order breaks this. I'm going to investigate how easy it would be to modify the config load/send operation to have it do a two-pass load, where the create records would be loaded first, and then the other records could be loaded after.

            People

              kit.westneat Kit Westneat (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: