Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18167

sanity-sec test_16: nodemap_classify_nid()) ASSERTION( nodemap != ((void *)0) ) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Chris Horn <chris.horn@hpe.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/4d132b73-7874-4421-ada0-09883de4ed46

      test_16 failed with the following error:

      trevis-63vm6 crashed during sanity-sec test_16
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/106957 - 4.18.0-513.24.1.el8_9.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/106957 - 4.18.0-513.24.1.el8_lustre.x86_64

      [ 1753.046816] LustreError: 164418:0:(nodemap_handler.c:288:nodemap_classify_nid()) ASSERTION( nodemap != ((void *)0) ) failed: 
      [ 1753.048973] LustreError: 164418:0:(nodemap_handler.c:288:nodemap_classify_nid()) LBUG
      [ 1753.050461] CPU: 1 PID: 164418 Comm: lctl Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-513.24.1.el8_lustre.x86_64 #1
      [ 1753.052659] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 1753.053732] Call Trace:
      [ 1753.054285]  dump_stack+0x41/0x60
      [ 1753.055017]  lbug_with_loc.cold.8+0x5/0x58 [libcfs]
      [ 1753.056006]  nodemap_classify_nid+0x1fd/0x290 [ptlrpc]
      [ 1753.057516]  ? kmem_cache_alloc_trace+0x142/0x280
      [ 1753.058429]  nm_member_reclassify_nodemap+0x7e/0x2b0 [ptlrpc]
      [ 1753.059626]  nodemap_add_range_helper+0x242/0x2b0 [ptlrpc]
      [ 1753.060751]  nodemap_add_range+0x79/0xd0 [ptlrpc]
      [ 1753.061738]  cfg_nodemap_cmd.constprop.14+0x6cc/0x900 [ptlrpc]
      [ 1753.062917]  ? _cond_resched+0x15/0x30
      [ 1753.063664]  server_iocontrol_nodemap+0xa2e/0xeb0 [ptlrpc]
      [ 1753.064791]  ? mgs_key_init+0x3e/0x130 [mgs]
      [ 1753.065685]  ? keys_fill+0xc8/0x120 [obdclass]
      [ 1753.066799]  ? lu_context_init+0xa8/0x1b0 [obdclass]
      [ 1753.067796]  mgs_iocontrol+0x827/0x10c0 [mgs]
      [ 1753.068662]  class_handle_ioctl+0xea8/0x1ce0 [obdclass]
      [ 1753.069713]  ? bpf_lsm_capset+0x10/0x10
      [ 1753.070487]  ? security_capable+0x38/0x60
      [ 1753.071276]  obd_class_ioctl+0x13b/0x190 [obdclass]
      [ 1753.072255]  do_vfs_ioctl+0xa4/0x690
      [ 1753.072997]  ? syscall_trace_enter+0x1ff/0x2d0
      [ 1753.073877]  ksys_ioctl+0x64/0xa0
      [ 1753.074541]  __x64_sys_ioctl+0x16/0x20
      [ 1753.075271]  do_syscall_64+0x5b/0x1b0
      [ 1753.076001]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
      

      Issue is a bug in range_search() when using large NIDs:

      struct lu_nid_range *range_search(struct nodemap_config *config,
                                        struct lnet_nid *nid)
      {
              struct lu_nid_range *range = NULL;
      ...
                      list_for_each_entry_safe(range, range_temp,
                                               &config->nmc_netmask_setup,
                                               rn_collect) {
                              if (nid_same(&range->rn_start, nid))
                                      break;
                              range = NULL;
                      }
              }
      
              return range;
      
      #define list_for_each_entry_safe(pos, n, head, member)                  \
              for (pos = list_first_entry(head, typeof(*pos), member),        \
                      n = list_next_entry(pos, member);                       \
                   !list_entry_is_head(pos, head, member);                    \
                   pos = n, n = list_next_entry(n, member))
      

      The list macro will set range to some non-null value before the loop exits, so the range = NULL assignment is a no-op and range_search() returns the wrong value.

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-sec test_16 - trevis-63vm6 crashed during sanity-sec test_16

      Attachments

        Issue Links

          Activity

            People

              hornc Chris Horn
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: