Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-20136

BUG: unable to handle kernel NULL pointer dereference at 0000000000000029

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Medium Medium
    • Lustre 2.18.0
    • None
    • None
    • 3
    • 9223372036854775807

      cfs_cpt_set_node() returns 1 on success (boolean), but its return
      value was left in rc after the for_each_node_mask loop that builds
      the IO CPT table when cpu_npartitions=1.

      In mds_start_ptlrpc_service(), this stale rc=1 causes the function
      to treat success as failure: all just-registered services are stopped
      via mds_stop_ptlrpc_service(), and rc=1 propagates up through
      mds_device_alloc() as ERR_PTR(1) = (void *)0x1. Since IS_ERR()
      does not catch small positive values, obd_setup() dereferences
      this as a struct lu_device pointer, crashing at offset 0x28 (ld_obd):

      BUG: unable to handle kernel NULL pointer dereference at 0x29
      RIP: obd_setup+0x238/0x470 [obdclass]

      The same pattern exists in oss_device_init(). While the OSS success
      path explicitly returns 0 (masking the bug), reset rc there too to
      prevent future regressions.

      [337261.506854] BUG: unable to handle kernel NULL pointer dereference at 0000000000000029
      [337261.514776] PGD 130588067 P4D 130588067 PUD 14524e067 PMD 0
      [337261.520519] Oops: 0002 [#1] SMP NOPTI
      [337261.524271] CPU: 0 PID: 435181 Comm: lt-mount.lustre Kdump: loaded Tainted: G           OE     -------- -  - 4.18.0-553.36.1.el8_10.x86_64 #1
      [337261.537036] Hardware name: Viking Enterprise Solutions VSSEP1EA/VSSEP1EA, BIOS 10.09.02 10/26/2020
      [337261.546076] RIP: 0010:obd_setup+0x238/0x470 [obdclass]
      [337261.551464] Code: 85 c0 0f 84 81 00 00 00 4c 89 ea 48 89 ee 48 89 df e8 4c 2e b3 f2 49 89 c5 48 3d 00 f0 ff ff 0f 87 3d 01 00 00 49 89 44 24 10 <4c> 89 60 28 48 89 68 08 48 85 c0 0f 84 37 01 00 00 f6 45 00 02 0f
      [337261.570293] RSP: 0018:ffffab642795f878 EFLAGS: 00010203
      [337261.575598] RAX: 0000000000000001 RBX: ffffab642795f8a8 RCX: 00000000002a0009
      [337261.582815] RDX: 00000000002a000a RSI: 00000000002a0009 RDI: ffff987400004e00
      [337261.590026] RBP: ffffffffc1fb2da0 R08: 0000000000000001 R09: 0000000000000000
      [337261.597244] R10: ffff9874344e4480 R11: ffff987567092500 R12: ffff98756702d9c0
      [337261.604462] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
      [337261.611672] FS:  00007f40c573da80(0000) GS:ffff9892ede00000(0000) knlGS:0000000000000000
      [337261.619845] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [337261.625676] CR2: 0000000000000029 CR3: 0000000264c02000 CR4: 0000000000350ef0
      [337261.632895] Call Trace:
      [337261.635439]  ? __die_body+0x1a/0x60
      [337261.639020]  ? no_context+0x1ba/0x3f0
      [337261.642771]  ? libcfs_debug_msg+0x907/0xc00 [libcfs]
      [337261.647835]  ? __bad_area_nosemaphore+0x157/0x180
      [337261.652624]  ? do_page_fault+0x37/0x12d
      [337261.656542]  ? page_fault+0x1e/0x30
      [337261.660123]  ? obd_setup+0x238/0x470 [obdclass]
      [337261.664783]  ? obd_setup+0x224/0x470 [obdclass]
      [337261.669445]  class_setup+0x5b7/0x760 [obdclass]
      [337261.674109]  lustre_start_simple+0x397/0x6f0 [obdclass]
      [337261.679465]  server_start_targets+0x1360/0x2b20 [ptlrpc]
      [337261.685081]  ? cpumask_next_wrap+0x2d/0x80
      [337261.689264]  ? string+0x44/0x60
      [337261.692495]  ? vsnprintf+0x340/0x520
      [337261.696162]  ? snprintf+0x49/0x70
      [337261.699567]  ? libcfs_debug_msg+0x907/0xc00 [libcfs]
      [337261.704621]  ? lustre_start_mgc+0xd6d/0x1ed0 [obdclass]
      [337261.709975]  ? kfree+0xd3/0x250
      [337261.713207]  ? lustre_start_mgc+0xd6d/0x1ed0 [obdclass]
      [337261.718564]  server_fill_super+0xd11/0x11a0 [ptlrpc]
      [337261.723709]  ? obd_zombie_barrier+0x38/0xb0 [obdclass]
      [337261.728982]  ? ll_alloc_inode+0x110/0x110 [lustre]
      [337261.733943]  lustre_fill_super+0x2eb/0x400 [lustre]
      [337261.738936]  vfs_get_super+0x7f/0x110
      [337261.742688]  vfs_get_tree+0x25/0xc0
      [337261.746267]  do_mount+0x2e9/0x950
      [337261.749671]  ksys_mount+0xbe/0xe0
      [337261.753067]  __x64_sys_mount+0x21/0x30
      [337261.756900]  do_syscall_64+0x5b/0x1a0
      [337261.760652]  entry_SYSCALL_64_after_hwframe+0x66/0xcb
      

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: