Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
3
-
9223372036854775807
Description
Hit this in my testing today (receovery-small test 111):
<1>[231830.373084] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
<1>[231830.374591] IP: [<ffffffffa08966a1>] nm_member_reclassify_nodemap+0x71/0x130 [ptlrpc]
<4>[231830.376120] PGD 414cc067 PUD 2ba8f067 PMD 0
<4>[231830.376513] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
<4>[231830.376513] last sysfs file: /sys/devices/virtual/block/loop0/queue/scheduler
<4>[231830.376513] CPU 1
<4>[231830.376513] Modules linked in: lustre ofd osp lod ost mdt mdd mgs osd_ldiskfs ldiskfs lquota lfsck obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass ksocklnd lnet libcfs zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl zlib_deflate exportfs jbd sha512_generic sha256_generic ext4 jbd2 mbcache virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio cxgb3i libcxgbi ipv6 cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs]
<4>[231830.376513]
<4>[231830.376513] Pid: 31113, comm: mount.lustre Tainted: P -- ------------ 2.6.32-rhe6.7-debug #1 Bochs Bochs
<4>[231830.376513] RIP: 0010:[<ffffffffa08966a1>] [<ffffffffa08966a1>] nm_member_reclassify_nodemap+0x71/0x130 [ptlrpc]
<4>[231830.376513] RSP: 0018:ffff8800825cb628 EFLAGS: 00010286
<4>[231830.376513] RAX: 0000000000000000 RBX: ffff8800338337f0 RCX: 0000000000000000
<4>[231830.376513] RDX: 0000000000000000 RSI: 0000000000000071 RDI: 0000000000000246
<4>[231830.376513] RBP: ffff8800825cb678 R08: 0000000000000ed9 R09: ffff880000000000
<4>[231830.376513] R10: 0000000000000001 R11: 0000000087654321 R12: ffff880033833b58
<4>[231830.376513] R13: ffff88008e29d7f0 R14: ffff880095cf4ef0 R15: ffff880095cf4fa8
<4>[231830.376513] FS: 00007f35b8cbf7a0(0000) GS:ffff880006280000(0000) knlGS:0000000000000000
<4>[231830.376513] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>[231830.376513] CR2: 0000000000000018 CR3: 000000007d53f000 CR4: 00000000000006e0
<4>[231830.376513] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[231830.376513] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>[231830.376513] Process mount.lustre (pid: 31113, threadinfo ffff8800825c8000, task ffff880023028080)
<4>[231830.376513] Stack:
<4>[231830.376513] 0000000000000010 ffffffffa09173c0 ffff880095cf4f18 ffff880095cf4f60
<4>[231830.376513] <d> ffff8800292eae88 ffff880095cf4ef0 ffff880095cf4f18 ffff880095cf4f18
<4>[231830.376513] <d> ffff8800292eae88 ffff880095cf4ef0 ffff8800825cb698 ffffffffa0891efa
<4>[231830.376513] Call Trace:
<4>[231830.376513] [<ffffffffa0891efa>] nodemap_putref+0x7a/0x2f0 [ptlrpc]
<4>[231830.376513] [<ffffffffa08923aa>] nodemap_config_cleanup+0xda/0x120 [ptlrpc]
<4>[231830.376513] [<ffffffffa0892406>] nodemap_config_dealloc+0x16/0xf0 [ptlrpc]
<4>[231830.376513] [<ffffffffa089264e>] nodemap_config_set_active+0x14e/0x270 [ptlrpc]
<4>[231830.376513] [<ffffffffa0897b76>] nm_config_file_register+0x966/0xf50 [ptlrpc]
<4>[231830.376513] [<ffffffffa0c0892d>] ? iam_container_setup+0xad/0x110 [osd_ldiskfs]
<4>[231830.376513] [<ffffffffa0c20000>] ? osd_inode_iteration+0xa30/0xd80 [osd_ldiskfs]
<4>[231830.376513] [<ffffffffa0721043>] mgs_fs_setup+0x2f3/0x6d0 [mgs]
<4>[231830.376513] [<ffffffffa072028f>] mgs_init0+0xe1f/0x16e0 [mgs]
<4>[231830.376513] [<ffffffffa0719689>] ? mgs_type_start+0x19/0x20 [mgs]
<4>[231830.376513] [<ffffffffa0720be8>] mgs_device_alloc+0x98/0x140 [mgs]
<4>[231830.376513] [<ffffffffa058c37f>] obd_setup+0x1bf/0x290 [obdclass]
<4>[231830.376513] [<ffffffffa058c6a8>] class_setup+0x258/0x930 [obdclass]
<4>[231830.376513] [<ffffffffa0592e91>] class_process_config+0x1151/0x23f0 [obdclass]
<4>[231830.376513] [<ffffffffa0427fb1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
<4>[231830.376513] [<ffffffffa059afcf>] do_lcfg+0x2cf/0x8e0 [obdclass]
<4>[231830.376513] [<ffffffffa059b674>] lustre_start_simple+0x94/0x200 [obdclass]
<4>[231830.376513] [<ffffffffa04247f8>] ? libcfs_log_return+0x28/0x40 [libcfs]
<4>[231830.376513] [<ffffffffa05ca1d7>] server_fill_super+0xc37/0x106c [obdclass]
<4>[231830.376513] [<ffffffffa0427fb1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
<4>[231830.376513] [<ffffffffa059d558>] lustre_fill_super+0x328/0x8a0 [obdclass]
<4>[231830.376513] [<ffffffffa059d230>] ? lustre_fill_super+0x0/0x8a0 [obdclass]
<4>[231830.376513] [<ffffffff811965cf>] get_sb_nodev+0x5f/0xa0
<4>[231830.376513] [<ffffffffa0597505>] lustre_get_sb+0x25/0x30 [obdclass]
<4>[231830.376513] [<ffffffff81195bfb>] vfs_kern_mount+0x7b/0x1b0
<4>[231830.376513] [<ffffffff81195da2>] do_kern_mount+0x52/0x130
<4>[231830.376513] [<ffffffff811b7ceb>] do_mount+0x2fb/0x920
<4>[231830.376513] [<ffffffff811b83a0>] sys_mount+0x90/0xe0
<4>[231830.376513] [<ffffffff8100b112>] system_call_fastpath+0x16/0x1b
<4>[231830.376513] Code: 03 00 00 0f 84 bf 00 00 00 49 81 ed 68 03 00 00 eb 12 0f 1f 84 00 00 00 00 00 4c 89 eb 4c 8d a8 98 fc ff ff 48 8b 83 10 01 00 00 <48> 8b 78 18 e8 96 b7 ff ff 49 39 c6 74 70 48 8b 8b 68 03 00 00
<1>[231830.376513] RIP [<ffffffffa08966a1>] nm_member_reclassify_nodemap+0x71/0x130 [ptlrpc]
<4>[231830.376513] RSP <ffff8800825cb628>
<4>[231830.376513] CR2: 0000000000000018
(gdb) l *(nm_member_reclassify_nodemap+0x71)
0xd76d1 is in nm_member_reclassify_nodemap (/home/green/git/lustre-release/lustre/ptlrpc/nodemap_member.c:153).
148 list_for_each_entry_safe(exp, tmp, &nodemap->nm_member_list,
149 exp_target_data.ted_nodemap_member) {
150 lnet_nid_t nid = exp->exp_connection->c_peer.nid;
151
152 /* nodemap_classify_nid requires nmc_range_tree_lock */
153 new_nodemap = nodemap_classify_nid(nid);
154 if (new_nodemap != nodemap) {
155 /* don't use member_del because ted_nodemap
156 * should never be null
157 */
I guess the nid could be NULL after all