Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
3
-
9223372036854775807
Description
[ 572.395773] BUG: unable to handle kernel NULL pointer dereference at (null) [ 572.405777] IP: [<ffffffffc1599a8e>] mdt_lvb2body+0x2e/0xe0 [mdt] [ 572.413803] PGD 0 [ 572.417222] Oops: 0000 [#1] SMP [ 572.421988] Modules linked in: osd_ldiskfs(OE) ldiskfs(OE) ost(OE) osp(OE) ofd(OE) mdt(OE) mdd(OE) lod(OE) mgs(OE) mgc(OE) lquota(OE) lfsck(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) linear raid10 ko2iblnd(OE) lnet(OE) libcfs(OE) ext4 mbcache jbd2 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack xt_multiport iptable_filter xt_CT nf_conntrack libcrc32c iptable_raw mlx5_ib(OE) mlx5_core(OE) mlxfw rdma_ucm(OE) ib_ucm(OE) ib_uverbs(OE) rdma_cm(OE) iw_cm(OE) ib_umad(OE) ib_ipoib(OE) ib_cm(OE) mlx4_ib(OE) mlx4_en(OE) ib_core(OE) sb_edac intel_powerclamp coretemp intel_rapl iTCO_wdt iosf_mbi iTCO_vendor_support kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul mgag200 glue_helper ablk_helper ttm cryptd drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr mei_me joydev [ 572.508847] drm_panel_orientation_quirks i2c_i801 lpc_ich dm_mod mei sg mlx4_core(OE) mlx_compat(OE) devlink wmi acpi_cpufreq ip_tables nfsv3 nfs_acl nfs lockd grace fscache sd_mod crc_t10dif crct10dif_generic team_mode_activebackup team crct10dif_pclmul crct10dif_common igb isci ahci crc32c_intel libsas libahci mpt2sas i2c_algo_bit dca ptp libata raid_class pps_core scsi_transport_sas sunrpc bonding ipmi_si ipmi_devintf ipmi_msghandler [ 572.556084] CPU: 5 PID: 39837 Comm: mdt01_025 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.1.3957.1.3.x4.4.35.x86_64 #1 [ 572.572512] Hardware name: Intel Corporation S2600JF/S2600JF, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013 [ 572.585206] task: ffff9d6027568000 ti: ffff9d602c57c000 task.ti: ffff9d602c57c000 [ 572.594779] RIP: 0010:[<ffffffffc1599a8e>] [<ffffffffc1599a8e>] mdt_lvb2body+0x2e/0xe0 [mdt] [ 572.605518] RSP: 0018:ffff9d602c57fa30 EFLAGS: 00010202 [ 572.612653] RAX: 0000000000000000 RBX: ffff9d60987e9a60 RCX: 0000000000000007 [ 572.621813] RDX: 0000000000000000 RSI: ffff9d60987e9a60 RDI: ffff9d6864b29c9c [ 572.630973] RBP: ffff9d602c57fa48 R08: 00000000000000ff R09: 0000000000000000 [ 572.640160] R10: 0000000000000051 R11: 000000020001b1b7 R12: ffff9d6864b29c9c [ 572.649289] R13: ffff9d6864b29c80 R14: ffffffffc10f5a10 R15: 0000000000001000 [ 572.658407] FS: 0000000000000000(0000) GS:ffff9d609e140000(0000) knlGS:0000000000000000 [ 572.668585] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 572.676129] CR2: 0000000000000000 CR3: 00000002b1810000 CR4: 00000000000607e0 [ 572.685217] Call Trace: [ 572.689050] [<ffffffffc159f160>] mdt_glimpse_enqueue+0x1c0/0x4e0 [mdt] [ 572.697530] [<ffffffffc154a4ef>] mdt_intent_glimpse+0x1f/0x30 [mdt] [ 572.705703] [<ffffffffc155ba8a>] mdt_intent_opc+0x1ba/0xb50 [mdt] [ 572.713716] [<ffffffffc1114890>] ? lustre_swab_ldlm_policy_data+0x30/0x30 [ptlrpc] [ 572.723309] [<ffffffffc154a4d0>] ? mdt_intent_brw+0x30/0x30 [mdt] [ 572.731236] [<ffffffffc1563bc4>] mdt_intent_policy+0x1a4/0x360 [mdt] [ 572.739449] [<ffffffffc10c252a>] ldlm_lock_enqueue+0x3da/0xae0 [ptlrpc] [ 572.747934] [<ffffffffc0cee733>] ? cfs_hash_bd_add_locked+0x63/0x80 [libcfs] [ 572.756895] [<ffffffffc0cf1ebe>] ? cfs_hash_add+0xbe/0x1a0 [libcfs] [ 572.764980] [<ffffffffc10eac56>] ldlm_handle_enqueue0+0xa56/0x1610 [ptlrpc] [ 572.773830] [<ffffffffc1114910>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [ 572.783153] [<ffffffffc1174cd2>] tgt_enqueue+0x62/0x210 [ptlrpc] [ 572.790907] [<ffffffffc1179aea>] tgt_request_handle+0x96a/0x1680 [ptlrpc] [ 572.799495] [<ffffffffc0ceb297>] ? libcfs_debug_msg+0x57/0x80 [libcfs] [ 572.807799] [<ffffffffc111fae6>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc] [ 572.817246] [<ffffffffc11237ba>] ptlrpc_main+0xcca/0x1ca0 [ptlrpc] [ 572.825135] [<ffffffffc1122af0>] ? ptlrpc_register_service+0x1010/0x1010 [ptlrpc] [ 572.834463] [<ffffffff846c1f81>] kthread+0xd1/0xe0 [ 572.840771] [<ffffffff846c1eb0>] ? insert_kthread_work+0x40/0x40 [ 572.848425] [<ffffffff84d77c1d>] ret_from_fork_nospec_begin+0x7/0x21 [ 572.856463] [<ffffffff846c1eb0>] ? insert_kthread_work+0x40/0x40 [ 572.864106] Code: 66 90 55 48 89 e5 41 55 49 89 fd 41 54 4c 8d 67 1c 53 4c 89 e7 48 89 f3 e8 b0 3d 7d c3 49 8b 95 b0 00 00 00 f6 05 cc 3c 77 ff 01 <48> 8b 02 48 89 83 b0 00 00 00 48 8b 42 20 48 89 83 b8 00 00 00 [ 572.887642] RIP [<ffffffffc1599a8e>] mdt_lvb2body+0x2e/0xe0 [mdt] [ 572.895468] RSP <ffff9d602c57fa30> [ 572.900227] CR2: 0000000000000000
The reason for the crash is that mdt_dom_lvbo_update() may skip LVB allocation
/* Before going further let's check that OBD and export are healthy. */ if (exp != NULL && (exp->exp_disconnected || exp->exp_failed || exp->exp_obd->obd_stopping)) { CDEBUG(D_INFO, "Skip LVB update, export is %s, obd is %s\n", exp->exp_failed ? "failed" : "disconnected", exp->exp_obd->obd_stopping ? "stopping" : "OK"); RETURN(0); }
and mdt_lvb2reply() deferences NULL pointer:
lock_res(res);
res_lvb = res->lr_lvb_data;
if (lvb)
*lvb = *res_lvb;