Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.11.0
-
None
-
3
-
9223372036854775807
Description
System crashed while testing under memory pressure:
[432534.561808] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [432534.563083] Lustre: Skipped 3 previous similar messages [432534.593088] BUG: unable to handle kernel NULL pointer dereference at (null) [432534.594035] IP: [<ffffffffa07d31f7>] tgt_free_reply_data+0x97/0x330 [ptlrpc] [432534.594035] PGD 3c7cb067 PUD 3836e067 PMD 0 [432534.594035] Oops: 0002 [#1] SMP [432534.594035] Modules linked in: lustre(OF) ofd(OF) osp(OF) lod(OF) ost(OF) mdt(OF) mdd(OF) mgs(OF) osd_ldiskfs(OF) ldiskfs(OF) lquota(OF) lfsck(OF) obdecho(OF) mgc(OF) lov(OF) osc(OF) mdc(OF) lmv(OF) fid(OF) fld(OF) ptlrpc(OF) obdclass(OF) ksocklnd(OF) lnet(OF) libcfs(OF) loop mbcache jbd2 sha512_generic netconsole sg dm_mirror dm_region_hash dm_log crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd serio_raw virtio_balloon virtio_console dm_mod intel_agp i2c_piix4 intel_gtt nfsd auth_rpcgss nfs_acl lockd sunrpc ip_tables xfs ata_generic libcrc32c virtio_net cirrus syscopyarea sysfillrect sysimgblt virtio_scsi drm_kms_helper virtio_blk ttm drm virtio_pci agpgart ata_piix virtio_ring libata virtio i2c_core [last unloaded: libcfs] [432534.594035] CPU: 1 PID: 5669 Comm: mdt01_003 Tainted: GF O-------------- 3.10.0-229.7.2.x86_64 #7 [432534.594035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014 [432534.594035] task: ffff880025081580 ti: ffff88002da2c000 task.ti: ffff88002da2c000 [432534.594035] RIP: 0010:[<ffffffffa07d31f7>] [<ffffffffa07d31f7>] tgt_free_reply_data+0x97/0x330 [ptlrpc] [432534.594035] RSP: 0018:ffff88002da2fb90 EFLAGS: 00010293 [432534.594035] RAX: 0000000000000001 RBX: ffff8800133fb8d8 RCX: 0000000000000000 [432534.594035] RDX: 0000000000000000 RSI: ffff88001289f300 RDI: ffff8800133fb8d8 [432534.594035] RBP: ffff88002da2fbd8 R08: ffff8800133fb8d8 R09: 0000000000000000 [432534.594035] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [432534.594035] R13: ffff880000e65718 R14: ffff88001289f3f8 R15: ffff88001289f300 [432534.594035] FS: 0000000000000000(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000 [432534.594035] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [432534.594035] CR2: 0000000000bfc001 CR3: 000000003c289000 CR4: 00000000001406e0 [432534.594035] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [432534.594035] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [432534.594035] Stack: [432534.594035] 0000000000000000 ffff88001289f3f8 ffffffff810c486d ffff88002da2fc30 [432534.594035] ffff8800133fb8d8 ffff88001289f300 ffff88001289ef60 ffff88001289f3f8 [432534.594035] ffff880000e65718 ffff88002da2fc30 ffffffffa07d34ee 0000000000000246 [432534.594035] Call Trace: [432534.594035] [<ffffffff810c486d>] ? trace_hardirqs_on+0xd/0x10 [432534.594035] [<ffffffffa07d34ee>] tgt_release_reply_data+0x5e/0x180 [ptlrpc] [432534.594035] [<ffffffffa07dc128>] tgt_handle_received_xid+0x98/0xe0 [ptlrpc] [432534.594035] [<ffffffffa07e1d38>] tgt_request_handle+0xb88/0x1330 [ptlrpc] [432534.594035] [<ffffffffa078d591>] ptlrpc_server_handle_request+0x231/0xac0 [ptlrpc] [432534.594035] [<ffffffffa078be15>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc] [432534.594035] [<ffffffffa0791790>] ptlrpc_main+0xab0/0x1e10 [ptlrpc] [432534.594035] [<ffffffff810c486d>] ? trace_hardirqs_on+0xd/0x10 [432534.594035] [<ffffffff8109b842>] ? finish_task_switch+0x42/0x150 [432534.594035] [<ffffffffa0790ce0>] ? ptlrpc_register_service+0xe50/0xe50 [ptlrpc] [432534.594035] [<ffffffff8109008a>] kthread+0xea/0xf0 [432534.594035] [<ffffffff8108ffa0>] ? kthread_create_on_node+0x140/0x140 [432534.594035] [<ffffffff81571258>] ret_from_fork+0x58/0x90 [432534.594035] [<ffffffff8108ffa0>] ? kthread_create_on_node+0x140/0x140 [432534.594035] Code: c1 fa 1f c1 ea 0c c1 f9 14 41 8d 04 14 25 ff ff 0f 00 29 d0 83 f9 0f 0f 8f 72 02 00 00 49 8b 95 28 04 00 00 48 63 c9 48 8b 14 ca <f0> 0f b3 02 19 c0 85 c0 0f 84 8b 01 00 00 48 85 db 0f 84 1b 02 [432534.594035] RIP [<ffffffffa07d31f7>] tgt_free_reply_data+0x97/0x330 [ptlrpc] [432534.594035] RSP <ffff88002da2fb90> [432534.594035] CR2: 0000000000000000 [432534.712915] ---[ end trace 26ac593d02d07dd0 ]--- [432534.714120] Kernel panic - not syncing: Fatal exception
This issue is caused by error return value in :
/* reply_data is supported by MDT targets only for now */ if (strncmp(obd->obd_type->typ_name, LUSTRE_MDT_NAME, 3) != 0) RETURN(0); OBD_ALLOC(lut->lut_reply_bitmap, LUT_REPLY_SLOTS_MAX_CHUNKS * sizeof(unsigned long *)); if (lut->lut_reply_bitmap == NULL) GOTO(out, rc); -----------------------------^^^ memset(&attr, 0, sizeof(attr)); attr.la_valid = LA_MODE;
I'll push a patch for it.
Attachments
Issue Links
- is related to
-
LU-8199 NULL pointer dereference in tgt_free_reply_data
-
- Resolved
-