Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.10.0
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for sarah_lw <wei3.liu@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/1a11ae8c-f8f9-11e6-aac4-5254006e85c2.
The sub-test test_compilebench failed with the following error:
test failed to respond and timed out
Not sure if this is the dup of LU-8584
server/client lustre-master tag-2.9.53 el7 zfs
MDS console
09:10:55:[ 327.535441] Lustre: DEBUG MARKER: == parallel-scale-nfsv3 test compilebench: compilebench ============================================== 09:04:15 (1487754255) 09:10:55:[ 327.867477] Lustre: DEBUG MARKER: /usr/sbin/lctl mark .\/compilebench -D \/mnt\/lustre\/d0.compilebench -i 2 -r 2 --makej 09:10:55:[ 328.161392] Lustre: DEBUG MARKER: ./compilebench -D /mnt/lustre/d0.compilebench -i 2 -r 2 --makej 09:10:55: 09:10:55:[ 721.605127] BUG: unable to handle kernel paging request at ffffeb040013bd80 09:10:55:[ 721.605127] IP: [<ffffffffa0aa330f>] lnet_cpt_of_md+0xdf/0x120 [lnet] 09:10:55:[ 721.605127] PGD 0 09:10:55:[ 721.605127] Oops: 0000 [#1] SMP 09:10:55:[ 721.605127] Modules linked in: osc(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_ssse3 sha512_generic crypto_null libcfs(OE) zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel ppdev aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr virtio_balloon i2c_piix4 parport_pc parport nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk crct10dif_pclmul crct10dif_common 8139too crc32c_intel cirrus drm_kms_helper serio_raw syscopyarea sysfillrect sysimgblt fb_sys_fops ttm 8139cp mii virtio_pci virtio_ring virtio drm ata_piix libata i2c_core floppy 09:10:55:[ 721.605127] CPU: 1 PID: 8060 Comm: mdt00_001 Tainted: P OE ------------ 3.10.0-514.6.1.el7_lustre.x86_64 #1 09:10:55:[ 721.605127] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 09:10:55:[ 721.605127] task: ffff880045b43ec0 ti: ffff8800454c8000 task.ti: ffff8800454c8000 09:10:55:[ 721.605127] RIP: 0010:[<ffffffffa0aa330f>] [<ffffffffa0aa330f>] lnet_cpt_of_md+0xdf/0x120 [lnet] 09:10:55:[ 721.605127] RSP: 0018:ffff8800454cba18 EFLAGS: 00010202 09:10:55:[ 721.605127] RAX: 000001040013bd80 RBX: 0009000000000000 RCX: 000077ff80000000 09:10:55:[ 721.605127] RDX: ffffea0000000000 RSI: 0000000000000000 RDI: ffff880079de2280 09:10:55:[ 721.605127] RBP: ffff8800454cba18 R08: 0000000000000009 R09: 00000000000003f8 09:10:56:[ 721.605127] R10: ffff88003b92a200 R11: ffffc90004ef6100 R12: ffff880013cca380 09:10:56:[ 721.605127] R13: 0009000000000000 R14: ffff88003b92a200 R15: 0000000000000000 09:10:56:[ 721.605127] FS: 0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 09:10:56:[ 721.605127] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 09:10:56:[ 721.605127] CR2: ffffeb040013bd80 CR3: 00000000019ba000 CR4: 00000000000406e0 09:10:56:[ 721.605127] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 09:10:56:[ 721.605127] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 09:10:56:[ 721.605127] Stack: 09:10:56:[ 721.605127] ffff8800454cbab0 ffffffffa0aaa7ba ffff88003df83df0 ffffffffffffffff 09:10:56:[ 721.605127] 0000000016636b10 0000000000000246 ffff88007d001900 0000000000008050 09:10:56:[ 721.605127] 00000000ffffffff ffffffffa0aa200c 0009000000000000 ffff88003b92a200 09:10:56:[ 721.605127] Call Trace: 09:10:56:[ 721.605127] [<ffffffffa0aaa7ba>] lnet_select_pathway+0x5a/0x1010 [lnet] 09:10:56:[ 721.605127] [<ffffffffa0aa200c>] ? LNetMDBind+0x7c/0x5e0 [lnet] 09:10:56:[ 721.605127] [<ffffffffa0aace71>] lnet_send+0x51/0x180 [lnet] 09:10:56:[ 721.605127] [<ffffffffa0aad1e5>] LNetPut+0x245/0x7a0 [lnet] 09:10:56:[ 721.605127] [<ffffffffa0d79d76>] ptl_send_buf+0x146/0x530 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffffa0a12cce>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] 09:10:56:[ 721.605127] [<ffffffffa0d9c637>] ? at_measured+0x1c7/0x380 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffffa0d7cffb>] ptlrpc_send_reply+0x29b/0x830 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffffa0d3b24e>] target_send_reply_msg+0x8e/0x170 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffffa0d45fc6>] target_send_reply+0x306/0x730 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffffa0d83657>] ? lustre_msg_set_last_committed+0x27/0xa0 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffffa0de1f37>] tgt_request_handle+0x587/0x1320 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffffa0d8d7ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffffa0d8b368>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffff810c4fe2>] ? default_wake_function+0x12/0x20 09:10:56:[ 721.605127] [<ffffffff810ba238>] ? __wake_up_common+0x58/0x90 09:10:56:[ 721.605127] [<ffffffffa0d917b0>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffffa0d90d10>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc] 09:10:56:[ 721.605127] [<ffffffff810b064f>] kthread+0xcf/0xe0 09:10:56:[ 721.605127] [<ffffffff810bf9f3>] ? finish_task_switch+0x53/0x180 09:10:56:[ 721.605127] [<ffffffff810b0580>] ? kthread_create_on_node+0x140/0x140 09:10:56:[ 721.605127] [<ffffffff81696958>] ret_from_fork+0x58/0x90 09:10:56:[ 721.605127] [<ffffffff810b0580>] ? kthread_create_on_node+0x140/0x140 09:10:56:[ 721.605127] Code: ff 77 00 00 48 8b 3d 91 53 03 00 48 01 d0 48 0f 42 0d 16 dd f1 e0 48 ba 00 00 00 00 00 ea ff ff 48 01 c8 48 c1 e8 0c 48 c1 e0 06 <48> 8b 34 10 48 c1 ee 36 e8 c4 fc f6 ff 5d c3 66 90 b8 ff ff ff 09:10:56:[ 721.605127] RIP [<ffffffffa0aa330f>] lnet_cpt_of_md+0xdf/0x120 [lnet] 09:10:56:[ 721.605127] RSP <ffff8800454cba18> 09:10:56:[ 721.605127] CR2: ffffeb040013bd80 09:10:56:[ 0.000000] Initializing cgroup subsys cpuset 09:10:56:[ 0.000000] Initializing cgroup subsys cpu 09:10:56:[ 0.000000] Initializing cgroup subsys cpuacct 09:10:56:[ 0.000000] Linux version 3.10.0-514.6.1.el7_lustre.x86_64 (jenkins@trevis-308.trevis.hpdd.intel.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Tue Feb 14 04:06:44 UTC 2017 09:10:56:[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-514.6.1.el7_lustre.x86_64 root=UUID=70563313-e0a3-4c81-b456-70fdcd7f6e9f ro console=tty0 LANG=en_US.UTF-8 console=ttyS0,115200 net.ifnames=0 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never disable_cpu_apicid=0 elfcorehdr=867708K 09:10:56:[ 0.000000] Disabled fast string operations
Info required for matching: parallel-scale-nfsv3 compilebench