Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.11.0, Lustre 2.10.2
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for sarah_lw <wei3.liu@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/bbaab274-d4fa-11e7-9c63-52540065bddc.
The sub-test test_metabench failed with the following error:
Timeout occurred after 347 mins, last suite running was parallel-scale-nfsv4, restarting cluster to continue tests
2.10.2 RC1 EL7.4 zfs
MDS crash
[13488.166260] Lustre: DEBUG MARKER: dmesg [13488.563678] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == parallel-scale-nfsv4 test metabench: metabench ==================================================== 09:11:38 \(1511946698\) [13488.724793] Lustre: DEBUG MARKER: == parallel-scale-nfsv4 test metabench: metabench ==================================================== 09:11:38 (1511946698) [13641.720773] LustreError: 1187:0:(lod_qos.c:1624:lod_alloc_qos()) ASSERTION( nfound <= inuse->op_count ) failed: nfound:3, op_count:0 [13641.721991] LustreError: 1187:0:(lod_qos.c:1624:lod_alloc_qos()) LBUG [13641.722604] Pid: 1187, comm: mdt00_011 [13641.722980] [13641.722980] Call Trace: [13641.723386] [<ffffffffc05c27ae>] libcfs_call_trace+0x4e/0x60 [libcfs] [13641.724029] [<ffffffffc05c283c>] lbug_with_loc+0x4c/0xb0 [libcfs] [13641.724636] [<ffffffffc125c342>] lod_alloc_qos.constprop.17+0x1582/0x1590 [lod] [13641.725507] [<ffffffffc125efe1>] lod_qos_prep_create+0x1291/0x17f0 [lod] [13641.726161] [<ffffffffc062c4b6>] ? nvlist_lookup_byte_array+0x26/0x30 [znvpair] [13641.726921] [<ffffffffc0fd1cc9>] ? __osd_xattr_get+0xa9/0x210 [osd_zfs] [13641.727558] [<ffffffffc0fd0007>] ? osd_fid_lookup+0x47/0x3a0 [osd_zfs] [13641.728196] [<ffffffffc125fab8>] lod_prepare_create+0x298/0x3f0 [lod] [13641.728900] [<ffffffffc125463e>] lod_declare_striped_create+0x1ee/0x970 [lod] [13641.729620] [<ffffffffc077dbc5>] ? sa_object_size+0x15/0x20 [zfs] [13641.730220] [<ffffffffc12583d1>] lod_declare_xattr_set+0x221/0xe40 [lod] [13641.730953] [<ffffffffc12b2d97>] mdd_create_data+0x487/0x720 [mdd] [13641.731568] [<ffffffffc1186f8a>] mdt_mfd_open+0xc5a/0xe70 [mdt] [13641.732152] [<ffffffffc118771b>] mdt_finish_open+0x57b/0x690 [mdt] [13641.732832] [<ffffffffc1188fcc>] mdt_reint_open+0x179c/0x31a0 [mdt] [13641.733464] [<ffffffffc0bbb717>] ? upcall_cache_get_entry+0x3f7/0x8f0 [obdclass] [13641.734187] [<ffffffffc0bc042e>] ? lu_ucred+0x1e/0x30 [obdclass] [13641.734853] [<ffffffffc116e925>] ? mdt_ucred+0x15/0x20 [mdt] [13641.735398] [<ffffffffc116f1f1>] ? mdt_root_squash+0x21/0x430 [mdt] [13641.736035] [<ffffffffc117e8a0>] mdt_reint_rec+0x80/0x210 [mdt] [13641.736676] [<ffffffffc116030b>] mdt_reint_internal+0x5fb/0x9c0 [mdt] [13641.737313] [<ffffffffc1160832>] mdt_intent_reint+0x162/0x430 [mdt] [13641.737993] [<ffffffffc116b59e>] mdt_intent_policy+0x43e/0xc70 [mdt] [13641.738644] [<ffffffffc0d312b7>] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] [13641.739318] [<ffffffffc0d5ad03>] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] [13641.740102] [<ffffffffc0d82ee0>] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] [13641.740847] [<ffffffffc0de01d2>] tgt_enqueue+0x62/0x210 [ptlrpc] [13641.741488] [<ffffffffc0de40d5>] tgt_request_handle+0x925/0x1370 [ptlrpc] [13641.742174] [<ffffffffc0d8cf16>] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [13641.742975] [<ffffffff810ba598>] ? __wake_up_common+0x58/0x90 [13641.743565] [<ffffffffc0d90652>] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [13641.744168] [<ffffffff81029557>] ? __switch_to+0xd7/0x510 [13641.744770] [<ffffffff816a9000>] ? __schedule+0x370/0x8b0 [13641.745310] [<ffffffffc0d8fbc0>] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [13641.745927] [<ffffffff810b099f>] kthread+0xcf/0xe0 [13641.746448] [<ffffffff810b08d0>] ? kthread+0x0/0xe0 [13641.746936] [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90 [13641.747498] [<ffffffff810b08d0>] ? kthread+0x0/0xe0 [13641.747980] [13641.748184] Kernel panic - not syncing: LBUG [13641.748599] CPU: 0 PID: 1187 Comm: mdt00_011 Tainted: P OE ------------ 3.10.0-693.5.2.el7_lustre.x86_64 #1 [13641.749598] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [13641.750132] ffff88005a2f2f00 00000000d5284124 ffff88004dbbf558 ffffffff816a3e2d [13641.750897] ffff88004dbbf5d8 ffffffff8169dd14 ffffffff00000008 ffff88004dbbf5e8 [13641.751664] ffff88004dbbf588 00000000d5284124 00000000d5284124 0000000000000246 [13641.752429] Call Trace: [13641.752678] [<ffffffff816a3e2d>] dump_stack+0x19/0x1b [13641.753161] [<ffffffff8169dd14>] panic+0xe8/0x20d [13641.753619] [<ffffffffc05c2854>] lbug_with_loc+0x64/0xb0 [libcfs] [13641.754206] [<ffffffffc125c342>] lod_alloc_qos.constprop.17+0x1582/0x1590 [lod] [13641.754914] [<ffffffffc125efe1>] lod_qos_prep_create+0x1291/0x17f0 [lod] [13641.755559] [<ffffffffc062c4b6>] ? nvlist_lookup_byte_array+0x26/0x30 [znvpair] [13641.756260] [<ffffffffc0fd1cc9>] ? __osd_xattr_get+0xa9/0x210 [osd_zfs] [13641.756892] [<ffffffffc0fd0007>] ? osd_fid_lookup+0x47/0x3a0 [osd_zfs] [13641.757517] [<ffffffffc125fab8>] lod_prepare_create+0x298/0x3f0 [lod] [13641.758127] [<ffffffffc125463e>] lod_declare_striped_create+0x1ee/0x970 [lod] [13641.758822] [<ffffffffc077dbc5>] ? sa_object_size+0x15/0x20 [zfs] [13641.759407] [<ffffffffc12583d1>] lod_declare_xattr_set+0x221/0xe40 [lod] [13641.760048] [<ffffffffc12b2d97>] mdd_create_data+0x487/0x720 [mdd] [13641.760650] [<ffffffffc1186f8a>] mdt_mfd_open+0xc5a/0xe70 [mdt] [13641.761217] [<ffffffffc118771b>] mdt_finish_open+0x57b/0x690 [mdt] [13641.761818] [<ffffffffc1188fcc>] mdt_reint_open+0x179c/0x31a0 [mdt] [13641.762425] [<ffffffffc0bbb717>] ? upcall_cache_get_entry+0x3f7/0x8f0 [obdclass] [13641.763143] [<ffffffffc0bc042e>] ? lu_ucred+0x1e/0x30 [obdclass] [13641.763723] [<ffffffffc116e925>] ? mdt_ucred+0x15/0x20 [mdt] [13641.764266] [<ffffffffc116f1f1>] ? mdt_root_squash+0x21/0x430 [mdt] [13641.764879] [<ffffffffc117e8a0>] mdt_reint_rec+0x80/0x210 [mdt] [13641.765448] [<ffffffffc116030b>] mdt_reint_internal+0x5fb/0x9c0 [mdt] [13641.766062] [<ffffffffc1160832>] mdt_intent_reint+0x162/0x430 [mdt] [13641.766662] [<ffffffffc116b59e>] mdt_intent_policy+0x43e/0xc70 [mdt] [13641.767285] [<ffffffffc0d312b7>] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] [13641.767938] [<ffffffffc0d5ad03>] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] [13641.768628] [<ffffffffc0d82ee0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] [13641.769359] [<ffffffffc0de01d2>] tgt_enqueue+0x62/0x210 [ptlrpc] [13641.769961] [<ffffffffc0de40d5>] tgt_request_handle+0x925/0x1370 [ptlrpc] [13641.770634] [<ffffffffc0d8cf16>] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [13641.771346] [<ffffffff810ba598>] ? __wake_up_common+0x58/0x90 [13641.771925] [<ffffffffc0d90652>] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [13641.772515] [<ffffffff81029557>] ? __switch_to+0xd7/0x510 [13641.773028] [<ffffffff816a9000>] ? __schedule+0x370/0x8b0 [13641.773567] [<ffffffffc0d8fbc0>] ? ptlrpc_register_service+0xe30/0xe30 [ptlrpc] [13641.774255] [<ffffffff810b099f>] kthread+0xcf/0xe0 [13641.774718] [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40 [13641.775281] [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90 [13641.775790] [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40 [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 3.10.0-693.5.2.el7_lustre.x86_64 (jenkins@trevis-310-el7-x8664-2.trevis.hpdd.intel.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Mon Nov 27 15:30:51 UTC 2017 [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.5.2.el7_lustre.x86_64 root=UUID=d6ed6b92-d297-4df2-adc0-6d2fa1ffd5ed ro console=tty0 LANG=en_US.UTF-8 console=ttyS0,115200 net.ifnames=0 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never disable_cpu_apicid=0 elfcorehdr=867700K [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009f7ff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009f800-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000021000000-0x0000000034f5cfff] usable [ 0.000000] BIOS-e820: [mem 0x0000000034fff800-0x0000000034ffffff] usable [ 0.000000] BIOS-e820: [mem 0x000000007fffa000-0x000000007fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] SMBIOS 2.4 present. [ 0.000000] Hypervisor detected: KVM