[LU-6249] mds-survey test_1: test failed to respond and timed out Created: 14/Feb/15 Updated: 12/Aug/22 Resolved: 12/Aug/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0, Lustre 2.11.0, Lustre 2.10.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Sarah Liu | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | dne, zfs | ||
| Environment: |
client and server: lustre-master build# 2856 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 17499 | ||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/a383e9ac-b2e0-11e4-b42d-5254006e85c2. The sub-test test_1 failed with the following error: test failed to respond and timed out cannot find error message, test just timed out, not sure if this is a dup of Info required for matching: mds-survey 1 |
| Comments |
| Comment by Oleg Drokin [ 17/Feb/15 ] |
|
MDS crashed and the console log is nowhee to be found to see why. |
| Comment by Minh Diep [ 06/Feb/18 ] |
|
MDS crashed https://testing.hpdd.intel.com/test_logs/1ebcd592-0b54-11e8-a7cd-52540065bddc/show_text [15349.075441] LustreError: 21742:0:(echo_client.c:1795:echo_md_lookup()) Skipped 1 previous similar message [15349.078482] LustreError: 21742:0:(echo_client.c:2027:echo_md_destroy_internal()) Can't find child MDT0002-tests: rc = -2 [15349.081864] LustreError: 21742:0:(echo_client.c:2027:echo_md_destroy_internal()) Skipped 1 previous similar message [15499.480571] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [15499.481150] IP: [<ffffffffc0ca9f53>] lu_object_alloc+0x73/0x310 [obdclass] [15499.481150] PGD 800000004f885067 PUD 1d0e4067 PMD 0 [15499.481150] Oops: 0002 [#1] SMP [15499.481150] Modules linked in: obdecho(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel dm_mod ppdev aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr joydev virtio_balloon i2c_piix4 parport_pc parport nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi cirrus drm_kms_helper virtio_blk ata_piix syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm libata crct10dif_pclmul crct10dif_common 8139too crc32c_intel serio_raw virtio_pci 8139cp virtio_ring virtio mii i2c_core floppy [15499.496961] CPU: 1 PID: 16365 Comm: lctl Tainted: P OE ------------ 3.10.0-693.17.1.el7_lustre.x86_64 #1 [15499.496961] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [15499.496961] task: ffff88005fb58000 ti: ffff880020888000 task.ti: ffff880020888000 [15499.496961] RIP: 0010:[<ffffffffc0ca9f53>] [<ffffffffc0ca9f53>] lu_object_alloc+0x73/0x310 [obdclass] [15499.496961] RSP: 0018:ffff88002088baf0 EFLAGS: 00010246 [15499.496961] RAX: 0000000240000bd0 RBX: ffff88005d39e180 RCX: 0000000000000000 [15499.496961] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88003520c7e0 [15499.496961] RBP: ffff88002088bb38 R08: 000000000001b920 R09: 0000000000000000 [15499.496961] R10: ffff88003520c7e0 R11: 0000000000000fff R12: ffff88003520c7e0 [15499.496961] R13: ffff88002088bbd8 R14: ffff88004b3e6228 R15: 0000000000000000 [15499.496961] FS: 00007f036cfa5740(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 [15499.496961] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [15499.496961] CR2: 0000000000000008 CR3: 000000003b72a000 CR4: 00000000000606e0 [15499.496961] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [15499.496961] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [15499.496961] Call Trace: [15499.496961] [<ffffffffc0ca7833>] ? htable_lookup+0x153/0x170 [obdclass] [15499.496961] [<ffffffffc0caa3bc>] lu_object_find_at+0x16c/0x290 [obdclass] [15499.496961] [<ffffffffc13237de>] echo_md_dir_stripe_choose.isra.43+0x26e/0x680 [obdecho] [15499.496961] [<ffffffffc1324b87>] echo_md_handler.isra.45+0xf97/0x2c20 [obdecho] [15499.496961] [<ffffffff816add34>] ? _raw_read_lock+0x14/0x20 [15499.496961] [<ffffffffc13278a1>] echo_client_iocontrol+0x1091/0x1ba0 [obdecho] [15499.496961] [<ffffffffc0c8aa59>] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [15499.496961] [<ffffffffc0c7586d>] class_handle_ioctl+0x18ed/0x1df0 [obdclass] [15499.496961] [<ffffffff811af746>] ? do_read_fault.isra.44+0xe6/0x130 [15499.496961] [<ffffffff812b3ea8>] ? security_capable+0x18/0x20 [15499.496961] [<ffffffffc0c5a602>] obd_class_ioctl+0xd2/0x170 [obdclass] [15499.496961] [<ffffffff8121730d>] do_vfs_ioctl+0x33d/0x540 [15499.496961] [<ffffffff81062efe>] ? kvm_clock_get_cycles+0x1e/0x20 [15499.496961] [<ffffffff810ec7ba>] ? __getnstimeofday64+0x3a/0xd0 [15499.496961] [<ffffffff812175b1>] SyS_ioctl+0xa1/0xc0 [15499.496961] [<ffffffff816b8930>] ? system_call_after_swapgs+0x15d/0x214 [15499.496961] [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b [15499.496961] [<ffffffff816b889d>] ? system_call_after_swapgs+0xca/0x214 [15499.496961] Code: 48 8b 42 10 ff 10 48 85 c0 49 89 c4 0f 84 3c 02 00 00 48 3d 00 f0 ff ff 0f 87 6f 02 00 00 48 8b 08 49 8b 57 08 49 8b 07 45 31 ff <48> 89 51 08 48 89 01 49 8b 04 24 4c 8d 70 40 48 89 44 24 08 48 [15499.496961] RIP [<ffffffffc0ca9f53>] lu_object_alloc+0x73/0x310 [obdclass] [15499.496961] RSP <ffff88002088baf0> [15499.496961] CR2: 0000000000000008 [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 3.10.0-693.17.1.el7_lustre.x86_64 (jenkins@trevis-307-el7-x8664-1.trevis.hpdd.intel.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Fri Jan 26 13:49:52 UTC 2018 [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.17.1.el7_lustre.x86_64 root=UUID=36a4fa8e-8395-4c4c-9d40-93a0779cd2bb ro console=tty0 LANG=en_US.UTF-8 console=ttyS0,115200 net.ifnames=0 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug transparent_hugepage=never disable_cpu_apicid=0 elfcorehdr=867708K [ 0.000000] Disabled fast string operations [ 0.000000] e820: BIOS-provided physical RAM https://testing.hpdd.intel.com/test_sets/1eb3e98c-0b54-11e8-a7cd-52540065bddc |