Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.8.0
-
None
-
autotest review-dne-part-2
-
3
-
9223372036854775807
Description
insanity test suite hangs on unmounting ost2 after all tests have completed successfully. In the suite_stdout log, we see
08:51:30:CMD: shadow-22vm7 grep -c /mnt/ost2' ' /proc/mounts 08:51:30:Stopping /mnt/ost2 (opts:-f) on shadow-22vm7 08:51:30:CMD: shadow-22vm7 umount -d -f /mnt/ost2 09:50:45:********** Timeout by autotest system **********
If we look at the test_complete log for shadow-22vm7, we see:
08:51:41:Lustre: DEBUG MARKER: grep -c /mnt/ost1' ' /proc/mounts 08:51:41:Lustre: DEBUG MARKER: umount -d -f /mnt/ost1 08:51:41:LustreError: 4149:0:(client.c:1130:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff880065e860c0 x1524146469973308/t0(0) o101->lustre-MDT0000-lwp-OST0000@10.1.5.16@tcp:23/10 lens 456/496 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1 08:51:41:LustreError: 4149:0:(qsd_reint.c:55:qsd_reint_completion()) lustre-OST0000: failed to enqueue global quota lock, glb fid:[0x200000006:0x20000:0x0], rc:-5 08:51:41:LustreError: 4149:0:(qsd_reint.c:55:qsd_reint_completion()) Skipped 1 previous similar message 08:51:41:Lustre: server umount lustre-OST0000 complete 08:51:41:Lustre: Skipped 6 previous similar messages 08:51:41:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' 08:51:41:Lustre: DEBUG MARKER: grep -c /mnt/ost2' ' /proc/mounts 08:51:41:Lustre: DEBUG MARKER: umount -d -f /mnt/ost2 08:51:41:general protection fault: 0000 [#1] SMP 08:51:41:last sysfs file: /sys/devices/system/cpu/online 08:51:41:CPU 1 08:51:41:Modules linked in: osp(U) ofd(U) lfsck(U) ost(U) mgc(U) osd_ldiskfs(U) lquota(U) lustre(U) lov(U) mdc(U) fid(U) lmv(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) sha512_generic libcfs(U) ldiskfs(U) jbd2 nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] 08:51:41: 08:51:41:Pid: 4320, comm: qsd_reint_0.lus Not tainted 2.6.32-573.8.1.el6_lustre.gea97898.x86_64 #1 Red Hat KVM 08:51:41:RIP: 0010:[<ffffffff81059911>] [<ffffffff81059911>] __wake_up_common+0x31/0x90 08:51:41:RSP: 0018:ffff88006201bd80 EFLAGS: 00010096 08:51:41:RAX: 5a5a5a5a5a5a5a42 RBX: ffff88002cd908a0 RCX: 0000000000000000 08:51:41:RDX: 5a5a5a5a5a5a5a5a RSI: 0000000000000003 RDI: ffff88002cd908a0 08:51:41:RBP: ffff88006201bdc0 R08: 0000000000000000 R09: 0000000000000000 08:51:41:R10: 0000000000000000 R11: 000000000000000f R12: 0000000000000282 08:51:41:R13: ffff88002cd908a8 R14: 0000000000000000 R15: 0000000000000000 08:51:41:FS: 0000000000000000(0000) GS:ffff880002300000(0000) knlGS:0000000000000000 08:51:41:CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b 08:51:41:CR2: 00007fd119ac7000 CR3: 000000004aba3000 CR4: 00000000000006e0 08:51:41:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 08:51:41:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 08:51:41:Process qsd_reint_0.lus (pid: 4320, threadinfo ffff880062018000, task ffff88005cbc0ab0) 08:51:41:Stack: 08:51:41: ffff88006201bda0 0000000300000001 ffff88006201be00 ffff88002cd908a0 08:51:41:<d> 0000000000000282 0000000000000003 0000000000000001 0000000000000000 08:51:41:<d> ffff88006201be00 ffffffff8105e168 ffff88006201be00 ffff88002cd90800 08:51:41:Call Trace: 08:51:41: [<ffffffff8105e168>] __wake_up+0x48/0x70 08:51:41: [<ffffffffa0c0a1a3>] qsd_reint_main+0x73/0x1950 [lquota] 08:51:41: [<ffffffff81538dde>] ? thread_return+0x4e/0x7d0 08:51:41: [<ffffffff810672c2>] ? default_wake_function+0x12/0x20 08:51:41: [<ffffffffa0c0a130>] ? qsd_reint_main+0x0/0x1950 [lquota] 08:51:41: [<ffffffff810a0fce>] kthread+0x9e/0xc0 08:51:41: [<ffffffff8100c28a>] child_rip+0xa/0x20 08:51:41: [<ffffffff810a0f30>] ? kthread+0x0/0xc0 08:51:41: [<ffffffff8100c280>] ? child_rip+0x0/0x20 08:51:41:Code: 41 56 41 55 41 54 53 48 83 ec 18 0f 1f 44 00 00 89 75 cc 89 55 c8 4c 8d 6f 08 48 8b 57 08 41 89 cf 4d 89 c6 48 8d 42 e8 49 39 d5 <48> 8b 58 18 74 3f 48 83 eb 18 eb 0a 0f 1f 00 48 89 d8 48 8d 5a 08:51:41:RIP [<ffffffff81059911>] __wake_up_common+0x31/0x90 08:51:41: RSP <ffff88006201bd80>
Logs are at https://testing.hpdd.intel.com/test_sets/2ed0029e-c202-11e5-92cf-5254006e85c2
This is the first time we've seen insanity fail in this way.