Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.8.0
-
None
-
client and server: lustre-master build # 3142 RHEL6.6 DNE
-
3
-
9223372036854775807
Description
This issue was created by maloo for sarah_lw <wei3.liu@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/72f11210-46d3-11e5-90a5-5254006e85c2.
The sub-test test_3a failed with the following error:
test failed to respond and timed out
ost console:
12:55:26:Lustre: DEBUG MARKER: == obdfilter-survey test 3a: Network survey == 05:48:19 (1439988499) 12:55:28:LustreError: 11-0: lustre-MDT0000-lwp-OST0000: operation obd_ping to node 10.2.4.221@tcp failed: rc = -107 12:55:30:LustreError: Skipped 7 previous similar messages 12:55:31:Lustre: lustre-MDT0000-lwp-OST0000: Connection to lustre-MDT0000 (at 10.2.4.221@tcp) was lost; in progress operations using this service will wait for recovery to complete 12:55:31:Lustre: Skipped 7 previous similar messages 12:55:32:Lustre: 6155:0:(client.c:2014:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1439988511/real 1439988511] req@ffff880014660980 x1509869039556728/t0(0) o400->MGC10.2.4.221@tcp@10.2.4.221@tcp:26/25 lens 224/224 e 0 to 1 dl 1439988518 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 12:55:32:Lustre: 6155:0:(client.c:2014:ptlrpc_expire_one_request()) Skipped 10 previous similar messages 12:55:32:LustreError: 166-1: MGC10.2.4.221@tcp: Connection to MGS (at 10.2.4.221@tcp) was lost; in progress operations using this service will fail 12:55:32:Lustre: DEBUG MARKER: grep -c /mnt/ost1' ' /proc/mounts 12:55:34:Lustre: DEBUG MARKER: umount -d -f /mnt/ost1 12:55:34:Lustre: server umount lustre-OST0000 complete 12:55:34:Lustre: Skipped 1 previous similar message 12:55:34:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' 12:55:34:Lustre: DEBUG MARKER: grep -c /mnt/ost2' ' /proc/mounts 12:55:34:Lustre: DEBUG MARKER: umount -d -f /mnt/ost2 12:55:34:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' 12:55:35:Lustre: DEBUG MARKER: grep -c /mnt/ost3' ' /proc/mounts 12:55:35:Lustre: DEBUG MARKER: umount -d -f /mnt/ost3 12:55:35:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' 12:55:35:Lustre: DEBUG MARKER: grep -c /mnt/ost4' ' /proc/mounts 12:55:35:Lustre: DEBUG MARKER: umount -d -f /mnt/ost4 12:55:35:Lustre: server umount lustre-OST0003 complete 12:55:35:Lustre: Skipped 2 previous similar messages 12:55:35:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' 12:55:36:Lustre: DEBUG MARKER: grep -c /mnt/ost5' ' /proc/mounts 12:55:36:Lustre: DEBUG MARKER: umount -d -f /mnt/ost5 12:55:36:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' 12:55:36:Lustre: DEBUG MARKER: grep -c /mnt/ost6' ' /proc/mounts 12:55:36:Lustre: DEBUG MARKER: umount -d -f /mnt/ost6 12:55:36:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' 12:55:37:Lustre: DEBUG MARKER: grep -c /mnt/ost7' ' /proc/mounts 12:55:37:Lustre: DEBUG MARKER: umount -d -f /mnt/ost7 12:55:37:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' 12:55:37:Lustre: DEBUG MARKER: grep -c /mnt/ost8' ' /proc/mounts 12:55:37:Lustre: DEBUG MARKER: umount -d -f /mnt/ost8 12:55:37:LustreError: 8532:0:(lu_object.c:1224:lu_device_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 3 12:55:37:LustreError: 8532:0:(lu_object.c:1224:lu_device_fini()) LBUG 12:55:37:Pid: 8532, comm: umount 12:55:38: 12:55:38:Call Trace: 12:55:38: [<ffffffffa049b875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 12:55:38: [<ffffffffa049be77>] lbug_with_loc+0x47/0xb0 [libcfs] 12:55:38: [<ffffffffa05f229b>] lu_device_fini+0xbb/0xc0 [obdclass] 12:55:38: [<ffffffffa05d328d>] ls_device_put+0x7d/0x2e0 [obdclass] 12:55:39: [<ffffffffa05d3662>] local_oid_storage_fini+0x172/0x410 [obdclass] 12:55:40: [<ffffffffa0dc476f>] lfsck_instance_cleanup+0x20f/0x7e0 [lfsck] 12:55:40: [<ffffffffa0dc6f7b>] lfsck_degister+0x4b/0x60 [lfsck] 12:55:40: [<ffffffffa0e8f597>] ofd_device_fini+0x87/0x250 [ofd] 12:55:40: [<ffffffffa05e1802>] class_cleanup+0x572/0xd30 [obdclass] 12:55:40: [<ffffffffa05c1776>] ? class_name2dev+0x56/0xe0 [obdclass] 12:55:41: [<ffffffffa05e3e56>] class_process_config+0x1e96/0x2800 [obdclass] 12:55:41: [<ffffffffa04a7c01>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 12:55:41: [<ffffffff8117523c>] ? __kmalloc+0x21c/0x230 12:55:41: [<ffffffffa05e4c7f>] class_manual_cleanup+0x4bf/0x8e0 [obdclass] 12:55:41: [<ffffffffa05c1776>] ? class_name2dev+0x56/0xe0 [obdclass] 12:55:41: [<ffffffffa061e102>] server_put_super+0x9e2/0xeb0 [obdclass] 12:55:41: [<ffffffff811ac776>] ? invalidate_inodes+0xf6/0x190 12:55:41: [<ffffffff81190b7b>] generic_shutdown_super+0x5b/0xe0 12:55:41: [<ffffffff81190c66>] kill_anon_super+0x16/0x60 12:55:41: [<ffffffffa05e7b36>] lustre_kill_super+0x36/0x60 [obdclass] 12:55:42: [<ffffffff81191407>] deactivate_super+0x57/0x80 12:55:42: [<ffffffff811b10df>] mntput_no_expire+0xbf/0x110 12:55:42: [<ffffffff811b1c2b>] sys_umount+0x7b/0x3a0 12:55:42: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b 12:55:42: 12:55:42:Kernel panic - not syncing: LBUG 12:55:42:Pid: 8532, comm: umount Not tainted 2.6.32-504.30.3.el6_lustre.x86_64 #1 12:55:42:Call Trace: 12:55:43: [<ffffffff81529c9c>] ? panic+0xa7/0x16f 12:55:43: [<ffffffffa049becb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] 12:55:43: [<ffffffffa05f229b>] ? lu_device_fini+0xbb/0xc0 [obdclass] 12:55:43: [<ffffffffa05d328d>] ? ls_device_put+0x7d/0x2e0 [obdclass] 12:55:43: [<ffffffffa05d3662>] ? local_oid_storage_fini+0x172/0x410 [obdclass] 12:55:43: [<ffffffffa0dc476f>] ? lfsck_instance_cleanup+0x20f/0x7e0 [lfsck] 12:55:43: [<ffffffffa0dc6f7b>] ? lfsck_degister+0x4b/0x60 [lfsck] 12:55:43: [<ffffffffa0e8f597>] ? ofd_device_fini+0x87/0x250 [ofd] 12:55:43: [<ffffffffa05e1802>] ? class_cleanup+0x572/0xd30 [obdclass] 12:55:43: [<ffffffffa05c1776>] ? class_name2dev+0x56/0xe0 [obdclass] 12:55:45: [<ffffffffa05e3e56>] ? class_process_config+0x1e96/0x2800 [obdclass] 12:55:45: [<ffffffffa04a7c01>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 12:55:45: [<ffffffff8117523c>] ? __kmalloc+0x21c/0x230 12:55:46: [<ffffffffa05e4c7f>] ? class_manual_cleanup+0x4bf/0x8e0 [obdclass] 12:55:46: [<ffffffffa05c1776>] ? class_name2dev+0x56/0xe0 [obdclass] 12:55:46: [<ffffffffa061e102>] ? server_put_super+0x9e2/0xeb0 [obdclass] 12:55:46: [<ffffffff811ac776>] ? invalidate_inodes+0xf6/0x190 12:55:46: [<ffffffff81190b7b>] ? generic_shutdown_super+0x5b/0xe0 12:55:46: [<ffffffff81190c66>] ? kill_anon_super+0x16/0x60 12:55:47: [<ffffffffa05e7b36>] ? lustre_kill_super+0x36/0x60 [obdclass] 12:55:47: [<ffffffff81191407>] ? deactivate_super+0x57/0x80 12:55:47: [<ffffffff811b10df>] ? mntput_no_expire+0xbf/0x110 12:55:48: [<ffffffff811b1c2b>] ? sys_umount+0x7b/0x3a0 12:55:49: [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b 12:55:50:Initializing cgroup subsys cpuset
Attachments
Issue Links
- is related to
-
LU-7221 replay-ost-single test_3: ASSERTION( __v > 0 && __v < ((int)0x5a5a5a5a5a5a5a5a) ) failed: value: 0
- Resolved
-
LU-6365 Eliminate unnecessary loop in lu_cache_shrink to improve performance
- Resolved
-
LU-8412 Intel CAS testing umount triggers lu_object.c:1224:lu_device_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 3
- Resolved
-
LU-7326 ost-pools hangs on OST unmount
- Resolved