Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.11.0
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for sarah_lw <wei3.liu@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/2a35151a-2751-11e8-b74b-52540065bddc
test_276 failed with the following error:
Timeout occurred after 175 mins, last suite running was sanity, restarting cluster to continue tests
Hit LBUG in interop testing between 2.10.3 server and master tag-2.10.59 client as LU-10650
[ 6845.114788] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity test 276: Race between mount and obd_statfs ================================================ 01:42:10 \(1520905330\) [ 6845.289408] Lustre: DEBUG MARKER: == sanity test 276: Race between mount and obd_statfs ================================================ 01:42:10 (1520905330) [ 6845.483951] Lustre: DEBUG MARKER: (while true; do /usr/sbin/lctl get_param obdfilter.*.filesfree > /dev/null 2>&1; done) & pid=$!; echo $pid > /tmp/sanity_276_pid [ 6845.485916] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost1' ' /proc/mounts || true [ 6845.798565] Lustre: DEBUG MARKER: umount -d /mnt/lustre-ost1 [ 6846.021507] Lustre: Failing over lustre-OST0000 [ 6846.065295] Lustre: server umount lustre-OST0000 complete [ 6846.241344] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && [ 6846.241344] lctl dl | grep ' ST ' || true [ 6846.683589] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-ost1 [ 6846.994289] Lustre: DEBUG MARKER: test -b /dev/lvm-Role_OSS/P1 [ 6847.287819] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_OSS/P1 [ 6847.741015] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-ost1; mount -t lustre /dev/lvm-Role_OSS/P1 /mnt/lustre-ost1 [ 6847.763478] LustreError: 137-5: lustre-OST0000_UUID: not available for connect from 10.2.8.29@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 6848.086287] LDISKFS-fs (dm-0): file extents enabled, maximum tree depth=5 [ 6848.088755] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [ 6848.133350] LustreError: 23480:0:(dt_object.h:2509:dt_statfs()) ASSERTION( dev ) failed: [ 6848.134405] LustreError: 23480:0:(dt_object.h:2509:dt_statfs()) LBUG [ 6848.135065] Pid: 23480, comm: lctl [ 6848.135417] [ 6848.135417] Call Trace: [ 6848.135842] [<ffffffffc05d27ae>] libcfs_call_trace+0x4e/0x60 [libcfs] [ 6848.136522] [<ffffffffc05d283c>] lbug_with_loc+0x4c/0xb0 [libcfs] [ 6848.137206] [<ffffffffc0a9cde2>] tgt_statfs_internal+0x2f2/0x360 [ptlrpc] [ 6848.137932] [<ffffffffc0d7d266>] ofd_statfs+0x66/0x470 [ofd] [ 6848.138690] [<ffffffffc07d00c6>] lprocfs_filesfree_seq_show+0xf6/0x530 [obdclass] [ 6848.139484] [<ffffffff811f4d72>] ? __mem_cgroup_commit_charge+0xe2/0x2f0 [ 6848.140176] [<ffffffff8119320e>] ? lru_cache_add+0xe/0x10 [ 6848.140744] [<ffffffff811be298>] ? page_add_new_anon_rmap+0xb8/0x170 [ 6848.141421] [<ffffffff811e23d5>] ? __kmalloc+0x55/0x230 [ 6848.141987] [<ffffffff81227eb7>] ? seq_buf_alloc+0x17/0x40 [ 6848.142581] [<ffffffffc0d91142>] ofd_filesfree_seq_show+0x12/0x20 [ofd] [ 6848.143274] [<ffffffff812283ba>] seq_read+0x10a/0x3b0 [ 6848.143810] [<ffffffff8127248d>] proc_reg_read+0x3d/0x80 [ 6848.144372] [<ffffffff8120295c>] vfs_read+0x9c/0x170 [ 6848.144902] [<ffffffff8120381f>] SyS_read+0x7f/0xe0 [ 6848.145407] [<ffffffff816b8929>] ? system_call_after_swapgs+0x156/0x214 [ 6848.146096] [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b [ 6848.146717] [<ffffffff816b889d>] ? system_call_after_swapgs+0xca/0x214 [ 6848.147392] [ 6848.147570] Kernel panic - not syncing: LBUG [ 6848.148014] CPU: 0 PID: 23480 Comm: lctl Tainted: G OE ------------ 3.10.0-693.11.6.el7_lustre.x86_64 #1 [ 6848.149072] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 6848.149645] Call Trace: [ 6848.149913] [<ffffffff816a5e7d>] dump_stack+0x19/0x1b [ 6848.150436] [<ffffffff8169fd64>] panic+0xe8/0x20d [ 6848.150935] [<ffffffffc05d2854>] lbug_with_loc+0x64/0xb0 [libcfs] [ 6848.151590] [<ffffffffc0a9cde2>] tgt_statfs_internal+0x2f2/0x360 [ptlrpc] [ 6848.152299] [<ffffffffc0d7d266>] ofd_statfs+0x66/0x470 [ofd] [ 6848.152901] [<ffffffffc07d00c6>] lprocfs_filesfree_seq_show+0xf6/0x530 [obdclass] [ 6848.153663] [<ffffffff811f4d72>] ? __mem_cgroup_commit_charge+0xe2/0x2f0 [ 6848.154348] [<ffffffff8119320e>] ? lru_cache_add+0xe/0x10 [ 6848.154912] [<ffffffff811be298>] ? page_add_new_anon_rmap+0xb8/0x170 [ 6848.155559] [<ffffffff811e23d5>] ? __kmalloc+0x55/0x230 [ 6848.156104] [<ffffffff81227eb7>] ? seq_buf_alloc+0x17/0x40 [ 6848.156685] [<ffffffffc0d91142>] ofd_filesfree_seq_show+0x12/0x20 [ofd] [ 6848.157357] [<ffffffff812283ba>] seq_read+0x10a/0x3b0 [ 6848.157891] [<ffffffff8127248d>] proc_reg_read+0x3d/0x80 [ 6848.158436] [<ffffffff8120295c>] vfs_read+0x9c/0x170 [ 6848.158955] [<ffffffff8120381f>] SyS_read+0x7f/0xe0 [ 6848.159458] [<ffffffff816b8929>] ? system_call_after_swapgs+0x156/0x214 [ 6848.160130] [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b [ 6848.160745] [<ffffffff816b889d>] ? system_call_after_swapgs+0xca/0x214 [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity test_276 - Timeout occurred after 175 mins, last suite running was sanity, restarting cluster to continue tests