[LU-10102] conf-sanity test_105: test failed to respond and timed out Created: 06/Oct/17  Updated: 29/Jul/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Casper Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

trevis, full
servers: el7.4, ldiskfs, branch master, v2.10.53.1, b3642
clients: sles12.2, branch master, v2.10.53.1, b3642


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

https://testing.hpdd.intel.com/test_sessions/bce14d16-91a3-441a-b3ec-8a1817b00025

Nothing found in any of the console or dmesg logs. It appears that the process of
stopping the clients is not able to finish. This might be SLES related.

From suite_log:

umount lustre on /mnt/lustre.....
CMD: trevis-3vm1 grep -c /mnt/lustre' ' /proc/mounts
Stopping client trevis-3vm1 /mnt/lustre (opts:)
CMD: trevis-3vm1 lsof -t /mnt/lustre
CMD: trevis-3vm1 umount  /mnt/lustre 2>&1
stop ost1 service on trevis-3vm3
CMD: trevis-3vm3 grep -c /mnt/lustre-ost1' ' /proc/mounts
Stopping /mnt/lustre-ost1 (opts:-f) on trevis-3vm3
CMD: trevis-3vm3 umount -f /mnt/lustre-ost1
CMD: trevis-3vm3 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
stop mds service on trevis-3vm4
CMD: trevis-3vm4 grep -c /mnt/lustre-mds1' ' /proc/mounts
Stopping /mnt/lustre-mds1 (opts:-f) on trevis-3vm4
CMD: trevis-3vm4 umount -f /mnt/lustre-mds1
CMD: trevis-3vm4 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
CMD: trevis-3vm1 lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
modules unloaded.
Stopping clients: trevis-3vm1,trevis-3vm2 /mnt/lustre (opts:)
CMD: trevis-3vm1,trevis-3vm2 running=\$(grep -c /mnt/lustre' ' /proc/mounts);
if [ \$running -ne 0 ] ; then
echo Stopping client \$(hostname) /mnt/lustre opts:;
lsof /mnt/lustre || need_kill=no;
if [ x != x -a x\$need_kill != xno ]; then
    pids=\$(lsof -t /mnt/lustre | sort -u);
    if [ -n \"\$pids\" ]; then
             kill -9 \$pids;
    fi
fi;
while umount  /mnt/lustre 2>&1 | grep -q busy; do
    echo /mnt/lustre is still busy, wait one second && sleep 1;
done;
fi

Generated at Sat Feb 10 02:32:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.