[LU-12260] conf-sanity test 57a times out in client unmount Created: 01/May/19  Updated: 01/May/19

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0, Lustre 2.13.0, Lustre 2.12.1
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Nunez (Inactive) Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: ppc
Environment:

PPC clients


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

conf-sanity test_57a times out in client unmount. These time outs are only seen for PPC.

Looking at the suite_log for a recent failure, https://testing.whamcloud.com/test_sets/b1449c50-668f-11e9-8bb1-52540065bddc, we see an error trying to mount an OST and then the client tries to umount and hangs

CMD: trevis-55vm11 e2label /dev/mapper/ost1_flakey
Starting ost1:   /dev/mapper/ost1_flakey /mnt/lustre-ost1
CMD: trevis-55vm11 mkdir -p /mnt/lustre-ost1; mount -t lustre   /dev/mapper/ost1_flakey /mnt/lustre-ost1
trevis-55vm11: mount.lustre: mount /dev/mapper/ost1_flakey at /mnt/lustre-ost1 failed: Cannot assign requested address
Start of /dev/mapper/ost1_flakey on ost1 failed 99
Stopping clients: trevis-77vm1.trevis.whamcloud.com,trevis-77vm2 /mnt/lustre (opts:-f)
CMD: trevis-77vm1.trevis.whamcloud.com,trevis-77vm2 running=\$(grep -c /mnt/lustre' ' /proc/mounts);
if [ \$running -ne 0 ] ; then
echo Stopping client \$(hostname) /mnt/lustre opts:-f;
lsof /mnt/lustre || need_kill=no;
if [ x-f != x -a x\$need_kill != xno ]; then
    pids=\$(lsof -t /mnt/lustre | sort -u);
    if [ -n \"\$pids\" ]; then
             kill -9 \$pids;
    fi
fi;
while umount -f /mnt/lustre 2>&1 | grep -q busy; do
    echo /mnt/lustre is still busy, wait one second && sleep 1;
done;
fi

Looking at the OSS console log, we can see errors

[15207.625829] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-ost1; mount -t lustre   /dev/mapper/ost1_flakey /mnt/lustre-ost1
[15207.947844] LDISKFS-fs (dm-10): file extents enabled, maximum tree depth=5
[15207.950074] LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. Opts: errors=remount-ro
[15208.057801] LDISKFS-fs (dm-10): file extents enabled, maximum tree depth=5
[15208.059983] LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc
[15208.161527] LustreError: 15f-b: lustre-OST2710: cannot register this server with the MGS: rc = -99. Is the MGS running?
[15208.175587] LustreError: 21832:0:(obd_mount_server.c:1939:server_fill_super()) Unable to start targets: -99
[15208.177395] LustreError: 21832:0:(obd_mount_server.c:1589:server_put_super()) no obd lustre-OST2710
[15208.178929] LustreError: 21832:0:(obd_mount_server.c:132:server_deregister_mount()) lustre-OST2710 not registered
[15208.245160] LustreError: 21832:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-99)
 

Logs for other failures are at
https://testing.whamcloud.com/test_sets/19acf4bc-2555-11e9-830a-52540065bddc
https://testing.whamcloud.com/test_sets/aa4bfab6-b6e4-11e8-8c12-52540065bddc
https://testing.whamcloud.com/test_sets/392a269c-0102-11e9-b970-52540065bddc


Generated at Sat Feb 10 02:51:01 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.