Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.11.0, Lustre 2.12.0, Lustre 2.10.3, Lustre 2.10.4, Lustre 2.10.5, Lustre 2.13.0, Lustre 2.10.6, Lustre 2.10.7, Lustre 2.12.1, Lustre 2.12.3, Lustre 2.14.0, Lustre 2.12.5, Lustre 2.12.6, Lustre 2.15.0
-
None
-
3
-
9223372036854775807
Description
parallel-scale-nfsv4 hangs on unmount after all tests have run. In the suite_log, the last thing we see is
== parallel-scale-nfsv4 test complete, duration 2088 sec ============================================= 22:07:24 (1521868044) Unmounting NFS clients... CMD: trevis-8vm1,trevis-8vm2 umount -f /mnt/lustre Unexporting Lustre filesystem... CMD: trevis-8vm1,trevis-8vm2 chkconfig --list rpcidmapd 2>/dev/null | grep -q rpcidmapd && service rpcidmapd stop || true CMD: trevis-8vm4 { [[ -e /etc/SuSE-release ]] && service nfsserver stop; } || service nfs stop CMD: trevis-8vm4 sed -i '/^lustre/d' /etc/exports CMD: trevis-8vm4 exportfs -v CMD: trevis-8vm4 grep -c /mnt/lustre' ' /proc/mounts Stopping client trevis-8vm4 /mnt/lustre (opts:-f) CMD: trevis-8vm4 lsof -t /mnt/lustre CMD: trevis-8vm4 umount -f /mnt/lustre 2>&1
Looking at the console logs for vm4, MDS1 and 3, we see
[ 2216.385890] Lustre: DEBUG MARKER: == parallel-scale-nfsv4 test complete, duration 2088 sec ============================================= 22:07:24 (1521868044) [ 2216.698201] Lustre: DEBUG MARKER: { [[ -e /etc/SuSE-release ]] && [ 2216.698201] service nfsserver stop; } || [ 2216.698201] service nfs stop [ 2216.805093] nfsd: last server has exited, flushing export cache [ 2216.819487] Lustre: DEBUG MARKER: sed -i '/^lustre/d' /etc/exports [ 2216.885266] Lustre: DEBUG MARKER: exportfs -v [ 2216.945098] Lustre: DEBUG MARKER: grep -c /mnt/lustre' ' /proc/mounts [ 2216.982526] Lustre: DEBUG MARKER: lsof -t /mnt/lustre [ 2217.170422] Lustre: DEBUG MARKER: umount -f /mnt/lustre 2>&1 [ 2217.192827] Lustre: setting import lustre-MDT0000_UUID INACTIVE by administrator request [ 2217.193476] LustreError: 410:0:(file.c:205:ll_close_inode_openhandle()) lustre-clilmv-ffff880060b4e800: inode [0x200000406:0x3c1b:0x0] mdc close failed: rc = -108 [ 2217.218709] Lustre: 4066:0:(llite_lib.c:2676:ll_dirty_page_discard_warn()) lustre: dirty page discard: 10.9.4.84@tcp:/lustre/fid: [0x200000406:0x3e42:0x0]/ may get corrupted (rc -108) [ 2217.218732] Lustre: 4066:0:(llite_lib.c:2676:ll_dirty_page_discard_warn()) lustre: dirty page discard: 10.9.4.84@tcp:/lustre/fid: [0x200000406:0x3e7b:0x0]/ may get corrupted (rc -108) … [ 5541.474664] [ 5541.474667] umount D 0000000000000000 0 410 409 0x00000000 [ 5541.474669] ffff88004365fda8 ffff88004365fde0 ffff880048e5ce00 ffff880043660000 [ 5541.474670] ffff88004365fde0 000000010013feb9 ffff88007fc10840 0000000000000000 [ 5541.474671] ffff88004365fdc0 ffffffff81612a95 ffff88007fc10840 ffff88004365fe68 [ 5541.474672] Call Trace: [ 5541.474674] [<ffffffff81612a95>] schedule+0x35/0x80 [ 5541.474677] [<ffffffff81615851>] schedule_timeout+0x161/0x2d0 [ 5541.474689] [<ffffffffa1457cc7>] ll_kill_super+0x77/0x150 [lustre] [ 5541.474723] [<ffffffffa09f3a94>] lustre_kill_super+0x34/0x40 [obdclass] [ 5541.474734] [<ffffffff8120cf5f>] deactivate_locked_super+0x3f/0x70 [ 5541.474742] [<ffffffff812283fb>] cleanup_mnt+0x3b/0x80 [ 5541.474745] [<ffffffff8109d198>] task_work_run+0x78/0x90 [ 5541.474748] [<ffffffff8107b5cf>] exit_to_usermode_loop+0x91/0xc2 [ 5541.474760] [<ffffffff81003ae5>] syscall_return_slowpath+0x85/0xa0 [ 5541.474768] [<ffffffff81616ca7>] int_ret_from_sys_call+0x25/0x9f [ 5541.476903] DWARF2 unwinder stuck at int_ret_from_sys_call+0x25/0x9f [ 5541.476904]
We see this problem with unmount on the maser and b2_10 branches for SLES12 SP2 and SP3 testing only.
Logs for test suites failres are at
https://testing.whamcloud.com/test_sets/4bce5a66-2f2f-11e8-9e0e-52540065bddc
https://testing.whamcloud.com/test_sets/103f280e-2fac-11e8-b3c6-52540065bddc
https://testing.whamcloud.com/test_sets/044a75f0-2eba-11e8-b6a0-52540065bddc
Attachments
Issue Links
- is related to
-
LU-17154 parallel-scale-nfsv4: hangs on umount after racer_on_nfs
- Open
-
LU-10566 parallel-scale-nfsv4 test_metabench: mkdir: cannot create directory on Read-only file system
- Reopened
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...