[LU-2862] 2.1.4<->2.4.0 interop: replay-dual test_11: rm: cannot remove `/mnt/lustre/f11-[1-5]': No such file or directory Created: 25/Feb/13 Updated: 03/Jun/16 Resolved: 03/Jun/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0, Lustre 2.1.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Jian Yu | Assignee: | WC Triage |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | mq213, yuc2 | ||
| Environment: |
Lustre b2_1 client build: http://build.whamcloud.com/job/lustre-b2_1/176 |
||
| Severity: | 3 |
| Rank (Obsolete): | 6928 |
| Description |
|
replay-dual test_11 failed as follows: Starting mds1: -o loop,user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1 CMD: client-19vm3 mkdir -p /mnt/mds1; mount -t lustre -o loop,user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1 CMD: client-19vm3 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/openmpi/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin: NAME=autotest_config sh rpc.sh set_default_debug \"-1\" \" 0xffb7e3ff\" 2 CMD: client-19vm3 e2label /dev/lvm-MDS/P1 Started lustre-MDT0000 CMD: client-19vm3 lctl set_param fail_loc=0 fail_loc=0 rm: cannot remove `/mnt/lustre/f11-[1-5]': No such file or directory replay-dual test_11: @@@@@@ FAIL: test_11 failed with 1 Console log on the client node client-19vm1 showed that: 08:16:21:Lustre: lustre-MDT0000-mdc-ffff880033305000: Connection restored to lustre-MDT0000 (at 10.10.4.222@tcp) 08:16:21:Lustre: Skipped 12 previous similar messages 08:16:21:Lustre: 8555:0:(client.c:1817:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1361549717/real 1361549717] req@ffff88004306b800 x1427685198042724/t51539607554(51539607554) o36->lustre-MDT0000-mdc-ffff880075efa000@10.10.4.222@tcp:12/10 lens 488/416 e 1 to 1 dl 1361549778 ref 2 fl Rpc:X/4/ffffffff rc 0/-1 08:16:21:Lustre: 8555:0:(client.c:1817:ptlrpc_expire_one_request()) Skipped 18 previous similar messages 08:16:21:LustreError: 8555:0:(client.c:2576:ptlrpc_replay_interpret()) request replay timed out, restarting recovery 08:16:21:LustreError: 167-0: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. 08:16:22:LustreError: 28715:0:(mdc_locks.c:736:mdc_enqueue()) ldlm_cli_enqueue: -4 08:16:22:LustreError: 28715:0:(dir.c:423:ll_get_dir_page()) lock enqueue: [0x200000007:0x1:0x0] at 0: rc -4 08:16:22:LustreError: 28715:0:(dir.c:648:ll_readdir()) error reading dir [0x200000007:0x1:0x0] at 0: rc -4 08:16:22:Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-dual test_11: @@@@@@ FAIL: test_11 failed with 1 Console log on the MDS client-19vm3 showed that: 08:15:15:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o loop,user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1 08:15:15:LDISKFS-fs (loop0): recovery complete 08:15:15:LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: 08:15:16:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u 08:15:16:LNet: 11113:0:(debug.c:324:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release. 08:15:16:LNet: 11113:0:(debug.c:324:libcfs_debug_str2mask()) Skipped 2 previous similar messages 08:15:16:Lustre: DEBUG MARKER: e2label /dev/lvm-MDS/P1 08:15:47:Lustre: *** cfs_fail_loc=119, val=2147483648*** 08:15:47:LustreError: 10976:0:(ldlm_lib.c:2422:target_send_reply_msg()) @@@ dropping reply req@ffff880053917850 x1427685198042724/t51539607554(51539607554) o36->2c21c195-e62a-2d57-459b-4cc847aed904@10.10.4.220@tcp:0/0 lens 488/448 e 1 to 0 dl 1361549749 ref 1 fl Complete:/4/0 rc 0/0 08:15:58:Lustre: DEBUG MARKER: lctl set_param fail_loc=0 08:16:20:Lustre: lustre-MDT0000: recovery is timed out, evict stale exports 08:16:20:Lustre: lustre-MDT0000: disconnecting 1 stale clients 08:16:20:Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-dual test_11: @@@@@@ FAIL: test_11 failed with 1 Maloo report: https://maloo.whamcloud.com/test_sets/a19d48ee-7d78-11e2-85d0-52540035b04c |
| Comments |
| Comment by Jian Yu [ 13/Mar/13 ] |
|
Lustre b2_1 client build: http://build.whamcloud.com/job/lustre-b2_1/186 The replay-dual test 11 passed: https://maloo.whamcloud.com/test_sets/1a958dc8-8b58-11e2-965f-52540035b04c |
| Comment by Sarah Liu [ 22/Apr/13 ] |
|
another failure: https://maloo.whamcloud.com/test_sets/8d87aa7a-a78a-11e2-b3cc-52540035b04c |
| Comment by Jian Yu [ 14/Aug/13 ] |
|
Lustre client build: http://build.whamcloud.com/job/lustre-b2_1/215/ (2.1.6) replay-dual test 11 hit the same failure: |