Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.4.0
-
None
-
server: 2.1.5
client: lustre-master build# 1481
-
3
-
8197
Description
This issue was created by maloo for sarah <sarah@whamcloud.com>
This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/a00f384c-b88e-11e2-891d-52540035b04c.
The sub-test test_17o failed with the following error:
stat file should fail
client dmesg:
Lustre: DEBUG MARKER: == sanity test 17o: stat file with incompat LMA feature == 01:16:21 (1368000981) LustreError: 11-0: lustre-MDT0000-mdc-ffff88007cc28400: Communicating with 10.10.4.122@tcp, operation ldlm_enqueue failed with -107. Lustre: lustre-MDT0000-mdc-ffff88007cc28400: Connection to lustre-MDT0000 (at 10.10.4.122@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: 8637:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1368000984/real 1368000984] req@ffff88006e13ec00 x1434452628717600/t0(0) o400->MGC10.10.4.122@tcp@10.10.4.122@tcp:26/25 lens 224/224 e 0 to 1 dl 1368000991 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 LustreError: 166-1: MGC10.10.4.122@tcp: Connection to MGS (at 10.10.4.122@tcp) was lost; in progress operations using this service will fail Lustre: Evicted from MGS (at 10.10.4.122@tcp) after server handle changed from 0xe26621f21f318fc6 to 0xe26621f21f31920b Lustre: MGC10.10.4.122@tcp: Connection restored to MGS (at 10.10.4.122@tcp) LustreError: 8635:0:(client.c:2696:ptlrpc_replay_interpret()) @@@ status -116, old was 0 req@ffff88007a953c00 x1434452628717596/t17179869193(17179869193) o35->lustre-MDT0000-mdc-ffff88007cc28400@10.10.4.122@tcp:23/10 lens 392/424 e 0 to 0 dl 1368000997 ref 2 fl Interpret:R/4/0 rc -116/-116 Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity test_17o: @@@@@@ FAIL: stat file should fail Lustre: DEBUG MARKER: sanity test_17o: @@@@@@ FAIL: stat file should fail Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2013-05-08/lustre-b2_1-el6-x86_64-vs-lustre-master-el6-x86_64--full--1_4_1__1481__-70235156306540-003534/sanity.test_17o.debug_log.$(hostname -s).1368000991.log; dmesg > /logdir/test_logs/2013-05-08/lustre-b2
OST console:
01:16:23:Lustre: DEBUG MARKER: == sanity test 17o: stat file with incompat LMA feature == 01:16:21 (1368000981) 01:16:34:Lustre: lustre-OST0000: Received new MDS connection from 10.10.4.122@tcp, removing former export from same NID 01:16:34:Lustre: Skipped 5 previous similar messages 01:16:34:LustreError: 11-0: an error occurred while communicating with 10.10.4.122@tcp. The obd_ping operation failed with -107 01:16:34:LustreError: 166-1: MGC10.10.4.122@tcp: Connection to MGS (at 10.10.4.122@tcp) was lost; in progress operations using this service will fail 01:16:34:LustreError: 24007:0:(client.c:1060:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff88007aced400 x1434452657963472/t0(0) o101->MGC10.10.4.122@tcp@10.10.4.122@tcp:26/25 lens 296/352 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1 01:16:34:LustreError: 24007:0:(client.c:1060:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@ffff88007aced400 x1434452657963473/t0(0) o101->MGC10.10.4.122@tcp@10.10.4.122@tcp:26/25 lens 296/352 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1 01:16:34:Lustre: Evicted from MGS (at MGC10.10.4.122@tcp_0) after server handle changed from 0xe26621f21f318ff0 to 0xe26621f21f3191f6 01:16:34:Lustre: MGC10.10.4.122@tcp: Reactivating import 01:16:34:Lustre: MGC10.10.4.122@tcp: Connection restored to MGS (at 10.10.4.122@tcp) 01:16:34:Lustre: 14527:0:(lustre_log.h:476:llog_group_set_export()) lustre-OST0002: export for group 0 is changed: 0xffff88007bcaac00 -> 0xffff88007ab82400 01:16:34:Lustre: 14527:0:(lustre_log.h:476:llog_group_set_export()) Skipped 12 previous similar messages 01:16:34:Lustre: 14527:0:(llog_net.c:168:llog_receptor_accept()) changing the import ffff880067f80000 - ffff88007a285800 01:16:34:Lustre: 14527:0:(llog_net.c:168:llog_receptor_accept()) Skipped 12 previous similar messages 01:16:34:Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity test_17o: @@@@@@ FAIL: stat file should fail 01:16:34:Lustre: DEBUG MARKER: sanity test_17o: @@@@@@ FAIL: stat file should fail
MDS console
01:16:25:Lustre: DEBUG MARKER: == sanity test 17o: stat file with incompat LMA feature == 01:16:21 (1368000981) 01:16:25:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts 01:16:25:Lustre: DEBUG MARKER: umount -d /mnt/mds1 01:16:25:Lustre: Failing over lustre-MDT0000 01:16:25:Lustre: 14221:0:(quota_master.c:795:close_quota_files()) quota[0] is off already 01:16:25:Lustre: 14221:0:(quota_master.c:795:close_quota_files()) Skipped 1 previous similar message 01:16:25:Lustre: Failing over mdd_obd-lustre-MDT0000 01:16:25:Lustre: mdd_obd-lustre-MDT0000: shutting down for failover; client state will be preserved. 01:16:25:Lustre: MGS has stopped. 01:16:25:LustreError: 14221:0:(ldlm_request.c:1174:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway 01:16:25:LustreError: 14221:0:(ldlm_request.c:1801:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108 01:16:37:Lustre: 14221:0:(client.c:1817:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1368000982/real 1368000982] req@ffff880063e5f000 x1434452653771035/t0(0) o251->MGC10.10.4.122@tcp@0@lo:26/25 lens 192/192 e 0 to 1 dl 1368000988 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 01:16:37:Lustre: server umount lustre-MDT0000 complete 01:16:37:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' 01:16:37:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1 01:16:37:Lustre: DEBUG MARKER: test -b /dev/lvm-MDS/P1 01:16:37:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl /dev/lvm-MDS/P1 /mnt/mds1 01:16:37:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: 01:16:37:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: 01:16:37:Lustre: MGS MGS started 01:16:37:Lustre: MGC10.10.4.122@tcp: Reactivating import 01:16:37:Lustre: Enabling ACL 01:16:37:Lustre: Enabling user_xattr 01:16:37:Lustre: lustre-MDT0000: used disk, loading 01:16:37:Lustre: 14436:0:(ldlm_lib.c:2028:target_recovery_init()) RECOVERY: service lustre-MDT0000, 2 recoverable clients, last_transno 17179869192 01:16:37:Lustre: 14436:0:(mdt_lproc.c:416:lprocfs_wr_identity_upcall()) lustre-MDT0000: identity upcall set to /usr/sbin/l_getidentity 01:16:37:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u 01:16:37:Lustre: DEBUG MARKER: e2label /dev/lvm-MDS/P1 2>/dev/null 01:16:37:Lustre: DEBUG MARKER: lctl set_param fail_loc=0x194 01:16:37:LustreError: 14442:0:(mdt_handler.c:2817:mdt_recovery()) operation 101 on unconnected MDS from 12345-10.10.4.121@tcp 01:16:37:Lustre: lustre-MDT0000: sending delayed replies to recovered clients