Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3310

Interop 2.1.5<->2.4 failure on test suite sanity test_17o: stat file should fail

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.4.1, Lustre 2.5.0
    • Lustre 2.4.0
    • None
    • server: 2.1.5
      client: lustre-master build# 1481
    • 3
    • 8197

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/a00f384c-b88e-11e2-891d-52540035b04c.

      The sub-test test_17o failed with the following error:

      stat file should fail

      client dmesg:

      Lustre: DEBUG MARKER: == sanity test 17o: stat file with incompat LMA feature == 01:16:21 (1368000981)
      LustreError: 11-0: lustre-MDT0000-mdc-ffff88007cc28400: Communicating with 10.10.4.122@tcp, operation ldlm_enqueue failed with -107.
      Lustre: lustre-MDT0000-mdc-ffff88007cc28400: Connection to lustre-MDT0000 (at 10.10.4.122@tcp) was lost; in progress operations using this service will wait for recovery to complete
      Lustre: 8637:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1368000984/real 1368000984]  req@ffff88006e13ec00 x1434452628717600/t0(0) o400->MGC10.10.4.122@tcp@10.10.4.122@tcp:26/25 lens 224/224 e 0 to 1 dl 1368000991 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
      LustreError: 166-1: MGC10.10.4.122@tcp: Connection to MGS (at 10.10.4.122@tcp) was lost; in progress operations using this service will fail
      Lustre: Evicted from MGS (at 10.10.4.122@tcp) after server handle changed from 0xe26621f21f318fc6 to 0xe26621f21f31920b
      Lustre: MGC10.10.4.122@tcp: Connection restored to MGS (at 10.10.4.122@tcp)
      LustreError: 8635:0:(client.c:2696:ptlrpc_replay_interpret()) @@@ status -116, old was 0  req@ffff88007a953c00 x1434452628717596/t17179869193(17179869193) o35->lustre-MDT0000-mdc-ffff88007cc28400@10.10.4.122@tcp:23/10 lens 392/424 e 0 to 0 dl 1368000997 ref 2 fl Interpret:R/4/0 rc -116/-116
      Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity test_17o: @@@@@@ FAIL: stat file should fail 
      Lustre: DEBUG MARKER: sanity test_17o: @@@@@@ FAIL: stat file should fail
      Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2013-05-08/lustre-b2_1-el6-x86_64-vs-lustre-master-el6-x86_64--full--1_4_1__1481__-70235156306540-003534/sanity.test_17o.debug_log.$(hostname -s).1368000991.log;
               dmesg > /logdir/test_logs/2013-05-08/lustre-b2
      

      OST console:

      01:16:23:Lustre: DEBUG MARKER: == sanity test 17o: stat file with incompat LMA feature == 01:16:21 (1368000981)
      01:16:34:Lustre: lustre-OST0000: Received new MDS connection from 10.10.4.122@tcp, removing former export from same NID
      01:16:34:Lustre: Skipped 5 previous similar messages
      01:16:34:LustreError: 11-0: an error occurred while communicating with 10.10.4.122@tcp. The obd_ping operation failed with -107
      01:16:34:LustreError: 166-1: MGC10.10.4.122@tcp: Connection to MGS (at 10.10.4.122@tcp) was lost; in progress operations using this service will fail
      01:16:34:LustreError: 24007:0:(client.c:1060:ptlrpc_import_delay_req()) @@@ IMP_INVALID  req@ffff88007aced400 x1434452657963472/t0(0) o101->MGC10.10.4.122@tcp@10.10.4.122@tcp:26/25 lens 296/352 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1
      01:16:34:LustreError: 24007:0:(client.c:1060:ptlrpc_import_delay_req()) @@@ IMP_INVALID  req@ffff88007aced400 x1434452657963473/t0(0) o101->MGC10.10.4.122@tcp@10.10.4.122@tcp:26/25 lens 296/352 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1
      01:16:34:Lustre: Evicted from MGS (at MGC10.10.4.122@tcp_0) after server handle changed from 0xe26621f21f318ff0 to 0xe26621f21f3191f6
      01:16:34:Lustre: MGC10.10.4.122@tcp: Reactivating import
      01:16:34:Lustre: MGC10.10.4.122@tcp: Connection restored to MGS (at 10.10.4.122@tcp)
      01:16:34:Lustre: 14527:0:(lustre_log.h:476:llog_group_set_export()) lustre-OST0002: export for group 0 is changed: 0xffff88007bcaac00 -> 0xffff88007ab82400
      01:16:34:Lustre: 14527:0:(lustre_log.h:476:llog_group_set_export()) Skipped 12 previous similar messages
      01:16:34:Lustre: 14527:0:(llog_net.c:168:llog_receptor_accept()) changing the import ffff880067f80000 - ffff88007a285800
      01:16:34:Lustre: 14527:0:(llog_net.c:168:llog_receptor_accept()) Skipped 12 previous similar messages
      01:16:34:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity test_17o: @@@@@@ FAIL: stat file should fail 
      01:16:34:Lustre: DEBUG MARKER: sanity test_17o: @@@@@@ FAIL: stat file should fail
      

      MDS console

      01:16:25:Lustre: DEBUG MARKER: == sanity test 17o: stat file with incompat LMA feature == 01:16:21 (1368000981)
      01:16:25:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts
      01:16:25:Lustre: DEBUG MARKER: umount -d /mnt/mds1
      01:16:25:Lustre: Failing over lustre-MDT0000
      01:16:25:Lustre: 14221:0:(quota_master.c:795:close_quota_files()) quota[0] is off already
      01:16:25:Lustre: 14221:0:(quota_master.c:795:close_quota_files()) Skipped 1 previous similar message
      01:16:25:Lustre: Failing over mdd_obd-lustre-MDT0000
      01:16:25:Lustre: mdd_obd-lustre-MDT0000: shutting down for failover; client state will be preserved.
      01:16:25:Lustre: MGS has stopped.
      01:16:25:LustreError: 14221:0:(ldlm_request.c:1174:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
      01:16:25:LustreError: 14221:0:(ldlm_request.c:1801:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
      01:16:37:Lustre: 14221:0:(client.c:1817:ptlrpc_expire_one_request()) @@@ Request  sent has timed out for slow reply: [sent 1368000982/real 1368000982]  req@ffff880063e5f000 x1434452653771035/t0(0) o251->MGC10.10.4.122@tcp@0@lo:26/25 lens 192/192 e 0 to 1 dl 1368000988 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
      01:16:37:Lustre: server umount lustre-MDT0000 complete
      01:16:37:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
      01:16:37:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1
      01:16:37:Lustre: DEBUG MARKER: test -b /dev/lvm-MDS/P1
      01:16:37:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o user_xattr,acl  		                   /dev/lvm-MDS/P1 /mnt/mds1
      01:16:37:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
      01:16:37:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
      01:16:37:Lustre: MGS MGS started
      01:16:37:Lustre: MGC10.10.4.122@tcp: Reactivating import
      01:16:37:Lustre: Enabling ACL
      01:16:37:Lustre: Enabling user_xattr
      01:16:37:Lustre: lustre-MDT0000: used disk, loading
      01:16:37:Lustre: 14436:0:(ldlm_lib.c:2028:target_recovery_init()) RECOVERY: service lustre-MDT0000, 2 recoverable clients, last_transno 17179869192
      01:16:37:Lustre: 14436:0:(mdt_lproc.c:416:lprocfs_wr_identity_upcall()) lustre-MDT0000: identity upcall set to /usr/sbin/l_getidentity
      01:16:37:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
      01:16:37:Lustre: DEBUG MARKER: e2label /dev/lvm-MDS/P1 2>/dev/null
      01:16:37:Lustre: DEBUG MARKER: lctl set_param fail_loc=0x194
      01:16:37:LustreError: 14442:0:(mdt_handler.c:2817:mdt_recovery()) operation 101 on unconnected MDS from 12345-10.10.4.121@tcp
      01:16:37:Lustre: lustre-MDT0000: sending delayed replies to recovered clients
      

      Attachments

        Activity

          People

            bobijam Zhenyu Xu
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: