Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1263

2.1.1<->2.2 OST0000 cannot be mounted after sanity test_27z failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.4.0
    • None
    • None
    • server: 2.1.1-RHEL6
      client: 2.2-RC2-RHEL6
    • 3
    • 4262

    Description

      sanity subtest 51b hang after subtest 27z failed. The OST0000 cannot be mounted.

      OST dmesg:
      --------------------
      [root@fat-intel-4 ~]# LustreError: 4319:0:(ldlm_lib.c:2129:target_send_reply_msg()) @@@ processing error (19) req@ffff88032a885000 x1397613683283791/t0(0) o-1><?>@<?>:0/0 lens 368/0 e 0 to 0 dl 1332872055 ref 1 fl Interpret:/ffffffff/ffffffff rc -19/-1
      LustreError: 4319:0:(ldlm_lib.c:2129:target_send_reply_msg()) Skipped 357 previous similar messages
      LustreError: 137-5: UUID 'lustre-OST0000_UUID' is not available for connect (no target)
      LustreError: Skipped 357 previous similar messages

      [root@fat-intel-4 ~]# mount
      /dev/sda1 on / type ext3 (rw)
      proc on /proc type proc (rw)
      sysfs on /sys type sysfs (rw)
      devpts on /dev/pts type devpts (rw,gid=5,mode=620)
      tmpfs on /dev/shm type tmpfs (rw)
      none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
      sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
      nfsd on /proc/fs/nfsd type nfsd (rw)
      /dev/sdb2 on /mnt/ost2 type lustre (rw)
      /dev/sdb3 on /mnt/ost3 type lustre (rw)
      /dev/sdc1 on /mnt/ost4 type lustre (rw)
      /dev/sdc2 on /mnt/ost5 type lustre (rw)
      /dev/sdc3 on /mnt/ost6 type lustre (rw)
      /dev/sdb1 on /mnt/ost1 type ldiskfs (rw)

      client console:
      ---------------------
      Lustre: DEBUG MARKER: == sanity test 27z: check SEQ/OID on the MDT and OST filesystems == 10:36:11 (1332869771)
      Lustre: DEBUG MARKER: sanity test_27z: @@@@@@ FAIL: parent SEQ mismatch
      LustreError: 11-0: an error occurred while communicating with 10.10.4.131@tcp. The obd_ping operation failed with -107
      Lustre: lustre-OST0000-osc-ffff880333317c00: Connection to service lustre-OST0000 via nid 10.10.4.131@tcp was lost; in progress operations using this service will wait for recovery to complete.
      Lustre: DEBUG MARKER: == sanity test 51b: mkdir .../t-0 — .../t-70 ====================== 10:40:18 (1332870018)
      LustreError: 11-0: an error occurred while communicating with 10.10.4.131@tcp. The ost_connect operation failed with -19
      LustreError: Skipped 24 previous similar messages
      Lustre: 2597:0:(import.c:524:import_select_connection()) lustre-OST0000-osc-ffff880333317c00: tried all connections, increasing latency to 21s
      Lustre: 2597:0:(import.c:524:import_select_connection()) Skipped 24 previous similar messages

      please find debug log in the attached.

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: