Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9875

conf-sanity test 70e fails with 'start mdt1 failed'

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • Lustre 2.10.0, Lustre 2.11.0
    • Separate MGS and MDS or MGT and MDT
    • 3
    • 9223372036854775807

    Description

      conf-sanity test_70e fails when run on a Lustre file system with a separate MDT and MGT with the error

      conf-sanity test_70e: @@@@@@ FAIL: start mdt1 failed
      

      From the test_log, we can see that the MDT cannot be mounted after being formatted

      Starting mds1:   /dev/lvm-Role_MDS/P2 /mnt/lustre-mds1
      CMD: onyx-44vm7 mkdir -p /mnt/lustre-mds1; mount -t lustre   		                   /dev/lvm-Role_MDS/P2 /mnt/lustre-mds1
      onyx-44vm7: mount.lustre: mount /dev/mapper/lvm--Role_MDS-P2 at /mnt/lustre-mds1 failed: Address already in use
      onyx-44vm7: The target service's index is already in use. (/dev/mapper/lvm--Role_MDS-P2)
      Start of /dev/lvm-Role_MDS/P2 on mds1 failed 98
      

      Looking at the MGS dmesg log, we see

      [10988.278664] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre   		                   /dev/lvm-Role_MDS/P2 /mnt/lustre-mds1
      [10988.392026] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: errors=remount-ro
      [10988.518785] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
      [10988.568637] LustreError: 140-5: Server lustre-MDT0000 requested index 0, but that index is already in use. Use --writeconf to force
      [10988.569912] LustreError: 9140:0:(mgs_handler.c:537:mgs_target_reg()) Failed to write lustre-MDT0000 log (-98)
      [10988.575542] LustreError: 15f-b: lustre-MDT0000: cannot register this server with the MGS: rc = -98. Is the MGS running?
      [10988.592411] LustreError: 27288:0:(obd_mount_server.c:1866:server_fill_super()) Unable to start targets: -98
      [10988.593547] LustreError: 27288:0:(obd_mount_server.c:1576:server_put_super()) no obd lustre-MDT0000
      [10988.594456] LustreError: 27288:0:(obd_mount_server.c:135:server_deregister_mount()) lustre-MDT0000 not registered
      [10988.657486] LustreError: 27288:0:(obd_mount.c:1505:lustre_fill_super()) Unable to mount  (-98)
      

      The MGS still remembers that there was already an MDT with index 0 for the existing file system and, thus, refuses to allow the new MDT to use index 0.

      Test sessions with logs for this failure are at
      https://testing.hpdd.intel.com/test_sets/c59d75f4-7e5c-11e7-b716-5254006e85c2
      https://testing.hpdd.intel.com/test_sets/131fc342-7efb-11e7-9785-5254006e85c2

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              jamesanunez James Nunez (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: