Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12925

interop: conf-sanity test 62 fails with “Restart of mds1 failed!”

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Lustre 2.13.0
    • Fix Version/s: Lustre 2.12.4
    • Labels:
    • Environment:
      master (2.13) servers with 2.12.3 clients
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      conf-sanity test_62 fails in interop testing with master servers and b2_12 clients. This test and others started failing on 24 OCT 2019 for master 2.12.58.171 with 2.12.3 clients. The last time this test passed was for 2.12.58.155 build #3964 servers with 2.12.2 build #18 on 17 OCT 2019.

      Looking at the suite_log for the failure at https://testing.whamcloud.com/test_sets/2d201dd4-f9cc-11e9-be86-52540065bddc, we see

      CMD: trevis-6vm7 mkdir -p /mnt/lustre-mds1; mount -t lustre   /dev/mapper/mds1_flakey /mnt/lustre-mds1
      trevis-6vm7: mount.lustre: mount /dev/mapper/mds1_flakey at /mnt/lustre-mds1 failed: Invalid argument
      trevis-6vm7: This may have multiple causes.
      trevis-6vm7: Are the mount options correct?
      trevis-6vm7: Check the syslog for more info.
      Start of /dev/mapper/mds1_flakey on mds1 failed 22
       conf-sanity test_62: @@@@@@ FAIL: Restart of mds1 failed! 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:5864:error()
        = /usr/lib64/lustre/tests/test-framework.sh:1586:mount_facets()
        = /usr/lib64/lustre/tests/test-framework.sh:3361:facet_failover()
        = /usr/lib64/lustre/tests/test-framework.sh:3455:fail()
        = /usr/lib64/lustre/tests/test-framework.sh:4182:stopall()
        = /usr/lib64/lustre/tests/test-framework.sh:4455:formatall()
        = /usr/lib64/lustre/tests/conf-sanity.sh:108:reformat()
        = /usr/lib64/lustre/tests/conf-sanity.sh:90:reformat_and_config()
        = /usr/lib64/lustre/tests/conf-sanity.sh:4603:test_62()
      

      Looking at the MDS (vm7) console log, we see the following errors

      [38024.851590] Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
      [38025.161866] Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
      [38025.494661] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre   /dev/mapper/mds1_flakey /mnt/lustre-mds1
      [38025.715937] LDISKFS-fs (dm-3): mounted filesystem without journal. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
      [38025.717858] LustreError: 19847:0:(osd_handler.c:7696:osd_mount()) lustre-MDT0000-osd: device /dev/mapper/mds1_flakey is mounted w/o journal
      [38025.719942] LustreError: 19847:0:(obd_config.c:575:class_setup()) setup lustre-MDT0000-osd failed (-22)
      [38025.721511] LustreError: 19847:0:(obd_mount.c:205:lustre_start_simple()) lustre-MDT0000-osd setup error -22
      [38025.723385] LustreError: 19847:0:(obd_mount_server.c:1977:server_fill_super()) Unable to start osd on /dev/mapper/mds1_flakey: -22
      [38025.725326] LustreError: 19847:0:(obd_mount.c:1669:lustre_fill_super()) Unable to mount  (-22)
      [38025.955972] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  conf-sanity test_62: @@@@@@ FAIL: Restart of mds1 failed! 
      [38026.141053] Lustre: DEBUG MARKER: conf-sanity test_62: @@@@@@ FAIL: Restart of mds1 failed!
      

      When conf-sanity test 62 fails, we also see tests 64, 65, 66, 68 and 69 fail. Tests 63 and 67 do not fail.

      We’ve seen these tests fail only once before
      https://testing.whamcloud.com/test_sets/ee1e3636-f75d-11e9-a197-52540065bddc

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ys Yang Sheng
                Reporter:
                jamesanunez James Nunez
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: