Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10708

replay-single test_20b: Restart of mds1 failed!

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.11.0, Lustre 2.12.0, Lustre 2.13.0, Lustre 2.12.1, Lustre 2.12.3
    • Hard Failover:
      RHEL 7.4 Server/ZFS
      RHEL 7.4 Client
      2.10.58 master, build 3707
    • 3
    • 9223372036854775807

    Description

      replay-single test_20b - Restart of mds1 failed!
      ^^^^^^^^^^^^^ DO NOT REMOVE LINE ABOVE ^^^^^^^^^^^^^

      This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/7f078076-15ba-11e8-bd00-52540065bddc

      test_20b failed with the following error:

      Restart of mds1 failed!
      

      test_logs:

      == replay-single test 20b: write, unlink, eviction, replay (test mds_cleanup_orphans) ================ 19:46:25 (1519069585)
      CMD: onyx-32vm7 lctl set_param -n os[cd]*.*MDT*.force_sync=1
      CMD: onyx-32vm6 lctl set_param -n osd*.*OS*.force_sync=1
      /mnt/lustre/f20b.replay-single
      lmm_stripe_count:  1
      lmm_stripe_size:   1048576
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: 0
      	obdidx		 objid		 objid		 group
      	     0	          4770	       0x12a2	             0
      
      CMD: onyx-32vm7 /usr/sbin/lctl set_param -n mdt.lustre-MDT0000.evict_client 425b1455-3c86-3ef4-e5f3-8752f5bdb612
      10000+0 records in
      10000+0 records out
      40960000 bytes (41 MB) copied, 1.08016 s, 37.9 MB/s
      CMD: onyx-32vm7 lctl set_param -n osd*.*MDT*.force_sync=1
      CMD: onyx-32vm7 /usr/sbin/lctl dl
      Failing mds1 on onyx-32vm7
      + pm -h powerman --off onyx-32vm7
      Command completed successfully
      reboot facets: mds1
      + pm -h powerman --on onyx-32vm7
      Command completed successfully
      Failover mds1 to onyx-32vm8
      19:46:42 (1519069602) waiting for onyx-32vm8 network 900 secs ...
      19:46:42 (1519069602) network interface is UP
      CMD: onyx-32vm8 hostname
      mount facets: mds1
      CMD: onyx-32vm8 lsmod | grep zfs >&/dev/null || modprobe zfs;
      			zpool list -H lustre-mdt1 >/dev/null 2>&1 ||
      			zpool import -f -o cachefile=none -o failmode=panic -d /dev/lvm-Role_MDS lustre-mdt1
      onyx-32vm8: cannot import 'lustre-mdt1': no such pool available
       replay-single test_20b: @@@@@@ FAIL: Restart of mds1 failed! 
      

      Attachments

        Issue Links

          Activity

            [LU-10708] replay-single test_20b: Restart of mds1 failed!
            sgiraddi Shashidhar Giraddi (Inactive) added a comment - +1  https://testing.whamcloud.com/test_sets/5d4c3c9a-6129-48c0-929a-73cd4fe3a0e1  
            jamesanunez James Nunez (Inactive) made changes -
            Remote Link New: This issue links to "Page (Whamcloud Community Wiki)" [ 24303 ]
            jamesanunez James Nunez (Inactive) made changes -
            Affects Version/s New: Lustre 2.13.0 [ 14290 ]
            jamesanunez James Nunez (Inactive) made changes -
            Remote Link New: This issue links to "Page (Whamcloud Community Wiki)" [ 24155 ]
            jamesanunez James Nunez (Inactive) made changes -
            Affects Version/s New: Lustre 2.12.3 [ 14418 ]
            jamesanunez James Nunez (Inactive) made changes -
            Remote Link New: This issue links to "Page (Whamcloud Community Wiki)" [ 23812 ]
            jamesanunez James Nunez (Inactive) added a comment - Another similar failure with replay-single test 3c at https://testing.whamcloud.com/test_sets/4051fd66-682a-11e9-bd0e-52540065bddc .
            jamesanunez James Nunez (Inactive) made changes -
            Affects Version/s New: Lustre 2.12.1 [ 14406 ]
            jamesanunez James Nunez (Inactive) made changes -
            Remote Link New: This issue links to "Page (Whamcloud Community Wiki)" [ 23535 ]

            Similar failure for recovery-random-scale test fail_client_mds at https://testing.whamcloud.com/test_sets/e3b58552-fea5-11e8-b837-52540065bddc

            CMD: trevis-25vm11 hostname
            mount facets: mds1
            CMD: trevis-25vm11 lsmod | grep zfs >&/dev/null || modprobe zfs;
            			zpool list -H lustre-mdt1 >/dev/null 2>&1 ||
            			zpool import -f -o cachefile=none -o failmode=panic -d /dev/lvm-Role_MDS lustre-mdt1
            trevis-25vm11: cannot import 'lustre-mdt1': no such pool available
             recovery-random-scale test_fail_client_mds: @@@@@@ FAIL: Restart of mds1 failed! 
            jamesanunez James Nunez (Inactive) added a comment - Similar failure for recovery-random-scale test fail_client_mds at https://testing.whamcloud.com/test_sets/e3b58552-fea5-11e8-b837-52540065bddc CMD: trevis-25vm11 hostname mount facets: mds1 CMD: trevis-25vm11 lsmod | grep zfs >&/dev/null || modprobe zfs; zpool list -H lustre-mdt1 >/dev/null 2>&1 || zpool import -f -o cachefile=none -o failmode=panic -d /dev/lvm-Role_MDS lustre-mdt1 trevis-25vm11: cannot import 'lustre-mdt1': no such pool available recovery-random-scale test_fail_client_mds: @@@@@@ FAIL: Restart of mds1 failed!

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: