Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3264

recovery-*-scale tests failed with FSTYPE=zfs and FAILURE_MODE=HARD

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.4.1, Lustre 2.5.0
    • Lustre 2.4.0

    • FSTYPE=zfs
      FAILURE_MODE=HARD
    • 3
    • 8083

    Description

      While running recovery-*-scale tests with FSTYPE=zfs and FAILURE_MODE=HARD under failover configuration, the tests failed as follows:

      Failing mds1 on wtm-9vm3
      + pm -h powerman --off wtm-9vm3
      Command completed successfully
      waiting ! ping -w 3 -c 1 wtm-9vm3, 4 secs left ...
      waiting ! ping -w 3 -c 1 wtm-9vm3, 3 secs left ...
      waiting ! ping -w 3 -c 1 wtm-9vm3, 2 secs left ...
      waiting ! ping -w 3 -c 1 wtm-9vm3, 1 secs left ...
      waiting for wtm-9vm3 to fail attempts=3
      + pm -h powerman --off wtm-9vm3
      Command completed successfully
      reboot facets: mds1
      + pm -h powerman --on wtm-9vm3
      Command completed successfully
      Failover mds1 to wtm-9vm7
      04:28:49 (1367234929) waiting for wtm-9vm7 network 900 secs ...
      04:28:49 (1367234929) network interface is UP
      CMD: wtm-9vm7 hostname
      mount facets: mds1
      Starting mds1:   lustre-mdt1/mdt1 /mnt/mds1
      CMD: wtm-9vm7 mkdir -p /mnt/mds1; mount -t lustre   		                   lustre-mdt1/mdt1 /mnt/mds1
      wtm-9vm7: mount.lustre: lustre-mdt1/mdt1 has not been formatted with mkfs.lustre or the backend filesystem type is not supported by this tool
      Start of lustre-mdt1/mdt1 on mds1 failed 19
      

      Maloo report: https://maloo.whamcloud.com/test_sets/ac7cbc10-b0e3-11e2-b2c4-52540035b04c

      Attachments

        Activity

          [LU-3264] recovery-*-scale tests failed with FSTYPE=zfs and FAILURE_MODE=HARD

          with REFORMAT=y FSTYPE=zfs sh llmount.sh -v I'm getting:

          Format mds1: lustre-mdt1/mdt1
          CMD: centos grep -c /mnt/mds1' ' /proc/mounts
          CMD: centos lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
          CMD: centos ! zpool list -H lustre-mdt1 >/dev/null 2>&1 ||
          zpool export lustre-mdt1
          CMD: centos /work/lustre/head1/lustre/tests/../utils/mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/work/lustre/head1/lustre/tests/../utils/l_getidentity --backfstype=zfs --device-size=200000 --reformat lustre-mdt1/mdt1 /tmp/lustre-mdt1

          Permanent disk data:
          Target: lustre:MDT0000
          Index: 0
          Lustre FS: lustre
          Mount type: zfs
          Flags: 0x65
          (MDT MGS first_time update )
          Persistent mount opts:
          Parameters: sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/work/lustre/head1/lustre/tests/../utils/l_getidentity

          mkfs_cmd = zpool create -f -O canmount=off lustre-mdt1 /tmp/lustre-mdt1
          mkfs_cmd = zfs create -o canmount=off -o xattr=sa lustre-mdt1/mdt1
          Writing lustre-mdt1/mdt1 properties
          lustre:version=1
          lustre:flags=101
          lustre:index=0
          lustre:fsname=lustre
          lustre:svname=lustre:MDT0000
          lustre:sys.timeout=20
          lustre:lov.stripesize=1048576
          lustre:lov.stripecount=0
          lustre:mdt.identity_upcall=/work/lustre/head1/lustre/tests/../utils/l_getidentity
          CMD: centos zpool set cachefile=none lustre-mdt1
          CMD: centos ! zpool list -H lustre-mdt1 >/dev/null 2>&1 ||
          zpool export lustre-mdt1
          ...
          Loading modules from /work/lustre/head1/lustre/tests/..
          detected 2 online CPUs by sysfs
          Force libcfs to create 2 CPU partitions
          debug=vfstrace rpctrace dlmtrace neterror ha config ioctl super
          subsystem_debug=all -lnet -lnd -pinger
          gss/krb5 is not supported
          Setup mgs, mdt, osts
          CMD: centos mkdir -p /mnt/mds1
          CMD: centos zpool import -f -o cachefile=none lustre-mdt1
          cannot import 'lustre-mdt1': no such pool available

          bzzz Alex Zhuravlev added a comment - with REFORMAT=y FSTYPE=zfs sh llmount.sh -v I'm getting: Format mds1: lustre-mdt1/mdt1 CMD: centos grep -c /mnt/mds1' ' /proc/mounts CMD: centos lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' CMD: centos ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || zpool export lustre-mdt1 CMD: centos /work/lustre/head1/lustre/tests/../utils/mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=lov.stripesize=1048576 --param=lov.stripecount=0 --param=mdt.identity_upcall=/work/lustre/head1/lustre/tests/../utils/l_getidentity --backfstype=zfs --device-size=200000 --reformat lustre-mdt1/mdt1 /tmp/lustre-mdt1 Permanent disk data: Target: lustre:MDT0000 Index: 0 Lustre FS: lustre Mount type: zfs Flags: 0x65 (MDT MGS first_time update ) Persistent mount opts: Parameters: sys.timeout=20 lov.stripesize=1048576 lov.stripecount=0 mdt.identity_upcall=/work/lustre/head1/lustre/tests/../utils/l_getidentity mkfs_cmd = zpool create -f -O canmount=off lustre-mdt1 /tmp/lustre-mdt1 mkfs_cmd = zfs create -o canmount=off -o xattr=sa lustre-mdt1/mdt1 Writing lustre-mdt1/mdt1 properties lustre:version=1 lustre:flags=101 lustre:index=0 lustre:fsname=lustre lustre:svname=lustre:MDT0000 lustre:sys.timeout=20 lustre:lov.stripesize=1048576 lustre:lov.stripecount=0 lustre:mdt.identity_upcall=/work/lustre/head1/lustre/tests/../utils/l_getidentity CMD: centos zpool set cachefile=none lustre-mdt1 CMD: centos ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || zpool export lustre-mdt1 ... Loading modules from /work/lustre/head1/lustre/tests/.. detected 2 online CPUs by sysfs Force libcfs to create 2 CPU partitions debug=vfstrace rpctrace dlmtrace neterror ha config ioctl super subsystem_debug=all -lnet -lnd -pinger gss/krb5 is not supported Setup mgs, mdt, osts CMD: centos mkdir -p /mnt/mds1 CMD: centos zpool import -f -o cachefile=none lustre-mdt1 cannot import 'lustre-mdt1': no such pool available

          can you confirm the patch does work on a local setup?

          bzzz Alex Zhuravlev added a comment - can you confirm the patch does work on a local setup?
          yujian Jian Yu added a comment -

          Patch was landed on master branch.

          yujian Jian Yu added a comment - Patch was landed on master branch.
          yujian Jian Yu added a comment -

          Patch for master branch is in http://review.whamcloud.com/6258.

          yujian Jian Yu added a comment - Patch for master branch is in http://review.whamcloud.com/6258 .

          Thanks, Brian. There's little info like this on the web. (Perhaps it would be worthwhile to add an FAQ entry on zfsonlinux.org sometime.)

          liwei Li Wei (Inactive) added a comment - Thanks, Brian. There's little info like this on the web. (Perhaps it would be worthwhile to add an FAQ entry on zfsonlinux.org sometime.)

          Until we have MMP for ZFS we've resolved this issue by delegating full authority for starting/stopping servers to heartbeat. See the lustre/scripts/Lustre.ha_v2 resource scripts ZPOOL_IMPORT_ARGS='-f' line which is used to always force importing the pool. We also boot all of our nodes diskless so they never have a persistent cache file and thus never get automatically imported. I admit it's a stop gap until we have real MMP, but in practice it's been working thus far.

          behlendorf Brian Behlendorf added a comment - Until we have MMP for ZFS we've resolved this issue by delegating full authority for starting/stopping servers to heartbeat. See the lustre/scripts/Lustre.ha_v2 resource scripts ZPOOL_IMPORT_ARGS='-f' line which is used to always force importing the pool. We also boot all of our nodes diskless so they never have a persistent cache file and thus never get automatically imported. I admit it's a stop gap until we have real MMP, but in practice it's been working thus far.

          This kind of problem is why we would want to have MMP for ZFS, but that hasn't been developed yet. However, for the sake of this bug, we just need to fix the ZFS import problem so that our automated testing scripts work.

          adilger Andreas Dilger added a comment - This kind of problem is why we would want to have MMP for ZFS, but that hasn't been developed yet. However, for the sake of this bug, we just need to fix the ZFS import problem so that our automated testing scripts work.
          yujian Jian Yu added a comment -

          It would be great if two nodes and a shared device are available to experiments.

          Let me setup the test environment and do some experiments.

          yujian Jian Yu added a comment - It would be great if two nodes and a shared device are available to experiments. Let me setup the test environment and do some experiments.

          (CC'ed Brian. How does LLNL implement failovers with ZFS?)

          The pool lustre-mdt1 needs to be imported via "zpool import -f ..." on wtm-9vm7. The tricky part, however, is how to prevent wtm-9vm3 from playing with the pool after rebooting. It might be doable by never caching Lustre pool configurations ("-o cachefile=none" at creation time), so that none of them will be automatically imported anywhere. It would be great if two nodes and a shared device are available to experiments.

          liwei Li Wei (Inactive) added a comment - (CC'ed Brian. How does LLNL implement failovers with ZFS?) The pool lustre-mdt1 needs to be imported via "zpool import -f ..." on wtm-9vm7. The tricky part, however, is how to prevent wtm-9vm3 from playing with the pool after rebooting. It might be doable by never caching Lustre pool configurations ("-o cachefile=none" at creation time), so that none of them will be automatically imported anywhere. It would be great if two nodes and a shared device are available to experiments.

          People

            yujian Jian Yu
            yujian Jian Yu
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: