Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.11.0, Lustre 2.10.2
    • Lustre 2.10.0
    • SLES12 SP2
      Lustre community pre-release 2.10
    • 2
    • 9223372036854775807

    Description

      when trying to format zfs osts we get the following error:
      mkfs.lustre --reformat --mdt --mgs --servicenode= 211@o2ib --backfstype=zfs --fsname=tempAA --index=0 mgs/mdt

      Permanent disk data:
      Target: tempAA:MDT0000
      Index: 0
      Lustre FS: tempAA
      Mount type: zfs
      Flags: 0x1065
      (MDT MGS first_time update no_primnode )
      Persistent mount opts:
      Parameters: failover.node=10.10.10.213@o2ib:10.10.10.211@o2ib

      mkfs.lustre FATAL: spl_hostid not set. See mkfs.lustre(8)
      mkfs.lustre FATAL: mkfs failed 22
      mkfs.lustre: exiting with 22 (Invalid argument)

      Attachments

        Activity

          [LU-9752] Unable to format zfs osts
          pjones Peter Jones added a comment -

          Landed for 2.11

          pjones Peter Jones added a comment - Landed for 2.11

          Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29327/
          Subject: LU-9752 man: Reference zgenhostid instead of genhostid
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: a1eb6de081473545fbd5c1fe33e209fe391bf708

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29327/ Subject: LU-9752 man: Reference zgenhostid instead of genhostid Project: fs/lustre-release Branch: master Current Patch Set: Commit: a1eb6de081473545fbd5c1fe33e209fe391bf708

          Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/29327
          Subject: LU-9752 man: Reference zgenhostid instead of genhostid
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: bbb792bc4373a999f0bbc1fd0dba62faf475b7f7

          gerrit Gerrit Updater added a comment - Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/29327 Subject: LU-9752 man: Reference zgenhostid instead of genhostid Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: bbb792bc4373a999f0bbc1fd0dba62faf475b7f7

          Nathaniel, could you please submit a patch that updates our references to zgenhostid instead of genhostid in the mkfs.lustre.8 and tunefs.lustre.8 man pages.

          adilger Andreas Dilger added a comment - Nathaniel, could you please submit a patch that updates our references to zgenhostid instead of genhostid in the mkfs.lustre.8 and tunefs.lustre.8 man pages.

          zgenhostid made it into 0.7.0.

          utopiabound Nathaniel Clark added a comment - zgenhostid made it into 0.7.0.

          The ZFS MMP implementation still depends on hostid to determine if the pool is being imported on the same node. There is a patch to create a zgenhostid command to make up for the lack of genhostid on SLES and Ubuntu, but that may only become available with ZFS 0.8.0, so I still think it makes sense for us to add our own genhostid command for those distros until it becomes available. At that point, we should consider changing our documentation to reference zgenhostid, since the upstream genhostid maintainer has said the command may be removed from RHEL completely.

          adilger Andreas Dilger added a comment - The ZFS MMP implementation still depends on hostid to determine if the pool is being imported on the same node. There is a patch to create a zgenhostid command to make up for the lack of genhostid on SLES and Ubuntu, but that may only become available with ZFS 0.8.0, so I still think it makes sense for us to add our own genhostid command for those distros until it becomes available. At that point, we should consider changing our documentation to reference zgenhostid , since the upstream genhostid maintainer has said the command may be removed from RHEL completely.

          Has this become irrelevant with ZFS 0.7.0's MMP? Since hostid isn't used for that?

          utopiabound Nathaniel Clark added a comment - Has this become irrelevant with ZFS 0.7.0's MMP? Since hostid isn't used for that?
          adilger Andreas Dilger added a comment - - edited

          I wonder if it makes sense to add the above script into a genhostid command as part of osd-zfs-mount.rpm for SLES, so that users don't run into this problem?

          The genhostid command is part of the RHEL initscripts package, while hostid is part of coreutils, so it does appear that genhostid is unique to RHEL, but it would be convenient to have this available on SLES as well.

          adilger Andreas Dilger added a comment - - edited I wonder if it makes sense to add the above script into a genhostid command as part of osd-zfs-mount.rpm for SLES, so that users don't run into this problem? The genhostid command is part of the RHEL initscripts package, while hostid is part of coreutils , so it does appear that genhostid is unique to RHEL, but it would be convenient to have this available on SLES as well.

          Thanks for the update. I did not realise that genhostid is not available on SLES. I did discover the following alternative method using BASH, for future reference:

          h=`hostid`; a=${h:6:2}; b=${h:4:2}; c=${h:2:2}; d=${h:0:2}
          sudo sh -c "echo -ne \"\x$a\x$b\x$c\x$d\" > /etc/hostid"
          

          For example:

          vagrant@sl12sp2-b:~> hostid
          007f0100
          vagrant@sl12sp2-b:~> h=`hostid`; a=${h:6:2}; b=${h:4:2}; c=${h:2:2}; d=${h:0:2}
          vagrant@sl12sp2-b:~> sudo sh -c "echo -ne \"\x$a\x$b\x$c\x$d\" > /etc/hostid"
          vagrant@sl12sp2-b:~> od -An -tx /etc/hostid
           007f0100
          

           

          malkolm Malcolm Cowe (Inactive) added a comment - Thanks for the update. I did not realise that genhostid is not available on SLES. I did discover the following alternative method using BASH, for future reference: h=`hostid`; a=${h:6:2}; b=${h:4:2}; c=${h:2:2}; d=${h:0:2} sudo sh -c "echo -ne \" \x$a\x$b\x$c\x$d\ " > /etc/hostid" For example: vagrant@sl12sp2-b:~> hostid 007f0100 vagrant@sl12sp2-b:~> h=`hostid`; a=${h:6:2}; b=${h:4:2}; c=${h:2:2}; d=${h:0:2} vagrant@sl12sp2-b:~> sudo sh -c "echo -ne \" \x$a\x$b\x$c\x$d\ " > /etc/hostid" vagrant@sl12sp2-b:~> od -An -tx /etc/hostid 007f0100  

          Hi Malcom,
          I had to manually the hostid file since there is no genhostid cli on suse linux.
          It is solved the problem, pls go ahead and close this ticket..

          thank you
          Abe

          abea@supermicro.com Abe (Inactive) added a comment - Hi Malcom, I had to manually the hostid file since there is no genhostid cli on suse linux. It is solved the problem, pls go ahead and close this ticket.. thank you Abe
          malkolm Malcolm Cowe (Inactive) added a comment - - edited

          Ignoring, for the moment, the --servicenode option having an invalid IPv4 address for the NID, the error returned by mkfs is stating that the server's hostid is not set persistently. This is needed in order to reduce the chances of double-importing a ZFS pool across multiple servers. 

          This is a recent change in behaviour for the mkfs.lustre command, designed to ensure that additional protection of the ZFS storage pools is in place before creating the file system target.

           Try running the following command:

          genhostid
          
          

          This will create a file, /etc/hostid, containing a persistent hostid for the server. To activate, either reload the spl.ko module, or reboot the server. In my experience, the server usually needs to be rebooted, but that may have been fixed in the most recent ZFS versions. When the reboot is complete, the SPL kernel module will pick up the hostid automatically when it is loaded. This will allow the format command to complete.

          Once rebooted, you can verify that the hostid is set properly by comparing the output of the hostid command to the content of /etc/hostid. For example:

          [root@ct73-oss2 ~]# hostid
          000bea11
          [root@ct73-oss2 ~]# od -An -tx /etc/hostid
           000bea11
          
          

           The hostid needs to be set for all Lustre servers with ZFS storage targets.

          malkolm Malcolm Cowe (Inactive) added a comment - - edited Ignoring, for the moment, the --servicenode option having an invalid IPv4 address for the NID, the error returned by mkfs is stating that the server's hostid is not set persistently. This is needed in order to reduce the chances of double-importing a ZFS pool across multiple servers.  This is a recent change in behaviour for the mkfs.lustre command, designed to ensure that additional protection of the ZFS storage pools is in place before creating the file system target.  Try running the following command: genhostid This will create a file, /etc/hostid , containing a persistent hostid for the server. To activate, either reload the spl.ko module, or reboot the server. In my experience, the server usually needs to be rebooted, but that may have been fixed in the most recent ZFS versions. When the reboot is complete, the SPL kernel module will pick up the hostid automatically when it is loaded. This will allow the format command to complete. Once rebooted, you can verify that the hostid is set properly by comparing the output of the hostid command to the content of /etc/hostid. For example: [root@ct73-oss2 ~]# hostid 000bea11 [root@ct73-oss2 ~]# od -An -tx /etc/hostid 000bea11  The hostid needs to be set for all Lustre servers with ZFS storage targets.

          People

            utopiabound Nathaniel Clark
            abea@supermicro.com Abe (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: