[LU-9752] Unable to format zfs osts Created: 07/Jul/17  Updated: 15/Nov/17  Resolved: 19/Oct/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.0
Fix Version/s: Lustre 2.11.0, Lustre 2.10.2

Type: Bug Priority: Major
Reporter: Abe Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: zfs
Environment:

SLES12 SP2
Lustre community pre-release 2.10


Epic/Theme: zfs
Severity: 2
Epic: zfs
Rank (Obsolete): 9223372036854775807

 Description   

when trying to format zfs osts we get the following error:
mkfs.lustre --reformat --mdt --mgs --servicenode= 211@o2ib --backfstype=zfs --fsname=tempAA --index=0 mgs/mdt

Permanent disk data:
Target: tempAA:MDT0000
Index: 0
Lustre FS: tempAA
Mount type: zfs
Flags: 0x1065
(MDT MGS first_time update no_primnode )
Persistent mount opts:
Parameters: failover.node=10.10.10.213@o2ib:10.10.10.211@o2ib

mkfs.lustre FATAL: spl_hostid not set. See mkfs.lustre(8)
mkfs.lustre FATAL: mkfs failed 22
mkfs.lustre: exiting with 22 (Invalid argument)



 Comments   
Comment by Peter Jones [ 07/Jul/17 ]

Nathaniel

Can you please advise?

Peter

Comment by Malcolm Cowe (Inactive) [ 09/Jul/17 ]

Ignoring, for the moment, the --servicenode option having an invalid IPv4 address for the NID, the error returned by mkfs is stating that the server's hostid is not set persistently. This is needed in order to reduce the chances of double-importing a ZFS pool across multiple servers. 

This is a recent change in behaviour for the mkfs.lustre command, designed to ensure that additional protection of the ZFS storage pools is in place before creating the file system target.

 Try running the following command:

genhostid

This will create a file, /etc/hostid, containing a persistent hostid for the server. To activate, either reload the spl.ko module, or reboot the server. In my experience, the server usually needs to be rebooted, but that may have been fixed in the most recent ZFS versions. When the reboot is complete, the SPL kernel module will pick up the hostid automatically when it is loaded. This will allow the format command to complete.

Once rebooted, you can verify that the hostid is set properly by comparing the output of the hostid command to the content of /etc/hostid. For example:

[root@ct73-oss2 ~]# hostid
000bea11
[root@ct73-oss2 ~]# od -An -tx /etc/hostid
 000bea11

 The hostid needs to be set for all Lustre servers with ZFS storage targets.

Comment by Abe [ 10/Jul/17 ]

Hi Malcom,
I had to manually the hostid file since there is no genhostid cli on suse linux.
It is solved the problem, pls go ahead and close this ticket..

thank you
Abe

Comment by Malcolm Cowe (Inactive) [ 11/Jul/17 ]

Thanks for the update. I did not realise that genhostid is not available on SLES. I did discover the following alternative method using BASH, for future reference:

h=`hostid`; a=${h:6:2}; b=${h:4:2}; c=${h:2:2}; d=${h:0:2}
sudo sh -c "echo -ne \"\x$a\x$b\x$c\x$d\" > /etc/hostid"

For example:

vagrant@sl12sp2-b:~> hostid
007f0100
vagrant@sl12sp2-b:~> h=`hostid`; a=${h:6:2}; b=${h:4:2}; c=${h:2:2}; d=${h:0:2}
vagrant@sl12sp2-b:~> sudo sh -c "echo -ne \"\x$a\x$b\x$c\x$d\" > /etc/hostid"
vagrant@sl12sp2-b:~> od -An -tx /etc/hostid
 007f0100

 

Comment by Andreas Dilger [ 13/Jul/17 ]

I wonder if it makes sense to add the above script into a genhostid command as part of osd-zfs-mount.rpm for SLES, so that users don't run into this problem?

The genhostid command is part of the RHEL initscripts package, while hostid is part of coreutils, so it does appear that genhostid is unique to RHEL, but it would be convenient to have this available on SLES as well.

Comment by Nathaniel Clark [ 18/Aug/17 ]

Has this become irrelevant with ZFS 0.7.0's MMP? Since hostid isn't used for that?

Comment by Andreas Dilger [ 18/Aug/17 ]

The ZFS MMP implementation still depends on hostid to determine if the pool is being imported on the same node. There is a patch to create a zgenhostid command to make up for the lack of genhostid on SLES and Ubuntu, but that may only become available with ZFS 0.8.0, so I still think it makes sense for us to add our own genhostid command for those distros until it becomes available. At that point, we should consider changing our documentation to reference zgenhostid, since the upstream genhostid maintainer has said the command may be removed from RHEL completely.

Comment by Nathaniel Clark [ 21/Sep/17 ]

zgenhostid made it into 0.7.0.

Comment by Andreas Dilger [ 21/Sep/17 ]

Nathaniel, could you please submit a patch that updates our references to zgenhostid instead of genhostid in the mkfs.lustre.8 and tunefs.lustre.8 man pages.

Comment by Gerrit Updater [ 05/Oct/17 ]

Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/29327
Subject: LU-9752 man: Reference zgenhostid instead of genhostid
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: bbb792bc4373a999f0bbc1fd0dba62faf475b7f7

Comment by Gerrit Updater [ 19/Oct/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29327/
Subject: LU-9752 man: Reference zgenhostid instead of genhostid
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: a1eb6de081473545fbd5c1fe33e209fe391bf708

Comment by Peter Jones [ 19/Oct/17 ]

Landed for 2.11

Comment by Gerrit Updater [ 26/Oct/17 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/29805
Subject: LU-9752 man: Reference zgenhostid instead of genhostid
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 58b218547b24069fffef45eb22a08439b632e548

Comment by Gerrit Updater [ 15/Nov/17 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/29805/
Subject: LU-9752 man: Reference zgenhostid instead of genhostid
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: 3dd7a41a9262635c42b8ba575d743b60e27ddbae

Generated at Sat Feb 10 02:28:55 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.