Details
-
New Feature
-
Resolution: Fixed
-
Minor
-
None
-
None
-
9223372036854775807
Description
A new script to be used in Pacemaker to manage ZFS pools and Lustre targets.
This RA is able to manage (import/export) ZFS pools and Lustre Target (mount/umount).
pcs resource create <Resource Name> ocf:heartbeat:LustreZFS \ pool="<ZFS Pool Name>" \ volume="<ZFS Volume Name>" \ mountpoint="<Mount Point" \ OCF_CHECK_LEVEL=10
where:
- pool is the pool name of the ZFS resource created in advance
- volume is the volume name created on the ZFS pool during the Lustre format (mkfs.lustre).
- mount point is the mount point created in advance on both the Lustre servers
- OCF_CHECK_LEVEL is optional and enable an extra monitor on the status of the pool
This script should be located in /usr/lib/ocf/resource.d/heartbeat/ of both the Lustre servers with permission 755.
The script provides protection from double imports of the pools. In order to activate this functionality is important to configure the hostid protection in ZFS using the genhostid command.
Default values:
- no defaults
Default timeout:
- start timeout 300s
- stop timeout 300s
- monitor timeout 300s interval 20s
Compatible and tested:
- pacemaker 1.1.13
- corosync 2.3.4
- pcs 0.9.143
- RHEL/CentOS 7.2
Attachments
Issue Links
- duplicates
-
LU-8458 Pacemaker script to monitor Lustre servers status
-
- Resolved
-
Checked to see what the resource option could locate with respect to ZFS and here's what I got:
pcs resource list | grep -i zfs
ocf:heartbeat:Lustre-MDS-ZFS - Lustre and ZFS management when the MDT and MGT
ocf:heartbeat*:LustreZFS* - Lustre and ZFS management
ocf:llnl:lustre - Lustre ZFS OSD resource agent
ocf:llnl:zpool - ZFS zpool resource agent
ocf:pacemaker:Lustre-MDS-ZFS - Lustre and ZFS management when the MDT and MGT
ocf:pacemaker:LustreZFS - Lustre and ZFS management
ls /usr/lib/ocf/resource.d/pacemaker
ClusterMon Dummy healthLNET HealthSMART LustreZFS pingd Stateful SystemHealth
controld HealthCPU healthLUSTRE Lustre-MDS-ZFS ping remote SysInfo
ls /usr/lib/ocf/resource.d/heartbeat
apache Delay exportfs healthLUSTRE iSCSILogicalUnit LVM nfsnotify oralsnr redis Squid
clvm dhcpd Filesystem iface-vlan iSCSITarget MailTo nfsserver pgsql Route symlink
conntrackd docker galera IPaddr Lustre-MDS-ZFS mysql nginx portblock rsyncd tomcat
CTDB Dummy garbd IPaddr2 LustreZFS nagios ocf-rarun postfix SendArp VirtualDomain
db2 ethmonitor healthLNET IPsrcaddr named oracle rabbitmq-cluster slapd Xinetd
ls /usr/lib/ocf/resource.d/llnl/
lustre zpool
The LLNL agents were installed yesterday by another staff member and we were able to successfully create the resources using the LLNL RA scripts but not the Intel ones:
Online: [ mds00 mds01 ]
Full list of resources:
hammer_io6 (stonith:fence_powerman): Started mds00
Anyway, if you have any other suggestions, I'd welcome them because I'd prefer using a vendor RA but will settle with the LLNL one for the moment.
Thanks again for the support with this.
Cheers,