Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-107

Lustre init scripts with heartbeat v1 integration

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.3.0
    • Lustre 2.1.0
    • None
    • 20,165
    • 4555

    Description

      This issue is to request that Lustre initialization scripts developed at LLNL be reviewed for inclusion in Lustre. A Gerrit submission for review is on its way.

      Attachments

        Issue Links

          Activity

            [LU-107] Lustre init scripts with heartbeat v1 integration

            Hi Wally,

            Thanks for reporting these problems. We knew the init scripts would probably need work to properly support non-redhat distros. I don't currently have a SLES system to test on, but when I get a chance I'll try to bring up a VM to look into this.

            Are you just using 'make rpm'?

            nedbass Ned Bass (Inactive) added a comment - Hi Wally, Thanks for reporting these problems. We knew the init scripts would probably need work to properly support non-redhat distros. I don't currently have a SLES system to test on, but when I get a chance I'll try to bring up a VM to look into this. Are you just using 'make rpm'?

            When we do our packaging on SLES11 SP1/2, we run into the following problems:

            1. no LSB header information:

            E: File `lnet' without LSB header found in /var/tmp/cray-lustre-cray_gem_c-2.3_3.0.34_0.7.9_1.0000.6718.11.1-root/etc/init.d/
            E: File `lustre' without LSB header found in /var/tmp/cray-lustre-cray_gem_c-2.3_3.0.34_0.7.9_1.0000.6718.11.1-root/etc/init.d/

            2. need sysconfig.lustre in /var/adm/fillup-templates:

            cray-lustre-cray_gem_c: "/etc/sysconfig/lustre" is not allowed anymore in SuSE Linux.

            3. if failover is only for servers, it should probably be excluded from client build

            wang Wally Wang (Inactive) added a comment - When we do our packaging on SLES11 SP1/2, we run into the following problems: 1. no LSB header information: E: File `lnet' without LSB header found in /var/tmp/cray-lustre-cray_gem_c-2.3_3.0.34_0.7.9_1.0000.6718.11.1-root/etc/init.d/ E: File `lustre' without LSB header found in /var/tmp/cray-lustre-cray_gem_c-2.3_3.0.34_0.7.9_1.0000.6718.11.1-root/etc/init.d/ 2. need sysconfig.lustre in /var/adm/fillup-templates: cray-lustre-cray_gem_c: "/etc/sysconfig/lustre" is not allowed anymore in SuSE Linux. 3. if failover is only for servers, it should probably be excluded from client build

            If a site hasn't configured /etc/ldev.conf then adding the /etc/init.d/lustre script should have no effect.

            That is unless there was already a site-specific /etc/init.d/lustre script in place and we overwrite it. The safest thing to do may be to mark it %config(noreplace) in the spec file.

            nedbass Ned Bass (Inactive) added a comment - If a site hasn't configured /etc/ldev.conf then adding the /etc/init.d/lustre script should have no effect. That is unless there was already a site-specific /etc/init.d/lustre script in place and we overwrite it. The safest thing to do may be to mark it %config(noreplace) in the spec file.

            Doug,
            could you please have a look at how the patch in http://review.whamcloud.com/290 complements or conflicts with your proposed changes to LNET configuration. I believe you were already planning to start with an /etc/init.d/lnet startup file. Hopefully by landing this now, it will give you a starting point for your configuration changes, and users will already be aware of this script, so your changes will be transparent to them.

            adilger Andreas Dilger added a comment - Doug, could you please have a look at how the patch in http://review.whamcloud.com/290 complements or conflicts with your proposed changes to LNET configuration. I believe you were already planning to start with an /etc/init.d/lnet startup file. Hopefully by landing this now, it will give you a starting point for your configuration changes, and users will already be aware of this script, so your changes will be transparent to them.

            Hi Andreas,

            If a site hasn't configured /etc/ldev.conf then adding the /etc/init.d/lustre script should have no effect. That is, it won't start any services or interfere with whatever method was used to start lustre before updating. Also it not run by the init system by default, but rather it is intended to by started by a HA/failover mechanism such as heartbeat.

            /etc/init.d/lnet could start lnet sooner than was the case before updating, but I wouldn't expect that to break anything.

            nedbass Ned Bass (Inactive) added a comment - Hi Andreas, If a site hasn't configured /etc/ldev.conf then adding the /etc/init.d/lustre script should have no effect. That is, it won't start any services or interfere with whatever method was used to start lustre before updating. Also it not run by the init system by default, but rather it is intended to by started by a HA/failover mechanism such as heartbeat. /etc/init.d/lnet could start lnet sooner than was the case before updating, but I wouldn't expect that to break anything.

            Ned, Brian, are there any implications to an existing system if /etc/init.d/lustre and /etc/init.d/lnet are suddenly added to an existing system?

            What we don't want is that someone upgrades to Lustre 2.4 from 2.1 and suddenly their system is unusable until they generate an /etc/ldev.conf or something. My assumption is that nobody ever reads the manual or release notes when upgrading, so if it doesn't work correctly "out of the box" then something was done incorrectly by the code(r).

            adilger Andreas Dilger added a comment - Ned, Brian, are there any implications to an existing system if /etc/init.d/lustre and /etc/init.d/lnet are suddenly added to an existing system? What we don't want is that someone upgrades to Lustre 2.4 from 2.1 and suddenly their system is unusable until they generate an /etc/ldev.conf or something. My assumption is that nobody ever reads the manual or release notes when upgrading, so if it doesn't work correctly "out of the box" then something was done incorrectly by the code(r).

            Integrated in reviews-centos5 #418
            LU-107 Add scripts for implementing heartbeat v1 failover

            Ned Bass : ffdb6f1830fead020e79aab7c32b4633e6bdb179
            Files :

            • lustre/scripts/haconfig
            • lustre/autoconf/lustre-core.m4
            • lustre/doc/ldev.conf.5
            • build/autoconf/lustre-build.m4
            • lustre/doc/lhbadm.8
            • lustre/scripts/Lustre
            • lustre/doc/Makefile.am
            • lustre/scripts/lustre.in
            • lustre/doc/ldev.8
            • lustre/conf/ldev.conf
            • lustre/scripts/lhbadm
            • lustre/doc/nids.5
            • lustre/scripts/Makefile.am
            • lustre/conf/lustre
            • lustre/conf/Makefile.am
            • lustre/scripts/ldev
            • lustre/scripts/lustre
            • lustre/scripts/lnet
            • lustre.spec.in
            hudson Build Master (Inactive) added a comment - Integrated in reviews-centos5 #418 LU-107 Add scripts for implementing heartbeat v1 failover Ned Bass : ffdb6f1830fead020e79aab7c32b4633e6bdb179 Files : lustre/scripts/haconfig lustre/autoconf/lustre-core.m4 lustre/doc/ldev.conf.5 build/autoconf/lustre-build.m4 lustre/doc/lhbadm.8 lustre/scripts/Lustre lustre/doc/Makefile.am lustre/scripts/lustre.in lustre/doc/ldev.8 lustre/conf/ldev.conf lustre/scripts/lhbadm lustre/doc/nids.5 lustre/scripts/Makefile.am lustre/conf/lustre lustre/conf/Makefile.am lustre/scripts/ldev lustre/scripts/lustre lustre/scripts/lnet lustre.spec.in

            Integrated in reviews-centos5 #416
            LU-107 Add scripts for implementing heartbeat v1 failover

            Ned Bass : a7de9d4b241454fa54e8e4638594240fec3bc82d
            Files :

            • lustre/scripts/lustre.in
            • lustre/conf/ldev.conf
            • lustre/scripts/lnet
            • lustre/doc/ldev.conf.5
            • lustre/autoconf/lustre-core.m4
            • lustre/doc/nids.5
            • lustre/scripts/lhbadm
            • lustre/scripts/haconfig
            • lustre/doc/lhbadm.8
            • lustre/conf/Makefile.am
            • lustre/conf/lustre
            • lustre.spec.in
            • build/autoconf/lustre-build.m4
            • lustre/scripts/lustre
            • lustre/doc/Makefile.am
            • lustre/doc/ldev.8
            • lustre/scripts/Lustre
            • lustre/scripts/ldev
            • lustre/scripts/Makefile.am
            hudson Build Master (Inactive) added a comment - Integrated in reviews-centos5 #416 LU-107 Add scripts for implementing heartbeat v1 failover Ned Bass : a7de9d4b241454fa54e8e4638594240fec3bc82d Files : lustre/scripts/lustre.in lustre/conf/ldev.conf lustre/scripts/lnet lustre/doc/ldev.conf.5 lustre/autoconf/lustre-core.m4 lustre/doc/nids.5 lustre/scripts/lhbadm lustre/scripts/haconfig lustre/doc/lhbadm.8 lustre/conf/Makefile.am lustre/conf/lustre lustre.spec.in build/autoconf/lustre-build.m4 lustre/scripts/lustre lustre/doc/Makefile.am lustre/doc/ldev.8 lustre/scripts/Lustre lustre/scripts/ldev lustre/scripts/Makefile.am
            nedbass Ned Bass (Inactive) added a comment - That worked! Thanks Brian. http://build.whamcloud.com/job/reviews-ubuntu/256/

            My apologies--I was accidentally using review/lustre instead of review/fs/lustre-release.

            No worries. Glad you figured out what it was.

            I rebased and resubmitted and the ubuntu build still fails, but I think I understand why now. My patch removes lustre/scripts/lustre and replaces it with lustre/scripts/lustre.in (due to the name of the tune2fs executable being determined at configure time). So lustre/scripts/lustre gets auto-generated when configure is run. But following configure the build runs

            fakeroot debian/rules clean

            which reverts all the patches. Reverting my patch tries to recreate lustre/scripts/lustre, but this fails because it already exists (it was created by configure).

            Nice catch. I also discovered the same yesterday since I was looking at why this build was failing also. It is a "perfect storm" of conditions that causes this.

            I suppose one way to fix this is to separate out the removal of lustre/scripts/lustre as a separate patch. Thoughts?

            I don't think that will fix it, but my changeset for LU-120 does. Perhaps you can cherry-pick that change and put it in front of yours and see if it fixes it. It does for me, locally.

            brian Brian Murrell (Inactive) added a comment - My apologies--I was accidentally using review/lustre instead of review/fs/lustre-release. No worries. Glad you figured out what it was. I rebased and resubmitted and the ubuntu build still fails, but I think I understand why now. My patch removes lustre/scripts/lustre and replaces it with lustre/scripts/lustre.in (due to the name of the tune2fs executable being determined at configure time). So lustre/scripts/lustre gets auto-generated when configure is run. But following configure the build runs fakeroot debian/rules clean which reverts all the patches. Reverting my patch tries to recreate lustre/scripts/lustre, but this fails because it already exists (it was created by configure). Nice catch. I also discovered the same yesterday since I was looking at why this build was failing also. It is a "perfect storm" of conditions that causes this. I suppose one way to fix this is to separate out the removal of lustre/scripts/lustre as a separate patch. Thoughts? I don't think that will fix it, but my changeset for LU-120 does. Perhaps you can cherry-pick that change and put it in front of yours and see if it fixes it. It does for me, locally.
            nedbass Ned Bass (Inactive) added a comment - - edited

            My apologies--I was accidentally using review/lustre instead of review/fs/lustre-release.

            I rebased and resubmitted and the ubuntu build still fails, but I think I understand why now. My patch removes lustre/scripts/lustre and replaces it with lustre/scripts/lustre.in (due to the name of the tune2fs executable being determined at configure time). So lustre/scripts/lustre gets auto-generated when configure is run. But following configure the build runs

            fakeroot debian/rules clean

            which reverts all the patches. Reverting my patch tries to recreate lustre/scripts/lustre, but this fails because it already exists (it was created by configure). I suppose one way to fix this is to separate out the removal of lustre/scripts/lustre as a separate patch. Thoughts?

            Thanks,
            Ned

            nedbass Ned Bass (Inactive) added a comment - - edited My apologies--I was accidentally using review/lustre instead of review/fs/lustre-release. I rebased and resubmitted and the ubuntu build still fails, but I think I understand why now. My patch removes lustre/scripts/lustre and replaces it with lustre/scripts/lustre.in (due to the name of the tune2fs executable being determined at configure time). So lustre/scripts/lustre gets auto-generated when configure is run. But following configure the build runs fakeroot debian/rules clean which reverts all the patches. Reverting my patch tries to recreate lustre/scripts/lustre, but this fails because it already exists (it was created by configure). I suppose one way to fix this is to separate out the removal of lustre/scripts/lustre as a separate patch. Thoughts? Thanks, Ned

            People

              green Oleg Drokin
              nedbass Ned Bass (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: