Details

    • Bug
    • Resolution: Fixed
    • Minor
    • None
    • None
    • 7
    • 3
    • 7145

    Description

      The "configuring failover" section in the Whamcloud release of the
      Lustre manual seems rather out of date:

      http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.html#configuringfailover

      The Oracle release says much the same thing:
      http://wiki.lustre.org/manual/LustreManual20_HTML/ConfiguringFailover.html#50540588_50628

      In section 11.1.1 "Power management software", it says:

      "For more information about PowerMan, go to:
      https://computing.llnl.gov/linux/powerman.html"

      Which no longer exists. It should probably point at
      http://code.google.com/p/powerman/

      Then in section 11.2. "Setting up High-Availability (HA) Software with
      Lustre" it mentions "Red Hat Cluster Manager" and "Pacemaker".

      "Red Hat Cluster Manager" points to
      http://wiki.lustre.org/index.php/Using_Red_Hat_Cluster_Manager_with_Lustre

      which says "In comparison with other HA solutions, RedHat Cluster as in
      RHEL 5.5 is an old HA solution. We recommend using other HA solutions
      like Pacemaker, if possible. "

      The pacemaker link:
      http://wiki.lustre.org/index.php/Using_Pacemaker_with_Lustre

      Although the title of this is "Using Pacemaker with Lustre", it starts
      off by saying "In modern clusters, OpenAIS, or more specifically, its
      communication stack corosync, is used for this task".

      In summary:

      1) The manual could do with some updating here.

      2) I suspect I should be using corosync.

      Attachments

        Activity

          [LUDOC-69] Lustre Manual needs updated Failover section

          Changes reviewed and merged. Resolved

          linda Linda Bebernes (Inactive) added a comment - Changes reviewed and merged. Resolved

          Changes pushed to gerrit and ready for review at http://review.whamcloud.com/8058

          Ch 3 Intro to Failover - edits for clarity, fixed missing figure
          Ch 11 Configuring Lustre Failover - major rewrite to update and
          clarify content
          Ch 13 Lustre Operations - edited failover-related entries for clarity,
          updated example from Elan to Ethernet, added cross-ref to Ch 11
          Ch 14 Lustre Maintenance - edited failover-related entries for clarity,
          added crossref to Ch 11
          Ch 20 MMP - changed chapter name from "Managing Failover" to
          "Lustre Failover and Multi-Mount Protection", minor edits, added xref to Ch 11
          Ch 36 - updated --servicenode and --failnode descriptons for mkfs.lustre
          and tunefs.lustre

          linda Linda Bebernes (Inactive) added a comment - Changes pushed to gerrit and ready for review at http://review.whamcloud.com/8058 Ch 3 Intro to Failover - edits for clarity, fixed missing figure Ch 11 Configuring Lustre Failover - major rewrite to update and clarify content Ch 13 Lustre Operations - edited failover-related entries for clarity, updated example from Elan to Ethernet, added cross-ref to Ch 11 Ch 14 Lustre Maintenance - edited failover-related entries for clarity, added crossref to Ch 11 Ch 20 MMP - changed chapter name from "Managing Failover" to "Lustre Failover and Multi-Mount Protection", minor edits, added xref to Ch 11 Ch 36 - updated --servicenode and --failnode descriptons for mkfs.lustre and tunefs.lustre

          Cliff,
          This is part of the quality improvement project for the Lustre Manual. Please feel free to work with Linda on this or reach out to her with questions.

          jlevi Jodi Levi (Inactive) added a comment - Cliff, This is part of the quality improvement project for the Lustre Manual. Please feel free to work with Linda on this or reach out to her with questions.

          People

            linda Linda Bebernes (Inactive)
            cliffw Cliff White (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: