Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3389

Lustre b2_1 build failed on RHEL6.4 with OFA IB stack

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.4.1, Lustre 2.5.0
    • Lustre 2.1.6

    • Distro: RHEL6.4
      Network: OFA IB
    • 3
    • 8392

    Description

      After http://review.whamcloud.com/5504 was landed on Lustre b2_1 branch, build on RHEL6.4 distro with OFA IB stack has been failing:

      http://build.whamcloud.com/job/lustre-b2_1/198/
      http://build.whamcloud.com/job/lustre-b2_1/203/

      The OFA version is 1.5.4.

      Attachments

        Activity

          [LU-3389] Lustre b2_1 build failed on RHEL6.4 with OFA IB stack
          ihara Shuichi Ihara (Inactive) added a comment - http://review.whamcloud.com/7216 for b2_4 http://review.whamcloud.com/7217 for b2_1

          I ported patch for b2_1 and b2_4 and will post them shortly.

          ihara Shuichi Ihara (Inactive) added a comment - I ported patch for b2_1 and b2_4 and will post them shortly.
          mdiep Minh Diep added a comment -

          b2_1 and b2_4 are now building with kernel 2.6.32-358.11.1 which has no external OFED support (or may be broken even ok to build).

          mdiep Minh Diep added a comment - b2_1 and b2_4 are now building with kernel 2.6.32-358.11.1 which has no external OFED support (or may be broken even ok to build).
          yujian Jian Yu added a comment -

          Or is more work needed in this ticket for OFED?

          1) the patches need to be landed on Lustre b2_1 and b2_4 branches
          2) OFA builds need to be enabled on Jenkins

          yujian Jian Yu added a comment - Or is more work needed in this ticket for OFED? 1) the patches need to be landed on Lustre b2_1 and b2_4 branches 2) OFA builds need to be enabled on Jenkins

          Now that the 2 patches have landed, can this ticket be closed? Or is more work needed in this ticket for OFED?

          jlevi Jodi Levi (Inactive) added a comment - Now that the 2 patches have landed, can this ticket be closed? Or is more work needed in this ticket for OFED?

          The patches work to build OFED-3.5-1 against RHEL6.4, but Mellanox OFED has different Macro name today.
          Here is quick description to build MLNX_OFED_LINUX-2.0-2.0.5 against the latest lustre patched kernel based on RHEL6.4. Today, we need adding macro name that we want to avoid due to OFED doesn't have automatically kernel verion detection. I asked Mellanox and they seems to be trying fix in the future release.

          # EXTRA_LNET_INCLUDE="-DCONFIG_COMPAT_IS_PHYS_ID_STATE -DCONFIG_COMPAT_IS_PCI_PHYSFN \
          -DCONFIG_COMPAT_IS_KSTRTOX -DCONFIG_COMPAT_IS_BITOP \
          -DCONFIG_COMPAT_NETLINK_3_7 -DCONFIG_COMPAT_IS_IP_TOS2PRIO \
          -DCONFIG_COMPAT_IS_NETIF_RSS_QUEUES -DCONFIG_COMPAT_IS_NOOP_LLSEEK \
          -DCONFIG_COMPAT_IS_SIMPLE_OPEN -DCONFIG_COMPAT_RCU \
          -DCONFIG_COMPAT_HAS_NUM_CHANNELS -DCONFIG_COMPAT_ETHTOOL_OPS_EXT" \
          ./configure --with-o2ib=/usr/src/ofa_kernel
          
          # EXTRA_LNET_INCLUDE="-DCONFIG_COMPAT_IS_PHYS_ID_STATE -DCONFIG_COMPAT_IS_PCI_PHYSFN \
          -DCONFIG_COMPAT_IS_KSTRTOX -DCONFIG_COMPAT_IS_BITOP \
          -DCONFIG_COMPAT_NETLINK_3_7 -DCONFIG_COMPAT_IS_IP_TOS2PRIO \
          -DCONFIG_COMPAT_IS_NETIF_RSS_QUEUES -DCONFIG_COMPAT_IS_NOOP_LLSEEK \
          -DCONFIG_COMPAT_IS_SIMPLE_OPEN -DCONFIG_COMPAT_RCU \
          -DCONFIG_COMPAT_HAS_NUM_CHANNELS -DCONFIG_COMPAT_ETHTOOL_OPS_EXT" make rpms
          
          ihara Shuichi Ihara (Inactive) added a comment - The patches work to build OFED-3.5-1 against RHEL6.4, but Mellanox OFED has different Macro name today. Here is quick description to build MLNX_OFED_LINUX-2.0-2.0.5 against the latest lustre patched kernel based on RHEL6.4. Today, we need adding macro name that we want to avoid due to OFED doesn't have automatically kernel verion detection. I asked Mellanox and they seems to be trying fix in the future release. # EXTRA_LNET_INCLUDE="-DCONFIG_COMPAT_IS_PHYS_ID_STATE -DCONFIG_COMPAT_IS_PCI_PHYSFN \ -DCONFIG_COMPAT_IS_KSTRTOX -DCONFIG_COMPAT_IS_BITOP \ -DCONFIG_COMPAT_NETLINK_3_7 -DCONFIG_COMPAT_IS_IP_TOS2PRIO \ -DCONFIG_COMPAT_IS_NETIF_RSS_QUEUES -DCONFIG_COMPAT_IS_NOOP_LLSEEK \ -DCONFIG_COMPAT_IS_SIMPLE_OPEN -DCONFIG_COMPAT_RCU \ -DCONFIG_COMPAT_HAS_NUM_CHANNELS -DCONFIG_COMPAT_ETHTOOL_OPS_EXT" \ ./configure --with-o2ib=/usr/src/ofa_kernel # EXTRA_LNET_INCLUDE="-DCONFIG_COMPAT_IS_PHYS_ID_STATE -DCONFIG_COMPAT_IS_PCI_PHYSFN \ -DCONFIG_COMPAT_IS_KSTRTOX -DCONFIG_COMPAT_IS_BITOP \ -DCONFIG_COMPAT_NETLINK_3_7 -DCONFIG_COMPAT_IS_IP_TOS2PRIO \ -DCONFIG_COMPAT_IS_NETIF_RSS_QUEUES -DCONFIG_COMPAT_IS_NOOP_LLSEEK \ -DCONFIG_COMPAT_IS_SIMPLE_OPEN -DCONFIG_COMPAT_RCU \ -DCONFIG_COMPAT_HAS_NUM_CHANNELS -DCONFIG_COMPAT_ETHTOOL_OPS_EXT" make rpms

          http://review.whamcloud.com/6448

          patch to build master on RHEL6.4 + OFED-3.5.1.

          ihara Shuichi Ihara (Inactive) added a comment - http://review.whamcloud.com/6448 patch to build master on RHEL6.4 + OFED-3.5.1.

          Peter,
          I just figured out and patches were not much needed for lustre build with OFED-3.5.1 against RHEL6.4 kernel.

          Here is a quick workaround for lustre (master and b2_1) with RHEL6.4 kernel and OFED-3.5.1.

          # EXTRA_LNET_INCLUDE="-DCONFIG_COMPAT_RHEL_6_4" ./configure --with-o2ib=/usr/src/compat-rdma --with-linux=/usr/src/kernels/2.6.32-358.6.1.el6_lustre.x86_64
          

          I've confirmed the compile worked with OFED-3.5.1, and am working on auto-detection of RHEL6.4 and set -DCONFIG_COMPAT_RHEL_6_4 to EXTRA_LNET_INCLUDE at the configure time. I will do some tests and push patches sonner.

          btw, I also would suggest land patch for LU-3166 (http://review.whamcloud.com/6048) as well. This is needed for bonding configuration with OFED-3.x stack.

          ihara Shuichi Ihara (Inactive) added a comment - Peter, I just figured out and patches were not much needed for lustre build with OFED-3.5.1 against RHEL6.4 kernel. Here is a quick workaround for lustre (master and b2_1) with RHEL6.4 kernel and OFED-3.5.1. # EXTRA_LNET_INCLUDE="-DCONFIG_COMPAT_RHEL_6_4" ./configure --with-o2ib=/usr/src/compat-rdma --with-linux=/usr/src/kernels/2.6.32-358.6.1.el6_lustre.x86_64 I've confirmed the compile worked with OFED-3.5.1, and am working on auto-detection of RHEL6.4 and set -DCONFIG_COMPAT_RHEL_6_4 to EXTRA_LNET_INCLUDE at the configure time. I will do some tests and push patches sonner. btw, I also would suggest land patch for LU-3166 ( http://review.whamcloud.com/6048 ) as well. This is needed for bonding configuration with OFED-3.x stack.
          pjones Peter Jones added a comment -

          Thanks Ihara! For the immediate releases scheduled for this quarter I think that the only options would be to either to not test external OFED or to test OFA OFED though I definitely think that there is a case to be made for us to look at Mellanox OFED as a possibility for the future. I did raise this suggestion on the most recent CDWG but there was not strong interest from others present. So, for the time being - do you expect to be able to supply a patch to allow us to support RHEL 6.4 and OFED 3.5.1 in the next day or so?

          pjones Peter Jones added a comment - Thanks Ihara! For the immediate releases scheduled for this quarter I think that the only options would be to either to not test external OFED or to test OFA OFED though I definitely think that there is a case to be made for us to look at Mellanox OFED as a possibility for the future. I did raise this suggestion on the most recent CDWG but there was not strong interest from others present. So, for the time being - do you expect to be able to supply a patch to allow us to support RHEL 6.4 and OFED 3.5.1 in the next day or so?

          Yes, I've tested OFED-3.5.1 with RHEL6.4 and it works. Howerver, it still needs some changes to build with Lustre.
          I'm wokring on this and will update here. btw, MLNX_OFED_LINUX-2.0-2.0.5 is based on OFED-3.x, and this works on RHEL6.4 with Lustre for both servers and client.

          ihara Shuichi Ihara (Inactive) added a comment - Yes, I've tested OFED-3.5.1 with RHEL6.4 and it works. Howerver, it still needs some changes to build with Lustre. I'm wokring on this and will update here. btw, MLNX_OFED_LINUX-2.0-2.0.5 is based on OFED-3.x, and this works on RHEL6.4 with Lustre for both servers and client.
          ys Yang Sheng added a comment -

          Looks like 3.5.1 have present in OFED daily build. The create date was May 22. Hope it release sooner.

          ys Yang Sheng added a comment - Looks like 3.5.1 have present in OFED daily build. The create date was May 22. Hope it release sooner.

          People

            mdiep Minh Diep
            yujian Jian Yu
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: