Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Testing on a system that has 4 numa nodes, but 1 of the nodes has no CPUs assigned triggered an assert in o2iblnd:

      LASSERT(sched->ibs_nthreads > 0);
      

      LU-6325 libcfs: shortcut to create CPT from NUMA topology
      introduced a method where if the module parameter cpu_pattern was set to "N" or "n", it would create the CPTs from the NUMA topology. This has the potential of exposing the assert where a schedule's ibs_nthreads could be 0 because there are no CPUs assigned to that CPT to which the scheduler is bound (IE cfs_cpt_weight() for the CPT in question returns 0).

      LU-8703 libcfs: use int type for CPT identification.
      In fact exposed this bug when the default value for the module parameter cpu_pattern was set to "N".

      We should be able to handle this case in the LND, by only creating schedulers for non empty CPTs.

      Or by not creating an empty CPT in the first place in the libcfs code.

      Attachments

        Issue Links

          Activity

            [LU-9448] Assert on an empty NUMA node
            pjones Peter Jones added a comment -

            Landed for 2.10

            pjones Peter Jones added a comment - Landed for 2.10

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27145/
            Subject: LU-9448 lnet: handle empty CPTs
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: e711370e13dcbe059e2551aa575c41d62cbcfca9

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27145/ Subject: LU-9448 lnet: handle empty CPTs Project: fs/lustre-release Branch: master Current Patch Set: Commit: e711370e13dcbe059e2551aa575c41d62cbcfca9

            Amir Shehata (amir.shehata@intel.com) uploaded a new patch: https://review.whamcloud.com/27145
            Subject: LU-9448 lnet: handle empty CPTs
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 94307f8eb7791d8d56e55b5db184b2006a59f9fa

            gerrit Gerrit Updater added a comment - Amir Shehata (amir.shehata@intel.com) uploaded a new patch: https://review.whamcloud.com/27145 Subject: LU-9448 lnet: handle empty CPTs Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 94307f8eb7791d8d56e55b5db184b2006a59f9fa
            dmiter Dmitry Eremin (Inactive) added a comment - The fix is still under review https://review.whamcloud.com/23222/

            People

              ashehata Amir Shehata (Inactive)
              ashehata Amir Shehata (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: