Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12352

libcfs crashes with certain cpu_npartitions values

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      Due to a bug in the code, libcfs will crash if the number of online cpus does not divide by the number of cpu partitions. Based on the checks in cfs_cpt_table_create(), it appears that the original intent was to push the remaining cpus into the initial partitions.

      A simple reproducer for a system with cpus number that is not a multiple of 3 is:

      insmod libcfs.ko cpu_pattern="" cpu_npartitions=3
      
      [112628.427628] LNetError: 14786:0:(libcfs_cpu.c:770:cfs_cpt_choose_ncpus()) ASSERTION( number > 0 ) failed: 
      [112628.427862] LNetError: 14786:0:(libcfs_cpu.c:770:cfs_cpt_choose_ncpus()) LBUG
      [112628.428073] Pid: 14786, comm: insmod 3.10.0-693.21.1.x3.1.10.x86_64 #1 SMP Wed Nov 14 12:16:53 CST 2018
      [112628.428082] Call Trace:
      [112628.428180]  [<ffffffff8103a212>] save_stack_trace_tsk+0x22/0x40
      [112628.428198]  [<ffffffffc067d7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [112628.428231]  [<ffffffffc067d87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [112628.428261]  [<ffffffffc069137a>] cfs_cpt_choose_ncpus+0x81a/0x820 [libcfs]
      [112628.428294]  [<ffffffffc06915ba>] cfs_cpt_table_create+0x23a/0x8d0 [libcfs]
      [112628.428325]  [<ffffffffc0691d4b>] cfs_cpu_init+0xbb/0xb70 [libcfs]
      [112628.428356]  [<ffffffffc06df031>] libcfs_init+0x31/0x1000 [libcfs]
      [112628.428388]  [<ffffffff810020ea>] do_one_initcall+0xba/0x240
      [112628.428400]  [<ffffffff81104424>] load_module+0x1f84/0x2a10
      [112628.428413]  [<ffffffff81105066>] SyS_finit_module+0xa6/0xd0
      [112628.428423]  [<ffffffff816c1715>] system_call_fastpath+0x1c/0x21
      [112628.428436]  [<ffffffffffffffff>] 0xffffffffffffffff
      [112628.428469] Kernel panic - not syncing: LBUG
      [112628.428572] CPU: 3 PID: 14786 Comm: insmod Tainted: G           OE  ------------   3.10.0-693.21.1.x3.1.10.x86_64 #1
      [112628.428782] Hardware name:                  /D525MWV, BIOS MWPNT10N.86A.0083.2011.0524.1600 05/24/2011
      [112628.428970] Call Trace:
      [112628.429046]  [<ffffffff816ae7c8>] dump_stack+0x19/0x1b
      [112628.429049]  [<ffffffff816a8634>] panic+0xe8/0x21f
      [112628.429049]  [<ffffffffc067d8cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [112628.429049]  [<ffffffffc069137a>] cfs_cpt_choose_ncpus+0x81a/0x820 [libcfs]
      [112628.429049]  [<ffffffffc06915ba>] cfs_cpt_table_create+0x23a/0x8d0 [libcfs]
      [112628.429049]  [<ffffffffc06df000>] ? 0xffffffffc06defff
      [112628.429049]  [<ffffffffc0691d4b>] cfs_cpu_init+0xbb/0xb70 [libcfs]
      [112628.429049]  [<ffffffffc06df000>] ? 0xffffffffc06defff
      [112628.429049]  [<ffffffffc06df031>] libcfs_init+0x31/0x1000 [libcfs]
      [112628.429049]  [<ffffffff810020ea>] do_one_initcall+0xba/0x240
      [112628.429049]  [<ffffffff81104424>] load_module+0x1f84/0x2a10
      [112628.429049]  [<ffffffff813523e0>] ? ddebug_proc_write+0xf0/0xf0
      [112628.429049]  [<ffffffff816c514a>] ? ftrace_graph_caller+0x5a/0x85
      [112628.429049]  [<ffffffff81100a83>] ? copy_module_from_fd.isra.42+0x53/0x150
      [112628.429049]  [<ffffffff81105066>] SyS_finit_module+0xa6/0xd0
      [112628.429049]  [<ffffffff816c1715>] system_call_fastpath+0x1c/0x21
      [112628.429049]  [<ffffffff816c1661>] ? system_call_after_swapgs+0xae/0x146
      

      A fix will be uploaded shortly.

      Attachments

        Activity

          People

            panda Andrew Perepechko
            panda Andrew Perepechko
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: