[LU-12299] avoid panic for too large cpu partitions Created: 15/May/19  Updated: 28/Jun/20  Resolved: 29/May/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.13.0, Lustre 2.12.5

Type: Bug Priority: Minor
Reporter: Wang Shilong (Inactive) Assignee: Wang Shilong (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

We still possibly hit following panic if CPU partitions is larger than online cpus:

[ 3574.857103] LNet: Removed LNI 10.0.1.81@tcp
[ 3576.527707] LNet: 28138:0:(linux-cpu.c:677:cfs_cpt_table_create()) CPU partition number 2 is larger than suggested value (1), your system may have performance issue or run out of memory while under pressure
[ 3576.529391] LNetError: 28138:0:(linux-cpu.c:571:cfs_cpt_choose_ncpus()) ASSERTION( number > 0 ) failed:
[ 3576.530270] LNetError: 28138:0:(linux-cpu.c:571:cfs_cpt_choose_ncpus()) LBUG
[ 3576.530907] Pid: 28138, comm: modprobe
[ 3576.531256]
Call Trace:
[ 3576.531628]  [<ffffffffc08d680e>] libcfs_call_trace+0x4e/0x60 [libcfs]
[ 3576.532228]  [<ffffffffc08d6dcc>] lbug_with_loc+0x4c/0xc0 [libcfs]
[ 3576.532792]  [<ffffffffc08da70a>] cfs_cpt_choose_ncpus+0x81a/0x820 [libcfs]
[ 3576.533438]  [<ffffffff811e42b6>] ? kmem_cache_alloc_trace+0x1d6/0x200
[ 3576.534033]  [<ffffffffc08daa36>] cfs_cpu_init+0x2e6/0x12d0 [libcfs]
[ 3576.534613]  [<ffffffffc08e5220>] ? init_libcfs_module+0x0/0x3c0 [libcfs]
[ 3576.535232]  [<ffffffffc08e52cb>] init_libcfs_module+0xab/0x3c0 [libcfs]
[ 3576.535841]  [<ffffffff810020ea>] do_one_initcall+0xba/0x240
[ 3576.536360]  [<ffffffff81104414>] load_module+0x1f84/0x2a10
[ 3576.536865]  [<ffffffff81353070>] ? ddebug_dyndbg_module_param_cb+0x0/0x60
[ 3576.537491]  [<ffffffff81100a73>] ? copy_module_from_fd.isra.42+0x53/0x150
[ 3576.538116]  [<ffffffff81105056>] SyS_finit_module+0xa6/0xd0
[ 3576.538787]  [<ffffffff816c16d5>] system_call_fastpath+0x1c/0x21
[ 3576.539377]  [<ffffffff816c1621>] ? system_call_after_swapgs+0xae/0x146
[ 3576.539961]


 Comments   
Comment by Gerrit Updater [ 15/May/19 ]

Wang Shilong (wshilong@ddn.com) uploaded a new patch: https://review.whamcloud.com/34864
Subject: LU-12299 libcfs: fix panic for too large cpu partions
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 1cf9de787b2c76aac257ef443e0f256b2e8545d4

Comment by Gerrit Updater [ 29/May/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34864/
Subject: LU-12299 libcfs: fix panic for too large cpu partions
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 77771ff24c03a59fc96a7f41199a6b73530a418a

Comment by Peter Jones [ 29/May/19 ]

Landed for 2.13

Comment by Gerrit Updater [ 27/Jan/20 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37332
Subject: LU-12299 libcfs: fix panic for too large cpu partions
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 5e382c0b5b0d59de21c43843ef0a91a1b90afd24

Comment by Gerrit Updater [ 06/Apr/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37332/
Subject: LU-12299 libcfs: fix panic for too large cpu partions
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: 77620a096ce75578069af666278a831ad5d0c446

Generated at Sat Feb 10 02:51:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.