[LU-5664] assertion in failure handling of LNetNIInit Created: 25/Sep/14 Updated: 19/Feb/15 Resolved: 19/Feb/15 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Liang Zhen (Inactive) | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 15875 | ||||||||
| Description |
|
I hit this in my testing, it seems like failure handling of LNetNIInit is not correct, for example, if we have initialised some NIs before the failure, then we should finalise those initialised NIs before calling lnet_unprepare LNetError: 2843:0:(api-ni.c:1505:lnet_startup_lndnis()) Can't load LND tcp, module ksocklnd, rc=256 LNetError: 2843:0:(api-ni.c:823:lnet_unprepare()) ASSERTION( list_empty(&the_lnet.ln_nis) ) failed: LNetError: 2843:0:(api-ni.c:823:lnet_unprepare()) LBUG Kernel panic - not syncing: LBUG Pid: 2843, comm: insmod Tainted: P --------------- 2.6.32.431.lustre #1 Call Trace: [<ffffffff8152528a>] ? panic+0xa7/0x16f [<ffffffffa041aeeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] [<ffffffffa04c0d6d>] ? lnet_unprepare+0x2ad/0x320 [lnet] [<ffffffffa04c4998>] ? LNetNIInit+0x1f8/0x3f0 [lnet] [<ffffffffa052a06e>] ? srpc_startup+0x5e/0x220 [lnet_selftest] [<ffffffffa052f585>] ? init_module+0x215/0x500 [lnet_selftest] [<ffffffffa052f370>] ? init_module+0x0/0x500 [lnet_selftest] [<ffffffff8100204c>] ? do_one_initcall+0x3c/0x1d0 [<ffffffff810bc511>] ? sys_init_module+0xe1/0x250 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b |
| Comments |
| Comment by Amir Shehata (Inactive) [ 25/Sep/14 ] |
|
I believe this is a duplicate of There is already a patch to fix this issue: |
| Comment by Liang Zhen (Inactive) [ 26/Sep/14 ] |
|
thanks Amir! |