Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5664

assertion in failure handling of LNetNIInit

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Minor Minor
    • None
    • None
    • None
    • 3
    • 15875

      I hit this in my testing, it seems like failure handling of LNetNIInit is not correct, for example, if we have initialised some NIs before the failure, then we should finalise those initialised NIs before calling lnet_unprepare

      LNetError: 2843:0:(api-ni.c:1505:lnet_startup_lndnis()) Can't load LND tcp, module ksocklnd, rc=256
      LNetError: 2843:0:(api-ni.c:823:lnet_unprepare()) ASSERTION( list_empty(&the_lnet.ln_nis) ) failed: 
      LNetError: 2843:0:(api-ni.c:823:lnet_unprepare()) LBUG
      Kernel panic - not syncing: LBUG
      Pid: 2843, comm: insmod Tainted: P           ---------------    2.6.32.431.lustre #1
      Call Trace:
       [<ffffffff8152528a>] ? panic+0xa7/0x16f
       [<ffffffffa041aeeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
       [<ffffffffa04c0d6d>] ? lnet_unprepare+0x2ad/0x320 [lnet]
       [<ffffffffa04c4998>] ? LNetNIInit+0x1f8/0x3f0 [lnet]
       [<ffffffffa052a06e>] ? srpc_startup+0x5e/0x220 [lnet_selftest]
       [<ffffffffa052f585>] ? init_module+0x215/0x500 [lnet_selftest]
       [<ffffffffa052f370>] ? init_module+0x0/0x500 [lnet_selftest]
       [<ffffffff8100204c>] ? do_one_initcall+0x3c/0x1d0
       [<ffffffff810bc511>] ? sys_init_module+0xe1/0x250
       [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
      

            wc-triage WC Triage
            liang Liang Zhen (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: