Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5664

assertion in failure handling of LNetNIInit

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 15875

    Description

      I hit this in my testing, it seems like failure handling of LNetNIInit is not correct, for example, if we have initialised some NIs before the failure, then we should finalise those initialised NIs before calling lnet_unprepare

      LNetError: 2843:0:(api-ni.c:1505:lnet_startup_lndnis()) Can't load LND tcp, module ksocklnd, rc=256
      LNetError: 2843:0:(api-ni.c:823:lnet_unprepare()) ASSERTION( list_empty(&the_lnet.ln_nis) ) failed: 
      LNetError: 2843:0:(api-ni.c:823:lnet_unprepare()) LBUG
      Kernel panic - not syncing: LBUG
      Pid: 2843, comm: insmod Tainted: P           ---------------    2.6.32.431.lustre #1
      Call Trace:
       [<ffffffff8152528a>] ? panic+0xa7/0x16f
       [<ffffffffa041aeeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
       [<ffffffffa04c0d6d>] ? lnet_unprepare+0x2ad/0x320 [lnet]
       [<ffffffffa04c4998>] ? LNetNIInit+0x1f8/0x3f0 [lnet]
       [<ffffffffa052a06e>] ? srpc_startup+0x5e/0x220 [lnet_selftest]
       [<ffffffffa052f585>] ? init_module+0x215/0x500 [lnet_selftest]
       [<ffffffffa052f370>] ? init_module+0x0/0x500 [lnet_selftest]
       [<ffffffff8100204c>] ? do_one_initcall+0x3c/0x1d0
       [<ffffffff810bc511>] ? sys_init_module+0xe1/0x250
       [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
      

      Attachments

        Issue Links

          Activity

            [LU-5664] assertion in failure handling of LNetNIInit
            jlevi Jodi Levi (Inactive) made changes -
            Resolution New: Duplicate [ 3 ]
            Status Original: Reopened [ 4 ] New: Closed [ 6 ]
            jlevi Jodi Levi (Inactive) made changes -
            Fix Version/s Original: Lustre 2.7.0 [ 10631 ]
            Fix Version/s Original: Lustre 2.5.4 [ 11190 ]
            jlevi Jodi Levi (Inactive) made changes -
            Resolution Original: Duplicate [ 3 ]
            Status Original: Closed [ 6 ] New: Reopened [ 4 ]
            jlevi Jodi Levi (Inactive) made changes -
            Resolution New: Duplicate [ 3 ]
            Status Original: Open [ 1 ] New: Closed [ 6 ]
            jlevi Jodi Levi (Inactive) made changes -
            Link New: This issue is duplicated by LU-5568 [ LU-5568 ]

            thanks Amir!

            liang Liang Zhen (Inactive) added a comment - thanks Amir!

            I believe this is a duplicate of LU-5568

            There is already a patch to fix this issue:
            http://review.whamcloud.com/#/c/11718/

            ashehata Amir Shehata (Inactive) added a comment - I believe this is a duplicate of LU-5568 There is already a patch to fix this issue: http://review.whamcloud.com/#/c/11718/
            liang Liang Zhen (Inactive) created issue -

            People

              wc-triage WC Triage
              liang Liang Zhen (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: