Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10554

trivial typo on lnetctl command line generates LBUG on lustre client

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.11.0
    • Lustre 2.10.2
    • None
    • lustre client 10.2.1
      OS: slf6.8
      kernel: 2.6.32-642.15.1.el6.x86_64

      Custom rpm rebuild:
      rpmbuild --rebuild --without servers --with lnet-dlc --with lustre-utils ./lustre-2.10.2-1.src.rpm
    • 3
    • 9223372036854775807

    Description

      I'm trying lnetctl commands. I tried to specify interface as ib0 and it crashed client node with LBUG.
      I think lnetctl can just report the error without LBUG.

      [root@tev0509 rpmbuild]# lnetctl net show
      net:

      • net type: lo
        local NI(s):
      • nid: 0@lo
        status: up
      • net type: o2ib
        local NI(s):
      • nid: 192.168.176.72@o2ib
        status: up
        interfaces:
        0: ib0
      1. lnetctl lnet unconfigure
      2. lnetctl lnet configure
      3. lnetctl net show
        net:
      • net type: lo
        local NI(s):
      • nid: 0@lo
        status: up
      1. lnetctl net add --if ib0

      Message from syslogd@tev0509 at Jan 23 15:30:15 ...
      kernel:LNetError: 30614:0:(api-ni.c:1499:lnet_startup_lndnet()) ASSERTION( libcfs_isknown_lnd(lnd_type) ) failed:

      Message from syslogd@tev0509 at Jan 23 15:30:15 ...
      kernel:LNetError: 30614:0:(api-ni.c:1499:lnet_startup_lndnet()) LBUG

      ==========
      kernel.log :

      2018-01-23 15:30:15 Pid: 30614, comm: lnetctl
      2018-01-23 15:30:15
      2018-01-23 15:30:15 Call Trace:
      2018-01-23 15:30:15 [<ffffffffa0a33885>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      2018-01-23 15:30:15 [<ffffffffa0a339cf>] lbug_with_loc+0x3f/0x90 [libcfs]
      2018-01-23 15:30:15 [<ffffffffa0a95b38>] lnet_startup_lndnet+0x8b8/0x8c0 [lnet]
      2018-01-23 15:30:15 [<ffffffffa0a4a65b>] ? cfs_percpt_lock+0x5b/0x110 [libcfs]
      2018-01-23 15:30:16 [<ffffffffa0a96cf4>] lnet_add_net_common+0x134/0x480 [lnet]
      2018-01-23 15:30:16 [<ffffffffa0a97354>] lnet_dyn_add_ni+0x194/0x1c0 [lnet]
      2018-01-23 15:30:16 [<ffffffffa0a3f311>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      2018-01-23 15:30:16 [<ffffffffa0ab1668>] lnet_ioctl+0x268/0x290 [lnet]
      2018-01-23 15:30:16 [<ffffffffa0a3d2b8>] libcfs_ioctl+0x118/0x4d0 [libcfs]
      2018-01-23 15:30:16 [<ffffffffa0a39231>] libcfs_psdev_ioctl+0x51/0x100 [libcfs]
      2018-01-23 15:30:16 [<ffffffff811af742>] vfs_ioctl+0x22/0xa0
      2018-01-23 15:30:16 [<ffffffff811af8e4>] do_vfs_ioctl+0x84/0x580
      2018-01-23 15:30:16 [<ffffffff811a7bc6>] ? final_putname+0x26/0x50
      2018-01-23 15:30:16 [<ffffffff811afe61>] sys_ioctl+0x81/0xa0
      2018-01-23 15:30:16 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
      2018-01-23 15:30:16
      2018-01-23 15:30:16 Kernel panic - not syncing: LBUG
      2018-01-23 15:30:16 Pid: 30614, comm: lnetctl Tainted: P – ------------ 2.6.32-642.15.1.el6.x86_64 #1
      2018-01-23 15:30:16 Call Trace:
      2018-01-23 15:30:16 [<ffffffff815484e1>] ? panic+0xa7/0x179
      2018-01-23 15:30:16 [<ffffffffa0a339e6>] ? lbug_with_loc+0x56/0x90 [libcfs]
      2018-01-23 15:30:16 [<ffffffffa0a95b38>] ? lnet_startup_lndnet+0x8b8/0x8c0 [lnet]
      2018-01-23 15:30:16 [<ffffffffa0a4a65b>] ? cfs_percpt_lock+0x5b/0x110 [libcfs]
      2018-01-23 15:30:16 [<ffffffffa0a96cf4>] ? lnet_add_net_common+0x134/0x480 [lnet]
      2018-01-23 15:30:16 [<ffffffffa0a97354>] ? lnet_dyn_add_ni+0x194/0x1c0 [lnet]
      2018-01-23 15:30:16 [<ffffffffa0a3f311>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
      2018-01-23 15:30:16 [<ffffffffa0ab1668>] ? lnet_ioctl+0x268/0x290 [lnet]
      2018-01-23 15:30:16 [<ffffffffa0a3d2b8>] ? libcfs_ioctl+0x118/0x4d0 [libcfs]
      2018-01-23 15:30:16 [<ffffffffa0a39231>] ? libcfs_psdev_ioctl+0x51/0x100 [libcfs]
      2018-01-23 15:30:16 [<ffffffff811af742>] ? vfs_ioctl+0x22/0xa0
      2018-01-23 15:30:16 [<ffffffff811af8e4>] ? do_vfs_ioctl+0x84/0x580
      2018-01-23 15:30:16 [<ffffffff811a7bc6>] ? final_putname+0x26/0x50
      2018-01-23 15:30:16 [<ffffffff811afe61>] ? sys_ioctl+0x81/0xa0
      2018-01-23 15:30:16 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
      2018-01-23 15:30:16 -----------[ cut here ]-----------
      2018-01-23 15:30:16 WARNING: at arch/x86/kernel/smp.c:118

      Attachments

        Issue Links

          Activity

            [LU-10554] trivial typo on lnetctl command line generates LBUG on lustre client
            pjones Peter Jones added a comment -

            Landed for 2.11

            pjones Peter Jones added a comment - Landed for 2.11

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31100/
            Subject: LU-10554 lnet: Remove LASSERT on userspace data
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 8059dbbe97a61e287efe0ae9d1f7767d362aa2d7

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31100/ Subject: LU-10554 lnet: Remove LASSERT on userspace data Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8059dbbe97a61e287efe0ae9d1f7767d362aa2d7
            sharmaso Sonia Sharma (Inactive) added a comment - - edited

            This issue is anyways resolved with LU-10151. With LU-10151, check for incomplete user data is put in place which errors out for missing information while adding NI. 
            With the above patch though, LASSERT on data from userspace is removed and the missing information is checked for and handled gracefully in kernel as well.

            sharmaso Sonia Sharma (Inactive) added a comment - - edited This issue is anyways resolved with LU-10151 . With LU-10151 , check for incomplete user data is put in place which errors out for missing information while adding NI.  With the above patch though, LASSERT on data from userspace is removed and the missing information is checked for and handled gracefully in kernel as well.

            Sonia Sharma (sonia.sharma@intel.com) uploaded a new patch: https://review.whamcloud.com/31100
            Subject: LU-10554 lnet: Remove LASSERT on userspace data
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8ad3ae3c99cb600f6542387990809f79b58bbf85

            gerrit Gerrit Updater added a comment - Sonia Sharma (sonia.sharma@intel.com) uploaded a new patch: https://review.whamcloud.com/31100 Subject: LU-10554 lnet: Remove LASSERT on userspace data Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8ad3ae3c99cb600f6542387990809f79b58bbf85

            We should never LASSERT on data from userspace, sent over the network, or read from disk.

            adilger Andreas Dilger added a comment - We should never LASSERT on data from userspace, sent over the network, or read from disk.

            People

              sharmaso Sonia Sharma (Inactive)
              alex.ku Alex Kulyavtsev
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: