[LU-3093] lstcon_dstnodes_prep()) ASSERTION( grp->grp_nnode >= 1 ) Created: 03/Apr/13  Updated: 04/Sep/13  Resolved: 04/Sep/13

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.4
Fix Version/s: Lustre 2.4.1, Lustre 2.5.0

Type: Bug Priority: Minor
Reporter: Ned Bass Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

https://github.com/chaos/lustre/commits/2.1.4-3chaos


Severity: 3
Rank (Obsolete): 7513

 Description   

An lst session node crashed when I tried to run an lnet selftest.

Backtrace from crash:

crash> bt
PID: 2961   TASK: ffff880bed544ae0  CPU: 8   COMMAND: "lst"
 #0 [ffff8808edeff980] machine_kexec at ffffffff81035b6b
 #1 [ffff8808edeff9e0] crash_kexec at ffffffff810c08d2
 #2 [ffff8808edeffab0] panic at ffffffff8150d3f3
 #3 [ffff8808edeffb30] lbug_with_loc at ffffffffa02c1e4b [libcfs]
 #4 [ffff8808edeffb50] lstcon_dstnodes_prep at ffffffffa059cc36 [lnet_selftest]
 #5 [ffff8808edeffbc0] lstcon_testrpc_prep at ffffffffa059eb9a [lnet_selftest]
 #6 [ffff8808edeffc20] lstcon_rpc_trans_ndlist at ffffffffa059f30f [lnet_selftest]
 #7 [ffff8808edeffc90] lstcon_test_add at ffffffffa059b2ce [lnet_selftest]
 #8 [ffff8808edeffd10] lst_test_add_ioctl at ffffffffa05a03e7 [lnet_selftest]
 #9 [ffff8808edeffda0] lstcon_ioctl_entry at ffffffffa05a3f95 [lnet_selftest]
#10 [ffff8808edeffdd0] libcfs_ioctl at ffffffffa02cac84 [libcfs]
#11 [ffff8808edeffe10] libcfs_ioctl at ffffffffa02c5ac4 [libcfs]
#12 [ffff8808edeffe60] vfs_ioctl at ffffffff81194acc
#13 [ffff8808edeffea0] do_vfs_ioctl at ffffffff81194c14
#14 [ffff8808edefff30] sys_ioctl at ffffffff81195191
#15 [ffff8808edefff80] system_call_fastpath at ffffffff8100b072
    RIP: 00002aaaaadada47  RSP: 00007fffffffe1b8  RFLAGS: 00010206
    RAX: 0000000000000010  RBX: ffffffff8100b072  RCX: 00002aaaaae1dac0
    RDX: 00007fffffffe0a0  RSI: 00000000c008653f  RDI: 0000000000000003
    RBP: 0000000000000000   R8: 0000000000000c26   R9: 0000000000000068
    R10: 00007fffffffdf40  R11: 0000000000000246  R12: 00000000c008653f
    R13: 00007fffffffe0a0  R14: 000000000061c2e0  R15: 0000000000000003
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

Console panic message:

LustreError: 2961:0:(conrpc.c:738:lstcon_dstnodes_prep()) ASSERTION( grp->grp_nnode >= 1 ) failed: 
LustreError: 2961:0:(conrpc.c:738:lstcon_dstnodes_prep()) LBUG
Pid: 2961, comm: lst

Call Trace:
 [<ffffffffa02c17e5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa02c1df7>] lbug_with_loc+0x47/0xb0 [libcfs]
 [<ffffffffa059cc36>] lstcon_dstnodes_prep+0x236/0x280 [lnet_selftest]
 [<ffffffff8116041a>] ? alloc_pages_current+0xaa/0x110
 [<ffffffffa059eb9a>] lstcon_testrpc_prep+0xfa/0x310 [lnet_selftest]
 [<ffffffffa0596a40>] ? lstcon_testrpc_condition+0x0/0x1c0 [lnet_selftest]
 [<ffffffffa059f30f>] lstcon_rpc_trans_ndlist+0x24f/0x300 [lnet_selftest]
 [<ffffffffa059b2ce>] lstcon_test_add+0x4ce/0x930 [lnet_selftest]
 [<ffffffffa05a03e7>] lst_test_add_ioctl+0xaf7/0xc20 [lnet_selftest]
 [<ffffffffa05a3f95>] lstcon_ioctl_entry+0x505/0x5f0 [lnet_selftest]
 [<ffffffffa02cac84>] libcfs_ioctl+0x354/0x900 [libcfs]
 [<ffffffffa02c5ac4>] libcfs_ioctl+0x84/0x180 [libcfs]
 [<ffffffff81194acc>] vfs_ioctl+0x7c/0xa0
 [<ffffffff81194c14>] do_vfs_ioctl+0x84/0x580
 [<ffffffff81195191>] sys_ioctl+0x81/0xa0
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Kernel panic - not syncing: LBUG
Pid: 2961, comm: lst Tainted: G        W  ---------------    2.6.32-358.5chaos.ch5.1.x86_64 #1
Call Trace:
 [<ffffffff8150d3ec>] ? panic+0xa7/0x16f
 [<ffffffffa02c1e4b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
 [<ffffffffa059cc36>] ? lstcon_dstnodes_prep+0x236/0x280 [lnet_selftest]
 [<ffffffff8116041a>] ? alloc_pages_current+0xaa/0x110
 [<ffffffffa059eb9a>] ? lstcon_testrpc_prep+0xfa/0x310 [lnet_selftest]
 [<ffffffffa0596a40>] ? lstcon_testrpc_condition+0x0/0x1c0 [lnet_selftest]
 [<ffffffffa059f30f>] ? lstcon_rpc_trans_ndlist+0x24f/0x300 [lnet_selftest]
 [<ffffffffa059b2ce>] ? lstcon_test_add+0x4ce/0x930 [lnet_selftest]
 [<ffffffffa05a03e7>] ? lst_test_add_ioctl+0xaf7/0xc20 [lnet_selftest]
 [<ffffffffa05a3f95>] ? lstcon_ioctl_entry+0x505/0x5f0 [lnet_selftest]
 [<ffffffffa02cac84>] ? libcfs_ioctl+0x354/0x900 [libcfs]
 [<ffffffffa02c5ac4>] ? libcfs_ioctl+0x84/0x180 [libcfs]
 [<ffffffff81194acc>] ? vfs_ioctl+0x7c/0xa0
 [<ffffffff81194c14>] ? do_vfs_ioctl+0x84/0x580
 [<ffffffff81195191>] ? sys_ioctl+0x81/0xa0
 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b

lst script (note the servers NID was mispelled)

#!/bin/bash
export LST_SESSION=1234
lst new_session read/write
lst add_group servers 172.19.1.101@owib100
lst add_group readers 172.16.66.53@tcp
lst add_batch bulk_rw
lst add_test --batch bulk_rw --from readers --to servers     brw read check=simple size=1M
lst run bulk_rw
# display server stats for 30 seconds
lst stat servers & sleep 30; kill $!
# tear down
lst end_session


 Comments   
Comment by Peter Jones [ 03/Apr/13 ]

Amir

Could you please look into this one?

Thanks

Peter

Comment by Amir Shehata (Inactive) [ 05/Apr/13 ]

In order to save time, I think the best approach is to agree first on the proposed solution with all interested parties, before going ahead with the implementation.

As Ned pointed out in his comment, and I agree, I was aware that the the fix doesn't resolve the vulnerability in the kernel, but I was under the false assumption that I should keep the fix as simple as possible.

The approach to properly fix this issue and thus remove the assert completely, since it's not appropriate to assert on user input, is outlined below.

Currently the APIs under question are (and their explanation):
===============================================================
Add group
---------
. An empty group is added

Add nodes to a group
---------------------
. A node is added to the group after it has been RPCed
. if the rpc is successful the node is moved to ACTIVE state
. if the rpc is successful but the the NID in the response is LNET_NID_ANY then state is set to UNKNOWN
. if the rpc is successful but the reply session doesn't match console session then state is set to BUSY

Further more:
. A node is created in UNKNOWN state and stays in UNKNOWN state until it's RPCed
. A node is set to DOWN when there is no response to an RPC message (this is currently buggy. It is set to UNKNOWN, but I'll fix it).
. A batch is only run if the nodes are in ACTIVE state

update the group
----------------
refresh
clean <[active|busy|down|unknown|invalid]>
. if all nodes which are in any of the above states are cleaned and the group now
has 0 nodes, then the group itself is deleted implicitly (done in the kernel).
remove <NID>
. If the last nid is removed (IE no more nodes in the group) then the group is deleted (done in the kernel).

The problematic cases are:
==========================
1. A group is added but no valid nids are added to the group (IE group is empty)
2. A group is added but no reachable nids are added to the group (IE group has no ACTIVE nodes)

General Design Guidelines:
==========================
The selftest kernel module provides the following:
. A set of constructs which can be used to create an lnet test (EX: Group, node, batch)
. The selftest kernel modules ensures that the use of these constructs adhere to its internal rules:
. EX: only send to reachable NIDs
. The application (user space) can instantiate these constructs and create test cases
. The kernel module shall guard against misuse of these constructs.

The proposed solution is:
=========================
The selftest kernel module would check the source and destination groups when a test is added to a batch (with the add_test command) to ensure that the groups have at least one node in ACTIVE state. If not, then the test is not added to the batch.

The groups remain (possibly empty) and can be updated at a later time.

Kernel Selftest Behavior:
=========================
I propose the following modifications to the selftest kernel module behavior to make it consistent:
1. Groups are added independently (already done)
2. Nodes are created and added to the group (already done)
3. Nodes can be removed from the group and the group can remain empty
4. Groups referenced in a test which is to be added to a batch, must have at least one ACTIVE node; or the add_test command fails

Justification:
==============
Groups are container entities that can exist independent of nodes; IE: they can be created and deleted explicitly and independently of other entities. However, nodes can only exist within a group; IE you can not create a node independently and add it to a group later. Thus to make the interface more consistent a group should not be deleted implicitly.

This opens up the flexibility of a user space application creating groups that can be reused, by adding and removing nodes, without having to worry about the group implicitly being deleted.

LST Utility behavior
====================
The lst utility can present its own behavioral model. In this case the lst utility will not allow a group to exist without having at least one node in ACTIVE state.

Other user space utilities can be implemented differently, but the kernel selftest module will ensure the sanity of the passed in parameters before carrying out a test case thus avoiding asserting and bringing down the entire OS.

Is this solution acceptable?

Comment by Ned Bass [ 05/Apr/13 ]

Amir, thanks for the detailed analysis. This solution looks acceptable to me.

Comment by Amir Shehata (Inactive) [ 15/Apr/13 ]

We reached a consensus to make sure that the vulnerability in the Kernel module is addressed, as well as ensuring that lst utility catches this case in its checks as well.

The fix will be added to review soon. My hard drive crashed, causing a delay in the resolution to this issue.

Comment by Amir Shehata (Inactive) [ 18/Apr/13 ]

Fix is ready for review:
http://review.whamcloud.com/#change,6092,patchset=2

Comment by Amir Shehata (Inactive) [ 23/Apr/13 ]

Just a reminder that the fix is ready for review: http://review.whamcloud.com/#change,6092

Comment by Bob Glossman (Inactive) [ 08/Aug/13 ]

back port to b2_4: http://review.whamcloud.com/7277

Generated at Sat Feb 10 01:30:54 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.