Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.1.4
-
None
-
3
-
7513
Description
An lst session node crashed when I tried to run an lnet selftest.
Backtrace from crash:
crash> bt PID: 2961 TASK: ffff880bed544ae0 CPU: 8 COMMAND: "lst" #0 [ffff8808edeff980] machine_kexec at ffffffff81035b6b #1 [ffff8808edeff9e0] crash_kexec at ffffffff810c08d2 #2 [ffff8808edeffab0] panic at ffffffff8150d3f3 #3 [ffff8808edeffb30] lbug_with_loc at ffffffffa02c1e4b [libcfs] #4 [ffff8808edeffb50] lstcon_dstnodes_prep at ffffffffa059cc36 [lnet_selftest] #5 [ffff8808edeffbc0] lstcon_testrpc_prep at ffffffffa059eb9a [lnet_selftest] #6 [ffff8808edeffc20] lstcon_rpc_trans_ndlist at ffffffffa059f30f [lnet_selftest] #7 [ffff8808edeffc90] lstcon_test_add at ffffffffa059b2ce [lnet_selftest] #8 [ffff8808edeffd10] lst_test_add_ioctl at ffffffffa05a03e7 [lnet_selftest] #9 [ffff8808edeffda0] lstcon_ioctl_entry at ffffffffa05a3f95 [lnet_selftest] #10 [ffff8808edeffdd0] libcfs_ioctl at ffffffffa02cac84 [libcfs] #11 [ffff8808edeffe10] libcfs_ioctl at ffffffffa02c5ac4 [libcfs] #12 [ffff8808edeffe60] vfs_ioctl at ffffffff81194acc #13 [ffff8808edeffea0] do_vfs_ioctl at ffffffff81194c14 #14 [ffff8808edefff30] sys_ioctl at ffffffff81195191 #15 [ffff8808edefff80] system_call_fastpath at ffffffff8100b072 RIP: 00002aaaaadada47 RSP: 00007fffffffe1b8 RFLAGS: 00010206 RAX: 0000000000000010 RBX: ffffffff8100b072 RCX: 00002aaaaae1dac0 RDX: 00007fffffffe0a0 RSI: 00000000c008653f RDI: 0000000000000003 RBP: 0000000000000000 R8: 0000000000000c26 R9: 0000000000000068 R10: 00007fffffffdf40 R11: 0000000000000246 R12: 00000000c008653f R13: 00007fffffffe0a0 R14: 000000000061c2e0 R15: 0000000000000003 ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b
Console panic message:
LustreError: 2961:0:(conrpc.c:738:lstcon_dstnodes_prep()) ASSERTION( grp->grp_nnode >= 1 ) failed: LustreError: 2961:0:(conrpc.c:738:lstcon_dstnodes_prep()) LBUG Pid: 2961, comm: lst Call Trace: [<ffffffffa02c17e5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa02c1df7>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa059cc36>] lstcon_dstnodes_prep+0x236/0x280 [lnet_selftest] [<ffffffff8116041a>] ? alloc_pages_current+0xaa/0x110 [<ffffffffa059eb9a>] lstcon_testrpc_prep+0xfa/0x310 [lnet_selftest] [<ffffffffa0596a40>] ? lstcon_testrpc_condition+0x0/0x1c0 [lnet_selftest] [<ffffffffa059f30f>] lstcon_rpc_trans_ndlist+0x24f/0x300 [lnet_selftest] [<ffffffffa059b2ce>] lstcon_test_add+0x4ce/0x930 [lnet_selftest] [<ffffffffa05a03e7>] lst_test_add_ioctl+0xaf7/0xc20 [lnet_selftest] [<ffffffffa05a3f95>] lstcon_ioctl_entry+0x505/0x5f0 [lnet_selftest] [<ffffffffa02cac84>] libcfs_ioctl+0x354/0x900 [libcfs] [<ffffffffa02c5ac4>] libcfs_ioctl+0x84/0x180 [libcfs] [<ffffffff81194acc>] vfs_ioctl+0x7c/0xa0 [<ffffffff81194c14>] do_vfs_ioctl+0x84/0x580 [<ffffffff81195191>] sys_ioctl+0x81/0xa0 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Kernel panic - not syncing: LBUG Pid: 2961, comm: lst Tainted: G W --------------- 2.6.32-358.5chaos.ch5.1.x86_64 #1 Call Trace: [<ffffffff8150d3ec>] ? panic+0xa7/0x16f [<ffffffffa02c1e4b>] ? lbug_with_loc+0x9b/0xb0 [libcfs] [<ffffffffa059cc36>] ? lstcon_dstnodes_prep+0x236/0x280 [lnet_selftest] [<ffffffff8116041a>] ? alloc_pages_current+0xaa/0x110 [<ffffffffa059eb9a>] ? lstcon_testrpc_prep+0xfa/0x310 [lnet_selftest] [<ffffffffa0596a40>] ? lstcon_testrpc_condition+0x0/0x1c0 [lnet_selftest] [<ffffffffa059f30f>] ? lstcon_rpc_trans_ndlist+0x24f/0x300 [lnet_selftest] [<ffffffffa059b2ce>] ? lstcon_test_add+0x4ce/0x930 [lnet_selftest] [<ffffffffa05a03e7>] ? lst_test_add_ioctl+0xaf7/0xc20 [lnet_selftest] [<ffffffffa05a3f95>] ? lstcon_ioctl_entry+0x505/0x5f0 [lnet_selftest] [<ffffffffa02cac84>] ? libcfs_ioctl+0x354/0x900 [libcfs] [<ffffffffa02c5ac4>] ? libcfs_ioctl+0x84/0x180 [libcfs] [<ffffffff81194acc>] ? vfs_ioctl+0x7c/0xa0 [<ffffffff81194c14>] ? do_vfs_ioctl+0x84/0x580 [<ffffffff81195191>] ? sys_ioctl+0x81/0xa0 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
lst script (note the servers NID was mispelled)
#!/bin/bash
export LST_SESSION=1234
lst new_session read/write
lst add_group servers 172.19.1.101@owib100
lst add_group readers 172.16.66.53@tcp
lst add_batch bulk_rw
lst add_test --batch bulk_rw --from readers --to servers brw read check=simple size=1M
lst run bulk_rw
# display server stats for 30 seconds
lst stat servers & sleep 30; kill $!
# tear down
lst end_session