[LU-10281] conf-sanity: test_54a hung at lnet_discover_peer_locked() Created: 27/Nov/17 Updated: 27/Nov/17 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
This issue was created by maloo for Jinshan Xiong <jinshan.xiong@intel.com> Please provide additional information about the failure here. This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/60aa1144-d2a7-11e7-9840-52540065bddc. The console message at OSS: [24240.470189] Lustre: srv-lustre-OST0000: No data found on store. Initialize space [24284.799899] LNet: Service thread pid 2277 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [24284.801683] Pid: 2277, comm: ll_ost00_002 [24284.802095] Call Trace: [24284.802511] [<ffffffff816a9569>] schedule+0x29/0x70 [24284.803038] [<ffffffffc077a9bb>] lnet_discover_peer_locked+0x10b/0x380 [lnet] [24284.803764] [<ffffffff810b1920>] ? autoremove_wake_function+0x0/0x40 [24284.804550] [<ffffffffc077aca0>] LNetPrimaryNID+0x70/0x1a0 [lnet] [24284.805230] [<ffffffffc0a3e35e>] ptlrpc_connection_get+0x3e/0x450 [ptlrpc] [24284.806007] [<ffffffffc0a422a4>] ptlrpc_send_reply+0x394/0x840 [ptlrpc] [24284.806762] [<ffffffffc0a482af>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc] [24284.807514] [<ffffffffc0a4281b>] ptlrpc_send_error+0x9b/0x1b0 [ptlrpc] [24284.808270] [<ffffffffc0a42940>] ptlrpc_error+0x10/0x20 [ptlrpc] [24284.808917] [<ffffffffc0aafb18>] tgt_request_handle+0x7d8/0x13b0 [ptlrpc] [24284.809653] [<ffffffffc0a53eee>] ptlrpc_server_handle_request+0x24e/0xab0 [ptlrpc] [24284.810445] [<ffffffffc0a50db8>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [24284.811199] [<ffffffff810c4832>] ? default_wake_function+0x12/0x20 [24284.811824] [<ffffffff810ba598>] ? __wake_up_common+0x58/0x90 [24284.812424] [<ffffffffc0a57692>] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [24284.813143] [<ffffffff81029557>] ? __switch_to+0xd7/0x510 [24284.813685] [<ffffffff816a9000>] ? __schedule+0x370/0x8b0 [24284.814259] [<ffffffffc0a56c00>] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [24284.814975] [<ffffffff810b099f>] kthread+0xcf/0xe0 [24284.815450] [<ffffffff810b08d0>] ? kthread+0x0/0xe0 [24284.815961] [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90 [24284.816565] [<ffffffff810b08d0>] ? kthread+0x0/0xe0 My patch doesn't change this area of the code. |
| Comments |
| Comment by Jinshan Xiong (Inactive) [ 27/Nov/17 ] |
|
Probably related to multi-rail issues. |