Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/85009820-d429-41a1-8948-90b2f66d7f02
test_123f failed with the following error:
onyx-37vm9 crashed during sanity test_123f LNetError: 191018:0:(socklnd_cb.c:1036:ksocknal_send()) ASSERTION( tx->tx_nkiov <= 256 ) failed: LNetError: 191018:0:(socklnd_cb.c:1036:ksocknal_send()) LBUG Pid: 191018, comm: mdt_out00_001 4.18.0-425.10.1.el8_lustre.x86_64 #1 SMP Wed May 3 16:22:26 UTC 2023 Call Trace TBD: libcfs_call_trace+0x6f/0xa0 [libcfs] lbug_with_loc+0x3f/0x70 [libcfs] ksocknal_send+0x27a/0x320 [ksocklnd] lnet_ni_send+0x4c/0xe0 [lnet] lnet_send+0xae/0x1e0 [lnet] LNetPut+0x318/0x940 [lnet] ptl_send_buf+0x208/0x5a0 [ptlrpc] ptlrpc_send_reply+0x2ad/0x8d0 [ptlrpc] target_send_reply+0x328/0x7d0 [ptlrpc] tgt_request_handle+0xe85/0x1920 [ptlrpc] ptlrpc_server_handle_request+0x31d/0xbc0 [ptlrpc] ptlrpc_main+0xc52/0x1510
Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/94706 - 5.4.0-131-generic
servers: https://build.whamcloud.com/job/lustre-reviews/94706 - 4.18.0-425.10.1.el8_lustre.x86_64
Have seen this about 10 times since 2023-05-09, after patch https://review.whamcloud.com/46540 "LU-15550 ptlrpc: retry mechanism for overflowed batched RPCs" landed, but I'm not sure if it is directly related.
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity test_123f - onyx-37vm9 crashed during sanity test_123f