Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
fs/lustre-release-fe
-
3
-
9223372036854775807
Description
We are encountering the issue described above on clients running lustre 2.8 with Omni-Path fabrics. We would like the patch backported to b2_8_fe.
Console log messages
2017-11-12 10:20:23 [763673.420307] LNetError: 6383:0:(o2iblnd_cb.c:1105:kiblnd_init_rdma()) RDMA has too many fragments for peer 192.168.134.10@o2ib27 (256), src idx/frags: 128/256 dst idx/frags: 128/256 2017-11-12 10:20:23 [763673.438365] LNetError: 6383:0:(o2iblnd_cb.c:434:kiblnd_handle_rx()) Can't setup rdma for PUT to 192.168.134.10@o2ib27: -90 2017-11-12 10:23:00 [763830.403553] Lustre: 8245:0:(client.c:2063:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1510510823/real 1510510823] req@ffff88102df6e300 x1583085670240648/t0(0) o4->lsh-OST0009-osc-ffff881035fb1000@172.19.3.26@o2ib600:6/4 lens 608/448 e 2 to 1 dl 1510510980 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 2017-11-12 10:23:00 [763830.435966] Lustre: lsh-OST0009-osc-ffff881035fb1000: Connection to lsh-OST0009 (at 172.19.3.26@o2ib600) was lost; in progress operations using this service will wait for recovery to complete 2017-11-12 10:23:00 [763830.455086] Lustre: Skipped 42 previous similar messages 2017-11-12 10:23:00 [763830.488005] Lustre: lsh-OST0009-osc-ffff881035fb1000: Connection restored to 172.19.3.26@o2ib600 (at 172.19.3.26@o2ib600)
Attachments
Issue Links
- is related to
-
LU-5718 RDMA too fragmented with router
- Resolved