Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
Lustre 2.5.3
-
3
-
16270
Description
The following was with our local lustre version 2.5.3-1chaos (see github.com/chaos/lustre), using the zfs osd.
I am noticing that with this 2.5.3 release that every time I boot the entire server cluster, the OSS nodes all have the following block of console noise. We are always past the deadline by "-6".
I hadn't seen this noise with the 2.4 branch.
2014-10-23 18:36:30 LustreError: 13a-8: Failed to get MGS log params and no local copy. 2014-10-23 18:36:41 LustreError: 7401:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff8810142b3800 x1482806576283684/t0(0) o253->MGC10.1.1.169@o2ib9@10.1.1.1 69@o2ib9:26/25 lens 4768/4768 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1 2014-10-23 18:36:41 LustreError: 7401:0:(client.c:1083:ptlrpc_import_delay_req()) Skipped 1 previous similar message 2014-10-23 18:36:42 Lustre: lcy-OST0001: Will be in recovery for at least 5:00, or until 135 clients reconnect. 2014-10-23 18:37:01 LustreError: 137-5: lcy-OST0000_UUID: not available for connect from 192.168.121.102@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. 2014-10-23 18:37:01 LustreError: Skipped 27 previous similar messages 2014-10-23 18:37:33 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound). 2014-10-23 18:37:33 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=2 reqQ=0 recA=0, svcEst=45, delay=0(jiff) 2014-10-23 18:37:33 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)? req@ ffff88081a1cb850 x1482683500276896/t0(0) o400->c6423062-7cd6-6647-06fc-f8be419a9edc@192.168.121.111@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114647 ref 2 fl Complete:H/c0/ffffffff rc 0/-1 2014-10-23 18:37:34 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound). 2014-10-23 18:37:34 Lustre: 7599:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=0, svcEst=45, delay=0(jiff) 2014-10-23 18:37:34 Lustre: 7599:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)? req@ ffff881013dab850 x1482683530936232/t0(0) o400->393de519-5053-8c5e-d1f5-ab0fa701f469@192.168.121.140@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114648 ref 2 fl Complete:H/c0/ffffffff rc 0/-1 2014-10-23 18:37:34 Lustre: 7599:0:(service.c:1309:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message 2014-10-23 18:37:36 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound). 2014-10-23 18:37:36 Lustre: Skipped 1 previous similar message 2014-10-23 18:37:36 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=0, svcEst=45, delay=0(jiff) 2014-10-23 18:37:36 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) Skipped 1 previous similar message 2014-10-23 18:37:36 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)? req@ ffff88081a1c0050 x1482683492672308/t0(0) o400->c3d0ea3a-8781-d75b-9934-fd211665f796@192.168.121.19@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114650 ref 2 fl Complete:H/c0/ffffffff rc 0/-1 2014-10-23 18:37:36 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message 2014-10-23 18:37:39 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound). 2014-10-23 18:37:39 Lustre: Skipped 2 previous similar messages 2014-10-23 18:37:39 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=0, svcEst=45, delay=0(jiff) 2014-10-23 18:37:39 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) Skipped 2 previous similar messages 2014-10-23 18:37:39 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)? req@ ffff88081a1b5050 x1482683502391272/t0(0) o400->ea71043b-0040-df9e-f053-fc396ff5a07b@192.168.121.137@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114653 ref 2 fl Complete:H/c0/ffffffff rc 0/-1 2014-10-23 18:37:39 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages 2014-10-23 18:37:43 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound). 2014-10-23 18:37:43 Lustre: Skipped 5 previous similar messages 2014-10-23 18:37:43 Lustre: 7599:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=0, svcEst=45, delay=0(jiff) 2014-10-23 18:37:43 Lustre: 7599:0:(service.c:1512:ptlrpc_at_check_timed()) Skipped 5 previous similar messages 2014-10-23 18:37:43 Lustre: 7599:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)? req@ ffff881013d8a850 x1482683531137984/t0(0) o400->84c85d3d-b1c2-dfa0-5dad-84193a0250a9@192.168.121.151@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114657 ref 2 fl Complete:H/c0/ffffffff rc 0/-1 2014-10-23 18:37:43 Lustre: 7599:0:(service.c:1309:ptlrpc_at_send_early_reply()) Skipped 7 previous similar messages 2014-10-23 18:37:52 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound). 2014-10-23 18:37:52 Lustre: Skipped 28 previous similar messages 2014-10-23 18:37:52 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=5 reqQ=0 recA=0, svcEst=45, delay=0(jiff) 2014-10-23 18:37:52 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) Skipped 28 previous similar messages 2014-10-23 18:37:52 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)? req@ ffff88081a20e050 x1482683498963872/t0(0) o400->770acbfa-1b07-454e-d9b5-c1becbe0bcee@192.168.121.87@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114666 ref 2 fl Complete:H/c0/ffffffff rc 0/-1 2014-10-23 18:37:52 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) Skipped 84 previous similar messages
Attachments
Issue Links
- is related to
-
LU-5079 conf-sanity test_47 timeout
- Resolved