Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5803

This server is not able to keep up with request traffic

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Minor
    • Lustre 2.5.4
    • Lustre 2.5.3
    • 3
    • 16270

    Description

      The following was with our local lustre version 2.5.3-1chaos (see github.com/chaos/lustre), using the zfs osd.

      I am noticing that with this 2.5.3 release that every time I boot the entire server cluster, the OSS nodes all have the following block of console noise. We are always past the deadline by "-6".

      I hadn't seen this noise with the 2.4 branch.

      2014-10-23 18:36:30 LustreError: 13a-8: Failed to get MGS log params and no local copy.
      2014-10-23 18:36:41 LustreError: 7401:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired   req@ffff8810142b3800 x1482806576283684/t0(0) o253->MGC10.1.1.169@o2ib9@10.1.1.1
      69@o2ib9:26/25 lens 4768/4768 e 0 to 0 dl 0 ref 2 fl Rpc:W/0/ffffffff rc 0/-1
      2014-10-23 18:36:41 LustreError: 7401:0:(client.c:1083:ptlrpc_import_delay_req()) Skipped 1 previous similar message
      2014-10-23 18:36:42 Lustre: lcy-OST0001: Will be in recovery for at least 5:00, or until 135 clients reconnect.
      2014-10-23 18:37:01 LustreError: 137-5: lcy-OST0000_UUID: not available for connect from 192.168.121.102@o2ib2 (no target). If you are running an HA pair check that the target is mounted
       on the other server.
      2014-10-23 18:37:01 LustreError: Skipped 27 previous similar messages
      2014-10-23 18:37:33 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound).
      2014-10-23 18:37:33 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=2 reqQ=0 recA=0, svcEst=45, delay=0(jiff)
      2014-10-23 18:37:33 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)?  req@
      ffff88081a1cb850 x1482683500276896/t0(0) o400->c6423062-7cd6-6647-06fc-f8be419a9edc@192.168.121.111@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114647 ref 2 fl Complete:H/c0/ffffffff rc 0/-1
      2014-10-23 18:37:34 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound).
      2014-10-23 18:37:34 Lustre: 7599:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=0, svcEst=45, delay=0(jiff)
      2014-10-23 18:37:34 Lustre: 7599:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)?  req@
      ffff881013dab850 x1482683530936232/t0(0) o400->393de519-5053-8c5e-d1f5-ab0fa701f469@192.168.121.140@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114648 ref 2 fl Complete:H/c0/ffffffff rc 0/-1
      2014-10-23 18:37:34 Lustre: 7599:0:(service.c:1309:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message
      2014-10-23 18:37:36 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound).
      2014-10-23 18:37:36 Lustre: Skipped 1 previous similar message
      2014-10-23 18:37:36 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=0, svcEst=45, delay=0(jiff)
      2014-10-23 18:37:36 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) Skipped 1 previous similar message
      2014-10-23 18:37:36 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)?  req@
      ffff88081a1c0050 x1482683492672308/t0(0) o400->c3d0ea3a-8781-d75b-9934-fd211665f796@192.168.121.19@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114650 ref 2 fl Complete:H/c0/ffffffff rc 0/-1
      2014-10-23 18:37:36 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) Skipped 1 previous similar message
      2014-10-23 18:37:39 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound).
      2014-10-23 18:37:39 Lustre: Skipped 2 previous similar messages
      2014-10-23 18:37:39 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=0, svcEst=45, delay=0(jiff)
      2014-10-23 18:37:39 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) Skipped 2 previous similar messages
      2014-10-23 18:37:39 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)?  req@
      ffff88081a1b5050 x1482683502391272/t0(0) o400->ea71043b-0040-df9e-f053-fc396ff5a07b@192.168.121.137@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114653 ref 2 fl Complete:H/c0/ffffffff rc 0/-1
      2014-10-23 18:37:39 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages
      2014-10-23 18:37:43 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound).
      2014-10-23 18:37:43 Lustre: Skipped 5 previous similar messages
      2014-10-23 18:37:43 Lustre: 7599:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=1 reqQ=0 recA=0, svcEst=45, delay=0(jiff)
      2014-10-23 18:37:43 Lustre: 7599:0:(service.c:1512:ptlrpc_at_check_timed()) Skipped 5 previous similar messages
      2014-10-23 18:37:43 Lustre: 7599:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)?  req@
      ffff881013d8a850 x1482683531137984/t0(0) o400->84c85d3d-b1c2-dfa0-5dad-84193a0250a9@192.168.121.151@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114657 ref 2 fl Complete:H/c0/ffffffff rc 0/-1
      2014-10-23 18:37:43 Lustre: 7599:0:(service.c:1309:ptlrpc_at_send_early_reply()) Skipped 7 previous similar messages
      2014-10-23 18:37:52 Lustre: ost: This server is not able to keep up with request traffic (cpu-bound).
      2014-10-23 18:37:52 Lustre: Skipped 28 previous similar messages
      2014-10-23 18:37:52 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) earlyQ=5 reqQ=0 recA=0, svcEst=45, delay=0(jiff)
      2014-10-23 18:37:52 Lustre: 7555:0:(service.c:1512:ptlrpc_at_check_timed()) Skipped 28 previous similar messages
      2014-10-23 18:37:52 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-6s), not sending early reply. Consider increasing at_early_margin (10)?  req@
      ffff88081a20e050 x1482683498963872/t0(0) o400->770acbfa-1b07-454e-d9b5-c1becbe0bcee@192.168.121.87@o2ib2:0/0 lens 224/0 e 1 to 0 dl 1414114666 ref 2 fl Complete:H/c0/ffffffff rc 0/-1
      2014-10-23 18:37:52 Lustre: 7555:0:(service.c:1309:ptlrpc_at_send_early_reply()) Skipped 84 previous similar messages
      

      Attachments

        Issue Links

          Activity

            People

              yujian Jian Yu
              morrone Christopher Morrone (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: