Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8867

Ignore timedout TX on closing connection

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      From customer Log, We found LNetError reports as below:

      Nov 13 17:06:58 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff8819375e0000) timed out 2 secs ago, resid: 0, wmem: 0
      Nov 18 02:22:37 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 1 secs ago, resid: 0, wmem: 3096232
      Nov 18 02:22:38 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 1 secs ago, resid: 0, wmem: 3096232
      Nov 18 02:22:39 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 2 secs ago, resid: 0, wmem: 3089888
      Nov 18 02:22:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 4 secs ago, resid: 0, wmem: 3090440
      Nov 18 02:22:45 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 8 secs ago, resid: 0, wmem: 3090440
      Nov 18 02:22:53 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 16 secs ago, resid: 0, wmem: 3090440
      Nov 18 02:23:09 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 32 secs ago, resid: 0, wmem: 3090440
      Nov 18 02:23:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 64 secs ago, resid: 0, wmem: 3090440
      Nov 18 02:24:45 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 128 secs ago, resid: 0, wmem: 3090440
      Nov 18 02:26:53 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 256 secs ago, resid: 0, wmem: 3090440
      Nov 18 02:31:09 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 512 secs ago, resid: 0, wmem: 3090440
      Nov 18 02:39:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 1024 secs ago, resid: 0, wmem: 3090440
      Nov 18 02:49:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 1624 secs ago, resid: 0, wmem: 3090440
      Nov 18 02:59:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 2224 secs ago, resid: 0, wmem: 3090440
      Nov 18 03:09:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 2824 secs ago, resid: 0, wmem: 3090440
      Nov 18 03:19:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 3424 secs ago, resid: 0, wmem: 3090440
      Nov 18 03:29:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 4024 secs ago, resid: 0, wmem: 3090440
      Nov 18 03:39:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 4624 secs ago, resid: 0, wmem: 3090440
      Nov 18 03:49:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 5224 secs ago, resid: 0, wmem: 3090440
      Nov 18 03:59:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 5824 secs ago, resid: 0, wmem: 3090440
      Nov 18 04:09:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 6424 secs ago, resid: 0, wmem: 3090440
      Nov 18 04:19:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 7024 secs ago, resid: 0, wmem: 3090440
      Nov 18 04:29:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 7624 secs ago, resid: 0, wmem: 3090440
      Nov 18 04:39:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 8224 secs ago, resid: 0, wmem: 3090440
      Nov 18 04:49:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 8824 secs ago, resid: 0, wmem: 3090440
      Nov 18 04:59:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 9424 secs ago, resid: 0, wmem: 3090440
      Nov 18 05:09:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 10024 secs ago, resid: 0, wmem: 3090440
      Nov 18 05:19:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 10624 secs ago, resid: 0, wmem: 3090440
      Nov 18 05:29:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 11224 secs ago, resid: 0, wmem: 3090440
      Nov 18 05:39:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 11824 secs ago, resid: 0, wmem: 3090440
      Nov 18 05:49:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 12424 secs ago, resid: 0, wmem: 3090440
      Nov 18 05:59:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 13024 secs ago, resid: 0, wmem: 3090440
      Nov 18 06:09:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 13624 secs ago, resid: 0, wmem: 3090440
      Nov 18 06:19:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 14224 secs ago, resid: 0, wmem: 3090440
      Nov 18 06:29:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 14824 secs ago, resid: 0, wmem: 3090440
      Nov 18 06:39:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 15424 secs ago, resid: 0, wmem: 3090440
      Nov 18 06:49:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 16024 secs ago, resid: 0, wmem: 3090440
      Nov 18 06:59:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 16624 secs ago, resid: 0, wmem: 3090440
      Nov 18 07:09:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 17224 secs ago, resid: 0, wmem: 3090440
      Nov 18 07:19:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 17824 secs ago, resid: 0, wmem: 3090440
      Nov 18 07:29:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 18424 secs ago, resid: 0, wmem: 3090440
      Nov 18 07:39:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 19024 secs ago, resid: 0, wmem: 3090440
      Nov 18 07:49:41 OSS04 kernel: LNetError: 3721:0:(socklnd_cb.c:2519:ksocknal_check_peer_timeouts()) Total 1 stale ZC_REQs for peer 192.235.3.34@tcp detected; the oldest(ffff881a2535a000) timed out 19624 secs ago, resid: 0, wmem: 3090440
      
      

      It is abnormal case to close same TX multiple times.

      Attachments

        Activity

          People

            ys Yang Sheng
            ys Yang Sheng
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: