Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13912

lnet_check_routes sends pings too frequently

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Lustre 2.15.0
    • Labels:
      None
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      lnet_check_routes() attempts to discover a router every alive_router_check_interval / (# local nets). e.g. test node has three nets:

      sles15s01:~ # lctl list_nids
      192.168.2.30@tcp
      192.168.2.31@tcp
      192.168.2.30@tcp10
      192.168.2.31@tcp11
      sles15s01:~ #
      

      Default interval is 60

      sles15s01:~ # cat /sys/module/lnet/parameters/alive_router_check_interval
      60
      sles15s01:~ #
      

      But each local net on the router is getting discovered every 15 seconds:
      tcp10

      00000400:00000200:3.0:1597692791.996167:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692793.020062:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692806.332061:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692808.380070:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692821.692041:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692823.740048:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692836.028075:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692838.076090:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692851.388127:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692853.436089:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692866.748080:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692868.796055:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692881.084093:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692883.132121:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692896.444129:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692898.492085:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692911.804178:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692913.852211:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692926.140084:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692928.188112:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692941.500096:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692943.548098:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692956.860151:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692958.908200:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692971.196694:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692973.244082:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597692986.556074:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597692988.604117:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693001.916333:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693003.964361:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693016.252092:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693018.300111:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693031.612116:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693033.660105:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693046.972061:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693049.020096:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693061.308071:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693064.380103:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693076.668114:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693079.740148:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693091.004132:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693094.076132:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693106.364052:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693109.436064:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693121.724125:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693124.796137:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693136.060160:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693139.132125:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693151.420063:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693154.492093:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693166.780101:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693169.852103:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      00000400:00000200:3.0:1597693181.116105:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp10(ffff93a5a43ea500) cpt = 1
      00000400:00000200:3.0:1597693184.188104:0:10126:0:(router.c:1233:lnet_check_routers()) 192.168.2.32@tcp99(ffff93a57174e200) tcp11(ffff93a53c96fe00) cpt = 1
      

        Attachments

          Activity

            People

            Assignee:
            hornc Chris Horn
            Reporter:
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: