Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10159

Lnet: Ping issues with Multi-rail routers talking to down rev clients

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • Lustre 2.10.1
    • Servers/Lnet Routers: Centos 7.4, MOFED 4.1-1.0.2.0, Lustre 2.10.1
      Clients: Centos 6.8, MOFED ?, Lustre 2.5 (DDN ES 2.5.42.28-ddn14)
    • 3
    • 9223372036854775807

    Description

      This case is being created on behalf of ANU/NCI
      Filesystem is new Lustre 2.10.1 ZFS based system

      The system has been built with Multi-Rail enabled Lnet routers. These Lnet routers have a IB Bonded EDR interface on the 'filesystem' side and two EDR interfaces on the same o2ib network on the client side.
      e.g. :

      # lnetctl net show
      net:
          - net type: lo
            local NI(s):
              - nid: 0@lo
                status: up
          - net type: o2ib8
            local NI(s):
              - nid: 10.112.1.81@o2ib8
                status: up
                interfaces:
                    0: ibbond
          - net type: o2ib3
            local NI(s):
              - nid: 10.9.110.171@o2ib3
                status: up
                interfaces:
                    0: ib1
              - nid: 10.9.110.179@o2ib3
                status: up
                interfaces:
                    0: ib3
      
      

      On clients each interface on the o2ib3 network is listed as a separate router. This was done as the clients are downrev and do not support multi-rail.

      # lctl route_list
      net              o2ib8 hops 1 gw               10.9.110.180@o2ib3 up pri 0
      net              o2ib8 hops 1 gw               10.9.110.181@o2ib3 up pri 0
      net              o2ib8 hops 1 gw               10.9.110.184@o2ib3 up pri 0
      net              o2ib8 hops 1 gw               10.9.110.179@o2ib3 up pri 0
      net              o2ib8 hops 1 gw               10.9.110.186@o2ib3 up pri 0
      net              o2ib8 hops 1 gw               10.9.110.182@o2ib3 up pri 0
      net              o2ib8 hops 1 gw               10.9.110.185@o2ib3 up pri 0
      net              o2ib8 hops 1 gw               10.9.110.183@o2ib3 up pri 0
      
      

      The clients have been configured on the lnet routers as 'Multi-Rail: True' so that the lustre code on the lnet routers will use all available interfaces on the o2ib3 network. If they are not configured as multi-rail aware, the old code path that chooses the first interface is used. Because the clients are actually attached via an FDR fabric, both EDR interfaces on the client side of the lnet routers are required to be used to achieve the target performance. This could also be achieved by using VM's as lnet routers, but that has other performance penalties that native does not.

      Errors have been spotted in the client logs for clients with the filesystem mounted and clients without the filesystem mounted. They all have the routes configured.:

      2017-10-25 12:07:22 [59647.382295] LNetError: 291:0:(o2iblnd_cb.c:2638:kiblnd_rejected()) 10.9.110.185@o2ib3 rejected: consumer defined fatal error
      2017-10-25 12:07:22 [59647.395386] LNetError: 291:0:(o2iblnd_cb.c:2638:kiblnd_rejected()) Skipped 11 previous similar messages
      2017-10-25 12:17:34 [60259.426495] LNetError: 291:0:(o2iblnd_cb.c:2638:kiblnd_rejected()) 10.9.110.185@o2ib3 rejected: consumer defined fatal error
      2017-10-25 12:17:34 [60259.439527] LNetError: 291:0:(o2iblnd_cb.c:2638:kiblnd_rejected()) Skipped 11 previous similar messages
      2017-10-25 12:27:46 [60871.470741] LNetError: 291:0:(o2iblnd_cb.c:2638:kiblnd_rejected()) 10.9.110.185@o2ib3 rejected: consumer defined fatal error
      2017-10-25 12:27:46 [60871.483851] LNetError: 291:0:(o2iblnd_cb.c:2638:kiblnd_rejected()) Skipped 11 previous similar messages
      
      

      These errors coincide with loss of available lnet routers:

      net              o2ib8 hops 1 gw               10.9.110.180@o2ib3 down pri 0
      net              o2ib8 hops 1 gw               10.9.110.181@o2ib3 down pri 0
      net              o2ib8 hops 1 gw               10.9.110.184@o2ib3 down pri 0
      net              o2ib8 hops 1 gw               10.9.110.179@o2ib3 down pri 0
      net              o2ib8 hops 1 gw               10.9.110.186@o2ib3 down pri 0
      net              o2ib8 hops 1 gw               10.9.110.182@o2ib3 down pri 0
      net              o2ib8 hops 1 gw               10.9.110.185@o2ib3 down pri 0
      net              o2ib8 hops 1 gw               10.9.110.183@o2ib3 down pri 0
      
      

      The issue appears to be that the lnet pings from clients to ensure routes are valid are not always returning from the ni that received the ping. This is causing the downrev clients to generate the above errors and flag the route as down.

      We are hoping a patch could be created to allow the above config (specifically the dual connections to the o2ib3 network) to be used in production.

      This system is not yet in production but is in final testing and there is some time pressure.

      Attachments

        Activity

          People

            ashehata Amir Shehata (Inactive)
            mhaakddn Malcolm Haak - NCI (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: