Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8225

router node: Failed to create FMR pool: -38

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.8.0
    • 3
    • 9223372036854775807

    Description

      On a router node with both omnipath and mellanox interfaces, I see the following in the output of journalctl -xe:

      -- Unit lnet.service has begun starting up.
      kernel: LNet: Added LNI 192.168.128.187@o2ib18 [128/8192/0/180]
      kernel: fmr_pool: Device mlx5_0 does not support FMRs
      kernel: LNetError: 7963:0:(o2iblnd.c:1459:kiblnd_create_fmr_pool()) Failed to create FMR pool: -38
      kernel: LNetError: 7963:0:(o2iblnd.c:2096:kiblnd_net_init_pools()) Can't initialize FMR pool for CPT 0: -38
      kernel: LNetError: 7963:0:(o2iblnd.c:2895:kiblnd_startup()) Failed to initialize NI pools: -38
      kernel: LNetError: 105-4: Error -100 starting up LNI o2ib
      kernel: LNetError: 801:0:(o2iblnd_cb.c:2297:kiblnd_passive_connect()) Can't accept conn from 192.168.128.37@o2ibkernel: LNetError: 801:0:(o2iblnd_cb.c:2297:kiblnd_passive_connect()) Skipped 20 previous similar messages
      kernel: LNet: Removed LNI 192.168.128.187@o2ib18
      lnet[7960]: LNET configure error 100: Network is down
      systemd[1]: lnet.service: control process exited, code=exited status=1
      systemd[1]: Failed to start SYSV: Part of the lustre file system..
      

      I do not encounter this on the compute nodes, which have only omnipath, nor on the lustre servers, which have only mellanox.

      Lustre 2.8 ships with /etc/modprobe.d/ko2iblnd.conf, which contains:

      alias ko2iblnd-opa ko2iblnd
      options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64 credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1
      

      Attachments

        Activity

          People

            dmiter Dmitry Eremin (Inactive)
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: