Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.10.0
    • Lustre 2.7.0, Lustre 2.8.0, Lustre 2.9.0
    • 3
    • 16043

    Description

      Got an IOR failure on the soak cluster with the following errors:

      Oct  7 21:54:01 lola-23 kernel: LNetError: 3613:0:(o2iblnd_cb.c:1134:kiblnd_init_rdma()) RDMA too fragmented for 192.168.1.115@o2ib100 (256): 128/256 src 128/256 dst frags
      Oct  7 21:54:01 lola-23 kernel: LNetError: 3618:0:(o2iblnd_cb.c:428:kiblnd_handle_rx()) Can't setup rdma for PUT to 192.168.1.114@o2ib100: -90
      Oct  7 21:54:01 lola-23 kernel: LNetError: 3618:0:(o2iblnd_cb.c:428:kiblnd_handle_rx()) Skipped 7 previous similar messages
      

      Liang told me that this is a known issue with routing. That said, the IOR process is not killable and the only option is to reboot the client node. We should at least fail "gracefully" by returning the error to the application.

      Attachments

        Issue Links

          Activity

            [LU-5718] RDMA too fragmented with router
            spitzcor Cory Spitz added a comment -

            LUDOC-378 is linked to this issue.

            spitzcor Cory Spitz added a comment - LUDOC-378 is linked to this issue.
            spitzcor Cory Spitz added a comment -

            Looks like we should have opened a LUDOC ticket to document wrq_sge.

            spitzcor Cory Spitz added a comment - Looks like we should have opened a LUDOC ticket to document wrq_sge.

            Ah! Thanks for the clarification, Chris and Doug! I was a bit lost as the parameters changed along the work done in this ticket. We'll test this right away.
            All the best,
            Stephane

            srcc Stanford Research Computing Center added a comment - Ah! Thanks for the clarification, Chris and Doug! I was a bit lost as the parameters changed along the work done in this ticket. We'll test this right away. All the best, Stephane

            Your router needs wrq_sge=2.

            doug Doug Oucharek (Inactive) added a comment - Your router needs wrq_sge=2.
            hornc Chris Horn added a comment -

            You need to set wrq_sge=2 on the routers, too.

            hornc Chris Horn added a comment - You need to set wrq_sge=2 on the routers, too.

            Hi,
            Could you please explain what is required to make the patches that landed work? We have tried 2.9 FE + patches from both LU-5718 and LU-9420 but are still seeing the problem on the routers. We have set wrq_sge=2 on the clients, and let the default wrq_sge=1 on the routers. We were not able to patch the servers at the moment (running IEEL3), see DELL-221.

            on the router with wrq_sge=1 (10.210.34.213@o2ib1 is an OSS not patched):

            [ 1111.504575] LNetError: 8688:0:(o2iblnd_cb.c:1093:kiblnd_init_rdma()) RDMA has too many fragments for peer 10.210.34.213@o2ib1 (256), src idx/frags: 128/147 dst idx/frags: 128/147
            [ 1111.522352] LNetError: 8688:0:(o2iblnd_cb.c:430:kiblnd_handle_rx()) Can't setup rdma for PUT to 10.210.34.213@o2ib1: -90
            

            Clients and routers are using mlx5, servers are using mlx4.

            Thanks,
            Stephane

            sthiell Stephane Thiell added a comment - Hi, Could you please explain what is required to make the patches that landed work? We have tried 2.9 FE + patches from both LU-5718 and LU-9420 but are still seeing the problem on the routers. We have set wrq_sge=2 on the clients, and let the default wrq_sge=1 on the routers. We were not able to patch the servers at the moment (running IEEL3), see DELL-221. on the router with wrq_sge=1 (10.210.34.213@o2ib1 is an OSS not patched): [ 1111.504575] LNetError: 8688:0:(o2iblnd_cb.c:1093:kiblnd_init_rdma()) RDMA has too many fragments for peer 10.210.34.213@o2ib1 (256), src idx/frags: 128/147 dst idx/frags: 128/147 [ 1111.522352] LNetError: 8688:0:(o2iblnd_cb.c:430:kiblnd_handle_rx()) Can't setup rdma for PUT to 10.210.34.213@o2ib1: -90 Clients and routers are using mlx5, servers are using mlx4. Thanks, Stephane

            Uff. It looks I was luck to use the build without this fix.

            dmiter Dmitry Eremin (Inactive) added a comment - Uff. It looks I was luck to use the build without this fix.

            This was addressed by a patch to LU-9420.  I would have pulled this patch to fix it under this ticket, but the patch took 2 years to land and I was not about to pull it for fear it would take another 2 years to re-land :^(.

            doug Doug Oucharek (Inactive) added a comment - This was addressed by a patch to LU-9420 .  I would have pulled this patch to fix it under this ticket, but the patch took 2 years to land and I was not about to pull it for fear it would take another 2 years to re-land :^(.

            After last patch landed I got the following error:

            [4020251.265904] LNetError: 95052:0:(o2iblnd_cb.c:1086:kiblnd_init_rdma()) RDMA is too large for peer 192.168.213.235@o2ib (131072), src size: 1048576 dst size: 1048576
            [4020251.265941] LNetError: 95050:0:(o2iblnd_cb.c:1720:kiblnd_reply()) Can't setup rdma for GET from 192.168.213.235@o2ib: -90
            [4020251.265948] LustreError: 95050:0:(events.c:199:client_bulk_callback()) event type 1, status -5, desc ffff8816e0754c00
            ...
            [4020251.267492] Lustre: 95098:0:(client.c:2115:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1493833318/real 1493833318]  req@ffff8817e9e48000 x1566397691863184/t0(0) o4->
            [4020251.267503] Lustre: nvmelfs-OST000b-osc-ffff881c8e362000: Connection to nvmelfs-OST000b (at 192.168.213.236@o2ib) was lost; in progress operations using this service will wait for recovery to complete
            ...
            [4020251.267965] LustreError: 95050:0:(events.c:199:client_bulk_callback()) event type 1, status -5, desc ffff880223361400
            [4020251.268058] Lustre: nvmelfs-OST000b-osc-ffff881c8e362000: Connection restored to 192.168.213.236@o2ib (at 192.168.213.236@o2ib)
            ...
            [4020256.133400] LNetError: 95052:0:(o2iblnd_cb.c:1086:kiblnd_init_rdma()) RDMA is too large for peer 192.168.213.235@o2ib (131072), src size: 1048576 dst size: 1048576
            [4020256.133561] LNetError: 95049:0:(o2iblnd_cb.c:1720:kiblnd_reply()) Can't setup rdma for GET from 192.168.213.235@o2ib: -90
            [4020256.133564] LNetError: 95049:0:(o2iblnd_cb.c:1720:kiblnd_reply()) Skipped 159 previous similar messages
            [4020256.133569] LustreError: 95049:0:(events.c:199:client_bulk_callback()) event type 1, status -5, desc ffff88192932fe00
            [4020256.133630] Lustre: 95125:0:(client.c:2115:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1493833323/real 1493833323]  req@ffff882031360300 x1566397691866144/t0(0) o4->
            [4020256.133634] Lustre: 95125:0:(client.c:2115:ptlrpc_expire_one_request()) Skipped 39 previous similar messages
            [4020256.133654] Lustre: nvmelfs-OST000e-osc-ffff881c8e362000: Connection to nvmelfs-OST000e (at 192.168.213.235@o2ib) was lost; in progress operations using this service will wait for recovery to complete
            [4020256.133656] Lustre: Skipped 39 previous similar messages
            [4020256.134200] Lustre: nvmelfs-OST000e-osc-ffff881c8e362000: Connection restored to 192.168.213.235@o2ib (at 192.168.213.235@o2ib)
            [4020256.134202] Lustre: Skipped 39 previous similar messages
            

            The system is partially working. I'm able to see the list of files and open small files. But large bulk transfer don't work.

            dmiter Dmitry Eremin (Inactive) added a comment - After last patch landed I got the following error: [4020251.265904] LNetError: 95052:0:(o2iblnd_cb.c:1086:kiblnd_init_rdma()) RDMA is too large for peer 192.168.213.235@o2ib (131072), src size: 1048576 dst size: 1048576 [4020251.265941] LNetError: 95050:0:(o2iblnd_cb.c:1720:kiblnd_reply()) Can't setup rdma for GET from 192.168.213.235@o2ib: -90 [4020251.265948] LustreError: 95050:0:(events.c:199:client_bulk_callback()) event type 1, status -5, desc ffff8816e0754c00 ... [4020251.267492] Lustre: 95098:0:(client.c:2115:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1493833318/real 1493833318] req@ffff8817e9e48000 x1566397691863184/t0(0) o4-> [4020251.267503] Lustre: nvmelfs-OST000b-osc-ffff881c8e362000: Connection to nvmelfs-OST000b (at 192.168.213.236@o2ib) was lost; in progress operations using this service will wait for recovery to complete ... [4020251.267965] LustreError: 95050:0:(events.c:199:client_bulk_callback()) event type 1, status -5, desc ffff880223361400 [4020251.268058] Lustre: nvmelfs-OST000b-osc-ffff881c8e362000: Connection restored to 192.168.213.236@o2ib (at 192.168.213.236@o2ib) ... [4020256.133400] LNetError: 95052:0:(o2iblnd_cb.c:1086:kiblnd_init_rdma()) RDMA is too large for peer 192.168.213.235@o2ib (131072), src size: 1048576 dst size: 1048576 [4020256.133561] LNetError: 95049:0:(o2iblnd_cb.c:1720:kiblnd_reply()) Can't setup rdma for GET from 192.168.213.235@o2ib: -90 [4020256.133564] LNetError: 95049:0:(o2iblnd_cb.c:1720:kiblnd_reply()) Skipped 159 previous similar messages [4020256.133569] LustreError: 95049:0:(events.c:199:client_bulk_callback()) event type 1, status -5, desc ffff88192932fe00 [4020256.133630] Lustre: 95125:0:(client.c:2115:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1493833323/real 1493833323] req@ffff882031360300 x1566397691866144/t0(0) o4-> [4020256.133634] Lustre: 95125:0:(client.c:2115:ptlrpc_expire_one_request()) Skipped 39 previous similar messages [4020256.133654] Lustre: nvmelfs-OST000e-osc-ffff881c8e362000: Connection to nvmelfs-OST000e (at 192.168.213.235@o2ib) was lost; in progress operations using this service will wait for recovery to complete [4020256.133656] Lustre: Skipped 39 previous similar messages [4020256.134200] Lustre: nvmelfs-OST000e-osc-ffff881c8e362000: Connection restored to 192.168.213.235@o2ib (at 192.168.213.235@o2ib) [4020256.134202] Lustre: Skipped 39 previous similar messages The system is partially working. I'm able to see the list of files and open small files. But large bulk transfer don't work.

            Just to let you know I'm in the process of testing this patch and the latest patch seems to be holding up. Good work Doug.

            simmonsja James A Simmons added a comment - Just to let you know I'm in the process of testing this patch and the latest patch seems to be holding up. Good work Doug.

            People

              doug Doug Oucharek (Inactive)
              johann Johann Lombardi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              39 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: