Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8368

Use kgnilnd_vzalloc() for copy buffer allocation

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • None
    • 3
    • 9223372036854775807

    Description

      Description of the issue from James Shimek:

      The node was essentially out of memory. I didn't see any allocation failures but running that close to the physical memory limit probably doesn't help it complete requests on time especially since kgnilnd is a bit finicky about memory. And it looks like instead of failing the IO we spin indefinitly due to the vmalloc in the rdma path not being able to fail.

      crash> kmem -i
      PAGES TOTAL PERCENTAGE
      TOTAL MEM 8247725 31.5 GB ----
      FREE 72580 283.5 MB 0% of TOTAL MEM
      USED 8175145 31.2 GB 99% of TOTAL MEM
      SHARED 5493193 21 GB 66% of TOTAL MEM
      BUFFERS 0 0 0% of TOTAL MEM
      CACHED 5881794 22.4 GB 71% of TOTAL MEM
      SLAB 384907 1.5 GB 4% of TOTAL MEM

      TOTAL SWAP 0 0 ----
      SWAP USED 0 0 100% of TOTAL SWAP
      SWAP FREE 0 0 0% of TOTAL SWAP

      COMMIT LIMIT 4123862 15.7 GB ----
      COMMITTED 515156 2 GB 12% of TOTAL LIMIT

      most of the memory seems to be being used by the kdwf system.
      >crash> sys
      > KERNEL: service_cle_6.1.DV00-build6.1.70DV_sles_12-created20160324.cpio/DEFAULT/boot/vmlinux-3.12.51-52.39.1_1.0000.9086-cray_ari_s
      > DUMPFILE: c0-0c1s1n2-1603301901.cdump [PARTIAL DUMP]
      > CPUS: 16
      > DATE: Wed Mar 30 19:00:49 2016
      > UPTIME: 1 days, 10:15:06
      >LOAD AVERAGE: 84.00, 83.84, 69.30
      > TASKS: 1398
      > NODENAME: nid00070
      > RELEASE: 3.12.51-52.39.1_1.0000.9086-cray_ari_s
      > VERSION: #1 SMP Thu Mar 10 22:13:18 UTC 2016
      > MACHINE: x86_64 (2600 Mhz)
      > MEMORY: 32 GB
      > PANIC: ""
      >crash> bt 8715
      >PID: 8715 TASK: ffff880442bbd180 CPU: 6 COMMAND: "kgnilnd_sd_04"
      > #0 [ffff8800457734c8] schedule at ffffffff815a39e5
      > #1 [ffff880045773548] schedule_timeout at ffffffff815a2391
      > #2 [ffff8800457735e0] __down_common at ffffffff815a50a6
      > #3 [ffff880045773640] __down at ffffffff815a5116
      > #4 [ffff880045773650] down at ffffffff81088581
      > #5 [ffff880045773670] dvsipc_send_ipc_request_common at ffffffffa053e5e5 >[dvsipc]
      > #6 [ffff8800457736d0] dvsipc_send_ipc_request at ffffffffa053edd4 [dvsipc]
      > #7 [ffff8800457736e0] send_ipc_request at ffffffffa053767d [dvsipc]
      > #8 [ffff880045773708] dvsnet_send_ipc_request at ffffffffa0c3c2ba [dvsnet_if]
      > #9 [ffff880045773720] kdwfs_send_transaction at ffffffffa03ad268 [kdwfs]
      >#10 [ffff880045773758] kdwfs_send_transaction_retry at ffffffffa03ade23 [kdwfs]
      >#11 [ffff880045773780] kdwfs_send_transaction_namespace_retry at ffffffffa03af54b [kdwfs]
      >#12 [ffff8800457737a0] kdwfs_send_unlink at ffffffffa03afddd [kdwfs]
      >#13 [ffff8800457737d8] kdwfs_evict_inode at ffffffffa03a78df [kdwfs]
      >#14 [ffff8800457737f8] evict at ffffffff81194e3c
      >#15 [ffff880045773820] iput at ffffffff81195685
      >#16 [ffff880045773850] __dentry_kill at ffffffff81191038
      >#17 [ffff880045773878] shrink_dentry_list at ffffffff81191393
      >#18 [ffff8800457738a8] prune_dcache_sb at ffffffff81192827
      >#19 [ffff8800457738e0] super_cache_scan at ffffffff8117e9a6
      >#20 [ffff880045773928] shrink_slab_node at ffffffff8112c6ac
      >#21 [ffff8800457739b8] shrink_slab at ffffffff8112d355
      >#22 [ffff880045773a00] do_try_to_free_pages at ffffffff8113071f
      >#23 [ffff880045773aa0] try_to_free_pages at ffffffff8113093f
      >#24 [ffff880045773b10] __alloc_pages_nodemask at ffffffff81124ba5
      >#25 [ffff880045773c38] alloc_pages_current at ffffffff8116227a
      >#26 [ffff880045773c80] __vmalloc_node_range at ffffffff8115430a
      >#27 [ffff880045773cf0] vmalloc at ffffffff8115463b
      >#28 [ffff880045773d10] kgnilnd_rdma at ffffffffa05d6bed [kgnilnd]
      >#29 [ffff880045773d98] kgnilnd_send_mapped_tx at ffffffffa05dd64e [kgnilnd]
      >#30 [ffff880045773dd0] kgnilnd_process_mapped_tx at ffffffffa05e17d6 [kgnilnd]
      >#31 [ffff880045773e60] kgnilnd_scheduler at ffffffffa05e383a [kgnilnd]
      >#32 [ffff880045773ed0] kthread at ffffffff81069ca0
      >#33 [ffff880045773f50] ret_from_fork at ffffffff815ae788

      This stack trace shows that kgnilnd is trying to send... but its using a vmalloc in the send path. Probably due to the buffer to be sent being unaligned in some manner. So the vmalloc needs to fail. but we arent using the __vmalloc that doesn't not try to do IO so we get stuck....

      Attachments

        Activity

          People

            wc-triage WC Triage
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: