Details
-
Bug
-
Resolution: Won't Fix
-
Critical
-
None
-
Lustre 2.5.1
-
None
-
2.5.1 based Lustre code.
-
3
-
17504
Description
Due investigation an OOM on node, we found a large number allocations done with 532480 and 266240 bytes size.
Example of vm_struct for memory region with size 266240:
crash> vm_struct ffff880019c542c0
struct vm_struct {
next = 0xffff880588f29900,
addr = 0xffffc904a626d000,
size = 266240,
flags = 4,
pages = 0x0,
nr_pages = 0,
phys_addr = 0,
caller = 0xffffffffa00b7136 <mlx4_buf_alloc+870>
}
99% of memory regions with size 266240 and 523480 has caller = 0xffffffffa00b7136 <mlx4_buf_alloc+870>.
number a regions is 31042 / 31296.
I found strange backtraces in kernel
PID: 83859 TASK: ffff8807d64ca040 CPU: 0 COMMAND: "kiblnd_connd"
#0 [ffff8807b2835a90] schedule at ffffffff815253c0
#1 [ffff8807b2835b58] schedule_timeout at ffffffff815262a5
#2 [ffff8807b2835c08] wait_for_common at ffffffff81525f23
#3 [ffff8807b2835c98] wait_for_completion at ffffffff8152603d
#4 [ffff8807b2835ca8] synchronize_sched at ffffffff81096e88
#5 [ffff8807b2835cf8] mlx4_cq_free at ffffffffa00bf188 [mlx4_core]
#6 [ffff8807b2835d68] mlx4_ib_destroy_cq at ffffffffa04725f5 [mlx4_ib]
#7 [ffff8807b2835d88] ib_destroy_cq at ffffffffa043de99 [ib_core]
#8 [ffff8807b2835d98] kiblnd_destroy_conn at ffffffffa0acbafc [ko2iblnd]
#9 [ffff8807b2835dd8] kiblnd_connd at ffffffffa0ad5fe1 [ko2iblnd]
#10 [ffff8807b2835ee8] kthread at ffffffff8109ac66
#11 [ffff8807b2835f48] kernel_thread at ffffffff8100c20a
so thread blocked with something while destroy an ib connection.
inspecting a task
crash> p ((struct task_struct *)0xffff8807d64ca040)->se.cfs_rq->rq->clock $25 = 230339336880160 crash> p ((struct task_struct *)0xffff8807d64ca040)->se.block_start $26 = 230337329685261 >>> (230339336880160-230337329685261)/10**9 2
but more interested in an o2ib lnd statistic i found
crash> kib_net 0xffff8808325e9dc0 struct kib_net { ibn_list = { next = 0xffff8807b40a2f40, prev = 0xffff8807b40a2f40 }, ibn_incarnation = 1423478059211439, ibn_init = 2, ibn_shutdown = 0, ibn_npeers = { counter = 31042 }, ibn_nconns = { counter = 31041 },
so 31k peers - but tests are run on cluster with 14 real clients and 5 server nodes, so isn't more 20 connections exist.
but where it placed?
crash> p &kiblnd_data.kib_connd_zombies $7 = (struct list_head *) 0xffffffffa0ae7e70 <kiblnd_data+112> crash> list -H 0xffffffffa0ae7e70 -o kib_conn.ibc_list | wc -l 31030
so all memory consumed with zombi which need more than 2s to destroy.