Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.1.0
-
lustre 2.1.0-21chaos (github.com/chaos/lustre)
-
3
-
4697
Description
We have an OSS where two OST threads are using significant CPU time, and I suspect not making real progress:
29703 root 20 0 0 0 0 R 40.7 0.0 309:55.91 ll_ost_io_375 14028 root 20 0 0 0 0 R 39.7 0.0 311:25.93 ll_ost_io_75
When I can catch them between active cycles on the CPU, they show the following backtraces from "crash":
crash> foreach 14028 29703 bt PID: 14028 TASK: ffff88041345ab00 CPU: 7 COMMAND: "ll_ost_io_75" #0 [ffff88041270f390] schedule at ffffffff814ee380 #1 [ffff88041270f458] schedule_timeout at ffffffff814ef235 #2 [ffff88041270f508] __down at ffffffff814f0152 #3 [ffff88041270f558] down at ffffffff810968b1 #4 [ffff88041270f588] ldlm_pools_shrink at ffffffffa05c9fd8 [ptlrpc] #5 [ffff88041270f5d8] ldlm_pools_srv_shrink at ffffffffa05ca213 [ptlrpc] #6 [ffff88041270f5e8] shrink_slab at ffffffff81129cba #7 [ffff88041270f648] zone_reclaim at ffffffff8112c364 #8 [ffff88041270f728] get_page_from_freelist at ffffffff81122934 #9 [ffff88041270f848] __alloc_pages_nodemask at ffffffff81123bb1 #10 [ffff88041270f968] kmem_getpages at ffffffff8115e0d2 #11 [ffff88041270f998] cache_grow at ffffffff8115e73f #12 [ffff88041270fa08] cache_alloc_refill at ffffffff8115e992 #13 [ffff88041270fa78] __kmalloc at ffffffff8115f6b9 #14 [ffff88041270fac8] cfs_alloc at ffffffffa032a863 [libcfs] #15 [ffff88041270faf8] ptlrpc_prep_bulk_exp at ffffffffa05d286d [ptlrpc] #16 [ffff88041270fb58] ost_brw_write at ffffffffa0a459c1 [ost] #17 [ffff88041270fcc8] ost_handle at ffffffffa0a49a19 [ost] #18 [ffff88041270fdf8] ptlrpc_main at ffffffffa05eb181 [ptlrpc] #19 [ffff88041270ff48] kernel_thread at ffffffff8100c14a PID: 29703 TASK: ffff88030b1be100 CPU: 5 COMMAND: "ll_ost_io_375" #0 [ffff8802e04771f0] schedule at ffffffff814ee380 #1 [ffff8802e04772b8] schedule_timeout at ffffffff814ef235 #2 [ffff8802e0477368] __down at ffffffff814f0152 #3 [ffff8802e04773b8] down at ffffffff810968b1 #4 [ffff8802e04773e8] ldlm_pools_shrink at ffffffffa05c9fd8 [ptlrpc] #5 [ffff8802e0477438] ldlm_pools_srv_shrink at ffffffffa05ca213 [ptlrpc] #6 [ffff8802e0477448] shrink_slab at ffffffff81129cba #7 [ffff8802e04774a8] zone_reclaim at ffffffff8112c364 #8 [ffff8802e0477588] get_page_from_freelist at ffffffff81122934 #9 [ffff8802e04776a8] __alloc_pages_nodemask at ffffffff81123bb1 #10 [ffff8802e04777c8] kmem_getpages at ffffffff8115e0d2 #11 [ffff8802e04777f8] cache_grow at ffffffff8115e73f #12 [ffff8802e0477868] cache_alloc_refill at ffffffff8115e992 #13 [ffff8802e04778d8] __kmalloc at ffffffff8115f6b9 #14 [ffff8802e0477928] cfs_alloc at ffffffffa032a863 [libcfs] #15 [ffff8802e0477958] filter_alloc_iobuf at ffffffffa0aa799e [obdfilter] #16 [ffff8802e04779a8] filter_iobuf_get at ffffffffa0a89c0e [obdfilter] #17 [ffff8802e04779c8] filter_preprw_read at ffffffffa0a9ff7e [obdfilter] #18 [ffff8802e0477af8] filter_preprw at ffffffffa0aa09b0 [obdfilter] #19 [ffff8802e0477b38] obd_preprw at ffffffffa0a3e33e [ost] #20 [ffff8802e0477ba8] ost_brw_read at ffffffffa0a44315 [ost] #21 [ffff8802e0477cc8] ost_handle at ffffffffa0a48d27 [ost] #22 [ffff8802e0477df8] ptlrpc_main at ffffffffa05eb181 [ptlrpc] #23 [ffff8802e0477f48] kernel_thread at ffffffff8100c14a
So it appears that they are failing to make progress in the shrinker code.
The immediate symptom that users see is a client that is unable to reconnect to an OST on this OSS:
Lustre: 15630:0:(ldlm_lib.c:852:target_handle_connect()) lsd-OST01bb: refuse reconnection from 59381921-e835-2c32-ab83-d3f9e3fbad7f@192.168.120.181@o2ib3 to 0xffff88040d93ec00/1
(As an aside, we really need to improve these console messages. Brian had a set of patches that greatly improved them attached to bugzilla bug 16909 and that LLNL carried for quite a while. But Sun/Oracle never landed them. I'll work on porting them forward and get them into gerrit when I have a chance).
That message means that one RPC is outstanding for that export, and the client will not be allowed to reconnect until it is removed. We can see the refused connections for hours, and I suspect that it will never fix itself without intervention. My assumption is that one of the runaway ll_ost_io threads is to blame for the outstanding RPC.
Memory is not really too tight on the node at this time.
> cat /proc/meminfo MemTotal: 16202948 kB MemFree: 2805328 kB Buffers: 7522692 kB Cached: 1631292 kB SwapCached: 0 kB Active: 6426516 kB Inactive: 2743512 kB Active(anon): 58184 kB Inactive(anon): 331836 kB Active(file): 6368332 kB Inactive(file): 2411676 kB Unevictable: 31840 kB Mlocked: 31840 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 7844 kB Writeback: 344 kB AnonPages: 48572 kB Mapped: 12160 kB Shmem: 367384 kB Slab: 3405396 kB SReclaimable: 965160 kB SUnreclaim: 2440236 kB KernelStack: 13216 kB PageTables: 4556 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 8101472 kB Committed_AS: 584152 kB VmallocTotal: 34359738367 kB VmallocUsed: 360960 kB VmallocChunk: 34350502472 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 9856 kB DirectMap2M: 2086912 kB DirectMap1G: 14680064 kB
Attachments
Issue Links
- is related to
-
LU-7266 Fix LDLM pool to make LRUR working properly
- Open
-
LU-2924 shrink ldlm_poold workload
- Resolved
- Trackbacks
-
Changelog 2.1 Changes from version 2.1.1 to version 2.1.2 Server support for kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.17.1.el6 (RHEL6) Client support for unpatched kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.17.1....
-
Changelog 2.2 version 2.2.0 Support for networks: o2iblnd OFED 1.5.4 Server support for kernels: 2.6.32220.4.2.el6 (RHEL6) Client support for unpatched kernels: 2.6.18274.18.1.el5 (RHEL5) 2.6.32220.4.2.el6 (RHEL6) 2.6.32.360....