[LU-1128] Complete investigation of the LDLM pool shrinker and SLV handling - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.2.0, Lustre 2.1.2
Affects Version/s: Lustre 2.1.0
Labels:
- llnl
Environment:
lustre 2.1.0-21chaos (github.com/chaos/lustre)

Severity:
3
Rank (Obsolete):
4697

Description

We have an OSS where two OST threads are using significant CPU time, and I suspect not making real progress:

29703 root      20   0     0    0    0 R 40.7  0.0 309:55.91 ll_ost_io_375
14028 root      20   0     0    0    0 R 39.7  0.0 311:25.93 ll_ost_io_75

When I can catch them between active cycles on the CPU, they show the following backtraces from "crash":

crash> foreach 14028 29703 bt
PID: 14028  TASK: ffff88041345ab00  CPU: 7   COMMAND: "ll_ost_io_75"
 #0 [ffff88041270f390] schedule at ffffffff814ee380
 #1 [ffff88041270f458] schedule_timeout at ffffffff814ef235
 #2 [ffff88041270f508] __down at ffffffff814f0152
 #3 [ffff88041270f558] down at ffffffff810968b1
 #4 [ffff88041270f588] ldlm_pools_shrink at ffffffffa05c9fd8 [ptlrpc]
 #5 [ffff88041270f5d8] ldlm_pools_srv_shrink at ffffffffa05ca213 [ptlrpc]
 #6 [ffff88041270f5e8] shrink_slab at ffffffff81129cba
 #7 [ffff88041270f648] zone_reclaim at ffffffff8112c364
 #8 [ffff88041270f728] get_page_from_freelist at ffffffff81122934
 #9 [ffff88041270f848] __alloc_pages_nodemask at ffffffff81123bb1
#10 [ffff88041270f968] kmem_getpages at ffffffff8115e0d2
#11 [ffff88041270f998] cache_grow at ffffffff8115e73f
#12 [ffff88041270fa08] cache_alloc_refill at ffffffff8115e992
#13 [ffff88041270fa78] __kmalloc at ffffffff8115f6b9
#14 [ffff88041270fac8] cfs_alloc at ffffffffa032a863 [libcfs]
#15 [ffff88041270faf8] ptlrpc_prep_bulk_exp at ffffffffa05d286d [ptlrpc]
#16 [ffff88041270fb58] ost_brw_write at ffffffffa0a459c1 [ost]
#17 [ffff88041270fcc8] ost_handle at ffffffffa0a49a19 [ost]
#18 [ffff88041270fdf8] ptlrpc_main at ffffffffa05eb181 [ptlrpc]
#19 [ffff88041270ff48] kernel_thread at ffffffff8100c14a

PID: 29703  TASK: ffff88030b1be100  CPU: 5   COMMAND: "ll_ost_io_375"
 #0 [ffff8802e04771f0] schedule at ffffffff814ee380
 #1 [ffff8802e04772b8] schedule_timeout at ffffffff814ef235
 #2 [ffff8802e0477368] __down at ffffffff814f0152
 #3 [ffff8802e04773b8] down at ffffffff810968b1
 #4 [ffff8802e04773e8] ldlm_pools_shrink at ffffffffa05c9fd8 [ptlrpc]
 #5 [ffff8802e0477438] ldlm_pools_srv_shrink at ffffffffa05ca213 [ptlrpc]
 #6 [ffff8802e0477448] shrink_slab at ffffffff81129cba
 #7 [ffff8802e04774a8] zone_reclaim at ffffffff8112c364
 #8 [ffff8802e0477588] get_page_from_freelist at ffffffff81122934
 #9 [ffff8802e04776a8] __alloc_pages_nodemask at ffffffff81123bb1
#10 [ffff8802e04777c8] kmem_getpages at ffffffff8115e0d2
#11 [ffff8802e04777f8] cache_grow at ffffffff8115e73f
#12 [ffff8802e0477868] cache_alloc_refill at ffffffff8115e992
#13 [ffff8802e04778d8] __kmalloc at ffffffff8115f6b9
#14 [ffff8802e0477928] cfs_alloc at ffffffffa032a863 [libcfs]
#15 [ffff8802e0477958] filter_alloc_iobuf at ffffffffa0aa799e [obdfilter]
#16 [ffff8802e04779a8] filter_iobuf_get at ffffffffa0a89c0e [obdfilter]
#17 [ffff8802e04779c8] filter_preprw_read at ffffffffa0a9ff7e [obdfilter]
#18 [ffff8802e0477af8] filter_preprw at ffffffffa0aa09b0 [obdfilter]
#19 [ffff8802e0477b38] obd_preprw at ffffffffa0a3e33e [ost]
#20 [ffff8802e0477ba8] ost_brw_read at ffffffffa0a44315 [ost]
#21 [ffff8802e0477cc8] ost_handle at ffffffffa0a48d27 [ost]
#22 [ffff8802e0477df8] ptlrpc_main at ffffffffa05eb181 [ptlrpc]
#23 [ffff8802e0477f48] kernel_thread at ffffffff8100c14a

So it appears that they are failing to make progress in the shrinker code.

The immediate symptom that users see is a client that is unable to reconnect to an OST on this OSS:

Lustre: 15630:0:(ldlm_lib.c:852:target_handle_connect()) lsd-OST01bb: refuse reconnection from 59381921-e835-2c32-ab83-d3f9e3fbad7f@192.168.120.181@o2ib3 to 0xffff88040d93ec00/1

(As an aside, we really need to improve these console messages. Brian had a set of patches that greatly improved them attached to bugzilla bug 16909 and that LLNL carried for quite a while. But Sun/Oracle never landed them. I'll work on porting them forward and get them into gerrit when I have a chance).

That message means that one RPC is outstanding for that export, and the client will not be allowed to reconnect until it is removed. We can see the refused connections for hours, and I suspect that it will never fix itself without intervention. My assumption is that one of the runaway ll_ost_io threads is to blame for the outstanding RPC.

Memory is not really too tight on the node at this time.

> cat /proc/meminfo 
MemTotal:       16202948 kB
MemFree:         2805328 kB
Buffers:         7522692 kB
Cached:          1631292 kB
SwapCached:            0 kB
Active:          6426516 kB
Inactive:        2743512 kB
Active(anon):      58184 kB
Inactive(anon):   331836 kB
Active(file):    6368332 kB
Inactive(file):  2411676 kB
Unevictable:       31840 kB
Mlocked:           31840 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:              7844 kB
Writeback:           344 kB
AnonPages:         48572 kB
Mapped:            12160 kB
Shmem:            367384 kB
Slab:            3405396 kB
SReclaimable:     965160 kB
SUnreclaim:      2440236 kB
KernelStack:       13216 kB
PageTables:         4556 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     8101472 kB
Committed_AS:     584152 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      360960 kB
VmallocChunk:   34350502472 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        9856 kB
DirectMap2M:     2086912 kB
DirectMap1G:    14680064 kB

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

diff_between_before_and_after.txt
407 kB
23/Feb/12 11:54 PM
ldlm_pools_shrink2.stp
0.7 kB
23/Feb/12 9:55 PM
servers_pool_state_and_stats.txt
526 kB
27/Feb/12 4:28 PM
servers_pool_state_and_stats2.txt
692 kB
27/Feb/12 4:31 PM
stap_script_output.txt.gz
292 kB
23/Feb/12 9:55 PM

Issue Links

is related to

LU-7266 Fix LDLM pool to make LRUR working properly

Open

LU-2924 shrink ldlm_poold workload

Resolved

Trackbacks

Changelog 2.1 Changes from version 2.1.1 to version 2.1.2 Server support for kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.17.1.el6 (RHEL6) Client support for unpatched kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.17.1....

Changelog 2.2 version 2.2.0 Support for networks: o2iblnd OFED 1.5.4 Server support for kernels: 2.6.32220.4.2.el6 (RHEL6) Client support for unpatched kernels: 2.6.18274.18.1.el5 (RHEL5) 2.6.32220.4.2.el6 (RHEL6) 2.6.32.360....

Complete investigation of the LDLM pool shrinker and SLV handling

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates