[LU-6823] Performance regression on servers with LU-5264 Created: 09/Jul/15  Updated: 10/Jul/15  Resolved: 10/Jul/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.3
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Bruno Travouillon (Inactive) Assignee: Bruno Faccini (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Environment:

RHEL 6.6 w/ Bull kernel 2.6.32-504.16.2.el6.Bull.74.x86_64
Lustre 2.5.3.90 with additionnal patches:
LU-6471 obdclass: fix llog_cat_cleanup() usage on client
LU-6392/LU-6389 llite: restart short read/write for normal IO
LU-5740 kernel upgrade [RHEL6.6 2.6.32-504.el6]
LU-4582 mgc: replace hard-coded MGC_ENQUEUE_LIMIT value
LU-5678 o2iblnd: connection refcount fix for kiblnd_post_rx
LU-5393 osd-ldiskfs: read i_size once to protect against race
LU-3727 nfs: fix ll_get_parent() LBUG caused by permission
LU-4528 llog: dont write llog in 3 steps
LU-5522 ldlm: remove expired lock from per-export list
LU-5264 obdclass: fix race during key quiescency
LU-6049 obdclass: Add synchro in lu_context_key_degister()
LU-6084 ptlrpc: prevent request timeout grow due to recovery
LU-5764 proc: crash of mds on apparent buffer overflow

1 MDT, 480 OSTs, 5000+ clients.


Attachments: File dump_log.tgz     File perf.report-dso     File perf.report.gz    
Issue Links:
Related
is related to LU-6800 Significant performance regression wi... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Since the introduction of LU-5264, we hit large performance regression on our filesystem with some user code that do lots of IOPS. This has an huge impact on the MDS, and btw on all the lustre clients. On the MDS, ptlrpcd, mdt and ldlm threads are overloaded, waiting in _spin_lock().

The perf report recorded during a slow-down window of the FS is attached. Here is a sample from perf.report-dso:

#
# Overhead             Shared Object
# ........  ........................
#
    98.08%  [kernel.kallsyms]
            |
            --- _spin_lock
               |
               |--97.33%-- 0xffffffffa05c12dc
               |          |
               |          |--53.24%-- 0xffffffffa075b629
               |          |          kthread
               |          |          child_rip
               |          |
               |          |--45.63%-- 0xffffffffa075b676
               |          |          kthread
               |          |          child_rip
               |          |
               |           --1.12%-- 0xffffffffa076aa58
               |                     kthread
               |                     child_rip
               |

Callers:

crash> kmem 0xffffffffa05c12dc
ffffffffa05c12dc (t) lu_context_exit+188 [obdclass] 

   VM_STRUCT                 ADDRESS RANGE               SIZE
ffff881877c91a40  ffffffffa0571000 - ffffffffa06a5000  1261568

      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea0055920d28 1872df3000                0  e4d1079  1 140000000000000
crash> kmem 0xffffffffa075b676
ffffffffa075b676 (t) ptlrpc_main+2806 [ptlrpc] ../debug/lustre-2.5.3.90/lustre/ptlrpc/service.c: 2356

   VM_STRUCT                 ADDRESS RANGE               SIZE
ffff881877c91400  ffffffffa06fb000 - ffffffffa0895000  1679360

      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea004b685228 158b853000                0      110  1 140000000000000
crash> kmem 0xffffffffa075b629
ffffffffa075b629 (t) ptlrpc_main+2729 [ptlrpc] ../debug/lustre-2.5.3.90/lustre/ptlrpc/service.c: 2534

   VM_STRUCT                 ADDRESS RANGE               SIZE
ffff881877c91400  ffffffffa06fb000 - ffffffffa0895000  1679360

      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea004b685228 158b853000                0      110  1 140000000000000

This _spin_lock() has been introduced by LU-5264, see http://review.whamcloud.com/#/c/13103/ . I can't find any backport to b2_5 in gerrit, but it seems that our engineering got the agreement to use this patch with Lustre 2.5.

For now, as a workaround, we removed both patches from LU-5264 and LU-6049.

My guess is that it should happen as well on OSS.

Could you help us to fix this? Do we miss some patches on top of 2.5.3.90? Is there any other conflicting patch?

I attach some debug logs (dump_log.tgz) from the MDS during the observed issue, if this can help.

If you need further information, please let me know.



 Comments   
Comment by Bruno Faccini (Inactive) [ 09/Jul/15 ]

Hello Bruno,
This is likely a duplicate of LU-6800. And my LU-5264 patch is the culprit here.
A first fix (changing spin-lock in a rw-lock) is currently under testing and others possible way to fix/improve too.
Will let you know asap how it goes.
Also, thanks for your "real-world" profiling infos.

Comment by Bruno Travouillon (Inactive) [ 09/Jul/15 ]

Indeed, this is a duplicate of LU-6800. Gaƫtan is working on this issue with me, feel free to contact him if you need further information. A crash dump is available at the customer site if you need more "real-world" data.

Comment by Andreas Dilger [ 10/Jul/15 ]

Closing as a duplicate of LU-6800.

Generated at Sat Feb 10 02:03:33 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.