[LU-9193] Multiple hangs observed with many open/getattr - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Critical
Fix Version/s: Lustre 2.13.0, Lustre 2.12.7
Affects Version/s: Lustre 2.7.0, Lustre 2.5.3, Lustre 2.8.0, Lustre 2.9.0
Labels:
None
Environment:
Centos 7.2
Centos 6.[7-8]
SELinux enforcing

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

Tested (reproduced) on 2.5 , 2.7, 2.8 and 2.9

MPI job on 300 nodes: 2/3 open and 1/3 stat on same file => hang (The MDS server threads are idle and the load is close to 0, threads are waiting for a lock but no threads have an active lock. After a long time, 15/30mn, threads become responsive again and resume operations normally).
Same job with only stat => no problem
Same job with only open => no problem

Some of the logs were similar to ~~LU-5497~~ and ~~LU-4579~~ but patches did not fix the issue.
If all job's clients are evicted manually then lustre recover and resume to a normal state.
Lustre 2.7.2 with patch of ~~LU-5781~~ (Solve a race for LRU lock cancel) was tested too.

So far what prevents the issue is to disable SELINUX:

cat policy-noxattr-lustre.patch
- - serefpolicy-3.13.1/policy/modules/kernel/filesystem.te.orig 2016-08-02 19:56:29.997519918 +0000
    +++ serefpolicy-3.13.1/policy/modules/kernel/filesystem.te 2016-08-02 19:57:10.124519918 +0000
    @@ -32,7 +32,8 @@ fs_use_xattr gfs2 gen_context(system_u:o
    fs_use_xattr gpfs gen_context(system_u:object_r:fs_t,s0);
    fs_use_xattr jffs2 gen_context(system_u:object_r:fs_t,s0);
    fs_use_xattr jfs gen_context(system_u:object_r:fs_t,s0);
    -fs_use_xattr lustre gen_context(system_u:object_r:fs_t,s0);
    +# Lustre is not supported Selinux correctly
    +#fs_use_xattr lustre gen_context(system_u:object_r:fs_t,s0);
    fs_use_xattr ocfs2 gen_context(system_u:object_r:fs_t,s0);
    fs_use_xattr overlay gen_context(system_u:object_r:fs_t,s0);
    fs_use_xattr xfs gen_context(system_u:object_r:fs_t,s0);

Reproducer (127 clients: vm3 to vm130, vm0 is MDS, vm1 and 2 are OSS):
mkdir /lustre/testfs/testuser/testdir; sleep 4; clush -bw vm[3-130] 'seq 0 1000 | xargs -P 7 -I{} sh -c "(({}%3==0)) && touch /lustre/testfs/testuser/testdir/foo$(hostname -s | tr -d vm) || stat /lustre/testfs/testuser/testdir > /dev/null"'

Tested disabling statahead. No impact.

Traces of stuck processes on the MDS look all the same (could be related to DDN-366):
8631 TASK: ffff880732202280 CPU:
[ffff88071587f760] __schedule at ffffffff8163b6cd
[ffff88071587f7c8] schedule at ffffffff8163bd69
[ffff88071587f7d8] schedule_timeout at ffffffff816399c5
[ffff88071587f880] ldlm_completion_ast at ffffffffa08b7fe1
[ffff88071587f920] ldlm_cli_enqueue_local at ffffffffa08b9c20
[ffff88071587f9b8] mdt_object_local_lock at ffffffffa0f3c6d2
[ffff88071587fa60] mdt_object_lock_internal at ffffffffa0f3cffb
[ffff88071587faa0] mdt_getattr_name_lock at ffffffffa0f3ddf6
[ffff88071587fb28] mdt_intent_getattr at ffffffffa0f3f3f0
[ffff88071587fb68] mdt_intent_policy at ffffffffa0f42e7c
[ffff88071587fbd0] ldlm_lock_enqueue at ffffffffa089f1f7
[ffff88071587fc28] ldlm_handle_enqueue0 at ffffffffa08c4fb2
[ffff88071587fcb8] tgt_enqueue at ffffffffa09493f2
[ffff88071587fcd8] tgt_request_handle at ffffffffa094ddf5
[ffff88071587fd20] ptlrpc_server_handle_request at ffffffffa08f87cb
[ffff88071587fde8] ptlrpc_main at ffffffffa08fc0f0
[ffff88071587fec8] kthread at ffffffff810a5b8f
[ffff88071587ff50] ret_from_fork at ffffffff81646c98

I do not have (yet) the clients traces but last call on clients is mdc_enqueue.

Crash dump analysis started (in progress). So far nothing obvious about SELinux, first analysis lead to ldlm_handle_enqueue0() and ldlm_lock_enqueue(). We see some processes idle for 280s, it's not entirely stuck but very very slow (we had to crash the VM to get proper dump because it was very hard to use crash as the threads are not 100% stuck).

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

logs_050717.tar.gz
89.55 MB
05/Jul/17 12:33 PM
logs-9193-patchset11-tests.tar.gz
8.46 MB
18/Sep/17 11:21 AM
LU-9193.tar.gz
16.73 MB
24/May/17 12:30 PM
LU-9193-patchset8.tar.gz
71.05 MB
15/Jun/17 12:36 PM
lustre-logs.tar.gz
12.65 MB
30/May/17 10:36 AM
lustre-logs-210617.tar.gz
14.31 MB
21/Jun/17 10:59 AM
lustre-LU9193-240817.tar.gz
79.36 MB
24/Aug/17 10:25 AM
vm0.tar.gz
13.93 MB
11/Jan/17 10:57 AM
vm105.tar.gz
134 kB
12/Jan/17 10:10 AM
vm62.tar.gz
133 kB
11/Jan/17 10:56 AM

Issue Links

is related to

LU-6784 Defects in SELinux support

Resolved

is related to

LU-12212 Often requests timeouts during dbench run

Resolved

LU-10262 Lock contention when doing creates for the same name

Resolved

LU-5560 SELinux support on the client side

Resolved

LU-10235 mkdir should check for directory existence on client before taking write lock

Resolved

Activity

People

Assignee:: Bruno Faccini (Inactive)

Reporter:: Jean-Baptiste Riaux (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 23 Start watching this issue

Dates

Created:: 08/Dec/16 8:04 PM

Updated:: 14/Dec/21 10:33 PM

Resolved:: 01/Apr/19 12:30 PM