[LU-8118] very slow metadata performance with shared striped directory Created: 09/May/16 Updated: 07/Jun/16 |
|
| Status: | In Progress |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Olaf Faaland | Assignee: | Lai Siyao |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | llnl | ||
| Environment: |
lustre-2.8.0-2.6.32_573.22.1.1chaos.ch5.4.x86_64.x86_64 cb25ac6 Target building for RHEL7 under Koji. 675a140 LU-7841 doc: stop using python-docutils b50a29a LU-7893 osd-zfs: calls dmu_objset_disown() with NULL 67fe716 LU-7198 clio: remove mtime check in vvp_io_fault_start() 80b4633 LLNL-0000 llapi: get OST count from proc e2717c9 LU-5725 ofd: Expose OFD site_stats through proc 8d9a8f2 LU-4009 osd-zfs: Add tunables to disable sync (DEBUG) 699abe4 LU-8073 build: Eliminate lustre-source binary package 71ee38a LU-8072 build: Restore module debuginfo 7fb8959 LU-7962 build: Support builds w/ weak module ZFS 66579d9 LU-7961 build: Fix ldiskfs source autodetect for CentOS 6 52aa718 LU-7643 build: Remove Linux version string from RPM release field 445b063 LU-5614 build: use %kernel_module_package in rpm spec f5b8fb1 LU-7699 build: Convert lustre_ver.h.in into a static .h file 333612e LU-7699 build: Eliminate lustre_build_version.h a49b396 LU-7699 build: Replace version_tag.pl with LUSTRE-VERSION-GEN 6948075 LU-7518 build: Remove the Phi accelerator-specific packaging ea79df5 New tag 2.8.0-RC5 |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
We are experiencing a severe metadata performance issue intermittently. So far it has occurred with a striped directory being alter Setup: Symptoms: In one case: srun -N 10 -n 80 mdtest -d /p/lustre/faaland1/mdtest -n 128000 -F The create rate started out about 30,000 creates/second total across all 10 MDTs. After some time it dropped to 10-20 creates/secon In the other case: The workload involved 10 clients, each running 96 threads. Shell scripts were randomly invoking mkdir, touch, rmdir, or rm (the lat Create rate started at about 10,000/sec concurrent with thousands of unlinks, mkdirs, rmdirs, and stats per sec. All those operatio strace -T output: getdents(3, /* 1061 entries */, 32768) = 32760 <1887.749809> getdents(3, /* 1102 entries */, 32768) = 32760 <1990.174707> getdents(3, /* 1087 entries */, 32768) = 32752 <1994.781547> getdents(3, /* 1056 entries */, 32768) = 32768 <1907.404333> brk(0xcf0000) = 0xcf0000 <0.000030> getdents(3, /* 1091 entries */, 32768) = 32752 <1860.720958> |
| Comments |
| Comment by Andreas Dilger [ 10/May/16 ] |
|
Are your workloads creating subdirectories? Have you tested without "setstripe -D" to see if that is the cause of the slowdown? |
| Comment by Joseph Gmitter (Inactive) [ 10/May/16 ] |
|
Hi Lai, Can you please advise on this? Thanks. |
| Comment by Olaf Faaland [ 10/May/16 ] |
|
Andreas, |
| Comment by Andreas Dilger [ 10/May/16 ] |
|
The "setstripe -D" should only affect subdirectory creation. That would appear to affect your second example where you wrote "Shell scripts were randomly invoking mkdir, touch, rmdir, or rm" but not the mdtest run, which is only creating files. As a starting point, it would be useful to collect full debug logs from one of the clients when it is in the "slow create" mode, to see if it is blocked locally or waiting on the MDS. If possible, collecting debug logs (at least +dlmtrace +rpctrace) from the MDSes during this slowdown would also be useful. Do you have any indication that one MDS is slower than the others (higher CPU load or load average) during this time? |