[LU-8641] speedup run_metabech () : make cleanup optional Created: 26/Sep/16  Updated: 23/Apr/17  Resolved: 23/Apr/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.10.0

Type: Improvement Priority: Minor
Reporter: Elena Gryaznova Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: patch

Issue Links:
Duplicate
Related
is related to LU-3308 large readdir chunk size slows unlink... Reopened
Rank (Obsolete): 9223372036854775807

 Description   

1
on real HW clusters we run metabench with parameters :

mbench_NFILES=50000
mbench_THREADS=16
+ /test-tools/mpich2/metabench/src/metabench -w /mnt/fs1/d0.metabench -c 50000 -C -S -k -p /test-tools/mpich2/metabench/dictionary
+ chmod 0777 /mnt/fs1
drwxrwxrwx 5 root root 270336 Apr 25 10:02 /mnt/fs1
+ su mpiuser sh -c "/usr/bin/mpirun      -np 112 /test-tools/mpich2/metabench/src/metabench -w /mnt/fs1/d0.metabench -c 50000 -C -S -k -p /test-tools/mpich2/metabench/dictionary "

test itself comleted in 20 minutes with creation rate :

------------- ------------ ---------- ---------- ----------
Total                         5600000   1112.045   5035.769
Elapsed                       5600000   1112.047   5035.758
------------- ------------ ---------- ---------- ----------
Average                         50000   1097.669     45.556
Std Dev                                   11.592      0.489 (   1.06%) (   1.07%)

[2016-04-0025 10:21:30] Leaving par_create_multidir

then test runs rm -rf <dir> with 5 millions file and can not be completed in 1.5 hours.

2. On real cluster it makes sense to leave files not deleted, so total number of files would grow with time.



 Comments   
Comment by Gerrit Updater [ 30/Sep/16 ]

Elena Gryaznova (elena.gryaznova@seagate.com) uploaded a new patch: http://review.whamcloud.com/22852
Subject: LU-8641 tests: tests: speedup metabench
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8e33d6aca1411f524e40493f0e39e44b02054e48

Comment by Andreas Dilger [ 11/Mar/17 ]

Instead of using "rm -r $dir" to empty directories, it is likely much faster to use "find $dir -print0 -type f | xargs -0 unlink; rm -r $dir" to delete the files first, then the remaining directories (if any).

Ideally it would be better to actually fix the "rm -r" performance for large directories, rather than just fixing the test script, since this is a performance problem that also affects users. This is discussed in LU-3308, including options to reduce readdir() RPC chunksize (possibly dynamically when contention is present), and alternately keeping the directory contents cached on the client after the DLM lock has been cancelled (as allowed by POSIX) to avoid cache ping-pong.

Comment by Gerrit Updater [ 23/Apr/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/22852/
Subject: LU-8641 tests: tests: speedup metabench
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: a9339b851789348348e77a02bff16f8d3af69091

Comment by Peter Jones [ 23/Apr/17 ]

Landed for 2.10

Generated at Sat Feb 10 02:19:18 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.