[LU-14773] reduce run_one() overhead Created: 18/Jun/21 Updated: 23/Jul/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Minor |
| Reporter: | Andreas Dilger | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
There could be some simple changes made to reduce individual subtest time and unmount/mount/format times that would help speed up every test session . Individual sanity subtests that are not doing more than "touch file; check if file exists" currently take 6-7 seconds because they are doing a lot of different things in the background in run_one() with multiple "do_nodes" commands:
When sanity was first written, these subtests took a fraction of a second each (i.e. they would scroll quickly up the screen). While I think the above checks are useful, the overhead could be reduced. I think the large part of this slowness is that each of these checks runs as a separate ssh/mcmd command, to each remote VM in series, and each ssh invocation is relatively slow. Speeding up the ssh invocation itself (via do_facet()/do_node()) would of course be desirable, but is not something I can control directly. Running the per-node checks in parallel would be a win (e.g. use real "pdsh" or "clush"), as would combining all of the checks into a single command that is run with a single ssh invocation to each node. The latter is something that can be done directly in test-framework, and is the main target of this ticket. |
| Comments |
| Comment by Gerrit Updater [ 18/Jun/21 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44033 |
| Comment by Andreas Dilger [ 18/Jun/21 ] |
|
Note that the 44033 patch is NOT the only thing that should be fixed, but is a simple patch that may produce immediate benefits (at a minimum it will avoid a lot of useless visual clutter in the subtest logs from the check_network() output). |
| Comment by Gerrit Updater [ 18/Jun/21 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44034 |
| Comment by Gerrit Updater [ 18/Aug/21 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44033/ |
| Comment by Gerrit Updater [ 25/Aug/21 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44034/ |