[LU-13086] 3c7aca747 LU-12395 breaks compatibility mpi tests with mpich Created: 18/Dec/19 Updated: 17/Sep/21 Resolved: 17/Sep/21 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Elena Gryaznova | Assignee: | Elena Gryaznova |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
commit 3c7aca747 [mpiexec@fre1307] match_arg (./utils/args/args.c:160): unrecognized argument oversubscribe [mpiexec@fre1307] HYDU_parse_array (./utils/args/args.c:175): argument matching returned error |
| Comments |
| Comment by Elena Gryaznova [ 18/Dec/19 ] |
|
Andreas, Please advice. |
| Comment by Andreas Dilger [ 11/Jan/20 ] |
|
Added Minh, since he was the author for that patch. Elena, I don't mind to make this optional. It seems we could also specify it as part of $MPIRUN_OPITIONS from the environment, but maybe this has some side-effect that I'm not aware of? |
| Comment by Gerrit Updater [ 03/Apr/20 ] |
|
Elena Gryaznova (c17455@cray.com) uploaded a new patch: https://review.whamcloud.com/38130 |
| Comment by Cory Spitz [ 13/May/20 ] |
|
mdiep, we could use your assistance with some questions in the review of https://review.whamcloud.com/#/c/38130/. Thanks! |
| Comment by Cory Spitz [ 20/May/20 ] |
|
mdiep and jamesanunez can you get together and reconcile your opinions about --oversubscribe and where you set it? James, it sounds like you can't get --oversubscribe in your env with the patch as-is. Is that right? If so, will you and Minh both be happy if it is moved to cfg/local.sh? Please update and clarify your comments in the Gerrit review. Thanks! |
| Comment by Gerrit Updater [ 21/May/20 ] |
|
Elena Gryaznova (c17455@cray.com) uploaded a new patch: https://review.whamcloud.com/38689 |
| Comment by Cory Spitz [ 22/Jun/20 ] |
|
mdiep and jamesanunez, now we have options between https://review.whamcloud.com/#/c/38130/ and https://review.whamcloud.com/#/c/38689/, but reviews have gone stale. Can you please express what approach you want to proceed with and why? |
| Comment by Cory Spitz [ 09/Jul/20 ] |
|
mdiep, colmstea, and jamesanunez, with the activity at https://review.whamcloud.com/#/c/38689/ are we to assume that it is the preferred direction? And should https://review.whamcloud.com/#/c/38130/ be abandoned? |
| Comment by Cory Spitz [ 21/Jul/20 ] |
|
mdiep, colmstea, and jamesanunez, so https://review.whamcloud.com/#/c/38130/ should be abandoned? And in https://review.whamcloud.com/#/c/38689/ Elena has proposed to "totally get rid of --oversubscribe". Can you agree? |
| Comment by Andreas Dilger [ 22/Jul/20 ] |
|
spitzcor, I've abandoned 38130 and updated 38689 to address the minor defect therein. It needs a second review and testing to finish before it can land. It looks like a reasonable compromise to include --oversubscribe so that the testing works out-of-the-box for RHEL (which is far-and-away the most common distro used with Lustre), but still allow the external config file to specify different options based on the MPI version. |
| Comment by Gerrit Updater [ 23/Dec/20 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41082 |
| Comment by Cory Spitz [ 04/Feb/21 ] |
|
I don't have permissions to view https://review.whamcloud.com/#/c/41082/. |
| Comment by Cory Spitz [ 22/Feb/21 ] |
|
Is https://review.whamcloud.com/#/c/41082/ supposed to be a replacement for https://review.whamcloud.com/#/c/38689/ ? |
| Comment by Gerrit Updater [ 02/Sep/21 ] |
|
"Charlie Olmstead <charlie@whamcloud.com>" merged in patch https://review.whamcloud.com/41082/ |
| Comment by Charlie Olmstead [ 02/Sep/21 ] |
|
AT Patch has been deployed |
| Comment by Gerrit Updater [ 17/Sep/21 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/38689/ |
| Comment by Peter Jones [ 17/Sep/21 ] |
|
Landed for 2.15 |