Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.13.0
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Li Xi <pkuelelixi@gmail.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/2fe8e80e-8cce-11e9-abe3-52540065bddc
By checking the test logs, we can find that in test_13, after running "lctl nodemap_del 48714_1", the test script check whether the nodemap has been deleted or not immediately in delete_nodemaps() of sanity-sec.sh. However, "lctl get_param nodemap.48714_1.id" still prints a result, which is unexpected by delete_nodemaps(). And thus, delete_nodemaps() quit with error reporting failure of test_13.
test_14 and test_15 failed too, but that is consequence of test_13 failure. In test_13, delete_nodemaps() didn't remove the existing nodemaps after 48714_1, so the nodemap_add of 48714_2 fails.
I think we need to have improvemens here. test_13, test_14 and test_15 are unrelated, so before running these test cases, delete_nodemaps() need to delete existing nodemaps to avoid failure.
Hmm, when comparing test log from one of the recent failures (https://testing.whamcloud.com/test_sets/969517b8-c9a9-11e9-9fc9-52540065bddc) and test log from patch https://review.whamcloud.com/35421/ when it passed Maloo (https://testing.whamcloud.com/sub_tests/7f7dee14-9f70-11e9-9e3d-52540065bddc), it appears that there are no such message as "On MGS 10.9.4.124, 40996_0.id = nodemap.40996_0.id=1" in the failure case.
It means wait_nm_sync did not do its job, possibly because of the empty third parameter not taken into account properly. I will push a patch to make that more robust.