[LU-5921] conf-sanity test_41c: unexpected concurent OST mounts result, rc=0 rc2=1 - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.8.0
Affects Version/s: None
Labels:
None

Severity:
3
Rank (Obsolete):
16528

Description

This issue was created by maloo for nasf <fan.yong@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/078fe42c-6b6b-11e4-b1b4-5254006e85c2.

The sub-test test_41c failed with the following error:

unexpected concurent OST mounts result, rc=0 rc2=1

Please provide additional information about the failure here.

Info required for matching: conf-sanity 41c

Attachments

Issue Links

is related to

LU-8168 conf-sanity test_41c: unexpected concurent MDT mounts result, rc=17 rc2=0

Open

LU-7442 conf-sanity test_41c: @@@@@@ FAIL: unexpected concurent MDT mounts rc=17 rc2=0

Resolved

is related to

LU-5299 osd_start() LBUG when doing parallel mount of the same target

Resolved

Activity

[LU-5921] conf-sanity test_41c: unexpected concurent OST mounts result, rc=0 rc2=1

Peter Jones added a comment - 09/Dec/15 4:51 AM

Landed for 2.8

Peter Jones added a comment - 09/Dec/15 4:51 AM Landed for 2.8

Gerrit Updater added a comment - 09/Dec/15 2:24 AM

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17302/
Subject: ~~LU-5921~~ tests: enhance server target mount race testing
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 5023ca334069950ccece06db3c12104232b5ab71

Gerrit Updater added a comment - 09/Dec/15 2:24 AM Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17302/ Subject: LU-5921 tests: enhance server target mount race testing Project: fs/lustre-release Branch: master Current Patch Set: Commit: 5023ca334069950ccece06db3c12104232b5ab71

Bruno Faccini (Inactive) added a comment - 20/Nov/15 2:47 PM

I have pushed patch #17302, to better set a concurrent/racy situation for same server target mount, and also allow to handle all mount errors instead of only EALREADY.

Bruno Faccini (Inactive) added a comment - 20/Nov/15 2:47 PM I have pushed patch #17302, to better set a concurrent/racy situation for same server target mount, and also allow to handle all mount errors instead of only EALREADY.

Gerrit Updater added a comment - 20/Nov/15 2:39 PM

Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/17302
Subject: ~~LU-5921~~ tests: enhance server target mount race testing
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ec9a588356c09d83e9cb3d9d385fb9ef200edf1e

Gerrit Updater added a comment - 20/Nov/15 2:39 PM Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/17302 Subject: LU-5921 tests: enhance server target mount race testing Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: ec9a588356c09d83e9cb3d9d385fb9ef200edf1e

Andreas Dilger added a comment - 15/Nov/14 7:31 AM

Sorry, I meant OBD_RACE() might be better than waiting in userspace. That depends on whether you actually want the two processes to race or just to have one in the middle of mounting when the second one starts.

Andreas Dilger added a comment - 15/Nov/14 7:31 AM Sorry, I meant OBD_RACE() might be better than waiting in userspace. That depends on whether you actually want the two processes to race or just to have one in the middle of mounting when the second one starts.

Bruno Faccini (Inactive) added a comment - 14/Nov/14 10:58 PM

Hello Andreas,
I am sorry but I am not completely sure about what you mean by "this would be better with OBD_FAIL_RACE() instead of having the test clear the fail_loc before the second mount" ? Do you think that I better have to use the "OBD_RACE(OBD_FAIL_TGT_CONN_RACE);" capability at the beginning of target_handle_connect(), instead to set/unset OBD_FAIL_TGT_DELAY_CONNECT to have 1st mount stall 10s and try to force a race?

OTOH, I may also change conf-sanity/test_41c to only check that only one of both concurrent mounts fails regardless of its return code ?

Bruno Faccini (Inactive) added a comment - 14/Nov/14 10:58 PM Hello Andreas, I am sorry but I am not completely sure about what you mean by "this would be better with OBD_FAIL_RACE() instead of having the test clear the fail_loc before the second mount" ? Do you think that I better have to use the "OBD_RACE(OBD_FAIL_TGT_CONN_RACE);" capability at the beginning of target_handle_connect(), instead to set/unset OBD_FAIL_TGT_DELAY_CONNECT to have 1st mount stall 10s and try to force a race? OTOH, I may also change conf-sanity/test_41c to only check that only one of both concurrent mounts fails regardless of its return code ?

Andreas Dilger added a comment - 14/Nov/14 6:19 PM

Bruno, it isn't clear that this should be considered a test failure. I think any result where one mount succeeds and the other fails should be considered a PASS?

Also, it seems to me that this would be better with OBD_FAIL_RACE() instead of having the test clear the fail_loc before the second mount. Since the test is running in a VM that may be quite slow, this opens up the possibility of the first mount completing, and the second one just refuses to mount in mount.lustre because it sees the first filesystem has finished mounting.

Andreas Dilger added a comment - 14/Nov/14 6:19 PM Bruno, it isn't clear that this should be considered a test failure. I think any result where one mount succeeds and the other fails should be considered a PASS? Also, it seems to me that this would be better with OBD_FAIL_RACE() instead of having the test clear the fail_loc before the second mount. Since the test is running in a VM that may be quite slow, this opens up the possibility of the first mount completing, and the second one just refuses to mount in mount.lustre because it sees the first filesystem has finished mounting.

People

Assignee:: Bruno Faccini (Inactive)

Reporter:: Maloo

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 14/Nov/14 12:55 AM

Updated:: 21/Apr/17 2:17 PM

Resolved:: 09/Dec/15 4:51 AM