[LU-4582] After failing over Lustre MGS node to the secondary, client mount fails with -5 Created: 04/Feb/14  Updated: 14/Mar/18  Resolved: 23/Jun/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0, Lustre 2.2.0, Lustre 2.3.0, Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Cheng Shao (Inactive) Assignee: Cliff White (Inactive)
Resolution: Fixed Votes: 0
Labels: patch

Issue Links:
Duplicate
Related
Severity: 3
Rank (Obsolete): 12524

 Description   

Following are steps to reproduce the issue reliably:

1. adjust obd_timeout from default 100 to 300
    lctl conf_param <fsname>.sys.timeout=300
2. mount and umount the client
    mount - lustre <primary MGS ip>:<secondary MGS ip>:/<fsname> /mnt/lustre
3. failover the MGS node to the secondary
4. mount the client again using the same command as in step 2

Then step 4 will fail with EIO.



 Comments   
Comment by Cheng Shao (Inactive) [ 04/Feb/14 ]

Here is the timeline of events according to the Lustre debug log. The beginning num is relative to the start of the mount op.

+0 Client sent MGS_CONNECT req to primary MGS node with timeout set to (obd_timeout/20 + adaptive_timeout), which was 20 seconds in our test case.
+0 Client sent LDLM_ENQUEUE req to MGS node with rq_delay_limit set to 5 seconds. This is for sptlrpc. The send will be delayed because the import is still in connecting state. 
+5 The above req failed after the delayed sent expired. But this is not fatal.
+5 Client sent another LDLM_ENQUEUE req to MGS node with rq_delay_limit set to MGC_ENQUEUE_LIMIT, which is hard coded to 50 seconds.
+20 MGS_CONNECT timed out.
+55 The second LDLM_ENQUEUE req failed after the delayed sent expired. This will fail the whole client mount with error -5.

The problem here is that, after the MGS_CONNECT failed to connect to the primary MGS, it didn't get a chance to connect to the secondary before the mount fails. We know that selecting a different MGS node is triggered by the pinger, which works at (obd_timeout/4) interval. Since we increased obd_timeout to 300, the interval became 75 seconds now. So the connection to the secondary will not happened prior to failure of the second LDLM_ENQUEUE req.

The solution we proposed here is to redefine MGS_ENQUEUE_LIMIT as relative to obd_timeout, instead of a hard-coded value. By doing that, the second LDLM_ENQUEUE will wait long enough to go through after the connection to the secondary MGS node is established.

Comment by Ryan Haasken [ 05/Feb/14 ]

Cheng, have you uploaded your patch to the whamcloud gerrit review site? If so, please post a link here. Thanks.

Comment by Cheng Shao (Inactive) [ 05/Feb/14 ]

Patch is up for review at http://review.whamcloud.com/#/c/9141/

Comment by Oleg Drokin [ 07/Feb/14 ]

I wonder why is your patch against b2_5 and not master? Is master not affected?
We generally prefer to land things to master first.

Comment by Cheng Shao (Inactive) [ 10/Feb/14 ]

Master is definitely affected as well. Will abandon this patch and submit a new one against master.

Comment by Cheng Shao (Inactive) [ 11/Feb/14 ]

New patch is at http://review.whamcloud.com/#/c/9217/.

Comment by Denis Kondratenko (Inactive) [ 25/Apr/14 ]

unfortunately Cheng left Xyratex, but we still need to get this landed.

Could someone review Cheng's patch?

Comment by Cliff White (Inactive) [ 09/May/14 ]

Reviewers have been assigned.

Comment by Ryan Haasken [ 29/May/14 ]

It has been a while since there has been any activity on this bug. Who is reviewing Cheng's patch?

Comment by Cliff White (Inactive) [ 30/May/14 ]

Very sorry about the delay, will investigate

Comment by Ryan Haasken [ 04/Jun/14 ]

Thanks, Cliff. http://review.whamcloud.com/#/c/9217/ has landed.

Comment by Cliff White (Inactive) [ 09/Jun/14 ]

Is it okay to close this isse?

Comment by Ryan Haasken [ 09/Jun/14 ]

Yes.

Comment by Gerrit Updater [ 10/Feb/15 ]

Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/13718
Subject: LU-4582 mgc: replace hard-coded MGC_ENQUEUE_LIMIT value
Project: fs/lustre-release
Branch: b2_5
Current Patch Set: 1
Commit: 6c815c12a2d6dfabf55b43563b4f1062c123db99

Generated at Sat Feb 10 01:44:01 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.