[LU-6307] Interop 2.6.0<->2.7 recovery-small test_105: MGS refused the connection from different version MDT Created: 01/Mar/15  Updated: 03/Mar/15  Resolved: 03/Mar/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.7.0, Lustre 2.8.0

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-6086 verify MDTs are running the same Lust... Resolved
Severity: 3
Rank (Obsolete): 17662

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/e154ad20-bfc4-11e4-881f-5254006e85c2.

The sub-test test_105 failed with the following error:

mount failed

MDS

Lustre: DEBUG MARKER: == recovery-small test 105: IR: NON IR clients support == 20:22:45 (1425154965)
Lustre: DEBUG MARKER: /usr/sbin/lctl list_param mgs.*.ir_timeout
Lustre: DEBUG MARKER: lctl set_param -n mgs.MGS.live.lustre=state=full
Lustre: MGS (2.7.0.0) refused the connection from different version MDT (2.6.0.0) 10.1.4.150@tcp 17c0ca90-6e3b-8642-1b7b-a19358c36e81
Lustre: MGS (2.7.0.0) refused the connection from different version MDT (2.6.0.0) 10.1.4.150@tcp 17c0ca90-6e3b-8642-1b7b-a19358c36e81
Lustre: MGS (2.7.0.0) refused the connection from different version MDT (2.6.0.0) 10.1.4.150@tcp 17c0ca90-6e3b-8642-1b7b-a19358c36e81
Lustre: MGS (2.7.0.0) refused the connection from different version MDT (2.6.0.0) 10.1.4.150@tcp 17c0ca90-6e3b-8642-1b7b-a19358c36e81
Lustre: DEBUG MARKER: /usr/sbin/lctl mark  recovery-small test_105: @@@@@@ FAIL: mount failed 


 Comments   
Comment by Andreas Dilger [ 02/Mar/15 ]

This is caused by http://review.whamcloud.com/13285 but it shouldn't check the versions between MDS and MGS, only between MDS nodes.

Comment by nasf (Inactive) [ 02/Mar/15 ]

It is not because of the check between MDS and MGS, instead, it is because we reuse the flag "OBD_CONNECT_MDS_MDS" as "OBD_CONNECT_MNE_SWAB". For the connection from MGC, it will set OBD_CONNECT_MNE_SWAB as following:

/* The MNE_SWAB flag is overloading the MDS_MDS bit only for the MGS
 * connection.  It is a temporary bug fix for Imperative Recovery interop
 * between 2.2 and 2.3 x86/ppc nodes, and can be removed when interop for
 * 2.2 clients/servers is no longer needed.  LU-1252/LU-1644. */
#define OBD_CONNECT_MNE_SWAB             OBD_CONNECT_MDS_MDS
...
lustre_start_mgc(...)
{
    ...
#if LUSTRE_VERSION_CODE < OBD_OCD_VERSION(3, 0, 53, 0)
        data->ocd_connect_flags |= OBD_CONNECT_MNE_SWAB;
#endif

        if (lmd_is_client(lsi->lsi_lmd) &&
            lsi->lsi_lmd->lmd_flags & LMD_FLG_NOIR)
                data->ocd_connect_flags &= ~OBD_CONNECT_IMP_RECOV;
    ...
}

Generally, on the server side, we can distinguish whether it is from MGC or not by checking OBD_CONNECT_IMP_RECOV. But in the recovery-small test_105, because of "noir" is specified, the OBD_CONNECT_IMP_RECOV is dropped. So the server cannot know whether it is from MGC or old MDT. Then failed.

Comment by Gerrit Updater [ 02/Mar/15 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/13927
Subject: LU-6307 obdclass: distinguish MGC/MDT connection properly
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 96913037db05898a29cc3cfbf21dffb5914606f0

Comment by nasf (Inactive) [ 02/Mar/15 ]

Sarah, would you please to verify above patch with b2_6 client for recovery-small test_105? Thanks!

Comment by Gerrit Updater [ 02/Mar/15 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/13928
Subject: LU-6307 obdclass: distinguish MGC/MDT connection properly
Project: fs/lustre-release
Branch: b2_7
Current Patch Set: 1
Commit: 0aa708733ace578813bef17eda1a745e3dedad27

Comment by Sarah Liu [ 02/Mar/15 ]

FanYong, just let you know the patch works

Comment by nasf (Inactive) [ 03/Mar/15 ]

Thanks Sarah!

Comment by Gerrit Updater [ 03/Mar/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13928/
Subject: LU-6307 obdclass: distinguish MGC/MDT connection properly
Project: fs/lustre-release
Branch: b2_7
Current Patch Set:
Commit: 0964ac7523ce816413b309f5b65a988713f607d0

Comment by Gerrit Updater [ 03/Mar/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13927/
Subject: LU-6307 obdclass: distinguish MGC/MDT connection properly
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 1bdc4fd0594e8948623bde18e7b5d691104d8808

Comment by Peter Jones [ 03/Mar/15 ]

Landed for 2.7 and 2.8

Generated at Sat Feb 10 01:59:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.