[LU-3559] MGS formatted with Lustre 2.1 cannot be started with Lustre 2.4 Created: 05/Jul/13  Updated: 12/Aug/13  Resolved: 12/Aug/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0, Lustre 2.5.0
Fix Version/s: Lustre 2.4.1, Lustre 2.5.0

Type: Bug Priority: Blocker
Reporter: Sebastien Buisson (Inactive) Assignee: Bruno Faccini (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-2022 MGS fails to mount due to "has no ind... Closed
Severity: 3
Rank (Obsolete): 8964

 Description   

With Lustre 2.1, I format an MGS with the follwing command:

mkfs.lustre --reformat --quiet --fsname=migrate --mgs
--mkfsoptions="-m 0" --device-size=204800 /root/mgt

Before starting this MGS, tunefs.lustre gives:

# tunefs.lustre --print /root/mgt
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

   Read previous values:
Target:     MGS
Index:      unassigned
Lustre FS:  migrate
Mount type: ldiskfs
Flags:      0x74
              (MGS needs_index first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:


   Permanent disk data:
Target:     MGS
Index:      unassigned
Lustre FS:  migrate
Mount type: ldiskfs
Flags:      0x74
              (MGS needs_index first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:

Still with Lustre 2.1, I successfully start this MGS, and afterwards tunefs.lustre gives:

# tunefs.lustre --print /root/mgt
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

   Read previous values:
Target:     MGS
Index:      unassigned
Lustre FS:  migrate
Mount type: ldiskfs
Flags:      0x74
              (MGS needs_index first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:


   Permanent disk data:
Target:     MGS
Index:      unassigned
Lustre FS:  migrate
Mount type: ldiskfs
Flags:      0x74
              (MGS needs_index first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:

Then I update to Lustre 2.4 on my node, and when I try to start this MGS I get:

# mount -t lustre -o loop /root/mgt /mnt/migrate/mgt
mount.lustre.orig: /dev/loop0 has no index assigned (probably formatted with old mkfs)

In Lustre 2.4, it seems mount.lustre requires every target, including MGT, to have an index. But with Lustre 2.1, MGTs can be formatted without specifying an index, and this is the way all our customers did. So if these customers migrate to Lustre 2.4, they will not be able to start their MGS.

I will propose a patch for mount.lustre in Lustre 2.4 to address this issue.

Sebastien.



 Comments   
Comment by Sebastien Buisson (Inactive) [ 05/Jul/13 ]

The patch is at:
http://review.whamcloud.com/6904

TIA,
Sebastien.

Comment by Bruno Faccini (Inactive) [ 05/Jul/13 ]

Looks like a duplicate of LU-2022 at first reading, will verify this definitely and update soon.

Comment by Sebastien Buisson (Inactive) [ 05/Jul/13 ]

Yes probably the issue described in LU-2022 is the same, but the solution that was adopted at that time (patch in mkfs.lustre) is useless in case of 'real' migration from 2.1 to 2.4, because we cannot afford reformatting the MGS on production clusters.

Comment by Bruno Faccini (Inactive) [ 05/Jul/13 ]

That's what I am trying to fully understand and why not advocate !! BTW, if it is the case, I think the test in your patch should better be something like :

if ((IS_MDT(ldd) || IS_OST(ldd)) && (ldd->ldd_flags & LDD_F_NEED_INDEX))

instead of only :

if (!IS_MGS(ldd) && (ldd->ldd_flags & LDD_F_NEED_INDEX))

to handle joint MGT/MDT case.

Comment by Sebastien Buisson (Inactive) [ 05/Jul/13 ]

You are right about the joint MGS/MDT case, I have updated my patch accordingly.

Concerning you other question, I can tell for sure that without this patch I simply cannot start my MGT

Sebastien.

Comment by Bruno Faccini (Inactive) [ 08/Jul/13 ]

Ok, so seems that original (and identical to your!) fix for LU-2022 has been diverted for some reason (others missing index issues, DNE, ...) leading to this interop bug to be left not fixed.

So now I will ask for some other reviewers. And re-trigger the change to auto-tests since it failed due to some known and currently worked-on test problem (LU-3560).

On the other hand, and in case of absolute need without the fix in, you can allow your 2.1 formatted MGT to be mounted by/after :

       _ "mount -t [ldiskfs,zfs]" your MGT stand-alone device
       _ binary edit underlying "CONFIGS/mountdata" file
       _ change the unset/"ffff" MGT index value, at offset 0x18/24 in file, to "0000"
Comment by Bruno Faccini (Inactive) [ 08/Jul/13 ]

Seems that original fix for LU-2022 has been lost in its last change patch-set.

Comment by Sebastien Buisson (Inactive) [ 08/Jul/13 ]

Gosh, the 2.1 MGT manipulation is not that easy...
This is why I would like to have the patch in the 2.4 release we will roll out to our customers )

Comment by Jodi Levi (Inactive) [ 19/Jul/13 ]

Patch landed to Master. Can this ticket be closed or is more work needed?

Comment by Jodi Levi (Inactive) [ 22/Jul/13 ]

Patch landed to master, so closing ticket. If more work is needed, let me know and I will reopen.

Comment by Sebastien Buisson (Inactive) [ 30/Jul/13 ]

Hi,

I would like this ticket to be reopened as I have just pushed a b2_4 version of the patch:
http://review.whamcloud.com/7176

Thanks,
Sebastien.

Comment by Jodi Levi (Inactive) [ 30/Jul/13 ]

Reopened for b2_4 patch

Comment by Bruno Faccini (Inactive) [ 31/Jul/13 ]

Added Li Wei and myself as reviewers.

Comment by Peter Jones [ 12/Aug/13 ]

Landed for 2.4.1 and 2.5

Generated at Sat Feb 10 01:34:57 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.