[LU-1526] interop 1.8,2.1 -> 2.4 "mkfs.lustre FATAL: The target index must be specified with --index" Created: 15/Jun/12  Updated: 28/Oct/14  Resolved: 17/Jan/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0, Lustre 2.1.3, Lustre 2.1.4, Lustre 1.8.8
Fix Version/s: Lustre 2.4.0, Lustre 2.1.4, Lustre 1.8.9

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: MB

Issue Links:
Duplicate
duplicates LU-2265 Interop 1.8/2.1<->2.4 Test failure on... Resolved
is duplicated by LU-2512 b1_8 and b2_1 test scripts fail again... Resolved
is duplicated by LU-2112 Interop issue Resolved
is duplicated by LU-2202 1.8.8/2.2<->master MDS fails to mount... Resolved
Related
is related to LU-2265 Interop 1.8/2.1<->2.4 Test failure on... Resolved
Severity: 3
Rank (Obsolete): 5602

 Description   

This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

This issue relates to the following test suite run:
https://maloo.whamcloud.com/test_sets/bcefb252-b038-11e1-9df1-52540035b04c.

The sub-test lustre-initialization_1 failed with the following error:

mkfs.lustre FATAL: The target index must be specified with --index

Info required for matching: lustre-initialization-1 lustre-initialization_1

The test-framework code for b1_8 and probably b2_1 need to start using "--index" to specify the device index. This is not harmful for those earlier releases, but is required for orion (2.4).



 Comments   
Comment by Andreas Dilger [ 15/Jun/12 ]

Not a super critical problem, but we'll need it for testing interop on master in the next couple of months. It would stop a steady flow of "lustre-initialization_1" failures for orion, and we can begin doing actual automated interop testing.

Comment by Andreas Dilger [ 17/Oct/12 ]
mount -t lustre -o user_xattr,acl  /dev/lvm-MDS/P1 /mnt/mds
mount.lustre: /dev/mapper/lvm--MDS-P1 has no index assigned (probably formatted with old mkfs)

I think it would be reasonable for mkfs_lustre.c to assume "--index=0" on an MDT if no index is specified. I think it will be some time before DNE is so prevalent that we need to require an index for the MDT. Also, it is misleading that even the 2.4 mkfs_lustre.c does no enforce an index for the MDT, but mount_lustre.c requires it.

It also makes sense to patch the b2_1 test-framework.sh to always specify an index for the MDT, so that future interop testing continues to work. We won't have 1.8->2.5 interop testing, so that probably isn't necessary.

Comment by James A Simmons [ 17/Oct/12 ]

What version was the index requirement for MDT introduced?

Comment by Andreas Dilger [ 17/Oct/12 ]

So far only in the master branch, targeted for the 2.4 release. There is a requirement for specifying --index for the OSTs also, but that is less of a surprise, and there is a warning in the 2.3 mke2fs that it will be required.

Comment by James A Simmons [ 17/Oct/12 ]

So a if [ $(lustre_version_code $facet) -gt $(version_code 2.3.50) ]; then would be fine in the test suite?

Comment by Andreas Dilger [ 17/Oct/12 ]

Firstly, because the test-framework.sh script running on the client is from the old version (e.g. 1.8.8 or 2.1.3) which doesn't have that check, adding it into the master test-framework will not help.

Secondly, I think this is a bit of a usability case as well. It makes sense to allow the 2.4 mkfs_lustre.c provide a default index = 0 for the MDT, but print a warning message (as was already added for 2.3 on OSTs) that an index will be required for MDT0000 in the future.

Comment by James A Simmons [ 18/Oct/12 ]

Patch at http://review.whamcloud.com/#change,4293

Comment by Sarah Liu [ 29/Oct/12 ]

This issue blocks the interop testing on 1.8 and 2.1, change priority to major

Comment by Peter Jones [ 05/Nov/12 ]

A fix was landed to master a week ago for this issue. Can it now be marked as resolved or is further work required?

Comment by James A Simmons [ 05/Nov/12 ]

Andreas was hoping for a test to add to the test suite for this.

Comment by Jinshan Xiong (Inactive) [ 15/Nov/12 ]

Another occurrence is at LU-2265. Though MDT's index can be assigned to 0 by default, it will be hard to decide OST's index. Actually this is an issue of test env. I can't think of another way to solve it instead of changing test scripts for 1.8/2.1. Please let me know if you have any idea.

Comment by Li Wei (Inactive) [ 19/Nov/12 ]

This issue should not block 2.4. If we want to do interoperability testing with clients from an older release and 2.4 servers, then the test scripts in the older release need to be updated to supply "--index" at format time.

Comment by Li Wei (Inactive) [ 19/Nov/12 ]

See also LU-1647, which covers other test changes needed to drive 2.4 servers.

Comment by Peter Jones [ 20/Nov/12 ]

Liwei

We do need to do interoperability testing with older releases. We are working on 2.1.4 at the moment and could also do a 1.8.9 release to include any changes necessary for testing to run smoothly. We also need to test interop with 2.3 but no maintenance releases are planned for that. What is the least impactful way to get as much of the interop testing as possible running routinely?

Thanks

Peter

Comment by Andreas Dilger [ 20/Nov/12 ]

Li Wei, we need to fix the automated interop testing in some manner. The easiest way is to add -index support to b2_1 and b1_8, but this will not be "active" until we make a release on those branches. Could you please work on porting or writing the patches needed to add -index support to those branches. The 2.1.4 release is planned to be out in the next month, so having a patch relatively soon will help speed the testing along.

Comment by Li Wei (Inactive) [ 20/Nov/12 ]

A simple patch adding "--index" support would diverge b1_8 and b2_1 test framework from master. Also, it would not allow us to test older clients with new servers using ZFS-base targets. Are you sure this is OK?

(I'm not optimistic that I'll have cycles to work on this soon.)

Comment by Andreas Dilger [ 23/Nov/12 ]

Li Wei, this problem is a large number of tests to fail, and until we get interop testing to pass we are blocked from landing all new features onto master (DNE, etc), so it needs to be addressed with a high priority.

A simple patch adding "--index" support would diverge b1_8 and b2_1 test framework from master. Also, it would not allow us to test older clients with new servers using ZFS-base targets. Are you sure this is OK?

I don't understand your concern, could you please explain further. Adding support to the test framework in b2_1 (to be included in the 2.1.4 release) that always has t-f supply the --index would solve this problem. Explicitly supplying --index works in these older releases, and is in fact what users actually do.

I would be happy to discuss this in Skype if it will move it along more quickly.

Comment by Jian Yu [ 29/Nov/12 ]

Let me port the "--index" stuff from http://review.whamcloud.com/2907 to b2_1 and b1_8.

Comment by Jian Yu [ 29/Nov/12 ]

Patch for b2_1 branch to add "--index" support is in http://review.whamcloud.com/4710.

With the above patch, the auster test suite can be performed on b2_1 clients with master servers.

However, during the testing, I found more test script interop issues, like:

  1. "ENABLE_QUOTA=yes" did not work (we need port the patch from http://review.whamcloud.com/4031)
  2. "obdfilter" procfs entries were not found (we need port the patch from http://review.whamcloud.com/2934)

and so on.

Comment by Li Wei (Inactive) [ 29/Nov/12 ]

To expect green results even just from LDiskFS testing, we might want to consider these in addition:

Comment by Jian Yu [ 30/Nov/12 ]

Thanks Li Wei for pointing out the necessary patches. Let me port them along with testing.

Comment by Jian Yu [ 06/Dec/12 ]

After the patch of http://review.whamcloud.com/4031 (test framework changes for new quota) was cherry-picked to b2_1 branch, I found that version_code() and lustre_version_code() were not added to b2_1 branch. Here is the patch to fix that: http://review.whamcloud.com/4754.

The patch contains the fixes ported from:

  1. http://review.whamcloud.com/2441 (support version_code and lustre_version_code)
  2. http://review.whamcloud.com/4681 (read quota_type from MDS instead of MGS)

More porting/testing works are in progress.

Comment by Jian Yu [ 07/Dec/12 ]

It seems http://review.whamcloud.com/4040 (new sanity-quota tests) is also needed for 2.1.4<->2.4.0 interop testing.

The patch for b2_1 branch is in http://review.whamcloud.com/4767.

Comment by Jian Yu [ 10/Dec/12 ]

The patch of http://review.whamcloud.com/2934 (Handle OFD procfs changes) is being ported.

The patch for b2_1 branch is in http://review.whamcloud.com/4783.

Comment by Jian Yu [ 10/Dec/12 ]

We also need port http://review.whamcloud.com/1425 (Fix OST index errors in test suite).

The patch for b2_1 branch is in http://review.whamcloud.com/4821.

Comment by Jian Yu [ 13/Dec/12 ]

The patch of http://review.whamcloud.com/2982 (OST_DESTROYs from MDS) is being ported.

The patch for b2_1 branch is in http://review.whamcloud.com/4832.

Comment by Jodi Levi (Inactive) [ 21/Dec/12 ]

Has everything landed for this and can this be closed?

Comment by Jian Yu [ 21/Dec/12 ]

Has everything landed for this and can this be closed?

Hi Jodi, the patches also need to be backported to b1_8 branch.

Comment by Jodi Levi (Inactive) [ 26/Dec/12 ]

Is a new patch needed for b1_8 or can Oleg cherry pick to land this to b1_8?

Comment by Jian Yu [ 26/Dec/12 ]

Is a new patch needed for b1_8 or can Oleg cherry pick to land this to b1_8?

I've tried to apply the patches directly on b1_8 branch but hit conflicts. So, backporting is still needed.

Comment by Jian Yu [ 27/Dec/12 ]

Here are the patches backported to b1_8 branch:
http://review.whamcloud.com/4893 (add --index support to the test framework)
http://review.whamcloud.com/4897 (t-f changes for new quota)
http://review.whamcloud.com/4915 (new sanity-quota tests)
http://review.whamcloud.com/4958 (Handle OFD procfs changes)
http://review.whamcloud.com/4959 (Support for MDS-initiated OST_DESTROYs)
http://review.whamcloud.com/4986 (Adapt oos to the new grant and osd-zfs behavior)

Comment by Jian Yu [ 17/Jan/13 ]

The patches have been landed on b2_1 and b1_8 branches separately.

Generated at Sat Feb 10 01:17:25 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.