[LU-2904] parallel-scale-nfsv3: FAIL: setup nfs failed! - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Blocker
Fix Version/s: Lustre 2.1.6, Lustre 2.4.1, Lustre 2.5.0
Affects Version/s: Lustre 2.4.0
Labels:
None
Environment:

Hide

Lustre b2_1 client build: http://build.whamcloud.com/job/lustre-b2_1/181
Lustre master server build: http://build.whamcloud.com/job/lustre-master/1285
Distro/Arch: RHEL6.3/x86_64

Show
Lustre b2_1 client build: http://build.whamcloud.com/job/lustre-b2_1/181 Lustre master server build: http://build.whamcloud.com/job/lustre-master/1285 Distro/Arch: RHEL6.3/x86_64

Severity:
3
Rank (Obsolete):
6993

Description

The parallel-scale-nfsv3 test failed as follows:

Mounting NFS clients (version 3)...
CMD: client-12vm1,client-12vm2 mkdir -p /mnt/lustre
CMD: client-12vm1,client-12vm2 mount -t nfs -o nfsvers=3,async                 client-12vm3:/mnt/lustre /mnt/lustre
client-12vm2: mount.nfs: Connection timed out
client-12vm1: mount.nfs: Connection timed out
 parallel-scale-nfsv3 : @@@@@@ FAIL: setup nfs failed!

Syslog on Lustre MDS/Lustre Client/NFS Server client-12vm3 showed that:

Mar  4 17:34:15 client-12vm3 mrshd[4254]: root@client-12vm1.lab.whamcloud.com as root: cmd='(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usr/sbin; cd /usr/lib64/lustre/tests; LUSTRE="/usr/lib64/lustre"  sh -c "exportfs -o rw,async,no_root_squash *:/mnt/lustre         && exportfs -v");echo XXRETCODE:$?'
Mar  4 17:34:15 client-12vm3 xinetd[1640]: EXIT: mshell status=0 pid=4253 duration=0(sec)
Mar  4 17:34:16 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:894 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:16 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:713 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:17 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:784 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:17 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:877 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:19 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:946 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:19 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:1013 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:23 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:797 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:23 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:701 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:31 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:719 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:31 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:941 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:41 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:943 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:41 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:810 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:51 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:849 for /mnt/lustre (/mnt/lustre)
Mar  4 17:34:51 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:740 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:01 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:846 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:01 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:667 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:11 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:955 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:11 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:1006 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:21 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:828 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:21 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:739 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:31 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:1011 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:31 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:994 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:41 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:847 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:41 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:756 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:51 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:892 for /mnt/lustre (/mnt/lustre)
Mar  4 17:35:51 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:749 for /mnt/lustre (/mnt/lustre)
Mar  4 17:36:01 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:1017 for /mnt/lustre (/mnt/lustre)
Mar  4 17:36:01 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:873 for /mnt/lustre (/mnt/lustre)
Mar  4 17:36:11 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:874 for /mnt/lustre (/mnt/lustre)
Mar  4 17:36:11 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:749 for /mnt/lustre (/mnt/lustre)
Mar  4 17:36:21 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.207:916 for /mnt/lustre (/mnt/lustre)
Mar  4 17:36:21 client-12vm3 rpc.mountd[4165]: authenticated mount request from 10.10.4.206:841 for /mnt/lustre (/mnt/lustre)
Mar  4 17:36:21 client-12vm3 xinetd[1640]: START: mshell pid=4286 from=::ffff:10.10.4.206
Mar  4 17:36:21 client-12vm3 mrshd[4287]: root@client-12vm1.lab.whamcloud.com as root: cmd='/usr/sbin/lctl mark "/usr/sbin/lctl mark  parallel-scale-nfsv3 : @@@@@@ FAIL: setup nfs failed! ";echo XXRETCODE:$?'
Mar  4 17:36:21 client-12vm3 kernel: Lustre: DEBUG MARKER: /usr/sbin/lctl mark  parallel-scale-nfsv3 : @@@@@@ FAIL: setup nfs failed!

Maloo report: https://maloo.whamcloud.com/test_sets/5cbf6978-853e-11e2-bfd3-52540035b04c

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

0001-LU-2904-nfs-support-64-bits-inode-number-in-nfs-hand.patch
1 kB
22/May/13 9:17 AM
0001-LU-2904-obdclass-return-valid-uuid-for-statfs.patch
2 kB
22/May/13 9:32 AM

Issue Links

is related to

LU-3318 mdc_set_lock_data() ASSERTION( old_inode->i_state & I_FREEING )

Resolved

LU-3550 Stale file handle on mount when mounting Lustre 2.4 via NFS

Resolved

LU-4057 sub-directory NFS reexport issue

Resolved

Activity

[LU-2904] parallel-scale-nfsv3: FAIL: setup nfs failed!

nasf (Inactive) added a comment - 11/Jul/13 2:39 PM

1) In theory, we can fill the low 32-bits of FSID with s_dev, and fill the high 32-bits of FSID as anything, such as Lustre magic. But in spite of what is filled, we cannot control how the caller to use the returned FSID. And there is no explicit advantage of replacing current patch, since the 64-bits FSID only be generated when mount.

nasf (Inactive) added a comment - 11/Jul/13 2:39 PM 1) In theory, we can fill the low 32-bits of FSID with s_dev, and fill the high 32-bits of FSID as anything, such as Lustre magic. But in spite of what is filled, we cannot control how the caller to use the returned FSID. And there is no explicit advantage of replacing current patch, since the 64-bits FSID only be generated when mount.

Alexey Lyashkov added a comment - 11/Jul/13 8:25 AM

1) I understand (and agree) about returning a fs id from statfs - but think we may use a s_dev for it and 32bit is enough, and we may use an high part of fs id with filling lustre magic (if need).

2) let me time until Monday to look into mounted code carefully.

Alexey Lyashkov added a comment - 11/Jul/13 8:25 AM 1) I understand (and agree) about returning a fs id from statfs - but think we may use a s_dev for it and 32bit is enough, and we may use an high part of fs id with filling lustre magic (if need). 2) let me time until Monday to look into mounted code carefully.

nasf (Inactive) added a comment - 10/Jul/13 3:25 PM

The root issue is in user space nfs-utils.

1) The FSID returned via statfs() to nfs-utils is 64-bits, in spite of use the new generated 64-bits FSID or reuse the old 32-bits FSID. If old 32-bits FSID is used, then __kernel_fsid_t::val[1] (or val[0]) is zero. Before this patch applied, Lustre did not return FSID via statfs().

2) The root NFS handle contains root inode#. Lustre root inode# is 64-bits, but when nfs-utils parses the root handle, it is converted to 32-bits. So it cannot locate the right "inode". I have made a patch for that, and sent it to related kernel maintainers, and hope the patch can be accepted/landed in the next nfs-utils release.

diff --git a/utils/mountd/cache.c b/utils/mountd/cache.c
index 517aa62..a7212e7 100644
— a/utils/mountd/cache.c
+++ b/utils/mountd/cache.c
@@ -388,7 +388,7 @@ struct parsed_fsid {
int fsidtype;
/* We could use a union for this, but it would be more

complicated; why bother? */

unsigned int inode;
+ uint64_t inode;
unsigned int minor;
unsigned int major;
unsigned int fsidnum;
–
1.7.1

nasf (Inactive) added a comment - 10/Jul/13 3:25 PM The root issue is in user space nfs-utils. 1) The FSID returned via statfs() to nfs-utils is 64-bits, in spite of use the new generated 64-bits FSID or reuse the old 32-bits FSID. If old 32-bits FSID is used, then __kernel_fsid_t::val [1] (or val [0] ) is zero. Before this patch applied, Lustre did not return FSID via statfs(). 2) The root NFS handle contains root inode#. Lustre root inode# is 64-bits, but when nfs-utils parses the root handle, it is converted to 32-bits. So it cannot locate the right "inode". I have made a patch for that, and sent it to related kernel maintainers, and hope the patch can be accepted/landed in the next nfs-utils release. diff --git a/utils/mountd/cache.c b/utils/mountd/cache.c index 517aa62..a7212e7 100644 — a/utils/mountd/cache.c +++ b/utils/mountd/cache.c @@ -388,7 +388,7 @@ struct parsed_fsid { int fsidtype; /* We could use a union for this, but it would be more complicated; why bother? */ unsigned int inode; + uint64_t inode; unsigned int minor; unsigned int major; unsigned int fsidnum; – 1.7.1

Alexey Lyashkov added a comment - 10/Jul/13 1:58 PM

anyway most kernel FS uses

<------>u64 id = huge_encode_dev(sb->s_bdev->bd_dev);
<------>buf->f_fsid.val[0] = (u32)id;
<------>buf->f_fsid.val[1] = (u32)(id >> 32);

i don't understand - why do not use same.

Alexey Lyashkov added a comment - 10/Jul/13 1:58 PM anyway most kernel FS uses <------>u64 id = huge_encode_dev(sb->s_bdev->bd_dev); <------>buf->f_fsid.val[0] = (u32)id; <------>buf->f_fsid.val[1] = (u32)(id >> 32); i don't understand - why do not use same.

Alexey Lyashkov added a comment - 10/Jul/13 1:24 PM

did you really think 2^32 lustre mounts exist on single node? FSid just an unique id for a mount point.
root cause to have it's 64bit - 32bit for block device id and 32bit for slice number inside of block device, so just need identify a mount point correctly. In case lustre, we don't have a slices inside a device and don't need a fill it, but device id exactly identify an mount point.

as about root node for an export, let me look - but as i remember a nfs-tools it's also as about NFS handle from an kernel.

Alexey Lyashkov added a comment - 10/Jul/13 1:24 PM did you really think 2^32 lustre mounts exist on single node? FSid just an unique id for a mount point. root cause to have it's 64bit - 32bit for block device id and 32bit for slice number inside of block device, so just need identify a mount point correctly. In case lustre, we don't have a slices inside a device and don't need a fill it, but device id exactly identify an mount point. as about root node for an export, let me look - but as i remember a nfs-tools it's also as about NFS handle from an kernel.

nasf (Inactive) added a comment - 10/Jul/13 12:53 PM

Honestly, I am not sure whether 32-bits is enough or not for kinds of statfs() users. It is true that in mixed environment new client will export 64-bits FSID and old client will export 32-bits FSID, such difference may cause issues if the users want to access Lustre via different clients with the same handle. But I do not know whether someone really will want to use Lustre as that. From a long view, we need to upgrade the FSID to 64-bits, otherwise, if 32-bits is always enough, the statfs() API should be shrink...

As for NFS handle with lu_fid, it works for objects under export-point, but the root NFS handle does not contains the lu_fid (which does not goes down to Lustre). That is why we make this patch.

nasf (Inactive) added a comment - 10/Jul/13 12:53 PM Honestly, I am not sure whether 32-bits is enough or not for kinds of statfs() users. It is true that in mixed environment new client will export 64-bits FSID and old client will export 32-bits FSID, such difference may cause issues if the users want to access Lustre via different clients with the same handle. But I do not know whether someone really will want to use Lustre as that. From a long view, we need to upgrade the FSID to 64-bits, otherwise, if 32-bits is always enough, the statfs() API should be shrink... As for NFS handle with lu_fid, it works for objects under export-point, but the root NFS handle does not contains the lu_fid (which does not goes down to Lustre). That is why we make this patch.

Alexey Lyashkov added a comment - 10/Jul/13 10:07 AM

fsid have a single requirement - that is should be same for a cluster and unique on node.
i think 32bits uid is enough to encode FS id in statfs.
but using a single FS have a benefits with interop - different nodes (with older and new nfsd) have same fs id in NFS handles so may used in failover pair.

I have some question to second patch also - we have prepared own NFS handle structure with lu_fid inside and it's should be don't have a limitation over 32bits, if we have lost one code patch and nfs handle created with wrong format - we need invest it is.

Alexey Lyashkov added a comment - 10/Jul/13 10:07 AM fsid have a single requirement - that is should be same for a cluster and unique on node. i think 32bits uid is enough to encode FS id in statfs. but using a single FS have a benefits with interop - different nodes (with older and new nfsd) have same fs id in NFS handles so may used in failover pair. I have some question to second patch also - we have prepared own NFS handle structure with lu_fid inside and it's should be don't have a limitation over 32bits, if we have lost one code patch and nfs handle created with wrong format - we need invest it is.

nasf (Inactive) added a comment - 10/Jul/13 9:50 AM

32-bits uuid maybe works for this case, but since the POSIX API is 64-bits, and the statfs() is not only for re-exporting via nfs, but also for others, so we prefer to generate and return 64-bits uuid as expected.

nasf (Inactive) added a comment - 10/Jul/13 9:50 AM 32-bits uuid maybe works for this case, but since the POSIX API is 64-bits, and the statfs() is not only for re-exporting via nfs, but also for others, so we prefer to generate and return 64-bits uuid as expected.

Alexey Lyashkov added a comment - 10/Jul/13 7:20 AM

last patch
http://git.whamcloud.com/?p=fs/lustre-release.git;a=commitdiff;h=8c4f4a47e051b097358818f4d3777d02124abbe7

looks invalid - lustre client had already such code

        /* We set sb->s_dev equal on all lustre clients in order to support
         * NFS export clustering.  NFSD requires that the FSID be the same
         * on all clients. */
        /* s_dev is also used in lt_compare() to compare two fs, but that is
         * only a node-local comparison. */
        uuid = obd_get_uuid(sbi->ll_md_exp);
        if (uuid != NULL)
                sb->s_dev = get_uuid2int(uuid->uuid, strlen(uuid->uuid));
        sbi->ll_mnt = mnt;

In that case exporting an s_dev via statfs will enough.

Alexey Lyashkov added a comment - 10/Jul/13 7:20 AM last patch http://git.whamcloud.com/?p=fs/lustre-release.git;a=commitdiff;h=8c4f4a47e051b097358818f4d3777d02124abbe7 looks invalid - lustre client had already such code /* We set sb->s_dev equal on all lustre clients in order to support * NFS export clustering. NFSD requires that the FSID be the same * on all clients. */ /* s_dev is also used in lt_compare() to compare two fs, but that is * only a node-local comparison. */ uuid = obd_get_uuid(sbi->ll_md_exp); if (uuid != NULL) sb->s_dev = get_uuid2int(uuid->uuid, strlen(uuid->uuid)); sbi->ll_mnt = mnt; In that case exporting an s_dev via statfs will enough.

Jian Yu added a comment - 27/May/13 6:46 AM - edited

Patch for Lustre b2_1 branch to add "32bitapi" Lustre client mount option while exporting the Lustre client as NFSv3 server: http://review.whamcloud.com/6457
Patch for Lustre b1_8 branch: http://review.whamcloud.com/6663
Patch for Lustre master branch: http://review.whamcloud.com/6649

Jian Yu added a comment - 27/May/13 6:46 AM - edited Patch for Lustre b2_1 branch to add "32bitapi" Lustre client mount option while exporting the Lustre client as NFSv3 server: http://review.whamcloud.com/6457 Patch for Lustre b1_8 branch: http://review.whamcloud.com/6663 Patch for Lustre master branch: http://review.whamcloud.com/6649

Jian Yu added a comment - 27/May/13 4:20 AM

Lustre b2_1 client build: http://build.whamcloud.com/job/lustre-b2_1/204
Lustre master server build: http://build.whamcloud.com/job/lustre-master/1508
Distro/Arch: RHEL6.4/x86_64

The issue still occurred: https://maloo.whamcloud.com/test_sets/b5a0c146-c624-11e2-9bf1-52540035b04c

CMD: client-26vm3 exportfs -o rw,async,no_root_squash *:/mnt/lustre         && exportfs -v
/mnt/lustre   	<world>(rw,async,wdelay,no_root_squash,no_subtree_check)

Mounting NFS clients (version 3)...
CMD: client-26vm5,client-26vm6.lab.whamcloud.com mkdir -p /mnt/lustre
CMD: client-26vm5,client-26vm6.lab.whamcloud.com mount -t nfs -o nfsvers=3,async                 client-26vm3:/mnt/lustre /mnt/lustre
client-26vm6: mount.nfs: Connection timed out
client-26vm5: mount.nfs: Connection timed out
 parallel-scale-nfsv3 : @@@@@@ FAIL: setup nfs failed!

On somehow, this issue can be resolved by specifying "fsid=1" (without "32bitapi" for Lustre mount option) when re-export Lustre via NFS (v3 or v4). For example: "/mnt/lustre 10.211.55.*(rw,no_root_squash,fsid=1)". (Verified on 2.6.32-358.2.1.el6)

We need a patch on Lustre b2_1 branch to resolve the interop issue.

Jian Yu added a comment - 27/May/13 4:20 AM Lustre b2_1 client build: http://build.whamcloud.com/job/lustre-b2_1/204 Lustre master server build: http://build.whamcloud.com/job/lustre-master/1508 Distro/Arch: RHEL6.4/x86_64 The issue still occurred: https://maloo.whamcloud.com/test_sets/b5a0c146-c624-11e2-9bf1-52540035b04c CMD: client-26vm3 exportfs -o rw,async,no_root_squash *:/mnt/lustre && exportfs -v /mnt/lustre <world>(rw,async,wdelay,no_root_squash,no_subtree_check) Mounting NFS clients (version 3)... CMD: client-26vm5,client-26vm6.lab.whamcloud.com mkdir -p /mnt/lustre CMD: client-26vm5,client-26vm6.lab.whamcloud.com mount -t nfs -o nfsvers=3,async client-26vm3:/mnt/lustre /mnt/lustre client-26vm6: mount.nfs: Connection timed out client-26vm5: mount.nfs: Connection timed out parallel-scale-nfsv3 : @@@@@@ FAIL: setup nfs failed! On somehow, this issue can be resolved by specifying "fsid=1" (without "32bitapi" for Lustre mount option) when re-export Lustre via NFS (v3 or v4). For example: "/mnt/lustre 10.211.55.*(rw,no_root_squash,fsid=1)". (Verified on 2.6.32-358.2.1.el6) We need a patch on Lustre b2_1 branch to resolve the interop issue.

People

Assignee:: nasf (Inactive)

Reporter:: Jian Yu

Votes:: 0 Vote for this issue

Watchers:: 17 Start watching this issue

Dates

Created:: 04/Mar/13 10:05 PM

Updated:: 20/Nov/13 9:26 AM

Resolved:: 31/Aug/13 6:48 PM