Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2559

test: lustre-rsync-test test 1 - setfattr <file path>: Operation not supported

Details

    • Bug
    • Resolution: Won't Fix
    • Blocker
    • Lustre 2.5.0
    • Lustre 2.3.0, Lustre 2.4.0, Lustre 2.1.3
    • 3
    • 5985

    Description

      == lustre-rsync-test test 1: Simple Replication ====================================================== 23:39:05 (1357025945)
      lustre-MDT0000: Registered changelog user cl1
      Replication #1
      Lustre filesystem: lustre
      MDT device: lustre-MDT0000
      Source: /mnt/nbp0-1
      Target: /var/acc-sm/target
      Target: /var/acc-sm/target2
      Statuslog: /var/acc-sm/lustre_rsync.log
      Changelog registration: cl1
      Starting changelog record: 0
      Clear changelog after use: no
      Errors: 0
      lustre_rsync took 0 seconds
      Changelog records consumed: 20
      setfattr: /mnt/nbp0-1/d0.lustre-rsync-test/d1/file5: Operation not supported
      Replication #2
      Replication of operation failed(-17): 20 SLINK (4) [0x200000400:0xe:0x0] [0x200000400:0x3:0x0] link3
      Lustre filesystem: lustre
      MDT device: lustre-MDT0000
      Source: /mnt/nbp0-1
      Target: /var/acc-sm/target
      Target: /var/acc-sm/target2
      Statuslog: /var/acc-sm/lustre_rsync.log
      Changelog registration: cl1
      Starting changelog record: 20
      Clear changelog after use: no
      Errors: 1
      lustre_rsync took 0 seconds
      Changelog records consumed: 4
      /var/acc-sm/target/d0.lustre-rsync-test/d1/file5: user.foo: No such attribute
      /var/acc-sm/target2/d0.lustre-rsync-test/d1/file5: user.foo: No such attribute
      lustre-rsync-test test_1: @@@@@@ FAIL: Error in replicating xattrs.
      Trace dump:
      = /usr/lib64/lustre/tests/test-framework.sh:3643:error_noexit()
      = /usr/lib64/lustre/tests/test-framework.sh:3665:error()
      = /usr/lib64/lustre/tests/lustre-rsync-test.sh:193:test_1()
      = /usr/lib64/lustre/tests/test-framework.sh:3907:run_one()
      = /usr/lib64/lustre/tests/test-framework.sh:3937:run_one_logged()
      = /usr/lib64/lustre/tests/test-framework.sh:3808:run_test()
      = /usr/lib64/lustre/tests/lustre-rsync-test.sh:205:main()
      Dumping lctl log to /var/acc-sm/test_logs/lustre-rsync-test.test_1.*.1357025946.log
      FAIL 1 (3s)

      Attachments

        Activity

          [LU-2559] test: lustre-rsync-test test 1 - setfattr <file path>: Operation not supported
          bogl Bob Glossman (Inactive) added a comment - - edited

          flags reported by mount differ from those shown in /proc/mounts. Here are both.

          2.3 client:

          # mount
          centos2:/lustre on /mnt/lustre type lustre (rw,user_xattr,flock)
          
          # cat /proc/mounts
          192.168.0.36@tcp:/lustre /mnt/lustre lustre rw,relatime,flock 0 0
          

          2.1 server:

          # mount
          /dev/sdb on /mnt/mds1 type lustre (rw)
          
          # cat /proc/mounts
          /dev/sdb /mnt/mds1 lustre ro 0 0
          

          I think I see where you are going with this. Looks like MGS/MDS gets mounted on the server without user_xattr set. Looking at the test scripts I see a difference between 2.1 and 2.3. 2.1 has MDS_MOUNT_OPTS explicitly defined with user_xattr in it in cfg/local.sh, 2.3 has empty opts.

          I'm guessing that in the 2.3 timeframe server mounts always do user_xattr by default and no longer require explicit flags. This messes up when using script & cfg files from 2.3 on 2.1 servers.

          Jay, can you check this out by adding explicit
          MDS_MOUNT_OPTS="-o user_xattr,acl"
          to your cfg file on clients?
          If you just copied or modified the cfg/local.sh from the build these are set empty.
          Setting MDS_MOUNT_OPTS as an environment variable should work too.

          Try the test with this change.
          You can do the failing test alone with:

          auster -rv lustre-rsync-test --only 1

          bogl Bob Glossman (Inactive) added a comment - - edited flags reported by mount differ from those shown in /proc/mounts. Here are both. 2.3 client: # mount centos2:/lustre on /mnt/lustre type lustre (rw,user_xattr,flock) # cat /proc/mounts 192.168.0.36@tcp:/lustre /mnt/lustre lustre rw,relatime,flock 0 0 2.1 server: # mount /dev/sdb on /mnt/mds1 type lustre (rw) # cat /proc/mounts /dev/sdb /mnt/mds1 lustre ro 0 0 I think I see where you are going with this. Looks like MGS/MDS gets mounted on the server without user_xattr set. Looking at the test scripts I see a difference between 2.1 and 2.3. 2.1 has MDS_MOUNT_OPTS explicitly defined with user_xattr in it in cfg/local.sh, 2.3 has empty opts. I'm guessing that in the 2.3 timeframe server mounts always do user_xattr by default and no longer require explicit flags. This messes up when using script & cfg files from 2.3 on 2.1 servers. Jay, can you check this out by adding explicit MDS_MOUNT_OPTS="-o user_xattr,acl" to your cfg file on clients? If you just copied or modified the cfg/local.sh from the build these are set empty. Setting MDS_MOUNT_OPTS as an environment variable should work too. Try the test with this change. You can do the failing test alone with: auster -rv lustre-rsync-test --only 1

          Can you perform "mount" on both 2.1 server and 2.3 client to check (and paste out) which flags have been enabled when you met the issues "Lustre: Disabling user_xattr feature because it is not supported on the server"? Thanks.

          yong.fan nasf (Inactive) added a comment - Can you perform "mount" on both 2.1 server and 2.3 client to check (and paste out) which flags have been enabled when you met the issues "Lustre: Disabling user_xattr feature because it is not supported on the server"? Thanks.

          Bob, forget about the 2.1.3 client I mentioned in earlier post on Jan 10. That was a beta version from SUSE. Since Peter said Intel would not support that version, it is irrelevant.

          But I did experience that problem in 2.3.0 though as I first reported.

          jaylan Jay Lan (Inactive) added a comment - Bob, forget about the 2.1.3 client I mentioned in earlier post on Jan 10. That was a beta version from SUSE. Since Peter said Intel would not support that version, it is irrelevant. But I did experience that problem in 2.3.0 though as I first reported.

          Jay, are you sure you have a 2.1.3 client for sles11 sp2? I think the standard prebuilt rpms for sles11 are for sp1, not sp2. I haven't been able to build any client 2.1 for sp2. I've only succeeded in building client 2.3 and master for sp2.

          you could do lctl lustre_build_version on the client to double check.

          bogl Bob Glossman (Inactive) added a comment - Jay, are you sure you have a 2.1.3 client for sles11 sp2? I think the standard prebuilt rpms for sles11 are for sp1, not sp2. I haven't been able to build any client 2.1 for sp2. I've only succeeded in building client 2.3 and master for sp2. you could do lctl lustre_build_version on the client to double check.

          I saw this error again when testing the SUSE version of lustre-2.1.3 client for sles11sp2. Note that my servers are still 2.1.3.

          jaylan Jay Lan (Inactive) added a comment - I saw this error again when testing the SUSE version of lustre-2.1.3 client for sles11sp2. Note that my servers are still 2.1.3.

          nasf, adding you to the watcher list as Peter asked. This bug seems to be due to version interop problems with connect flags. Do you know anything in the 2.3 timeframe that might be related? It was suggested you might know or have worked in this area. Just looking at the 2.1 vs. 2.3 code I haven't been able to spot anything obvious, all the flag definitions and so forth look compatible.

          bogl Bob Glossman (Inactive) added a comment - nasf, adding you to the watcher list as Peter asked. This bug seems to be due to version interop problems with connect flags. Do you know anything in the 2.3 timeframe that might be related? It was suggested you might know or have worked in this area. Just looking at the 2.1 vs. 2.3 code I haven't been able to spot anything obvious, all the flag definitions and so forth look compatible.
          bogl Bob Glossman (Inactive) added a comment - - edited

          Looks like this is an interop problem between 2.3 clients and 2.1 servers. The key fact is the following error that shows up in client dmesg at mount time:

          Lustre: Disabling user_xattr feature because it is not supported on the server

          This happens with any 2.3 or 2.3+ client, sles11 or centos.
          Once the client is mounted with user_xattr disabled, any setfattr command attempted by the client will fail.

          Not sure exactly why the 2.3 client thinks the 2.1 server is incapable of doing user_xattr.

          bogl Bob Glossman (Inactive) added a comment - - edited Looks like this is an interop problem between 2.3 clients and 2.1 servers. The key fact is the following error that shows up in client dmesg at mount time: Lustre: Disabling user_xattr feature because it is not supported on the server This happens with any 2.3 or 2.3+ client, sles11 or centos. Once the client is mounted with user_xattr disabled, any setfattr command attempted by the client will fail. Not sure exactly why the 2.3 client thinks the 2.1 server is incapable of doing user_xattr.

          So far I haven't been able to reproduce the failure. I have been trying to vary the client side only. It will take me a bit longer to bring up a 2.1 server in case it's the server side that causes the problem. I will keep trying to reproduce it.

          bogl Bob Glossman (Inactive) added a comment - So far I haven't been able to reproduce the failure. I have been trying to vary the client side only. It will take me a bit longer to bring up a 2.1 server in case it's the server side that causes the problem. I will keep trying to reproduce it.
          pjones Peter Jones added a comment -

          Bob

          Could you please look into this one?

          Thanks

          Peter

          pjones Peter Jones added a comment - Bob Could you please look into this one? Thanks Peter

          People

            bogl Bob Glossman (Inactive)
            jaylan Jay Lan (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: