Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18783

pjdfstest test chmod 2 fails with ZFS 2.3.0

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.17.0
    • RHEL8 debug kernel running pdfstest test
    • 3
    • 9223372036854775807

    Description

      Oleg's RHLE8 debug setup sees the following failure.

      == run_pjdfstest test chmod_02: chmod returns ENAMETOOLONG if a component of a pathname exceeded \{NAME_MAX} characters === 10:00:33 
      Run /usr/share/pjdfstest/chmod/02.t against ext4 filesystem
      prove -f /usr/share/pjdfstest/chmod/02.t &> /tmp/pjdfstest-ext4
      Run /usr/share/pjdfstest/chmod/02.t against lustre filesystem
      lfs mkdir: dirstripe error on '/mnt/lustre/pjdfstest': stripe already set
      lfs setdirstripe: cannot create dir '/mnt/lustre/pjdfstest': File exists
      prove -f /usr/share/pjdfstest/chmod/02.t &> /tmp/pjdfstest-lustre
      
      ext4 report
      /usr/share/pjdfstest/chmod/02.t .. ok
      All tests successful. 
      Files=1, Tests=5, 3 wallclock secs ( 0.05 usr 0.02 sys + 0.10 cusr 1.08 csys = 1.25 CPU) Result: PASS 
      
      lustre report /usr/share/pjdfstest/chmod/02.t .. 
      not ok 5 - tried 'chmod 69e00cf29204f42d5271017e619f9e726949887fc30437c9d7bf6ef841d5675998586322f182567bd51832542773da1a6446af072a67c42e5ade58a5897ea7aa9fe967a44e8f89312bd45fcab7ee2f38ce8503b80cf3428771fd3fd9a43832190da80bd43682353e47aff2168f4422fd27647dcea3f1b85a7d22c21dd9b72e2x 0620', expected ENAMETOOLONG, got ENOENT 
      Failed 1/5 subtests
      
       Test Summary Report 
      ------------------- 
      /usr/share/pjdfstest/chmod/02.t (Wstat: 0 Tests: 5 Failed: 1) 
       Failed test: 5 Files=1, Tests=5, 2 wallclock secs ( 0.04 usr 0.02 sys + 0.10 cusr 0.98 csys = 1.14 CPU) 
      Result: FAIL run_pjdfstest test_chmod_02: @@@@@@ FAIL: /usr/share/pjdfstest/chmod/02.t against lustre failed
      

      Attachments

        Issue Links

          Activity

            [LU-18783] pjdfstest test chmod 2 fails with ZFS 2.3.0

            It looks according to my patch https://review.whamcloud.com/58397 ("LU-18803 class: add namelen_max and maxbytes params") that ZFS is indeed returning namelen=256 to the client, and the client is only using the value from MDT0000 to put into ll_namelen. The Janitor testing shows both ZFS and ldiskfs results.
            https://testing.whamcloud.com/gerrit-janitor/49958/results.html

            llite.lustre-ffff8800b6339800.namelen_max=256
            lov.lustre-clilov-ffff8800b6339800.namelen_max=255
            mdc.lustre-MDT0000-mdc-ffff8800b6339800.namelen_max=256
            

            On ldiskfs it is always 255:

            llite.lustre-ffff8800b43eb800.namelen_max=255
            lov.lustre-clilov-ffff8800b43eb800.namelen_max=255
            mdc.lustre-MDT0000-mdc-ffff8800b43eb800.namelen_max=255
            

            It would also make sense for the MDT to return 255 for now, to help older clients along. I don't know why the change in MDD is not doing this...

            adilger Andreas Dilger added a comment - It looks according to my patch https://review.whamcloud.com/58397 (" LU-18803 class: add namelen_max and maxbytes params ") that ZFS is indeed returning namelen=256 to the client, and the client is only using the value from MDT0000 to put into ll_namelen . The Janitor testing shows both ZFS and ldiskfs results. https://testing.whamcloud.com/gerrit-janitor/49958/results.html llite.lustre-ffff8800b6339800.namelen_max=256 lov.lustre-clilov-ffff8800b6339800.namelen_max=255 mdc.lustre-MDT0000-mdc-ffff8800b6339800.namelen_max=256 On ldiskfs it is always 255: llite.lustre-ffff8800b43eb800.namelen_max=255 lov.lustre-clilov-ffff8800b43eb800.namelen_max=255 mdc.lustre-MDT0000-mdc-ffff8800b43eb800.namelen_max=255 It would also make sense for the MDT to return 255 for now, to help older clients along. I don't know why the change in MDD is not doing this...

            I filed LU-18803 to track the removal of NAME_MAX from the client and server code. I don't think it would be a lot of work, but I also don't think it has a high demand compared to lots of other improvements that could be made.

            However, if someone has a strong use case for filenames > 255 then it could be done in a few days of work I think.

            adilger Andreas Dilger added a comment - I filed LU-18803 to track the removal of NAME_MAX from the client and server code. I don't think it would be a lot of work, but I also don't think it has a high demand compared to lots of other improvements that could be made. However, if someone has a strong use case for filenames > 255 then it could be done in a few days of work I think.

            Some further comments here:

            • the VFS itself doesn't actually impose a NAME_MAX limit, which is why ZFS can even have 1024-character filenames
            • what happens if there are different MDTs with different limits (e.g. two ZFS MDTs running different ZFS versions or pool configs during an upgrade)? Should the names be limited by the MDT a particular directory is on (which is the case today) or should there be a single limit across the whole filesystem to reduce confusion in userspace applications?

            I suspect the root of the problem is that mdd_statfs() limiting os_namelen to NAME_MAX doesn't actually affect any filesystem operations on the OSD or client. Patch http://review.whamcloud.com/8217 ("LU-4219 mdd: limit os_namelen to the max of NAME_MAX") fixes the statfs.f_namelen value returned to userspace and quieted the fpathconf test failures in that ticket, but the MDT is actually depending on the OSD layer to enforce the namelen limit during file creation and truncating os_namelen doesn't change this. Since the ZFS limit is 256, it would (currently) allow filenames up to that limit to be created.

            It likely makes sense that lmv_statfs() should find the minimum os_namelen() returned from any MDT when it is aggregating osfs->os_namelen (capped at NAME_MAX on the client), and this should be set in sbi->ll_namelen and returned by ll_statfs_internal().

            I don't think that > 255-char filenames is a "problem" per-se, just that pjdfstest is depending on glibc and POSIX limits. That is really a problem in the test/spec in the end.

            I was going to suggest to add osd-zfs and llite tunables to allow changing the os_namelen limit (and depend only on the client to do the truncation), but it appears that the Lustre code is using NAME_MAX all over the place (e.g. llite, lmv, lfsck, lod, mdd, mdt should use os_namelen, fscrypt and ext4-filename-encode.patch should use EXT4_NAME_MAX, etc.), so increasing this limit would be a much larger project than fixing this one issue, and should get its own Jira ticket.

            adilger Andreas Dilger added a comment - Some further comments here: the VFS itself doesn't actually impose a NAME_MAX limit, which is why ZFS can even have 1024-character filenames what happens if there are different MDTs with different limits (e.g. two ZFS MDTs running different ZFS versions or pool configs during an upgrade)? Should the names be limited by the MDT a particular directory is on (which is the case today) or should there be a single limit across the whole filesystem to reduce confusion in userspace applications? I suspect the root of the problem is that mdd_statfs() limiting os_namelen to NAME_MAX doesn't actually affect any filesystem operations on the OSD or client. Patch http://review.whamcloud.com/8217 (" LU-4219 mdd: limit os_namelen to the max of NAME_MAX ") fixes the statfs.f_namelen value returned to userspace and quieted the fpathconf test failures in that ticket, but the MDT is actually depending on the OSD layer to enforce the namelen limit during file creation and truncating os_namelen doesn't change this. Since the ZFS limit is 256, it would (currently) allow filenames up to that limit to be created. It likely makes sense that lmv_statfs() should find the minimum os_namelen() returned from any MDT when it is aggregating osfs->os_namelen (capped at NAME_MAX on the client), and this should be set in sbi->ll_namelen and returned by ll_statfs_internal() . I don't think that > 255-char filenames is a "problem" per-se, just that pjdfstest is depending on glibc and POSIX limits. That is really a problem in the test/spec in the end. I was going to suggest to add osd-zfs and llite tunables to allow changing the os_namelen limit (and depend only on the client to do the truncation), but it appears that the Lustre code is using NAME_MAX all over the place (e.g. llite, lmv, lfsck, lod, mdd, mdt should use os_namelen, fscrypt and ext4-filename-encode.patch should use EXT4_NAME_MAX , etc.), so increasing this limit would be a much larger project than fixing this one issue, and should get its own Jira ticket.

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58336
            Subject: LU-18783 llite: ensure dentry name is not longer than NAME_MAX
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 701301ffd9aa73c63539dfe3d65a964eea5feee8

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58336 Subject: LU-18783 llite: ensure dentry name is not longer than NAME_MAX Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 701301ffd9aa73c63539dfe3d65a964eea5feee8

            People

              simmonsja James A Simmons
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: