[LU-18783] pjdfstest test chmod 2 fails with ZFS 2.3.0 - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.17.0
Affects Version/s: Lustre 2.17.0
Labels:
- janitor9x
- zfs-2.3
Environment:
RHEL8 debug kernel running pdfstest test

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

Oleg's RHLE8 debug setup sees the following failure.

== run_pjdfstest test chmod_02: chmod returns ENAMETOOLONG if a component of a pathname exceeded \{NAME_MAX} characters === 10:00:33 
Run /usr/share/pjdfstest/chmod/02.t against ext4 filesystem
prove -f /usr/share/pjdfstest/chmod/02.t &> /tmp/pjdfstest-ext4
Run /usr/share/pjdfstest/chmod/02.t against lustre filesystem
lfs mkdir: dirstripe error on '/mnt/lustre/pjdfstest': stripe already set
lfs setdirstripe: cannot create dir '/mnt/lustre/pjdfstest': File exists
prove -f /usr/share/pjdfstest/chmod/02.t &> /tmp/pjdfstest-lustre

ext4 report
/usr/share/pjdfstest/chmod/02.t .. ok
All tests successful. 
Files=1, Tests=5, 3 wallclock secs ( 0.05 usr 0.02 sys + 0.10 cusr 1.08 csys = 1.25 CPU) Result: PASS 

lustre report /usr/share/pjdfstest/chmod/02.t .. 
not ok 5 - tried 'chmod 69e00cf29204f42d5271017e619f9e726949887fc30437c9d7bf6ef841d5675998586322f182567bd51832542773da1a6446af072a67c42e5ade58a5897ea7aa9fe967a44e8f89312bd45fcab7ee2f38ce8503b80cf3428771fd3fd9a43832190da80bd43682353e47aff2168f4422fd27647dcea3f1b85a7d22c21dd9b72e2x 0620', expected ENAMETOOLONG, got ENOENT 
Failed 1/5 subtests

 Test Summary Report 
------------------- 
/usr/share/pjdfstest/chmod/02.t (Wstat: 0 Tests: 5 Failed: 1) 
 Failed test: 5 Files=1, Tests=5, 2 wallclock secs ( 0.04 usr 0.02 sys + 0.10 cusr 0.98 csys = 1.14 CPU) 
Result: FAIL run_pjdfstest test_chmod_02: @@@@@@ FAIL: /usr/share/pjdfstest/chmod/02.t against lustre failed

Attachments

Issue Links

is related to

LU-1580 POSIX: pathconf.32: pathconf("/dev/pts/0", _PC_NAME_MAX) did not give correct results

Resolved

LU-3587 CR_MAXSIZE is too small

Resolved

LU-4219 posix test_1: fpathconf.8 and pathconf.10 failed: did not return the value of NAME_MAX

Resolved

LU-18803 remove hard-coded NAME_MAX from client and server

Reopened

LU-18515 Support ZFS 2.3.0

Resolved

Activity

[LU-18783] pjdfstest test chmod 2 fails with ZFS 2.3.0

Gerrit Updater added a comment - 25/Apr/25 12:51 AM

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/58336/
Subject: ~~LU-18783~~ llite: ensure dentry name within ll_namelen
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6210f4ae96633b95fc62c8fddc7d9517b5bb7e5c

Gerrit Updater added a comment - 25/Apr/25 12:51 AM "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/58336/ Subject: LU-18783 llite: ensure dentry name within ll_namelen Project: fs/lustre-release Branch: master Current Patch Set: Commit: 6210f4ae96633b95fc62c8fddc7d9517b5bb7e5c

Andreas Dilger added a comment - 13/Mar/25 10:03 AM

It looks according to my patch https://review.whamcloud.com/58397 ("LU-18803 class: add namelen_max and maxbytes params") that ZFS is indeed returning namelen=256 to the client, and the client is only using the value from MDT0000 to put into ll_namelen. The Janitor testing shows both ZFS and ldiskfs results.
https://testing.whamcloud.com/gerrit-janitor/49958/results.html

llite.lustre-ffff8800b6339800.namelen_max=256
lov.lustre-clilov-ffff8800b6339800.namelen_max=255
mdc.lustre-MDT0000-mdc-ffff8800b6339800.namelen_max=256

On ldiskfs it is always 255:

llite.lustre-ffff8800b43eb800.namelen_max=255
lov.lustre-clilov-ffff8800b43eb800.namelen_max=255
mdc.lustre-MDT0000-mdc-ffff8800b43eb800.namelen_max=255

It would also make sense for the MDT to return 255 for now, to help older clients along. I don't know why the change in MDD is not doing this...

Andreas Dilger added a comment - 13/Mar/25 10:03 AM It looks according to my patch https://review.whamcloud.com/58397 (" LU-18803 class: add namelen_max and maxbytes params ") that ZFS is indeed returning namelen=256 to the client, and the client is only using the value from MDT0000 to put into ll_namelen . The Janitor testing shows both ZFS and ldiskfs results. https://testing.whamcloud.com/gerrit-janitor/49958/results.html llite.lustre-ffff8800b6339800.namelen_max=256 lov.lustre-clilov-ffff8800b6339800.namelen_max=255 mdc.lustre-MDT0000-mdc-ffff8800b6339800.namelen_max=256 On ldiskfs it is always 255: llite.lustre-ffff8800b43eb800.namelen_max=255 lov.lustre-clilov-ffff8800b43eb800.namelen_max=255 mdc.lustre-MDT0000-mdc-ffff8800b43eb800.namelen_max=255 It would also make sense for the MDT to return 255 for now, to help older clients along. I don't know why the change in MDD is not doing this...

Andreas Dilger added a comment - 12/Mar/25 9:32 PM

I filed LU-18803 to track the removal of NAME_MAX from the client and server code. I don't think it would be a lot of work, but I also don't think it has a high demand compared to lots of other improvements that could be made.

However, if someone has a strong use case for filenames > 255 then it could be done in a few days of work I think.

Andreas Dilger added a comment - 12/Mar/25 9:32 PM I filed LU-18803 to track the removal of NAME_MAX from the client and server code. I don't think it would be a lot of work, but I also don't think it has a high demand compared to lots of other improvements that could be made. However, if someone has a strong use case for filenames > 255 then it could be done in a few days of work I think.

Andreas Dilger added a comment - 12/Mar/25 8:59 PM

Some further comments here:

the VFS itself doesn't actually impose a NAME_MAX limit, which is why ZFS can even have 1024-character filenames
what happens if there are different MDTs with different limits (e.g. two ZFS MDTs running different ZFS versions or pool configs during an upgrade)? Should the names be limited by the MDT a particular directory is on (which is the case today) or should there be a single limit across the whole filesystem to reduce confusion in userspace applications?

I suspect the root of the problem is that mdd_statfs() limiting os_namelen to NAME_MAX doesn't actually affect any filesystem operations on the OSD or client. Patch http://review.whamcloud.com/8217 ("LU-4219 mdd: limit os_namelen to the max of NAME_MAX") fixes the statfs.f_namelen value returned to userspace and quieted the fpathconf test failures in that ticket, but the MDT is actually depending on the OSD layer to enforce the namelen limit during file creation and truncating os_namelen doesn't change this. Since the ZFS limit is 256, it would (currently) allow filenames up to that limit to be created.

It likely makes sense that lmv_statfs() should find the minimum os_namelen() returned from any MDT when it is aggregating osfs->os_namelen (capped at NAME_MAX on the client), and this should be set in sbi->ll_namelen and returned by ll_statfs_internal().

I don't think that > 255-char filenames is a "problem" per-se, just that pjdfstest is depending on glibc and POSIX limits. That is really a problem in the test/spec in the end.

I was going to suggest to add osd-zfs and llite tunables to allow changing the os_namelen limit (and depend only on the client to do the truncation), but it appears that the Lustre code is using NAME_MAX all over the place (e.g. llite, lmv, lfsck, lod, mdd, mdt should use os_namelen, fscrypt and ext4-filename-encode.patch should use EXT4_NAME_MAX, etc.), so increasing this limit would be a much larger project than fixing this one issue, and should get its own Jira ticket.

Andreas Dilger added a comment - 12/Mar/25 8:59 PM Some further comments here: the VFS itself doesn't actually impose a NAME_MAX limit, which is why ZFS can even have 1024-character filenames what happens if there are different MDTs with different limits (e.g. two ZFS MDTs running different ZFS versions or pool configs during an upgrade)? Should the names be limited by the MDT a particular directory is on (which is the case today) or should there be a single limit across the whole filesystem to reduce confusion in userspace applications? I suspect the root of the problem is that mdd_statfs() limiting os_namelen to NAME_MAX doesn't actually affect any filesystem operations on the OSD or client. Patch http://review.whamcloud.com/8217 (" LU-4219 mdd: limit os_namelen to the max of NAME_MAX ") fixes the statfs.f_namelen value returned to userspace and quieted the fpathconf test failures in that ticket, but the MDT is actually depending on the OSD layer to enforce the namelen limit during file creation and truncating os_namelen doesn't change this. Since the ZFS limit is 256, it would (currently) allow filenames up to that limit to be created. It likely makes sense that lmv_statfs() should find the minimum os_namelen() returned from any MDT when it is aggregating osfs->os_namelen (capped at NAME_MAX on the client), and this should be set in sbi->ll_namelen and returned by ll_statfs_internal() . I don't think that > 255-char filenames is a "problem" per-se, just that pjdfstest is depending on glibc and POSIX limits. That is really a problem in the test/spec in the end. I was going to suggest to add osd-zfs and llite tunables to allow changing the os_namelen limit (and depend only on the client to do the truncation), but it appears that the Lustre code is using NAME_MAX all over the place (e.g. llite, lmv, lfsck, lod, mdd, mdt should use os_namelen, fscrypt and ext4-filename-encode.patch should use EXT4_NAME_MAX , etc.), so increasing this limit would be a much larger project than fixing this one issue, and should get its own Jira ticket.

Gerrit Updater added a comment - 07/Mar/25 8:10 PM

"James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58336
Subject: ~~LU-18783~~ llite: ensure dentry name is not longer than NAME_MAX
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 701301ffd9aa73c63539dfe3d65a964eea5feee8

Gerrit Updater added a comment - 07/Mar/25 8:10 PM "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/58336 Subject: LU-18783 llite: ensure dentry name is not longer than NAME_MAX Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 701301ffd9aa73c63539dfe3d65a964eea5feee8

pjdfstest test chmod 2 fails with ZFS 2.3.0

Details

Description

Attachments

Issue Links

Activity

People

Dates