[LU-15748] interop: sanity test_150b: fallocate failed, error Operation not supported, mode 0, offset 62914560, len 4194304 Created: 14/Apr/22 Updated: 27/Apr/23 Resolved: 29/Nov/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.15.0 |
| Fix Version/s: | Lustre 2.15.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | Arshad Hussain |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||
| Description |
|
This issue was created by maloo for Cliff White <cwhite@whamcloud.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/1c15b404-f4af-4275-9a77-f5a50b30ca56 keep default fallocate mode: 0 fallocate failed, error Operation not supported, mode 0, offset 62914560, len 4194304 sanity test_150b: @@@@@@ FAIL: fallocate failed Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:6273:error() = /usr/lib64/lustre/tests/sanity.sh:13441:test_150b() = /usr/lib64/lustre/tests/test-framework.sh:6576:run_one() = /usr/lib64/lustre/tests/test-framework.sh:6623:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:6465:run_test() = /usr/lib64/lustre/tests/sanity.sh:13443:main() Dumping lctl log to /autotest/autotest-1/2022-04-03/lustre-master_full-part-2_4283_1_29_b51025b5-001f-45f2-9eb2-ebe47692e36c//sanity.test_150b.*.1649017567.log |
| Comments |
| Comment by Patrick Farrell [ 18/Apr/22 ] |
|
arshad512 , are you able to take a look at this? |
| Comment by Arshad Hussain [ 18/Apr/22 ] |
|
@Patrick Farrell, sorry, I missed this JIRA. I will have a look and upadate on this.
|
| Comment by Patrick Farrell [ 18/Apr/22 ] |
|
arshad512 - no worries, it's quite new. Thanks for taking a look. |
| Comment by Arshad Hussain [ 19/Apr/22 ] |
|
@Andreas, @Patrick This is an inter-op bug. Seen on 2.14 client and 2.15.0-RC3 (server). New check added under lustre/ofd/ofd_dev.c when fallocate(punch) was introduced. And was if ((oa->o_valid & (OBD_MD_FLSIZE | OBD_MD_FLBLOCKS)) !=
(OBD_MD_FLSIZE | OBD_MD_FLBLOCKS)) {
...
}
This was reproducable as (without patch) on 2.14 client: # ./check_fallocate /mnt/lustre/a fallocate failed, error Operation not supported, mode 0, offset 62914560, len 4194304
After patch on 2.14 client: # ./check_fallocate /mnt/lustre/a # ls -ali /mnt/lustre/a 144115205272502273 -rwx------ 1 root root 125829120 Apr 19 05:56 /mnt/lustre/a Since the check is valid. The patch must be back-ported to 2.14. Other thing to check is that the within check_fallocate.c the function test_prealloc_nonsparse() always carries correct o_valid flags. However, test_prealloc_sparse() does not and which was failing. |
| Comment by Gerrit Updater [ 19/Apr/22 ] |
|
|
| Comment by Andreas Dilger [ 19/Apr/22 ] |
|
It isn't really possible to retroactively fix older releases in this way. Even if this patch landed on b2_14 it would not help the sites that are running 2.14.0. Do you know if 2.14.0 correctly populated these fields, but just didn't set the flags? In that case, one solution would be to disable the new flag check on master before 2.15.0 is released, and put it under: #ifdef LUSTRE_BUILD_VERSION > OBD_OCD_VERSION(2, 18, 52, 0) and only check that the values are set for the next few releases. Also, we might (initially) make this check conditional on the use of FALLOC_FL_PUNCH_HOLE, since that is when the client started setting these flags. Alternately, the server could check if the client version is >= 2.15.0 before enforcing the check, but this makes things complicated if backporting the punch feature to older releases (though at worst a sanity check is removed that might still be reasonably validated by checking the o_size and o_blocks values are sane. Alternately, it could be checked by an OBD_CONNECT2_* flag already added in 2.15.0, but I'm a bit reluctant to burn a new flag for this minor issue (though it wouldn't be the end of the world if others think that is needed). It looks like the lack of OBD_CONNECT_TRUNCLOCK would indicate a newer client (it was set by 2.14.0 clients and sent to both MDTs and OSTs, but not by 2.15.0 clients), which would give us something like: /* was OBD_CONNECT_TRUNCLOCK 0x400ULL *locks on server for punch */ /* temporary usage until 2.21.53 to indicate pre-2.15 client, see LU-15478 */ #define OBD_CONNECT_OLD_FALLOC 0x400ULL /* missing o_valid flags */ /* * fallocate() start and end are passed in o_size and o_blocks * on the wire. Clients 2.15.0 and newer should always set * the OBD_MD_FLSIZE and OBD_MD_FLBLOCKS valid flags, but some * older client versions did not. We permit older clients to * not set these flags, checking their version by proxy using * the lack of OBD_CONNECT_TRUNCLOCK to imply 2.14.0 and older. * * Return -EOPNOTSUPP to also work with older clients not * supporting newer server modes. */ if ((oa->o_valid & (OBD_MD_FLSIZE | OBD_MD_FLBLOCKS)) != (OBD_MD_FLSIZE | OBD_MD_FLBLOCKS) #if LUSTRE_VERSION_CODE < OBD_OCD_VERSION(2, 21, 53, 0) && !(tgt_conn_flags(tsi) & OBD_CONNECT_OLD_FALLOC) #endif ) RETURN(-EOPNOTSUPP); start = oa->o_size; end = oa->o_blocks; /* verify arguments are sane (len <= 0 also denied by client VFS) */ if (start >= end) RETURN(-EINVAL); /* * mode == 0 (which is standard prealloc) and PUNCH is supported * Rest of mode options are not supported yet. */ mode = oa->o_falloc_mode; if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) RETURN(-EOPNOTSUPP); NOTE the same problem exists in mdt_fallocate_hdl() but has not even been fixed with the |
| Comment by Gerrit Updater [ 20/Apr/22 ] |
|
"Arshad Hussain <arshad.hussain@aeoncomputing.com>" uploaded a new patch: https://review.whamcloud.com/47098 |
| Comment by Gerrit Updater [ 28/Apr/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/47098/ |
| Comment by Peter Jones [ 28/Apr/22 ] |
|
Landed for 2.15 |
| Comment by Andreas Dilger [ 06/Jun/22 ] |
|
I think the patch to fix the interop is backward and needs a minor fix. |
| Comment by Gerrit Updater [ 06/Jun/22 ] |
|
"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/47548 |
| Comment by Gerrit Updater [ 06/Jun/22 ] |
|
|
| Comment by Gerrit Updater [ 21/Aug/22 ] |
|
|
| Comment by Gerrit Updater [ 29/Nov/22 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/47548/ |
| Comment by Peter Jones [ 29/Nov/22 ] |
|
Landed for 2.15 |