[LU-16743] ZFS sanity test_316: lfs migrate -m1 failed: no such file or directory Created: 16/Apr/23  Updated: 13/Sep/23  Resolved: 13/Sep/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: zfs

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/473fe491-78ba-4552-a0ab-556025352754

test_316 failed with the following error:

lfs migrate -m1 failed

Test session details:
clients: https://build.whamcloud.com/job/lustre-master-patchless/768 - 4.18.0-240.22.1.el8_3.x86_64
servers: https://build.whamcloud.com/job/lustre-master-patchless/768 - 4.18.0-240.22.1.el8_3.x86_64

Started failing on 2023-04-13. Nothing in the logs except ESTALE error:

ll_close_inode_openhandle()) lustre-clilmv-ffff991acdd0a800: inode [0x200005223:0x10470:0x0] mdc close failed: rc = -116

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity test_316 - lfs migrate -m1 failed



 Comments   
Comment by Serguei Smirnov [ 15/Jun/23 ]

+1 on master: https://testing.whamcloud.com/test_sets/84b4be3d-4ab1-4333-aa16-d415848972ec

Comment by Nikitas Angelinas [ 15/Jun/23 ]

+1 on master: https://testing.whamcloud.com/test_sets/38964588-9046-45fe-902a-bb7ac1660871

Comment by Andreas Dilger [ 22/Aug/23 ]

Lai, could you please take a look at this.  It failed 12x last week, which isn't a huge number given how often sanity.sh is run, but it looks like a real defect and not a test script issue.

Comment by Andreas Dilger [ 22/Aug/23 ]

Patches landed in that timeframe that might affect this test:

$ git log --oneline --after 2023-04-10 --before 2023-04-14
ae2ff5174d LU-16732 ldiskfs: update for ext4-delayed-iput for RHEL9.1
4b12f2b9ae LU-16646 krb: improve lookup of user's credentials
9784178eff LU-16646 krb: use system ccache for Lustre services
5731acd499 LU-16646 krb: get rid of MEMORY private cache for krb creds
3214d4d860 LU-16630 sec: improve Kerberos cross-realm trust remapping
0b00e318e2 LU-13444 tests: sanity to set PTLDEBUG
fce9af0b7f LU-16503 utils: add --hex-idx option for lfs getstripe
023a873c14 LU-16608 tests: Reduce lock count in 255c
d7e20133c3 LU-16221 kernel: RHEL 9.1 server support
dfb08bbf77 LU-16465 llite: fix LSOM blocks for ftruncate and close
2de1dbd440 LU-16350 osd-ldiskfs: no_llseek removed, dquot_transfer
0006eb3644 LU-16328 llite: migrate_folio, vfs_setxattr
133ed0cf6f LU-16327 llite: read_folio, release_folio, filler_t
ca59060c42 LU-11047 mdt: standardize mdt object locking
4435d0121f LU-14139 statahead: batched statahead processing
1c8a49bedf LU-11404 llite: only first sync to MDS matter
36a199db2b LU-10391 ptlrpc: change cc_nid in nrs to be struct lnet_nid
a6915dd9d8 LU-10391 obdclass: change class_match_nid to take lnet_nid
163331cb81 LU-10391 lustre: introduce class_parse_nid()
b4a28a3269 LU-10391 lustre: rename class_parse_nid to class_parse_nid4
16d84b0305 LU-10391 obdclass: change class_add/check_uuid to large nid
42b49afdc8 LU-10391 lnet: change LNetAddPeer() to take struct lnet_nid
6c0b4329d0 LU-16339 quota: notify OSTs until lge_qunit_nu is set
3be4258839 LU-16634 llite: move common ioctl code to ll_iocontrol()
1f4825eff0 LU-16634 obdclass: improve iocontrol error messages
4a1465577e LU-16634 misc: remove unnecessary ioctl typecasts
Comment by Lai Siyao [ 23/Aug/23 ]

Maloo test results search shows this occurs on ZFS backend only, and it failed because the ORPHAN flag is read on the target stripe of migrating directory, I'm afraid its inode attr may not be initialized correctly.

Comment by Lai Siyao [ 23/Aug/23 ]
00080000:00000040:1.0:1686430631.310794:0:327942:0:(osd_object.c:1034:osd_attr_get()) lustre-MDT0001: set orphan flag on [0x240002340:0x20:0x0] (0x8877f/0x0)

it shows inode->i_flags has ORPHAN set, which should not happen, because this flag should be set on obj->oo_lma_flags (which is 0 here).

Comment by Gerrit Updater [ 23/Aug/23 ]

"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52052
Subject: LU-16743 mdt: clear attr->la_flags before migration
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7a84550a7bf813d93ef1d9275b1164912e9579ce

Comment by Gerrit Updater [ 13/Sep/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52052/
Subject: LU-16743 lod: create stripe with correct attr
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6be9476e790ceef71e874b2745a8280443d5c90b

Comment by Peter Jones [ 13/Sep/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:29:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.