[LU-11907] sanity fails to cleanup d60g.sanity Created: 31/Jan/19  Updated: 04/Jan/20  Resolved: 29/May/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.2, Lustre 2.12.3
Fix Version/s: Lustre 2.13.0, Lustre 2.12.4

Type: Bug Priority: Minor
Reporter: Alex Zhuravlev Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-11681 sanity test 65i fails with 'find /mn... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

== sanity test complete, duration 31 sec ============================================================= 14:20:26 (1548926426)
rm: cannot remove '/mnt/lustre/d60g.sanity': Directory not empty
sanity : @@@@@@ FAIL: remove sub-test dirs failed

this is because 60g left d60g.sanity in inconsistent state:
striped directory creation failed to create one stripe, then failed to remove direntry (due to same OBD_FAIL_OSD_TXN_START).
when final rmdir fails to load inode with:
LustreError: 25350:0:(llite_lib.c:2414:ll_prep_inode()) new_inode -fatal: rc -2

I'm not sure what would be a correct approach here as in the fields we'd suggest to run LFSCK. probably we should use something like lctl rmdir .. to cleanup after 60g?



 Comments   
Comment by Andreas Dilger [ 31/Jan/19 ]

While fixing this with LFSCK should hopefully work, it is probably my least-preferred option, next least preferred option is using "lfs rmentry" (or whatever it is called).  

The reason I don't like those is because this isn't what users will use, and they may not even know they exist. They will use "rmdir" or "rm -r", and that should work. The main reason we added "lfs rmentry" is to allow deleting entries on a remote MDT when the MDT is permanently lost. Normally, we don't want to allow a user to delete a remote directory when the MDT is just temporarily offline, because it may be full of files. 

Comment by Alex Zhuravlev [ 31/Jan/19 ]

right, I think this is a special case. I actually doubt such an outcome was designed. though another concern is that the test leaves the filesystem inconsistent and if we begin to run fsck/lfsck at some point (iirc, we discussed this possibility) then that would report an error.
another thought was that it's probably OK to let users to access/remove striped directories with missing stripes? currently this is impossible, AFAIU.

Comment by Andreas Dilger [ 31/Jan/19 ]

Yes, being able to read/unlink directories with missing stripes is very useful. We can't allow new files to be created therein, but we might consider to allow the entries to be migrated to another MDT (probably as a separate patch, if it is more complex than just allowing read/unlink). 

Comment by Lai Siyao [ 01/Feb/19 ]

Yes, I agree. After we support access to directories with missing stripes, migrate such directory will be similar to migrate directory with bad hash.

Comment by Gerrit Updater [ 24/Apr/19 ]

Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34750
Subject: LU-11907 dne: allow access to striped dir with broken layout
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 1717d19d1eefd6be6dcc78ea55283ceccc68e920

Comment by Gerrit Updater [ 29/May/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34750/
Subject: LU-11907 dne: allow access to striped dir with broken layout
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d2725563e7afa17a41a53aa65255a31380606d23

Comment by Peter Jones [ 29/May/19 ]

Landed for 2.13

Comment by Gerrit Updater [ 05/Dec/19 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36939
Subject: LU-11907 dne: allow access to striped dir with broken layout
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: a7ddb9c9e20a1593b46fd16862c4bfaea66c0f41

Comment by Gerrit Updater [ 03/Jan/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36939/
Subject: LU-11907 dne: allow access to striped dir with broken layout
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: 5b1ea58c21edd17c2cb1f4ecdbbeb5bbdaa1444b

Generated at Sat Feb 10 02:47:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.