[LU-16335] "lfs rm_entry" failed to remove broken directories Created: 23/Nov/22  Updated: 08/Jan/24  Resolved: 19/Jan/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0, Lustre 2.15.3

Type: Bug Priority: Minor
Reporter: Lai Siyao Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocker
Related
is related to LU-16336 LFSCK should fix inconsistencies caus... Open
is related to LU-16159 remove update logs after recovery abort Reopened
is related to LU-16398 ost-pools: FAIL: remove sub-test dirs... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

In LU-16159, update logs are canceled upon recovery abort, which may leave some directories broken, and can't be removed by "lfs rm_entry". This is because "lfs rm_entry" failed some sanity check, but it will leave end-user a broken filesystem without any way to fix.



 Comments   
Comment by Lai Siyao [ 06/Dec/22 ]

config.log show this:

configure:39672: result: no
configure:39682: checking if ioctl IOC_REMOVE_ENTRY' is supported
configure:39696: gcc -c -g -O2 -Werror -I/home/laisiyao/lustre/lnet/include/uapi -I/home/laisiyao/lustre/lustre/include/uapi -I/home/laisiyao/lustre/libcfs/include -I/home/laisiyao/lustre/lnet/utils/ -I/home/laisiyao/lustre/lustre/include  conftest.c >&5
In file included from /home/laisiyao/lustre/lustre/include/uapi/linux/lustre/lustre_idl.h:74,
                 from /home/laisiyao/lustre/lustre/include/uapi/linux/lustre/lustre_ioctl.h:34,
                 from conftest.c:253:
/home/laisiyao/lustre/lustre/include/uapi/linux/lustre/lustre_user.h: In function 'changelog_rec_sname':
/home/laisiyao/lustre/lustre/include/uapi/linux/lustre/lustre_user.h:2012:9: error: implicit declaration of function 'strchrnul'; did you mean 'strchr'? [-Werror=implicit-function-declaration]
  return strchrnul(changelog_rec_name(rec), '\0') + 1;
         ^~~~~~~~~
         strchr
/home/laisiyao/lustre/lustre/include/uapi/linux/lustre/lustre_user.h:2012:50: error: returning 'int' from a function with return type 'char *' makes pointer from integer without a cast [-Werror=int-conversion]
  return strchrnul(changelog_rec_name(rec), '\0') + 1;
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
cc1: all warnings being treated as errors
configure:39696: $? = 1
configure: failed program was:
 

This check is from https://review.whamcloud.com/39207, and it causes "lfs rm_entry" always returns -ENOTSUP, which should be the cause of broken directory unlink fail.

strchrnul() needs define _GNU_SOURCE, adding it in the check can solve the issue.

Comment by Gerrit Updater [ 07/Dec/22 ]

"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49328
Subject: LU-16335 build: strchrnul() needs define _GNU_SOURCE
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f5f31fa51477c7091a5af7fbb95382cd87d97938

Comment by Gerrit Updater [ 07/Dec/22 ]

"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49329
Subject: LU-16335 mdt: skip child check for rm_entry
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2cc5c2bfed10f2d63ee767cf90d77fb16e08e366

Comment by Gerrit Updater [ 07/Dec/22 ]

"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49334
Subject: LU-16335 header: implement native strchrnul()
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: bc03636059c5c2dc9a0f1030af83309aedc94ac0

Comment by Gerrit Updater [ 07/Dec/22 ]

"Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49335
Subject: LU-16335 test: add fail_abort_cleanup()
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 243ffa00d0a4353c78ee8b73bdce0e1c60bdc5da

Comment by James A Simmons [ 07/Dec/22 ]

So we disable various test for the native client since lfs rm_dentry was removed upstream. Its considered a security risk. Does this mean lfsck can repair the file system?

Comment by Lai Siyao [ 07/Dec/22 ]

In theory lfsck should fix these inconsistencies, but it's not fully tested, and LU-16159 test result shows some are not fixed, and it will be addressed in LU-16336.

Comment by Gerrit Updater [ 20/Dec/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49328/
Subject: LU-16335 build: remove _GNU_SOURCE dependency in lustre_user.h
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: efc5c8d4de60d394344506f7cfb188eaf04a4bac

Comment by Gerrit Updater [ 07/Jan/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49335/
Subject: LU-16335 test: add fail_abort_cleanup()
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d5fe41a02a6ed57bcbfc4a4c695bb509c9c7c313

Comment by Gerrit Updater [ 19/Jan/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49329/
Subject: LU-16335 mdt: skip target check for rm_entry
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ae98c5fdaaf37daeb328b7110cbcf42754752c9d

Comment by Peter Jones [ 19/Jan/23 ]

Landed for 2.16

Comment by Gerrit Updater [ 26/Jan/23 ]

"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49776
Subject: LU-16335 build: remove _GNU_SOURCE dependency in lustre_user.h
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 11d59e1f9407f8545172b00a23a34385366b4fe6

Comment by Gerrit Updater [ 08/Mar/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49776/
Subject: LU-16335 build: remove _GNU_SOURCE dependency in lustre_user.h
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 8a8747d319aa3f91674b156c79d44cbc092ee175

Comment by Gerrit Updater [ 06/Jun/23 ]

"Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51237
Subject: LU-16335 test: add fail_abort_cleanup()
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 2e35cdf5abe712084351d92f2b4feb99b98da08c

Generated at Sat Feb 10 03:26:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.