[LU-16427] 'lfs rmfid' does not print anything on error Created: 22/Dec/22  Updated: 01/May/23  Resolved: 01/May/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Minor
Reporter: Andreas Dilger Assignee: Arshad Hussain
Resolution: Fixed Votes: 0
Labels: easy

Issue Links:
Related
is related to LU-16618 rmfid miscellaneous fixes Open
is related to LU-14469 lmv_rmfid() does 128K kmalloc() Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The "lfs rmfid" does not print anything useful or exit with a non-zero error code when passed bad arguments:

# lfs rmfid --help
# lfs rmfid foo
# echo $?
0

It would make sense to handle "--help" to print the usage message, and should print an error and the usage message if a bad filesystem name/mountpoint argument is given or if a bad FID is passed (though it should continue to process all remaining FIDs on the command-line and exit with an error at the end).



 Comments   
Comment by Thomas Bertschinger [ 21/Mar/23 ]

I have been looking into this and have found that `rmfid` also appears completely nonfunctional when used within the "lfs shell". Here's what I mean:

$ lfs
lfs > rmfid /mnt/mylustre [0x200000407:0xcc:0x0]
unrecognized FID: /mnt/mylustre 
lfs > rmfid [0x200000407:0xcc:0x0]
lfs rmfid: cannot remove FIDs: Bad file descriptor

I'm guessing that lfs is not commonly used in this way, but is this shell considered supported and should be working? Should this be fixed as part of this bug?

 

Comment by Thomas Bertschinger [ 21/Mar/23 ]

One challenge with printing out good error messages here is that I don't think the current API of llapi_rmfid() provides a good way to distinguish an error in opening the filesystem directory vs. an error with the ioctl().

A nonexistent filesystem path should result in llapi_rmfid() returning -ENOENT, I think (although currently it returns -EBADF), and while the ioctl() won't return -ENOENT from what I can tell, I don't know that it will never return -ENOENT.

One solution would be to create a function llapi_rmfid_at() that takes an FD open to the lustre mountpoint as the 1st argument so that the caller can validate the mountpoint, similar to lfs_fid2path() and llapi_fid2path_at(). (Or just modify llapi_rmfid(), although this would be a breaking API change.) This change would allow for clearer error messages from lfs rmfid, IMO. Do you have any thoughts on this idea?

Comment by Arshad Hussain [ 23/Mar/23 ]

Andreas/Thomas,

I am already working on rmfid issues. (See below) you want me to pick up this LU?

Thanks

https://review.whamcloud.com/c/fs/lustre-release/+/50367

 

LU-16618 lfs: rmfid miscellaneous fixes

This patch fixes:

01. Fix rmfid silently accepting fid without fsname
or lustre root mount point. Make it correctly
fail if required arguments is not provided.

After Patch:
~~~~~~~~~~~~
$ lfs rmfid 0x200000402:0x1:0x0
lfs rmfid: missing <fsname|rootpath> or <fid>
Remove file(s) by FID(s)
usage: rmfid <fsname|rootpath> <fid> ...

Before Patch:
~~~~~~~~~~~~
$ lfs rmfid 0x200000402:0x1:0x0
${noformat}

Comment by Gerrit Updater [ 23/Mar/23 ]

"Arshad Hussain <arshad.hussain@aeoncomputing.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50388
Subject: LU-16427 lfs: rmfid does not print anything on error
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 83add47417cfca6600336ae7d890125a513cee94

Comment by Thomas Bertschinger [ 23/Mar/23 ]

@Arshad sorry, I did not realize you were already working on this one! I'll be happy to review your patch if desired.

Comment by Arshad Hussain [ 20/Apr/23 ]

@Thomas, no worries and thanks for your review. Even I realized much later that this ticke was open much earlier by Andreas. It was missed by me. I started working on rmfid when I accidentally stumbled upon its error.

Comment by Gerrit Updater [ 01/May/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50388/
Subject: LU-16427 lfs: rmfid does not print anything on error
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 5d930252407cc11604c4b9de2984784c62c43a4c

Comment by Peter Jones [ 01/May/23 ]

Landed for 2.16

Generated at Sat Feb 10 03:26:56 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.