[LU-13031] store JobID of program that created file in inodes at create time Created: 28/Nov/19 Updated: 07/Dec/23 Resolved: 07/Aug/23 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.16.0, Lustre 2.15.4 |
| Type: | Improvement | Priority: | Minor |
| Reporter: | Andreas Dilger | Assignee: | Thomas Bertschinger |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LTS15, llnl, lug23dd, medium | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||||||||||||||
| Description |
|
It would be useful to store the JobID that created a file in an xattr on the MDT inode. This would allow tracking the source of files after the fact, and can help users with the provenance of their file data (e.g. which run the files came from, which can then be used to determine runtime parameters used, whether the run was successful or not, etc.). It isn't clear whether this would be needed on the OST inodes as well, but I think initially the MDT inode would be enough, as this would be the only xattr visible to users. The OST xattr would only be accessible for low-level scanning tools. |
| Comments |
| Comment by Olaf Faaland [ 18/Dec/19 ] |
|
I agree this would be useful. If you're not in a hurry, we would like to try to implement it. |
| Comment by Andreas Dilger [ 18/Dec/19 ] |
|
Olaf, |
| Comment by Gerrit Updater [ 12/May/23 ] |
|
"Thomas Bertschinger <bertschinger@lanl.gov>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50982 |
| Comment by Thomas Bertschinger [ 12/May/23 ] |
|
@Patrick Farrell |
| Comment by Patrick Farrell [ 12/May/23 ] |
|
Thanks, Thomas. Could you lay out the implications of using user vs lustre namespaces as you see them? What do we gain, what we do lose, and/or what requirements can't we meet in certain settings? |
| Comment by Thomas Bertschinger [ 12/May/23 ] |
|
Requirements (or at least, Wants):
User namespace
Cons:
Lustre namespace:
Cons:
Trusted namespace: I think this would only be considered as backing the lustre namespace and wouldn't be used on its own. System namespace:
My opinion is that the user namespace is the best choice because every other choice fails to meet the goals above, but I will admit that not being very experienced with the kernel side of things, I can't say how severe the problem with the user namespace is. |
| Comment by Patrick Farrell [ 12/May/23 ] |
|
Thank you for that; I'm persuaded we should do the user xattr, the issues around interop and user tools look decisive. If we're forced to change tack in the future, we can deal with that then. (I think I was the only one actively objecting, Andreas indicated OK at LUG, so you should be good to go.) |
| Comment by Andreas Dilger [ 12/May/23 ] |
|
I'm inclined toward the user.job xattr, but it is never a good idea to unilaterally make policy decisions in the kernel that cannot be changed. As such, it might make sense to have a tunable parameter like "mdt.*.job_xattr=user.job" (or mdd.*.job_xattr is possibly better, though e.g. "mdt.*.job_cleanup_interval" already exists), and then this could be changed in the future if there is some conflict (e.g. some site already uses the "user.job" xattr for some other purpose). I don't think the job_xattr should allow totally arbitrary values (e.g. overwriting trusted.lov or trusted.lma or security.* would be bad). One option is to only allow a limited selection of valid xattr namespaces, and possibly names:
If we allow the xattr name portion to be specified (which I'm not sure about, but putting it out for completeness), it should have some reasonable limits:
Thoughts? |
| Comment by Andreas Dilger [ 12/May/23 ] |
|
For the implementation, any such restrictions/checking would be done in the sysfs write handler when the xattr is first set, and for actual xattr creation this would just copy the job_xattr string during actual file creation with no need for additional sanity checking, since it should always be consistent. For the "don't want user.job to be cp.12345" issue, I think that should be handled properly. Initially, the xattr would be set to cp.PID or SLURM_JOBID or whatever by the MDS internally at file creation time. However, if cp, rsync, tar, dcp, dsync ... is also copying xattrs, the initial value would be overwritten later in the copy process, which is what we would want I think. |
| Comment by Gerrit Updater [ 07/Aug/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50982/ |
| Comment by Peter Jones [ 07/Aug/23 ] |
|
Landed for 2.16 |
| Comment by James A Simmons [ 25/Aug/23 ] |
|
The new maloo test fail in interop. |
| Comment by Thomas Bertschinger [ 25/Aug/23 ] |
|
Does interop refer to some testing that mixes different client and server versions? Is the failure occurring when the server version is older than the client and the MDS doesn't have this new change? If it just needs a version check like: (( $MDS1_VERSION >= $(version_code 2.15.??) )) || skip ... that should be simple enough, I can submit a fix for that. The version I get when I check out this commit and run `lctl get_param version` is 2.15.57. Is that the correct code to put in the version check? Or do I need one higher than that (since I guess you could check out a commit newer than the 2.15.57 tag but older than this change and its version would pass...). If I need one higher than 2.15.57, do I use 2.15.58, 2.16.0, or something else? |
| Comment by Patrick Farrell [ 25/Aug/23 ] |
|
Yes, you've got this exactly right. That's what interop is and that check is what's needed. Generally you'd just use 2.15.57, the tag that was live when the patch was merged. We don't do interop testing between different development versions, just between dev versions and older full versions. It's also fine if in some narrow scenario we skip a test that would pass. So that MDS version check you suggested for 2.15.57 is what's needed. Oops on the reviewers (ie, including me) for not noting that at the time. |
| Comment by Andreas Dilger [ 25/Aug/23 ] |
|
The version_code helper supports 4-component version numbers, so since this is being done after the landing it is helpful to include the full version from "git describe" of the commit hash of the patch itself. If it were before landing it could use the 4-component version of the patch before landing, even though it is slightly inaccurate.. It is also helpful to include a brief description of why the test is being skipped. Please see some examples in patch https://review.whamcloud.com/52009 " |
| Comment by Gerrit Updater [ 25/Aug/23 ] |
|
"Thomas Bertschinger <bertschinger@lanl.gov>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52095 |
| Comment by Gerrit Updater [ 06/Sep/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52095/ |
| Comment by Gerrit Updater [ 10/Sep/23 ] |
|
"Xing Huang <hxing@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52332 |
| Comment by Andreas Dilger [ 27/Sep/23 ] |
|
Thomas, I just realized that this is storing the xattr on the MDT inode, but it would also be useful if the same xattr was also stored on the OST objects for debugging/scanning/recovery purposes. Is this something that you would be interested to work on? |
| Comment by Thomas Bertschinger [ 27/Sep/23 ] |
|
Hi Andreas, I'd be willing to work on adding that in. I'm pretty busy at the moment so I likely won't be able to start working on it for 2-3 weeks, but if it can wait that long I'm happy to give it a shot. |
| Comment by Andreas Dilger [ 28/Sep/23 ] |
|
No rush. For some reason I thought this was part of the original implementation, but upon looking at the patch again I saw it was only on the MDT inode. I think it makes sense to put this in ofd_attr_handle_id() where it is initializing the object ownership the first time the OST object is written to (in the "if (mask != 0)" block if it cleared the SUID/SGID bits). That allows a "one shot" setting of the user.job xattr without having to check each time whether it exists or not. The parameter obdfilter.*.jobid_xattr should be used like mdt.*.jobid_xattr to control the xattr name. |
| Comment by Gerrit Updater [ 13/Nov/23 ] |
|
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52332/ |
| Comment by Gerrit Updater [ 07/Dec/23 ] |
|
"Thomas Bertschinger <bertschinger@lanl.gov>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53367 |