[LU-17371] OFD ALR records should not be generated for maintenance operations Created: 15/Dec/23  Updated: 15/Dec/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0, Lustre 2.16.0
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The OFD access log reader generates records for files that were just created and are being written for the first time.

This can be confusing in the case of background filesystem maintenance operations, such as "lfs mirror *", "lfs migrate", and "lfs hsm *" operations that incorrectly give the impression that the file is under active application usage when it is not.

There should be a mechanism under which these background maintenance operations do not inflate the heat of files.



 Comments   
Comment by Andreas Dilger [ 15/Dec/23 ]

One relatively simple solution would be to have a "quiet time" for newly created objects, so that writes during this initial period do not generate ALR records and do not count as "heat" for the file. Something like "obdfilter.*.access_log_quiet=300" set on the OSTs would skip ALR generation for the first 300s after the object is created. This could be used to differentiate

This would cover the write cases for "lfs mirror extend" and "lfs migrate" and "lfs hsm restore" where new OST objects are allocated for the file, or other tools that may be creating and writing files directly in a "cold" OST pool.

It would not cover the case for "lfs mirror resync" where writes are done to an existing object, nor the read traffic generated by "lfs mirror extend", "lfs mirror resync", "lfs migrate", or "lfs hsm archive". In order to handle this case, there should be an interface (maybe llapi_ladvise(LU_LADVISE_NOALR)}}") that can set a flag on the file descriptor that will flag all OBD_BRW RPCs so that they do not generate ALR records. This is somewhat overlapping with the access_log_quiet functionality, but that has the benefit that it can be applied to old clients/utilities that are not able to use LU_LADVISE_NOALR.

Generated at Sat Feb 10 03:34:54 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.