Details
-
New Feature
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
9223372036854775807
Description
Pages that are dirty at the time of a client eviction are discarded, which may result in file corruption. Currently, the Lustre client issues a warning message to the console log that identifies the fid of the file containing the discarded dirty page. It also tries to include the name of the file in the warning message. There are problems with this scheme:
1. Trying to get the file name can cause a deadlock. See LU-12522.
2. The console log is not accessible to application users.
3. The message has no link to the job or application that was affected by the warning.
The proposal here is to add a variable to the debugfs (/sys/kernel/debug/lustre/llite.<fs>.discard_list) that lists the fids of files with discarded dirty pages. When a dirty page is discarded, the fid, jobid, and inode of the current file are added to the list.
For example:
root@vmcentos7 lustre-ffff8cf1aa3a4000]# lctl get_param llite.*.discard_list llite.lustre-ffff8cf1aa3a4000.discard_list= Jobs with discarded dirty pages:timestamp, jobid, filesystem, fid, discarded page count, filename 1572294777.000402500 dmesg.0 lustre [0x200000401:0x4:0x0] 256 amk/testing/discards1 1572294777.000404336 dmesg.0 lustre [0x200000401:0x8:0x0] 256 amk/testing/discards5 1572294777.000406094 dmesg.0 lustre [0x200000401:0xc:0x0] 256 amk/testing/discards9 1572294777.000407841 dmesg.0 lustre [0x200000401:0x10:0x0] 256 amk/testing/discards13 1572294777.000409620 dmesg.0 lustre [0x200000401:0x14:0x0] 256 amk/testing/discards17 1572294777.000411458 dmesg.0 lustre [0x200000401:0x6:0x0] 256 amk/testing/discards3
Either an admin (or eventually an application user) script or a job manager can examine the discard lists on each client at the end of a job to determine whether any dirty pages have been lost and inform the user.
Note the implementation limits the number of fids reported to 128 per file system. If more files than that have discarded dirty pages, the oldest entries in the discard_list are re-used. A discard_list can be cleared/emptied by writing anything to the debugfs variable (set_param llite.*.discard_list=clear).
The warning message will still be issued to the console log when a dirty page is discarded. The message will now only contain the fid; no attempt will be made to fetch the file name. Thus LU-12522 is resolved.
Identified drawbacks of the design include:
- A user still needs root privileges to access the list of files with discarded pages. When a UI for multiple line output is defined, discard_list could be moved to /sys/fs/lustre where permissions can be set more flexibly.
- Output does not display the mount point in the file name. (Existing functions to retrieve the mount point are written to run in user, rather than kernel, space. The mount point can easily be identified and added to the discard_list info using Linux utilities.)
- The file name will not be displayed if the file is deleted before discard_list is read.
- Lustre must be manually directed to clear the discard lists.
- A discard_list is a fixed size so if the max is exceeded not all files with discarded dirty pages will be reported in the list.
Attachments
Issue Links
- is related to
-
LU-12522 Deadlock: ptlrpcd daemon blocked in osc_extent_wait
-
- Open
-
Patch: https://review.whamcloud.com/#/c/36607/