[LU-7799] mdt.*.hsm.actions skips some records Created: 19/Feb/16  Updated: 26/Sep/16  Resolved: 14/Mar/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0, Lustre 2.7.0, Lustre 2.8.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Minor
Reporter: John Hammond Assignee: John Hammond
Resolution: Fixed Votes: 0
Labels: hsm, llog

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Note that this is surely responsible for a number of mysterious sanity-hsm test failures.

Due to a bug in (or misuse of) llog_cat_process() the HSM actions proc file will skip some records when read.

~# # mount and setup HSM
~# killall lhsmtool_posix
~# wc -l /proc/fs/lustre/mdt/lustre-MDT0000/hsm/actions
0 /proc/fs/lustre/mdt/lustre-MDT0000/hsm/actions
~# cd /mnt/lustre
lustre# for ((i = 0; i < 20; i++)); do
>   touch f$i
>   lfs hsm_archive f$i
> done

Now there should be 20 records in the actions file but there are only 19:

lustre# wc -l /proc/fs/lustre/mdt/lustre-MDT0000/hsm/actions
19 /proc/fs/lustre/mdt/lustre-MDT0000/hsm/actions

The missing record corresponds to f17:

lustre# lfs path2fid f17
[0x200000401:0x12:0x0]
lustre# grep '0x200000401:0x12:0x0' /proc/fs/lustre/mdt/lustre-MDT0000/hsm/actions
lustre# lfs hsm_action f17
f17: ARCHIVE waiting (from 0 to EOF)

The issue is with how the startidx parameter to llog_cat_process() is handled (see mdt_hsm_actions_proc_show() and hsm_actions_show_cb()). startidx becomes lpd_startidx then lpcd_first_idx which is actually skipped in llog_process_thread().



 Comments   
Comment by Gerrit Updater [ 19/Feb/16 ]

John L. Hammond (john.hammond@intel.com) uploaded a new patch: http://review.whamcloud.com/18525
Subject: LU-7799 hsm: use correct record start index for actions
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: aee680aca3e7fcd5d04055039f6e8cea891354a6

Comment by Gerrit Updater [ 14/Mar/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18525/
Subject: LU-7799 hsm: use correct record start index for actions
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 1ce2c6f33c104cafbf42828551e338d0c5e7602a

Generated at Sat Feb 10 02:12:01 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.