[LU-13304] add testing for LFSCK DRYRUN Created: 28/Feb/20  Updated: 08/Dec/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Rank (Obsolete): 9223372036854775807

 Description   

I noticed in a patch review that there are places in lfsck_layout.c (and probably others) where lb_param & LPF_DRYRUN is not being checked before modifying the filesystem. For example, lfsck_layout_ins_dangling_rec(), lfsck_layout_del_dangling_rec(), lfsck_layout_refill_lovea(), __lfsck_layout_update_pfid(), lfsck_layout_recreate_parent(), etc. While it is possible that those functions are skipped at a higher level when DRYRUN is set, it would be prudent to add checks and skip any filesystem-modifying code at the level where those changes are being done, maybe with a CDEBUG(D_LFSCK, ...) message so that the potential fix appears in the debug log (LU-9548). That makes it very clear to the reader/developer that there is no chance those modifications would be done.

We should do a lot to improve the testing of DRYRUN during LFSCK to make sure that this is being handled correctly, and not change the filesystem when it is asked to only do a check.

In addition to adding checks for DRYRUN in all of the modifying functions (even if this is redundant), some suggestions to detect problems during testing would include:

  • run each test with "lctl lfsck_start --dryrun" first, then verify the problem was not fixed until the second (non-DRYRUN) run
  • set "lctl printk=+lfsck" during sanity-lfsck.sh and grab the D_LFSCK output from all nodes during each subtest, use "sed" to drop the run-unique output like timestamps, PID, FID, then compare the output with "expect" to ensure that the right thing is being done for each subtest
  • add a lfsck_declare_trans_start() wrapper for dt_declare_trans_start() (and/or lfsck_trans_start() for dt_trans_start()) that prints and returns an error if called when DRYRUN is set. That might be too much, if there are "empty" transactions created when DRYRUN is set, but it wouldn't be terrible to avoid that completely.
  • if the above is too broken, add lfsck_* wrappers for all of the dt_* modifying operations to return an error if called with DRYRUN set. That would handle the case of dt_trans_start() being called when --dryrun is set, but is not as good a solution as avoiding it completely, since there is still overhead from calling dt_trans_start() for empty transactions, and more danger for the filesystem if something is missed.


 Comments   
Comment by Andreas Dilger [ 08/Dec/21 ]

I hit a check in exactly this codepath while running "lfsck -r -A -t all --dryryn" on a test filesystem. The lfsck_trans_create() wrapper was added in patch https://review.whamcloud.com/37194 "LU-13124 scrub: check for multiple linked file" and was triggered during my run:

CPU: 1 PID: 28775 Comm: lfsck  3.10.0-1160.21.1.el7_lustre.ddn13.x86_64 #1
Call Trace:
 dump_stack+0x19/0x1b
 lfsck_trans_create.part.54+0x6c/0x75 [lfsck]
 lfsck_namespace_trace_update+0xc2a/0xe10 [lfsck]
 lfsck_namespace_exec_oit+0xa1f/0xc50 [lfsck]
 lfsck_master_oit_engine+0x745/0x12d0 [lfsck]
 lfsck_master_engine+0x9de/0x1420 [lfsck]
 kthread+0xd1/0xe0
Lustre: testfs-MDT0001-osd: namespace LFSCK add flags for [0x240000402:0x10a1:0x0] in the trace file, flags 1, old 0, new 1: rc = -22
LustreError: 28775:0:(lfsck_internal.h:1553:lfsck_trans_create()) testfs-MDT0001-osd: transaction is being created in DRYRUN mode!

This caused the system to be almost completely unusable because of the large number of stack traces being dumped.

Two fixes are needed here:

  • quiet the console messages/stack trace to be printed at most every 60s
  • fix lfsck_namespace_trace_update() to check the DRYRUN flag before trying to fix anything
Generated at Sat Feb 10 03:00:08 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.