[LU-17010] lfsck_trans_create shouldn't be called in dryrun mode Created: 02/Aug/23  Updated: 02/Oct/23  Resolved: 19/Aug/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.16.0, Lustre 2.15.3
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Major
Reporter: Hongchao Zhang Assignee: Hongchao Zhang
Resolution: Fixed Votes: 0
Labels: LTS15

Issue Links:
Related
is related to LU-13124 lfsck check for multiple linked file ... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

When running LFSCK with --dryrun the following stack will spam the console log while running and cause significant load/slowdown on the MDT due to the many console messages:

kernel: CPU: 8 PID: 21324 Comm: lfsck Tainted: G OE ------------ T 3.10.0-1160.59.1.el7_lustre.ddn16.x86_64 #1
kernel: Hardware name: DDN SFA400NVXE, BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
kernel: Call Trace:
kernel: [<ffffffff9d3865b9>] dump_stack+0x19/0x1b
kernel: [<ffffffffc133e164>] lfsck_trans_create.part.53+0x6c/0x75 [lfsck]
kernel: [<ffffffffc1302d39>] lfsck_namespace_trace_update+0x919/0xa10 [lfsck]
kernel: [<ffffffffc0611f02>] ? fld_local_lookup+0x62/0x270 [fld]
kernel: [<ffffffffc1307d60>] lfsck_namespace_exec_oit+0x6b0/0x880 [lfsck]
kernel: [<ffffffffc12f0871>] lfsck_exec_oit+0x81/0xb10 [lfsck]
kernel: [<ffffffffc12f1ae8>] lfsck_master_oit_engine+0x7e8/0x12f0 [lfsck]
kernel: [<ffffffff9cce6321>] ? put_prev_entity+0x31/0x400
kernel: [<ffffffffc12f2ee6>] lfsck_master_engine+0x8f6/0x13b0 [lfsck]
kernel: [<ffffffff9ccd4abe>] ? finish_task_switch+0x4e/0x1c0
kernel: [<ffffffff9ccdadf0>] ? wake_up_state+0x20/0x20
kernel: [<ffffffffc12f25f0>] ? lfsck_master_oit_engine+0x12f0/0x12f0 [lfsck]
kernel: [<ffffffff9ccc5e61>] kthread+0xd1/0xe0
kernel: [<ffffffff9ccc5d90>] ? insert_kthread_work+0x40/0x40
kernel: [<ffffffff9d399ddd>] ret_from_fork_nospec_begin+0x7/0x21
kernel: [<ffffffff9ccc5d90>] ? insert_kthread_work+0x40/0x40
kernel: CPU: 8 PID: 21324 Comm: lfsck Tainted: G OE ------------ T 3.10.0-1160.59.1.el7_lustre.ddn16.x86_64 #1


 Comments   
Comment by Gerrit Updater [ 02/Aug/23 ]

"Hongchao Zhang <hongchao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51849
Subject: LU-17010 lfsck: don't create trans in dryrun mode
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b0de6c377e4ee57a5d1af8db82958c8afe43db3b

Comment by Gerrit Updater [ 19/Aug/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51849/
Subject: LU-17010 lfsck: don't create trans in dryrun mode
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 441902fa3d445791a8c54026c130ab357f7469d7

Comment by Peter Jones [ 19/Aug/23 ]

Landed for 2.16

Comment by Gerrit Updater [ 13/Sep/23 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52356
Subject: LU-17010 lfsck: don't dump stack repeatedly
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c4d8c5439c74fe71bc8c3f940bf259d00ed25753

Comment by Gerrit Updater [ 28/Sep/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52356/
Subject: LU-17010 lfsck: don't dump stack repeatedly
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: dc360cd3eff20618f243ab89097a62f8ecf2c929

Generated at Sat Feb 10 03:31:51 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.