[LU-8217] Clarify interaction of lfsck_start -A and -M options Created: 30/May/16  Updated: 09/Jun/16  Resolved: 09/Jun/16

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Nathan Dauchy (Inactive) Assignee: nasf (Inactive)
Resolution: Not a Bug Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Looking for clarification on the intent of the "-A" option to "lctl lfsck_start".

The helpt text says:

options:
-M: device to start LFSCK/scrub on
-A: start LFSCK on all MDT devices

Which implies that one should either run lfsck on just one OR all devices. Yet, it seems that specifying -M is still required:

service322 ~ # lctl lfsck_start -A
Must specify device to start LFSCK.

So, I would guess that "-M" was never made non-mandatory in the arg parsing when -A was added. Or perhaps the help text is just misleading and all that is needed is a documentation update?

Should the -A and -M options be mutually-exclusive?

Under what circumstances would you NOT want to use -A? Should that just be the default, and then -M is optional to be more restrictive in lfsck operation, and no -A option is needed at all?



 Comments   
Comment by Peter Jones [ 30/May/16 ]

Fan Yong

Could you please advise?

Thanks

Peter

Comment by nasf (Inactive) [ 31/May/16 ]

Sorry for the confusing. The new LFSCK contains several functionalities, some of the them is local device based, such as OI scrub, linkEA and FID-in-dirent verification; some of them cross MDT-OST servers, such as layout LFSCK; some of them needs all MDTs to be involved, such as namespace LFSCK for DNE. All these functionalities or components share the same LFSCK API.

The option "-M" specify to which device the LFSCK command is initially sent. For example, if you want to start OI scrub on the MDT0002, you can specify "lctl lfsck_start -M $FSNAME-MDT0002 -t scrub", then related command will be sent to MDT0002 only. If you want to start layout LFSCK on the MDT0003, you can specify "lctl lfsck_start -M $FSNAME-MDT0003 -t layout", then related command will be sent to MDT0003, and MDT0003 will send LFSCK command to OSTs.

The option "-A" means we want to start/stop LFSCK on all MDTs devices. Under such case, you still need to specify "-M" option. For example, you have four MDTs in your system, you want to start layout LFSCK on all MDTs (and related OSTs), you can specify "lctl lfsck_start -M $FSNAME-MDT0000 -t layout", then related command will be sent to MDT0000, and MDT0000 will send LFSCK command to other three MDTs, and each MDT will send the LFSCK command to OSTs.

So "-M" and "-A" are NOT exclusive each other. "-M" is mandatory, in spite of "-A" specified or not. "-A" is optional, that is disabled by default. But sometimes it will be enabled automatically. For example, if you specify "-o" for layout LFSCK to handle orphan OST-objects, even if you does not specify "-A", the LFSCK commands will be sent to all MDTs and OSTs, that is initially from the MDTxxxx specified by "-M" option. On the other hand, if you want to start namespace LFSCK for DNE case, you need to specify "-A" explicitly, otherwise it will start local namespace LFSCK for linkEA and FID-in-dirent only.

Comment by Nathan Dauchy (Inactive) [ 31/May/16 ]

That clarification helps a lot, thanks! (maybe some of it should be in the manual or help output?)

Just to be sure I understand... on a system with a single MDT, there is never really a need to use the -A option, since the other flags will control whether LFSCK commands are sent to the OSTs, right? Conversely, what is the downside to specifying -A, and why not just make that the default behaviour to simplify things for the system admins?

Comment by nasf (Inactive) [ 31/May/16 ]

For single MDT system, no need to specify "-A" option. The MDT will dispatch related LFSCK command to OSTs automatically.
As for why not enable "-A" by default, there is no special reason, but consider some use cases, such as OI scrub, it is local device based, such functionality was introduced since Lustre-2.3, at that time, no "-A" option yet. If we make "-A" enable by default, then we need another option for these non "-A" cases.

Generated at Sat Feb 10 02:15:38 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.