<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:16:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-8361] lctl lfsck_start --all does not start lfsck on all devices</title>
                <link>https://jira.whamcloud.com/browse/LU-8361</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;It appears that in lustre 2.8.0, the lctl lfsck_start --all command is broken:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@jet1:~]# lctl lfsck_start -A
Must specify device to start LFSCK.
[root@jet1:~]# lctl lfsck_start --all 
Must specify device to start LFSCK.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But the man page says (in the lfsck section):&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;         -A, --all&lt;br/&gt;
              Start LFSCK on all available MDT devices.&lt;/p&gt;&lt;/blockquote&gt;</description>
                <environment></environment>
        <key id="37953">LU-8361</key>
            <summary>lctl lfsck_start --all does not start lfsck on all devices</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="yong.fan">nasf</assignee>
                                    <reporter username="morrone">Christopher Morrone</reporter>
                        <labels>
                            <label>llnl</label>
                    </labels>
                <created>Fri, 1 Jul 2016 21:28:43 +0000</created>
                <updated>Tue, 28 Feb 2017 16:46:32 +0000</updated>
                            <resolved>Thu, 8 Sep 2016 03:34:47 +0000</resolved>
                                    <version>Lustre 2.8.0</version>
                                    <fixVersion>Lustre 2.9.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="157583" author="pjones" created="Sat, 2 Jul 2016 13:16:54 +0000"  >&lt;p&gt;Fan Yong&lt;/p&gt;

&lt;p&gt;Could you please advise on this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="157607" author="yong.fan" created="Mon, 4 Jul 2016 15:48:50 +0000"  >&lt;p&gt;Sorry for the confused man page. As you can see via &quot;lctl lfsck_start --help&quot;&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lctl lfsck_start --help
start LFSCK
usage:
lfsck_start &amp;lt;-M | --device {MDT,OST}_device&amp;gt;
	     [-A | --all] [-c | --create_ostobj [on | off]]
	     [-C | --create_mdtobj [on | off]]
	     [-e | --error {continue | abort}] [-h | --help]
	     [-n | --dryrun [on | off]] [-o | --orphan]
             [-r | --reset] [-s | --speed ops_per_sec_limit]
             [-t | --type check_type[,check_type...]]
	     [-w | --window_size size]
options:
...
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &quot;&amp;#45;M|&amp;#45;&amp;#45;device&quot; option must be specified for any LFSCK command except &quot;help&quot; case. The &quot;&amp;#45;A|&amp;#45;&amp;#45;all&quot; option is used for DNE case. If there are multiple MDTs in your system, then for the default mode, the LFSCK will be started on the specified MDT and related OSTs (for layout LFSCK case). If you want to start the LFSCK on the whole system, you need to specify both the &quot;&amp;#45;M|&amp;#45;&amp;#45;device&quot; and the &quot;&amp;#45;A|&amp;#45;&amp;#45;all&quot; options, then the specified MDT will dispatch the LFSCK command to other MDTs automatically.&lt;/p&gt;

&lt;p&gt;You may want to ask why must specify the &quot;&amp;#45;M|&amp;#45;&amp;#45;device&quot; option, there are two main reasons:&lt;br/&gt;
1) It is not easy to decide the device name by the &quot;lctl&quot; command, Lustre supports multiple Lustre instances share the same physical node.&lt;br/&gt;
2) Historical compatibility consideration. As you know, we have several LFSCK releases since Lustre-2.3, but the full LFSCK support for DNE is released via Lustre-2.7. Before that all the LFSCK commands need the &quot;&amp;#45;M|&amp;#45;&amp;#45;device&quot; option. Keeping such option will easy the existing LFSCK users/scripts.&lt;/p&gt;</comment>
                            <comment id="157697" author="morrone" created="Tue, 5 Jul 2016 18:25:29 +0000"  >&lt;p&gt;I don&apos;t buy the explanation that it is not easy to decide the device name from the lctl command.  If there are multiple Lustre instances, you could simply start lfsck check on all of them.  After all, it is the &quot;all&quot; command.  If you want to make it easier for people to select one filesystem and not another, allow them to specify a filesystem name rather than a device name.&lt;/p&gt;

&lt;p&gt;The historical consideration doesn&apos;t really make too much sense to me either.  You can support the older -M option without making it a required parameter in all instances.&lt;/p&gt;

&lt;p&gt;Yes, the man page definitely needs improvement as well.&lt;/p&gt;</comment>
                            <comment id="157746" author="yong.fan" created="Wed, 6 Jul 2016 01:06:03 +0000"  >&lt;p&gt;Here, the &quot;Start LFSCK on all available MDT devices&quot; wants to say that the user can start the LFSCK on all the available MDTs via single &quot;lctl lfsck_start&quot; command, instead of &quot;lctl lfsck_start&quot; on each MDT one by one. It is NOT exclusive with the &quot;-M&quot; option.&lt;/p&gt;

&lt;p&gt;As for the solution of specifying the filesystem name, on some degree, that is equal to split the &quot;-M&quot; option into several ones: &quot;filesystem name&quot; + &quot;device role&quot; + &quot;device index&quot;, it is not impossible, but means more options. It may be convenient for some cases, but may be inconvenient for other cases. At the time of the first LFSCK released in Lustre-2.3, the &quot;-M&quot; option was enough, then the subsequent LFSCK releases inherited that.&lt;/p&gt;

&lt;p&gt;Anyway, I will update the man page to describe the things more clear to avoid misguiding.&lt;/p&gt;</comment>
                            <comment id="157754" author="yong.fan" created="Wed, 6 Jul 2016 03:43:12 +0000"  >&lt;p&gt;Here is the patch:&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/#/c/19294/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/19294/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="158450" author="morrone" created="Tue, 12 Jul 2016 01:29:54 +0000"  >&lt;blockquote&gt;&lt;p&gt;Here, the &quot;Start LFSCK on all available MDT devices&quot; wants to say that the user can start the LFSCK on all the available MDTs via single &quot;lctl lfsck_start&quot; command, instead of &quot;lctl lfsck_start&quot; on each MDT one by one. It is NOT exclusive with the &quot;-M&quot; option.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;I understand that.  I am saying that this design is not user friendly.  I would love for it to work like this:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;$ lctl lfsck start
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that there is no underscore connecting &quot;lfsck&quot; and &quot;start&quot;.  And this would just start lfsck for all devices found for all filesystems found.  Probably 99% of Lustre installations will not have multiple filesystems served from the same servers, so we optimize for the common case and make that clean and simple.  Then we can add options to allow finer control for the remaining 1% of use cases.&lt;/p&gt;

&lt;p&gt;-M, which looks like and option but isn&apos;t optional, is not a clean user interface design.  And I don&apos;t agree that it is functionally equivalent to a filesystem name.  A sysadmin is more likely to know the filesystem&apos;s name off the top of their head.  An individual MDT name that is valid on the current node is probably something that they will always need to look up.  And what is the command to look that up?  &quot;lctl dl&quot;?  That command is terrible.&lt;/p&gt;</comment>
                            <comment id="158451" author="morrone" created="Tue, 12 Jul 2016 01:38:26 +0000"  >&lt;p&gt;Come to think of it, rather than making the filesystem name an option, and because single filesystems are the most common, you could just make &quot;lctl lfsck start&quot; return an error if there is more than one filesystem.  Something like this:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;$ lctl lfsck start
Error: more than one filesystem, please specify which one:
  lquake
  ltest
$ lctl lfsck start lquake
Starting lfsck on lquake
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But on a system with only one filesystem it would just do:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;$ lctl lfsck start
Starting lfsck on bigfs
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="159262" author="adilger" created="Tue, 19 Jul 2016 21:28:07 +0000"  >&lt;p&gt;Chris,&lt;br/&gt;
please note that &lt;tt&gt;-M &amp;lt;fsname&amp;gt;&lt;/tt&gt; does not need the full device UUID, just the target name like &lt;tt&gt;&amp;lt;fsname&amp;gt;-MDT0000&lt;/tt&gt; so it does not need to be looked up by &lt;tt&gt;lctl dl&lt;/tt&gt; each time.  That said, I agree there could be improvements in this area.&lt;/p&gt;

&lt;p&gt;Fan Yong,&lt;br/&gt;
I don&apos;t think it would be a problem to allow &lt;tt&gt;lctl lfsck_start -A&lt;/tt&gt; to run on all filesystems (MDTs and OSTs) that are connected to the current node.  Since all of the current users would be using &lt;tt&gt;-M &amp;lt;mdt_name&amp;gt;&lt;/tt&gt;, and just not requiring that option together with &lt;tt&gt;-A&lt;/tt&gt; would keep compatibility and also address Chris&apos; concerns.  It should also be relatively straight forward for &lt;tt&gt;-A -M &amp;lt;fsname&amp;gt;&lt;/tt&gt; to allow just specifying the fsname, which is no less specific than using &lt;tt&gt;-A -M &amp;lt;fsname&amp;gt;-MDT0000&lt;/tt&gt; since it is already checking all the devices in that filesystem. The device name(s) list can be found with something like:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        glob_t param;

        rc = llapi_get_param_path(NULL, &lt;span class=&quot;code-quote&quot;&gt;&quot;mdd.*-MDT0000&quot;&lt;/span&gt;, FILTER_BY_EXACT, NULL, &amp;amp;param);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="159266" author="morrone" created="Tue, 19 Jul 2016 21:54:50 +0000"  >&lt;blockquote&gt;&lt;p&gt;please note that -M &amp;lt;fsname&amp;gt; does not need the full device UUID, just the target name like &amp;lt;fsname&amp;gt;-MDT0000 so it does not need to be looked up by lctl dl each time&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;I know.  But there are now more targets than just 0000.  It requires a lookup or some other external knowledge to know the mapping of MDTs to MDS nodes.  It is an added difficulty that is easily avoided with good command design.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;rc = llapi_get_param_path(NULL, &quot;mdd.*-MDT0000&quot;, FILTER_BY_EXACT, NULL, &amp;amp;param);&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;It can&apos;t be a literal &quot;0000&quot;.  Not all MDS nodes will be serving that particular index.&lt;/p&gt;</comment>
                            <comment id="159272" author="adilger" created="Tue, 19 Jul 2016 23:28:27 +0000"  >&lt;p&gt;I was thinking that &quot;lctl lfsck_start -A&quot; would be run on the primary MDS and then it doesn&apos;t matter where the rest of the MDTs are located since &quot;-A&quot; will start lfsck on all of the other MDS and OSS nodes. &lt;/p&gt;</comment>
                            <comment id="159273" author="morrone" created="Tue, 19 Jul 2016 23:34:07 +0000"  >&lt;p&gt;I would prefer not to make that restriction if it does not already exist.  From a usability standpoint, it would be nice if admins could start lfsck from the MGS too, and maybe even from clients in the future.&lt;/p&gt;

&lt;p&gt;If any restrictions about where the command can be run, it needs to be clearly stated.&lt;/p&gt;

&lt;p&gt;And if the command can only be run on the node with the zero index MDT, then the error message needs to be very clear when the command is run on a node without that MDT.&lt;/p&gt;</comment>
                            <comment id="159277" author="adilger" created="Wed, 20 Jul 2016 00:07:26 +0000"  >&lt;p&gt;I&apos;m fine with allowing lfsck to be started on any MDT if that isn&apos;t already a restriction (I&apos;ve always started it on MDT0000 by virtue of only having one MDS on my home system). I&apos;m against allowing administrator tasks to be run from clients, since this adds an extra level of security issues, and IMHO doesn&apos;t provide any significant benefit. &lt;/p&gt;</comment>
                            <comment id="159326" author="yong.fan" created="Wed, 20 Jul 2016 16:07:16 +0000"  >&lt;p&gt;The current implementation supports to start the LFSCK on any MDT as long as you specify the &quot;-M&quot; option properly, for example &quot;-M ${fsname}-MDT0001&quot; also works. But it does NOT allow to be started from client, nor from a separated MGS node.&lt;/p&gt;</comment>
                            <comment id="160259" author="gerrit" created="Fri, 29 Jul 2016 02:44:57 +0000"  >&lt;p&gt;Fan Yong (fan.yong@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/21596&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/21596&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8361&quot; title=&quot;lctl lfsck_start --all does not start lfsck on all devices&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8361&quot;&gt;&lt;del&gt;LU-8361&lt;/del&gt;&lt;/a&gt; lfsck: detect Lustre device automatically&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 5e2f4b5358e926ae20658f6e5ba313e253db32c0&lt;/p&gt;</comment>
                            <comment id="165229" author="gerrit" created="Thu, 8 Sep 2016 02:06:01 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/21596/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/21596/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8361&quot; title=&quot;lctl lfsck_start --all does not start lfsck on all devices&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8361&quot;&gt;&lt;del&gt;LU-8361&lt;/del&gt;&lt;/a&gt; lfsck: detect Lustre device automatically&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: a0f7174c4106104f45977eeec7338e8f7fd1dafa&lt;/p&gt;</comment>
                            <comment id="165250" author="yong.fan" created="Thu, 8 Sep 2016 03:34:47 +0000"  >&lt;p&gt;The patch has been landed to master.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzyggn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>