<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:01:12 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-6553] Recurrence of LU-5299: obd_mount_server.c:1690:osd_start()) ASSERTION( obd ) failed</title>
                <link>https://jira.whamcloud.com/browse/LU-6553</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The patch for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5573&quot; title=&quot;Test timeout conf-sanity test_41c&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5573&quot;&gt;&lt;del&gt;LU-5573&lt;/del&gt;&lt;/a&gt; (&lt;a href=&quot;http://review.whamcloud.com/#/c/12353/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/12353/&lt;/a&gt;), which closed &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5299&quot; title=&quot;osd_start() LBUG when doing parallel mount of the same target&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5299&quot;&gt;&lt;del&gt;LU-5299&lt;/del&gt;&lt;/a&gt;, does not cover some cases.&lt;/p&gt;

&lt;p&gt;Specifically, the code which enables the combined MGT/MDT to start correctly also disables the race protection for a combined MGT/MDT.&lt;/p&gt;

&lt;p&gt;So racing multiple mount commands on a combined MGT/MDT can still cause this problem.&lt;/p&gt;

&lt;p&gt;I&apos;ve taken a look, and I don&apos;t see any easy way to fix this in the current context.  I can provide dumps if needed, and I&apos;ll attach a log now.&lt;/p&gt;

&lt;p&gt;Note the attempts to start MDT0000.  There are five, four of which start after the first one but before it has completed.&lt;/p&gt;</description>
                <environment>Combined MGT/MDT, racing multiple mount commands.</environment>
        <key id="29817">LU-6553</key>
            <summary>Recurrence of LU-5299: obd_mount_server.c:1690:osd_start()) ASSERTION( obd ) failed</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="bfaccini">Bruno Faccini</assignee>
                                    <reporter username="paf">Patrick Farrell</reporter>
                        <labels>
                    </labels>
                <created>Fri, 1 May 2015 14:59:00 +0000</created>
                <updated>Wed, 13 Oct 2021 02:27:01 +0000</updated>
                            <resolved>Wed, 13 Oct 2021 02:27:01 +0000</resolved>
                                    <version>Lustre 2.7.0</version>
                    <version>Lustre 2.5.4</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="114096" author="bfaccini" created="Sat, 2 May 2015 14:50:00 +0000"  >&lt;p&gt;Hello Patrick,&lt;br/&gt;
As I am the unfortunate author of both+complementary patches for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5299&quot; title=&quot;osd_start() LBUG when doing parallel mount of the same target&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5299&quot;&gt;&lt;del&gt;LU-5299&lt;/del&gt;&lt;/a&gt; and &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5573&quot; title=&quot;Test timeout conf-sanity test_41c&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5573&quot;&gt;&lt;del&gt;LU-5573&lt;/del&gt;&lt;/a&gt;, I think I had to assign this ticket to me ...&lt;br/&gt;
Thanks for the report and also the attached Lustre debug trace for the LBUG.&lt;br/&gt;
Having a look to the trace I think you are right with the fact that the problem/race is still being present, but this is only in the case of concurrent mount/start commands where either of the nosvc/nomgs flags has been specified for a combined MDT/MGS device.&lt;br/&gt;
Will try to fix this case too, as a new complementary patch ...&lt;/p&gt;</comment>
                            <comment id="114152" author="paf" created="Mon, 4 May 2015 18:35:15 +0000"  >&lt;p&gt;Thanks, Bruno - Good luck.  I couldn&apos;t find an easy way to do it, but I expect you know this code much better than me.&lt;/p&gt;</comment>
                            <comment id="127007" author="wang" created="Thu, 10 Sep 2015 23:09:26 +0000"  >&lt;p&gt;Hi Bruno, any progress on this one? Thanks.&lt;/p&gt;</comment>
                            <comment id="141957" author="bfaccini" created="Thu, 11 Feb 2016 10:41:36 +0000"  >&lt;p&gt;Patrick, Wally,&lt;br/&gt;
Sorry I am late on this, but back working on a new/addon patch to fully fix.&lt;br/&gt;
Can you help in providing your reproducer or at least give some details on how these concurrent mount cmds are generated ?&lt;/p&gt;</comment>
                            <comment id="141967" author="paf" created="Thu, 11 Feb 2016 14:32:55 +0000"  >&lt;p&gt;Bruno -&lt;/p&gt;

&lt;p&gt;We don&apos;t have a specific reproducer.  It actually turned out we were doing concurrent mounts because our failover stuff was misconfigured on an internal system.&lt;br/&gt;
But I think it would be sufficient to simply use a bash script to spawn off multiple mount commands for a particular target.&lt;/p&gt;</comment>
                            <comment id="145520" author="wang" created="Tue, 15 Mar 2016 00:23:53 +0000"  >&lt;p&gt;Bruno,&lt;br/&gt;
Here is a simple reproducer:&lt;/p&gt;

&lt;p&gt;1. create and start a Lustre file system with mgt/mdt combo&lt;br/&gt;
2. umount the mgt and mdt&lt;br/&gt;
3. run the following &apos;test_mount&apos; script 5 times in parallel:&lt;/p&gt;

&lt;p&gt;cat test_mount&lt;br/&gt;
#!/bin/bash&lt;br/&gt;
mount -t lustre -o nosvc,abort_recov --verbose /dev/sdd /tmp/lustre/scratch/mgt&lt;br/&gt;
mount -t lustre -o nomgs,abort_recov --verbose /dev/sdd /tmp/lustre/scratch/mdt&lt;/p&gt;

&lt;p&gt;for ((i=0;i&amp;lt;5;i++));do ./test_mount &amp;amp; done;&lt;/p&gt;</comment>
                            <comment id="145526" author="paf" created="Tue, 15 Mar 2016 02:15:41 +0000"  >&lt;p&gt;Thanks, Wally!&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="26271">LU-5573</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="25445">LU-5299</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="17644" name="perses_20150430t095047_mds2.log.sort.gz" size="3906018" author="paf" created="Fri, 1 May 2015 14:59:00 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxcc7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>