<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:08:55 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-631] IO errors when using automounter and Lustre</title>
                <link>https://jira.whamcloud.com/browse/LU-631</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Ever since we moved from Lustre 1.6.6 to 1.8 I&apos;ve seen issues with using&lt;br/&gt;
the automounter and Lustre.  I&apos;ve finally got around to looking at what&lt;br/&gt;
the issue is, but I&apos;m not quite sure what the correct way to resolve it&lt;br/&gt;
is.  I think the issue will remain in 2.0+ but I didn&apos;t look closely at&lt;br/&gt;
the code.  The issue is that lov_connect which calls lov_connect_obd is&lt;br/&gt;
an asynchronous connect that does not wait for all OSCs to be connected&lt;br/&gt;
before returning.  In the end lustre_fill_super can return before all&lt;br/&gt;
OSCs have been set active so any file operations that caused the&lt;br/&gt;
automount may return an error.  Many lov functions check to make sure&lt;br/&gt;
the lov_tgt_desc ltd_active flag is 1 or return -EIO.&lt;/p&gt;

&lt;p&gt;Original email thread from lustre-devel:&lt;br/&gt;
&lt;a href=&quot;http://groups.google.com/group/lustre-devel-list/browse_thread/thread/4796d88cadf9d0e9/248ebf6e3f9877f3?lnk=gst&amp;amp;q=automount#248ebf6e3f9877f3&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://groups.google.com/group/lustre-devel-list/browse_thread/thread/4796d88cadf9d0e9/248ebf6e3f9877f3?lnk=gst&amp;amp;q=automount#248ebf6e3f9877f3&lt;/a&gt;&lt;/p&gt;</description>
                <environment>various</environment>
        <key id="11562">LU-631</key>
            <summary>IO errors when using automounter and Lustre</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="hongchao.zhang">Hongchao Zhang</assignee>
                                    <reporter username="jfilizetti">Jeremy Filizetti</reporter>
                        <labels>
                            <label>ptr</label>
                    </labels>
                <created>Wed, 24 Aug 2011 21:07:49 +0000</created>
                <updated>Tue, 9 Jul 2013 12:58:04 +0000</updated>
                            <resolved>Thu, 25 Apr 2013 17:14:07 +0000</resolved>
                                    <version>Lustre 1.8.6</version>
                                    <fixVersion>Lustre 2.4.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="22454" author="pjones" created="Thu, 3 Nov 2011 17:31:29 +0000"  >&lt;p&gt;Hongchao&lt;/p&gt;

&lt;p&gt;Can you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="22490" author="hongchao.zhang" created="Fri, 4 Nov 2011 06:03:09 +0000"  >&lt;p&gt;the problem(-EIO) caused by &quot;ls -l  /lustre/xen1/tmp/testfile&quot; is in &quot;lov_enqueue&quot;, where &quot;lov_prep_enqueue_set&quot; find&lt;br/&gt;
there is no available OSC to send glimpse request and the request set contains no lov_request, then it return -EIO,&lt;/p&gt;

&lt;p&gt;   in &quot;lov_prep_enqueue_set&quot;,&lt;br/&gt;
   ...&lt;br/&gt;
   if (!set-&amp;gt;set_count)&lt;br/&gt;
      GOTO(out_set, rc = -EIO);&lt;br/&gt;
   ...&lt;/p&gt;

&lt;p&gt;here, we can wait these OSCs to be connected &amp;amp; activated, but it will need long time if the OST is recovering, &lt;br/&gt;
furthermore, there is still problem in the current code:&lt;br/&gt;
if there are more than one stripes in a file, and one OSC is activated, the other isn&apos;t, then only one glimpse request&lt;br/&gt;
is sent, and its A(CM)Time&amp;amp;Size is taken into account, but the second one&apos;s is not! it&apos;s the same effect if we don&apos;t&lt;br/&gt;
return &quot;-EIO&quot; in the above code snippet.&lt;/p&gt;</comment>
                            <comment id="22584" author="jfilizetti" created="Sun, 6 Nov 2011 23:03:19 +0000"  >&lt;p&gt;I think the easiest way to make a satisfactory fix (to me) is to make sure that nothing is queued to the OSC before it has been set active so that we don&apos;t return -EIO from lov_prep_enqueue_set on operations that might have triggered the mount from the automounter.&lt;/p&gt;

&lt;p&gt;As for the bug you mention about not accounting for &lt;/p&gt;
{a,c,m}
&lt;p&gt;time and size from all of the OSC if some of them are done should also be fixed.  Maybe that should be tracked under a separate bug.&lt;/p&gt;</comment>
                            <comment id="23345" author="pjones" created="Wed, 23 Nov 2011 10:00:45 +0000"  >&lt;p&gt;Bobi&lt;/p&gt;

&lt;p&gt;Hongchao is out for a while. Could you please investigate this issue in his absence?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="23629" author="hongchao.zhang" created="Fri, 2 Dec 2011 06:09:20 +0000"  >&lt;p&gt;the patch is tracked at &lt;a href=&quot;http://review.whamcloud.com/#change,2469&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,2469&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="57060" author="pjones" created="Thu, 25 Apr 2013 17:14:07 +0000"  >&lt;p&gt;Landed for 2.4&lt;/p&gt;</comment>
                            <comment id="61934" author="shadow" created="Tue, 9 Jul 2013 12:58:04 +0000"  >&lt;p&gt;good patch to make MDT hang if someone will add OST which unreachable in config change time.&lt;br/&gt;
new creation will call statfs to obtain information about new OST - and any new creation will blocked until ost connection finished.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvp2n:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7892</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>