<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:35:12 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-3587] CR_MAXSIZE is too small</title>
                <link>https://jira.whamcloud.com/browse/LU-3587</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;One of our Lustre 2.4.0 based clients is hitting an assertion when we run robinhood.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2013-07-09 14:35:15 vrelic8 login: LustreError: 5467:0:(mdc_request.c:1500:changelog_kuc_hdr()) ASSERTION( len &amp;lt;= cfs_size_round(2*255 + 1 + sizeof(struct changelog_rec)) ) failed:
2013-07-09 14:41:07 LustreError: 5467:0:(mdc_request.c:1500:changelog_kuc_hdr()) LBUG
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Backtrace of the crashed thread:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;PID: 5467   TASK: ffff885c8668eaa0  CPU: 17  COMMAND: &quot;mdc_clg_send_th&quot;
 #0 [ffff885c866779e8] machine_kexec at ffffffff81035bfb
 #1 [ffff885c86677a48] crash_kexec at ffffffff810c0932
 #2 [ffff885c86677b18] panic at ffffffff8150d943
 #3 [ffff885c86677b98] lbug_with_loc at ffffffffa042bf4b [libcfs]
 #4 [ffff885c86677bb8] changelog_kuc_hdr at ffffffffa096bf3e [mdc]
 #5 [ffff885c86677bc8] changelog_kkuc_cb at ffffffffa096d28c [mdc]
 #6 [ffff885c86677bf8] llog_process_thread at ffffffffa05e110b [obdclass]
 #7 [ffff885c86677ca8] llog_process_or_fork at ffffffffa05e2e1d [obdclass]
 #8 [ffff885c86677cf8] llog_cat_process_cb at ffffffffa05e560a [obdclass]
 #9 [ffff885c86677d58] llog_process_thread at ffffffffa05e110b [obdclass]
#10 [ffff885c86677e08] llog_process_or_fork at ffffffffa05e2e1d [obdclass]
#11 [ffff885c86677e58] llog_cat_process_or_fork at ffffffffa05e3ef9 [obdclass]
#12 [ffff885c86677ee8] llog_cat_process at ffffffffa05e41d9 [obdclass]
#13 [ffff885c86677f08] mdc_changelog_send_thread at ffffffffa097157b [mdc]
#14 [ffff885c86677f48] kernel_thread at ffffffff8100c0ca
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Digging into the crash dump, the values on the stack did not appear to contain either the &quot;len&quot; variable calculated by changelog_kkuc_cb().  But the problem was easily reproducible, so I added a debug message and tried again.  This told me:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2013-07-15 11:27:32 LustreError: 5472:0:(mdc_request.c:1504:changelog_kuc_hdr()) ASSERTION( len &amp;lt;= CR_MAXSIZE ) failed: CR_MAXSIZE=576, len=588, rec=ffff885ce8c9a148, changelog_rec_size=96
2013-07-15 11:27:32 LustreError: 5472:0:(mdc_request.c:1504:changelog_kuc_hdr()) LBUG
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;With a pointer to the llog_changelog_rec function, I could print it out in crash:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;crash&amp;gt; print *(struct llog_changelog_rec *)0xffff885ce8c9a148
$4 = {
  cr_hdr = {
    lrh_len = 608,
    lrh_index = 29516,
    lrh_type = 275120128,
    padding = 0
  },
  cr = {
    cr_namelen = 484,
    cr_flags = 8192,
    cr_type = 8,
    cr_index = 80023354,
    cr_prev = 0,
    cr_time = 1474100508289501189,
    {
      cr_tfid = {
        f_seq = 0,
        f_oid = 0,
        f_ver = 0
      },
      cr_markerflags = 0
    },
    cr_pfid = {
      f_seq = 8589961859,
      f_oid = 28956,
      f_ver = 0
    },
    cr_name = 0xffff885ce8c9a198 &quot;\263j&quot;
  },
  cr_tail = {
    lrt_len = 27315,
    lrt_index = 2
  }
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So here is what I understand:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;cr_type tells us that this is a CL_RENAME&lt;/li&gt;
	&lt;li&gt;cr_flags tells us that cr is a struct changelog_ext_rec instead of the normal struct changelog_rec&lt;/li&gt;
	&lt;li&gt;cr_namelen is large at 484&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Note that cr_name looks corrupt, but is not.  It is merely because crash does not know that we have an undeclared union here.  If I manually add 32 bytes to the cr_namelen pointer, I &lt;em&gt;am&lt;/em&gt; able to see a correct filename:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;crash&amp;gt; print (((struct llog_changelog_rec *)0xffff885ce8c9a148)-&amp;gt;cr).cr_name+32&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I won&apos;t share the users&apos;s file name here, it is sufficient to say that it is 245 characters long, &lt;em&gt;not&lt;/em&gt; including the terminating NULL character. If I then add 246 characters, I can see a second filename:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;crash&amp;gt; print (((struct llog_changelog_rec *)0xffff885ce8c9a148)-&amp;gt;cr).cr_name+32+246&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This filename is nearly the same as the first one, but the trailing string of &quot;-failed&quot; was removed, resulting in a string 238 characters long, not including the terminating null character.  Doing the math:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;245+238+1 = 484&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;OK, that matches cr_len!  Now I think I see why CR_MAXSIZE starts out with &quot;2*LEN_MAX + 1&quot;.  But I don&apos;t understand it.  Why are we only accounting for one of the null characters??&lt;/p&gt;

&lt;p&gt;But the root of the bug appears to be that CR_MAXSIZE only accounts for the size of the smaller &quot;sizeof(struct changelog_rec)&quot;, when it should be using &quot;sizeof(struct changelog_ext_rec)&quot;.  Those extra 32 bytes would have been enough to store this structure, and avoid the bug.&lt;/p&gt;

&lt;p&gt;I can change CR_MAXSIZE to do the right thing, and I will push a patch to do so soon.  I have not yet fully checked to see if changing CR_MAXSIZE could break anything else. &lt;/p&gt;</description>
                <environment></environment>
        <key id="19826">LU-3587</key>
            <summary>CR_MAXSIZE is too small</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bogl">Bob Glossman</assignee>
                                    <reporter username="morrone">Christopher Morrone</reporter>
                        <labels>
                    </labels>
                <created>Tue, 16 Jul 2013 00:55:51 +0000</created>
                <updated>Sun, 17 Nov 2013 14:57:24 +0000</updated>
                            <resolved>Tue, 8 Oct 2013 16:43:57 +0000</resolved>
                                    <version>Lustre 2.4.0</version>
                                    <fixVersion>Lustre 2.5.0</fixVersion>
                    <fixVersion>Lustre 2.4.2</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="62340" author="morrone" created="Tue, 16 Jul 2013 01:03:29 +0000"  >&lt;p&gt;Pushed a possible solution in change &lt;a href=&quot;http://review.whamcloud.com/6993&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/6993&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I will test it some time tomorrow.&lt;/p&gt;</comment>
                            <comment id="62344" author="laisiyao" created="Tue, 16 Jul 2013 03:27:45 +0000"  >&lt;p&gt;CR_MAXSIZE starts out with &quot;2*NAME_MAX +1&quot; is because a null character is used as delimiter between two names, and we&apos;ve already stored the total name length of the two names plus the delimiter in rec-&amp;gt;cr_namelen, and we always access the second name with length.&lt;/p&gt;

&lt;p&gt;The fix looks correct, and it will be great if a sanity test is added for this.&lt;/p&gt;</comment>
                            <comment id="62720" author="morrone" created="Mon, 22 Jul 2013 18:01:54 +0000"  >&lt;p&gt;FYI, change &lt;a href=&quot;http://review.whamcloud.com/6993&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;6993&lt;/a&gt; did address the problem, and we are using it on nodes where we run robinhood.&lt;/p&gt;</comment>
                            <comment id="64649" author="morrone" created="Tue, 20 Aug 2013 18:17:03 +0000"  >&lt;p&gt;Note that there is an additional problem, beyond the incorrectly defined CR_MAXSIZE.  The KUC code uses a size for a buffer that is basically sizeof(struct kuc_hdr) + CR_MAXSIZE, but the assertion does not take into account the struct kuc_hdr size, only CR_MAXSIZE.&lt;/p&gt;

&lt;p&gt;I have a second patch that I will push shortly.&lt;/p&gt;</comment>
                            <comment id="64688" author="morrone" created="Tue, 20 Aug 2013 23:02:18 +0000"  >&lt;p&gt;Second patch is:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/7406&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7406&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="64690" author="morrone" created="Tue, 20 Aug 2013 23:03:51 +0000"  >&lt;p&gt;Also, as has already been mentioned, it would be very nice to have a new sanity test that tests a rename of a file name 255 characters long to another 255 character long file name, and then read the changelog that results.  I won&apos;t have time to work on that test.&lt;/p&gt;</comment>
                            <comment id="67370" author="pjones" created="Tue, 24 Sep 2013 15:29:20 +0000"  >&lt;p&gt;The patch has landed to master, but has a test been created yet?&lt;/p&gt;</comment>
                            <comment id="67409" author="pjones" created="Tue, 24 Sep 2013 18:03:34 +0000"  >&lt;p&gt;Bob&lt;/p&gt;

&lt;p&gt;Could you please look into creating a suitable test?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="67482" author="bogl" created="Tue, 24 Sep 2013 22:49:41 +0000"  >&lt;p&gt;added sanity test&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/7751&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7751&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="68600" author="pjones" created="Tue, 8 Oct 2013 16:43:57 +0000"  >&lt;p&gt;Landed for 2.5&lt;/p&gt;</comment>
                            <comment id="70661" author="bogl" created="Mon, 4 Nov 2013 21:38:57 +0000"  >&lt;p&gt;backports for b2_4:&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/8170&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8170&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;http://review.whamcloud.com/8171&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/8171&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvvef:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9102</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>