<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:40:50 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4229] crash in NRS cleanup during mount failure</title>
                <link>https://jira.whamcloud.com/browse/LU-4229</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Was running a memory-intensive workload on the same node and then mounted MDS.  It failed an allocation during setup and then oopsed in the subsequent cleanup.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LDISKFS-fs (dm-9): mounted filesystem with ordered data mode. quota=on. Opts: 
mount.lustre: page allocation failure. order:1, mode:0x40
Pid: 6512, comm: mount.lustre Tainted: P      D W  ---------------    2.6.32-279.5.1.el6_lustre.g7f15218.x86_64 #1
Call Trace:
[&amp;lt;ffffffff811276cf&amp;gt;] ? __alloc_pages_nodemask+0x77f/0x940
[&amp;lt;ffffffff81161e92&amp;gt;] ? kmem_getpages+0x62/0x170
[&amp;lt;ffffffff81162aaa&amp;gt;] ? fallback_alloc+0x1ba/0x270
[&amp;lt;ffffffff811624ff&amp;gt;] ? cache_grow+0x2cf/0x320
[&amp;lt;ffffffff81162829&amp;gt;] ? ____cache_alloc_node+0x99/0x160
[&amp;lt;ffffffffa10116c1&amp;gt;] ? cfs_cpt_malloc+0x31/0x60 [libcfs]
[&amp;lt;ffffffff811636ef&amp;gt;] ? kmem_cache_alloc_node_notrace+0x6f/0x130
[&amp;lt;ffffffff8116392b&amp;gt;] ? __kmalloc_node+0x7b/0x100
[&amp;lt;ffffffffa10116c1&amp;gt;] ? cfs_cpt_malloc+0x31/0x60 [libcfs]
[&amp;lt;ffffffffa0a54f88&amp;gt;] ? ptlrpc_alloc_rqbd+0x1e8/0x6d0 [ptlrpc]
[&amp;lt;ffffffffa0a55555&amp;gt;] ? ptlrpc_grow_req_bufs+0xe5/0x2a0 [ptlrpc]
[&amp;lt;ffffffffa0a55d25&amp;gt;] ? ptlrpc_register_service+0x615/0x17c0 [ptlrpc]
[&amp;lt;ffffffffa0cee1a5&amp;gt;] ? mgs_init0+0x1285/0x1760 [mgs]
[&amp;lt;ffffffffa0a9bb90&amp;gt;] ? tgt_request_handle+0x0/0xe40 [ptlrpc]
[&amp;lt;ffffffffa0a6b610&amp;gt;] ? target_print_req+0x0/0xa0 [ptlrpc]
[&amp;lt;ffffffffa0ce74e9&amp;gt;] ? mgs_type_start+0x19/0x20 [mgs]
[&amp;lt;ffffffffa0cee78f&amp;gt;] ? mgs_device_alloc+0x10f/0x260 [mgs]
[&amp;lt;ffffffffa0901a2f&amp;gt;] ? obd_setup+0x1bf/0x290 [obdclass]
[&amp;lt;ffffffffa0901d08&amp;gt;] ? class_setup+0x208/0x870 [obdclass]
[&amp;lt;ffffffffa090954c&amp;gt;] ? class_process_config+0xc6c/0x1ad0 [obdclass]
[&amp;lt;ffffffffa090e3d3&amp;gt;] ? lustre_cfg_new+0x2d3/0x6e0 [obdclass]
[&amp;lt;ffffffffa090e929&amp;gt;] ? do_lcfg+0x149/0x480 [obdclass]
[&amp;lt;ffffffffa090ecf4&amp;gt;] ? lustre_start_simple+0x94/0x200 [obdclass]
[&amp;lt;ffffffffa0948479&amp;gt;] ? server_fill_super+0x1159/0x19ea [obdclass]
[&amp;lt;ffffffffa09148f8&amp;gt;] ? lustre_fill_super+0x1d8/0x530 [obdclass]
[&amp;lt;ffffffffa0914720&amp;gt;] ? lustre_fill_super+0x0/0x530 [obdclass]
[&amp;lt;ffffffff8117e16f&amp;gt;] ? get_sb_nodev+0x5f/0xa0
[&amp;lt;ffffffffa090c425&amp;gt;] ? lustre_get_sb+0x25/0x30 [obdclass]
[&amp;lt;ffffffff8117ddcb&amp;gt;] ? vfs_kern_mount+0x7b/0x1b0
[&amp;lt;ffffffff8117df72&amp;gt;] ? do_kern_mount+0x52/0x130
[&amp;lt;ffffffff8119c652&amp;gt;] ? do_mount+0x2d2/0x8d0
[&amp;lt;ffffffff8119cce0&amp;gt;] ? sys_mount+0x90/0xe0

LustreError: 6512:0:(service.c:156:ptlrpc_grow_req_bufs()) mgs: Can&apos;t allocate request buffer
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [&amp;lt;ffffffffa0a8ac5c&amp;gt;] ptlrpc_service_nrs_cleanup+0xec/0x440 [ptlrpc]
PGD 1b078067 PUD 20d38067 PMD 0 
Pid: 6512, comm: mount.lustre Tainted: P      D W  ---------------    2.6.32-279.5.1.el6_lustre.g7f15218.x86_64 #1 Dell Inc.                 Dell DXP051                  /0FJ030
RIP: 0010:[&amp;lt;ffffffffa0a8ac5c&amp;gt;]  [&amp;lt;ffffffffa0a8ac5c&amp;gt;] ptlrpc_service_nrs_cleanup+0xec/0x440 [ptlrpc]
RSP: 0018:ffff88001fc536c8  EFLAGS: 00010217
RAX: 0000000000000000 RBX: ffff8800709834e0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa0b29640
RBP: ffff88001fc53708 R08: 0000000000000002 R09: 0000000000000000
R10: ffff8800244cc000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8800adc70cc0 R14: ffff880070983618 R15: ffff8800709834e8
FS:  00007fb3066b0700(0000) GS:ffff880002280000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000053c91000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
Process mount.lustre (pid: 6512, threadinfo ffff88001fc52000, task ffff880017014080)
Stack:
ffff880070983400 00ff880017014080 ffff88001fc53708 ffff8800adc70cc0
&amp;lt;d&amp;gt; ffff880070983400 ffff880070983448 ffff880070983618 ffff880017014080
&amp;lt;d&amp;gt; ffff88001fc537b8 ffffffffa0a52583 ffff88001fc53728 ffff8800adc70cc0
Call Trace:
[&amp;lt;ffffffffa0a52583&amp;gt;] ptlrpc_unregister_service+0x673/0xff0 [ptlrpc]
[&amp;lt;ffffffffa0a556a1&amp;gt;] ? ptlrpc_grow_req_bufs+0x231/0x2a0 [ptlrpc]
[&amp;lt;ffffffffa0a55ee2&amp;gt;] ptlrpc_register_service+0x7d2/0x17c0 [ptlrpc]
[&amp;lt;ffffffffa0cee1a5&amp;gt;] mgs_init0+0x1285/0x1760 [mgs]
[rest of the stack is the same as above]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This resolves to:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;(gdb) list *(ptlrpc_service_nrs_cleanup+0xec)
0x90c8c is in ptlrpc_service_nrs_cleanup_locked (/usr/src/lustre-head/lustre/ptlrpc/nrs.c:1030).
1025
1026    again:
1027            nrs = nrs_svcpt2nrs(svcpt, hp);
1028            nrs-&amp;gt;nrs_stopping = 1;
1029
1030            cfs_list_for_each_entry_safe(policy, tmp, &amp;amp;nrs-&amp;gt;nrs_policy_list,
1031                                         pol_list) {
1032                    rc = nrs_policy_unregister(nrs, policy-&amp;gt;pol_desc-&amp;gt;pd_name);
1033                    LASSERT(rc == 0);
1034            }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It looks like nrs_policy_list isn&apos;t initialized by the time this cleanup is called.  Need to check something to see if this struct even needs to be cleaned up.&lt;/p&gt;</description>
                <environment>Lustre master v2_5_50_0-3-g6229525&lt;br/&gt;
Single node test setup, 1 MDT, 3 OST, client&lt;br/&gt;
RHEL6.3 2.6.32-279.5.1&lt;br/&gt;
</environment>
        <key id="21935">LU-4229</key>
            <summary>crash in NRS cleanup during mount failure</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="adilger">Andreas Dilger</reporter>
                        <labels>
                    </labels>
                <created>Fri, 8 Nov 2013 12:06:09 +0000</created>
                <updated>Thu, 13 Feb 2014 22:08:43 +0000</updated>
                            <resolved>Thu, 14 Nov 2013 00:08:36 +0000</resolved>
                                    <version>Lustre 2.6.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="71215" author="niu" created="Mon, 11 Nov 2013 02:11:01 +0000"  >&lt;p&gt;Looks duplicated to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3772&quot; title=&quot;Crash in ptlrpc_service_nrs_cleanup() when out of memory&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3772&quot;&gt;&lt;del&gt;LU-3772&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="71488" author="adilger" created="Thu, 14 Nov 2013 00:08:36 +0000"  >&lt;p&gt;Duplicate of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3772&quot; title=&quot;Crash in ptlrpc_service_nrs_cleanup() when out of memory&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3772&quot;&gt;&lt;del&gt;LU-3772&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                            <comment id="77030" author="adilger" created="Thu, 13 Feb 2014 22:08:43 +0000"  >&lt;p&gt;Shows mode:0x40 == __GFP_IO, but missing __GFP_WAIT from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4357&quot; title=&quot;page allocation failure. mode:0x40 caused by missing __GFP_WAIT flag&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4357&quot;&gt;&lt;del&gt;LU-4357&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="22373">LU-4357</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="20383">LU-3772</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw8fr:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>11519</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>