<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:49:08 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12038] conf-sanity test 59 memory leaks</title>
                <link>https://jira.whamcloud.com/browse/LU-12038</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Added support for consistent memory leak detection to my scripts and sure enough, conf-sanity test 59 leaks memory in current master:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[ 5916.569153] Lustre: DEBUG MARKER: == conf-sanity test 59: writeconf mount option ======================================================= 15:41:23 (1551472883)
[ 5918.077007] Lustre: Setting parameter lustre-MDT0000.mdt.identity_upcall in log lustre-MDT0000
[ 5918.118690] Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space
[ 5918.119237] Lustre: Skipped 2 previous similar messages
[ 5918.132181] Lustre: lustre-MDT0000: new disk, initializing
[ 5918.132596] Lustre: Skipped 1 previous similar message
[ 5918.372703] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400]:0:mdt
[ 5919.051560] Lustre: DEBUG MARKER: oleg15-server.localnet: executing set_default_debug -1 all 16
[ 5922.520329] Lustre: lustre-MDT0000: Connection restored to 192.168.10.168@tcp (at 0@lo)
[ 5922.521053] Lustre: Skipped 16 previous similar messages
[ 5922.540923] Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 60-180
[ 5922.541528] Lustre: Skipped 3 previous similar messages
[ 5923.201728] Lustre: DEBUG MARKER: oleg15-server.localnet: executing set_default_debug -1 all 16
[ 5938.511155] Lustre: 8603:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1551472898/real 1551472898]  req@ffff8800bab07b40 x1626836967632352/t0(0) o251-&amp;gt;MGC192.168.10.168@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1551472904 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
[ 5938.513054] Lustre: 8603:0:(client.c:2134:ptlrpc_expire_one_request()) Skipped 2 previous similar messages
[ 5938.544098] Lustre: server umount lustre-OST0000 complete
[ 5938.544517] Lustre: Skipped 3 previous similar messages
[ 5940.849292] Lustre: MGS: Logs for fs lustre were removed by user request.  All servers must be restarted in order to regenerate the logs: rc = 0
[ 5940.858390] Lustre: Setting parameter lustre-MDT0000.mdt.identity_upcall in log lustre-MDT0000
[ 5941.576567] Lustre: DEBUG MARKER: oleg15-server.localnet: executing set_default_debug -1 all 16
[ 5943.500363] LustreError: 13b-9: lustre-OST0000 claims to have registered, but this MGS does not know about it, preventing registration.
[ 5943.508686] LustreError: 13a-8: Failed to get MGS log lustre-OST0000 and no local copy.
[ 5943.509328] LustreError: 15c-8: MGC192.168.10.168@tcp: The configuration from log &apos;lustre-OST0000&apos; failed (-2). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
[ 5943.510952] LustreError: 9571:0:(obd_mount_server.c:1386:server_start_targets()) failed to start server lustre-OST0000: -2
[ 5943.529731] LustreError: 9571:0:(obd_mount_server.c:1939:server_fill_super()) Unable to start targets: -2
[ 5943.530468] LustreError: 9571:0:(obd_mount_server.c:1589:server_put_super()) no obd lustre-OST0000
[ 5943.576890] LustreError: 9571:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-2)
[ 5944.443255] Lustre: MGS: Regenerating lustre-OST0000 log by user request: rc = 0
[ 5945.111938] Lustre: DEBUG MARKER: oleg15-server.localnet: executing set_default_debug -1 all 16
[ 5947.346104] Lustre: MGS: Regenerating lustre-OST0001 log by user request: rc = 0
[ 5948.013805] Lustre: DEBUG MARKER: oleg15-server.localnet: executing set_default_debug -1 all 16
[ 5952.788610] Lustre: lustre-OST0001-osc-MDT0000: Connection to lustre-OST0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
[ 5952.790296] Lustre: lustre-OST0001: Not available for connect from 0@lo (stopping)
[ 5957.780675] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_statfs to node 0@lo failed: rc = -107
[ 5957.797623] LustreError: 137-5: lustre-OST0001_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 5962.804735] Lustre: lustre-OST0000: Not available for connect from 0@lo (stopping)
[ 5962.805177] LustreError: 137-5: lustre-OST0001_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
[ 5962.807882] Lustre: Skipped 1 previous similar message
[ 5973.778356] Lustre: DEBUG MARKER: oleg15-server.localnet: executing /home/green/git/lustre-release/lustre/scripts/lustre_rmmod ldiskfs
[ 5974.249337] Key type lgssc unregistered
[ 5975.903115] LNet: Removed LNI 192.168.10.168@tcp
[ 5975.975610] LustreError: 11726:0:(class_obd.c:824:obdclass_exit()) obd_memory max: 366353522, leaked: 80
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Full report: &lt;a href=&quot;http://testing.linuxhacker.ru:3333/lustre-reports/122/testresults/conf-sanity-zfs-centos7_x86_64-centos7_x86_64/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://testing.linuxhacker.ru:3333/lustre-reports/122/testresults/conf-sanity-zfs-centos7_x86_64-centos7_x86_64/&lt;/a&gt;&lt;/p&gt;</description>
                <environment></environment>
        <key id="55042">LU-12038</key>
            <summary>conf-sanity test 59 memory leaks</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="5">Cannot Reproduce</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="green">Oleg Drokin</reporter>
                        <labels>
                    </labels>
                <created>Sat, 2 Mar 2019 03:05:29 +0000</created>
                <updated>Fri, 18 Jun 2021 20:16:41 +0000</updated>
                            <resolved>Fri, 18 Jun 2021 20:16:41 +0000</resolved>
                                    <version>Lustre 2.13.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="246689" author="green" created="Sat, 4 May 2019 05:14:21 +0000"  >&lt;p&gt;ok, the most easy way to see it is in zfs with conf-sanity test 57a it looks like.&lt;/p&gt;

&lt;p&gt;kmemleak checker seems to point at this path:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;unreferenced object 0xffff8800299f7500 (size 64):
  comm &quot;mount.lustre&quot;, pid 9866, jiffies 4295066912 (age 532.277s)
  hex dump (first 32 bytes):
    a0 18 03 4e 00 88 ff ff a0 18 03 4e 00 88 ff ff  ...N.......N....
    03 00 00 00 02 00 00 00 03 00 00 00 00 00 00 00  ................
  backtrace:
    [&amp;lt;ffffffff81211684&amp;gt;] kmem_cache_alloc_trace+0x134/0x620
    [&amp;lt;ffffffffa09fbf9b&amp;gt;] lustre_index_register+0x12b/0x5b0 [obdclass]
    [&amp;lt;ffffffffa0d0b157&amp;gt;] osd_ios_scan_one+0x1027/0x10d0 [osd_zfs]
    [&amp;lt;ffffffffa0d0bfdc&amp;gt;] osd_initial_OI_scrub+0x65c/0xce0 [osd_zfs]
    [&amp;lt;ffffffffa0d0ef78&amp;gt;] osd_scrub_setup+0x9b8/0xb50 [osd_zfs]
    [&amp;lt;ffffffffa0ce750a&amp;gt;] osd_mount+0xe7a/0x1310 [osd_zfs]
    [&amp;lt;ffffffffa0ce7efc&amp;gt;] osd_device_alloc+0x25c/0x3a0 [osd_zfs]
    [&amp;lt;ffffffffa09beb93&amp;gt;] obd_setup+0x113/0x280 [obdclass]
    [&amp;lt;ffffffffa09bf826&amp;gt;] class_setup+0x2a6/0x840 [obdclass]
    [&amp;lt;ffffffffa09c307a&amp;gt;] class_process_config+0x190a/0x2810 [obdclass]
    [&amp;lt;ffffffffa09c6eb8&amp;gt;] do_lcfg+0x258/0x4b0 [obdclass]
    [&amp;lt;ffffffffa09caf48&amp;gt;] lustre_start_simple+0x88/0x210 [obdclass]
    [&amp;lt;ffffffffa09f7e85&amp;gt;] server_fill_super+0xf55/0x1890 [obdclass]
    [&amp;lt;ffffffffa09ce898&amp;gt;] lustre_fill_super+0x3d8/0x8c0 [obdclass]
    [&amp;lt;ffffffff8123a47d&amp;gt;] mount_nodev+0x4d/0xb0
    [&amp;lt;ffffffffa09c6be8&amp;gt;] lustre_mount+0x38/0x60 [obdclass]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;the libu thing allocated in that path is never freed. size reported as leaked is 40 bytes and that&apos;s exactly that struct size so I think it matches everything.&lt;/p&gt;

&lt;p&gt;the freeing is supposed to happen by a call to lustre_index_restore that does not appear to be coming.&lt;/p&gt;

&lt;p&gt;It looks like the problem is in osd_index_restore. Where as in ldiskfs we always call  lustre_index_restore unconditionally, in zfs it&apos;s conditional on call to osd_idc_find_and_init_with_oid, but no matter what it does, the item is removed from the list and is therefore leaked if the lustre_index_restore was not called?&lt;/p&gt;

&lt;p&gt;this code was introduced with &lt;a href=&quot;https://review.whamcloud.com/30910&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/30910&lt;/a&gt; for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10193&quot; title=&quot;Index object on-disk layout compatibility&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10193&quot;&gt;&lt;del&gt;LU-10193&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="246717" author="fan.yong" created="Sun, 5 May 2019 03:24:15 +0000"  >&lt;p&gt;If the lustre_index_restore() does not handle/free the @libu, then it will be kept in the list osd_device::od_index_backup_list until the device umount via osd_index_backup(). So there should be not memleak unless some @libu is registered during or after the umount by race. But as I remember, the lustre_index_register() logic will avoid inserting new @libu if the device in-umount or has been umounted. So if such bad case happened, it is bug and should be fixed.&lt;/p&gt;</comment>
                            <comment id="304909" author="adilger" created="Fri, 18 Jun 2021 20:16:41 +0000"  >&lt;p&gt;Have not seen this since initial reports.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="49083">LU-10193</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00clz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>