<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:53:11 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12506] Client unable to mount filesystem with very large number of MDTs</title>
                <link>https://jira.whamcloud.com/browse/LU-12506</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Hello,&lt;br/&gt;
 There was a message on the lustre-discuss list about this issue back in May (&lt;a href=&quot;http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2019-May/016475.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2019-May/016475.html&lt;/a&gt;) - and I&apos;ve managed to reproduce this error. I couldn&apos;t find an open ticket for it however so I wanted to create one.&lt;/p&gt;

&lt;p&gt;My environment is the following:&lt;/p&gt;

&lt;p&gt;Servers and Clients are using the upstream 2.12.2 and same kernel version:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@dac-e-1 ~]# lfs --version
lfs 2.12.2
# Server kernel version
3.10.0-957.10.1.el7_lustre.x86_64
# Client kernel version (unpatched)
3.10.0-957.10.1.el7.x86_64
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;There are 24 servers, each containing 12x NVMe flash devices. For this test I am configuring the block-devices on each server identically, with 3 devices on each server partitioned into a 200G MDT and the remaining space as OST.&lt;/p&gt;

&lt;p&gt;Altogether this makes 72 MDTs, and 288 OSTs in the filesystem.&lt;/p&gt;

&lt;p&gt;Below are the syslog messages from the client and servers when attempting to mount the filesystem:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeHeader panelHeader&quot; style=&quot;border-bottom-width: 1px;&quot;&gt;&lt;b&gt;Client syslog - Nid: 10.47.21.72@o2ib1&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
-- Logs begin at Wed 2019-07-03 19:54:04 BST, end at Thu 2019-07-04 13:06:12 BST. --
Jul 04 12:59:43 cpu-e-1095 kernel: Lustre: DEBUG MARKER: Attempting client mount from 10.47.21.72@o2ib1
Jul 04 12:59:56 cpu-e-1095 kernel: LustreError: 94792:0:(mdc_request.c:2700:mdc_setup()) fs1-MDT0031-mdc-ffff9f4c85ad8000: failed to setup changelog &lt;span class=&quot;code-object&quot;&gt;char&lt;/span&gt; device: rc = -16
Jul 04 12:59:56 cpu-e-1095 kernel: LustreError: 94792:0:(obd_config.c:559:class_setup()) setup fs1-MDT0031-mdc-ffff9f4c85ad8000 failed (-16)
Jul 04 12:59:56 cpu-e-1095 kernel: LustreError: 94792:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.47.18.1@o2ib1: cfg command failed: rc = -16
Jul 04 12:59:56 cpu-e-1095 kernel: Lustre:    cmd=cf003 0:fs1-MDT0031-mdc  1:fs1-MDT0031_UUID  2:10.47.18.17@o2ib1  
Jul 04 12:59:56 cpu-e-1095 kernel: LustreError: 15c-8: MGC10.47.18.1@o2ib1: The configuration from log &lt;span class=&quot;code-quote&quot;&gt;&apos;fs1-client&apos;&lt;/span&gt; failed (-16). This may be the result of communication errors between &lt;span class=&quot;code-keyword&quot;&gt;this&lt;/span&gt; node and the MGS, a bad configuration, or other errors. See the syslog &lt;span class=&quot;code-keyword&quot;&gt;for&lt;/span&gt; more information.
Jul 04 12:59:56 cpu-e-1095 kernel: LustreError: 94774:0:(obd_config.c:610:class_cleanup()) Device 58 not setup
Jul 04 12:59:56 cpu-e-1095 kernel: Lustre: Unmounted fs1-client
Jul 04 12:59:56 cpu-e-1095 kernel: LustreError: 94774:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount  (-16)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeHeader panelHeader&quot; style=&quot;border-bottom-width: 1px;&quot;&gt;&lt;b&gt;Servers syslog&lt;/b&gt;&lt;/div&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
[root@xcat1 ~]# xdsh csd3-buff &lt;span class=&quot;code-quote&quot;&gt;&apos;journalctl -a --since &lt;span class=&quot;code-quote&quot;&gt;&quot;12:59&quot;&lt;/span&gt; _TRANSPORT=kernel&apos;&lt;/span&gt; | xdshbak -c                                                                                                  
HOSTS -------------------------------------------------------------------------
dac-e-1
-------------------------------------------------------------------------------
-- Logs begin at Thu 2019-03-21 15:42:02 GMT, end at Thu 2019-07-04 13:04:24 BST. --
Jul 04 12:59:43 dac-e-1 kernel: Lustre: DEBUG MARKER: Attempting client mount from 10.47.21.72@o2ib1
Jul 04 12:59:55 dac-e-1 kernel: Lustre: MGS: Connection restored to 08925711-bdfa-621f-89ec-0364645c915c (at 10.47.21.72@o2ib1)
Jul 04 12:59:55 dac-e-1 kernel: Lustre: Skipped 2036 previous similar messages

HOSTS -------------------------------------------------------------------------
dac-e-10, dac-e-11, dac-e-12, dac-e-13, dac-e-14, dac-e-15, dac-e-16, dac-e-17, dac-e-18, dac-e-19, dac-e-2, dac-e-20, dac-e-21, dac-e-22, dac-e-23, dac-e-24, dac-e-3, dac-e-4, dac-e-5, dac-e-6, dac-e-7, dac-e-8, dac-e-9
-------------------------------------------------------------------------------
-- No entries --
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Attached are lustre debug logs from both the client and the dac-e-1 server which contains the MGT.&lt;/p&gt;

&lt;p&gt;I can provide debug logs from all 24 servers if that would help, just let me know.&lt;/p&gt;

&lt;p&gt;I&apos;ve successfully used the same configuration with 2x MDTs per server, so 48 MDTs in total, without problem, but I haven&apos;t confirmed what Scott mentioned on the mailing list about the failure starting at 56 MDTs.&lt;/p&gt;

&lt;p&gt;Thanks,&lt;br/&gt;
 Matt&lt;/p&gt;</description>
                <environment></environment>
        <key id="56268">LU-12506</key>
            <summary>Client unable to mount filesystem with very large number of MDTs</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="hongchao.zhang">Hongchao Zhang</assignee>
                                    <reporter username="mrb">Matt R&#225;s&#243;-Barnett</reporter>
                        <labels>
                    </labels>
                <created>Thu, 4 Jul 2019 12:22:01 +0000</created>
                <updated>Tue, 6 Apr 2021 12:03:31 +0000</updated>
                            <resolved>Fri, 23 Oct 2020 21:14:36 +0000</resolved>
                                    <version>Lustre 2.10.8</version>
                    <version>Lustre 2.12.3</version>
                                    <fixVersion>Lustre 2.14.0</fixVersion>
                    <fixVersion>Lustre 2.12.7</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>14</watches>
                                                                            <comments>
                            <comment id="250676" author="pjones" created="Thu, 4 Jul 2019 14:18:07 +0000"  >&lt;p&gt;Hongchao&lt;/p&gt;

&lt;p&gt;Can you please investigate?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="250886" author="hongchao.zhang" created="Tue, 9 Jul 2019 11:21:56 +0000"  >&lt;p&gt;In Linux kernel, the misc device is limited to 64&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;in drivers/char/misc.c
...
#define DYNAMIC_MINORS 64 /* like dynamic majors */
static DECLARE_BITMAP(misc_minors, DYNAMIC_MINORS);
...
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;when mounting the Lustre, there will be one misc device registered for ChangeLog for each MDC&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;int mdc_changelog_cdev_init(struct obd_device *obd)
{
        ...
        entry-&amp;gt;ced_misc.minor = MISC_DYNAMIC_MINOR;
        entry-&amp;gt;ced_misc.name  = entry-&amp;gt;ced_name;
        entry-&amp;gt;ced_misc.fops  = &amp;amp;chlg_fops;
        ...    

        /* Register new character device */
        rc = misc_register(&amp;amp;entry-&amp;gt;ced_misc);
        if (rc != 0) 
                GOTO(out_unlock, rc);
       ...
}       
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;it will return -EBUSY if there are more than 64 MDTs (will be less than 64 if some misc devices are used by other modules)&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;in drivers/char/misc.c
...
#define DYNAMIC_MINORS 64 /* like dynamic majors */
static DECLARE_BITMAP(misc_minors, DYNAMIC_MINORS);
...
int misc_register(struct miscdevice * misc)
{
        ...
        if (misc-&amp;gt;minor == MISC_DYNAMIC_MINOR) {
                int i = find_first_zero_bit(misc_minors, DYNAMIC_MINORS);
                if (i &amp;gt;= DYNAMIC_MINORS) {
                        mutex_unlock(&amp;amp;misc_mtx);
                        return -EBUSY;
                }
                misc-&amp;gt;minor = DYNAMIC_MINORS - i - 1;
                set_bit(i, misc_minors);
        } else {
        ...
}
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="254835" author="adilger" created="Mon, 16 Sep 2019 23:35:50 +0000"  >&lt;p&gt;I&apos;d commented previously in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11626&quot; title=&quot;mdc: obd might go away while referenced by code in mdc_changelog&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11626&quot;&gt;&lt;del&gt;LU-11626&lt;/del&gt;&lt;/a&gt;, but that comment would be better here:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It makes more sense to multiplex a single character device across multiple MDTs, named &quot;&lt;tt&gt;/dev/lustre-changelog&lt;/tt&gt;&quot;. To track the MDT index on the open file handle (default = &lt;tt&gt;&amp;lt;onlyfs&amp;gt;-MDT0000&lt;/tt&gt;, which will work for many systems without any change) add an &lt;tt&gt;ioctl()&lt;/tt&gt; to specify the MDT name for that file handle if needed.&lt;/p&gt;

&lt;p&gt;That avoids the need to create so many character devices, avoids the need to share a single &lt;tt&gt;chlg_registered_dev&lt;/tt&gt; between multiple OBDs (one for each opener), and this interface change can be encapsulated inside the llapi code. This will also avoid the complexity in &lt;tt&gt;chlg_registered_dev_find_by_obd()&lt;/tt&gt; if we have only a single &lt;tt&gt;chlg_registered_dev&lt;/tt&gt; per OBD.&lt;/p&gt;

&lt;p&gt;There would need to be some small changes to &lt;tt&gt;liblustreapi_chlg.c&lt;/tt&gt; to open the &lt;tt&gt;lustre_changelog&lt;/tt&gt; device and call the ioctl() to change the MDT index instead of opening a different device for each MDT, with a fallback to the old behavior if the new device name doesn&apos;t exist. Probably the best is to change &lt;tt&gt;chlg_dev_path()&lt;/tt&gt; to &lt;tt&gt;chlg_dev_open()&lt;/tt&gt; and return the open file handle or an error instead of the &lt;tt&gt;pathname&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;On the kernel side in &lt;tt&gt;mdc_changelog_cdev_init()&lt;/tt&gt;, we might consider still creating some limited number of &lt;tt&gt;/dev/changelog-$fsname-MDTnnnn&lt;/tt&gt; devices (maybe max 16?) for compatibility with userspace applications/libraries that are opening the old devices and are statically linked to {liblustreapi.a}} (under &lt;tt&gt;LUSTRE_VERSION_CODE&lt;/tt&gt; checks so they go away eventually). However, it shouldn&apos;t be an error if the compat devices cannot be created if there are many MDTs, since most clients will not be Changelog consumers.&lt;/p&gt;&lt;/blockquote&gt;
</comment>
                            <comment id="254902" author="gerrit" created="Tue, 17 Sep 2019 17:01:25 +0000"  >&lt;p&gt;Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/36213&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/36213&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12506&quot; title=&quot;Client unable to mount filesystem with very large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12506&quot;&gt;&lt;del&gt;LU-12506&lt;/del&gt;&lt;/a&gt; mdc: Remove cdev_init&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: a6c1ad680f1dc5422bec4483f7c5569ed10793d6&lt;/p&gt;</comment>
                            <comment id="254903" author="pfarrell" created="Tue, 17 Sep 2019 17:03:53 +0000"  >&lt;p&gt;Matt,&lt;/p&gt;

&lt;p&gt;The above is &lt;b&gt;absolutely not&lt;/b&gt; a fix, it&apos;s just a quick hack, but as long as you&apos;re not using changelogs, that patch on the client should let you mount with &amp;gt; 64 MDTs.&lt;/p&gt;</comment>
                            <comment id="254955" author="mrb" created="Wed, 18 Sep 2019 08:44:04 +0000"  >&lt;p&gt;Thanks Patrick, that&apos;s great. I&apos;ll give this a test in a couple of weeks when I have a window to do some more benchmarking on this hardware - I was interested in just seeing how far we could scale DNE striped directories, so no changelogs on this system. I&apos;ll try this and report back then.&lt;/p&gt;

&lt;p&gt;Cheers,&lt;br/&gt;
Matt&lt;/p&gt;</comment>
                            <comment id="261931" author="adilger" created="Mon, 27 Jan 2020 23:06:33 +0000"  >&lt;p&gt;This issue was introduced with patch &lt;a href=&quot;https://review.whamcloud.com/18900&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/18900&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7659&quot; title=&quot;Replace KUC by more standard mechanisms&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7659&quot;&gt;LU-7659&lt;/a&gt; mdc: expose changelog through char devices&lt;/tt&gt;&quot; in commit &lt;tt&gt;v2_9_55_0-13-g1d40214d96&lt;/tt&gt;, so affects both 2.10 and 2.12 LTS releases.  Please add that in &lt;tt&gt;Fixes:&lt;/tt&gt; label in the patch commit message when fixing this issue.&lt;/p&gt;</comment>
                            <comment id="264255" author="gerrit" created="Fri, 28 Feb 2020 14:30:32 +0000"  >&lt;p&gt;Hongchao Zhang (hongchao@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/37759&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/37759&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12506&quot; title=&quot;Client unable to mount filesystem with very large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12506&quot;&gt;&lt;del&gt;LU-12506&lt;/del&gt;&lt;/a&gt; changelog: support large number of MDT&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 4d1e03fd208504854fbbf3631547b00a32d8c62f&lt;/p&gt;</comment>
                            <comment id="264422" author="jhammond" created="Mon, 2 Mar 2020 21:54:54 +0000"  >&lt;p&gt;This could/should be solved by using dynamic devices instead of misc devices. See &lt;a href=&quot;https://review.whamcloud.com/#/c/37552/4/lustre/ofd/ofd_access_log.c@406&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/37552/4/lustre/ofd/ofd_access_log.c@406&lt;/a&gt; for an approach which should work here as sell.&lt;/p&gt;</comment>
                            <comment id="264542" author="hongchao.zhang" created="Wed, 4 Mar 2020 14:12:01 +0000"  >&lt;p&gt;Hi John,&lt;br/&gt;
Thanks!  It&apos;s a better solution to replace miscdevice with dynamic devices, I have updated the patch accordingly. Thanks&lt;/p&gt;</comment>
                            <comment id="265277" author="gerrit" created="Sat, 14 Mar 2020 03:24:29 +0000"  >&lt;p&gt;Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/37917&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/37917&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12506&quot; title=&quot;Client unable to mount filesystem with very large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12506&quot;&gt;&lt;del&gt;LU-12506&lt;/del&gt;&lt;/a&gt; mdc: clean up code style for mdc_locks.c&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: d08b729acb70fba933da40e7699b621e2643355f&lt;/p&gt;</comment>
                            <comment id="265925" author="gerrit" created="Tue, 24 Mar 2020 05:16:18 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/37759/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/37759/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12506&quot; title=&quot;Client unable to mount filesystem with very large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12506&quot;&gt;&lt;del&gt;LU-12506&lt;/del&gt;&lt;/a&gt; changelog: support large number of MDT&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: d0423abc1adc717b08de61be3556688cccd52ddf&lt;/p&gt;</comment>
                            <comment id="266060" author="gerrit" created="Wed, 25 Mar 2020 04:54:12 +0000"  >&lt;p&gt;Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/38058&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/38058&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12506&quot; title=&quot;Client unable to mount filesystem with very large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12506&quot;&gt;&lt;del&gt;LU-12506&lt;/del&gt;&lt;/a&gt; tests: clean up MDT name generation&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: e5d323b7a9c1aa5969b90ef4fc3ec302a23d46e9&lt;/p&gt;</comment>
                            <comment id="268379" author="gerrit" created="Thu, 23 Apr 2020 16:49:04 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/37917/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/37917/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12506&quot; title=&quot;Client unable to mount filesystem with very large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12506&quot;&gt;&lt;del&gt;LU-12506&lt;/del&gt;&lt;/a&gt; mdc: clean up code style for mdc_locks.c&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 0716f5a9d98a4fa299b2cfc7cfee236313e3dbcc&lt;/p&gt;</comment>
                            <comment id="274394" author="pjones" created="Fri, 3 Jul 2020 21:29:58 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/ViewProfile.jspa?name=mrb&quot; class=&quot;user-hover&quot; rel=&quot;mrb&quot;&gt;mrb&lt;/a&gt; have you ever re-tried this test with the fix in place?&lt;/p&gt;</comment>
                            <comment id="274956" author="mrb" created="Fri, 10 Jul 2020 10:38:27 +0000"  >&lt;p&gt;Hi Peter, I&apos;m afraid I haven&apos;t tested it no, and I&apos;m unlikely to be able to do so for some time now as I&apos;m not actively working with this system to test with any more. &lt;/p&gt;

&lt;p&gt;It might be something I get to look at again in Q3/Q4 this year as we will be installing more all-flash nodes to double the size of our current all-flash Lustre, so I imagine we will be doing some intensive work benchmarking it once the system integration is done. Indeed with the number of servers we&apos;ll have at that point, we will be getting close to needing it if we wanted to have an MDT on every server in the filesystem.&lt;/p&gt;

&lt;p&gt;Cheers,&lt;br/&gt;
Matt&lt;/p&gt;</comment>
                            <comment id="275161" author="pjones" created="Sat, 11 Jul 2020 15:35:30 +0000"  >&lt;p&gt;ok Matt fair enough. Let&apos;s engage again if/when you are ready to start raising the bar again &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;</comment>
                            <comment id="280704" author="alex.ku" created="Fri, 25 Sep 2020 23:16:25 +0000"  >&lt;p&gt;Do you have this patch backported to b2_12 ?&lt;/p&gt;

&lt;p&gt;Can it be backported to upcoming 2.12.6 release ?&#160;&lt;/p&gt;

&lt;p&gt;The patch is required on clients but not on servers ?&lt;/p&gt;

&lt;p&gt;&#160;I likely hit this issue when trying to mount two large lustre fs with 40 MDT each on the same client. MDT count 2*40=80 &amp;gt; 64. I can mount these lustre fs one at a time but not both at the same time.&lt;/p&gt;</comment>
                            <comment id="281352" author="spitzcor" created="Fri, 2 Oct 2020 15:12:35 +0000"  >&lt;p&gt;&amp;gt; The patch is required on clients but not on servers ?&lt;br/&gt;
Yes, &lt;a href=&quot;https://review.whamcloud.com/#/c/37759/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/37759/&lt;/a&gt; only affects mdc.&lt;/p&gt;</comment>
                            <comment id="283170" author="pjones" created="Fri, 23 Oct 2020 21:14:36 +0000"  >&lt;p&gt;The fix itself has landed for 2.14. The creation of a test is being tracked under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14058&quot; title=&quot;Create tests for large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14058&quot;&gt;LU-14058&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="285353" author="alex.ku" created="Tue, 17 Nov 2020 17:59:56 +0000"  >&lt;p&gt;Peter,&lt;/p&gt;

&lt;p&gt;is it possible to backport this patch to 2.12 and include it into 2.12.6 release? This will simplify upgrade on nodes with upstream client installed otherwise I will have to fork off.&lt;/p&gt;

&lt;p&gt;I have this patch tested&#160; on rhel with 88MDTs for the code built from the HPE source tree.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="291810" author="gerrit" created="Thu, 11 Feb 2021 22:26:46 +0000"  >&lt;p&gt;&lt;del&gt;Andreas Dilger (adilger@whamcloud.com) uploaded a new patch:&lt;/del&gt; &lt;a href=&quot;https://review.whamcloud.com/41485&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/41485&lt;/a&gt;&lt;br/&gt;
&lt;del&gt;Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12506&quot; title=&quot;Client unable to mount filesystem with very large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12506&quot;&gt;&lt;del&gt;LU-12506&lt;/del&gt;&lt;/a&gt; tests: handle more MDTs in sanity.sh&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Project: fs/lustre-release&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Branch: master&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Current Patch Set: 1&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Commit: 28fa92e0552f0f9135256fa4611c68e5c6396773&lt;/del&gt;&lt;/p&gt;

&lt;p&gt;Pushed to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14058&quot; title=&quot;Create tests for large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14058&quot;&gt;LU-14058&lt;/a&gt; instead.&lt;/p&gt;</comment>
                            <comment id="295384" author="gerrit" created="Thu, 18 Mar 2021 21:44:19 +0000"  >&lt;p&gt;Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/42087&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/42087&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12506&quot; title=&quot;Client unable to mount filesystem with very large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12506&quot;&gt;&lt;del&gt;LU-12506&lt;/del&gt;&lt;/a&gt; changelog: support large number of MDT&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: b9380fe5ed814d91dac2d1d03ad817ffb0869766&lt;/p&gt;</comment>
                            <comment id="297898" author="gerrit" created="Tue, 6 Apr 2021 04:42:23 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/42087/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/42087/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12506&quot; title=&quot;Client unable to mount filesystem with very large number of MDTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12506&quot;&gt;&lt;del&gt;LU-12506&lt;/del&gt;&lt;/a&gt; changelog: support large number of MDT&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 0596a16841406b93ec1e348fcc9eecce62d9fe8b&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="59411">LU-13620</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="34089">LU-7659</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="63336">LU-14523</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="59021">LU-13508</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="58254">LU-13321</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="53930">LU-11626</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="61295">LU-14058</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="33076" name="cpu-e-1095.20190704-1300.log.gz" size="526005" author="mrb" created="Thu, 4 Jul 2019 12:10:47 +0000"/>
                            <attachment id="33077" name="dac-e-1.20190704-1300.log.gz" size="24819840" author="mrb" created="Thu, 4 Jul 2019 12:10:49 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00j6v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>