<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:21:08 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-8856] ZFS-MDT 100% full. Cannot delete files.</title>
                <link>https://jira.whamcloud.com/browse/LU-8856</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;End Customer: MSU (Michigan State Univ)&lt;/p&gt;

&lt;p&gt;A user generated tons of small files and exhausted the available inodes of the MDT (single MDT, no DNE). Any attempts at deleting files as root fail. &lt;/p&gt;

&lt;p&gt;I looked at &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8787&quot; title=&quot;zpool containing MDT0000 out of space&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8787&quot;&gt;&lt;del&gt;LU-8787&lt;/del&gt;&lt;/a&gt; and &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8714&quot; title=&quot;too many update logs during soak-test.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8714&quot;&gt;LU-8714&lt;/a&gt; but they don&apos;t seem to follow this closely enough.&lt;/p&gt;

&lt;p&gt;zdb -d ls15-mds-00.mdt/mdt&lt;br/&gt;
Dataset ls15-mds-00.mdt/mdt &lt;span class=&quot;error&quot;&gt;&amp;#91;ZPL&amp;#93;&lt;/span&gt;, ID 66, cr_txg 20442, 2.82T, 280362968 objects&lt;/p&gt;

&lt;p&gt;ls15-mds-00.mdt/mdt     2.82T      0  2.82T  /ls15-mds-00.mdt/mdt&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@lac-373 roth&amp;#93;&lt;/span&gt;# lfs df -i&lt;br/&gt;
UUID                      Inodes       IUsed       IFree IUse% Mounted on&lt;br/&gt;
ls15-MDT0000_UUID      280362968   280362968           0 100% /mnt/ls15&lt;span class=&quot;error&quot;&gt;&amp;#91;MDT:0&amp;#93;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;But we can&apos;t remove any files:&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;root@lac-000 1mk5_5998&amp;#93;&lt;/span&gt;# rm tor.mat&lt;br/&gt;
rm: cannot remove `tor.mat&apos;: No space left on device&lt;/p&gt;

&lt;p&gt;I&apos;m going to take a stab at deregistering the changelog which might free up enough space to get the MDT able to process some file deletions. If anyone has any other &apos;best practices&apos; please advise.&lt;/p&gt;
</description>
                <environment>CentOS 6.8 2.6.32_504.30.3.el6.x86_64, Lustre 2.8.0 (g0bcd520), ZFS 0.6.5.4-1</environment>
        <key id="41702">LU-8856</key>
            <summary>ZFS-MDT 100% full. Cannot delete files.</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bzzz">Alex Zhuravlev</assignee>
                                    <reporter username="aeonjeffj">Jeff Johnson</reporter>
                        <labels>
                            <label>llnl</label>
                    </labels>
                <created>Mon, 21 Nov 2016 18:43:32 +0000</created>
                <updated>Tue, 6 Feb 2024 06:44:07 +0000</updated>
                            <resolved>Thu, 15 Mar 2018 14:04:43 +0000</resolved>
                                    <version>Lustre 2.8.0</version>
                                    <fixVersion>Lustre 2.11.0</fixVersion>
                    <fixVersion>Lustre 2.10.4</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="174513" author="adilger" created="Mon, 21 Nov 2016 19:32:11 +0000"  >&lt;p&gt;If you have ChangeLogs active without an active consumer, then this will definitely consume a lot of space that does not get freed until the ChangeLog is processed or removed.  Also, having an active ChangeLog means that some space is needed for the CL record at unlink time.&lt;/p&gt;

&lt;p&gt;Do you have any snapshots of this filesystem?  If yes, then deleting the oldest snapshot should also free up some space.&lt;/p&gt;

&lt;p&gt;It may be that mounting and unmounting the dataset (up to 4 times) will allow old committed transactions to free up space&lt;/p&gt;

&lt;p&gt;If none of these options work, it may be possible to mount the filesystem locally as type &lt;tt&gt;zfs&lt;/tt&gt; and deleting some specific files, however we should discuss that before any action is taken like this.&lt;/p&gt;

&lt;p&gt;Finally, one option would be to add extra storage to the MDT zpool.  However, note that it will &lt;b&gt;not&lt;/b&gt; be possible to remove those devices after they are added, so if this is done they should be configured correctly as mirrored VDEV(s) to maintain reliability.&lt;/p&gt;
</comment>
                            <comment id="174517" author="aeonjeffj" created="Mon, 21 Nov 2016 19:52:53 +0000"  >&lt;p&gt;Draining changelogs or deregistering the changelog isn&apos;t working. For some reason the changelog doesn&apos;t have a user. The user &lt;b&gt;was&lt;/b&gt; cl1 and the logs of the robinhood server show it was processing using cl1.&lt;/p&gt;

&lt;p&gt;On MDS it appears that there are no unprocessed changelog entries but robinhood was running up until a few months ago so there should be unprocessed changes stored:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# cat /proc/fs/lustre/mdd/ls15-MDT0000/changelog_users
current index: 164447373
ID    index
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;From a client, as root: (produces lots of output)&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lfs changelog ls15-MDT0000|head
160211907 12LYOUT 01:50:56.131136452 2015.11.01 0x0 t=[0x200002cf8:0x12e9f:0x0]
160211908 12LYOUT 01:50:56.131136452 2015.11.01 0x0 t=[0x200002d22:0x167e9:0x0]
160211909 13TRUNC 01:50:56.132136455 2015.11.01 0xe t=[0x200002ca8:0x8a69:0x0]
160211910 13TRUNC 01:50:56.132136455 2015.11.01 0xe t=[0x200002cb4:0x17a3d:0x0]
160211911 11CLOSE 01:50:56.132136455 2015.11.01 0x42 t=[0x200002c37:0x8577:0x0]
160211912 11CLOSE 01:50:56.133136458 2015.11.01 0x42 t=[0x200002ca8:0x8a69:0x0]
160211913 11CLOSE 01:50:56.133136458 2015.11.01 0x42 t=[0x200002cb4:0x17a3d:0x0]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Trying to clear as root from a client:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# lfs changelog_clear ls15-MDT0000 cl1 0
changelog_clear error: No such file or directory
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Trying to deregister from the MDS:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@ls15-mds-00.i ~]# lctl --device ls15-MDT0000 changelog_deregister cl1
error: changelog_deregister: No such file or directory

[root@ls15-mds-00.i ~]# lctl --device ls15-MDT0000 changelog_deregister cl0
error: changelog_deregister: expected id of the form cl&amp;lt;num&amp;gt; got &apos;cl0&apos;
deregister an existing changelog user
usage:	device &amp;lt;mdtname&amp;gt;
	changelog_deregister &amp;lt;id&amp;gt;
run &amp;lt;command&amp;gt; after connecting to device &amp;lt;devno&amp;gt;
--device &amp;lt;devno&amp;gt; &amp;lt;command [args ...]&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Logs from robinhood server showing consumption of changelogs using reader_id &apos;cl1&apos;:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;======== General statistics =========
Daemon start time: 2016/07/28 18:48:59
Started modules: log_reader
ChangeLog reader #0:
   fs_name    =   ls15
   mdt_name   =   MDT0000
   reader_id  =   cl1
   records read        = 4235467
   interesting records = 2823646
   suppressed records  = 1411821
   records pending     = 0
   last received            = 2016/07/28 19:29:26
   last read record time    = 2015/10/31 22:28:52.489794
   last read record id      = 164447373
   last pushed record id    = 164447370
   last committed record id = 164447370
   last cleared record id   = 164447370
   read speed               = 0.00 record/sec (0.00 incl. idle time)
   processing speed ratio   = 0.00
   ChangeLog stats:
   MARK: 0, CREAT: 0, MKDIR: 0, HLINK: 0, SLINK: 0, MKNOD: 0, UNLNK: 0, RMDIR: 0, RENME: 0
   RNMTO: 0, OPEN: 0, CLOSE: 1411823, LYOUT: 1411822, TRUNC: 1411822, SATTR: 0, XATTR: 0
   HSM: 0, MTIME: 0, CTIME: 0, ATIME: 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="174519" author="aeonjeffj" created="Mon, 21 Nov 2016 20:01:46 +0000"  >&lt;p&gt;There are no snapshots in the MDT pool.&lt;/p&gt;

&lt;p&gt;I was hoping to figure out how to see the changelog file or directory using zdb but I can&apos;t seem to find which object ID it might be. With over 2T full there are lots of entries to try and poke at. Is there a any sort of default object ID for the changelog file(s) or directory?&lt;/p&gt;

&lt;p&gt;By &apos;dataset&apos; are you referring to unmounting and remounting the LFS server-side targets? Basically take down and remount the server-side of the LFS 3-4 times?&lt;/p&gt;</comment>
                            <comment id="174526" author="aeonjeffj" created="Mon, 21 Nov 2016 20:36:15 +0000"  >&lt;p&gt;We threw hardware at it. Expanded the MDT pool by adding a mirrored vdev and the extra 320GB gave room to move around and delete files. &lt;/p&gt;

&lt;p&gt;I&apos;d still like to walk down this ticket so some best practices could be offered in the event a future occurrence doesn&apos;t have extra hardware at hand.&lt;/p&gt;
</comment>
                            <comment id="174545" author="adilger" created="Mon, 21 Nov 2016 21:59:11 +0000"  >&lt;p&gt;While we do try to reserve space in the MDT and OST zpools (&lt;tt&gt;OSD_STATFS_RESERVED_SIZE&lt;/tt&gt;), but I suspect we are not taking this into account when allocating files on the MDT, only on the OST.&lt;/p&gt;

&lt;p&gt;Separately, we need to look into how ChangeLogs are handled when the MDT is &quot;full&quot;.  The &quot;unused ChangeLog is filling MDT&quot; problem seems to be happening a lot. I think we need to handle this in an automatic manner, by tracking how much space the ChangeLog consumes, and if the MDT is too full and the oldest ChangeLog user that hasn&apos;t been used in some time (a week?) should be unregistered (with a clear LCONSOLE() error message printed) and records purged up to the next CL user.  CL deregistration should be repeated in LRU order as needed until enough free space is available or no more unused CL users exist.  It shouldn&apos;t automatically deregister active CL users (e.g. less than one day) since that could be used as a DOS to deactivate filesystem monitoring tools.&lt;/p&gt;

&lt;p&gt;A /proc tunable should be available to &lt;em&gt;disable&lt;/em&gt; automatic CL user deregistration, and when this is set users would get ENOSPC instead of success when trying to modify the MDT.  This should not be the default behaviour, however, and only used if it is more important to track every filesystem operation than it is to be able to use the filesystem.&lt;/p&gt;</comment>
                            <comment id="174604" author="pjones" created="Tue, 22 Nov 2016 05:27:56 +0000"  >&lt;p&gt;Lai&lt;/p&gt;

&lt;p&gt;Could you please assist with this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="180326" author="adilger" created="Tue, 10 Jan 2017 20:33:49 +0000"  >&lt;p&gt;Two things need to be done here to handle this problem automatically, since this problem of ChangeLogs filling the MDT has happened several times:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;if the MDT is too full and the ChangeLog consumes too much space (see also patch&#160;&lt;a href=&quot;https://review.whamcloud.com/16416&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/16416&lt;/a&gt; &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7156&quot; title=&quot;Provide size of changelogs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7156&quot;&gt;&lt;del&gt;LU-7156&lt;/del&gt;&lt;/a&gt; mdd: add changelog_size to procfs&quot;), and some ChangeLog user hasn&apos;t been used in some time (over a week?) it should be unregistered (with a clear&#160;&lt;tt&gt;LCONSOLE()&lt;/tt&gt; error message printed) and records purged up to the next CL user (this should happen automatically), repeat for the next CL user as needed. Possibly a&#160;&lt;tt&gt;/sys/fs/lustre&lt;/tt&gt; tunable to disable automatic CL user deregistration should be available, and further operations on the filesystem would get ENOSPC instead of success (as it does today), if it really is critical to track every operation, but that should not be the default behavior. The deregistration should not be done for recently active ChangeLog users (&amp;lt; 24h), since this would potentially allow users to disable the ChangeLogs just by filling the MDT, and there is little benefit to removing the CL user if it does not free up much space.&lt;/li&gt;
	&lt;li&gt;reserve more space for ZFS filesystems, and when the threshold is hit only allow files to be deleted. This needs to be done in conjunction with automatic removal of old ChangeLog, otherwise deleting files will free some space (assuming whole metadnode blocks are freed) but it will also consume space in the ChangeLog.&lt;/li&gt;
&lt;/ol&gt;
</comment>
                            <comment id="180353" author="adilger" created="Tue, 10 Jan 2017 22:07:14 +0000"  >&lt;p&gt;One option for a very simple short-term solution for the ZFS space reservation is to have the MDS or OSD startup check the size of and/or write a 10MB &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/help_16.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; file&#160;in the MDS root with a name like &lt;tt&gt;IN_CASE_OF_ENOSPC_TRUNCATE_THIS_FILE&lt;/tt&gt;.&#160; This would be large enough to ensure that truncating the file manually from a local mount if there is an emergency situation like this in the future releases enough space to start deleting file again.&#160; It isn&apos;t high-tech, but is definitely a robust way to reserve space for such an emergency, and in most cases that space won&apos;t be missed.&#160; The size of the file could be scaled down for small test filesystems below, say, 10GB or skipped completely.&#160; Some care would be needed to avoid refilling the file immediately after mount if the MDS is just being mounted after truncating the &lt;tt&gt;ICE&lt;/tt&gt; file and files are being deleted.&#160;&lt;/p&gt;

&lt;p&gt;However, it shouldn&apos;t delay too long in repopulating the file to avoid the situation where there is some runaway user job that continues to fill the filesystem and it gets back into the same situation again immediately.&#160; The benefit of this low-tech approach (vs. an in-memory reservation of space, and selectively blocking all but file/directory removal operations) is that this could be implemented quickly and potentially backported to existing releases with little risk.&lt;/p&gt;</comment>
                            <comment id="193786" author="gerrit" created="Thu, 27 Apr 2017 16:03:20 +0000"  >&lt;p&gt;Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/26868&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/26868&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8856&quot; title=&quot;ZFS-MDT 100% full. Cannot delete files.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8856&quot;&gt;&lt;del&gt;LU-8856&lt;/del&gt;&lt;/a&gt; osd: reserve space in zfs pool for emergency&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 5920a86e0a1fad6669997483a28fcea95d9a2fce&lt;/p&gt;</comment>
                            <comment id="194256" author="bzzz" created="Wed, 3 May 2017 12:28:59 +0000"  >&lt;p&gt;Andreas, probably there is another solution for the problem.&lt;/p&gt;

&lt;p&gt;Basically ZFS reserves some space internally:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Normally, we don&apos;t allow the last 3.2% (1/(2^spa_slop_shift)) of space in&lt;/li&gt;
	&lt;li&gt;the pool to be consumed.  This ensures that we don&apos;t run the pool&lt;/li&gt;
	&lt;li&gt;completely out of space, due to unaccounted changes (e.g. to the MOS).&lt;/li&gt;
	&lt;li&gt;It also limits the worst-case time to allocate space.  If we have&lt;/li&gt;
	&lt;li&gt;less than this amount of free space, most ZPL operations (e.g. write,&lt;/li&gt;
	&lt;li&gt;create) will return ENOSPC.&lt;br/&gt;
 *&lt;/li&gt;
	&lt;li&gt;Certain operations (e.g. file removal, most administrative actions) can&lt;/li&gt;
	&lt;li&gt;use half the slop space.  They will only return ENOSPC if less than half&lt;/li&gt;
	&lt;li&gt;the slop space is free.  Typically, once the pool has less than the slop&lt;/li&gt;
	&lt;li&gt;space free, the user will use these operations to free up space in the pool.&lt;/li&gt;
	&lt;li&gt;These are the operations that call dsl_pool_adjustedsize() with the netfree&lt;/li&gt;
	&lt;li&gt;argument set to TRUE.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;we can mark any transction &quot;net free&quot; using dmu_tx_mark_netfree()&lt;/p&gt;

&lt;p&gt;so the very first thing would be to mark transactions involving object destroy.&lt;br/&gt;
then we could have a procfs tunable so that sysadm can turn that for specificic transactions (e.g. originated from root).&lt;/p&gt;</comment>
                            <comment id="194260" author="gerrit" created="Wed, 3 May 2017 12:47:27 +0000"  >&lt;p&gt;Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/26930&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/26930&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8856&quot; title=&quot;ZFS-MDT 100% full. Cannot delete files.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8856&quot;&gt;&lt;del&gt;LU-8856&lt;/del&gt;&lt;/a&gt; osd: mark specific transactions netfree&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 63b56104f21e8e1abfe962bd9ab6b749b67fed3a&lt;/p&gt;</comment>
                            <comment id="194418" author="bzzz" created="Thu, 4 May 2017 09:00:19 +0000"  >&lt;p&gt;the approach seem to work (in simple cases at least). here is the test:&lt;/p&gt;

&lt;p&gt;test_803() {&lt;br/&gt;
       mkdir $DIR/$tdir&lt;br/&gt;
       createmany -m $DIR/$tdir/f 10000000000 &amp;amp;&amp;amp; error &quot;too big device?&quot;&lt;br/&gt;
       rm -rf $DIR/$tdir || error &quot;rm should succeed after ENOSPC&quot;&lt;br/&gt;
}&lt;/p&gt;

&lt;p&gt;== sanity test 803: OOS == 15:52:41 (1493902361)&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;create 10000 (time 1493902365.56 total 4.41 last 2269.00)&lt;/li&gt;
	&lt;li&gt;create 20000 (time 1493902371.51 total 10.35 last 1681.47)&lt;/li&gt;
	&lt;li&gt;create 30000 (time 1493902377.66 total 16.50 last 1625.92)&lt;/li&gt;
	&lt;li&gt;create 40000 (time 1493902384.95 total 23.80 last 1370.67)&lt;/li&gt;
	&lt;li&gt;create 44433 (time 1493902394.95 total 33.80 last 443.18)&lt;/li&gt;
	&lt;li&gt;create 46390 (time 1493902404.97 total 43.81 last 195.48)&lt;/li&gt;
	&lt;li&gt;create 47995 (time 1493902414.97 total 53.82 last 160.40)&lt;/li&gt;
	&lt;li&gt;create 49455 (time 1493902424.98 total 63.83 last 145.92)&lt;/li&gt;
	&lt;li&gt;create 50000 (time 1493902428.78 total 67.62 last 143.49)&lt;/li&gt;
	&lt;li&gt;create 51395 (time 1493902438.78 total 77.63 last 139.39)&lt;/li&gt;
	&lt;li&gt;create 52925 (time 1493902448.79 total 87.64 last 152.94)&lt;/li&gt;
	&lt;li&gt;create 54468 (time 1493902458.79 total 97.64 last 154.29)&lt;/li&gt;
	&lt;li&gt;create 56076 (time 1493902468.80 total 107.64 last 160.70)&lt;/li&gt;
	&lt;li&gt;create 57716 (time 1493902478.80 total 117.65 last 163.87)&lt;/li&gt;
	&lt;li&gt;create 59290 (time 1493902488.81 total 127.66 last 157.27)&lt;/li&gt;
	&lt;li&gt;create 60000 (time 1493902493.28 total 132.12 last 159.07)&lt;/li&gt;
	&lt;li&gt;create 61487 (time 1493902503.28 total 142.13 last 148.58)&lt;br/&gt;
mknod(/mnt/lustre/d803.sanity/f62098) error: No space left on device&lt;br/&gt;
total: 62098 create in 146.27 seconds: 424.54 ops/second&lt;br/&gt;
Resetting fail_loc on all nodes...done.&lt;br/&gt;
15:56:09 (1493902569) waiting for dual2 network 5 secs ...&lt;br/&gt;
15:56:09 (1493902569) network interface is UP&lt;br/&gt;
PASS 803 (208s)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;w/o the patch rm fails..&lt;/p&gt;</comment>
                            <comment id="194448" author="bzzz" created="Thu, 4 May 2017 14:29:29 +0000"  >&lt;p&gt;unfortunately this capability was added in 0.7, it&apos;s not easily available on 0.6 though the majority of the required functionality is in place.&lt;/p&gt;</comment>
                            <comment id="194474" author="adilger" created="Thu, 4 May 2017 17:17:26 +0000"  >&lt;p&gt;I think the two approaches are complimentary. We can use the reserved space file for now, and use the &quot;netfree&quot; functionality when it is available.&lt;/p&gt;

&lt;p&gt;The main question about &quot;netfree&quot; is whether this is actually true when we delete an inode on the MDT with ChangeLogs enabled? Even if the dnode is deleted, it may not actually release space (due to shared dnode blocks) and the added ChangeLog record will consume space. &lt;/p&gt;

&lt;p&gt;As a result, even if the netfree functionality is available I think it makes sense to keep the emergency space reservation file around. If we never need to delete it then that is fine too, the amount of space consumed is minimal. &lt;/p&gt;</comment>
                            <comment id="194475" author="bzzz" created="Thu, 4 May 2017 17:34:12 +0000"  >&lt;p&gt;I think this is true for &quot;reserved with writes&quot; as well - changelogs/destroy logs can be quite big so that with that reserved released we&apos;ll keep consuming?&lt;br/&gt;
correct me if I&apos;m wrong, but I don&apos;t really see big difference.&lt;/p&gt;</comment>
                            <comment id="195962" author="adilger" created="Tue, 16 May 2017 07:33:09 +0000"  >&lt;p&gt;I think in the &quot;reserved with writes&quot; case, since the admin needs to get involved they can hopefully fix the source of the problem that is consuming all the free space (e.g. stale ChangeLog consumer registered) when they delete the emergency file.&lt;/p&gt;</comment>
                            <comment id="195971" author="bzzz" created="Tue, 16 May 2017 12:37:32 +0000"  >&lt;p&gt;well, I guess we can mark any transaction originated from root with netfree flag when a special tunable set ?&lt;br/&gt;
if no space can be released, admin comes in settting that variable and do whatever may help with it&apos;s rights..&lt;/p&gt;</comment>
                            <comment id="218027" author="ofaaland" created="Thu, 11 Jan 2018 19:25:32 +0000"  >&lt;p&gt;We&apos;ve encountered this at LLNL, too.&lt;/p&gt;

&lt;p&gt;For the benefit of other sites that end up looking at this ticket and have Lustre versions without Alex&apos;s patches, I&apos;m working up a procedure which I&apos;ll put on wiki.lustre.org at &lt;a href=&quot;http://wiki.lustre.org/ZFS_MDT_ENOSPC_Recovery&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://wiki.lustre.org/ZFS_MDT_ENOSPC_Recovery&lt;/a&gt;.  It will work on any ZFS &amp;gt;= 0.6.5 using spa_slop_shift mentioned by Alex, above.&lt;/p&gt;</comment>
                            <comment id="221737" author="gerrit" created="Tue, 27 Feb 2018 03:41:50 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/26930/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/26930/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8856&quot; title=&quot;ZFS-MDT 100% full. Cannot delete files.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8856&quot;&gt;&lt;del&gt;LU-8856&lt;/del&gt;&lt;/a&gt; osd: mark specific transactions netfree&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 8d1639b5cf1edbc885876956dcd6189173c00955&lt;/p&gt;</comment>
                            <comment id="221768" author="pjones" created="Tue, 27 Feb 2018 04:24:52 +0000"  >&lt;p&gt;Landed for 2.11&lt;/p&gt;</comment>
                            <comment id="221837" author="gerrit" created="Tue, 27 Feb 2018 18:27:44 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/31442&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/31442&lt;/a&gt;&lt;br/&gt;
Subject: Revert &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8856&quot; title=&quot;ZFS-MDT 100% full. Cannot delete files.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8856&quot;&gt;&lt;del&gt;LU-8856&lt;/del&gt;&lt;/a&gt; osd: mark specific transactions netfree&quot;&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 83be173e2848e3b81d6fb2123d70d0cf614105a8&lt;/p&gt;</comment>
                            <comment id="221838" author="pjones" created="Tue, 27 Feb 2018 18:27:51 +0000"  >&lt;p&gt;Reopening due to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-10732&quot; title=&quot;sanity-lfsck test_9a: FAIL: (7) Failed to get expected &amp;#39;completed&amp;#39;&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-10732&quot;&gt;&lt;del&gt;LU-10732&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="221839" author="gerrit" created="Tue, 27 Feb 2018 18:27:54 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/31442/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/31442/&lt;/a&gt;&lt;br/&gt;
Subject: Revert &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8856&quot; title=&quot;ZFS-MDT 100% full. Cannot delete files.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8856&quot;&gt;&lt;del&gt;LU-8856&lt;/del&gt;&lt;/a&gt; osd: mark specific transactions netfree&quot;&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: c2caa40bd38e7645dc4ac90552e12e3fb7fde476&lt;/p&gt;</comment>
                            <comment id="221846" author="gerrit" created="Tue, 27 Feb 2018 19:17:14 +0000"  >&lt;p&gt;Minh Diep (minh.diep@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/31443&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/31443&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8856&quot; title=&quot;ZFS-MDT 100% full. Cannot delete files.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8856&quot;&gt;&lt;del&gt;LU-8856&lt;/del&gt;&lt;/a&gt; osd: mark specific transactions netfree&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: d47e3892f83d2d1bdb21653381a8d1ec0db68a4a&lt;/p&gt;</comment>
                            <comment id="221849" author="gerrit" created="Tue, 27 Feb 2018 19:39:24 +0000"  >&lt;p&gt;Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/31444&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/31444&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8856&quot; title=&quot;ZFS-MDT 100% full. Cannot delete files.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8856&quot;&gt;&lt;del&gt;LU-8856&lt;/del&gt;&lt;/a&gt; osd: mark specific transactions netfree&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 8bc89e5b3aeda1bf15f2ff6fc53651e470c0a6c6&lt;/p&gt;</comment>
                            <comment id="223706" author="gerrit" created="Thu, 15 Mar 2018 13:54:18 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/31444/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/31444/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8856&quot; title=&quot;ZFS-MDT 100% full. Cannot delete files.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8856&quot;&gt;&lt;del&gt;LU-8856&lt;/del&gt;&lt;/a&gt; osd: mark specific transactions netfree&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 106abc184d8b57de560dc1874683ce5487dcf30a&lt;/p&gt;</comment>
                            <comment id="223725" author="pjones" created="Thu, 15 Mar 2018 14:04:43 +0000"  >&lt;p&gt;Landed for 2.11&lt;/p&gt;</comment>
                            <comment id="224404" author="gerrit" created="Fri, 23 Mar 2018 15:02:05 +0000"  >&lt;p&gt;Minh Diep (minh.diep@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/31751&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/31751&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8856&quot; title=&quot;ZFS-MDT 100% full. Cannot delete files.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8856&quot;&gt;&lt;del&gt;LU-8856&lt;/del&gt;&lt;/a&gt; osd: mark specific transactions netfree&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: aaafe65b47925dd83b2138d55843ac61af2967a8&lt;/p&gt;</comment>
                            <comment id="227216" author="gerrit" created="Thu, 3 May 2018 18:17:22 +0000"  >&lt;p&gt;John L. Hammond (john.hammond@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/31751/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/31751/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-8856&quot; title=&quot;ZFS-MDT 100% full. Cannot delete files.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-8856&quot;&gt;&lt;del&gt;LU-8856&lt;/del&gt;&lt;/a&gt; osd: mark specific transactions netfree&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: a8c7a32fd7fc54e9717e23f208b40c8ff93b81e4&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                                        </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="32827">LU-7340</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="50974">LU-10732</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10040" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic</customfieldname>
                        <customfieldvalues>
                                        <label>metadata</label>
            <label>zfs</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10030" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic/Theme</customfieldname>
                        <customfieldvalues>
                                        <label>zfs</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzyw7j:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>