<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:41:53 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4345] failed to update accounting ZAP for user</title>
                <link>https://jira.whamcloud.com/browse/LU-4345</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;We are using lustre &lt;a href=&quot;https://github.com/chaos/lustre/tree/2.4.0-19chaos&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;2.4.0-19chaos&lt;/a&gt; on our servers running with the ZFS OSD.  On some of the OSS nodes we are seeing messages like this:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Nov  6 00:06:29 stout8 kernel: LustreError: 14909:0:(osd_object.c:973:osd_attr_set()) fsrzb-OST0007: failed to update accounting ZAP for user 132245 (-2)
Nov  6 00:06:29 stout8 kernel: LustreError: 14909:0:(osd_object.c:973:osd_attr_set()) Skipped 5 previous similar messages
Nov  6 00:06:38 stout16 kernel: LustreError: 15266:0:(osd_object.c:973:osd_attr_set()) fsrzb-OST000f: failed to update accounting ZAP for user 122392 (-2)
Nov  6 00:06:38 stout16 kernel: LustreError: 15266:0:(osd_object.c:973:osd_attr_set()) Skipped 3 previous similar messages
Nov  6 00:06:40 stout12 kernel: LustreError: 15801:0:(osd_object.c:973:osd_attr_set()) fsrzb-OST000b: failed to update accounting ZAP for user 122708 (-2)
Nov  6 00:06:40 stout12 kernel: LustreError: 15801:0:(osd_object.c:973:osd_attr_set()) Skipped 4 previous similar messages
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Nov  7 00:31:36 porter31 kernel: LustreError: 7704:0:(osd_object.c:973:osd_attr_set()) lse-OST001f: failed to update accounting ZAP for user 54916 (-2)
Nov  7 02:53:05 porter19 kernel: LustreError: 9380:0:(osd_object.c:973:osd_attr_set()) lse-OST0013: failed to update accounting ZAP for user 7230 (-2)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Dec  3 12:01:21 stout7 kernel: Lustre: Skipped 3 previous similar messages
Dec  3 13:52:30 stout4 kernel: LustreError: 15806:0:(osd_object.c:967:osd_attr_set()) fsrzb-OST0003: failed to update accounting ZAP for user 1752876224 (-2)
Dec  3 13:52:30 stout4 kernel: LustreError: 15806:0:(osd_object.c:967:osd_attr_set()) Skipped 3 previous similar messages
Dec  3 13:52:30 stout1 kernel: LustreError: 15324:0:(osd_object.c:967:osd_attr_set()) fsrzb-OST0000: failed to update accounting ZAP for user 1752876224 (-2)
Dec  3 13:52:30 stout1 kernel: LustreError: 15784:0:(osd_object.c:967:osd_attr_set()) fsrzb-OST0000: failed to update accounting ZAP for user 1752876224 (-2)
Dec  3 13:52:30 stout14 kernel: LustreError: 16345:0:(osd_object.c:967:osd_attr_set()) fsrzb-OST000d: failed to update accounting ZAP for user 1752876224 (-2)
Dec  3 13:52:30 stout12 kernel: LustreError: 32355:0:(osd_object.c:967:osd_attr_set()) fsrzb-OST000b: failed to update accounting ZAP for user 1752876224 (-2)
Dec  3 13:52:30 stout2 kernel: LustreError: 15145:0:(osd_object.c:967:osd_attr_set()) fsrzb-OST0001: failed to update accounting ZAP for user 1752876224 (-2)
Dec  3 13:52:30 stout10 kernel: LustreError: 14570:0:(osd_object.c:967:osd_attr_set()) fsrzb-OST0009: failed to update accounting ZAP for user 1752876224 (-2)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;First of all, these messages are terrible.  If you look at osd_attr_set() there are &lt;em&gt;four&lt;/em&gt; exactly identical messages that are printed.  Ok, granted, we can look them up by line number.  But even better would be to make them unique.&lt;/p&gt;

&lt;p&gt;So looking them up by line numbers 967 and 973, it would appear that we have hit at least the first two of the &quot;filed to update accounting ZAP for user&quot; messages.&lt;/p&gt;

&lt;p&gt;Note that the UID numbers do not look correct to me.  Many of them are clearly not in the valid UID range.  But then I don&apos;t completely understand what is going on here yet.&lt;/p&gt;</description>
                <environment>Lustre 2.4.0-19chaos</environment>
        <key id="22341">LU-4345</key>
            <summary>failed to update accounting ZAP for user</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="niu">Niu Yawei</assignee>
                                    <reporter username="morrone">Christopher Morrone</reporter>
                        <labels>
                            <label>llnl</label>
                            <label>mn4</label>
                    </labels>
                <created>Wed, 4 Dec 2013 21:58:23 +0000</created>
                <updated>Wed, 6 Jul 2016 16:32:18 +0000</updated>
                            <resolved>Sun, 1 Jun 2014 20:36:54 +0000</resolved>
                                                    <fixVersion>Lustre 2.6.0</fixVersion>
                    <fixVersion>Lustre 2.5.3</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>13</watches>
                                                                            <comments>
                            <comment id="72856" author="morrone" created="Wed, 4 Dec 2013 22:04:11 +0000"  >&lt;p&gt;Can someone please explain the difference between &quot;la-&amp;gt;la_uid&quot; and &quot;obj-&amp;gt;oo_attr.la_uid&quot; in this function?  Both appear to be used as keys in the ZAP, and we increment the value associated with the former and decrement the value associated with the latter.&lt;/p&gt;

&lt;p&gt;Are either of these values a fake uid?  In other words, should these always be correct UID values as one would see in user space?  I am just wondering if I should really be assuming that the printed uid values are really incorrect or not.&lt;/p&gt;</comment>
                            <comment id="72858" author="pjones" created="Wed, 4 Dec 2013 22:07:56 +0000"  >&lt;p&gt;Alex&lt;/p&gt;

&lt;p&gt;Could you please comment?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="73172" author="bzzz" created="Tue, 10 Dec 2013 04:36:52 +0000"  >&lt;p&gt;Christopher, one is &quot;existing&quot; uid (which is to be decremented) and another is &quot;new&quot; (which is to be incremented). this way we move object from one user to another one. to clarify - 122392 doesn&apos;t look like a correct UID to you? I&apos;d say so, but I don&apos;t manage systems with lots of users.&lt;/p&gt;</comment>
                            <comment id="73207" author="morrone" created="Tue, 10 Dec 2013 17:30:53 +0000"  >&lt;p&gt;I don&apos;t understand.  What kind of accounting is this?&lt;/p&gt;

&lt;p&gt;And yes, we do not have UIDs outside of the unsigned 16bit range.   122392 is definitely not a valid UID.  In fact, none of the numbers given in the messages are valid UIDs (note that not all UIDs are in use in the passwd file).&lt;/p&gt;</comment>
                            <comment id="73210" author="bzzz" created="Tue, 10 Dec 2013 17:35:27 +0000"  >&lt;p&gt;this is quota accounting. given it&apos;s not easy to enable it on an existing system (with lots of objects to scan in flight, etc) we decided to have accounting enforced all the time (same for ldiskfs). IOW, for every backend we do track objects/space.&lt;/p&gt;</comment>
                            <comment id="74455" author="niu" created="Tue, 7 Jan 2014 02:40:08 +0000"  >&lt;p&gt;The accounting ZAP updating failed for -ENOENT could probably caused by that the invalid UID wasn&apos;t found in the ZAP?&lt;/p&gt;

&lt;p&gt;Is there a reproducer? That could be helpful for us to debug and see where the invalid UID come from.&lt;/p&gt;</comment>
                            <comment id="75389" author="morrone" created="Tue, 21 Jan 2014 22:14:01 +0000"  >&lt;p&gt;There is no reproducer known.  It just happens randomly in production.&lt;/p&gt;</comment>
                            <comment id="75415" author="niu" created="Wed, 22 Jan 2014 07:49:26 +0000"  >&lt;p&gt;Are there any files in the production system which has such fake uid as owner? Could you apply the debug patch and see if we can get some clue on where the fake uid comes from? Thanks.&lt;/p&gt;</comment>
                            <comment id="75468" author="morrone" created="Wed, 22 Jan 2014 23:38:19 +0000"  >&lt;p&gt;If any files exist with an UID of 1752876224, it is nearly certain that a lustre bug is responsible.&lt;/p&gt;

&lt;p&gt;I&apos;ll work the debug patch into the next update cycle.  Sadly, we just missed the last one.&lt;/p&gt;</comment>
                            <comment id="76433" author="morrone" created="Fri, 7 Feb 2014 05:00:30 +0000"  >&lt;p&gt;FYI, this patch went onto an ldiskfs-based filesystem and is tripping the ofs_write_attr_set() message constantly.  The debug patch is almost certainly going to have to be backed out.&lt;/p&gt;

&lt;p&gt;Can you work on something that is a bit more targeted?  Does only that one message need to come out or are the others going to be as bad as this?&lt;/p&gt;</comment>
                            <comment id="76434" author="morrone" created="Fri, 7 Feb 2014 05:04:35 +0000"  >&lt;p&gt;By the way, what was the goal of the &amp;gt; 7000 part in the file?  Did you think there were no valid UIDs greater than 7000?  That is certainly not the case.  7230 was not valid in the above example, but that is a &lt;em&gt;hole&lt;/em&gt; in the UID space.  We have UIDs into the 50000 range at least.&lt;/p&gt;</comment>
                            <comment id="76442" author="niu" created="Fri, 7 Feb 2014 06:52:24 +0000"  >&lt;p&gt;I updated the debug patch, this time error message will only be printed when uid is greater than 1700000000.&lt;/p&gt;</comment>
                            <comment id="80640" author="morrone" created="Mon, 31 Mar 2014 18:24:39 +0000"  >&lt;p&gt;It looks like we are hitting this on OSS nodes pretty frequently.  Here are some examples:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2014-03-31 10:23:46 LustreError: 6113:0:(ofd_obd.c:888:ofd_setattr()) LU-4345 set uid: 632227776
2014-03-31 10:23:46 LustreError: 6119:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 632227776
2014-03-31 10:23:46 LustreError: 6113:0:(ofd_obd.c:888:ofd_setattr()) Skipped 1 previous similar message
2014-03-31 10:23:46 LustreError: 6113:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 632227776
2014-03-31 10:48:29 LustreError: 6229:0:(ofd_obd.c:888:ofd_setattr()) LU-4345 set uid: 857467328
2014-03-31 10:48:29 LustreError: 6229:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 857467328
2014-03-31 10:48:35 LustreError: 6020:0:(ofd_obd.c:888:ofd_setattr()) LU-4345 set uid: 857467328
2014-03-31 10:48:35 LustreError: 6020:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 857467328
2014-03-31 10:48:44 LustreError: 6224:0:(ofd_obd.c:888:ofd_setattr()) LU-4345 set uid: 857467328
2014-03-31 10:48:44 LustreError: 6159:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 857467328
2014-03-31 10:48:44 LustreError: 6224:0:(ofd_obd.c:888:ofd_setattr()) Skipped 2 previous similar messages
2014-03-31 10:48:48 LustreError: 6159:0:(ofd_obd.c:888:ofd_setattr()) LU-4345 set uid: 857467328
2014-03-31 10:48:48 LustreError: 6159:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 857467328
2014-03-31 10:48:48 LustreError: 6159:0:(ofd_objects.c:418:ofd_attr_set()) Skipped 2 previous similar messages
2014-03-31 10:48:52 LustreError: 6307:0:(ofd_obd.c:888:ofd_setattr()) LU-4345 set uid: 857467328
2014-03-31 10:48:52 LustreError: 6002:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 857467328
2014-03-31 10:48:52 LustreError: 6307:0:(ofd_obd.c:888:ofd_setattr()) Skipped 1 previous similar message
2014-03-31 10:48:56 LustreError: 6002:0:(ofd_obd.c:888:ofd_setattr()) LU-4345 set uid: 857467328
2014-03-31 10:48:56 LustreError: 6002:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 857467328
2014-03-31 10:48:56 LustreError: 6002:0:(ofd_objects.c:418:ofd_attr_set()) Skipped 1 previous similar message
2014-03-31 10:49:05 LustreError: 6159:0:(ofd_obd.c:888:ofd_setattr()) LU-4345 set uid: 857467328
2014-03-31 10:49:05 LustreError: 6159:0:(ofd_obd.c:888:ofd_setattr()) Skipped 3 previous similar messages
2014-03-31 10:49:05 LustreError: 6119:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 857467328
2014-03-31 10:49:05 LustreError: 6119:0:(ofd_objects.c:418:ofd_attr_set()) Skipped 3 previous similar messages
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2014-03-29 18:55:33 LustreError: 5973:0:(ofd_obd.c:888:ofd_setattr()) LU-4345 set uid: 938789056
2014-03-29 18:55:33 LustreError: 5968:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 938789056
2014-03-29 18:55:33 LustreError: 5973:0:(ofd_obd.c:888:ofd_setattr()) Skipped 1 previous similar message
2014-03-29 18:55:33 LustreError: 5987:0:(ofd_obd.c:888:ofd_setattr()) LU-4345 set uid: 938789056
2014-03-29 18:55:33 LustreError: 5987:0:(ofd_obd.c:888:ofd_setattr()) Skipped 1 previous similar message
2014-03-29 18:55:33 LustreError: 5987:0:(ofd_objects.c:418:ofd_attr_set()) LU-4345 set uid: 938789056
2014-03-29 18:55:33 LustreError: 5968:0:(osd_object.c:974:osd_attr_set()) lse-OST0013: failed to update accounting ZAP for original user 938788544 (-2)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;Note that I had set the &quot;&amp;gt; 7000&quot; to &quot;&amp;gt; 70000&quot;, and also added &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4345&quot; title=&quot;failed to update accounting ZAP for user&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4345&quot;&gt;&lt;del&gt;LU-4345&lt;/del&gt;&lt;/a&gt;&quot; to the comments before you provided the updated patch that uses &quot;&amp;gt; 1700000000&quot;.&lt;/p&gt;</comment>
                            <comment id="80669" author="niu" created="Tue, 1 Apr 2014 02:52:50 +0000"  >&lt;p&gt;The message shows that all these incorrect ids come from owner/group changing (not from punch or internal owner/group setting on first write), so I think it could be some problematic application changes owner/group unexpectedly? Could you check if there are any files on MDT were changed to such invalid owner/group? (hope they are not temporary file).&lt;/p&gt;

&lt;p&gt;Another thing is that why it failed to update ZAP? The ZAP should be updated successfully no matter if the uid/gid is invalid, I think we need some zfs expert to enable debug on zfs and see what&apos;s going on the ZAP updating.&lt;/p&gt;</comment>
                            <comment id="80817" author="bzzz" created="Wed, 2 Apr 2014 09:44:37 +0000"  >&lt;p&gt;I wonder if this is a race in zap_increment().. there is no locking between zap_lookup() and zap_remove()/zap_update(). meaning in some case we can try to remove the record already removed concurrently? I&apos;d like to suggest &lt;a href=&quot;http://review.whamcloud.com/#/c/7157/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/7157/&lt;/a&gt; but unfortunately that one depends on specific DMU API to be exported.&lt;/p&gt;</comment>
                            <comment id="83162" author="niu" created="Mon, 5 May 2014 08:11:56 +0000"  >&lt;p&gt;I found that osp sometimes could set a random uid/gid to OST object. (when user set uid or gid only).&lt;/p&gt;

&lt;p&gt;in osp_sync_add_rec():&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        &lt;span class=&quot;code-keyword&quot;&gt;case&lt;/span&gt; MDS_SETATTR64_REC:
                rc = fid_to_ostid(fid, &amp;amp;osi-&amp;gt;osi_oi);
                LASSERT(rc == 0);
                osi-&amp;gt;osi_hdr.lrh_len = sizeof(osi-&amp;gt;osi_setattr);
                osi-&amp;gt;osi_hdr.lrh_type = MDS_SETATTR64_REC;
                osi-&amp;gt;osi_setattr.lsr_oi  = osi-&amp;gt;osi_oi;
                LASSERT(attr);
                osi-&amp;gt;osi_setattr.lsr_uid = attr-&amp;gt;la_uid;
                osi-&amp;gt;osi_setattr.lsr_gid = attr-&amp;gt;la_gid;
                &lt;span class=&quot;code-keyword&quot;&gt;break&lt;/span&gt;;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Both uid and gid from attr are saved into the llog without checking if they are all valid. (if LA_UID &amp;amp; LA_GID are both present in attr-&amp;gt;la_valid)&lt;/p&gt;

&lt;p&gt;in osp_sync_new_setattr_job():&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;        body-&amp;gt;oa.o_oi = rec-&amp;gt;lsr_oi;
        body-&amp;gt;oa.o_uid = rec-&amp;gt;lsr_uid;
        body-&amp;gt;oa.o_gid = rec-&amp;gt;lsr_gid;
        body-&amp;gt;oa.o_valid = OBD_MD_FLGROUP | OBD_MD_FLID |
                           OBD_MD_FLUID | OBD_MD_FLGID;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;We send both the uid &amp;amp; gid from llog to OST, and tell OST that both uid &amp;amp; gid are valid. (OBD_MD_FLUID &amp;amp; OBD_MD_FLGID)&lt;/p&gt;

&lt;p&gt;This could probably the cause of random id on OST object, I think we&apos;d store a flag in llog_setattr64_rec to specify which id is valid. Alex, what do you think?&lt;/p&gt;</comment>
                            <comment id="83163" author="bzzz" created="Mon, 5 May 2014 08:17:13 +0000"  >&lt;p&gt;we don&apos;t store &quot;validity&quot; in llog. so I guess the right fix would be to fill missing uid/gid in llog record with current value?&lt;/p&gt;</comment>
                            <comment id="83256" author="niu" created="Tue, 6 May 2014 02:55:21 +0000"  >&lt;blockquote&gt;
&lt;p&gt;we don&apos;t store &quot;validity&quot; in llog. so I guess the right fix would be to fill missing uid/gid in llog record with current value?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;You mean get the current ids in lod layer, and pass them to osp by &apos;attr&apos;? (the &apos;attr&apos; is &apos;const&apos;)&lt;/p&gt;</comment>
                            <comment id="83263" author="niu" created="Tue, 6 May 2014 05:27:25 +0000"  >&lt;p&gt;&lt;a href=&quot;http://review.whamcloud.com/10223&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10223&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="83818" author="spimpale" created="Mon, 12 May 2014 10:22:42 +0000"  >&lt;p&gt;Is there a b2_4 backport of this patch?&lt;/p&gt;</comment>
                            <comment id="83839" author="pjones" created="Mon, 12 May 2014 14:13:11 +0000"  >&lt;p&gt;Swapnil&lt;/p&gt;

&lt;p&gt;Not yet. The usual practice is to finalize the form of the patch on master before back porting to earlier branches&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="84836" author="spimpale" created="Sat, 24 May 2014 04:56:09 +0000"  >&lt;p&gt;Peter,&lt;/p&gt;

&lt;p&gt;Could you please provide a b2_4 backport of this patch? We need it at one of our customer sites.&lt;/p&gt;

&lt;p&gt;Thanks!&lt;/p&gt;</comment>
                            <comment id="84840" author="pjones" created="Sat, 24 May 2014 13:47:16 +0000"  >&lt;p&gt;Swapnil&lt;/p&gt;

&lt;p&gt;Sorry if I was not clear previously. Yes, I understand that you would like a b2_4 version of this fix and as soon as we have finalized the form of the fix we will create one&lt;/p&gt;

&lt;p&gt;Regards&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="84995" author="niu" created="Wed, 28 May 2014 01:37:09 +0000"  >&lt;p&gt;b2_4: &lt;a href=&quot;http://review.whamcloud.com/#/c/10462/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#/c/10462/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="84998" author="spimpale" created="Wed, 28 May 2014 04:29:24 +0000"  >&lt;p&gt;Thanks Niu&lt;/p&gt;</comment>
                            <comment id="85281" author="adilger" created="Fri, 30 May 2014 17:57:21 +0000"  >&lt;p&gt;The patch &lt;a href=&quot;http://review.whamcloud.com/7157&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7157&lt;/a&gt; was landed to master and then reverted due to problems. That patch needs to be refreshed. &lt;/p&gt;</comment>
                            <comment id="85417" author="jlevi" created="Sun, 1 Jun 2014 20:36:54 +0000"  >&lt;p&gt;Follow on work is being tracked in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5129&quot; title=&quot;CLONE - failed to update accounting ZAP for user&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5129&quot;&gt;&lt;del&gt;LU-5129&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="85493" author="morrone" created="Mon, 2 Jun 2014 18:24:13 +0000"  >&lt;p&gt;So what is the actual state of the reported bug?  Is it now fixed, but because &lt;a href=&quot;http://review.whamcloud.com/7157&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7157&lt;/a&gt; was reverted we now have a potential performance regression?  Or is the bug not yet fixed?&lt;/p&gt;</comment>
                            <comment id="85505" author="adilger" created="Mon, 2 Jun 2014 19:41:51 +0000"  >&lt;p&gt;Chris,&lt;br/&gt;
the patch 10223 was landed for master (2.6.0), which Niu believes to be the major source of inconsistent UID/GIDs on the OSTs for quota accounting.&lt;/p&gt;

&lt;p&gt;The 7157 patch is to be tracked under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2600&quot; title=&quot;lustre metadata performance is very slow on zfs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2600&quot;&gt;&lt;del&gt;LU-2600&lt;/del&gt;&lt;/a&gt; where it was originally filed.  I mistakenly thought it was submitted under this ticket and needed a new patch to track it for landing.  It had accidentally landed to master for a very short time, but was reverted because it caused problems and Alex had only intended it for testing at this point.&lt;/p&gt;</comment>
                            <comment id="85552" author="bzzz" created="Tue, 3 Jun 2014 09:47:18 +0000"  >&lt;p&gt;2600 was doing OK on my local system, unfortunately it seem to fail on maloo sometimes. I asked Brian B. to help with understanding the root cause - somehow dnodes are still referenced when I use dsl_sync_task_nowait(). once this sorted out (Brian, please help &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; we can try that again.&lt;/p&gt;

&lt;p&gt;the important thing is that w/o 2600 accounting is still racy..&lt;/p&gt;</comment>
                            <comment id="86333" author="behlendorf" created="Wed, 11 Jun 2014 17:00:00 +0000"  >&lt;p&gt;Exactly what kind of failure are you seeing?  I don&apos;t understand what you mean by &apos;somehow dnodes are still referenced&apos;.  Can you point me at a maloo failure which shows the problem or better describe exactly what the issue is.  What you&apos;re trying to do in the patch looks reasonable to me on the surface, although the whole thing feels racy.&lt;/p&gt;</comment>
                            <comment id="86336" author="bzzz" created="Wed, 11 Jun 2014 17:19:33 +0000"  >&lt;p&gt;at umount dnodes storing object accounting are still referenced, so dnode_special_close() gets stuck because meta dnode is referenced by those.&lt;/p&gt;

&lt;p&gt;why do you think the whole thing is racy?&lt;/p&gt;</comment>
                            <comment id="86361" author="behlendorf" created="Wed, 11 Jun 2014 19:57:27 +0000"  >&lt;p&gt;Ahh OK.  I remember now.&lt;/p&gt;

&lt;p&gt;It looks to me like you&apos;re failing to call dmu_tx_commit() after dsl_sync_task_nowait().  The commit is responsible for dropping all the dnode holds and notifying any waiters.  Without it the holds are just going to accumulate and I&apos;d expect to see exactly what you&apos;re describing.&lt;/p&gt;

&lt;p&gt;My suggestion would be to take a good look at the spa_history_log() function in zfs.  It&apos;s a fairly nice example of how to go about this.  In this case they create a tx per history update since there aren&apos;t that many of them.  In the Lustre case I agree it&apos;s probably a good idea to batch them as your doing.  However, the commit callback I&apos;d still strongly suggest allowing a dedicated tx for this purpose.  I think it would make the code more readable and easy to verify that you&apos;ve constructed the tx properly.  There&apos;s a nice comment in include/sys/dmu.h describing what you can and cannot do when constructing a tx.  If you break any of those rules you&apos;re likely to see some strange problems.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;/*
 * You must create a transaction, then hold the objects which you will
 * (or might) modify as part of this transaction.  Then you must assign
 * the transaction to a transaction group.  Once the transaction has
 * been assigned, you can modify buffers which belong to held objects as
 * part of this transaction.  You can&apos;t modify buffers before the
 * transaction has been assigned; you can&apos;t modify buffers which don&apos;t
 * belong to objects which this transaction holds; you can&apos;t hold
 * objects once the transaction has been assigned.  You may hold an
 * object which you are going to free (with dmu_object_free()), but you
 * don&apos;t have to.
 *
 * You can abort the transaction before it has been assigned.
 *
 * Note that you may hold buffers (with dmu_buf_hold) at any time,
 * regardless of transaction state.
 */
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The racy comment was just a subjective feeling I had about the code.  There&apos;s so much non-private state you&apos;re depending on it&apos;s hard to easily look at the code and convince yourself it&apos;s safe.    An good example of this is the reuse the the ot_tx.  It&apos;s hard to know exactly what state the tx is in when you enter the function, and depending on what state that is there are certain things you must not do.  If you were to create a new tx here it would be much clearer.&lt;/p&gt;</comment>
                            <comment id="86363" author="bzzz" created="Wed, 11 Jun 2014 20:04:15 +0000"  >&lt;p&gt;hmm, what I&apos;m trying to do is to hit the same transaction (otherwise accounting is not atomic), but at the point when all the normal changes &lt;b&gt;within&lt;/b&gt; this txg are done. - this is the purpose of   dsl_sync_task_nowai(), right? and basically that whole state is per txg and it&apos;s supposed to be totally non-racy.&lt;/p&gt;

&lt;p&gt;yet another point is that this is exactly how space accounting works now? (or very similarly)&lt;/p&gt;</comment>
                            <comment id="86365" author="bzzz" created="Wed, 11 Jun 2014 20:17:31 +0000"  >&lt;p&gt;I had a quick look at spa_history_log() and not sure what&apos;s the difference. yes, it does create &quot;own&quot; tx, but how is it really different given all the logic is done in spa_history_log_sync() which is scheduled with dsl_sync_task_nowait() ?&lt;/p&gt;

&lt;p&gt;in my patch I do have a normal tx created the same normal way, I check this is the first time I see this specific txg, then allocate a structure (which is per-txg essentially) and schedule a function with dsl_sync_task_nowait() which does update couple ZAPs.. seem to be exactly the same?&lt;/p&gt;</comment>
                            <comment id="86367" author="behlendorf" created="Wed, 11 Jun 2014 20:20:25 +0000"  >&lt;p&gt;Exactly right.  You just need this tx to end up in the same txg.  As long as that&apos;s true you&apos;ll have the atomic guarantee.  One failure either both will be on disk or neither will.  But you don&apos;t need to leverage an existing tx to achieve that.  If you know the txg you can use dmu_tx_create_assigned() to ensure your tx is part of that txg.&lt;/p&gt;</comment>
                            <comment id="86368" author="behlendorf" created="Wed, 11 Jun 2014 20:22:15 +0000"  >&lt;p&gt;The critical bit is the call to dmu_tx_commit() after the dsl_sync_task_nowait() call.  Where does this happen in the Lustre code for this tx?&lt;/p&gt;</comment>
                            <comment id="86370" author="bzzz" created="Wed, 11 Jun 2014 20:24:12 +0000"  >&lt;p&gt;hmm, osd_trans_stop() does this, otherwise we would block in the very first txg? sort of interesting thing is that the patch getting stuck quite rarely.. it actually passed many maloo runs and it was hard to hit the issue locally as well.&lt;/p&gt;</comment>
                            <comment id="86372" author="behlendorf" created="Wed, 11 Jun 2014 20:50:17 +0000"  >&lt;p&gt;Perhaps it&apos;s just me.  But I find it hard to walk the code and verify that the tx is being constructed correctly.  Are we perhaps taking extra holds now?  One thing you could try is to build ZFS with with the --enable-debug-dmu-tx option.  This will enable a variety of checks to ensure that tx are constructed and managed properly.  If they&apos;re not you&apos;ll ASSERT.  They&apos;re somewhat expensive so they&apos;re disabled by default.&lt;/p&gt;

&lt;p&gt;As for doing thing like the internal ZFS code quota is managed more like I described above.  If you look at dsl_pool_sync() which is called in the sync&apos;ing context like a synctask you&apos;ll see that it creates a new tx for the correct txg with dmu_tx_create_assigned().  It then goes through the list of dirty dnodes per dataset and updates the zap&apos;s accordingly.  I don&apos;t see why Lustre couldn&apos;t do something similar.&lt;/p&gt;</comment>
                            <comment id="86405" author="bzzz" created="Thu, 12 Jun 2014 05:06:25 +0000"  >&lt;p&gt;OK, I&apos;ll have a closer look at the debug options, but notice that spa_history_log() and spa_history_log_sync don&apos;t calls dmu_tx_create_assigned()  and this still work?&lt;/p&gt;

&lt;p&gt;I&apos;ll also try to use dmu_tx_create_assigned(), thanks.&lt;/p&gt;</comment>
                            <comment id="86424" author="behlendorf" created="Thu, 12 Jun 2014 15:53:28 +0000"  >&lt;p&gt;&amp;gt; spa_history_log() and spa_history_log_sync don&apos;t calls dmu_tx_create_assigned()&lt;/p&gt;

&lt;p&gt;They don&apos;t call it because there aren&apos;t any atomic requirements for the history.  But if you need to make sure all the tx&apos;s end up being atomic you&apos;ll need dmu_tx_create_assigned() which is what the quota code uses.  Even better just use the tx passed to the synctask.  This is created in dsl_pool_sync() and will be tied to the txg being synced.&lt;/p&gt;</comment>
                            <comment id="91560" author="yujian" created="Wed, 13 Aug 2014 21:04:04 +0000"  >&lt;p&gt;Here is the back-ported patch for Lustre b2_5 branch: &lt;a href=&quot;http://review.whamcloud.com/11435&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/11435&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="24982">LU-5129</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="25134">LU-5188</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="22781">LU-4504</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="14009" name="0001-LU-4345-quota-debug-patch.patch" size="4642" author="niu" created="Wed, 22 Jan 2014 07:52:03 +0000"/>
                            <attachment id="14068" name="debug_patch-v1.patch" size="4672" author="niu" created="Fri, 7 Feb 2014 06:52:24 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwap3:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>11907</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>