<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:10:49 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-7659] Replace KUC by more standard mechanisms</title>
                <link>https://jira.whamcloud.com/browse/LU-7659</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;The Kernel Userland Communication (KUC) subsystem is a lustre-specific API for something relatively common (deliver stream of records from kernel to userland, transmit feedback from userland to kernel). We propose to replace it by character devices.&lt;/p&gt;

&lt;p&gt;Besides being more standard, it can also increase performance significantly. A process can read large chunks from the character device. Our proof of concept shows a 5~10x speedup for reading changelogs by blocks of 4k.&lt;/p&gt;

&lt;p&gt;I would like feedback and suggestions. The proposed implementation works as follows:&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;register a misc char device at mdc_setup (eg. /dev/changelog-lustre0-MDT0000). The minor number is associated to the corresponding OBD.&lt;/li&gt;
	&lt;li&gt;The .open handler starts a thread in the background, that iterates over the llog and enqueues up to X records into a ring buffer&lt;/li&gt;
	&lt;li&gt;The .read dequeues records from the ring buffer. We can make it blocking or not.&lt;/li&gt;
	&lt;li&gt;.release stops the background thread and releases resources&lt;/li&gt;
	&lt;li&gt;changelog clear is not yet implemented. It can either be a .write or a .unlocked_ioctl handler. Which would be preferable?&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The implementation for the copytool has not been done yet but would work in a similar way.&lt;/p&gt;</description>
                <environment></environment>
        <key id="34089">LU-7659</key>
            <summary>Replace KUC by more standard mechanisms</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="4" iconUrl="https://jira.whamcloud.com/images/icons/statuses/reopened.png" description="This issue was once resolved, but the resolution was deemed incorrect. From here issues are either marked assigned or resolved.">Reopened</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="ypo">Yohan Pipereau</assignee>
                                    <reporter username="hdoreau">Henri Doreau</reporter>
                        <labels>
                            <label>cea</label>
                            <label>patch</label>
                    </labels>
                <created>Wed, 13 Jan 2016 12:00:05 +0000</created>
                <updated>Wed, 15 Dec 2021 08:51:22 +0000</updated>
                                            <version>Lustre 2.10.0</version>
                                    <fixVersion>Upstream</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>21</watches>
                                                                            <comments>
                            <comment id="138805" author="bfaccini" created="Wed, 13 Jan 2016 15:05:32 +0000"  >&lt;p&gt;Hello Henri,&lt;br/&gt;
This looks really promising, and also much more fitting with the growing volume and required need of bandwidth for ChangeLogs, than KUC can offer now.&lt;br/&gt;
Can you detail the need for the back-ground thread being started during open?&lt;/p&gt;</comment>
                            <comment id="138809" author="hdoreau" created="Wed, 13 Jan 2016 15:13:06 +0000"  >&lt;p&gt;Thanks Bruno,&lt;/p&gt;

&lt;p&gt;this thread makes records retrieval operations asynchronous and speeds up the whole thing. While the userland processes a batch of records, a new one gets retrieved in the background. Do you think it&apos;s overkill? &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;</comment>
                            <comment id="138811" author="hdoreau" created="Wed, 13 Jan 2016 15:15:53 +0000"  >&lt;p&gt;I should add that kuc can deliver ~100k records per second in my benchmarks, though we have a robinhood setup able to consume 70k/s. This motived my work. It would also make it easier to read records from any language. And would look nicer.&lt;/p&gt;</comment>
                            <comment id="138877" author="adilger" created="Thu, 14 Jan 2016 02:35:37 +0000"  >&lt;p&gt;Some comments and questions:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;would this new mechanism be able to handle multiple ChangeLog consumers?&lt;/li&gt;
	&lt;li&gt;my preference would be to use read and write for the interface, instead of ioctl, since this can be used even from scripts&lt;/li&gt;
	&lt;li&gt;I would have suggested a /proc file instead of a char device, but new /proc files are frowned upon, and /sys files are only one value per file.  The (minor) issue with a char device is the registration of the char major/minor, but it could use a misc char device?&lt;/li&gt;
	&lt;li&gt;the .llseek() operation should allow seeking to a specific record, so that if there are multiple consumers and old records are not yet cancelled the new records can be found easily&lt;/li&gt;
	&lt;li&gt;the char device should also have a .poll() method so that userspace can wait for new records efficiently instead of busy looping&lt;/li&gt;
&lt;/ul&gt;



&lt;p&gt;One issue that had come up with ChangeLogs in the past was that they are single-threaded in the kernel, which limits performance during metadata operations.  If we are changing the API in userspace, it might also be good to change the on-disk format to allow multiple ChangeLog files to be written in parallel.  Probably not one per core (that may become too many on large MDS nodes), but maybe 4-8 or so.  The records could be merge sorted in the kernel by the helper thread at read time.&lt;/p&gt;</comment>
                            <comment id="138878" author="adilger" created="Thu, 14 Jan 2016 02:52:47 +0000"  >&lt;p&gt;Also, in theory the user-kernel interface could be changed without changing the userspace API, though this may be less desirable because of the licensing.&lt;/p&gt;

&lt;p&gt;Also note that it may be possible to just change the existing pipe interface to allow reading multiple records at once, instead of the current implementation that does two read() calls per record (one for the header and one for the body).  It would be possible to read up to 64KB chunks from the pipe I think, and it could get as many full records as fit into the buffer and return a short read.&lt;/p&gt;</comment>
                            <comment id="138960" author="rread" created="Thu, 14 Jan 2016 20:05:00 +0000"  >&lt;p&gt;The inotify API could be a good model for this, particularly providing a file descriptor that can be used with select or poll. &lt;/p&gt;</comment>
                            <comment id="145430" author="gerrit" created="Mon, 14 Mar 2016 15:59:01 +0000"  >&lt;p&gt;Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/18900&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/18900&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7659&quot; title=&quot;Replace KUC by more standard mechanisms&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7659&quot;&gt;LU-7659&lt;/a&gt; mdc: expose changelog through char devices&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: d34ff747f04f30e37a6f13b970bbcaa6ffe9e813&lt;/p&gt;</comment>
                            <comment id="147100" author="adilger" created="Mon, 28 Mar 2016 20:40:33 +0000"  >&lt;p&gt;I was thinking that this might also be more useful to expose via &lt;tt&gt;.lustre/changelog/MDTxxxx&lt;/tt&gt; rather than /dev/XXX so that it is easily accessed by applications when multiple filesystems are mounted on the same node.&lt;/p&gt;

&lt;p&gt;Also, very similar to this would be exposing (probably only on the server?) the virtual &quot;all objects&quot; iterator for each target under similar &lt;tt&gt;.lustre/iterator/MDTxxxx&lt;/tt&gt; and &lt;tt&gt;.lustre/iterator/OSTxxxx&lt;/tt&gt; virtual files (or similar).  The MDTxxxx iterators are useful for listing all inodes in order so that they can efficiently be processed for initial RBH scans of all files.  The OSTxxxx iterators might be useful for e.g. migrating objects off OSTs, replication of file data, and other operations that touch every object on an &lt;em&gt;online&lt;/em&gt; OST, but could be implemented separately as needed.  The caveat is that this would only be easily accessed if the OST is online, unless it was handled virtually by traversing the MDT layouts when the OST is offline which would not be nearly as efficient.&lt;/p&gt;</comment>
                            <comment id="150120" author="simmonsja" created="Mon, 25 Apr 2016 23:13:31 +0000"  >&lt;p&gt;As I explore netlink I wonder if the API could be used for this? In in my research I discovered it being used by the SCSI layer which surprised me.&lt;/p&gt;</comment>
                            <comment id="152831" author="gerrit" created="Thu, 19 May 2016 14:34:39 +0000"  >&lt;p&gt;Quentin Bouget (quentin.bouget.ocre@cea.fr) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/20327&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/20327&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7659&quot; title=&quot;Replace KUC by more standard mechanisms&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7659&quot;&gt;LU-7659&lt;/a&gt; mdc: expose hsm requests through char devices&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 52935cd7190a1b4d4b5def5a9244ce1e5ca60c3a&lt;/p&gt;</comment>
                            <comment id="153966" author="gerrit" created="Mon, 30 May 2016 12:27:55 +0000"  >&lt;p&gt;Quentin Bouget (quentin.bouget.ocre@cea.fr) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/20501&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/20501&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7659&quot; title=&quot;Replace KUC by more standard mechanisms&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7659&quot;&gt;LU-7659&lt;/a&gt; mdc: add an ioctl call to the copytool char device&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 3d4f22eab430874560db611f9bd95fb31f63350f&lt;/p&gt;</comment>
                            <comment id="153967" author="gerrit" created="Mon, 30 May 2016 12:27:56 +0000"  >&lt;p&gt;Quentin Bouget (quentin.bouget.ocre@cea.fr) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/20502&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/20502&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7659&quot; title=&quot;Replace KUC by more standard mechanisms&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7659&quot;&gt;LU-7659&lt;/a&gt; mdc: revise copytool char device locking&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: ebae9fe2b1fdf458638cabbde8a02fc1522ebb75&lt;/p&gt;</comment>
                            <comment id="187090" author="hdoreau" created="Sun, 5 Mar 2017 16:16:33 +0000"  >&lt;p&gt;Andreas, I realize that have not answered your questions, sorry for that, see below.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;would this new mechanism be able to handle multiple ChangeLog consumers?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Yes, multiple processes can open the char device. By default they start reading from the beginning of the llog and they can lseek to wherever they want in the log to start at a given record. Very similar to the existing implementation in this sense.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;my preference would be to use read and write for the interface, instead of ioctl, since this can be used even from scripts&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Done.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I would have suggested a /proc file instead of a char device, but new /proc files are frowned upon, and /sys files are only one value per file. The (minor) issue with a char device is the registration of the char major/minor, but it could use a misc char device?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;It is a misc char device.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;the .llseek() operation should allow seeking to a specific record, so that if there are multiple consumers and old records are not yet cancelled the new records can be found easily&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Done, using the record number as the offset to jump to.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;the char device should also have a .poll() method so that userspace can wait for new records efficiently instead of busy looping&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Done.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;One issue that had come up with ChangeLogs in the past was that they are single-threaded in the kernel, which limits performance during metadata operations. If we are changing the API in userspace, it might also be good to change the on-disk format to allow multiple ChangeLog files to be written in parallel. Probably not one per core (that may become too many on large MDS nodes), but maybe 4-8 or so. The records could be merge sorted in the kernel by the helper thread at read time.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;I&apos;d love that. It is beyond the scope of this patch I&apos;d&apos; say, but I keep it in mind. Maybe indexes instead of llog catalogs?&lt;/p&gt;</comment>
                            <comment id="187105" author="adilger" created="Mon, 6 Mar 2017 08:28:34 +0000"  >&lt;p&gt;Yes, we&apos;ve discussed changing llogs over to use an index instead of a flat file. The benefit of the llog file is that it can be written mostly sequentially, and record cancellation only needs to update the bitmap in the header.  The drawback is that updating the header is serialized, reserving space in the llog file is difficult if the record size is unknown, and there is added complexity the order of the log records does not match the order that transactions are completed.&lt;/p&gt;

&lt;p&gt;On a related note, did you look into connecting the LFSCK iterator to the new char interface to speed up the initial scanning for RBH?&lt;/p&gt;</comment>
                            <comment id="191002" author="gerrit" created="Thu, 6 Apr 2017 13:46:28 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/18900/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/18900/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7659&quot; title=&quot;Replace KUC by more standard mechanisms&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7659&quot;&gt;LU-7659&lt;/a&gt; mdc: expose changelog through char devices&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 1d40214d96dd6e36bd39a35f8419f753bae8d305&lt;/p&gt;</comment>
                            <comment id="191296" author="pjones" created="Sun, 9 Apr 2017 14:12:34 +0000"  >&lt;p&gt;Landed for 2.10&lt;/p&gt;</comment>
                            <comment id="231483" author="gerrit" created="Mon, 6 Aug 2018 06:38:25 +0000"  >&lt;p&gt;Yohan Pipereau (yohan.pipereau.ocre@cea.fr) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/32941&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/32941&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7659&quot; title=&quot;Replace KUC by more standard mechanisms&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7659&quot;&gt;LU-7659&lt;/a&gt; libcfs: Use netlink for KUC communication&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 31a839a7e846b7e53c0e846452435fffe83a0585&lt;/p&gt;</comment>
                            <comment id="241977" author="gerrit" created="Thu, 14 Feb 2019 16:35:05 +0000"  >&lt;p&gt;James Simmons (uja.ornl@yahoo.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/34258&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34258&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7659&quot; title=&quot;Replace KUC by more standard mechanisms&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7659&quot;&gt;LU-7659&lt;/a&gt; hsm: Use netlink for KUC communication&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: e3e1c2f7fa81dd149308b41c74ad190e32c858ae&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="67632">LU-15373</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="52043">LU-10968</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="56268">LU-12506</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="53930">LU-11626</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="46759">LU-9680</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="48818">LU-10141</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10490" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>End date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Mon, 30 May 2016 12:00:05 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                            <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzxy27:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                <customfield id="customfield_10493" key="com.atlassian.jira.plugin.system.customfieldtypes:datepicker">
                        <customfieldname>Start date</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>Wed, 13 Jan 2016 12:00:05 +0000</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                    </customfields>
    </item>
</channel>
</rss>