<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:15:40 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-1328] Failing customer&apos;s file creation test</title>
                <link>https://jira.whamcloud.com/browse/LU-1328</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Customer running a script that calls a Java program (see attachments).  Two clients panic&apos;d.&lt;/p&gt;

&lt;p&gt;2012-04-13 19:21:49 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.401564&amp;#93;&lt;/span&gt; -----------&lt;del&gt;[ cut here ]&lt;/del&gt;-----------&lt;br/&gt;
2012-04-13 19:21:49 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.424756&amp;#93;&lt;/span&gt; WARNING: at fs/libfs.c:363 simple_setattr+0x99/0xb0()&lt;br/&gt;
2012-04-13 19:21:49 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.455860&amp;#93;&lt;/span&gt; Hardware name: ProLiant BL460c G7&lt;br/&gt;
2012-04-13 19:21:49 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.477542&amp;#93;&lt;/span&gt; Modules linked in: lmv mgc lustre lquota lov osc mdc fid fld ksocklnd ptlrpc obdclass lnet lvfs libcfs ppdev 8021q garp bridge stp llc nfsd nfs lockd nfs_acl auth_rpcgss sunrpc netconsole configfs dm_crypt dm_mod crc32c ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter be2net ip_tables x_tables hpilo joydev hpwdt rtc_cmos psmouse rtc_core lp rtc_lib evdev parport mac_hid loop serio_raw tcp_scalable fuse virtio_blk virtio virtio_ring xenfs ext4 mbcache jbd2 xfs usbhid exportfs raid1 mptspi mptsas mptscsih mptbase mpt2sas raid_class arcmsr aic94xx libsas libata scsi_transport_sas aic7xxx aic79xx scsi_transport_spi megaraid_sas cciss sd_mod sg hpsa scsi_mod uhci_hcd ehci_hcd &lt;span class=&quot;error&quot;&gt;&amp;#91;last unloaded: libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-04-13 19:21:49 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.830142&amp;#93;&lt;/span&gt; Pid: 11122, comm: java Not tainted 2.6.38.2-ts4 #11&lt;br/&gt;
2012-04-13 19:21:49 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.860423&amp;#93;&lt;/span&gt; Call Trace:&lt;br/&gt;
2012-04-13 19:21:49 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.872545&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8105d10f&amp;gt;&amp;#93;&lt;/span&gt; ? warn_slowpath_common+0x7f/0xc0&lt;br/&gt;
2012-04-13 19:21:49 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.903121&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8105d16a&amp;gt;&amp;#93;&lt;/span&gt; ? warn_slowpath_null+0x1a/0x20&lt;br/&gt;
2012-04-13 19:21:49 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.933790&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8115bea9&amp;gt;&amp;#93;&lt;/span&gt; ? simple_setattr+0x99/0xb0&lt;br/&gt;
2012-04-13 19:21:50 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.962234&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8115be10&amp;gt;&amp;#93;&lt;/span&gt; ? simple_setattr+0x0/0xb0&lt;br/&gt;
2012-04-13 19:21:50 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26759.990161&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa1200af6&amp;gt;&amp;#93;&lt;/span&gt; ? ll_md_setattr+0x3e6/0x840 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-04-13 19:21:50 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26760.024040&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa12011b4&amp;gt;&amp;#93;&lt;/span&gt; ? ll_setattr_raw+0x264/0xe40 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-04-13 19:21:50 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26760.056779&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0ba3126&amp;gt;&amp;#93;&lt;/span&gt; ? cfs_hash_del+0xa6/0x1d0 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-04-13 19:21:50 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26760.089867&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa1201ded&amp;gt;&amp;#93;&lt;/span&gt; ? ll_setattr+0x5d/0x100 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-04-13 19:21:50 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26760.120906&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81153761&amp;gt;&amp;#93;&lt;/span&gt; ? notify_change+0x161/0x2c0&lt;br/&gt;
2012-04-13 19:21:50 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26760.150011&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff811383d1&amp;gt;&amp;#93;&lt;/span&gt; ? do_truncate+0x61/0x90&lt;br/&gt;
2012-04-13 19:21:50 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26760.176623&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8113862d&amp;gt;&amp;#93;&lt;/span&gt; ? sys_ftruncate+0xdd/0xf0&lt;br/&gt;
2012-04-13 19:21:50 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26760.205178&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8100bfc2&amp;gt;&amp;#93;&lt;/span&gt; ? system_call_fastpath+0x16/0x1b&lt;br/&gt;
2012-04-13 19:21:50 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;26760.235981&amp;#93;&lt;/span&gt; --&lt;del&gt;[ end trace 402d40ca74c5ea86 ]&lt;/del&gt;--&lt;/p&gt;

&lt;p&gt;On one OSS, about an hour before the client crash, I saw this:&lt;/p&gt;

&lt;p&gt;Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: Lustre: Service thread pid 16292 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: Pid: 16292, comm: ll_ost_io_254&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel:&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: Call Trace:&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa026f1a9&amp;gt;&amp;#93;&lt;/span&gt; ? LNetGet+0x3e9/0x830 &lt;span class=&quot;error&quot;&gt;&amp;#91;lnet&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa02699c5&amp;gt;&amp;#93;&lt;/span&gt; ? LNetMDBind+0x135/0x490 &lt;span class=&quot;error&quot;&gt;&amp;#91;lnet&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8104fdef&amp;gt;&amp;#93;&lt;/span&gt; ? lock_timer_base+0x2b/0x4f&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8104febd&amp;gt;&amp;#93;&lt;/span&gt; ? try_to_del_timer_sync+0xaa/0xb7&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff814aa778&amp;gt;&amp;#93;&lt;/span&gt; schedule_timeout+0x1c6/0x1ee&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810500ce&amp;gt;&amp;#93;&lt;/span&gt; ? process_timeout+0x0/0x10&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0221351&amp;gt;&amp;#93;&lt;/span&gt; cfs_waitq_timedwait+0x11/0x20 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03b417f&amp;gt;&amp;#93;&lt;/span&gt; target_bulk_io+0x3af/0xa40 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0221910&amp;gt;&amp;#93;&lt;/span&gt; ? cfs_alloc+0x30/0x60 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81038bc6&amp;gt;&amp;#93;&lt;/span&gt; ? default_wake_function+0x0/0x14&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa038752f&amp;gt;&amp;#93;&lt;/span&gt; ost_brw_write+0x12af/0x2040 &lt;span class=&quot;error&quot;&gt;&amp;#91;ost&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03ef7cc&amp;gt;&amp;#93;&lt;/span&gt; ? lustre_msg_set_timeout+0x9c/0x110 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810c6edb&amp;gt;&amp;#93;&lt;/span&gt; ? free_hot_page+0x3f/0x44&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810c7081&amp;gt;&amp;#93;&lt;/span&gt; ? __free_pages+0x5a/0x70&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03b19c0&amp;gt;&amp;#93;&lt;/span&gt; ? target_bulk_timeout+0x0/0x100 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa038e104&amp;gt;&amp;#93;&lt;/span&gt; ost_handle+0x2604/0x57e0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ost&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03ea29e&amp;gt;&amp;#93;&lt;/span&gt; ? lustre_msg_get_opc+0x8e/0xf0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03f735c&amp;gt;&amp;#93;&lt;/span&gt; ptlrpc_server_handle_request+0x4ec/0xfc0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81037e95&amp;gt;&amp;#93;&lt;/span&gt; ? enqueue_task+0x7c/0x8b&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa02212ae&amp;gt;&amp;#93;&lt;/span&gt; ? cfs_timer_arm+0xe/0x10 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa022deb0&amp;gt;&amp;#93;&lt;/span&gt; ? lc_watchdog_touch+0x70/0x150 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa022dfd7&amp;gt;&amp;#93;&lt;/span&gt; ? lc_watchdog_disable+0x47/0x120 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03fa714&amp;gt;&amp;#93;&lt;/span&gt; ? ptlrpc_wait_event+0xa4/0x2d0 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81030fda&amp;gt;&amp;#93;&lt;/span&gt; ? __wake_up+0x48/0x55&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03fae7f&amp;gt;&amp;#93;&lt;/span&gt; ptlrpc_main+0x53f/0x1670 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81003ada&amp;gt;&amp;#93;&lt;/span&gt; child_rip+0xa/0x20&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa03fa940&amp;gt;&amp;#93;&lt;/span&gt; ? ptlrpc_main+0x0/0x1670 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
Apr 13 18:00:53 ts-xxxxxxxx-04 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81003ad0&amp;gt;&amp;#93;&lt;/span&gt; ? child_rip+0x0/0x20&lt;/p&gt;

&lt;p&gt;I don&apos;t know if it is related, or not.&lt;/p&gt;

&lt;p&gt;Attached files:&lt;/p&gt;

&lt;p&gt;Reproduce.java:  Java program used when reproducing the problem&lt;br/&gt;
reproduce.sh:    Shell script that calls Reproduce.Java in a loop&lt;br/&gt;
messages-mds:    /var/log/messages from the MDS&lt;br/&gt;
messages-oss-1   from the first OSS&lt;br/&gt;
messages-oss-2   from the second OSS&lt;br/&gt;
users*:          Client output via netconsole.&lt;/p&gt;


</description>
                <environment>Lustre servers are running 2.6.32-220.el6, with Lustre 2.1.1.rc4.&lt;br/&gt;
Lustre clients are running 2.6.38.2, with special code created for this release, with &lt;a href=&quot;http://review.whamcloud.com/#change,2170&quot;&gt;http://review.whamcloud.com/#change,2170&lt;/a&gt;. (patch 8)</environment>
        <key id="14029">LU-1328</key>
            <summary>Failing customer&apos;s file creation test</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="laisiyao">Lai Siyao</assignee>
                                    <reporter username="rspellman">Roger Spellman</reporter>
                        <labels>
                    </labels>
                <created>Mon, 16 Apr 2012 17:03:59 +0000</created>
                <updated>Thu, 27 Sep 2012 17:49:58 +0000</updated>
                            <resolved>Sun, 1 Jul 2012 22:58:07 +0000</resolved>
                                    <version>Lustre 2.1.1</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="34850" author="pjones" created="Mon, 16 Apr 2012 19:52:36 +0000"  >&lt;p&gt;Lai&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="34856" author="laisiyao" created="Mon, 16 Apr 2012 21:11:47 +0000"  >&lt;p&gt;Roger, the first backtrace is a warning, the panic information is lost (refer to usrs393.netconsole), is it possible to setup kdump and get crash data? or connect a serial console and get panic info? I&apos;ll try to reproduce in my setup too.&lt;/p&gt;

&lt;p&gt;The second backtrace shows a slow thread on OST, which should not be relative to the panic.&lt;/p&gt;</comment>
                            <comment id="34945" author="rspellman" created="Tue, 17 Apr 2012 13:28:03 +0000"  >&lt;p&gt;/var/log/messages from system with the problem.&lt;/p&gt;</comment>
                            <comment id="34946" author="rspellman" created="Tue, 17 Apr 2012 13:28:21 +0000"  >&lt;p&gt;From the customer:&lt;/p&gt;

&lt;p&gt;Attached is the appropriate portion of /var/log/messages from that date - it had more info then the netconsole because apparently the problem was in the softirq and the network driver couldn&apos;t send anymore. Unfortunately the serial connections very often don&apos;t have anything on them for whatever reason and we haven&apos;t been able to get kdump to work in our environment. I currently have another client that is locked up on probably the same issue.&lt;/p&gt;</comment>
                            <comment id="35005" author="laisiyao" created="Wed, 18 Apr 2012 05:49:07 +0000"  >&lt;p&gt;Yes, the log shows there are some network IO problem, but it&apos;s still a warning, is it possible to get the crash backtrace?&lt;/p&gt;</comment>
                            <comment id="35479" author="pcpiela" created="Thu, 26 Apr 2012 09:09:50 +0000"  >&lt;p&gt;The customer is unable to provide a crash backtrace. Are you able to reproduce the problem in-house with the provided test application?&lt;/p&gt;</comment>
                            <comment id="35495" author="laisiyao" created="Thu, 26 Apr 2012 11:01:59 +0000"  >&lt;p&gt;Not yet, I&apos;ll keep it running.&lt;/p&gt;</comment>
                            <comment id="35943" author="laisiyao" created="Wed, 2 May 2012 06:06:30 +0000"  >&lt;p&gt;This has been run for several days, but not produced in my test environment.&lt;/p&gt;

&lt;p&gt;I used `while true; do sh reproduce.sh || break; done`, and I saw a few mkdir failures (looks to be failed on -EEXIST, this may be okay), everything else looks fine.&lt;/p&gt;</comment>
                            <comment id="38224" author="rspellman" created="Mon, 7 May 2012 10:10:14 +0000"  >&lt;p&gt;Customer reports:&lt;/p&gt;

&lt;p&gt;I was able to verify I still see this problem this morning, additionally I think I have distilled it down to an even smaller test case with a single client.&lt;/p&gt;

&lt;p&gt;I have a directory (generated from the test case I sent you) that has 34853 files in it. If I execute the following:&lt;/p&gt;

&lt;p&gt;$ lfs find &amp;lt;dir&amp;gt; -type f | sh -c &apos;while read line; do stat ${line} &amp;lt; /dev/null &amp;gt; /dev/null &amp;amp; done&apos;&lt;/p&gt;

&lt;p&gt;This will attempt to stat the files in parallel and I see a bunch of errors that it can&apos;t find the files. However if I stat the files sequentially, everything is ok:&lt;/p&gt;

&lt;p&gt;$ lfs find &amp;lt;dir&amp;gt; -type f | sh -c &apos;while read line; do stat ${line} &amp;lt; /dev/null &amp;gt; /dev/null; done&apos;&lt;/p&gt;


&lt;p&gt;Lai, please see if you can reproduce using this method.&lt;/p&gt;

&lt;p&gt;Regarding the EXIST, customer responded:&lt;br/&gt;
As far as I know we don&apos;t see EEXIST. Every time I&apos;ve seen a problem it has been a FileNotFoundException caused by ENOENT. I ran the test case I gave you this morning using &quot;strace -f&quot; and here is a snippet of one of the logs that failed:&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; open(&quot;/mnt/lustre /filetest/usrs398/20120507-092516/d5/7/75.bin&quot;, O_WRONLY|O_CREAT|O_TRUNC, 0666)  = 79 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; fstat(79, &lt;/p&gt;
{st_mode=S_IFREG|0664, st_size=0, ...}) = 0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; fcntl(79, F_GETFD)          = 0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; fcntl(79, F_SETFD, FD_CLOEXEC) = 0 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(79, &quot;K\304\215f\361R\351\226\340Hh\32i\303\233\243t\20\301\34\313A\243\254\224\n\262\211^\202\215\20&quot;..., 1024) = 1024 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; open(&quot;/mnt/lustre/ filetest/usrs398/20120507-092516/d5/7/76.bin&quot;, O_WRONLY|O_CREAT|O_TRUNC, 0666) = 80 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; fstat(80, {st_mode=S_IFREG|0664, st_size=0, ...}
&lt;p&gt;) = 0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; fcntl(80, F_GETFD)          = 0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; fcntl(80, F_SETFD, FD_CLOEXEC) = 0 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(80, &quot;K\304\215f\361R\351\226\340Hh\32i\303\233\243t\20\301\34\313A\243\254\224\n\262\211^\202\215\20&quot;..., 1024) = 1024 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; open(&quot;/mnt/lustre/ filetest/usrs398/20120507-092516/d5/7/77.bin&quot;, O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOENT (No such file or directory) &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; lseek(3, 49747007, SEEK_SET) = 49747007 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; read(3, &quot;PK\3\4\n\0\0\0\0\0I\nj?0\341^$j\2\0\0j\2\0\0#\0\0\0&quot;, 30) = 30 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; lseek(3, 49747072, SEEK_SET) = 49747072 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; read(3, &quot;\312\376\272\276\0\0\0001\0\&quot;\10\0\4\10\0\5\10\0\n\1\0\0\1\0\2 (\1\0\24()&quot;..., 618) = 618 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;Exception in thread \&quot;main\&quot; &quot;, 27Exception in thread &quot;main&quot; ) = 27 &lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;java.io.FileNotFoundException: /&quot;..., 127java.io.FileNotFoundException: /mnt/lustre/ filetest/usrs398/20120507-092516/d5/7/77.bin (No such file or directory)) = 127&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;\n&quot;, 1)           = 1&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;\tat java.io.FileOutputStream.ope&quot;..., 48	at java.io.FileOutputStream.open(Native Method)) = 48&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;\n&quot;, 1)           = 1&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;\tat java.io.FileOutputStream.&amp;lt;in&quot;..., 62	at java.io.FileOutputStream.&amp;lt;init&amp;gt;(FileOutputStream.java:194)) = 62&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;\n&quot;, 1)           = 1&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;\tat java.io.FileOutputStream.&amp;lt;in&quot;..., 62	at java.io.FileOutputStream.&amp;lt;init&amp;gt;(FileOutputStream.java:145)) = 62&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;\n&quot;, 1)           = 1&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;\tat Reproduce.main(Reproduce.jav&quot;..., 37	at Reproduce.main(Reproduce.java:31)) = 37&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;pid 22808&amp;#93;&lt;/span&gt; write(2, &quot;\n&quot;, 1)           = 1&lt;/p&gt;

&lt;p&gt;You can see that 2 opens succeeded with writes and the third failed. They were all writing to the same directory so that means the directory existed, just for some reason the third open returned ENOENT instead of succeeding. open(2) says:&lt;/p&gt;

&lt;p&gt;       ENOENT O_CREAT  is  not set and the named file does not exist.  Or, a directory component in pathname does not exist or is a dan-&lt;br/&gt;
              gling symbolic link.&lt;/p&gt;

&lt;p&gt;Since O_CREAT is given and all directory components exist since previous writes succeeded, there must be a bug in Lustre somewhere. What about the simple script where I tried to stat the files in parallel. Did you pass that on to Whamcloud as (on the surface) it appears to be a similar problem.&lt;/p&gt;</comment>
                            <comment id="38398" author="laisiyao" created="Wed, 9 May 2012 11:06:40 +0000"  >&lt;p&gt;I can&apos;t reproduce with the &apos;stat&apos; script yet; maybe my test machine is not powerful enough, and there isn&apos;t a machine installed with fc15 in testlab.&lt;/p&gt;

&lt;p&gt;With a small tweak of your script, it can dump debug log upon error:&lt;/p&gt;

&lt;p&gt;lfs find &amp;lt;dir&amp;gt; -type f | sh -c &apos;while read line; do stat ${line} &amp;lt; /dev/null &amp;gt; /dev/null || lctl dk /tmp/`basename $line`.log &amp;amp; done&apos;&lt;/p&gt;

&lt;p&gt;Could you test this script on your environment?&lt;/p&gt;</comment>
                            <comment id="38733" author="laisiyao" created="Mon, 14 May 2012 11:54:29 +0000"  >&lt;p&gt;I&apos;ve setup 3 nodes on testlab, 1 is server installed el6, 2 clients with kernel 2.6.38, and both clients doing stat and reproduce.sh in a loop. I have run test for half a day, but haven&apos;t reproduced yet, and I&apos;ll keep them running, and will add another client tomorrow if not reproduce yet.&lt;/p&gt;</comment>
                            <comment id="38794" author="laisiyao" created="Mon, 14 May 2012 22:54:35 +0000"  >&lt;p&gt;I can&apos;t reproduce over night, now there are 300k files total in test system, could you tell me how many files are in your system when the error appeared? Could you upload messages files of both client and MDS at failure time?&lt;/p&gt;</comment>
                            <comment id="38830" author="pcpiela" created="Tue, 15 May 2012 12:58:19 +0000"  >&lt;p&gt;Results from requested script change&lt;/p&gt;

&lt;p&gt;lfs find &amp;lt;dir&amp;gt; -type f | sh -c &apos;while read line; do stat ${line} &amp;lt; /dev/null &amp;gt; /dev/null || lctl dk /tmp/`basename $line`.log &amp;amp; done&apos;&lt;/p&gt;</comment>
                            <comment id="38899" author="laisiyao" created="Wed, 16 May 2012 04:20:37 +0000"  >&lt;p&gt;This looks to be a MDS issue about OPEN lock, I have a patch at &lt;a href=&quot;http://review.whamcloud.com/#change,2800&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,2800&lt;/a&gt;, will you compile it against master, update MDS and test again?&lt;/p&gt;</comment>
                            <comment id="38988" author="rspellman" created="Thu, 17 May 2012 10:51:10 +0000"  >&lt;p&gt;Customer asks:&lt;br/&gt;
&amp;gt; What version of the kernel is Whamcloud testing with? Are they using 2.6.38.2 or a newer version of 2.6.38?&lt;/p&gt;</comment>
                            <comment id="38993" author="laisiyao" created="Thu, 17 May 2012 11:04:09 +0000"  >&lt;p&gt;I&apos;m testing with 2.6.38.6-26.&lt;/p&gt;</comment>
                            <comment id="38999" author="rspellman" created="Thu, 17 May 2012 12:08:06 +0000"  >&lt;p&gt;Please verify that the patch only touches one file.&lt;/p&gt;</comment>
                            <comment id="39000" author="rspellman" created="Thu, 17 May 2012 12:10:16 +0000"  >&lt;p&gt;Please verify that this only affects the Lustre server, not the client.&lt;/p&gt;</comment>
                            <comment id="39045" author="laisiyao" created="Thu, 17 May 2012 20:21:53 +0000"  >&lt;p&gt;Yes, the patch affects Lustre MDS server only, and only touches one file.&lt;/p&gt;</comment>
                            <comment id="39133" author="rspellman" created="Mon, 21 May 2012 12:04:37 +0000"  >&lt;p&gt;We applied the patch.  The customer reports the following:&lt;/p&gt;

&lt;p&gt;$ lfs find . -type f | sh -c &apos;while read file; do stat $file &amp;amp; done &amp;gt;/dev/null&apos;&lt;br/&gt;
stat: cannot stat `./20120507-091222/d3/3/270.bin&apos;: No such file or directory&lt;br/&gt;
stat: cannot stat `./20120507-091222/d5/32/681.bin&apos;: No such file or directory&lt;br/&gt;
stat: cannot stat `./20120507-091222/d5/32/1525.bin&apos;: No such file or directory&lt;br/&gt;
stat: cannot stat `./20120507-091222/d12/0/284.bin&apos;: No such file or directory&lt;br/&gt;
stat: cannot stat `./20120507-091222/d12/0/1037.bin&apos;: No such file or directory&lt;br/&gt;
stat: cannot stat `./20120507-091222/d12/0/917.bin&apos;: No such file or directory&lt;br/&gt;
stat: cannot stat `./20120507-091222/d11/32/1807.bin&apos;: No such file or directory&lt;br/&gt;
stat: cannot stat `./20120507-091222/d10/6/2587.bin&apos;: No such file or directory&lt;br/&gt;
stat: cannot stat `./20120507-091222/d10/6/891.bin&apos;: No such file or directory&lt;br/&gt;
stat: cannot stat `./20120507-091222/d10/6/750.bin&apos;: No such file or directory&lt;br/&gt;
stat: cannot stat `./20120507-091222/d7/34/151.bin&apos;: No such file or directory&lt;/p&gt;

&lt;p&gt;Here are some additional details:&lt;/p&gt;

&lt;p&gt;  This is a single client with 24 cores running the test.&lt;/p&gt;

&lt;p&gt;  &quot;ls -l&apos; on each file succeeds - if I perform the stat sequentially it succeeds - it only fails when I launch the stat processes in the background thus making them parallel.&lt;/p&gt;

&lt;p&gt;  Each time I run the test, I get different files that can&apos;t be found.&lt;/p&gt;</comment>
                            <comment id="39166" author="laisiyao" created="Mon, 21 May 2012 22:39:12 +0000"  >&lt;p&gt;Is there any error message on MDS server?&lt;/p&gt;</comment>
                            <comment id="39174" author="laisiyao" created="Tue, 22 May 2012 03:28:51 +0000"  >&lt;p&gt;I have no way to debug without debug logs and messages; it would really to useful to dump debuglog upon error on both client and MDS, and it&apos;s best to enable &apos;info&apos; and &apos;vfstrace&apos; debug trace, you can check it by `lctl get_param debug`, if it doesn&apos;t contain &apos;info&apos; or &apos;vfstrace&apos;, you can enable it with `lctl set_param debug=+&quot;info vfstrace&quot;`.&lt;/p&gt;</comment>
                            <comment id="39191" author="rspellman" created="Tue, 22 May 2012 10:29:20 +0000"  >&lt;p&gt;This is the messages file from the MDS.&lt;/p&gt;</comment>
                            <comment id="39192" author="rspellman" created="Tue, 22 May 2012 10:30:27 +0000"  >&lt;p&gt;&amp;gt; I have no way to debug without debug logs and messages; it would really to useful to dump debuglog upon error on both client and MDS, and it&apos;s best to enable &apos;info&apos; and &apos;vfstrace&apos; debug trace, you can check it by `lctl get_param debug`, if it doesn&apos;t contain &apos;info&apos; or &apos;vfstrace&apos;, you can enable it with `lctl set_param debug=+&quot;info vfstrace&quot;`.&lt;/p&gt;

&lt;p&gt;I will ask the customer to enable these.  Are these enabled on both the client and servers?&lt;/p&gt;</comment>
                            <comment id="39198" author="rspellman" created="Tue, 22 May 2012 11:17:16 +0000"  >&lt;p&gt;What command is run to get view the debug data?&lt;/p&gt;</comment>
                            <comment id="39199" author="laisiyao" created="Tue, 22 May 2012 11:46:35 +0000"  >&lt;p&gt;Yes, it&apos;s better to enable debug trace on both client and server. Upon error, you can run `lctl dk /tmp/create.log`, then you can check debug trace in the log file, which contains recent debug traces, and this is why this command should be run immediately after error occurs, otherwise the old debug trace is dropped.&lt;/p&gt;</comment>
                            <comment id="39233" author="rspellman" created="Tue, 22 May 2012 15:51:28 +0000"  >&lt;p&gt;This file contains client.out and mds.out with the debug info you requested.&lt;/p&gt;</comment>
                            <comment id="39262" author="laisiyao" created="Tue, 22 May 2012 23:02:25 +0000"  >&lt;p&gt;When is the test conducted? &lt;/p&gt;

&lt;p&gt;I saw two lines of error messages suspicious in messages.mds, but they are at May 21 23:00:01:&lt;br/&gt;
May 21 23:00:01 ts-xxxxxxxx-01 kernel: LustreError: 6386:0:(mdt_open.c:1472:mdt_reint_open()) open &lt;span class=&quot;error&quot;&gt;&amp;#91;0x200000400:0xa:0x0&amp;#93;&lt;/span&gt;/(logdates-&amp;gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2003acec9:0x1:0x0&amp;#93;&lt;/span&gt;) cr_flag=01102 mode=0042775 msg_flag=0x0 failed: -21&lt;br/&gt;
May 21 23:00:01 ts-xxxxxxxx-01 kernel: LustreError: 6386:0:(mdt_open.c:1472:mdt_reint_open()) open &lt;span class=&quot;error&quot;&gt;&amp;#91;0x200000400:0xa:0x0&amp;#93;&lt;/span&gt;/(logdates-&amp;gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2003acec9:0x2:0x0&amp;#93;&lt;/span&gt;) cr_flag=01102 mode=0042775 msg_flag=0x0 failed: -21&lt;/p&gt;

&lt;p&gt;Could you upload script output along with this test? And you can use `lfs fid2path &lt;span class=&quot;error&quot;&gt;&amp;#91;0x2003acec9:0x1:0x0&amp;#93;&lt;/span&gt;` and `lfs fid2path &lt;span class=&quot;error&quot;&gt;&amp;#91;0x2003acec9:0x2:0x0&amp;#93;&lt;/span&gt;` to check which two files are failed to open.&lt;/p&gt;</comment>
                            <comment id="39282" author="rspellman" created="Wed, 23 May 2012 14:05:57 +0000"  >&lt;p&gt;I&apos;m moving back to this thread to continue the file issue since it had gotten merged in with the OST failure...&lt;/p&gt;

&lt;p&gt;I ran the test yesterday so the lines from May 21 don&apos;t pertain to the test. The two fids Lai asks about  (&lt;span class=&quot;error&quot;&gt;&amp;#91;0x2003acec9:0x1:0x0&amp;#93;&lt;/span&gt; &amp;amp; &lt;span class=&quot;error&quot;&gt;&amp;#91;0x2003acec9:0x2:0x0&amp;#93;&lt;/span&gt;) don&apos;t exist in the file system, but I assume they were located under the directory indicated by &lt;span class=&quot;error&quot;&gt;&amp;#91;0x200000400:0xa:0x0&amp;#93;&lt;/span&gt; which is owned by one of our users who is more tolerant of random failures and so he hasn&apos;t been telling me about any issues he has had since he knows it is &quot;use at your own risk&quot; at this time.&lt;/p&gt;

&lt;p&gt;I have uploaded a new tarball where I ran the test again. It contains the following files:&lt;/p&gt;

&lt;p&gt;client.out - debug_kernel output from the client &lt;br/&gt;
mds.out - debug_kernel output from the mds &lt;br/&gt;
fid_mapping - mapping of fids to files for the fids found in client.out&lt;br/&gt;
error_fid_mapping - mapping of paths to fids and OSTs for the files that weren&apos;t found typescript - the output of &quot;script&quot;&lt;/p&gt;

&lt;p&gt;The tarball is enoent-20120523.tar.gz.&lt;/p&gt;</comment>
                            <comment id="39312" author="laisiyao" created="Thu, 24 May 2012 04:34:58 +0000"  >&lt;p&gt;The logs show that the -ENOENT files are never .lookup or .revalidate at both client and server side, so the -ENOENT error should happen in directory name lookup/revalidate. I&apos;m a bit suspicious of &apos;CWD&apos; handling in lustre: the test script calls &apos;stat&apos; which calls syscall vfs_lstat() finally, and it starts pathname lookup from CWD (Current Working Directory) if filename doesn&apos;t start with &apos;/&apos;.&lt;/p&gt;

&lt;p&gt;One way to prove this is to test like this:&lt;br/&gt;
lfs find /mnt/lustre/filetest -type f | sh -c &apos;while read file; do stat $file &amp;amp; done &amp;gt;/dev/null&apos;&lt;/p&gt;

&lt;p&gt;Then all `stat` will lookup pathname from ROOT.&lt;/p&gt;

&lt;p&gt;BTW, please enable more debug trace in your test: `lctl set_param debug=+&quot;vfstrace info dlmtrace dentry inode&quot;`.&lt;/p&gt;</comment>
                            <comment id="39409" author="rspellman" created="Fri, 25 May 2012 10:30:08 +0000"  >&lt;p&gt;Customer reports:&lt;/p&gt;

&lt;p&gt;It didn&apos;t make a difference - I didn&apos;t think it would since this seems to be an issue with parallel access since if I run the stats sequentially it works.&lt;/p&gt;

&lt;p&gt;I have uploaded enoent-20120524.tar.gz:&lt;/p&gt;

&lt;p&gt;&quot;client.out&quot; and &quot;mds.out&quot; contain the debug output.&lt;br/&gt;
&quot;typescript&quot; contains the output from the run.&lt;br/&gt;
&quot;fids&quot; contains the file names and fids for the files that weren&apos;t found.&lt;/p&gt;</comment>
                            <comment id="39423" author="rspellman" created="Fri, 25 May 2012 16:56:58 +0000"  >&lt;p&gt;Customer reports&lt;/p&gt;

&lt;p&gt;If I limit the number of CPUs it can run on, it reduces the number of failures to the point where running it on 1 cpu makes it effectively sequential:&lt;/p&gt;

&lt;p&gt;$ lfs find . -type f | taskset -c 0 sh -c &apos;while read file; do stat $file &amp;amp; done &amp;gt; /dev/null&apos;&lt;/p&gt;

&lt;p&gt;This command eventually can&apos;t fork any more processes (I reach that limit) but there are no ENOENT errors.&lt;/p&gt;

&lt;p&gt;I see the same scenario with &quot;taskset -c 0,1&quot;, and I see my first failure with &quot;taskset -c 0-2&quot;, but it takes some time. However using &quot;taskset -c 0,2&quot; after some time I do see the ENOENT error. CPU 0 and CPU 1 are in different sockets and CPU 0 and CPU 2 are in the same socket. I do not see the error with 0 &amp;amp; 12 which are the hyperthreaded pairs.&lt;/p&gt;

&lt;p&gt;I can run more tests next week if you  want me to.&lt;/p&gt;</comment>
                            <comment id="39442" author="laisiyao" created="Sun, 27 May 2012 20:37:06 +0000"  >&lt;p&gt;Hmm, it looks to be a bug in ll_splice_alias():&lt;br/&gt;
Once ll_splice_alias() found a reusable dentry for current inode, it will tries to reuse it, and dput() the just allocated dentry, but before that d_rehash() is called to hash this newly allocated dentry into dcache, with dentry-&amp;gt;d_inode set to NULL. This originates from a kernel bug (d_rehash() used to be called by d_splice_alias() too), and actually it&apos;s planned to be removed in another patch, but we just think this line is redundant, and don&apos;t see it will cause error.&lt;/p&gt;

&lt;p&gt;I&apos;ll update FC15 patchless client patch later to reflect this change.&lt;/p&gt;</comment>
                            <comment id="39445" author="laisiyao" created="Sun, 27 May 2012 22:00:13 +0000"  >&lt;p&gt;Patch is updated: &lt;a href=&quot;http://review.whamcloud.com/#change,2170&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,2170&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Please patch it against latest master branch, and this change is for client only.&lt;/p&gt;</comment>
                            <comment id="39562" author="laisiyao" created="Wed, 30 May 2012 04:37:02 +0000"  >&lt;p&gt;Due to recent commit, please use patch: &lt;a href=&quot;http://review.whamcloud.com/#change,1865&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,1865&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="39644" author="rspellman" created="Wed, 30 May 2012 16:23:07 +0000"  >&lt;p&gt;Is this change just for the clients, or the servers, or both?&lt;br/&gt;
Do I apply both the patches from 2170 and 1865, or just 1865?&lt;/p&gt;</comment>
                            <comment id="39684" author="laisiyao" created="Wed, 30 May 2012 20:22:51 +0000"  >&lt;p&gt;It&apos;s for client only. 2170 has been abandoned now, 1865 is the only remaining patch for FC15 patchless client support code.&lt;/p&gt;</comment>
                            <comment id="39691" author="rspellman" created="Wed, 30 May 2012 21:56:22 +0000"  >&lt;p&gt;The modules will not load correctly.  I get the following errors:&lt;/p&gt;

&lt;p&gt;Lustre: Lustre: Build Version: 2.2.54-g2fcae7b-CHANGED-2.6.32-220.el6.x86_64&lt;br/&gt;
ptlrpc: Unknown symbol lut_boot_epoch_update&lt;br/&gt;
ptlrpc: Unknown symbol lut_mod_exit&lt;br/&gt;
ptlrpc: Unknown symbol lut_mod_init&lt;/p&gt;

&lt;p&gt;grepping through the code, I find:&lt;/p&gt;

&lt;p&gt;lustre-release-with-patch-9/lustre/include/lu_target.h:void lut_boot_epoch_update(struct lu_target *lut);&lt;br/&gt;
lustre-release-with-patch-9/lustre/ldlm/ldlm_lib.c:        lut_boot_epoch_update(lut);&lt;br/&gt;
lustre-release-with-patch-9/lustre/ldlm/ldlm_lib.c:                lut_boot_epoch_update(lut);&lt;br/&gt;
lustre-release-with-patch-9/lustre/ptlrpc/target.c:void lut_boot_epoch_update(struct lu_target *lut)&lt;br/&gt;
lustre-release-with-patch-9/lustre/ptlrpc/target.c:EXPORT_SYMBOL(lut_boot_epoch_update);&lt;/p&gt;

&lt;p&gt;So, that function is called from ldlm_lib.c.&lt;/p&gt;

&lt;p&gt;BUT ... In the ptlrpc/Makefile, target.o is not included, i.e.&lt;/p&gt;

&lt;p&gt;#ptlrpc_objs += target.o&lt;/p&gt;
</comment>
                            <comment id="39696" author="laisiyao" created="Wed, 30 May 2012 22:59:03 +0000"  >&lt;p&gt;you need to update to latest master branch, then patch latest Change,1865 (it&apos;s patch 21 now).&lt;/p&gt;</comment>
                            <comment id="39735" author="rspellman" created="Thu, 31 May 2012 12:25:56 +0000"  >&lt;p&gt;The problem still persists, i.e.&lt;/p&gt;

&lt;p&gt;May 31 12:05:21 compute-01-32 kernel: Lustre: Lustre: Build Version: 2.2.54-g9567e22-CHANGED-2.6.32-220.el6.x86_64&lt;br/&gt;
May 31 12:05:21 compute-01-32 kernel: ptlrpc: Unknown symbol lut_boot_epoch_update&lt;br/&gt;
May 31 12:05:21 compute-01-32 kernel: ptlrpc: Unknown symbol lut_mod_exit&lt;br/&gt;
May 31 12:05:21 compute-01-32 kernel: ptlrpc: Unknown symbol lut_mod_init&lt;/p&gt;

&lt;p&gt;I think I know the cause of the problem.&lt;/p&gt;

&lt;p&gt;In the file lustre/ptlrpc/Makefile.in is the line&lt;/p&gt;

&lt;p&gt;@SERVER_TRUE@ptlrpc_objs += target.o&lt;/p&gt;

&lt;p&gt;After running:  &lt;/p&gt;

&lt;p&gt;./configure --with-kernel=/usr/src/kernels/2.6.32-220.el6.x86_64 \&lt;br/&gt;
            --with-linux=/usr/src/kernels/2.6.32-220.el6.x86_64  \&lt;br/&gt;
            --disable-liblustre                                  \&lt;br/&gt;
            --without-sysio                                      \&lt;br/&gt;
            --disable-server&lt;/p&gt;

&lt;p&gt;The file lustre/ptlrpc/Makefile contains&lt;/p&gt;

&lt;p&gt;#ptlrpc_objs += target.o&lt;/p&gt;

&lt;p&gt;The file lustre/ldlm/ldlm_lib.c calls lut_boot_epoch_update(), which is defined in target.c.  But, since ldlm_lib.c is compiled, but target.c is not compiled, there is an unknown symbol.&lt;/p&gt;</comment>
                            <comment id="39744" author="rspellman" created="Thu, 31 May 2012 13:10:50 +0000"  >&lt;p&gt;The commands I ran to get the build are:&lt;/p&gt;

&lt;p&gt;git clone git://git.whamcloud.com/fs/lustre-release.git         # svn checkout url&lt;br/&gt;
cd lustre-release&lt;br/&gt;
git fetch &lt;a href=&quot;http://review.whamcloud.com/p/fs/lustre-release&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/p/fs/lustre-release&lt;/a&gt; refs/changes/65/1865/21 &amp;amp;&amp;amp; git checkout FETCH_HEAD&lt;/p&gt;</comment>
                            <comment id="39794" author="laisiyao" created="Fri, 1 Jun 2012 03:56:21 +0000"  >&lt;p&gt;I did the same as you, but all looks well.&lt;/p&gt;

&lt;p&gt;And where lut_boot_epoch_update() is called in ldlm_lib.c is inside #ifdef HAVE_SERVER_SUPPORT, so that these code won&apos;t be compiled, and won&apos;t cause the unknown symbol problem.&lt;/p&gt;</comment>
                            <comment id="39802" author="rspellman" created="Fri, 1 Jun 2012 10:49:17 +0000"  >&lt;p&gt;My error.  I copied the wrong modules to the target system.  The modules load fine now.&lt;/p&gt;</comment>
                            <comment id="39912" author="pjones" created="Mon, 4 Jun 2012 05:50:47 +0000"  >&lt;p&gt;As I understand it, this code is being tracked for landing under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-506&quot; title=&quot;FC15  patchless client support.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-506&quot;&gt;&lt;del&gt;LU-506&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="40137" author="rspellman" created="Wed, 6 Jun 2012 13:47:23 +0000"  >&lt;p&gt;Customer reports:&lt;br/&gt;
It appears that this patch has fixed the original bug as the researcher was not able to replicate the problem, however he was able to panic nine of the clients with an LBUG.  &lt;/p&gt;

&lt;p&gt;See attachment for the netconsole output.  The following is the output from one:&lt;/p&gt;

&lt;p&gt;2012-06-06 11:35:51 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4026925.936063&amp;#93;&lt;/span&gt; Lustre: MGC192.168.185.35@tcp: Reactivating import&lt;br/&gt;
2012-06-06 11:35:52 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4026926.153312&amp;#93;&lt;/span&gt; Lustre: Mounted xxxx-client&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.176640&amp;#93;&lt;/span&gt; LustreError: 8473:0:(cl_page.c:1026:cl_page_assume()) page@ffff880f6107ad80&lt;span class=&quot;error&quot;&gt;&amp;#91;4 ffff880d5f11e498:0 ^          (null)_ffff880f6107ae40 1 0 1 ffff88103efcc028           (null) 0x0&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.261214&amp;#93;&lt;/span&gt; LustreError: 8473:0:(cl_page.c:1026:cl_page_assume()) page@ffff880f6107ae40&lt;span class=&quot;error&quot;&gt;&amp;#91;1 ffff880fafbffda8:0 ^ffff880f6107ad80_          (null) 0 0 1           (null)           (null) 0x0&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.346779&amp;#93;&lt;/span&gt; LustreError: 8473:0:(cl_page.c:1026:cl_page_assume()) vvp-page@ffff880dbfdb9640(0:0:0) vm@ffffea0033482180 60000000000086d 4:0 ffff880f6107ad80 0 lru&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.419949&amp;#93;&lt;/span&gt; LustreError: 8473:0:(cl_page.c:1026:cl_page_assume()) lov-page@ffff880dbfdbdcf0&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.462875&amp;#93;&lt;/span&gt; LustreError: 8473:0:(cl_page.c:1026:cl_page_assume()) osc-page@ffff880f610795a8: 1&amp;lt; 0x845fed 2 0 - - - &amp;gt; 2&amp;lt; 0 0 51 0x0 0x400 |           (null) ffff880e893b1fb8 ffff88176f4b3e80 ffffffffa1088200 ffff880f610795a8 &amp;gt; 3&amp;lt; - ffff880967a58000 0 0 0 &amp;gt; 4&amp;lt; 0 0 48 532799488 - | - - + - &amp;gt; 5&amp;lt; - - - - | 0 - - | 0 - -&amp;gt;&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.604818&amp;#93;&lt;/span&gt; LustreError: 8473:0:(cl_page.c:1026:cl_page_assume()) end page@ffff880f6107ad80&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.647677&amp;#93;&lt;/span&gt; LustreError: 8473:0:(cl_page.c:1026:cl_page_assume()) pg-&amp;gt;cp_owner == NULL&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.688726&amp;#93;&lt;/span&gt; LustreError: 8473:0:(cl_page.c:1026:cl_page_assume()) ASSERTION( 0 ) failed:&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.731216&amp;#93;&lt;/span&gt; LustreError: 8473:0:(cl_page.c:1026:cl_page_assume()) LBUG&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.764574&amp;#93;&lt;/span&gt; Pid: 8473, comm: java&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.782718&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.782720&amp;#93;&lt;/span&gt; Call Trace:&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.803973&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0b77865&amp;gt;&amp;#93;&lt;/span&gt; libcfs_debug_dumpstack+0x55/0x80 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.841227&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0b77d97&amp;gt;&amp;#93;&lt;/span&gt; lbug_with_loc+0x47/0xc0 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.872993&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0d61b50&amp;gt;&amp;#93;&lt;/span&gt; cl_page_own0+0x0/0x2c0 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdclass&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:32 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.906060&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa1196726&amp;gt;&amp;#93;&lt;/span&gt; ll_prepare_write+0x86/0x170 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.940052&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa11aa8e8&amp;gt;&amp;#93;&lt;/span&gt; ll_write_begin+0x88/0x160 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043242.973265&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa11a51cb&amp;gt;&amp;#93;&lt;/span&gt; ? ll_getxattr+0xfb/0x440 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.005911&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810e75fe&amp;gt;&amp;#93;&lt;/span&gt; generic_file_buffered_write+0xfe/0x250&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.039902&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810ea1a0&amp;gt;&amp;#93;&lt;/span&gt; __generic_file_aio_write+0x230/0x470&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.073917&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810ea442&amp;gt;&amp;#93;&lt;/span&gt; generic_file_aio_write+0x62/0xd0&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.105326&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa11bb800&amp;gt;&amp;#93;&lt;/span&gt; vvp_io_write_start+0xb0/0x1e0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.139984&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0d6a1d2&amp;gt;&amp;#93;&lt;/span&gt; cl_io_start+0x72/0x100 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdclass&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.172847&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0d6d774&amp;gt;&amp;#93;&lt;/span&gt; cl_io_loop+0xd4/0x160 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdclass&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.203854&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa1175d3e&amp;gt;&amp;#93;&lt;/span&gt; ll_file_io_generic+0x3be/0x4f0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.238796&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa1175fa0&amp;gt;&amp;#93;&lt;/span&gt; ll_file_aio_write+0x130/0x1f0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.272727&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa117634c&amp;gt;&amp;#93;&lt;/span&gt; ll_file_write+0x14c/0x250 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.306367&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff811390a8&amp;gt;&amp;#93;&lt;/span&gt; vfs_write+0xc8/0x190&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.332757&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff811398d1&amp;gt;&amp;#93;&lt;/span&gt; sys_write+0x51/0x90&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.359311&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8100bfc2&amp;gt;&amp;#93;&lt;/span&gt; system_call_fastpath+0x16/0x1b&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.389972&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.399097&amp;#93;&lt;/span&gt; Kernel panic - not syncing: LBUG&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.421020&amp;#93;&lt;/span&gt; Pid: 8473, comm: java Tainted: G        W   2.6.38.2-ts4 #11&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.455850&amp;#93;&lt;/span&gt; Call Trace:&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.468809&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8145a373&amp;gt;&amp;#93;&lt;/span&gt; ? panic+0x91/0x19c&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.494377&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0b77dfb&amp;gt;&amp;#93;&lt;/span&gt; ? lbug_with_loc+0xab/0xc0 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.527753&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0d61b50&amp;gt;&amp;#93;&lt;/span&gt; ? cl_page_own0+0x0/0x2c0 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdclass&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.561570&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa1196726&amp;gt;&amp;#93;&lt;/span&gt; ? ll_prepare_write+0x86/0x170 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.595489&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa11aa8e8&amp;gt;&amp;#93;&lt;/span&gt; ? ll_write_begin+0x88/0x160 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.629850&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa11a51cb&amp;gt;&amp;#93;&lt;/span&gt; ? ll_getxattr+0xfb/0x440 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.661785&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810e75fe&amp;gt;&amp;#93;&lt;/span&gt; ? generic_file_buffered_write+0xfe/0x250&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.697719&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810ea1a0&amp;gt;&amp;#93;&lt;/span&gt; ? __generic_file_aio_write+0x230/0x470&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.734746&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810ea442&amp;gt;&amp;#93;&lt;/span&gt; ? generic_file_aio_write+0x62/0xd0&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.767215&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa11bb800&amp;gt;&amp;#93;&lt;/span&gt; ? vvp_io_write_start+0xb0/0x1e0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.803208&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0d6a1d2&amp;gt;&amp;#93;&lt;/span&gt; ? cl_io_start+0x72/0x100 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdclass&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.836585&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0d6d774&amp;gt;&amp;#93;&lt;/span&gt; ? cl_io_loop+0xd4/0x160 &lt;span class=&quot;error&quot;&gt;&amp;#91;obdclass&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.870706&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa1175d3e&amp;gt;&amp;#93;&lt;/span&gt; ? ll_file_io_generic+0x3be/0x4f0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:33 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.907105&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa1175fa0&amp;gt;&amp;#93;&lt;/span&gt; ? ll_file_aio_write+0x130/0x1f0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.942842&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa117634c&amp;gt;&amp;#93;&lt;/span&gt; ? ll_file_write+0x14c/0x250 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043243.977049&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff811390a8&amp;gt;&amp;#93;&lt;/span&gt; ? vfs_write+0xc8/0x190&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.004041&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff811398d1&amp;gt;&amp;#93;&lt;/span&gt; ? sys_write+0x51/0x90&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.031568&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8100bfc2&amp;gt;&amp;#93;&lt;/span&gt; ? system_call_fastpath+0x16/0x1b&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.063216&amp;#93;&lt;/span&gt; -----------&lt;del&gt;[ cut here ]&lt;/del&gt;-----------&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.087602&amp;#93;&lt;/span&gt; WARNING: at arch/x86/kernel/smp.c:118 native_smp_send_reschedule+0x54/0x60()&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.129258&amp;#93;&lt;/span&gt; Hardware name: ProLiant BL460c G7&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.151613&amp;#93;&lt;/span&gt; Modules linked in: lmv mgc lustre lov osc mdc fid fld ksocklnd ptlrpc obdclass lnet lvfs libcfs parport_pc ppdev 8021q garp bridge stp llc nfsd netconsole configfs nfs lockd nfs_acl auth_rpcgss sunrpc dm_crypt dm_mod crc32c ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter ip_tables x_tables hpwdt psmouse be2net joydev hpilo evdev mac_hid serio_raw rtc_cmos rtc_core rtc_lib parport loop tcp_scalable fuse virtio_blk virtio virtio_ring xenfs ext4 mbcache jbd2 xfs exportfs raid1 usbhid mptspi mptsas mptscsih mptbase mpt2sas raid_class arcmsr aic94xx libsas libata scsi_transport_sas aic7xxx aic79xx scsi_transport_spi megaraid_sas cciss sg sd_mod hpsa ehci_hcd scsi_mod uhci_hcd &lt;span class=&quot;error&quot;&gt;&amp;#91;last unloaded: libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.501423&amp;#93;&lt;/span&gt; Pid: 15289, comm: ptlrpcd_6 Tainted: G        W   2.6.38.2-ts4 #11&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.540816&amp;#93;&lt;/span&gt; Call Trace:&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.553799&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8105d10f&amp;gt;&amp;#93;&lt;/span&gt; ? warn_slowpath_common+0x7f/0xc0&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.585974&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8105d16a&amp;gt;&amp;#93;&lt;/span&gt; ? warn_slowpath_null+0x1a/0x20&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.616810&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81029c24&amp;gt;&amp;#93;&lt;/span&gt; ? native_smp_send_reschedule+0x54/0x60&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.651789&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81045f06&amp;gt;&amp;#93;&lt;/span&gt; ? resched_task+0x76/0x90&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.679720&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81056465&amp;gt;&amp;#93;&lt;/span&gt; ? check_preempt_wakeup+0x1b5/0x280&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.714069&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81045fc4&amp;gt;&amp;#93;&lt;/span&gt; ? check_preempt_curr+0x84/0xa0&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.745025&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81055ebb&amp;gt;&amp;#93;&lt;/span&gt; ? try_to_wake_up+0x7b/0x410&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.775300&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81056262&amp;gt;&amp;#93;&lt;/span&gt; ? default_wake_function+0x12/0x20&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.808168&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8107e046&amp;gt;&amp;#93;&lt;/span&gt; ? autoremove_wake_function+0x16/0x40&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.841761&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff810455c9&amp;gt;&amp;#93;&lt;/span&gt; ? __wake_up_common+0x59/0x90&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.873316&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8104d998&amp;gt;&amp;#93;&lt;/span&gt; ? __wake_up+0x48/0x70&lt;br/&gt;
2012-06-06 16:08:34 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.900189&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0b7834a&amp;gt;&amp;#93;&lt;/span&gt; ? cfs_waitq_signal+0x1a/0x20 &lt;span class=&quot;error&quot;&gt;&amp;#91;libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.935008&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa1012227&amp;gt;&amp;#93;&lt;/span&gt; ? ksocknal_queue_tx_locked+0x277/0x540 &lt;span class=&quot;error&quot;&gt;&amp;#91;ksocklnd&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043244.973972&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa100d033&amp;gt;&amp;#93;&lt;/span&gt; ? ksocknal_find_conn_locked+0xa3/0x230 &lt;span class=&quot;error&quot;&gt;&amp;#91;ksocklnd&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.013276&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa101263b&amp;gt;&amp;#93;&lt;/span&gt; ? ksocknal_launch_packet+0x14b/0x350 &lt;span class=&quot;error&quot;&gt;&amp;#91;ksocklnd&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.053190&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa10129be&amp;gt;&amp;#93;&lt;/span&gt; ? ksocknal_send+0x17e/0x410 &lt;span class=&quot;error&quot;&gt;&amp;#91;ksocklnd&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.087409&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0cce12b&amp;gt;&amp;#93;&lt;/span&gt; ? lnet_ni_send+0x4b/0x100 &lt;span class=&quot;error&quot;&gt;&amp;#91;lnet&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.119743&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0cd294b&amp;gt;&amp;#93;&lt;/span&gt; ? lnet_send+0x20b/0xa30 &lt;span class=&quot;error&quot;&gt;&amp;#91;lnet&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.150531&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0cce530&amp;gt;&amp;#93;&lt;/span&gt; ? lnet_prep_send+0x50/0xb0 &lt;span class=&quot;error&quot;&gt;&amp;#91;lnet&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.183883&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0cd3a49&amp;gt;&amp;#93;&lt;/span&gt; ? LNetPut+0x2a9/0x670 &lt;span class=&quot;error&quot;&gt;&amp;#91;lnet&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.214255&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0e9e4da&amp;gt;&amp;#93;&lt;/span&gt; ? ptl_send_buf+0x18a/0x440 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.247512&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0ea08a0&amp;gt;&amp;#93;&lt;/span&gt; ? ptl_send_rpc+0x4e0/0xb10 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.280878&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8104bb3a&amp;gt;&amp;#93;&lt;/span&gt; ? finish_task_switch+0x4a/0x100&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.311703&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0e97695&amp;gt;&amp;#93;&lt;/span&gt; ? ptlrpc_send_new_req+0x3e5/0x720 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.348610&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8145d2bf&amp;gt;&amp;#93;&lt;/span&gt; ? _raw_spin_lock_irqsave+0x2f/0x40&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.381490&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0e9a9a0&amp;gt;&amp;#93;&lt;/span&gt; ? ptlrpc_check_set+0x340/0x1750 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.417485&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8106ca3a&amp;gt;&amp;#93;&lt;/span&gt; ? del_timer_sync+0x3a/0x60&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.447321&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0ec4dcb&amp;gt;&amp;#93;&lt;/span&gt; ? ptlrpcd_check+0x52b/0x550 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.480734&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0ec50ab&amp;gt;&amp;#93;&lt;/span&gt; ? ptlrpcd+0x2bb/0x360 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.514762&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81056250&amp;gt;&amp;#93;&lt;/span&gt; ? default_wake_function+0x0/0x20&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.546436&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8100cde4&amp;gt;&amp;#93;&lt;/span&gt; ? kernel_thread_helper+0x4/0x10&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.578542&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa0ec4df0&amp;gt;&amp;#93;&lt;/span&gt; ? ptlrpcd+0x0/0x360 &lt;span class=&quot;error&quot;&gt;&amp;#91;ptlrpc&amp;#93;&lt;/span&gt;&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.609487&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8100cde0&amp;gt;&amp;#93;&lt;/span&gt; ? kernel_thread_helper+0x0/0x10&lt;br/&gt;
2012-06-06 16:08:35 +0000 &lt;span class=&quot;error&quot;&gt;&amp;#91;4043245.640837&amp;#93;&lt;/span&gt; --&lt;del&gt;[ end trace 70a7f3071bb3c3f8 ]&lt;/del&gt;--&lt;/p&gt;
</comment>
                            <comment id="40175" author="laisiyao" created="Thu, 7 Jun 2012 02:58:29 +0000"  >&lt;p&gt;Jinshan, could you help check why this ASSERT is not true on 2.6.38 kernel?&lt;/p&gt;</comment>
                            <comment id="40246" author="laisiyao" created="Thu, 7 Jun 2012 23:05:09 +0000"  >&lt;p&gt;As Jinshan pointed out, it looks to be the same issue of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1421&quot; title=&quot;Client LBUG in ll_file_write after filesystem expansion&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1421&quot;&gt;&lt;del&gt;LU-1421&lt;/del&gt;&lt;/a&gt;, the fix is at &lt;a href=&quot;http://review.whamcloud.com/#change,3027&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/#change,3027&lt;/a&gt;. Roger, could you apply this patch and try again?&lt;/p&gt;</comment>
                            <comment id="40677" author="rspellman" created="Fri, 15 Jun 2012 15:38:09 +0000"  >&lt;p&gt;Customer reports:&lt;/p&gt;

&lt;p&gt;We have been running with the latest patch for several days with no problems other than the old slowpath warning:&lt;/p&gt;

&lt;p&gt;609631.154485] -----------&lt;del&gt;[ cut here ]&lt;/del&gt;-----------&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154499&amp;#93;&lt;/span&gt; WARNING: at fs/libfs.c:363 simple_setattr+0x99/0xb0()&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154502&amp;#93;&lt;/span&gt; Hardware name: ProLiant BL460c G7&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154504&amp;#93;&lt;/span&gt; Modules linked in: lmv mgc lustre lov osc mdc fid fld ksocklnd ptlrpc obdclass lnet lvfs libcfs parport_pc ppdev 8021q garp bridge stp llc nfsd nfs lockd nfs_acl auth_rpcgss sunrpc netconsole configfs dm_crypt dm_mod crc32c ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iptable_filter ip_tables x_tables parport be2net psmouse evdev hpilo rtc_cmos joydev hpwdt serio_raw mac_hid loop rtc_core rtc_lib tcp_scalable fuse virtio_blk virtio virtio_ring xenfs ext4 mbcache jbd2 usbhid xfs exportfs raid1 mptspi mptsas mptscsih mptbase mpt2sas raid_class arcmsr aic94xx libsas libata scsi_transport_sas aic7xxx aic79xx scsi_transport_spi megaraid_sas cciss sd_mod sg hpsa scsi_mod ehci_hcd uhci_hcd &lt;span class=&quot;error&quot;&gt;&amp;#91;last unloaded: libcfs&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154578&amp;#93;&lt;/span&gt; Pid: 19765, comm: java Not tainted 2.6.38.2-ts4 #11&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154580&amp;#93;&lt;/span&gt; Call Trace:&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154588&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8105d10f&amp;gt;&amp;#93;&lt;/span&gt; ? warn_slowpath_common+0x7f/0xc0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154592&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8105d16a&amp;gt;&amp;#93;&lt;/span&gt; ? warn_slowpath_null+0x1a/0x20&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154596&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8115bea9&amp;gt;&amp;#93;&lt;/span&gt; ? simple_setattr+0x99/0xb0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154632&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa10f2cd6&amp;gt;&amp;#93;&lt;/span&gt; ? ll_md_setattr+0x3e6/0x840 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154652&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa10f3394&amp;gt;&amp;#93;&lt;/span&gt; ? ll_setattr_raw+0x264/0xe40 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154672&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffffa10f3fcd&amp;gt;&amp;#93;&lt;/span&gt; ? ll_setattr+0x5d/0x100 &lt;span class=&quot;error&quot;&gt;&amp;#91;lustre&amp;#93;&lt;/span&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154677&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81153761&amp;gt;&amp;#93;&lt;/span&gt; ? notify_change+0x161/0x2c0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154682&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff811383d1&amp;gt;&amp;#93;&lt;/span&gt; ? do_truncate+0x61/0x90&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154687&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff811beeec&amp;gt;&amp;#93;&lt;/span&gt; ? security_inode_permission+0x1c/0x30&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154692&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81144868&amp;gt;&amp;#93;&lt;/span&gt; ? finish_open+0x138/0x1b0&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154696&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81146003&amp;gt;&amp;#93;&lt;/span&gt; ? do_last+0x83/0x360&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154699&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81148706&amp;gt;&amp;#93;&lt;/span&gt; ? do_filp_open+0x3d6/0x830&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154704&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8110bc27&amp;gt;&amp;#93;&lt;/span&gt; ? handle_mm_fault+0x157/0x250&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154708&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8115493a&amp;gt;&amp;#93;&lt;/span&gt; ? alloc_fd+0x10a/0x150&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154713&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff811373a9&amp;gt;&amp;#93;&lt;/span&gt; ? do_sys_open+0x69/0x110&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154717&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff81137490&amp;gt;&amp;#93;&lt;/span&gt; ? sys_open+0x20/0x30&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154722&amp;#93;&lt;/span&gt;  &lt;span class=&quot;error&quot;&gt;&amp;#91;&amp;lt;ffffffff8100bfc2&amp;gt;&amp;#93;&lt;/span&gt; ? system_call_fastpath+0x16/0x1b&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;609631.154725&amp;#93;&lt;/span&gt; --&lt;del&gt;[ end trace e84ad085cd1d9abc ]&lt;/del&gt;--&lt;/p&gt;



&lt;p&gt;Question:  Is the best thing to do for this to lower the number of OST threads, as documented on the Lustre manual?&lt;/p&gt;</comment>
                            <comment id="40723" author="laisiyao" created="Mon, 18 Jun 2012 01:54:46 +0000"  >&lt;p&gt;This is just kernel warning which is a bit excessive, and it has nothing to do with real IO performance. Lustre uses a kernel exported function simple_setattr(), which is originally for simple filesystem (which doesn&apos;t implement truncate), so it gives a warning, but there is no real problem.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="11524" name="20120606-netconsole.tbz2" size="8029" author="rspellman" created="Wed, 6 Jun 2012 13:47:23 +0000"/>
                            <attachment id="11167" name="Reproduce.java" size="1347" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11433" name="debug.tar.gz" size="9878911" author="rspellman" created="Tue, 22 May 2012 15:51:28 +0000"/>
                            <attachment id="11445" name="enoent-20120523.tar.gz" size="10104258" author="rspellman" created="Wed, 23 May 2012 14:05:56 +0000"/>
                            <attachment id="11468" name="enoent-20120524.tar.bz2" size="5918331" author="rspellman" created="Fri, 25 May 2012 10:30:08 +0000"/>
                            <attachment id="11169" name="messages-mds" size="180744" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11170" name="messages-oss-1" size="43880" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11171" name="messages-oss-2" size="68663" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11425" name="messages.mds" size="110300" author="rspellman" created="Tue, 22 May 2012 10:29:20 +0000"/>
                            <attachment id="11168" name="reproduce.sh" size="670" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11372" name="staterrs.tar.gz" size="2227457" author="pcpiela" created="Tue, 15 May 2012 12:58:19 +0000"/>
                            <attachment id="11172" name="usrs388.netconsole" size="3429" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11173" name="usrs389.netconsole" size="3339" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11174" name="usrs390.netconsole" size="3339" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11175" name="usrs391.netconsole" size="3339" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11176" name="usrs392.netconsole" size="3340" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11189" name="usrs393.messages" size="20608" author="rspellman" created="Tue, 17 Apr 2012 13:28:03 +0000"/>
                            <attachment id="11177" name="usrs393.netconsole" size="1798" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11178" name="usrs394.netconsole" size="3340" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11179" name="usrs395.netconsole" size="3339" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11180" name="usrs396.netconsole" size="3898" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11181" name="usrs397.netconsole" size="3340" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11182" name="usrs398.netconsole" size="3339" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11183" name="usrs399.netconsole" size="3339" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                            <attachment id="11184" name="usrs400.netconsole" size="3340" author="rspellman" created="Mon, 16 Apr 2012 17:03:59 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvh1z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6412</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10021"><![CDATA[2]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>