<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:37:37 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-10721] Interop 2.10.3&lt;-&gt;master recovery-small test_115a: BAD READ CHECKSUM: should have changed on the client or in transit</title>
                <link>https://jira.whamcloud.com/browse/LU-10721</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;recovery-small test_115a - Timeout occurred after 212 mins, last suite running was recovery-small, restarting cluster to continue tests&lt;br/&gt;
^^^^^^^^^^^^^ DO NOT REMOVE LINE ABOVE ^^^^^^^^^^^^^&lt;/p&gt;

&lt;p&gt;This issue was created by maloo for sarah_lw &amp;lt;wei3.liu@intel.com&amp;gt;&lt;/p&gt;

&lt;p&gt;This issue relates to the following test suite run: &lt;br/&gt;
&lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/4ee5d662-12aa-11e8-a6ad-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/4ee5d662-12aa-11e8-a6ad-52540065bddc&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;test_115a failed with the following error:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Timeout occurred after 212 mins, last suite running was recovery-small, restarting cluster to continue tests
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;client: lustre-master tag-2.10.58&lt;br/&gt;
server: 2.10.3&lt;/p&gt;

&lt;p&gt;OST console&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[ 9177.783390] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == recovery-small test 115a: read: late REQ MDunlink and no bulk ===================================== 10:08:44 \(1518689324\)
[ 9177.969217] Lustre: DEBUG MARKER: == recovery-small test 115a: read: late REQ MDunlink and no bulk ===================================== 10:08:44 (1518689324)
[ 9178.139373] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2&amp;gt;/dev/null ||
				/usr/sbin/lctl lustre_build_version 2&amp;gt;/dev/null ||
				/usr/sbin/lctl --version 2&amp;gt;/dev/null | cut -d&apos; &apos; -f2
[ 9179.546836] Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0x8000051a
[ 9182.363552] Lustre: *** cfs_fail_loc=51a, val=0***
[ 9212.794207] Lustre: lustre-OST0000: Client 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp) reconnecting
[ 9213.793456] LustreError: 132-0: lustre-OST0000: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.2.8.124@tcp inode [0x200039611:0x9:0x0] object 0x0:59083 extent [0-4095], client returned csum 0 (type 4), server csum 98f94189 (type 4)
[ 9253.893995] LNet: Service thread pid 18623 was inactive for 40.10s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
[ 9253.895896] Pid: 18623, comm: ll_ost_io00_012
[ 9253.896368] 
Call Trace:
[ 9253.896822]  [&amp;lt;ffffffff816ab6b9&amp;gt;] schedule+0x29/0x70
[ 9253.897357]  [&amp;lt;ffffffff816a9004&amp;gt;] schedule_timeout+0x174/0x2c0
[ 9253.898152]  [&amp;lt;ffffffff8109a6c0&amp;gt;] ? process_timeout+0x0/0x10
[ 9253.898933]  [&amp;lt;ffffffffc09d247e&amp;gt;] target_bulk_io+0x4ae/0xab0 [ptlrpc]
[ 9253.899635]  [&amp;lt;ffffffff810c6440&amp;gt;] ? default_wake_function+0x0/0x20
[ 9253.900476]  [&amp;lt;ffffffffc0a7c482&amp;gt;] tgt_brw_read+0xf32/0x1850 [ptlrpc]
[ 9253.901189]  [&amp;lt;ffffffffc0a16dc3&amp;gt;] ? lustre_pack_reply_v2+0x183/0x280 [ptlrpc]
[ 9253.902042]  [&amp;lt;ffffffffc0a16f2f&amp;gt;] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[ 9253.902824]  [&amp;lt;ffffffffc09cfe60&amp;gt;] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[ 9253.903575]  [&amp;lt;ffffffffc0a79da5&amp;gt;] tgt_request_handle+0x925/0x1370 [ptlrpc]
[ 9253.904421]  [&amp;lt;ffffffffc0a22b16&amp;gt;] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc]
[ 9253.905276]  [&amp;lt;ffffffffc05d7bc7&amp;gt;] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[ 9253.906048]  [&amp;lt;ffffffffc0a26252&amp;gt;] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
[ 9253.906800]  [&amp;lt;ffffffffc0a257c0&amp;gt;] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
[ 9253.907481]  [&amp;lt;ffffffff810b252f&amp;gt;] kthread+0xcf/0xe0
[ 9253.908062]  [&amp;lt;ffffffff810b2460&amp;gt;] ? kthread+0x0/0xe0
[ 9253.908623]  [&amp;lt;ffffffff816b8798&amp;gt;] ret_from_fork+0x58/0x90
[ 9253.909214]  [&amp;lt;ffffffff810b2460&amp;gt;] ? kthread+0x0/0xe0

[ 9253.910003] LustreError: dumping log to /tmp/lustre-log.1518689399.18623
[ 9313.794986] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff88006133d450 x1592451778374256/t0(0) o3-&amp;gt;8a9263ac-9afd-de65-c52c-eb00e5830c2e@10.2.8.124@tcp:170/0 lens 608/432 e 4 to 0 dl 1518689465 ref 1 fl Interpret:/0/0 rc 0/0
[ 9313.797912] Lustre: lustre-OST0000: Bulk IO read error with 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp), client will retry: rc -110
[ 9313.799353] LNet: Service thread pid 18623 completed after 100.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
[ 9320.792161] Lustre: lustre-OST0000: Client 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp) reconnecting
[ 9320.794621] LustreError: 132-0: lustre-OST0000: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.2.8.124@tcp inode [0x200039611:0x9:0x0] object 0x0:59083 extent [0-4095], client returned csum 0 (type 4), server csum 98f94189 (type 4)
[ 9420.796985] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff880060b31450 x1592451778374256/t0(0) o3-&amp;gt;8a9263ac-9afd-de65-c52c-eb00e5830c2e@10.2.8.124@tcp:277/0 lens 608/432 e 0 to 0 dl 1518689572 ref 1 fl Interpret:/2/0 rc 0/0
[ 9420.799665] Lustre: lustre-OST0000: Bulk IO read error with 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp), client will retry: rc -110
[ 9427.795775] LustreError: 132-0: lustre-OST0000: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.2.8.124@tcp inode [0x200039611:0x9:0x0] object 0x0:59083 extent [0-4095], client returned csum 0 (type 4), server csum 98f94189 (type 4)
[ 9527.797989] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff88006027b850 x1592451778374256/t0(0) o3-&amp;gt;8a9263ac-9afd-de65-c52c-eb00e5830c2e@10.2.8.124@tcp:384/0 lens 608/432 e 0 to 0 dl 1518689679 ref 1 fl Interpret:/2/0 rc 0/0
[ 9527.800790] Lustre: lustre-OST0000: Bulk IO read error with 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp), client will retry: rc -110
[ 9534.798632] Lustre: lustre-OST0000: Client 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp) reconnecting
[ 9534.799948] Lustre: Skipped 1 previous similar message
[ 9534.800610] Lustre: lustre-OST0000: Connection restored to de427d50-d7c3-9c4d-0314-3a8b85bd9da6 (at 10.2.8.124@tcp)
[ 9534.801699] Lustre: Skipped 117 previous similar messages
[ 9534.803081] LustreError: 132-0: lustre-OST0000: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.2.8.124@tcp inode [0x200039611:0x9:0x0] object 0x0:59083 extent [0-4095], client returned csum 0 (type 4), server csum 98f94189 (type 4)
[ 9634.804986] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff88006128b850 x1592451778374256/t0(0) o3-&amp;gt;8a9263ac-9afd-de65-c52c-eb00e5830c2e@10.2.8.124@tcp:491/0 lens 608/432 e 0 to 0 dl 1518689786 ref 1 fl Interpret:/2/0 rc 0/0
[ 9634.807683] Lustre: lustre-OST0000: Bulk IO read error with 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp), client will retry: rc -110
[ 9641.800411] LustreError: 132-0: lustre-OST0000: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.2.8.124@tcp inode [0x200039611:0x9:0x0] object 0x0:59083 extent [0-4095], client returned csum 0 (type 4), server csum 98f94189 (type 4)
[ 9741.802995] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff88005d038c50 x1592451778374256/t0(0) o3-&amp;gt;8a9263ac-9afd-de65-c52c-eb00e5830c2e@10.2.8.124@tcp:598/0 lens 608/432 e 0 to 0 dl 1518689893 ref 1 fl Interpret:/2/0 rc 0/0
[ 9741.805670] Lustre: lustre-OST0000: Bulk IO read error with 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp), client will retry: rc -110
[ 9748.800033] LustreError: 132-0: lustre-OST0000: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.2.8.124@tcp inode [0x200039611:0x9:0x0] object 0x0:59083 extent [0-4095], client returned csum 0 (type 4), server csum 98f94189 (type 4)
[ 9848.801984] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff880060638850 x1592451778374256/t0(0) o3-&amp;gt;8a9263ac-9afd-de65-c52c-eb00e5830c2e@10.2.8.124@tcp:705/0 lens 608/432 e 0 to 0 dl 1518690000 ref 1 fl Interpret:/2/0 rc 0/0
[ 9848.804674] Lustre: lustre-OST0000: Bulk IO read error with 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp), client will retry: rc -110
[ 9855.800100] Lustre: lustre-OST0000: Client 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp) reconnecting
[ 9855.801538] Lustre: Skipped 2 previous similar messages
[ 9855.803027] LustreError: 132-0: lustre-OST0000: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.2.8.124@tcp inode [0x200039611:0x9:0x0] object 0x0:59083 extent [0-4095], client returned csum 0 (type 4), server csum 98f94189 (type 4)
[ 9955.804985] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff880060786050 x1592451778374256/t0(0) o3-&amp;gt;8a9263ac-9afd-de65-c52c-eb00e5830c2e@10.2.8.124@tcp:57/0 lens 608/432 e 0 to 0 dl 1518690107 ref 1 fl Interpret:/2/0 rc 0/0
[ 9955.807591] Lustre: lustre-OST0000: Bulk IO read error with 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp), client will retry: rc -110
[ 9962.801022] LustreError: 132-0: lustre-OST0000: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.2.8.124@tcp inode [0x200039611:0x9:0x0] object 0x0:59083 extent [0-4095], client returned csum 0 (type 4), server csum 98f94189 (type 4)
[10062.802989] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff880068906050 x1592451778374256/t0(0) o3-&amp;gt;8a9263ac-9afd-de65-c52c-eb00e5830c2e@10.2.8.124@tcp:164/0 lens 608/432 e 0 to 0 dl 1518690214 ref 1 fl Interpret:/2/0 rc 0/0
[10062.805909] Lustre: lustre-OST0000: Bulk IO read error with 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp), client will retry: rc -110
[10069.802829] LustreError: 132-0: lustre-OST0000: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.2.8.124@tcp inode [0x200039611:0x9:0x0] object 0x0:59083 extent [0-4095], client returned csum 0 (type 4), server csum 98f94189 (type 4)
[10169.804997] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff88006133e050 x1592451778374256/t0(0) o3-&amp;gt;8a9263ac-9afd-de65-c52c-eb00e5830c2e@10.2.8.124@tcp:271/0 lens 608/432 e 0 to 0 dl 1518690321 ref 1 fl Interpret:/2/0 rc 0/0
[10169.807940] Lustre: lustre-OST0000: Bulk IO read error with 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp), client will retry: rc -110
[10176.807295] Lustre: lustre-OST0000: Connection restored to de427d50-d7c3-9c4d-0314-3a8b85bd9da6 (at 10.2.8.124@tcp)
[10176.808453] Lustre: Skipped 5 previous similar messages
[10283.808108] LustreError: 132-0: lustre-OST0000: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.2.8.124@tcp inode [0x200039611:0x9:0x0] object 0x0:59083 extent [0-4095], client returned csum 0 (type 4), server csum 98f94189 (type 4)
[10283.810562] LustreError: Skipped 1 previous similar message
[10383.810991] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s  req@ffff88005c641c50 x1592451778374256/t0(0) o3-&amp;gt;8a9263ac-9afd-de65-c52c-eb00e5830c2e@10.2.8.124@tcp:485/0 lens 608/432 e 0 to 0 dl 1518690535 ref 1 fl Interpret:/2/0 rc 0/0
[10383.813599] LustreError: 18623:0:(ldlm_lib.c:3226:target_bulk_io()) Skipped 1 previous similar message
[10383.814654] Lustre: lustre-OST0000: Bulk IO read error with 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp), client will retry: rc -110
[10383.816086] Lustre: Skipped 1 previous similar message
[10390.807059] Lustre: lustre-OST0000: Client 8a9263ac-9afd-de65-c52c-eb00e5830c2e (at 10.2.8.124@tcp) reconnecting
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment></environment>
        <key id="50957">LU-10721</key>
            <summary>Interop 2.10.3&lt;-&gt;master recovery-small test_115a: BAD READ CHECKSUM: should have changed on the client or in transit</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="maloo">Maloo</reporter>
                        <labels>
                    </labels>
                <created>Mon, 26 Feb 2018 17:51:45 +0000</created>
                <updated>Mon, 22 Oct 2018 21:59:23 +0000</updated>
                                            <version>Lustre 2.11.0</version>
                    <version>Lustre 2.12.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="224287" author="sarah" created="Thu, 22 Mar 2018 17:20:11 +0000"  >&lt;p&gt;saw similar trace in interop testing between 2.10.3 server and 2.11 client(2.11.rc1) in replay-dual test_28&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/4f6324e4-2cbf-11e8-b74b-52540065bddc&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/4f6324e4-2cbf-11e8-b74b-52540065bddc&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;oss trace&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;
[16542.791311] Lustre: DEBUG MARKER: dmesg
[16552.842570] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == replay-dual test 28: lock replay should be ordered: waiting after granted ========================= 12:21:45 \(1521548505\)
[16553.045521] Lustre: DEBUG MARKER: == replay-dual test 28: lock replay should be ordered: waiting after granted ========================= 12:21:45 (1521548505)
[16553.238972] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x80000324
[16555.581452] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x32a
[16555.938376] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost1&apos; &apos; /proc/mounts || true
[16556.281981] Lustre: DEBUG MARKER: umount -d /mnt/lustre-ost1
[16556.464887] Lustre: Failing over lustre-OST0000
[16556.487576] Lustre: *** cfs_fail_loc=32a, val=0***
[16556.489453] LustreError: 26562:0:(client.c:1166:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88001d3b4900 x1595442006207008/t0(0) o105-&amp;gt;lustre-OST0000@10.9.6.38@tcp:15/16 lens 360/224 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1
[16557.691330] Lustre: lustre-OST0000: Not available for connect from 10.9.6.39@tcp (stopping)
[16557.693679] Lustre: Skipped 2 previous similar messages
[16558.538911] Lustre: server umount lustre-OST0000 complete
[16558.781868] Lustre: DEBUG MARKER: lsmod | grep lnet &amp;gt; /dev/null &amp;amp;&amp;amp;
[16558.781868] lctl dl | grep &apos; ST &apos; || true
[16559.031236] LustreError: 137-5: lustre-OST0000_UUID: not available for connect from 10.9.6.41@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
[16559.035632] LustreError: Skipped 11 previous similar messages
[16559.137050] Lustre: DEBUG MARKER: modprobe dm-flakey;
[16559.137050] dmsetup targets | grep -q flakey
[16569.529187] Lustre: DEBUG MARKER: hostname
[16569.929319] Lustre: DEBUG MARKER: modprobe dm-flakey;
[16569.929319] dmsetup targets | grep -q flakey
[16570.273958] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/ost1_flakey &amp;gt;/dev/null 2&amp;gt;&amp;amp;1
[16570.617265] Lustre: DEBUG MARKER: dmsetup status /dev/mapper/ost1_flakey 2&amp;gt;&amp;amp;1
[16570.984247] Lustre: DEBUG MARKER: test -b /dev/mapper/ost1_flakey
[16571.345293] Lustre: DEBUG MARKER: e2label /dev/mapper/ost1_flakey
[16571.696415] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-ost1; mount -t lustre /dev/mapper/ost1_flakey /mnt/lustre-ost1
[16571.897992] LDISKFS-fs (dm-10): file extents enabled, maximum tree depth=5
[16571.912201] LDISKFS-fs (dm-10): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc
[16571.977281] Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 60-180
[16572.180231] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
[16572.532387] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
[16572.689287] Lustre: lustre-OST0000: Will be in recovery for at least 1:00, or until 5 clients reconnect
[16573.193865] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-47vm4.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[16573.195629] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-47vm4.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[16573.407400] Lustre: DEBUG MARKER: trevis-47vm4.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[16573.415822] Lustre: DEBUG MARKER: trevis-47vm4.trevis.hpdd.intel.com: executing set_default_debug -1 all 4
[16573.639098] Lustre: DEBUG MARKER: e2label /dev/mapper/ost1_flakey 2&amp;gt;/dev/null | grep -E &apos;:[a-zA-Z]\{3}[0-9]\{4}&apos;
[16573.987794] Lustre: DEBUG MARKER: e2label /dev/mapper/ost1_flakey 2&amp;gt;/dev/null | grep -E &apos;:[a-zA-Z]\{3}[0-9]\{4}&apos;
[16574.145277] Lustre: *** cfs_fail_loc=32a, val=0***
[16574.288413] Lustre: lustre-OST0000: Recovery over after 0:01, of 5 clients 5 recovered and 0 were evicted.
[16574.307211] Lustre: lustre-OST0000: deleting orphan objects from 0x0:119724 to 0x0:119753
[16574.439183] Lustre: DEBUG MARKER: e2label /dev/mapper/ost1_flakey 2&amp;gt;/dev/null
[16575.455488] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-47vm3.trevis.hpdd.intel.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-ffff*.ost_server_uuid
[16575.471606] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-47vm2.trevis.hpdd.intel.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-ffff*.ost_server_uuid
[16575.701143] Lustre: DEBUG MARKER: trevis-47vm3.trevis.hpdd.intel.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-ffff*.ost_server_uuid
[16575.757454] Lustre: DEBUG MARKER: trevis-47vm2.trevis.hpdd.intel.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-ffff*.ost_server_uuid
[16576.026776] Lustre: DEBUG MARKER: /usr/sbin/lctl mark osc.lustre-OST0000-osc-ffff*.ost_server_uuid in FULL state after 0 sec
[16576.080737] Lustre: DEBUG MARKER: /usr/sbin/lctl mark osc.lustre-OST0000-osc-ffff*.ost_server_uuid in FULL state after 0 sec
[16576.243891] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-ffff*.ost_server_uuid in FULL state after 0 sec
[16576.372631] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-ffff*.ost_server_uuid in FULL state after 0 sec
[16614.367033] LNet: Service thread pid 5969 was inactive for 40.06s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
[16614.371988] LNet: Skipped 1 previous similar message
[16614.374152] Pid: 5969, comm: ll_ost_io00_042
[16614.376225] 
[16614.376225] Call Trace:
[16614.379927] [&amp;lt;ffffffff816ab6b9&amp;gt;] schedule+0x29/0x70
[16614.382154] [&amp;lt;ffffffff816a9004&amp;gt;] schedule_timeout+0x174/0x2c0
[16614.384389] [&amp;lt;ffffffff8109a6c0&amp;gt;] ? process_timeout+0x0/0x10
[16614.386629] [&amp;lt;ffffffffc0a9147e&amp;gt;] target_bulk_io+0x4ae/0xab0 [ptlrpc]
[16614.388888] [&amp;lt;ffffffff810c6440&amp;gt;] ? default_wake_function+0x0/0x20
[16614.391163] [&amp;lt;ffffffffc0b3cf43&amp;gt;] tgt_brw_write+0x11a3/0x17c0 [ptlrpc]
[16614.393426] [&amp;lt;ffffffffc0a8ee60&amp;gt;] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[16614.395780] [&amp;lt;ffffffffc0b38da5&amp;gt;] tgt_request_handle+0x925/0x1370 [ptlrpc]
[16614.398086] [&amp;lt;ffffffffc0ae1b16&amp;gt;] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc]
[16614.400490] [&amp;lt;ffffffffc0ae5252&amp;gt;] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
[16614.402714] [&amp;lt;ffffffffc0ae47c0&amp;gt;] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
[16614.404927] [&amp;lt;ffffffff810b252f&amp;gt;] kthread+0xcf/0xe0
[16614.406954] [&amp;lt;ffffffff810b2460&amp;gt;] ? kthread+0x0/0xe0
[16614.409011] [&amp;lt;ffffffff816b8798&amp;gt;] ret_from_fork+0x58/0x90
[16614.411056] [&amp;lt;ffffffff810b2460&amp;gt;] ? kthread+0x0/0xe0
[16614.413024] 
[16614.414668] LustreError: dumping log to /tmp/lustre-log.1521548568.5969
[16614.481244] Pid: 5896, comm: ll_ost_io00_009
[16614.483130] 
[16614.483130] Call Trace:
[16614.486250] [&amp;lt;ffffffff816ab6b9&amp;gt;] schedule+0x29/0x70
[16614.488242] [&amp;lt;ffffffff816a9004&amp;gt;] schedule_timeout+0x174/0x2c0
[16614.490222] [&amp;lt;ffffffff8109a6c0&amp;gt;] ? process_timeout+0x0/0x10
[16614.492124] [&amp;lt;ffffffffc0a9147e&amp;gt;] target_bulk_io+0x4ae/0xab0 [ptlrpc]
[16614.494226] [&amp;lt;ffffffff810c6440&amp;gt;] ? default_wake_function+0x0/0x20
[16614.496290] [&amp;lt;ffffffffc0b3cf43&amp;gt;] tgt_brw_write+0x11a3/0x17c0 [ptlrpc]
[16614.498217] [&amp;lt;ffffffffc0a8ee60&amp;gt;] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[16614.500397] [&amp;lt;ffffffffc0b38da5&amp;gt;] tgt_request_handle+0x925/0x1370 [ptlrpc]
[16614.502336] [&amp;lt;ffffffffc0ae1b16&amp;gt;] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc]
[16614.504490] [&amp;lt;ffffffffc0ae5252&amp;gt;] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
[16614.506352] [&amp;lt;ffffffffc0ae47c0&amp;gt;] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
[16614.508328] [&amp;lt;ffffffff810b252f&amp;gt;] kthread+0xcf/0xe0
[16614.510082] [&amp;lt;ffffffff810b2460&amp;gt;] ? kthread+0x0/0xe0
[16614.511781] [&amp;lt;ffffffff816b8798&amp;gt;] ret_from_fork+0x58/0x90
[16614.513386] [&amp;lt;ffffffff810b2460&amp;gt;] ? kthread+0x0/0xe0
[16614.515083] 
[16614.516396] Pid: 6019, comm: ll_ost_io00_055
[16614.517950] 
[16614.517950] Call Trace:
[16614.520321] [&amp;lt;ffffffff816ab6b9&amp;gt;] schedule+0x29/0x70
[16614.521827] [&amp;lt;ffffffff816a9004&amp;gt;] schedule_timeout+0x174/0x2c0
[16614.523393] [&amp;lt;ffffffff8109a6c0&amp;gt;] ? process_timeout+0x0/0x10
[16614.524959] [&amp;lt;ffffffffc0a9147e&amp;gt;] target_bulk_io+0x4ae/0xab0 [ptlrpc]
[16614.526403] [&amp;lt;ffffffff810c6440&amp;gt;] ? default_wake_function+0x0/0x20
[16614.528061] [&amp;lt;ffffffffc0b3cf43&amp;gt;] tgt_brw_write+0x11a3/0x17c0 [ptlrpc]
[16614.529528] [&amp;lt;ffffffffc0a8ee60&amp;gt;] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[16614.531221] [&amp;lt;ffffffffc0b38da5&amp;gt;] tgt_request_handle+0x925/0x1370 [ptlrpc]
[16614.532753] [&amp;lt;ffffffffc0ae1b16&amp;gt;] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc]
[16614.534595] [&amp;lt;ffffffffc0ae5252&amp;gt;] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
[16614.536121] [&amp;lt;ffffffffc0ae47c0&amp;gt;] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
[16614.540706] [&amp;lt;ffffffff810b252f&amp;gt;] kthread+0xcf/0xe0
[16614.542088] [&amp;lt;ffffffff810b2460&amp;gt;] ? kthread+0x0/0xe0
[16614.543618] [&amp;lt;ffffffff816b8798&amp;gt;] ret_from_fork+0x58/0x90
[16614.545129] [&amp;lt;ffffffff810b2460&amp;gt;] ? kthread+0x0/0xe0
[16614.546565] 
[16614.547744] Pid: 5897, comm: ll_ost_io00_010

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&#160;&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzztcv:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>