<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:24:34 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2362] MDS crash in llog_process_thread </title>
                <link>https://jira.whamcloud.com/browse/LU-2362</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;On reboot of the MDS this morning, we hit the following crash:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Lustre: lstest-MDT0000: Recovery over after 5:00, of 449 clients 194 recovered and 255 were evicted.
Lustre: lstest-MDT0000: Client a449214e-338a-17fc-d926-5a3a273629af (at 172.20.17.191@o2ib500) reconnecting
Lustre: lstest-MDT0000: Client cbe0af34-1d34-affd-7602-97be89da80f8 (at 172.20.17.89@o2ib500) reconnecting
Lustre: Skipped 23 previous similar messages
Lustre: lstest-MDT0000: Client 3f7011c2-9552-065c-9d01-46c12930b0a3 (at 172.20.17.63@o2ib500) reconnecting
Lustre: Skipped 100 previous similar messages
LustreError: 11-0: lstest-OST0197-osc-MDT0000: Communicating with 172.20.3.8@o2ib500, operation ost_connect failed with -19
LustreError: Skipped 33 previous similar messages
LustreError: 11-0: lstest-MDT0000-osp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11
LustreError: Skipped 63 previous similar messages
Lustre: lstest-OST01fd-osc-MDT0000: Connection restored to lstest-OST01fd (at 172.20.3.109@o2ib500)
Lustre: lstest-OST01a3-osc-MDT0000: Connection restored to lstest-OST01a3 (at 172.20.3.19@o2ib500)
Lustre: lstest-OST01ca-osc-MDT0000: Connection restored to lstest-OST01ca (at 172.20.3.58@o2ib500)
Lustre: Skipped 1 previous similar message
Lustre: lstest-OST01cf-osc-MDT0000: Connection restored to lstest-OST01cf (at 172.20.3.63@o2ib500)
Lustre: Skipped 2 previous similar messages
Lustre: lstest-OST01ec-osc-MDT0000: Connection restored to lstest-OST01ec (at 172.20.3.92@o2ib500)
Lustre: Skipped 11 previous similar messages
Lustre: 32886:0:(llog.c:92:llog_free_handle()) Still busy: 2: 0x8d58:0x1:0: 64767 43517 1801984 1
Pid: 32886, comm: osp-syn-461


Call Trace:
 [&amp;lt;ffffffffa05be965&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [&amp;lt;ffffffffa0716e81&amp;gt;] llog_free_handle+0xb1/0x3c0 [obdclass]
 [&amp;lt;ffffffffa07171ed&amp;gt;] llog_close+0x5d/0x190 [obdclass]
 [&amp;lt;ffffffffa071cdf7&amp;gt;] llog_cat_cancel_records+0x137/0x360 [obdclass]
 [&amp;lt;ffffffffa0d30408&amp;gt;] osp_sync_process_committed+0x238/0x760 [osp]
 [&amp;lt;ffffffffa09259a7&amp;gt;] ? ptlrpcd_add_req+0x187/0x2e0 [ptlrpc]
 [&amp;lt;ffffffffa0d309c4&amp;gt;] osp_sync_process_queues+0x94/0x11c0 [osp]
 [&amp;lt;ffffffff8105ea30&amp;gt;] ? default_wake_function+0x0/0x20
 [&amp;lt;ffffffffa0718ecb&amp;gt;] llog_process_thread+0x8fb/0xe10 [obdclass]
 [&amp;lt;ffffffffa0d30930&amp;gt;] ? osp_sync_process_queues+0x0/0x11c0 [osp]
 [&amp;lt;ffffffffa071ac0d&amp;gt;] llog_process_or_fork+0x12d/0x660 [obdclass]
 [&amp;lt;ffffffffa071cc13&amp;gt;] llog_cat_process_cb+0x2d3/0x380 [obdclass]
 [&amp;lt;ffffffffa0718ecb&amp;gt;] llog_process_thread+0x8fb/0xe10 [obdclass]
 [&amp;lt;ffffffffa071c940&amp;gt;] ? llog_cat_process_cb+0x0/0x380 [obdclass]
 [&amp;lt;ffffffffa071ac0d&amp;gt;] llog_process_or_fork+0x12d/0x660 [obdclass]
 [&amp;lt;ffffffffa071b749&amp;gt;] llog_cat_process_or_fork+0x89/0x290 [obdclass]
 [&amp;lt;ffffffff8104cab9&amp;gt;] ? __wake_up_common+0x59/0x90
 [&amp;lt;ffffffffa0d30930&amp;gt;] ? osp_sync_process_queues+0x0/0x11c0 [osp]
 [&amp;lt;ffffffffa071b969&amp;gt;] llog_cat_process+0x19/0x20 [obdclass]
 [&amp;lt;ffffffffa05bf83a&amp;gt;] ? cfs_waitq_signal+0x1a/0x20 [libcfs]
 [&amp;lt;ffffffffa0d32aaa&amp;gt;] osp_sync_thread+0x22a/0x700 [osp]
 [&amp;lt;ffffffffa0d32880&amp;gt;] ? osp_sync_thread+0x0/0x700 [osp]
 [&amp;lt;ffffffff8100c14a&amp;gt;] child_rip+0xa/0x20
 [&amp;lt;ffffffffa0d32880&amp;gt;] ? osp_sync_thread+0x0/0x700 [osp]
 [&amp;lt;ffffffffa0d32880&amp;gt;] ? osp_sync_thread+0x0/0x700 [osp]
 [&amp;lt;ffffffff8100c140&amp;gt;] ? child_rip+0x0/0x20

general protection fault: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:80/0000:80:02.2/0000:83:00.0/host7/port-7:1/expander-7:1/port-7:1:10/end_device-7:1:10/target7:0:30/7:0:30:0/state
CPU 18 
Modules linked in: osp(U) mdt(U) mdd(U) fid(U) fld(U) lod(U) mgs(U) mgc(U) osd_zfs(U) lquota(U) ptlrpc(U) obdclass(U) lvfs(U) acpi_cpufreq freq_table mperf ko2iblnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa mlx4_ib ib_mad ib_core dm_mirror dm_region_hash dm_log dm_round_robin dm_multipath dm_mod vhost_net macvtap macvlan tun kvm zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate sg ses enclosure sd_mod crc_t10dif isci libsas wmi mpt2sas scsi_transport_sas raid_class sb_edac edac_core i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ahci ioatdma shpchp ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc mlx4_en mlx4_core igb dca [last unloaded: cpufreq_ondemand]

Pid: 32886, comm: osp-syn-461
 Tainted: P        W  ----------------   2.6.32-220.23.1.2chaos.ch5.x86_64 #1 appro 2620x-in/S2600GZ
RIP: 0010:[&amp;lt;ffffffffa071889c&amp;gt;]  [&amp;lt;ffffffffa071889c&amp;gt;] llog_process_thread+0x2cc/0xe10 [obdclass]
RSP: 0018:ffff881fa1b6db50  EFLAGS: 00010206
RAX: 5a5a5a5a5a5a5a5a RBX: 000000000000aa01 RCX: 0000000000000000
RDX: ffff881017306940 RSI: ffff881fa1c28000 RDI: ffff880f81308000
RBP: ffff881fa1b6dbf0 R08: ffff880fffeed800 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff880f7f1bc058
R13: 000000000000fcff R14: ffff880f81306000 R15: ffff881fa1b6de80
FS:  00007ffff7fdc700(0000) GS:ffff880060740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002aaaaafa4450 CR3: 0000000001a85000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process osp-syn-461
 (pid: 32886, threadinfo ffff881fa1b6c000, task ffff881fa1988ae0)
Stack:
 ffff880f00002000 0000000101e5b370 ffff880f00000000 ffff880f81306001
&amp;lt;0&amp;gt; 0000a9fe00000000 0000000000000000 0000000000000000 00000000002aa000
&amp;lt;0&amp;gt; 0000fd00a1b6dbd0 ffff880f81308000 ffff881017306940 ffff880fffeed800
Call Trace:
 [&amp;lt;ffffffffa0d30930&amp;gt;] ? osp_sync_process_queues+0x0/0x11c0 [osp]
 [&amp;lt;ffffffffa071ac0d&amp;gt;] llog_process_or_fork+0x12d/0x660 [obdclass]
 [&amp;lt;ffffffffa071cc13&amp;gt;] llog_cat_process_cb+0x2d3/0x380 [obdclass]
 [&amp;lt;ffffffffa0718ecb&amp;gt;] llog_process_thread+0x8fb/0xe10 [obdclass]
 [&amp;lt;ffffffffa071c940&amp;gt;] ? llog_cat_process_cb+0x0/0x380 [obdclass]
 [&amp;lt;ffffffffa071ac0d&amp;gt;] llog_process_or_fork+0x12d/0x660 [obdclass]
 [&amp;lt;ffffffffa071b749&amp;gt;] llog_cat_process_or_fork+0x89/0x290 [obdclass]
 [&amp;lt;ffffffff8104cab9&amp;gt;] ? __wake_up_common+0x59/0x90
 [&amp;lt;ffffffffa0d30930&amp;gt;] ? osp_sync_process_queues+0x0/0x11c0 [osp]
 [&amp;lt;ffffffffa071b969&amp;gt;] llog_cat_process+0x19/0x20 [obdclass]
 [&amp;lt;ffffffffa05bf83a&amp;gt;] ? cfs_waitq_signal+0x1a/0x20 [libcfs]
 [&amp;lt;ffffffffa0d32aaa&amp;gt;] osp_sync_thread+0x22a/0x700 [osp]
 [&amp;lt;ffffffffa0d32880&amp;gt;] ? osp_sync_thread+0x0/0x700 [osp]
 [&amp;lt;ffffffff8100c14a&amp;gt;] child_rip+0xa/0x20
 [&amp;lt;ffffffffa0d32880&amp;gt;] ? osp_sync_thread+0x0/0x700 [osp]
 [&amp;lt;ffffffffa0d32880&amp;gt;] ? osp_sync_thread+0x0/0x700 [osp]
 [&amp;lt;ffffffff8100c140&amp;gt;] ? child_rip+0x0/0x20
Code: 78 0c 00 01 00 00 00 e8 33 6d eb ff 48 83 7d b8 00 0f 84 f8 03 00 00 4c 8b 45 b8 49 8b 80 b0 00 00 00 48 85 c0 0f 84 e4 03 00 00 &amp;lt;48&amp;gt; 8b 40 08 48 85 c0 0f 84 67 02 00 00 4d 89 f1 c7 04 24 00 20 
RIP  [&amp;lt;ffffffffa071889c&amp;gt;] llog_process_thread+0x2cc/0xe10 [obdclass]
 RSP &amp;lt;ffff881fa1b6db50&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;
crash&amp;gt; bt
PID: 32886  TASK: ffff881fa1988ae0  CPU: 18  COMMAND: &quot;osp-syn-461
&quot;
 #0 [ffff881fa1b6d8e0] machine_kexec at ffffffff8103216b
 #1 [ffff881fa1b6d940] crash_kexec at ffffffff810b8d12
 #2 [ffff881fa1b6da10] oops_end at ffffffff814f2c00
 #3 [ffff881fa1b6da40] die at ffffffff8100f26b
 #4 [ffff881fa1b6da70] do_general_protection at ffffffff814f2792
 #5 [ffff881fa1b6daa0] general_protection at ffffffff814f1f65
    [exception RIP: llog_process_thread+716]
    RIP: ffffffffa071889c  RSP: ffff881fa1b6db50  RFLAGS: 00010206
    RAX: 5a5a5a5a5a5a5a5a  RBX: 000000000000aa01  RCX: 0000000000000000
    RDX: ffff881017306940  RSI: ffff881fa1c28000  RDI: ffff880f81308000
    RBP: ffff881fa1b6dbf0   R8: ffff880fffeed800   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: ffff880f7f1bc058
    R13: 000000000000fcff  R14: ffff880f81306000  R15: ffff881fa1b6de80
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #6 [ffff881fa1b6dbf8] llog_process_or_fork at ffffffffa071ac0d [obdclass]
 #7 [ffff881fa1b6dc48] llog_cat_process_cb at ffffffffa071cc13 [obdclass]
 #8 [ffff881fa1b6dc88] llog_process_thread at ffffffffa0718ecb [obdclass]
 #9 [ffff881fa1b6dd38] llog_process_or_fork at ffffffffa071ac0d [obdclass]
#10 [ffff881fa1b6dd88] llog_cat_process_or_fork at ffffffffa071b749 [obdclass]
#11 [ffff881fa1b6de18] llog_cat_process at ffffffffa071b969 [obdclass]
#12 [ffff881fa1b6de38] osp_sync_thread at ffffffffa0d32aaa [osp]
#13 [ffff881fa1b6df48] kernel_thread at ffffffff8100c14a
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;(gdb) l *llog_process_thread+716
0x58cc is in llog_process_thread (/builddir/build/BUILD/lustre-2.3.56/lustre/include/lustre_log.h:662).
657     /builddir/build/BUILD/lustre-2.3.56/lustre/include/lustre_log.h: No such file or directory.
        in /builddir/build/BUILD/lustre-2.3.56/lustre/include/lustre_log.h
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lustre/include/lustre_log.h:

649 static inline int llog_next_block(const struct lu_env *env,                     
650                                   struct llog_handle *loghandle, int *cur_idx,  
651                                   int next_idx, __u64 *cur_offset, void *buf,   
652                                   int len)                                      
653 {                                                                               
654         struct llog_operations *lop;                                            
655         int rc;                                                                 
656                                                                                 
657         ENTRY;                                                                  
658                                                                                 
659         rc = llog_handle2ops(loghandle, &amp;amp;lop);                                  
660         if (rc)                                                                 
661                 RETURN(rc);                                                     
662         if (lop-&amp;gt;lop_next_block == NULL)                                        
663                 RETURN(-EOPNOTSUPP);                                            
664                                                                                 
665         rc = lop-&amp;gt;lop_next_block(env, loghandle, cur_idx, next_idx,             
666                                  cur_offset, buf, len);                         
667         RETURN(rc);                                                             
668 }     
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is probably related to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2109&quot; title=&quot;__llog_process_thread() GPF&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2109&quot;&gt;&lt;del&gt;LU-2109&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</description>
                <environment></environment>
        <key id="16730">LU-2362</key>
            <summary>MDS crash in llog_process_thread </summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="6" iconUrl="https://jira.whamcloud.com/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="3">Duplicate</resolution>
                                        <assignee username="liwei">Li Wei</assignee>
                                    <reporter username="prakash">Prakash Surya</reporter>
                        <labels>
                            <label>topsequoia</label>
                    </labels>
                <created>Tue, 20 Nov 2012 14:40:47 +0000</created>
                <updated>Tue, 20 Nov 2012 21:32:56 +0000</updated>
                            <resolved>Tue, 20 Nov 2012 21:32:47 +0000</resolved>
                                    <version>Lustre 2.4.0</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="48156" author="liwei" created="Tue, 20 Nov 2012 21:32:47 +0000"  >&lt;p&gt;Right, this should be a symptom of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2109&quot; title=&quot;__llog_process_thread() GPF&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2109&quot;&gt;&lt;del&gt;LU-2109&lt;/del&gt;&lt;/a&gt;.  I&apos;m closing it as dup.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="14687">LU-2109</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvckf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5624</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>