<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:31:07 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-9994] sanity-scrub test_5: test failed to respond and timed out</title>
                <link>https://jira.whamcloud.com/browse/LU-9994</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;This issue was created by maloo for Bob Glossman &amp;lt;bob.glossman@intel.com&amp;gt;&lt;/p&gt;

&lt;p&gt;This issue relates to the following test suite run: &lt;a href=&quot;https://testing.hpdd.intel.com/test_sets/35ecbbdc-99c2-11e7-b778-5254006e85c2&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.hpdd.intel.com/test_sets/35ecbbdc-99c2-11e7-b778-5254006e85c2&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The sub-test test_5 failed with the following error:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;test failed to respond and timed out
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This failure is a panic on MDS.  It has a different stack trace than &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-5449&quot; title=&quot;Test failure on test suite sanity-scrub, subtest test_8&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-5449&quot;&gt;&lt;del&gt;LU-5449&lt;/del&gt;&lt;/a&gt; so I&apos;m creating a new ticket.&lt;br/&gt;
Panic seen:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;20:49:39:[15096.775030] LustreError: 1321:0:(llog_cat.c:269:llog_cat_id2handle()) lustre-MDT0000-osp-MDT0002: error opening log id [0x2:0x402:0x2]:0: rc = -2
20:49:39:[15096.776495] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
20:49:39:[15096.777310] IP: [&amp;lt;ffffffffc07d002a&amp;gt;] llog_process_thread+0x3a/0x1460 [obdclass]
20:49:39:[15096.778071] PGD 78ab5067 PUD 78ab4067 PMD 0 
20:49:39:[15096.778551] Oops: 0000 [#1] SMP 
20:49:39:[15096.778921] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel nfsd aesni_intel lrw gf128mul ppdev glue_helper ablk_helper cryptd i2c_piix4 nfs_acl lockd joydev pcspkr i2c_core virtio_balloon auth_rpcgss grace parport_pc parport sunrpc ip_tables ata_generic pata_acpi ext4 mbcache jbd2 ata_piix virtio_blk libata 8139too crct10dif_pclmul crct10dif_common crc32c_intel serio_raw virtio_pci 8139cp virtio_ring virtio mii floppy
20:49:39:[15096.788032] CPU: 1 PID: 1321 Comm: lod0002_rec0000 Tainted: G           OE  ------------   3.10.0-693.2.2.el7_lustre.x86_64 #1
20:49:39:[15096.789136] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
20:49:39:[15096.789694] task: ffff880067595ee0 ti: ffff8800606bc000 task.ti: ffff8800606bc000
20:49:39:[15096.790413] RIP: 0010:[&amp;lt;ffffffffc07d002a&amp;gt;]  [&amp;lt;ffffffffc07d002a&amp;gt;] llog_process_thread+0x3a/0x1460 [obdclass]
20:49:39:[15096.791385] RSP: 0018:ffff8800606bfb28  EFLAGS: 00010246
20:49:39:[15096.791910] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800606bffd8
20:49:39:[15096.792588] RDX: ffff8800606bfbb8 RSI: 0000000000000000 RDI: ffff88006797f180
20:49:39:[15096.793279] RBP: ffff8800606bfbe0 R08: 0000000000019be0 R09: ffff88007d001a00
20:49:39:[15096.793977] R10: ffffffffc07d14a6 R11: 000000000000000f R12: 0000000000000000
20:49:39:[15096.794662] R13: ffff8800667ebc00 R14: ffff88006797f180 R15: 0000000000000000
20:49:39:[15096.795352] FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
20:49:39:[15096.796149] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
20:49:39:[15096.796704] CR2: 0000000000000060 CR3: 0000000077c09000 CR4: 00000000000406e0
20:49:39:[15096.797387] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
20:49:39:[15096.798084] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
20:49:39:[15096.798778] Stack:
20:49:39:[15096.798987]  ffff8800606bfe58 ffff8800606bfb90 ffffffffc05eeba7 ffff880000000010
20:49:39:[15096.799777]  ffff8800606bfba0 ffff8800606bfb60 0000000075f08900 ffff88007a318000
20:49:39:[15096.800574]  ffff88006797f180 0000000000000001 0000000000000001 0000000000000402
20:49:39:[15096.801374] Call Trace:
20:49:39:[15096.801634]  [&amp;lt;ffffffffc05eeba7&amp;gt;] ? libcfs_debug_msg+0x57/0x80 [libcfs]
20:49:39:[15096.802295]  [&amp;lt;ffffffffc07d5ccf&amp;gt;] ? llog_cat_cleanup+0x15f/0x380 [obdclass]
20:49:39:[15096.802990]  [&amp;lt;ffffffffc0eea6b0&amp;gt;] ? lodname2mdt_index+0x2f0/0x2f0 [lod]
20:49:39:[15096.803639]  [&amp;lt;ffffffffc07d150c&amp;gt;] llog_process_or_fork+0xbc/0x450 [obdclass]
20:49:39:[15096.804337]  [&amp;lt;ffffffffc07d6a3a&amp;gt;] llog_cat_process_cb+0x20a/0x220 [obdclass]
20:49:39:[15096.805040]  [&amp;lt;ffffffffc07d0865&amp;gt;] llog_process_thread+0x875/0x1460 [obdclass]
20:49:39:[15096.805752]  [&amp;lt;ffffffffc07d6830&amp;gt;] ? llog_cat_process_common+0x440/0x440 [obdclass]
20:49:39:[15096.806488]  [&amp;lt;ffffffffc07d150c&amp;gt;] llog_process_or_fork+0xbc/0x450 [obdclass]
20:49:39:[15096.807184]  [&amp;lt;ffffffffc07d6830&amp;gt;] ? llog_cat_process_common+0x440/0x440 [obdclass]
20:49:39:[15096.807936]  [&amp;lt;ffffffffc07d59b9&amp;gt;] llog_cat_process_or_fork+0x199/0x2a0 [obdclass]
20:49:39:[15096.808673]  [&amp;lt;ffffffffc0f1ae2a&amp;gt;] ? lod_sub_prep_llog+0x24a/0x783 [lod]
20:49:39:[15096.809322]  [&amp;lt;ffffffffc0eea6b0&amp;gt;] ? lodname2mdt_index+0x2f0/0x2f0 [lod]
20:49:39:[15096.809976]  [&amp;lt;ffffffffc07d5aee&amp;gt;] llog_cat_process+0x2e/0x30 [obdclass]
20:49:39:[15096.810613]  [&amp;lt;ffffffffc0ee6a89&amp;gt;] lod_sub_recovery_thread+0x439/0xc80 [lod]
20:49:39:[15096.811300]  [&amp;lt;ffffffffc0ee6650&amp;gt;] ? lod_trans_stop+0x340/0x340 [lod]
20:49:39:[15096.811920]  [&amp;lt;ffffffff810b098f&amp;gt;] kthread+0xcf/0xe0
20:49:39:[15096.812393]  [&amp;lt;ffffffff810b08c0&amp;gt;] ? insert_kthread_work+0x40/0x40
20:49:39:[15096.812987]  [&amp;lt;ffffffff816b4f18&amp;gt;] ret_from_fork+0x58/0x90
20:49:39:[15096.813516]  [&amp;lt;ffffffff810b08c0&amp;gt;] ? insert_kthread_work+0x40/0x40
20:49:39:[15096.814113] Code: 41 54 53 48 81 ec 90 00 00 00 4c 8b 27 48 8b 47 18 65 48 8b 34 25 28 00 00 00 48 89 75 d0 31 f6 f6 05 d2 d5 e3 ff 01 48 89 7d 88 &amp;lt;4d&amp;gt; 8b 6c 24 60 48 89 45 80 c7 45 c4 00 00 00 00 74 0d f6 05 b9 
20:49:39:[15096.817178] RIP  [&amp;lt;ffffffffc07d002a&amp;gt;] llog_process_thread+0x3a/0x1460 [obdclass]
20:49:39:[15096.817925]  RSP &amp;lt;ffff8800606bfb28&amp;gt;
20:49:39:[15096.818270] CR2: 0000000000000060
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;Info required for matching: sanity-scrub 5&lt;/p&gt;</description>
                <environment></environment>
        <key id="48329">LU-9994</key>
            <summary>sanity-scrub test_5: test failed to respond and timed out</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bfaccini">Bruno Faccini</assignee>
                                    <reporter username="maloo">Maloo</reporter>
                        <labels>
                    </labels>
                <created>Fri, 15 Sep 2017 06:31:06 +0000</created>
                <updated>Tue, 7 Nov 2017 07:51:13 +0000</updated>
                            <resolved>Tue, 7 Nov 2017 07:37:15 +0000</resolved>
                                                    <fixVersion>Lustre 2.11.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>7</watches>
                                                                            <comments>
                            <comment id="208486" author="aboyko" created="Fri, 15 Sep 2017 13:49:56 +0000"  >&lt;p&gt;I can reproduce this issue with the next test, which I created for &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9714&quot; title=&quot;Changelog consumer test reports &amp;#39;Local llog found corrupted&amp;#39;&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9714&quot;&gt;&lt;del&gt;LU-9714&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[ 1557.907952] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
[ 1557.907976] IP: [&amp;lt;ffffffffa07ceff7&amp;gt;] llog_process_thread+0x37/0x1180 [obdclass]
[ 1557.908039] PGD 53d17067 PUD 54b44067 PMD 0 
[ 1557.908054] Oops: 0000 [#1] SMP 
[ 1557.908065] Modules linked in: loop lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) mbcache lquota(OE) lfsck(OE) jbd2 obdecho(OE) mgc(OE) lov(OE) osc(OE) mdc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) nfsv3 nfs fscache snd_seq_midi snd_seq_midi_event coretemp kvm_intel kvm crc32_pclmul ghash_clmulni_intel snd_ens1371 snd_rawmidi snd_ac97_codec aesni_intel ppdev ac97_bus lrw snd_seq gf128mul glue_helper ablk_helper cryptd snd_seq_device snd_pcm vmw_balloon pcspkr snd_timer uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev snd soundcore sg shpchp vmw_vmci i2c_piix4 parport_pc parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod sr_mod crc_t10dif cdrom crct10dif_generic
[ 1557.908340]  ata_generic pata_acpi crct10dif_pclmul crct10dif_common crc32c_intel vmwgfx serio_raw drm_kms_helper ttm drm mptspi e1000 scsi_transport_spi ata_piix mptscsih mptbase libata i2c_core dm_mirror dm_region_hash dm_log dm_mod
[ 1557.910750] CPU: 2 PID: 13785 Comm: lctl Tainted: G           OE  ------------   3.10.0-327.10.1.el7_lustreb_neo_stable_us_MRP_3474.x86_64 #1
[ 1557.913468] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[ 1557.915940] task: ffff8801143fe780 ti: ffff88011566c000 task.ti: ffff88011566c000
[ 1557.916708] RIP: 0010:[&amp;lt;ffffffffa07ceff7&amp;gt;]  [&amp;lt;ffffffffa07ceff7&amp;gt;] llog_process_thread+0x37/0x1180 [obdclass]
[ 1557.918992] RSP: 0018:ffff88011566f968  EFLAGS: 00010246
[ 1557.920400] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011566ffd8
[ 1557.921723] RDX: ffff88011566f9e8 RSI: 0000000000000000 RDI: ffff8800549e2a20
[ 1557.922901] RBP: ffff88011566fa10 R08: 0000000000017540 R09: ffff88013a007a00
[ 1557.924062] R10: ffff8800549e2a20 R11: 207865646e692074 R12: 0000000000000000
[ 1557.925257] R13: ffff88011566fcc0 R14: ffff8800549e2a20 R15: 0000000000000000
[ 1557.926446] FS:  00007f80aaab8740(0000) GS:ffff88013a640000(0000) knlGS:0000000000000000
[ 1557.927565] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1557.928445] CR2: 0000000000000060 CR3: 0000000050b6a000 CR4: 00000000001407e0
[ 1557.929127] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1557.929833] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1557.930501] Stack:
[ 1557.931120]  ffffffffa0657b57 ffff880100000010 ffff88011566f9d0 ffff88011566f990
[ 1557.931794]  000000005700bf62 ffff880118562000 ffff8800bb7e2c00 ffff8800549e2a20
[ 1557.932417]  0000000000000005 000000000000000a 0000000000000000 ffff88011566fa00
[ 1557.933021] Call Trace:
[ 1557.933638]  [&amp;lt;ffffffffa0657b57&amp;gt;] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[ 1557.934245]  [&amp;lt;ffffffffa07d49bf&amp;gt;] ? llog_cat_cleanup+0x15f/0x380 [obdclass]
[ 1557.934880]  [&amp;lt;ffffffffa0fd8820&amp;gt;] ? mdd_changelog_fini+0x1c0/0x1c0 [mdd]
[ 1557.935544]  [&amp;lt;ffffffffa07d01fc&amp;gt;] llog_process_or_fork+0xbc/0x450 [obdclass]
[ 1557.936105]  [&amp;lt;ffffffffa07d572a&amp;gt;] llog_cat_process_cb+0x20a/0x220 [obdclass]
[ 1557.936718]  [&amp;lt;ffffffffa07cf565&amp;gt;] llog_process_thread+0x5a5/0x1180 [obdclass]
[ 1557.937342]  [&amp;lt;ffffffffa07d5520&amp;gt;] ? llog_cat_process_common+0x440/0x440 [obdclass]
[ 1557.938130]  [&amp;lt;ffffffffa07d01fc&amp;gt;] llog_process_or_fork+0xbc/0x450 [obdclass]
[ 1557.938726]  [&amp;lt;ffffffffa07d5520&amp;gt;] ? llog_cat_process_common+0x440/0x440 [obdclass]
[ 1557.939324]  [&amp;lt;ffffffffa07d46a9&amp;gt;] llog_cat_process_or_fork+0x199/0x2a0 [obdclass]
[ 1557.939894]  [&amp;lt;ffffffffa07d02c5&amp;gt;] ? llog_process_or_fork+0x185/0x450 [obdclass]
[ 1557.940414]  [&amp;lt;ffffffffa0fd8820&amp;gt;] ? mdd_changelog_fini+0x1c0/0x1c0 [mdd]
[ 1557.941010]  [&amp;lt;ffffffffa07d47de&amp;gt;] llog_cat_process+0x2e/0x30 [obdclass]
[ 1557.941984]  [&amp;lt;ffffffffa0fd7a18&amp;gt;] llog_changelog_cancel+0x58/0x1d0 [mdd]
[ 1557.942541]  [&amp;lt;ffffffffa0fd94b7&amp;gt;] ? mdd_changelog_write_header+0x67/0x470 [mdd]
[ 1557.943062]  [&amp;lt;ffffffffa07d6334&amp;gt;] llog_cancel+0x54/0x240 [obdclass]
[ 1557.943574]  [&amp;lt;ffffffffa0fd9a9d&amp;gt;] mdd_changelog_llog_cancel+0xbd/0x250 [mdd]
[ 1557.944058]  [&amp;lt;ffffffffa0fdd286&amp;gt;] mdd_iocontrol+0x12c6/0x16b0 [mdd]
[ 1557.944517]  [&amp;lt;ffffffffa08147f4&amp;gt;] ? keys_fill+0xf4/0x170 [obdclass]
[ 1557.944992]  [&amp;lt;ffffffffa10486be&amp;gt;] mdt_iocontrol+0x5de/0xaf0 [mdt]
[ 1557.945448]  [&amp;lt;ffffffffa07f8439&amp;gt;] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[ 1557.945892]  [&amp;lt;ffffffffa07e3c13&amp;gt;] class_handle_ioctl+0x1933/0x1dc0 [obdclass]
[ 1557.946376]  [&amp;lt;ffffffff811970c8&amp;gt;] ? handle_mm_fault+0x5b8/0xf50
[ 1557.946890]  [&amp;lt;ffffffff812856b8&amp;gt;] ? security_capable+0x18/0x20
[ 1557.947420]  [&amp;lt;ffffffffa07c95d2&amp;gt;] obd_class_ioctl+0xd2/0x170 [obdclass]
[ 1557.947974]  [&amp;lt;ffffffff811f1ec5&amp;gt;] do_vfs_ioctl+0x2e5/0x4c0
[ 1557.948400]  [&amp;lt;ffffffff8164119d&amp;gt;] ? __do_page_fault+0x16d/0x450
[ 1557.948841]  [&amp;lt;ffffffff811f2141&amp;gt;] SyS_ioctl+0xa1/0xc0
[ 1557.949276]  [&amp;lt;ffffffff81645c89&amp;gt;] system_call_fastpath+0x16/0x1b
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;test&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;#!/bin/bash

NUM=100

dmesg -c &amp;gt; /dev/null
#register users
for (( i = 1; i &amp;lt; $NUM; i++)); do
        cl_users[$i]=`lctl --device lustre-MDT0000 changelog_register -n`
done

first=`lctl get_param  mdd.lustre-MDT0000.changelog_users | tail -n1 | cut -d&quot; &quot; -f2`
./lustre/tests/createmany -m /mnt/lustre/test_file_ 10000
./lustre/tests/unlinkmany /mnt/lustre/test_file_ 10000

for (( i = 1; i &amp;lt; $NUM; i++)); do
        lfs changelog_clear  lustre-MDT0000 ${cl_users[$i]} $((first+20000)) &amp;amp;
        lfs changelog_clear  lustre-MDT0000 ${cl_users[1]} $((first+20000)) &amp;amp;

done


#unregister users
for (( i = 1; i &amp;lt; $NUM; i++)); do
        lctl --device lustre-MDT0000 changelog_deregister ${cl_users[$i]} &amp;gt; /dev/null &amp;amp;
        lctl get_param  mdd.lustre-MDT0000.changelog_users &amp;gt; /dev/null &amp;amp;
done

sleep 5

race_str=`dmesg | tail -n50 | grep &apos;modify index mismatch&apos;`

echo $race_str
[[ !  -z  $race_str  ]] &amp;amp;&amp;amp; echo &apos;Race happened&apos;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;it requires a loop, something like&lt;/p&gt;

&lt;p&gt;while true; do sh ../mrp-4455/test_changelog.sh; done&lt;/p&gt;</comment>
                            <comment id="208487" author="aboyko" created="Fri, 15 Sep 2017 13:58:29 +0000"  >&lt;p&gt;I guess that the last line failed, lloghandle is NULL&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;static int llog_process_thread(void *arg)
{
&#160;&#160;&#160;&#160;&#160;&#160;&#160; struct llog_process_info&#160;&#160;&#160;&#160;&#160;&#160;&#160; *lpi = arg;
&#160;&#160;&#160;&#160;&#160;&#160;&#160; struct llog_handle&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; *loghandle = lpi-&amp;gt;lpi_loghandle;
&#160;&#160;&#160;&#160;&#160;&#160;&#160; struct llog_log_hdr&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; *llh = loghandle-&amp;gt;lgh_hdr;   &amp;lt;====

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="210613" author="bfaccini" created="Mon, 9 Oct 2017 15:17:20 +0000"  >&lt;p&gt;Hmm, looks like patch from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9153&quot; title=&quot;improve FID/config name specification for lctl llog_print and llog_info&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9153&quot;&gt;&lt;del&gt;LU-9153&lt;/del&gt;&lt;/a&gt; (&quot;llog: consolidate common error checking&quot;) has introduced a regression in llog_cat_id2handle() error handling path (see error/msg just preceding the crash) by adding llog_cat_process_common() common routine additional call in sequence :&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;static int llog_cat_process_common(const struct lu_env *env,
                                   struct llog_handle *cat_llh,
                                   struct llog_rec_hdr *rec,
                                   struct llog_handle **llhp)
{
        struct llog_logid_rec *lir = container_of(rec, typeof(*lir), lid_hdr);
        struct llog_log_hdr *hdr;
        int rc;
         
        ENTRY;
        if (rec-&amp;gt;lrh_type != le32_to_cpu(LLOG_LOGID_MAGIC)) {
                rc = -EINVAL;
                CWARN(&quot;%s: invalid record in catalog &quot;DFID&quot;:%x: rc = %d\n&quot;,
                      cat_llh-&amp;gt;lgh_ctxt-&amp;gt;loc_obd-&amp;gt;obd_name,
                      PFID(&amp;amp;cat_llh-&amp;gt;lgh_id.lgl_oi.oi_fid),
                      cat_llh-&amp;gt;lgh_id.lgl_ogen, rc);
                RETURN(rc);
        } 
        CDEBUG(D_HA, &quot;processing log &quot;DFID&quot;:%x at index %u of catalog &quot;DFID&quot;\n&quot;,
               PFID(&amp;amp;lir-&amp;gt;lid_id.lgl_oi.oi_fid), lir-&amp;gt;lid_id.lgl_ogen,
               le32_to_cpu(rec-&amp;gt;lrh_index),
               PFID(&amp;amp;cat_llh-&amp;gt;lgh_id.lgl_oi.oi_fid));
    
        rc = llog_cat_id2handle(env, cat_llh, llhp, &amp;amp;lir-&amp;gt;lid_id);
        if (rc) {
                /* After a server crash, a stub of index record in catlog could
                 * be kept, because plain log destroy + catlog index record
                 * deletion are not atomic. So we end up with an index but no
                 * actual record. Destroy the index and move on. */
                if (rc == -ENOENT || rc == -ESTALE)
                        rc = llog_cat_cleanup(env, cat_llh, NULL,  &amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt; rc can be cleared!!!
                                              rec-&amp;gt;lrh_index);
                if (rc)
                        CWARN(&quot;%s: can&apos;t find llog handle &quot;DFID&quot;:%x: rc = %d\n&quot;,
                              cat_llh-&amp;gt;lgh_ctxt-&amp;gt;loc_obd-&amp;gt;obd_name,
                              PFID(&amp;amp;lir-&amp;gt;lid_id.lgl_oi.oi_fid),
                              lir-&amp;gt;lid_id.lgl_ogen, rc);

                RETURN(rc);
        }
.......................

static int llog_cat_process_cb(const struct lu_env *env,
                               struct llog_handle *cat_llh,
                               struct llog_rec_hdr *rec, void *data)
{
        struct llog_process_data *d = data;
        struct llog_handle *llh = NULL;
        int rc;

        ENTRY;
        rc = llog_cat_process_common(env, cat_llh, rec, &amp;amp;llh);
        if (rc) &amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt; no error handling??
                GOTO(out, rc);

        if (rec-&amp;gt;lrh_index &amp;lt; d-&amp;gt;lpd_startcat) {
                /* Skip processing of the logs until startcat */
                rc = 0;
        } else if (d-&amp;gt;lpd_startidx &amp;gt; 0) {
                struct llog_process_cat_data cd;
        
                cd.lpcd_first_idx = d-&amp;gt;lpd_startidx;
                cd.lpcd_last_idx = 0;
                rc = llog_process_or_fork(env, llh, d-&amp;gt;lpd_cb, d-&amp;gt;lpd_data,  &amp;lt;&amp;lt;&amp;lt; can be called with llh==NULL 
                                          &amp;amp;cd, false);
                /* Continue processing the next log from idx 0 */
                d-&amp;gt;lpd_startidx = 0;
        } else {
                rc = llog_process_or_fork(env, llh, d-&amp;gt;lpd_cb, d-&amp;gt;lpd_data, &amp;lt;&amp;lt;&amp;lt; can be called with llh==NULL
                                          NULL, false);
        }
    
............................

&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Will try to cook a patch to fix this asap.&lt;/p&gt;

</comment>
                            <comment id="210624" author="gerrit" created="Mon, 9 Oct 2017 16:11:35 +0000"  >&lt;p&gt;Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/29370&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29370&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9994&quot; title=&quot;sanity-scrub test_5: test failed to respond and timed out&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9994&quot;&gt;&lt;del&gt;LU-9994&lt;/del&gt;&lt;/a&gt; obdclass: fix llog_cat_id2handle() error handling&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: e9c7d9a668f01b4c8db741a6a92f037e5bfa1982&lt;/p&gt;</comment>
                            <comment id="210931" author="bfaccini" created="Thu, 12 Oct 2017 13:07:42 +0000"  >&lt;p&gt;BTW, at the moment, there is a running serie of 57 identical failures/timeouts of sanity-scrub/test_5 being reported in Maloo and starting on July 19th (date when &quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9153&quot; title=&quot;improve FID/config name specification for lctl llog_print and llog_info&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9153&quot;&gt;&lt;del&gt;LU-9153&lt;/del&gt;&lt;/a&gt; llog: consolidate common error checking&quot; has landed on master !!).&lt;/p&gt;</comment>
                            <comment id="210932" author="bfaccini" created="Thu, 12 Oct 2017 13:09:32 +0000"  >&lt;p&gt;Upgrading priority to Major as it appears to have some impact on auto-tests.&lt;/p&gt;</comment>
                            <comment id="211107" author="gerrit" created="Sat, 14 Oct 2017 11:33:25 +0000"  >&lt;p&gt;Fan Yong (fan.yong@intel.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/29616&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29616&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9994&quot; title=&quot;sanity-scrub test_5: test failed to respond and timed out&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9994&quot;&gt;&lt;del&gt;LU-9994&lt;/del&gt;&lt;/a&gt; obdclass: skip llog_process for empty handle&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: bed20ef1b81873a2d7991f84c7fc7a165ecd8b8c&lt;/p&gt;</comment>
                            <comment id="211108" author="yong.fan" created="Sat, 14 Oct 2017 11:36:58 +0000"  >&lt;p&gt;I made the patch because most of the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-7585&quot; title=&quot;Implement OI Scrub for ZFS&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-7585&quot;&gt;&lt;del&gt;LU-7585&lt;/del&gt;&lt;/a&gt; patches failed on Maloo. I am not sure whether this patch can resolve the issue or not. But let&apos;s try.&lt;/p&gt;</comment>
                            <comment id="212685" author="bogl" created="Thu, 2 Nov 2017 18:46:48 +0000"  >&lt;p&gt;I see two proposed fixes here:&lt;br/&gt;
&lt;a href=&quot;https://review.whamcloud.com/29370&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29370&lt;/a&gt;&lt;br/&gt;
&lt;a href=&quot;https://review.whamcloud.com/29616&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29616&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Are both needed?  If not is one preferred over the other?&lt;/p&gt;</comment>
                            <comment id="212721" author="gerrit" created="Fri, 3 Nov 2017 04:25:58 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/29370/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29370/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9994&quot; title=&quot;sanity-scrub test_5: test failed to respond and timed out&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9994&quot;&gt;&lt;del&gt;LU-9994&lt;/del&gt;&lt;/a&gt; obdclass: fix llog_cat_id2handle() error handling&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 97fe84143da81d42770ae29298403aad80103f94&lt;/p&gt;</comment>
                            <comment id="212958" author="adilger" created="Tue, 7 Nov 2017 07:37:15 +0000"  >&lt;p&gt;Only one of the two patches needed to land.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="48726">LU-10113</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="44075">LU-9153</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzzk9b:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>