Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6159

Set CL_CLOSE in default changelog mask

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.8.0
    • Lustre 2.7.0, Lustre 2.5.3
    • 17225

    Description

      There's no point in ignoring CL_CLOSE by default in changelogs. Robinhood needs these events else the database quickly becomes out of sync. It's absence has created problem with our users who forgot to add them. So let's have it by default, which is IMO a sensitive choice.

      Note that CL_CLOSE is only issued when the file was opened in writing mode.

      Attachments

        Issue Links

          Activity

            [LU-6159] Set CL_CLOSE in default changelog mask
            pjones Peter Jones added a comment -

            Landed for 2.8

            pjones Peter Jones added a comment - Landed for 2.8

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13526/
            Subject: LU-6159 hsm: add CL_CLOSE to default changelog mask
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: f27a9e92b2507a18872008f49f9bf0e09009f1cc

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13526/ Subject: LU-6159 hsm: add CL_CLOSE to default changelog mask Project: fs/lustre-release Branch: master Current Patch Set: Commit: f27a9e92b2507a18872008f49f9bf0e09009f1cc

            Same crash was already reported as LU-5938. So setting CL_CLOSE just made it reproducible, and didn't introduce it. Yeah!

            fzago Frank Zago (Inactive) added a comment - Same crash was already reported as LU-5938 . So setting CL_CLOSE just made it reproducible, and didn't introduce it. Yeah!

            frank zago (fzago@cray.com) uploaded a new patch: http://review.whamcloud.com/13619
            Subject: LU-6159 mdd: fixed oops when dereferencing structure
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 06928d07ddc3281f61cbd7340178417267c0099f

            gerrit Gerrit Updater added a comment - frank zago (fzago@cray.com) uploaded a new patch: http://review.whamcloud.com/13619 Subject: LU-6159 mdd: fixed oops when dereferencing structure Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 06928d07ddc3281f61cbd7340178417267c0099f
            fzago Frank Zago (Inactive) added a comment - - edited

            v4 of this patch is causing the MDS to crash in test_232.
            In mdd_changelog_data_store(), uc is NULL, so its dereference causes the oops.
            I will pamper over it it the next version, but I don't know whether that's the correct fix.

            Note that just running test_232 doesn't cause the oops. It's a combination of some
            of the previous tests, plus the umount_client in test_232 that creates the issue.

            <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
            <1>IP: [<ffffffffa0fe1f0f>] mdd_changelog_data_store+0x1cf/0x390 [mdd]
            <4>PGD 7c4d8067 PUD 7c4d9067 PMD 0 
            <4>Oops: 0000 [#1] SMP 
            <4>last sysfs file: /sys/devices/system/cpu/online
            <4>CPU 2 
            <4>Modules linked in: lustre(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) mdd(U) mgs(U) osd_ldiskfs(U) ldiskfs(U) lquota(U) lfsck(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc_gss(U) ptlrpc(U) obdclass(U) ksocklnd(U) lnet(U) libcfs(U) ext2 exportfs jbd sunrpc sha512_generic sha256_generic ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc autofs4 ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate vhost_net macvtap macvlan tun microcode sg virtio_balloon virtio_net snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core ext4 jbd2 mbcache virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: obdecho]
            <4>
            <4>Pid: 10317, comm: mdt00_000 Tainted: P           ---------------    2.6.32-431.20.3.el6_lustre.x86_64 #1 Red Hat KVM
            <4>RIP: 0010:[<ffffffffa0fe1f0f>]  [<ffffffffa0fe1f0f>] mdd_changelog_data_store+0x1cf/0x390 [mdd]
            <4>RSP: 0018:ffff88007b3a5b00  EFLAGS: 00010202
            <4>RAX: 0000000000005042 RBX: ffff88007bd30000 RCX: 0000000000000000
            <4>RDX: 0000000000001042 RSI: 0000000000000046 RDI: ffffffffa0ff457c
            <4>RBP: ffff88007b3a5b60 R08: 0000000000000000 R09: 0720072007200720
            <4>R10: 0720072007200720 R11: 0720072007200720 R12: 000000000000000b
            <4>R13: ffff88007b3a5c70 R14: ffff88004db29960 R15: 0000000000000000
            <4>FS:  0000000000000000(0000) GS:ffff880002280000(0000) knlGS:0000000000000000
            <4>CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
            <4>CR2: 0000000000000048 CR3: 000000007ab7b000 CR4: 00000000000006e0
            <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
            <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
            <4>Process mdt00_000 (pid: 10317, threadinfo ffff88007b3a4000, task ffff88001d4a0ae0)
            <4>Stack:
            <4> 0000000000000000 ffff880036326990 ffff88007b3a5b50 ffff88004db0d738
            <4><d> ffff88004ab2a5c0 000000424db29960 ffff88007b3a5c70 ffff88004db29960
            <4><d> ffff88007b3a5c70 0000000000000000 ffff88004ab2a5c0 ffff880036326990
            <4>Call Trace:
            <4> [<ffffffffa0fe6afe>] mdd_close+0x34e/0xc50 [mdd]
            <4> [<ffffffffa10686f1>] mdt_mfd_close+0x3f1/0xac0 [mdt]
            <4> [<ffffffffa103500e>] ? mdt_ctxt_add_dirty_flag+0x13e/0x190 [mdt]
            <4> [<ffffffffa10353f2>] mdt_obd_disconnect+0x392/0x510 [mdt]
            <4> [<ffffffffa08a54f1>] target_handle_disconnect+0x1b1/0x480 [ptlrpc]
            <4> [<ffffffffa0947ec9>] tgt_disconnect+0x39/0x160 [ptlrpc]
            <4> [<ffffffffa0948d9e>] tgt_request_handle+0x8be/0x1000 [ptlrpc]
            <4> [<ffffffffa08f8891>] ptlrpc_main+0xe41/0x1960 [ptlrpc]
            <4> [<ffffffffa08f7a50>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
            <4> [<ffffffff8109abf6>] kthread+0x96/0xa0
            <4> [<ffffffff8100c20a>] child_rip+0xa/0x20
            <4> [<ffffffff8109ab60>] ? kthread+0x0/0xa0
            <4> [<ffffffff8100c200>] ? child_rip+0x0/0x20
            <4>Code: bf 31 00 4c 89 fe 48 c7 c7 6f 45 ff a0 31 c0 e8 6c 67 54 e0 8b 45 cc 48 c7 c7 7c 45 ff a0 25 ff 0f 00 00 89 c2 80 cc 50 80 ce 10 <41> 80 7f 48 00 0f 44 c2 89 45 cc 31 c0 e8 43 67 54 e0 8b 45 cc 
            <1>RIP  [<ffffffffa0fe1f0f>] mdd_changelog_data_store+0x1cf/0x390 [mdd]
            <4> RSP <ffff88007b3a5b00>
            <4>CR2: 0000000000000048
            
            fzago Frank Zago (Inactive) added a comment - - edited v4 of this patch is causing the MDS to crash in test_232. In mdd_changelog_data_store(), uc is NULL, so its dereference causes the oops. I will pamper over it it the next version, but I don't know whether that's the correct fix. Note that just running test_232 doesn't cause the oops. It's a combination of some of the previous tests, plus the umount_client in test_232 that creates the issue. <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 <1>IP: [<ffffffffa0fe1f0f>] mdd_changelog_data_store+0x1cf/0x390 [mdd] <4>PGD 7c4d8067 PUD 7c4d9067 PMD 0 <4>Oops: 0000 [#1] SMP <4>last sysfs file: /sys/devices/system/cpu/online <4>CPU 2 <4>Modules linked in: lustre(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) mdd(U) mgs(U) osd_ldiskfs(U) ldiskfs(U) lquota(U) lfsck(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc_gss(U) ptlrpc(U) obdclass(U) ksocklnd(U) lnet(U) libcfs(U) ext2 exportfs jbd sunrpc sha512_generic sha256_generic ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc autofs4 ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate vhost_net macvtap macvlan tun microcode sg virtio_balloon virtio_net snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core ext4 jbd2 mbcache virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: obdecho] <4> <4>Pid: 10317, comm: mdt00_000 Tainted: P --------------- 2.6.32-431.20.3.el6_lustre.x86_64 #1 Red Hat KVM <4>RIP: 0010:[<ffffffffa0fe1f0f>] [<ffffffffa0fe1f0f>] mdd_changelog_data_store+0x1cf/0x390 [mdd] <4>RSP: 0018:ffff88007b3a5b00 EFLAGS: 00010202 <4>RAX: 0000000000005042 RBX: ffff88007bd30000 RCX: 0000000000000000 <4>RDX: 0000000000001042 RSI: 0000000000000046 RDI: ffffffffa0ff457c <4>RBP: ffff88007b3a5b60 R08: 0000000000000000 R09: 0720072007200720 <4>R10: 0720072007200720 R11: 0720072007200720 R12: 000000000000000b <4>R13: ffff88007b3a5c70 R14: ffff88004db29960 R15: 0000000000000000 <4>FS: 0000000000000000(0000) GS:ffff880002280000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>CR2: 0000000000000048 CR3: 000000007ab7b000 CR4: 00000000000006e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process mdt00_000 (pid: 10317, threadinfo ffff88007b3a4000, task ffff88001d4a0ae0) <4>Stack: <4> 0000000000000000 ffff880036326990 ffff88007b3a5b50 ffff88004db0d738 <4><d> ffff88004ab2a5c0 000000424db29960 ffff88007b3a5c70 ffff88004db29960 <4><d> ffff88007b3a5c70 0000000000000000 ffff88004ab2a5c0 ffff880036326990 <4>Call Trace: <4> [<ffffffffa0fe6afe>] mdd_close+0x34e/0xc50 [mdd] <4> [<ffffffffa10686f1>] mdt_mfd_close+0x3f1/0xac0 [mdt] <4> [<ffffffffa103500e>] ? mdt_ctxt_add_dirty_flag+0x13e/0x190 [mdt] <4> [<ffffffffa10353f2>] mdt_obd_disconnect+0x392/0x510 [mdt] <4> [<ffffffffa08a54f1>] target_handle_disconnect+0x1b1/0x480 [ptlrpc] <4> [<ffffffffa0947ec9>] tgt_disconnect+0x39/0x160 [ptlrpc] <4> [<ffffffffa0948d9e>] tgt_request_handle+0x8be/0x1000 [ptlrpc] <4> [<ffffffffa08f8891>] ptlrpc_main+0xe41/0x1960 [ptlrpc] <4> [<ffffffffa08f7a50>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] <4> [<ffffffff8109abf6>] kthread+0x96/0xa0 <4> [<ffffffff8100c20a>] child_rip+0xa/0x20 <4> [<ffffffff8109ab60>] ? kthread+0x0/0xa0 <4> [<ffffffff8100c200>] ? child_rip+0x0/0x20 <4>Code: bf 31 00 4c 89 fe 48 c7 c7 6f 45 ff a0 31 c0 e8 6c 67 54 e0 8b 45 cc 48 c7 c7 7c 45 ff a0 25 ff 0f 00 00 89 c2 80 cc 50 80 ce 10 <41> 80 7f 48 00 0f 44 c2 89 45 cc 31 c0 e8 43 67 54 e0 8b 45 cc <1>RIP [<ffffffffa0fe1f0f>] mdd_changelog_data_store+0x1cf/0x390 [mdd] <4> RSP <ffff88007b3a5b00> <4>CR2: 0000000000000048

            frank zago (fzago@cray.com) uploaded a new patch: http://review.whamcloud.com/13526
            Subject: LU-6159 hsm: add CL_CLOSE to default changelog mask
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 56c9f07aae9229c83d6655bc5e96362a6958be85

            gerrit Gerrit Updater added a comment - frank zago (fzago@cray.com) uploaded a new patch: http://review.whamcloud.com/13526 Subject: LU-6159 hsm: add CL_CLOSE to default changelog mask Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 56c9f07aae9229c83d6655bc5e96362a6958be85

            People

              wc-triage WC Triage
              fzago Frank Zago (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: