Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6722

sanity-lfsck test_1a: FAIL: (3) Fail to start LFSCK for namespace!

Details

    • 3
    • 9223372036854775807

    Description

      sanity-lfsck test 1a failed as follows:

      CMD: shadow-5vm12 /usr/sbin/lctl set_param fail_loc=0x1501
      CMD: shadow-5vm12 /usr/sbin/lctl set_param fail_loc=0
      10.1.4.53@tcp:/lustre /mnt/lustre lustre rw,flock,user_xattr 0 0
      CMD: shadow-5vm1.shadow.whamcloud.com grep -c /mnt/lustre' ' /proc/mounts
      Stopping client shadow-5vm1.shadow.whamcloud.com /mnt/lustre (opts:)
      CMD: shadow-5vm1.shadow.whamcloud.com lsof -t /mnt/lustre
      CMD: shadow-5vm1.shadow.whamcloud.com umount  /mnt/lustre 2>&1
      CMD: shadow-5vm12 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -t namespace -r
      shadow-5vm12: Fail to start LFSCK: Read-only file system
       sanity-lfsck test_1a: @@@@@@ FAIL: (3) Fail to start LFSCK for namespace! 
      

      Maloo report: https://testing.hpdd.intel.com/test_sets/8520310e-108d-11e5-a2d3-5254006e85c2

      Attachments

        Issue Links

          Activity

            [LU-6722] sanity-lfsck test_1a: FAIL: (3) Fail to start LFSCK for namespace!

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/19732/
            Subject: LU-6722 ldiskfs: fix credits at ldiskfs_delete_inode
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: d36f00b3b7c977acf1a092d188535d34d079c57a

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/19732/ Subject: LU-6722 ldiskfs: fix credits at ldiskfs_delete_inode Project: fs/lustre-release Branch: master Current Patch Set: Commit: d36f00b3b7c977acf1a092d188535d34d079c57a

            Alexander Boyko (alexander.boyko@seagate.com) uploaded a new patch: http://review.whamcloud.com/19732
            Subject: LU-6722 ldiskfs: fix credits at ldiskfs_delete_inode
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 881c189401b7dd0d3bcc1f7846e77d7d95ebc9c7

            gerrit Gerrit Updater added a comment - Alexander Boyko (alexander.boyko@seagate.com) uploaded a new patch: http://review.whamcloud.com/19732 Subject: LU-6722 ldiskfs: fix credits at ldiskfs_delete_inode Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 881c189401b7dd0d3bcc1f7846e77d7d95ebc9c7

            The new issue will be resolved in LU-7045

            yong.fan nasf (Inactive) added a comment - The new issue will be resolved in LU-7045

            The latest MDS corruption is different. The transaction was aborted because the osp_write_local_file() did not declare enough credits for write, the original failure was caused by less credits when delete/destroy. I have open a new ticket LU-7045 for the new issue.

            yong.fan nasf (Inactive) added a comment - The latest MDS corruption is different. The transaction was aborted because the osp_write_local_file() did not declare enough credits for write, the original failure was caused by less credits when delete/destroy. I have open a new ticket LU-7045 for the new issue.
            yujian Jian Yu added a comment -

            On master branch, in the following report:
            https://testing.hpdd.intel.com/sub_tests/ebb78d16-1044-11e5-8fc4-5254006e85c2

            Dmesg on MDS 1 (shadow-15vm8):

            [12947.228164] ------------[ cut here ]------------
            [12947.228652] WARNING: at /var/lib/jenkins/workspace/lustre-master/arch/x86_64/build_type/server/distro/el7/ib_stack/inkernel/BUILD/BUILD/lustre-2.7.54/ldiskfs/ext4_jbd2.c:260 __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]()
            [12947.230373] Modules linked in: osp(OF) mdd(OF) lod(OF) mdt(OF) lfsck(OF) mgs(OF) mgc(OF) osd_ldiskfs(OF) lquota(OF) fid(OF) fld(OF) ksocklnd(OF) ptlrpc(OF) obdclass(OF) lnet(OF) libcfs(OF) ldiskfs(OF) sha512_generic dm_mod nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ppdev pcspkr serio_raw parport_pc virtio_balloon i2c_piix4 parport ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm 8139too virtio_pci 8139cp virtio_ring virtio mii drm ata_piix libata i2c_core floppy [last unloaded: libcfs]
            
            [12947.237641] CPU: 1 PID: 18882 Comm: llog_process_th Tainted: GF       W  O--------------   3.10.0-229.4.2.el7_lustre.g1fee634.x86_64 #1
            [12947.238718] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
            [12947.239582]  0000000000000000 000000005e4d00bc ffff88006d6a77f0 ffffffff816050da
            [12947.240672]  ffff88006d6a7828 ffffffff8106e34b ffff88000083fc98 ffff88007b5877e0
            [12947.241691]  ffff88005c362708 ffffffffa062745c 0000000000000325 ffff88006d6a7838
            [12947.242719] Call Trace:
            [12947.243043]  [<ffffffff816050da>] dump_stack+0x19/0x1b
            [12947.243689]  [<ffffffff8106e34b>] warn_slowpath_common+0x6b/0xb0
            [12947.244632]  [<ffffffff8106e49a>] warn_slowpath_null+0x1a/0x20
            [12947.245577]  [<ffffffffa05d26b2>] __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]
            [12947.246666]  [<ffffffffa05e30c1>] ldiskfs_getblk+0x131/0x200 [ldiskfs]
            [12947.247508]  [<ffffffffa05e31b7>] ldiskfs_bread+0x27/0xc0 [ldiskfs]
            [12947.248258]  [<ffffffffa0bca919>] osd_ldiskfs_write_record+0x169/0x350 [osd_ldiskfs]
            [12947.248974]  [<ffffffffa0bcabdf>] osd_write+0xdf/0x200 [osd_ldiskfs]
            [12947.249558]  [<ffffffffa0765399>] dt_record_write+0x39/0x120 [obdclass]
            [12947.250194]  [<ffffffffa0eed261>] osp_write_local_file.isra.16+0x284/0x30c [osp]
            [12947.250845]  [<ffffffffa0eed7ca>] osp_last_used_init+0x1e7/0xaa6 [osp]
            [12947.251417]  [<ffffffffa0ecef84>] osp_init0.isra.17+0x1b44/0x2060 [osp]
            [12947.252042]  [<ffffffffa0ecf549>] osp_device_alloc+0xa9/0x1a0 [osp]
            [12947.252606]  [<ffffffffa074b6c4>] obd_setup+0x114/0x2a0 [obdclass]
            [12947.253147]  [<ffffffffa074f430>] class_setup+0x2a0/0x820 [obdclass]
            [12947.253735]  [<ffffffffa0753b0c>] class_process_config+0x224c/0x2e50 [obdclass]
            [12947.254369]  [<ffffffffa0571967>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
            [12947.254979]  [<ffffffffa0754f6f>] class_config_llog_handler+0x85f/0x16e0 [obdclass]
            [12947.255645]  [<ffffffffa071945a>] llog_process_thread+0x7aa/0xe90 [obdclass]
            [12947.256291]  [<ffffffffa071a4f0>] ? llog_backup+0x500/0x500 [obdclass]
            [12947.256883]  [<ffffffffa071a53c>] llog_process_thread_daemonize+0x4c/0x80 [obdclass]
            [12947.257551]  [<ffffffff8109739f>] kthread+0xcf/0xe0
            [12947.258000]  [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140
            [12947.258565]  [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0
            [12947.259066]  [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140
            [12947.259689] ---[ end trace ddd376db706c5663 ]---
            [12947.260092] LDISKFS-fs: ldiskfs_getblk:805: aborting transaction: error 28 in __ldiskfs_handle_dirty_metadata
            [12947.260934] LDISKFS-fs error (device dm-0): ldiskfs_getblk:805: inode #132: block 16827: comm llog_process_th: journal_dirty_metadata failed: handle type 0 started at line 967, credits 6/0, errcode -28
            [12947.275334] Aborting journal on device dm-0-8.
            [12947.287404] LDISKFS-fs (dm-0): Remounting filesystem read-only
            [12947.288619] LustreError: 18882:0:(osd_io.c:1645:osd_ldiskfs_write_record()) dm-0: error reading offset 84152 (block 20): rc = -28
            [12947.290466] LDISKFS-fs error (device dm-0) in osd_trans_stop:1052: error 28
            [12947.290556] LustreError: 18458:0:(osd_handler.c:829:osd_trans_commit_cb()) transaction @0xffff8800700b43c0 commit error: 2
            [12947.309035] LustreError: 18882:0:(osd_handler.c:1055:osd_trans_stop()) lustre-MDT0000-osd: failed to stop transaction: rc = -28
            [12947.310070] LustreError: 18882:0:(osp_dev.c:283:osp_init_last_objid()) lustre-OST2917-osc-MDT0000: can't initialize lov_objid: rc = -28
            [12947.311186] LustreError: 18882:0:(osp_dev.c:369:osp_last_used_init()) lustre-OST2917-osc-MDT0000: Can not get ids -28 from old objid!
            [12947.312480] LustreError: 18882:0:(obd_config.c:558:class_setup()) setup lustre-OST2917-osc-MDT0000 failed (-28)
            [12947.313339] LustreError: 18882:0:(obd_config.c:1601:class_config_llog_handler()) MGC10.1.4.179@tcp: cfg command failed: rc = -28
            [12947.314362] Lustre:    cmd=cf003 0:lustre-OST2917-osc-MDT0000  1:lustre-OST2917_UUID  2:10.1.4.178@tcp  
            
            [12947.315420] LustreError: 18477:0:(mgc_request.c:527:do_requeue()) failed processing log: -28
            [12948.576155] LustreError: 18880:0:(tgt_lastrcvd.c:591:tgt_client_new()) lustre-MDT0000: Failed to write client lcd at idx 3, rc -30
            [12953.576890] LustreError: 18480:0:(tgt_lastrcvd.c:591:tgt_client_new()) lustre-MDT0000: Failed to write client lcd at idx 3, rc -30
            
            yujian Jian Yu added a comment - On master branch, in the following report: https://testing.hpdd.intel.com/sub_tests/ebb78d16-1044-11e5-8fc4-5254006e85c2 Dmesg on MDS 1 (shadow-15vm8): [12947.228164] ------------[ cut here ]------------ [12947.228652] WARNING: at /var/lib/jenkins/workspace/lustre-master/arch/x86_64/build_type/server/distro/el7/ib_stack/inkernel/BUILD/BUILD/lustre-2.7.54/ldiskfs/ext4_jbd2.c:260 __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]() [12947.230373] Modules linked in: osp(OF) mdd(OF) lod(OF) mdt(OF) lfsck(OF) mgs(OF) mgc(OF) osd_ldiskfs(OF) lquota(OF) fid(OF) fld(OF) ksocklnd(OF) ptlrpc(OF) obdclass(OF) lnet(OF) libcfs(OF) ldiskfs(OF) sha512_generic dm_mod nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ppdev pcspkr serio_raw parport_pc virtio_balloon i2c_piix4 parport ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm 8139too virtio_pci 8139cp virtio_ring virtio mii drm ata_piix libata i2c_core floppy [last unloaded: libcfs] [12947.237641] CPU: 1 PID: 18882 Comm: llog_process_th Tainted: GF W O-------------- 3.10.0-229.4.2.el7_lustre.g1fee634.x86_64 #1 [12947.238718] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [12947.239582] 0000000000000000 000000005e4d00bc ffff88006d6a77f0 ffffffff816050da [12947.240672] ffff88006d6a7828 ffffffff8106e34b ffff88000083fc98 ffff88007b5877e0 [12947.241691] ffff88005c362708 ffffffffa062745c 0000000000000325 ffff88006d6a7838 [12947.242719] Call Trace: [12947.243043] [<ffffffff816050da>] dump_stack+0x19/0x1b [12947.243689] [<ffffffff8106e34b>] warn_slowpath_common+0x6b/0xb0 [12947.244632] [<ffffffff8106e49a>] warn_slowpath_null+0x1a/0x20 [12947.245577] [<ffffffffa05d26b2>] __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs] [12947.246666] [<ffffffffa05e30c1>] ldiskfs_getblk+0x131/0x200 [ldiskfs] [12947.247508] [<ffffffffa05e31b7>] ldiskfs_bread+0x27/0xc0 [ldiskfs] [12947.248258] [<ffffffffa0bca919>] osd_ldiskfs_write_record+0x169/0x350 [osd_ldiskfs] [12947.248974] [<ffffffffa0bcabdf>] osd_write+0xdf/0x200 [osd_ldiskfs] [12947.249558] [<ffffffffa0765399>] dt_record_write+0x39/0x120 [obdclass] [12947.250194] [<ffffffffa0eed261>] osp_write_local_file.isra.16+0x284/0x30c [osp] [12947.250845] [<ffffffffa0eed7ca>] osp_last_used_init+0x1e7/0xaa6 [osp] [12947.251417] [<ffffffffa0ecef84>] osp_init0.isra.17+0x1b44/0x2060 [osp] [12947.252042] [<ffffffffa0ecf549>] osp_device_alloc+0xa9/0x1a0 [osp] [12947.252606] [<ffffffffa074b6c4>] obd_setup+0x114/0x2a0 [obdclass] [12947.253147] [<ffffffffa074f430>] class_setup+0x2a0/0x820 [obdclass] [12947.253735] [<ffffffffa0753b0c>] class_process_config+0x224c/0x2e50 [obdclass] [12947.254369] [<ffffffffa0571967>] ? libcfs_debug_msg+0x57/0x80 [libcfs] [12947.254979] [<ffffffffa0754f6f>] class_config_llog_handler+0x85f/0x16e0 [obdclass] [12947.255645] [<ffffffffa071945a>] llog_process_thread+0x7aa/0xe90 [obdclass] [12947.256291] [<ffffffffa071a4f0>] ? llog_backup+0x500/0x500 [obdclass] [12947.256883] [<ffffffffa071a53c>] llog_process_thread_daemonize+0x4c/0x80 [obdclass] [12947.257551] [<ffffffff8109739f>] kthread+0xcf/0xe0 [12947.258000] [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140 [12947.258565] [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0 [12947.259066] [<ffffffff810972d0>] ? kthread_create_on_node+0x140/0x140 [12947.259689] ---[ end trace ddd376db706c5663 ]--- [12947.260092] LDISKFS-fs: ldiskfs_getblk:805: aborting transaction: error 28 in __ldiskfs_handle_dirty_metadata [12947.260934] LDISKFS-fs error (device dm-0): ldiskfs_getblk:805: inode #132: block 16827: comm llog_process_th: journal_dirty_metadata failed: handle type 0 started at line 967, credits 6/0, errcode -28 [12947.275334] Aborting journal on device dm-0-8. [12947.287404] LDISKFS-fs (dm-0): Remounting filesystem read-only [12947.288619] LustreError: 18882:0:(osd_io.c:1645:osd_ldiskfs_write_record()) dm-0: error reading offset 84152 (block 20): rc = -28 [12947.290466] LDISKFS-fs error (device dm-0) in osd_trans_stop:1052: error 28 [12947.290556] LustreError: 18458:0:(osd_handler.c:829:osd_trans_commit_cb()) transaction @0xffff8800700b43c0 commit error: 2 [12947.309035] LustreError: 18882:0:(osd_handler.c:1055:osd_trans_stop()) lustre-MDT0000-osd: failed to stop transaction: rc = -28 [12947.310070] LustreError: 18882:0:(osp_dev.c:283:osp_init_last_objid()) lustre-OST2917-osc-MDT0000: can't initialize lov_objid: rc = -28 [12947.311186] LustreError: 18882:0:(osp_dev.c:369:osp_last_used_init()) lustre-OST2917-osc-MDT0000: Can not get ids -28 from old objid! [12947.312480] LustreError: 18882:0:(obd_config.c:558:class_setup()) setup lustre-OST2917-osc-MDT0000 failed (-28) [12947.313339] LustreError: 18882:0:(obd_config.c:1601:class_config_llog_handler()) MGC10.1.4.179@tcp: cfg command failed: rc = -28 [12947.314362] Lustre: cmd=cf003 0:lustre-OST2917-osc-MDT0000 1:lustre-OST2917_UUID 2:10.1.4.178@tcp [12947.315420] LustreError: 18477:0:(mgc_request.c:527:do_requeue()) failed processing log: -28 [12948.576155] LustreError: 18880:0:(tgt_lastrcvd.c:591:tgt_client_new()) lustre-MDT0000: Failed to write client lcd at idx 3, rc -30 [12953.576890] LustreError: 18480:0:(tgt_lastrcvd.c:591:tgt_client_new()) lustre-MDT0000: Failed to write client lcd at idx 3, rc -30

            We are going to need a patch for SLES12 now that it is supported.

            simmonsja James A Simmons added a comment - We are going to need a patch for SLES12 now that it is supported.
            pjones Peter Jones added a comment -

            Landed for 2.8

            pjones Peter Jones added a comment - Landed for 2.8

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15334/
            Subject: LU-6722 ldiskfs: declare credits for quota when destroy inode
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: e065d04c3cdcf40ae4d4fd4aee0b2548aca4d045

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15334/ Subject: LU-6722 ldiskfs: declare credits for quota when destroy inode Project: fs/lustre-release Branch: master Current Patch Set: Commit: e065d04c3cdcf40ae4d4fd4aee0b2548aca4d045

            Andreas Dilger (andreas.dilger@intel.com) merged in patch http://review.whamcloud.com/15401/
            Subject: LU-6722 jbd: double minimum journal size for RHEL7
            Project: tools/e2fsprogs
            Branch: master-lustre
            Current Patch Set:
            Commit: 15d2f58cd493a73f860a4415cac0da48c932e72a

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) merged in patch http://review.whamcloud.com/15401/ Subject: LU-6722 jbd: double minimum journal size for RHEL7 Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: Commit: 15d2f58cd493a73f860a4415cac0da48c932e72a
            gerrit Gerrit Updater added a comment - - edited

            nevermind

            gerrit Gerrit Updater added a comment - - edited nevermind

            People

              yong.fan nasf (Inactive)
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: