Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6722

sanity-lfsck test_1a: FAIL: (3) Fail to start LFSCK for namespace!

Details

    • 3
    • 9223372036854775807

    Description

      sanity-lfsck test 1a failed as follows:

      CMD: shadow-5vm12 /usr/sbin/lctl set_param fail_loc=0x1501
      CMD: shadow-5vm12 /usr/sbin/lctl set_param fail_loc=0
      10.1.4.53@tcp:/lustre /mnt/lustre lustre rw,flock,user_xattr 0 0
      CMD: shadow-5vm1.shadow.whamcloud.com grep -c /mnt/lustre' ' /proc/mounts
      Stopping client shadow-5vm1.shadow.whamcloud.com /mnt/lustre (opts:)
      CMD: shadow-5vm1.shadow.whamcloud.com lsof -t /mnt/lustre
      CMD: shadow-5vm1.shadow.whamcloud.com umount  /mnt/lustre 2>&1
      CMD: shadow-5vm12 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -t namespace -r
      shadow-5vm12: Fail to start LFSCK: Read-only file system
       sanity-lfsck test_1a: @@@@@@ FAIL: (3) Fail to start LFSCK for namespace! 
      

      Maloo report: https://testing.hpdd.intel.com/test_sets/8520310e-108d-11e5-a2d3-5254006e85c2

      Attachments

        Issue Links

          Activity

            [LU-6722] sanity-lfsck test_1a: FAIL: (3) Fail to start LFSCK for namespace!

            Andreas Dilger (andreas.dilger@intel.com) merged in patch http://review.whamcloud.com/15401/
            Subject: LU-6722 jbd: double minimum journal size for RHEL7
            Project: tools/e2fsprogs
            Branch: master-lustre
            Current Patch Set:
            Commit: 15d2f58cd493a73f860a4415cac0da48c932e72a

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) merged in patch http://review.whamcloud.com/15401/ Subject: LU-6722 jbd: double minimum journal size for RHEL7 Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: Commit: 15d2f58cd493a73f860a4415cac0da48c932e72a
            gerrit Gerrit Updater added a comment - - edited

            nevermind

            gerrit Gerrit Updater added a comment - - edited nevermind
            gerrit Gerrit Updater added a comment - - edited

            Deleted irrelevant comment.

            gerrit Gerrit Updater added a comment - - edited Deleted irrelevant comment.

            Fan Yong,
            there is also a generic issue that RHEL7 only allows transactions 1/2 the size of RHEL6, so in addition to your patch to fix the transaction credits for setxattr, there also needs to be a second patch to double the minimum journal size from 4MB to 8MB when running on kernels 3.10 and later.

            adilger Andreas Dilger added a comment - Fan Yong, there is also a generic issue that RHEL7 only allows transactions 1/2 the size of RHEL6, so in addition to your patch to fix the transaction credits for setxattr, there also needs to be a second patch to double the minimum journal size from 4MB to 8MB when running on kernels 3.10 and later.

            Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/15334
            Subject: LU-6722 ldiskfs: more credits to destroy inode with large EA
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 4f54dc215328cc9dbc2cab1236db84de29afe6ad

            gerrit Gerrit Updater added a comment - Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/15334 Subject: LU-6722 ldiskfs: more credits to destroy inode with large EA Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 4f54dc215328cc9dbc2cab1236db84de29afe6ad
            yong.fan nasf (Inactive) added a comment - - edited

            The http://review.whamcloud.com/10376 mainly affects the the OSD-ldiskfs to complain that someone declare too much credits, but it is not fatal. The root reason for this trouble is inside ldiskfs itself because of not consider quota modification when ldiskfs_xattr_delete_inode().

            yong.fan nasf (Inactive) added a comment - - edited The http://review.whamcloud.com/10376 mainly affects the the OSD-ldiskfs to complain that someone declare too much credits, but it is not fatal. The root reason for this trouble is inside ldiskfs itself because of not consider quota modification when ldiskfs_xattr_delete_inode().
            sarah Sarah Liu added a comment -

            this issue caused many failures of EL7 client/server

            https://testing.hpdd.intel.com/test_sessions/9d7790da-135d-11e5-b4b0-5254006e85c2

            sarah Sarah Liu added a comment - this issue caused many failures of EL7 client/server https://testing.hpdd.intel.com/test_sessions/9d7790da-135d-11e5-b4b0-5254006e85c2
            adilger Andreas Dilger added a comment - - edited

            The reason it is complaining is because of a change to the underlying JBD2 transaction limits:

            static int osd_param_is_not_sane(const struct osd_device *dev,
                                             const struct thandle *th)
            {               
                    struct osd_thandle *oh = container_of(th, typeof(*oh), ot_super);
                            
                    return oh->ot_credits > osd_transaction_size(dev);
            }
            
            #ifdef LDISKFS_HT_MISC
            # define osd_transaction_size(dev) (osd_journal(dev)->j_max_transaction_buffers / 2)
            #else
            # define osd_transaction_size(dev) (osd_journal(dev)->j_max_transaction_buffers)
            #endif
            
            static int osd_trans_start(const struct lu_env *env, struct dt_device *d,
                                       struct thandle *th)
            {
                    :
                    :
                            CWARN("%.16s: too many transaction credits (%d > %d)\n",
                                  LDISKFS_SB(osd_sb(dev))->s_es->s_volume_name,
                                  oh->ot_credits,   
                                  osd_journal(dev)->j_max_transaction_buffers);
            

            The CERROR() message is bad, and should use osd_transaction_size(dev) instead of accessing j_max_transaction_buffers directly. This was missed from http://review.whamcloud.com/10376 originally.

            I expect that LDISKFS_HT_MISC is only true for RHEL 7 (the original upstream patch 8f7d89f36 is in 3.10 and later), and this is cutting the maximum transaction size in half. I don't think that this will be a problem in real usage, only if the journal size is very small, since this is a single-transaction limit.

            An easy solution would be to increase the minimum journal size from 4MB to 8MB for RHEL7. Does the test environment specify the journal size explicitly? It seems that the default journal size should be 4096 blocks for filesystems over 128MB (according to mke2fs ext2fs_default_journal_size(). According to lustre/tests/cfg/local.sh the default OSTSIZE and MDSSIZE are 200MB.

            adilger Andreas Dilger added a comment - - edited The reason it is complaining is because of a change to the underlying JBD2 transaction limits: static int osd_param_is_not_sane(const struct osd_device *dev, const struct thandle *th) { struct osd_thandle *oh = container_of(th, typeof(*oh), ot_super); return oh->ot_credits > osd_transaction_size(dev); } #ifdef LDISKFS_HT_MISC # define osd_transaction_size(dev) (osd_journal(dev)->j_max_transaction_buffers / 2) #else # define osd_transaction_size(dev) (osd_journal(dev)->j_max_transaction_buffers) #endif static int osd_trans_start(const struct lu_env *env, struct dt_device *d, struct thandle *th) { : : CWARN("%.16s: too many transaction credits (%d > %d)\n", LDISKFS_SB(osd_sb(dev))->s_es->s_volume_name, oh->ot_credits, osd_journal(dev)->j_max_transaction_buffers); The CERROR() message is bad, and should use osd_transaction_size(dev) instead of accessing j_max_transaction_buffers directly. This was missed from http://review.whamcloud.com/10376 originally. I expect that LDISKFS_HT_MISC is only true for RHEL 7 (the original upstream patch 8f7d89f36 is in 3.10 and later), and this is cutting the maximum transaction size in half. I don't think that this will be a problem in real usage, only if the journal size is very small, since this is a single-transaction limit. An easy solution would be to increase the minimum journal size from 4MB to 8MB for RHEL7. Does the test environment specify the journal size explicitly? It seems that the default journal size should be 4096 blocks for filesystems over 128MB (according to mke2fs ext2fs_default_journal_size() . According to lustre/tests/cfg/local.sh the default OSTSIZE and MDSSIZE are 200MB.

            How often is this failing?

            adilger Andreas Dilger added a comment - How often is this failing?
            green Oleg Drokin added a comment -

            Right before the remount r/o we are getting a transaction credits overflow:

            [18669.823920] Lustre: 2374:0:(osd_handler.c:918:osd_trans_start()) lustre-MDT0000: too many transaction credits (185 > 256)
            [18669.828228] Lustre: 2374:0:(osd_handler.c:923:osd_trans_start())   create: 2/8, destroy: 1/4
            [18669.831572] Lustre: 2374:0:(osd_handler.c:928:osd_trans_start())   attr_set: 2/2, xattr_set: 2/15
            [18669.835192] Lustre: 2374:0:(osd_handler.c:935:osd_trans_start())   write: 7/29, punch: 0/0, quota 2/2
            [18669.836831] Lustre: 2374:0:(osd_handler.c:940:osd_trans_start())   insert: 4/67, delete: 2/5
            [18669.838251] Lustre: 2374:0:(osd_handler.c:945:osd_trans_start())   ref_add: 1/1, ref_del: 2/2
            [18669.839654] Pid: 2374, comm: mdt00_001
            [18669.840329] 
            Call Trace:
            [18669.841044]  [<ffffffffa0620843>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
            [18669.842210]  [<ffffffffa0ba8af2>] osd_trans_start+0x642/0x670 [osd_ldiskfs]
            [18669.861939]  [<ffffffffa0a2362e>] top_trans_start+0x60e/0x830 [ptlrpc]
            [18669.872487]  [<ffffffffa0e1b421>] lod_trans_start+0x31/0x40 [lod]
            [18669.873091]  [<ffffffffa0ea9394>] mdd_trans_start+0x14/0x20 [mdd]
            [18669.873892]  [<ffffffffa0e99235>] mdd_unlink+0x4b5/0xa50 [mdd]
            [18669.874503]  [<ffffffffa0763eb2>] ? dt_version_get+0x72/0x1f0 [obdclass]
            [18669.875302]  [<ffffffffa0d6a41b>] mdt_reint_unlink+0xa7b/0x11c0 [mdt]
            [18669.876120]  [<ffffffffa07811ae>] ? lu_ucred+0x1e/0x30 [obdclass]
            [18669.884730]  [<ffffffffa0d6d920>] mdt_reint_rec+0x80/0x210 [mdt]
            [18669.885320]  [<ffffffffa0d511ac>] mdt_reint_internal+0x58c/0x780 [mdt]
            [18669.886148]  [<ffffffffa0d5a597>] mdt_reint+0x67/0x140 [mdt]
            [18669.886709]  [<ffffffffa0a10235>] tgt_request_handle+0x6d5/0x1060 [ptlrpc]
            [18669.887388]  [<ffffffffa09c020b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
            [18669.888540]  [<ffffffffa09bdd88>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
            [18669.889192]  [<ffffffff810a9662>] ? default_wake_function+0x12/0x20
            [18669.890231]  [<ffffffff810a0898>] ? __wake_up_common+0x58/0x90
            [18669.890908]  [<ffffffffa09c43f8>] ptlrpc_main+0xaf8/0x1ea0 [ptlrpc]
            [18669.891495]  [<ffffffff810ad8b6>] ? __dequeue_entity+0x26/0x40
            [18669.892495]  [<ffffffffa09c3900>] ? ptlrpc_main+0x0/0x1ea0 [ptlrpc]
            [18669.893083]  [<ffffffff8109739f>] kthread+0xcf/0xe0
            [18669.893725]  [<ffffffff810972d0>] ? kthread+0x0/0xe0
            [18669.894181]  [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0
            [18669.894869]  [<ffffffff810972d0>] ? kthread+0x0/0xe0
            

            IT's strange that the claim is that 185 is bigger than 256 when it's not.

            And then we get:

            [18680.472086] Lustre: DEBUG MARKER: /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -t namespace -r
            [18680.583847] ------------[ cut here ]------------
            [18680.585330] WARNING: at /var/lib/jenkins/workspace/lustre-master/arch/x86_64/build_type/server/distro/el7/ib_stack/inkernel/BUILD/BUILD/lustre-2.7.54/ldiskfs/ext4_jbd2.c:260 __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]()
            [18680.607767] Modules linked in: osp(OF) mdd(OF) lod(OF) mdt(OF) lfsck(OF) mgs(OF) mgc(OF) osd_ldiskfs(OF) lquota(OF) fid(OF) fld(OF) ksocklnd(OF) ptlrpc(OF) obdclass(OF) lnet(OF) sha512_generic libcfs(OF) ldiskfs(OF) dm_mod nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm floppy ib_sa ib_mad i2c_piix4 virtio_balloon ib_core ppdev pcspkr serio_raw ib_addr parport_pc parport ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus syscopyarea sysfillrect sysimgblt 8139too drm_kms_helper ttm virtio_pci virtio_ring virtio drm 8139cp mii ata_piix libata i2c_core [last unloaded: obdecho]
            
            [18680.614887] CPU: 1 PID: 3261 Comm: lctl Tainted: GF       W  O--------------   3.10.0-229.4.2.el7_lustre.g1fee634.x86_64 #1
            [18680.615845] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
            [18680.616350]  0000000000000000 0000000069f92699 ffff8800649af7c8 ffffffff816050da
            [18680.617033]  ffff8800649af800 ffffffff8106e34b ffff8800610cb410 ffff88007bbe34b0
            [18680.617731]  0000000000000000 ffffffffa05bd9c0 00000000000013f2 ffff8800649af810
            [18680.618420] Call Trace:
            [18680.618645]  [<ffffffff816050da>] dump_stack+0x19/0x1b
            [18680.619098]  [<ffffffff8106e34b>] warn_slowpath_common+0x6b/0xb0
            [18680.619622]  [<ffffffff8106e49a>] warn_slowpath_null+0x1a/0x20
            [18680.620131]  [<ffffffffa05686b2>] __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]
            [18680.620825]  [<ffffffffa057eec4>] ? ldiskfs_dirty_inode+0x54/0x60 [ldiskfs]
            [18680.621456]  [<ffffffffa058b556>] ldiskfs_free_blocks+0x5e6/0xb90 [ldiskfs]
            [18680.622082]  [<ffffffffa057fe95>] ldiskfs_xattr_release_block+0x275/0x330 [ldiskfs]
            [18680.622733]  [<ffffffffa05830cb>] ldiskfs_xattr_delete_inode+0x2bb/0x300 [ldiskfs]
            [18680.623388]  [<ffffffffa057d9f5>] ldiskfs_evict_inode+0x1b5/0x610 [ldiskfs]
            [18680.623976]  [<ffffffff811e23d7>] evict+0xa7/0x170
            [18680.624410]  [<ffffffff811e2c15>] iput+0xf5/0x180
            [18680.624822]  [<ffffffffa0ba7b23>] osd_object_delete+0x1d3/0x300 [osd_ldiskfs]
            [18680.625477]  [<ffffffffa075f04d>] lu_object_free.isra.30+0x9d/0x1a0 [obdclass]
            [18680.626111]  [<ffffffffa075f212>] lu_object_put+0xc2/0x320 [obdclass]
            [18680.626676]  [<ffffffffa075f486>] lu_object_put_nocache+0x16/0x20 [obdclass]
            [18680.627307]  [<ffffffffa0742674>] local_object_unlink+0x374/0xc10 [obdclass]
            [18680.627913]  [<ffffffffa0c92afd>] lfsck_namespace_load_one_trace_file.isra.66+0x2d/0x70 [lfsck]
            [18680.628680]  [<ffffffffa0c988ad>] lfsck_namespace_reset+0x14d/0x510 [lfsck]
            [18680.629338]  [<ffffffffa0c85a76>] lfsck_start+0x2f6/0x1410 [lfsck]
            [18680.629876]  [<ffffffff811acba3>] ? __kmalloc+0x1f3/0x230
            [18680.630394]  [<ffffffffa05b0002>] ? ldiskfs_move_extents+0x532/0x990 [ldiskfs]
            [18680.631222]  [<ffffffff811abe3e>] ? kmem_cache_alloc_trace+0x1ce/0x1f0
            [18680.632151]  [<ffffffffa0edb060>] ? osp_key_init+0x20/0x140 [osp]
            [18680.632764]  [<ffffffffa0e840a9>] mdd_iocontrol+0x89/0xa60 [mdd]
            [18680.633317]  [<ffffffffa076150f>] ? lu_context_init+0xff/0x260 [obdclass]
            [18680.633961]  [<ffffffffa0d5843d>] mdt_iocontrol+0x1cd/0x990 [mdt]
            [18680.634521]  [<ffffffffa072b6cc>] class_handle_ioctl+0x1b3c/0x22b0 [obdclass]
            [18680.635200]  [<ffffffff812693a8>] ? security_capable+0x18/0x20
            [18680.635747]  [<ffffffffa07115e2>] obd_class_ioctl+0xd2/0x170 [obdclass]
            [18680.636341]  [<ffffffff811da2c5>] do_vfs_ioctl+0x2e5/0x4c0
            [18680.636848]  [<ffffffff811e4e17>] ? __fd_install+0x47/0x60
            [18680.637336]  [<ffffffff811da541>] SyS_ioctl+0xa1/0xc0
            [18680.637813]  [<ffffffff81615029>] system_call_fastpath+0x16/0x1b
            [18680.638346] ---[ end trace 3f466fc704c541d0 ]---
            [18680.638793] LDISKFS-fs: ldiskfs_free_blocks:5106: aborting transaction: error 28 in __ldiskfs_handle_dirty_metadata
            [18680.639699] LDISKFS: jbd2_journal_dirty_metadata failed: handle type 5 started at line 240, credits 3/0, errcode -28
            [18680.640631] LDISKFS-fs error (device dm-0) in ldiskfs_free_blocks:5118: error 28
            [18680.660378] Aborting journal on device dm-0-8.
            [18680.672709] LDISKFS-fs (dm-0): Remounting filesystem read-only
            [18680.673249] LDISKFS-fs warning (device dm-0): ldiskfs_evict_inode:274: couldn't extend journal (err -30)
            [18680.674068] LDISKFS-fs error (device dm-0) in ldiskfs_evict_inode:303: error 28
            [18680.674093] LustreError: 2353:0:(osd_handler.c:829:osd_trans_commit_cb()) transaction @0xffff88007b6430c0 commit error: 2
            
            green Oleg Drokin added a comment - Right before the remount r/o we are getting a transaction credits overflow: [18669.823920] Lustre: 2374:0:(osd_handler.c:918:osd_trans_start()) lustre-MDT0000: too many transaction credits (185 > 256) [18669.828228] Lustre: 2374:0:(osd_handler.c:923:osd_trans_start()) create: 2/8, destroy: 1/4 [18669.831572] Lustre: 2374:0:(osd_handler.c:928:osd_trans_start()) attr_set: 2/2, xattr_set: 2/15 [18669.835192] Lustre: 2374:0:(osd_handler.c:935:osd_trans_start()) write: 7/29, punch: 0/0, quota 2/2 [18669.836831] Lustre: 2374:0:(osd_handler.c:940:osd_trans_start()) insert: 4/67, delete: 2/5 [18669.838251] Lustre: 2374:0:(osd_handler.c:945:osd_trans_start()) ref_add: 1/1, ref_del: 2/2 [18669.839654] Pid: 2374, comm: mdt00_001 [18669.840329] Call Trace: [18669.841044] [<ffffffffa0620843>] libcfs_debug_dumpstack+0x53/0x80 [libcfs] [18669.842210] [<ffffffffa0ba8af2>] osd_trans_start+0x642/0x670 [osd_ldiskfs] [18669.861939] [<ffffffffa0a2362e>] top_trans_start+0x60e/0x830 [ptlrpc] [18669.872487] [<ffffffffa0e1b421>] lod_trans_start+0x31/0x40 [lod] [18669.873091] [<ffffffffa0ea9394>] mdd_trans_start+0x14/0x20 [mdd] [18669.873892] [<ffffffffa0e99235>] mdd_unlink+0x4b5/0xa50 [mdd] [18669.874503] [<ffffffffa0763eb2>] ? dt_version_get+0x72/0x1f0 [obdclass] [18669.875302] [<ffffffffa0d6a41b>] mdt_reint_unlink+0xa7b/0x11c0 [mdt] [18669.876120] [<ffffffffa07811ae>] ? lu_ucred+0x1e/0x30 [obdclass] [18669.884730] [<ffffffffa0d6d920>] mdt_reint_rec+0x80/0x210 [mdt] [18669.885320] [<ffffffffa0d511ac>] mdt_reint_internal+0x58c/0x780 [mdt] [18669.886148] [<ffffffffa0d5a597>] mdt_reint+0x67/0x140 [mdt] [18669.886709] [<ffffffffa0a10235>] tgt_request_handle+0x6d5/0x1060 [ptlrpc] [18669.887388] [<ffffffffa09c020b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] [18669.888540] [<ffffffffa09bdd88>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [18669.889192] [<ffffffff810a9662>] ? default_wake_function+0x12/0x20 [18669.890231] [<ffffffff810a0898>] ? __wake_up_common+0x58/0x90 [18669.890908] [<ffffffffa09c43f8>] ptlrpc_main+0xaf8/0x1ea0 [ptlrpc] [18669.891495] [<ffffffff810ad8b6>] ? __dequeue_entity+0x26/0x40 [18669.892495] [<ffffffffa09c3900>] ? ptlrpc_main+0x0/0x1ea0 [ptlrpc] [18669.893083] [<ffffffff8109739f>] kthread+0xcf/0xe0 [18669.893725] [<ffffffff810972d0>] ? kthread+0x0/0xe0 [18669.894181] [<ffffffff81614f7c>] ret_from_fork+0x7c/0xb0 [18669.894869] [<ffffffff810972d0>] ? kthread+0x0/0xe0 IT's strange that the claim is that 185 is bigger than 256 when it's not. And then we get: [18680.472086] Lustre: DEBUG MARKER: /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -t namespace -r [18680.583847] ------------[ cut here ]------------ [18680.585330] WARNING: at /var/lib/jenkins/workspace/lustre-master/arch/x86_64/build_type/server/distro/el7/ib_stack/inkernel/BUILD/BUILD/lustre-2.7.54/ldiskfs/ext4_jbd2.c:260 __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]() [18680.607767] Modules linked in: osp(OF) mdd(OF) lod(OF) mdt(OF) lfsck(OF) mgs(OF) mgc(OF) osd_ldiskfs(OF) lquota(OF) fid(OF) fld(OF) ksocklnd(OF) ptlrpc(OF) obdclass(OF) lnet(OF) sha512_generic libcfs(OF) ldiskfs(OF) dm_mod nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm floppy ib_sa ib_mad i2c_piix4 virtio_balloon ib_core ppdev pcspkr serio_raw ib_addr parport_pc parport ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus syscopyarea sysfillrect sysimgblt 8139too drm_kms_helper ttm virtio_pci virtio_ring virtio drm 8139cp mii ata_piix libata i2c_core [last unloaded: obdecho] [18680.614887] CPU: 1 PID: 3261 Comm: lctl Tainted: GF W O-------------- 3.10.0-229.4.2.el7_lustre.g1fee634.x86_64 #1 [18680.615845] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [18680.616350] 0000000000000000 0000000069f92699 ffff8800649af7c8 ffffffff816050da [18680.617033] ffff8800649af800 ffffffff8106e34b ffff8800610cb410 ffff88007bbe34b0 [18680.617731] 0000000000000000 ffffffffa05bd9c0 00000000000013f2 ffff8800649af810 [18680.618420] Call Trace: [18680.618645] [<ffffffff816050da>] dump_stack+0x19/0x1b [18680.619098] [<ffffffff8106e34b>] warn_slowpath_common+0x6b/0xb0 [18680.619622] [<ffffffff8106e49a>] warn_slowpath_null+0x1a/0x20 [18680.620131] [<ffffffffa05686b2>] __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs] [18680.620825] [<ffffffffa057eec4>] ? ldiskfs_dirty_inode+0x54/0x60 [ldiskfs] [18680.621456] [<ffffffffa058b556>] ldiskfs_free_blocks+0x5e6/0xb90 [ldiskfs] [18680.622082] [<ffffffffa057fe95>] ldiskfs_xattr_release_block+0x275/0x330 [ldiskfs] [18680.622733] [<ffffffffa05830cb>] ldiskfs_xattr_delete_inode+0x2bb/0x300 [ldiskfs] [18680.623388] [<ffffffffa057d9f5>] ldiskfs_evict_inode+0x1b5/0x610 [ldiskfs] [18680.623976] [<ffffffff811e23d7>] evict+0xa7/0x170 [18680.624410] [<ffffffff811e2c15>] iput+0xf5/0x180 [18680.624822] [<ffffffffa0ba7b23>] osd_object_delete+0x1d3/0x300 [osd_ldiskfs] [18680.625477] [<ffffffffa075f04d>] lu_object_free.isra.30+0x9d/0x1a0 [obdclass] [18680.626111] [<ffffffffa075f212>] lu_object_put+0xc2/0x320 [obdclass] [18680.626676] [<ffffffffa075f486>] lu_object_put_nocache+0x16/0x20 [obdclass] [18680.627307] [<ffffffffa0742674>] local_object_unlink+0x374/0xc10 [obdclass] [18680.627913] [<ffffffffa0c92afd>] lfsck_namespace_load_one_trace_file.isra.66+0x2d/0x70 [lfsck] [18680.628680] [<ffffffffa0c988ad>] lfsck_namespace_reset+0x14d/0x510 [lfsck] [18680.629338] [<ffffffffa0c85a76>] lfsck_start+0x2f6/0x1410 [lfsck] [18680.629876] [<ffffffff811acba3>] ? __kmalloc+0x1f3/0x230 [18680.630394] [<ffffffffa05b0002>] ? ldiskfs_move_extents+0x532/0x990 [ldiskfs] [18680.631222] [<ffffffff811abe3e>] ? kmem_cache_alloc_trace+0x1ce/0x1f0 [18680.632151] [<ffffffffa0edb060>] ? osp_key_init+0x20/0x140 [osp] [18680.632764] [<ffffffffa0e840a9>] mdd_iocontrol+0x89/0xa60 [mdd] [18680.633317] [<ffffffffa076150f>] ? lu_context_init+0xff/0x260 [obdclass] [18680.633961] [<ffffffffa0d5843d>] mdt_iocontrol+0x1cd/0x990 [mdt] [18680.634521] [<ffffffffa072b6cc>] class_handle_ioctl+0x1b3c/0x22b0 [obdclass] [18680.635200] [<ffffffff812693a8>] ? security_capable+0x18/0x20 [18680.635747] [<ffffffffa07115e2>] obd_class_ioctl+0xd2/0x170 [obdclass] [18680.636341] [<ffffffff811da2c5>] do_vfs_ioctl+0x2e5/0x4c0 [18680.636848] [<ffffffff811e4e17>] ? __fd_install+0x47/0x60 [18680.637336] [<ffffffff811da541>] SyS_ioctl+0xa1/0xc0 [18680.637813] [<ffffffff81615029>] system_call_fastpath+0x16/0x1b [18680.638346] ---[ end trace 3f466fc704c541d0 ]--- [18680.638793] LDISKFS-fs: ldiskfs_free_blocks:5106: aborting transaction: error 28 in __ldiskfs_handle_dirty_metadata [18680.639699] LDISKFS: jbd2_journal_dirty_metadata failed: handle type 5 started at line 240, credits 3/0, errcode -28 [18680.640631] LDISKFS-fs error (device dm-0) in ldiskfs_free_blocks:5118: error 28 [18680.660378] Aborting journal on device dm-0-8. [18680.672709] LDISKFS-fs (dm-0): Remounting filesystem read-only [18680.673249] LDISKFS-fs warning (device dm-0): ldiskfs_evict_inode:274: couldn't extend journal (err -30) [18680.674068] LDISKFS-fs error (device dm-0) in ldiskfs_evict_inode:303: error 28 [18680.674093] LustreError: 2353:0:(osd_handler.c:829:osd_trans_commit_cb()) transaction @0xffff88007b6430c0 commit error: 2

            People

              yong.fan nasf (Inactive)
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: