Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6435

ldiskfs bug in __ldiskfs_handle_dirty_metadata

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • Lustre 2.8.0
    • Lustre 2.8.0
    • None
    • el7.1
    • 2
    • 9223372036854775807

    Description

      New bug seen only in autotest for el7.1 server code so far. Not seen in previous test runs on el7.0. After the failure the lustre fs becomes read-only, leading to many more failures.

      Yang Sheng reports:

      [4/4/15, 8:46:51 PM] yang sheng: I also encountered such error in my test environment. I found it caused by journal space not eough to handle dirty data. So modify MDS_FS_MKFS_OPTS='-J size=xxx' would works well.
      [4/4/15, 8:57:22 PM] yang sheng: I'll doing more investigate to reveal root cause.

      Seen in sanity test_102i: https://testing.hpdd.intel.com/test_sets/b849c698-da5b-11e4-8289-5254006e85c2

      [ 2666.355166] -----------[ cut here ]-----------
      [ 2666.356994] WARNING: at /var/lib/jenkins/tmp/lustre_el7_topdir/BUILD/BUILD/lustre-2.7.51/ldiskfs/ext4_jbd2.c:260 __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]()
      [ 2666.361168] Modules linked in: osp(OF) mdd(OF) lod(OF) mdt(OF) lfsck(OF) mgs(OF) mgc(OF) osd_ldiskfs(OF) lquota(OF) fid(OF) fld(OF) ksocklnd(OF) ptlrpc(OF) obdclass(OF) lnet(OF) sha512_generic libcfs(OF) ldiskfs(OF) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ppdev ib_cm parport_pc iw_cm parport ib_sa ib_mad virtio_balloon pcspkr i2c_piix4 ib_core serio_raw ib_addr ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm 8139too virtio_pci virtio_ring virtio ata_piix floppy drm i2c_core libata 8139cp mii [last unloaded: llog_test]

      [ 2666.378809] CPU: 0 PID: 11066 Comm: mdt00_002 Tainted: GF W O-------------- 3.10.0-229.1.2.el7_lustre.g5f2eb1d.x86_64 #1
      [ 2666.382889] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      [ 2666.384948] 0000000000000000 00000000e532c5a7 ffff880077c677f8 ffffffff81604d2a
      [ 2666.387191] ffff880077c67830 ffffffff8106e34b ffff880035b8bdd0 ffff88007cc414b0
      [ 2666.389386] 0000000000000000 ffffffffa05ba540 00000000000013f2 ffff880077c67840
      [ 2666.391661] Call Trace:
      [ 2666.393487] [] dump_stack+0x19/0x1b
      [ 2666.395480] [] warn_slowpath_common+0x6b/0xb0
      [ 2666.397550] [] warn_slowpath_null+0x1a/0x20
      [ 2666.399640] [] __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]
      [ 2666.401911] [] ? ldiskfs_dirty_inode+0x54/0x60 [ldiskfs]
      [ 2666.404128] [] ldiskfs_free_blocks+0x5e6/0xb90 [ldiskfs]
      [ 2666.406246] [] ldiskfs_xattr_release_block+0x275/0x330 [ldiskfs]
      [ 2666.408443] [] ldiskfs_xattr_delete_inode+0x2bb/0x300 [ldiskfs]
      [ 2666.410567] [] ldiskfs_evict_inode+0x1b5/0x610 [ldiskfs]
      [ 2666.412594] [] evict+0xa7/0x170
      [ 2666.414443] [] iput+0xf5/0x180
      [ 2666.416275] [] osd_object_delete+0x1d3/0x300 [osd_ldiskfs]
      [ 2666.418308] [] lu_object_free.isra.30+0x9d/0x1a0 [obdclass]
      [ 2666.420350] [] lu_object_put+0xc2/0x320 [obdclass]
      [ 2666.422389] [] mdt_reint_unlink+0x796/0x1150 [mdt]
      [ 2666.424396] [] mdt_reint_rec+0x80/0x210 [mdt]
      [ 2666.426508] [] mdt_reint_internal+0x58c/0x780 [mdt]
      [ 2666.428542] [] mdt_reint+0x67/0x140 [mdt]
      [ 2666.430616] [] tgt_request_handle+0x635/0xfd0 [ptlrpc]
      [ 2666.432746] [] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
      [ 2666.434929] [] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
      [ 2666.436900] [] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      [ 2666.438920] [] ptlrpc_main+0xaf8/0x1ea0 [ptlrpc]
      [ 2666.440866] [] ? __dequeue_entity+0x26/0x40
      [ 2666.442788] [] ? ptlrpc_register_service+0xf00/0xf00 [ptlrpc]
      [ 2666.444755] [] kthread+0xcf/0xe0
      [ 2666.446590] [] ? kthread_create_on_node+0x140/0x140
      [ 2666.448452] [] ret_from_fork+0x7c/0xb0
      [ 2666.450308] [] ? kthread_create_on_node+0x140/0x140
      [ 2666.452177] --[ end trace 53ab1a0dad30f568 ]--
      [ 2666.453923] LDISKFS-fs: ldiskfs_free_blocks:5106: aborting transaction: error 28 in __ldiskfs_handle_dirty_metadata
      [ 2666.456082] LDISKFS: jbd2_journal_dirty_metadata failed: handle type 5 started at line 240, credits 3/0, errcode -28
      [ 2666.456945] LDISKFS-fs error (device dm-0) in ldiskfs_free_blocks:5118: error 28
      [ 2666.469566] Aborting journal on device dm-0-8.
      [ 2666.516889] LDISKFS-fs (dm-0): Remounting filesystem read-only

      Attachments

        Issue Links

          Activity

            People

              ys Yang Sheng
              bogl Bob Glossman (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: