Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-716

MDT crashed at: Kernel BUG at fs/jbd2/transaction.c:982

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • None
    • None
    • 3
    • 6552

    Description

      This crash was found at ORNL when they were testing IR, please take a look at http://jira.whamcloud.com/browse/ORNL-8 for comments on Sep 23.

      Sep 23 10:57:56 barry-mds1 kernel: ----------- [cut here ] --------- [please bite here ] ---------
      Sep 23 10:57:56 barry-mds1 kernel: Kernel BUG at fs/jbd2/transaction.c:982
      Sep 23 10:57:56 barry-mds1 kernel: invalid opcode: 0000 [1] SMP
      Sep 23 10:57:56 barry-mds1 kernel: last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
      Sep 23 10:57:56 barry-mds1 kernel: CPU 2
      Sep 23 10:57:56 barry-mds1 kernel: Modules linked in: ipmi_si cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) exportfs mgc(U) ldiskfs(U) jbd2 crc16 lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) dell_rbu autofs4 ib_srp(U) ipmi_devintf ipmi_msghandler ipt_REJECT xt_tcpudp xt_state ip_conntrack nfnetlink iptable_filter ip_tables x_tables be2iscsi iscsi_tcp bnx2i cnic uio libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ipoib_helper(U) ib_cm(U) ib_sa(U) ipv6 xfrm_nalgo crypto_api ib_uverbs(U) ib_umad(U) iw_nes(U) iw_cxgb3(U) cxgb3(U) ib_qib(U) dca mlx4_ib(U) mlx4_en(U) mlx4_core(U) dm_mirror dm_log dm_multipath scsi_dh dm_mod raid0 raid1 video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sr_mod cdrom sd_mod sg joydev ata_piix libata mptsas mptscsih mptbase ib_mthca(U) pcspkr shpchp tpm_tis scsi_transport_sas tpm scsi_mod uhci_hcd ib_mad(U) i5000_edac ehci_hcd serio_raw edac_mc tpm_bios ib_core(U) nfs nfs_acl lockd fscache sunrpc bnx2
      Sep 23 10:57:56 barry-mds1 kernel: Pid: 5472, comm: tgt_recov Tainted: G 2.6.18-238.19.1.el5.head #1
      Sep 23 10:57:56 barry-mds1 kernel: RIP: 0010:[<ffffffff88d4097a>] [<ffffffff88d4097a>] :jbd2:jbd2_journal_dirty_metadata+0x5f/0x11f
      Sep 23 10:57:56 barry-mds1 kernel: RSP: 0018:ffff8103fe016d20 EFLAGS: 00010246
      Sep 23 10:57:56 barry-mds1 kernel: RAX: 0000000000000000 RBX: ffff8103fb525df0 RCX: ffff8103fb525df0
      Sep 23 10:57:56 barry-mds1 kernel: RDX: ffff810411f4a5c0 RSI: ffff8103fb525df0 RDI: ffff810400007438
      Sep 23 10:57:56 barry-mds1 kernel: RBP: ffff8103fe016d40 R08: 0000000000000000 R09: 0000000000000000
      Sep 23 10:57:56 barry-mds1 kernel: R10: 0000000b00000040 R11: 0000000043e41c7c R12: ffff81041928c430
      Sep 23 10:57:56 barry-mds1 kernel: R13: ffff810400007438 R14: ffff8104202a5800 R15: 0000000000002280
      Sep 23 10:57:56 barry-mds1 kernel: FS: 00002b0d6d50c6e0(0000) GS:ffff81010eeef3c0(0000) knlGS:0000000000000000
      Sep 23 10:57:56 barry-mds1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      Sep 23 10:57:56 barry-mds1 kernel: CR2: 0000000041d0dfe8 CR3: 00000004190cc000 CR4: 00000000000006e0
      Sep 23 10:57:56 barry-mds1 kernel: Process tgt_recov (pid: 5472, threadinfo ffff8103fe016000, task ffff810421e127e0)
      Sep 23 10:57:56 barry-mds1 kernel: Stack: ffff810400007438 ffff8103fb525df0 0000000000000000 ffffffff88de1650
      Sep 23 10:57:56 barry-mds1 kernel: ffff8103fe016d70 ffffffff88d5c03b ffff8103fb60e280 0000000000000040
      Sep 23 10:57:56 barry-mds1 kernel: ffff8103fb525df0 0000000000000040 ffff8103fe016e20 ffffffff88dd9598
      Sep 23 10:57:56 barry-mds1 kernel: Call Trace:
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88d5c03b>] :ldiskfs:__ldiskfs_handle_dirty_metadata+0xdb/0x100
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88dd9598>] :fsfilt_ldiskfs:fsfilt_ldiskfs_write_handle+0x1b8/0x2d0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff888a0e59>] :libcfs:libcfs_debug_vmsg2+0x7e9/0xad0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88dd97fd>] :fsfilt_ldiskfs:fsfilt_ldiskfs_write_record+0x14d/0x190
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff8893d88c>] :obdclass:llog_lvfs_write_blob+0x10c/0x420
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88d617b0>] :ldiskfs:ldiskfs_file_open+0x0/0x120
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff8893e5ae>] :obdclass:llog_lvfs_write_rec+0x5fe/0xda0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff888966fb>] :libcfs:cfs_alloc+0x2b/0x60
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff8006873d>] __down_read+0x17/0xa0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff8893ad98>] :obdclass:llog_cat_current_log+0xae8/0xf20
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff8893eb4b>] :obdclass:llog_lvfs_write_rec+0xb9b/0xda0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff80049578>] try_to_wake_up+0x472/0x484
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff8893cd9d>] :obdclass:llog_cat_add_rec+0xbd/0x750
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff8002f7b2>] __wake_up+0x43/0x50
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88944258>] :obdclass:llog_obd_origin_add+0xa8/0x170
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88943936>] :obdclass:llog_add+0x326/0x3d0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88942897>] :obdclass:__llog_ctxt_put+0x27/0x270
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88c24c78>] :lov:lov_llog_origin_add+0x368/0x590
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88943936>] :obdclass:llog_add+0x326/0x3d0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff8001b396>] vsnprintf+0x611/0x659
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff80073304>] do_gettimeofday+0x41/0x91
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88df9a21>] :mds:mds_llog_origin_add+0x161/0x2d0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88943936>] :obdclass:llog_add+0x326/0x3d0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88df7a71>] :mds:mds_llog_add_unlink+0x301/0x540
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88ed66bd>] :osd_ldiskfs:osd_attr_get+0x8d/0x130
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88df844d>] :mds:mds_log_op_orphan+0x10d/0x190
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e3ecfb>] :mdd:md_capainfo+0x1b/0x20
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88dffb52>] :mds:mds_lov_update_objids+0x7c2/0xc50
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff80022ec7>] __up_read+0x7a/0x83
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e23f2d>] :mdd:mdd_lov_objid_update+0xd/0x10
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e3b995>] :mdd:mdd_create+0x1dd5/0x26e0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff8898329c>] :obdclass:htable_lookup+0xbc/0x190
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a50103>] :ptlrpc:lustre_msg_get_flags+0x23/0x80
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e3537e>] :mdd:mdd_lookup+0x8e/0x110
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88efe2b7>] :cmm:cml_create+0x197/0x210
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e9bcdf>] :mdt:mdt_reint_open+0x265f/0x3250
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a51fd0>] :ptlrpc:lustre_msg_string+0x0/0x280
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a50530>] :ptlrpc:lustre_msg_buf+0x0/0x80
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a7a9b7>] :ptlrpc:__req_capsule_get+0x2c7/0x5d0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e84ce3>] :mdt:mdt_open_unpack+0x573/0x620
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e3ed42>] :mdd:md_ucred+0x42/0x50
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e3ed42>] :mdd:md_ucred+0x42/0x50
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e876f3>] :mdt:mdt_reint_rec+0x83/0x100
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a50103>] :ptlrpc:lustre_msg_get_flags+0x23/0x80
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e6afb5>] :mdt:mdt_reint_internal+0x8e5/0x990
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e6b0ac>] :mdt:mdt_intent_fixup_resent+0x4c/0x230
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff8005f5c6>] cache_alloc_refill+0x116/0x19f
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e78e85>] :mdt:mdt_intent_reint+0x1f5/0x5e0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff888a5d7d>] :libcfs:cfs_hash_hd_hnode_add+0xd/0x40
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e796d7>] :mdt:mdt_intent_policy+0x467/0x650
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a0f209>] :ptlrpc:ldlm_lock_enqueue+0x179/0xa00
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a2b390>] :ptlrpc:ldlm_export_lock_get+0x10/0x20
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff888a3521>] :libcfs:cfs_hash_bd_add_locked+0x71/0x80
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff888a6edd>] :libcfs:cfs_hash_add+0x17d/0x190
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a2d69b>] :ptlrpc:ldlm_handle_enqueue0+0x9eb/0xff0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e6ca89>] :mdt:mdt_unpack_req_pack_rep+0x89/0x6a0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e78b32>] :mdt:mdt_enqueue+0x72/0x100
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e6e286>] :mdt:mdt_handle_common+0x11e6/0x1750
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e71fe0>] :mdt:mdt_recovery_handle+0x0/0x130
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e71fe0>] :mdt:mdt_recovery_handle+0x0/0x130
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88e72092>] :mdt:mdt_recovery_handle+0xb2/0x130
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a1a9e0>] :ptlrpc:check_for_next_transno+0x0/0x540
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a14e10>] :ptlrpc:exp_req_replay_healthy+0x0/0x20
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a1581b>] :ptlrpc:target_recovery_overseer+0x2b/0x1d0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a15dd0>] :ptlrpc:handle_recovery_req+0x200/0x2f0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a4ff42>] :ptlrpc:lustre_msg_get_transno+0x22/0xe0
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a187e4>] :ptlrpc:target_recovery_thread+0x374/0xe10
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a18470>] :ptlrpc:target_recovery_thread+0x0/0xe10
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff80061fb1>] child_rip+0xa/0x11
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a18470>] :ptlrpc:target_recovery_thread+0x0/0xe10
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff88a18470>] :ptlrpc:target_recovery_thread+0x0/0xe10
      Sep 23 10:57:56 barry-mds1 kernel: [<ffffffff80061fa7>] child_rip+0x0/0x11
      Sep 23 10:57:56 barry-mds1 kernel:
      Sep 23 10:57:56 barry-mds1 kernel:
      Sep 23 10:57:56 barry-mds1 kernel: Code: 0f 0b 68 92 68 d4 88 c2 d6 03 ff c8 41 89 45 08 49 39 54 24
      Sep 23 10:57:56 barry-mds1 kernel: RIP [<ffffffff88d4097a>] :jbd2:jbd2_journal_dirty_metadata+0x5f/0x11f
      Sep 23 10:57:56 barry-mds1 kernel: RSP <ffff8103fe016d20>
      Sep 23 10:57:56 barry-mds1 kernel: <0>Kernel panic - not syncing: Fatal exception

      Attachments

        Activity

          People

            bobijam Zhenyu Xu
            jay Jinshan Xiong (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: