Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10194

recovery-small test_111: kernel panic on MDS mount

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/414aabd6-bfb7-11e7-88ab-52540065bddc.

      The sub-test test_111 failed with the following error:

      Timeout occurred after 630 mins, last suite running was recovery-small, restarting cluster to continue tests
      

      the following is from the MDS1 console log:

      [34228.547539] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
      [34228.549353] IP: [<ffffffffc09c804a>] llog_process_thread+0x3a/0x1460 [obdclass]
      [34228.549353] PGD 0 
      [34228.553202] Oops: 0000 [#1] SMP 
      [34228.553202] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) loop dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel ppdev nfsd lrw gf128mul glue_helper ablk_helper cryptd joydev pcspkr virtio_balloon i2c_piix4 parport_pc parport nfs_acl auth_rpcgss lockd grace sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_blk 8139too ata_piix crct10dif_pclmul crct10dif_common libata crc32c_intel serio_raw virtio_pci virtio_ring virtio 8139cp mii i2c_core floppy [last unloaded: libcfs]
      [34228.553202] CPU: 1 PID: 25256 Comm: lod0000_rec0001 Tainted: G        W  OE  ------------   3.10.0-693.5.2.el7_lustre.x86_64 #1
      [34228.553202] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      [34228.553202] task: ffff88006963bf40 ti: ffff88004fb78000 task.ti: ffff88004fb78000
      [34228.553202] RIP: 0010:[<ffffffffc09c804a>]  [<ffffffffc09c804a>] llog_process_thread+0x3a/0x1460 [obdclass]
      [34228.553202] RSP: 0018:ffff88004fb7bb28  EFLAGS: 00010202
      [34228.553202] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [34228.553202] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88007ad57780
      [34228.553202] RBP: ffff88004fb7bbe0 R08: 000000000000000a R09: 000000000000000a
      [34228.553202] R10: 0000000000000000 R11: 000000000000000f R12: 0000000000000000
      [34228.553202] R13: ffff8800663f0780 R14: ffff88007ad57780 R15: 0000000000000000
      [34228.553202] FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
      [34228.610695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [34228.610695] CR2: 0000000000000060 CR3: 00000000019f2000 CR4: 00000000000406e0
      [34228.610695] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [34228.610695] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [34228.610695] Stack:
      [34228.610695]  00000000410f2c08 0000000000000246 ffff88007d001a00 0000000000000000
      [34228.610695]  ffff88007d001a00 ffffffffc09c94c6 0000000000000000 ffffffffc0f7a6b0
      [34228.610695]  ffff88007ad57780 ffff88007ad57780 0000000000000000 ffff88004fb7bbe0
      [34228.610695] Call Trace:
      [34228.610695]  [<ffffffffc09c94c6>] ? llog_process_or_fork+0x56/0x450 [obdclass]
      [34228.610695]  [<ffffffffc0f7a6b0>] ? lodname2mdt_index+0x2f0/0x2f0 [lod]
      [34228.610695]  [<ffffffffc06f1ba7>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      [34228.610695]  [<ffffffffc0f7a6b0>] ? lodname2mdt_index+0x2f0/0x2f0 [lod]
      [34228.610695]  [<ffffffffc09c952c>] llog_process_or_fork+0xbc/0x450 [obdclass]
      [34228.610695]  [<ffffffffc09cea5a>] llog_cat_process_cb+0x20a/0x220 [obdclass]
      [34228.610695]  [<ffffffffc09c8885>] llog_process_thread+0x875/0x1460 [obdclass]
      [34228.610695]  [<ffffffffc09ce850>] ? llog_cat_process_common+0x440/0x440 [obdclass]
      [34228.610695]  [<ffffffffc09c952c>] llog_process_or_fork+0xbc/0x450 [obdclass]
      [34228.610695]  [<ffffffffc09ce850>] ? llog_cat_process_common+0x440/0x440 [obdclass]
      [34228.610695]  [<ffffffffc09cd9d9>] llog_cat_process_or_fork+0x199/0x2a0 [obdclass]
      [34228.610695]  [<ffffffffc0f7a6b0>] ? lodname2mdt_index+0x2f0/0x2f0 [lod]
      [34228.610695]  [<ffffffffc0f7a6b0>] ? lodname2mdt_index+0x2f0/0x2f0 [lod]
      [34228.610695]  [<ffffffffc09cdb0e>] llog_cat_process+0x2e/0x30 [obdclass]
      [34228.610695]  [<ffffffffc0f76a89>] lod_sub_recovery_thread+0x439/0xc80 [lod]
      [34228.610695]  [<ffffffffc0f76650>] ? lod_trans_stop+0x340/0x340 [lod]
      [34228.610695]  [<ffffffff810b099f>] kthread+0xcf/0xe0
      [34228.610695]  [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40
      [34228.610695]  [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90
      [34228.610695]  [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40
      [34228.610695] Code: 41 54 53 48 81 ec 90 00 00 00 4c 8b 27 48 8b 47 18 65 48 8b 34 25 28 00 00 00 48 89 75 d0 31 f6 f6 05 b2 85 d4 ff 01 48 89 7d 88 <4d> 8b 6c 24 60 48 89 45 80 c7 45 c4 00 00 00 00 74 0d f6 05 99 
      [34228.610695] RIP  [<ffffffffc09c804a>] llog_process_thread+0x3a/0x1460 [obdclass]
      [34228.610695]  RSP <ffff88004fb7bb28>
      [34228.610695] CR2: 0000000000000060
      

      Info required for matching: recovery-small 111

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: