Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Bob Glossman <bob.glossman@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/35ecbbdc-99c2-11e7-b778-5254006e85c2.
The sub-test test_5 failed with the following error:
test failed to respond and timed out
This failure is a panic on MDS. It has a different stack trace than LU-5449 so I'm creating a new ticket.
Panic seen:
20:49:39:[15096.775030] LustreError: 1321:0:(llog_cat.c:269:llog_cat_id2handle()) lustre-MDT0000-osp-MDT0002: error opening log id [0x2:0x402:0x2]:0: rc = -2 20:49:39:[15096.776495] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 20:49:39:[15096.777310] IP: [<ffffffffc07d002a>] llog_process_thread+0x3a/0x1460 [obdclass] 20:49:39:[15096.778071] PGD 78ab5067 PUD 78ab4067 PMD 0 20:49:39:[15096.778551] Oops: 0000 [#1] SMP 20:49:39:[15096.778921] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel nfsd aesni_intel lrw gf128mul ppdev glue_helper ablk_helper cryptd i2c_piix4 nfs_acl lockd joydev pcspkr i2c_core virtio_balloon auth_rpcgss grace parport_pc parport sunrpc ip_tables ata_generic pata_acpi ext4 mbcache jbd2 ata_piix virtio_blk libata 8139too crct10dif_pclmul crct10dif_common crc32c_intel serio_raw virtio_pci 8139cp virtio_ring virtio mii floppy 20:49:39:[15096.788032] CPU: 1 PID: 1321 Comm: lod0002_rec0000 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.x86_64 #1 20:49:39:[15096.789136] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 20:49:39:[15096.789694] task: ffff880067595ee0 ti: ffff8800606bc000 task.ti: ffff8800606bc000 20:49:39:[15096.790413] RIP: 0010:[<ffffffffc07d002a>] [<ffffffffc07d002a>] llog_process_thread+0x3a/0x1460 [obdclass] 20:49:39:[15096.791385] RSP: 0018:ffff8800606bfb28 EFLAGS: 00010246 20:49:39:[15096.791910] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800606bffd8 20:49:39:[15096.792588] RDX: ffff8800606bfbb8 RSI: 0000000000000000 RDI: ffff88006797f180 20:49:39:[15096.793279] RBP: ffff8800606bfbe0 R08: 0000000000019be0 R09: ffff88007d001a00 20:49:39:[15096.793977] R10: ffffffffc07d14a6 R11: 000000000000000f R12: 0000000000000000 20:49:39:[15096.794662] R13: ffff8800667ebc00 R14: ffff88006797f180 R15: 0000000000000000 20:49:39:[15096.795352] FS: 0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 20:49:39:[15096.796149] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 20:49:39:[15096.796704] CR2: 0000000000000060 CR3: 0000000077c09000 CR4: 00000000000406e0 20:49:39:[15096.797387] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 20:49:39:[15096.798084] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 20:49:39:[15096.798778] Stack: 20:49:39:[15096.798987] ffff8800606bfe58 ffff8800606bfb90 ffffffffc05eeba7 ffff880000000010 20:49:39:[15096.799777] ffff8800606bfba0 ffff8800606bfb60 0000000075f08900 ffff88007a318000 20:49:39:[15096.800574] ffff88006797f180 0000000000000001 0000000000000001 0000000000000402 20:49:39:[15096.801374] Call Trace: 20:49:39:[15096.801634] [<ffffffffc05eeba7>] ? libcfs_debug_msg+0x57/0x80 [libcfs] 20:49:39:[15096.802295] [<ffffffffc07d5ccf>] ? llog_cat_cleanup+0x15f/0x380 [obdclass] 20:49:39:[15096.802990] [<ffffffffc0eea6b0>] ? lodname2mdt_index+0x2f0/0x2f0 [lod] 20:49:39:[15096.803639] [<ffffffffc07d150c>] llog_process_or_fork+0xbc/0x450 [obdclass] 20:49:39:[15096.804337] [<ffffffffc07d6a3a>] llog_cat_process_cb+0x20a/0x220 [obdclass] 20:49:39:[15096.805040] [<ffffffffc07d0865>] llog_process_thread+0x875/0x1460 [obdclass] 20:49:39:[15096.805752] [<ffffffffc07d6830>] ? llog_cat_process_common+0x440/0x440 [obdclass] 20:49:39:[15096.806488] [<ffffffffc07d150c>] llog_process_or_fork+0xbc/0x450 [obdclass] 20:49:39:[15096.807184] [<ffffffffc07d6830>] ? llog_cat_process_common+0x440/0x440 [obdclass] 20:49:39:[15096.807936] [<ffffffffc07d59b9>] llog_cat_process_or_fork+0x199/0x2a0 [obdclass] 20:49:39:[15096.808673] [<ffffffffc0f1ae2a>] ? lod_sub_prep_llog+0x24a/0x783 [lod] 20:49:39:[15096.809322] [<ffffffffc0eea6b0>] ? lodname2mdt_index+0x2f0/0x2f0 [lod] 20:49:39:[15096.809976] [<ffffffffc07d5aee>] llog_cat_process+0x2e/0x30 [obdclass] 20:49:39:[15096.810613] [<ffffffffc0ee6a89>] lod_sub_recovery_thread+0x439/0xc80 [lod] 20:49:39:[15096.811300] [<ffffffffc0ee6650>] ? lod_trans_stop+0x340/0x340 [lod] 20:49:39:[15096.811920] [<ffffffff810b098f>] kthread+0xcf/0xe0 20:49:39:[15096.812393] [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40 20:49:39:[15096.812987] [<ffffffff816b4f18>] ret_from_fork+0x58/0x90 20:49:39:[15096.813516] [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40 20:49:39:[15096.814113] Code: 41 54 53 48 81 ec 90 00 00 00 4c 8b 27 48 8b 47 18 65 48 8b 34 25 28 00 00 00 48 89 75 d0 31 f6 f6 05 d2 d5 e3 ff 01 48 89 7d 88 <4d> 8b 6c 24 60 48 89 45 80 c7 45 c4 00 00 00 00 74 0d f6 05 b9 20:49:39:[15096.817178] RIP [<ffffffffc07d002a>] llog_process_thread+0x3a/0x1460 [obdclass] 20:49:39:[15096.817925] RSP <ffff8800606bfb28> 20:49:39:[15096.818270] CR2: 0000000000000060
Info required for matching: sanity-scrub 5