[LU-6988] MDS and OST mount crashes with kernel panic Created: 12/Aug/15 Updated: 14/Sep/15 Resolved: 14/Sep/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Aditya Pandit (Inactive) | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Panic on MDS while mounting lustre. DISKFS-fs (vdb): mounted filesystem with ordered data mode. quota=on. Opts: Lustre: Setting parameter lustre-MDT0000-mdtlov.lov.stripesize in log lustre-MDT0000 Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space Lustre: lustre-MDT0000: new disk, initializing Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400):0:mdt ------------[ cut here ]------------ kernel BUG at block/blk-core.c:2627! invalid opcode: 0000 [#1] SMP Pid: 6022, comm: ldiskfslazyinit Not tainted 2.6.32-431.29.2.el6_lustreb_neo_stable_6698_6 #1 Red Hat KVM RIP: 0010:[<ffffffff812698c2>] [<ffffffff812698c2>] __blk_end_request_all+0x32/0x60 Process ldiskfslazyinit (pid: 6022, threadinfo ffff88011bb80000, task ffff88011933f500) Call Trace: <IRQ> [<ffffffffa005a22a>] blk_done+0x4a/0x110 [virtio_blk] [<ffffffff810ecb14>] ? __rcu_process_callbacks+0x54/0x350 [<ffffffffa004e2ac>] vring_interrupt+0x3c/0xd0 [virtio_ring] [<ffffffff810e7090>] handle_IRQ_event+0x60/0x170 [<ffffffff8107a64f>] ? __do_softirq+0x11f/0x1e0 [<ffffffff810e99ee>] handle_edge_irq+0xde/0x180 [<ffffffff8100faf9>] handle_irq+0x49/0xa0 [<ffffffff81532dbc>] do_IRQ+0x6c/0xf0 [<ffffffff8100b9d3>] ret_from_intr+0x0/0x11 <EOI> [<ffffffff811222e5>] mempool_alloc_slab+0x15/0x20 [<ffffffff81122483>] mempool_alloc+0x63/0x140 [<ffffffff811c4ed2>] bvec_alloc_bs+0xe2/0x110 [<ffffffff811c4fb2>] bio_alloc_bioset+0xb2/0xf0 [<ffffffff811c5095>] bio_alloc+0x15/0x30 [<ffffffff812705a8>] blkdev_issue_zeroout+0x88/0x180 [<ffffffffa02b1c64>] ldiskfs_init_inode_table+0x154/0x290 [ldiskfs] [<ffffffffa02dbdcb>] ldiskfs_lazyinit_thread+0x15b/0x2f0 [ldiskfs] [<ffffffff8109abf6>] kthread+0x96/0xa0 |
| Comments |
| Comment by Andreas Dilger [ 13/Aug/15 ] |
|
It looks like this BUG might be: void __blk_end_request_all(struct request *rq, int error) { bool pending; unsigned int bidi_bytes = 0; if (unlikely(blk_bidi_rq(rq))) bidi_bytes = blk_rq_bytes(rq->next_rq); pending = __blk_end_bidi_request(rq, error, blk_rq_bytes(rq), bidi_bytes); BUG_ON(pending); } EXPORT_SYMBOL(__blk_end_request_all); |
| Comment by Andreas Dilger [ 13/Aug/15 ] |
|
The bug is happening during lazy inode table initialization after the initial filesystem format. Could you see if this oops is avoided by formatting the filesystem with "mkfs.lustre --mkfsoptions="-E lazy_itable_init=0"? Are you using any non-default options for formatting or mounting your MDT or OST filesystems? What version of e2fsprogs are you using? |
| Comment by Andreas Dilger [ 13/Aug/15 ] |
|
What kernel is this? RHEL6.3? Does it happen with the stock Lustre RHEL6.6 kernel? Are there any other kernel or ldiskfs patches applied? |
| Comment by Aditya Pandit (Inactive) [ 14/Aug/15 ] |
|
e2fsprogs: e2fsprogs-1.42.7.x1.mrp.128-8.el6.src.rpm We have not applied any extra patches. We haven't used any non-standard option for formatting and mounting. It is Scientific Linux release 6.5 (Carbon) There are no extra kernel or ldiskfs patches applied. Will try with RHEL 6.6 and stock kernel and let you know the results. |
| Comment by Aditya Pandit (Inactive) [ 27/Aug/15 ] |
|
I tried it on stock kernel with lustre patches it is crashed there also. kernel BUG at block/blk-core.c:2627! I have not seen on Oracle VirtualBox. |
| Comment by Aditya Pandit (Inactive) [ 14/Sep/15 ] |
|
This bug is duplicate of https://jira.hpdd.intel.com/browse/LU-6974. |
| Comment by Peter Jones [ 14/Sep/15 ] |
|
ok. I will close the ticket as a duplicate. Thanks for letting us know. |