[LU-6974] RHEL 7.1 lustre-initialization-1: MDS crashed while lustre mount Created: 10/Aug/15 Updated: 12/May/17 Resolved: 31/Aug/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Yang Sheng |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
3.10.0-229.11.1.el7_lustre.g853ea39.x86_64 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for ys <yang.sheng@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/cebbc5a2-3e51-11e5-8006-5254006e85c2. The sub-test lustre-initialization_1 failed with the following error: Test system failed to start single suite, so abandoning all hope and giving up Please provide additional information about the failure here. Info required for matching: lustre-initialization-1 lustre-initialization_1 MDS crashed: 03:40:55:[ 2331.721214] ldiskfs: module verification failed: signature and/or required key missing - tainting kernel 03:40:55:[ 2331.740707] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro 03:40:55:[ 2352.224410] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro 03:40:55:[ 2352.319932] BUG: unable to handle kernel paging request at 0000000000001f50 03:40:55:[ 2352.320900] IP: [<ffffffff81155b02>] page_waitqueue+0x62/0x80 03:40:55:[ 2352.320900] PGD 36560067 PUD 7bb3a067 PMD 0 03:40:55:[ 2352.320900] Oops: 0000 [#1] SMP 03:40:55:[ 2352.320900] Modules linked in: ldiskfs(OF) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ppdev ib_cm iw_cm serio_raw pcspkr virtio_balloon i2c_piix4 parport_pc parport ib_sa ib_mad ib_core ib_addr ext4 mbcache jbd2 ata_generic pata_acpi cirrus syscopyarea sysfillrect virtio_blk sysimgblt drm_kms_helper 8139too floppy ttm ata_piix drm libata 8139cp mii virtio_pci virtio_ring i2c_core virtio 03:40:55:[ 2352.320900] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GF O-------------- 3.10.0-229.11.1.el7_lustre.g853ea39.x86_64 #1 03:40:55:[ 2352.320900] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 03:40:55:[ 2352.320900] task: ffff88007c052220 ti: ffff88007c080000 task.ti: ffff88007c080000 03:40:55:[ 2352.320900] RIP: 0010:[<ffffffff81155b02>] [<ffffffff81155b02>] page_waitqueue+0x62/0x80 03:40:55:[ 2352.320900] RSP: 0018:ffff88007fd03d50 EFLAGS: 00010086 03:40:55:[ 2352.320900] RAX: 0000000000001800 RBX: ffff88007c163100 RCX: 0000000000000040 03:40:55:[ 2352.320900] RDX: 97fd97a7b8163100 RSI: 0000000000000000 RDI: 0000000000000000 03:40:55:[ 2352.320900] RBP: ffff88007fd03d50 R08: 8800000000000000 R09: 2001f058c4000000 03:40:55:[ 2352.320900] R10: dffd97a7b8163100 R11: ffffffffa001a265 R12: ffff880035ea3500 03:40:55:[ 2352.320900] R13: 0000000000000001 R14: 0000000000000000 R15: ffff88003631e6c0 03:40:55:[ 2352.320900] FS: 0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 03:40:55:[ 2352.320900] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 03:40:55:[ 2352.320900] CR2: 0000000000001f50 CR3: 000000003652c000 CR4: 00000000000006e0 03:40:55:[ 2352.320900] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 03:40:55:[ 2352.320900] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 03:40:55:[ 2352.320900] Stack: 03:40:55:[ 2352.320900] ffff88007fd03d68 ffffffff81156f8e ffff880035ea3568 ffff88007fd03d90 03:40:55:[ 2352.320900] ffffffff8120652e ffff880035ea3500 0000000000000000 0000000000000000 03:40:55:[ 2352.320900] ffff88007fd03da0 ffffffff811fd9dd ffff88007fd03de0 ffffffff812adf60 03:40:55:[ 2352.320900] Call Trace: 03:40:56:[ 2352.320900] <IRQ> 03:40:56:[ 2352.320900] [<ffffffff81156f8e>] unlock_page+0x1e/0x30 03:40:56:[ 2352.320900] [<ffffffff8120652e>] mpage_end_io+0x3e/0xb0 03:40:56:[ 2352.320900] [<ffffffff811fd9dd>] bio_endio+0x1d/0x30 03:40:56:[ 2352.320900] [<ffffffff812adf60>] blk_update_request+0x90/0x350 03:40:56:[ 2352.320900] [<ffffffff812b708a>] blk_mq_end_request+0x1a/0x70 03:40:56:[ 2352.320900] [<ffffffffa01202f2>] virtblk_request_done+0x32/0x80 [virtio_blk] 03:40:56:[ 2352.320900] [<ffffffff812b783d>] __blk_mq_complete_request+0x7d/0x100 03:40:56:[ 2352.320900] [<ffffffff812b78e1>] blk_mq_complete_request+0x21/0x30 03:40:56:[ 2352.320900] [<ffffffffa0120076>] virtblk_done+0x76/0x100 [virtio_blk] 03:40:56:[ 2352.320900] [<ffffffffa001a4a8>] vring_interrupt+0x38/0x90 [virtio_ring] 03:40:56:[ 2352.320900] [<ffffffff8110b84e>] handle_irq_event_percpu+0x3e/0x1e0 03:40:56:[ 2352.320900] [<ffffffff8110ba2d>] handle_irq_event+0x3d/0x60 03:40:56:[ 2352.320900] [<ffffffff8110e6c7>] handle_edge_irq+0x77/0x130 03:40:56:[ 2352.320900] [<ffffffff81015cff>] handle_irq+0xbf/0x150 03:40:57:[ 2352.320900] [<ffffffff81610b4a>] ? atomic_notifier_call_chain+0x1a/0x20 03:40:57:[ 2352.320900] [<ffffffff816175ef>] do_IRQ+0x4f/0xf0 03:40:57:[ 2352.320900] [<ffffffff8160c82d>] common_interrupt+0x6d/0x6d 03:40:57:[ 2352.320900] <EOI> 03:40:57:[ 2352.320900] [<ffffffff8109b938>] ? hrtimer_start+0x18/0x20 03:40:57:[ 2352.320900] [<ffffffff81052dd6>] ? native_safe_halt+0x6/0x10 03:40:57:[ 2352.320900] [<ffffffff8101c93f>] default_idle+0x1f/0xc0 03:40:57:[ 2352.320900] [<ffffffff8101d236>] arch_cpu_idle+0x26/0x30 03:40:57:[ 2352.320900] [<ffffffff810c6985>] cpu_startup_entry+0xf5/0x290 03:40:57:[ 2352.320900] [<ffffffff810423ca>] start_secondary+0x1ba/0x230 03:40:57:[ 2352.320900] Code: 48 03 04 d5 80 31 a2 81 48 89 fa 48 c1 e1 39 48 c1 e2 36 48 c1 e7 3f 48 89 e5 4c 01 d2 4c 29 c2 48 01 f2 48 29 ca b9 40 00 00 00 <2b> 88 50 07 00 00 48 01 d7 48 8b 80 40 07 00 00 5d 48 d3 ef 48 03:40:57:[ 2352.320900] RIP [<ffffffff81155b02>] page_waitqueue+0x62/0x80 03:40:57:[ 2352.320900] RSP <ffff88007fd03d50> 03:40:57:[ 2352.320900] CR2: 0000000000001f50 03:40:57:[ 2352.320900] ------------[ cut here ]------------ 03:40:57:[ 2352.320900] kernel BUG at mm/vmalloc.c:1339! 03:40:57:[ 2352.320900] invalid opcode: 0000 [#2] SMP 03:40:57:[ 2352.320900] Modules linked in: ldiskfs(OF) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ppdev ib_cm iw_cm serio_raw pcspkr virtio_balloon i2c_piix4 parport_pc parport ib_sa ib_mad ib_core ib_addr ext4 mbcache jbd2 ata_generic pata_acpi cirrus syscopyarea sysfillrect virtio_blk sysimgblt drm_kms_helper 8139too floppy ttm ata_piix drm libata 8139cp mii virtio_pci virtio_ring i2c_core virtio 03:40:57:[ 2352.320900] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GF O-------------- 3.10.0-229.11.1.el7_lustre.g853ea39.x86_64 #1 03:40:57:[ 2352.320900] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 03:40:57:[ 2352.320900] task: ffff88007c052220 ti: ffff88007c080000 task.ti: ffff88007c080000 03:40:57:[ 2352.320900] RIP: 0010:[<ffffffff8119060e>] [<ffffffff8119060e>] __get_vm_area_node+0x1ce/0x1d0 03:40:57:[ 2352.320900] RSP: 0018:ffff88007fd032b0 EFLAGS: 00010006 03:40:57:[ 2352.320900] RAX: ffff88007c083fd8 RBX: 00000000ffffffff RCX: ffffc90000000000 03:40:57:[ 2352.320900] RDX: 0000000000000022 RSI: 0000000000000001 RDI: 0000000000002000 03:40:57:[ 2352.320900] RBP: ffff88007fd03310 R08: ffffe8ffffffffff R09: 00000000ffffffff 03:40:57:[ 2352.320900] R10: ffff880077e6f480 R11: ffff8800362c48d0 R12: ffffffffa00d42c9 03:40:57:[ 2352.320900] R13: 0000000000001200 R14: 00000000000080d2 R15: ffffea0000d8e940 03:40:57:[ 2352.320900] FS: 0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 03:40:57:[ 2352.320900] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 03:40:57:[ 2352.320900] CR2: 0000000000001f50 CR3: 000000003652c000 CR4: 00000000000006e0 03:40:57:[ 2352.320900] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 03:40:57:[ 2352.320900] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 03:40:57:[ 2352.320900] Stack: 03:40:57:[ 2352.320900] ffffffff81191d7d 00000000000080d2 ffffffffa00d42c9 8000000000000163 03:40:57:[ 2352.320900] 000080d200000000 0000000000000000 681e1a9a428ae35b ffff880077e6f480 03:40:57:[ 2352.320900] ffff8800362c48b0 0000000000240000 0000000000000080 ffffea0000d8e940 03:40:57:[ 2352.320900] Call Trace: 03:40:57:[ 2352.320900] <IRQ> 03:40:57:[ 2352.320900] [<ffffffff81191d7d>] ? __vmalloc_node_range+0x7d/0x270 03:40:57:[ 2352.320900] [<ffffffffa00d42c9>] ? ttm_tt_init+0x69/0xb0 [ttm] 03:40:58:[ 2352.320900] [<ffffffff81191fb1>] __vmalloc+0x41/0x50 03:40:58:[ 2352.320900] [<ffffffffa00d42c9>] ? ttm_tt_init+0x69/0xb0 [ttm] 03:40:58:[ 2352.320900] [<ffffffffa00d42c9>] ttm_tt_init+0x69/0xb0 [ttm] 03:40:58:[ 2352.320900] [<ffffffffa013b4e8>] cirrus_ttm_tt_create+0x58/0x90 [cirrus] 03:40:58:[ 2352.320900] [<ffffffffa00d4a7d>] ttm_bo_add_ttm+0x8d/0xc0 [ttm] 03:40:58:[ 2352.320900] [<ffffffffa00d60f1>] ttm_bo_handle_move_mem+0x571/0x5b0 [ttm] 03:40:58:[ 2352.320900] [<ffffffff81601ae4>] ? __slab_free+0x10e/0x277 03:40:58:[ 2352.320900] [<ffffffffa00d674a>] ? ttm_bo_mem_space+0x10a/0x310 [ttm] 03:40:58:[ 2352.320900] [<ffffffffa00d6e17>] ttm_bo_validate+0x247/0x260 [ttm] 03:40:58:[ 2352.320900] [<ffffffff81059e69>] ? iounmap+0x79/0xa0 03:40:58:[ 2352.320900] [<ffffffff81050069>] ? kgdb_arch_late+0xe9/0x180 03:40:58:[ 2352.320900] [<ffffffffa013bac2>] cirrus_bo_push_sysram+0x82/0xe0 [cirrus] 03:40:58:[ 2352.320900] [<ffffffffa0139c84>] cirrus_crtc_do_set_base.isra.8.constprop.10+0x84/0x430 [cirrus] 03:40:58:[ 2352.320900] [<ffffffffa013a479>] cirrus_crtc_mode_set+0x449/0x4d0 [cirrus] 03:40:58:[ 2352.320900] [<ffffffffa0107939>] drm_crtc_helper_set_mode+0x2e9/0x520 [drm_kms_helper] 03:40:58:[ 2352.320900] [<ffffffffa01086bf>] drm_crtc_helper_set_config+0x87f/0xaa0 [drm_kms_helper] 03:40:58:[ 2352.320900] [<ffffffff816092eb>] ? __ww_mutex_lock+0x1b/0xa0 03:40:58:[ 2352.320900] [<ffffffffa0094711>] drm_mode_set_config_internal+0x61/0xe0 [drm] 03:40:58:[ 2352.320900] [<ffffffffa0110a94>] drm_fb_helper_pan_display+0x94/0xf0 [drm_kms_helper] 03:40:58:[ 2352.320900] [<ffffffff81326309>] fb_pan_display+0xc9/0x190 03:40:58:[ 2352.320900] [<ffffffff81335390>] bit_update_start+0x20/0x50 03:40:58:[ 2352.320900] [<ffffffff81334dbd>] fbcon_switch+0x39d/0x5a0 03:40:58:[ 2352.320900] [<ffffffff813a35f9>] redraw_screen+0x1a9/0x270 03:40:58:[ 2352.320900] [<ffffffff8132650e>] ? fb_blank+0xae/0xc0 03:40:58:[ 2352.320900] [<ffffffff813322da>] fbcon_blank+0x22a/0x2f0 03:40:58:[ 2352.320900] [<ffffffff81070384>] ? wake_up_klogd+0x34/0x50 04:40:23:********** Timeout by autotest system ********** |
| Comments |
| Comment by Yang Sheng [ 10/Aug/15 ] |
|
Hi, YuJian, I found this issue just exist in autotest system. Could you tell me whether i can access test nodes to figure out what happens. |
| Comment by Bob Glossman (Inactive) [ 10/Aug/15 ] |
|
Am I correct in thinking this panic is only seen in tests of the new 3.10.0-229.11.1.el7 kernel version? It seems possible this is a TEI issue and not a lustre one. I haven't seen any such panics during local build & test of the new kernel version. |
| Comment by Yang Sheng [ 10/Aug/15 ] |
|
Not only 229.11.1 kernel. Also 229.7.2 kernel. https://testing.hpdd.intel.com/test_sets/83040eee-3d20-11e5-9684-5254006e85c2 I agree it looks like a TEI issue rather than LU issue. Please feel free to change it if any. |
| Comment by Andreas Dilger [ 12/Aug/15 ] |
|
Yang Sheng, if this is a kernel crash, I don't think it is correct to assign it to the TEI team. Is the crash stack always the same? It looks like a problem in the block layer. How often is this bug hit? It is currently marked minor priority. |
| Comment by Yang Sheng [ 13/Aug/15 ] |
|
Hi, Andreas, This issue just occurred in autotest system, never hit on my local box. So i think it may relate to autotest environment. It is very frequently in recent el7.1 tests. |
| Comment by Bob Glossman (Inactive) [ 13/Aug/15 ] |
|
another seen in el7 test on master: |
| Comment by Andreas Dilger [ 13/Aug/15 ] |
|
It does appear that the oops is the same in each case. In particular: BUG: unable to handle kernel paging request at 0000000000001f50 03:40:55:[ 2352.320900] IP: [<ffffffff81155b02>] page_waitqueue+0x62/0x80 is always the same. Yang Sheng, can you please see what is being done at that code, and what pointer is bad? Even if the failure is happening only on autotest nodes and not on your local test system, this is something that the Lustre team needs to debug. You could reserve a node and use loadjenkinsbuild to set up a similar test environment if needed. Maybe it depends on a difference between the RHEL7 guest VM and the specific host VM version? |
| Comment by Yang Sheng [ 13/Aug/15 ] |
|
Hi, Andreas, I suspect it relate to some special test environment. Looks like Onyx is full reserved and i have also requested a account for Shadow cluster(TEI-3782). So i can take a closer look for this issue. |
| Comment by James Nunez (Inactive) [ 18/Aug/15 ] |
|
Similar (same?) failure on master on shadow at 2015-08-17 21:09:48 - https://testing.hpdd.intel.com/test_sets/2f886b10-452d-11e5-afca-5254006e85c2 21:12:00:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 21:12:00:BUG: unable to handle kernel NULL pointer dereference at (null) 21:12:00:IP: [<ffffffff811ca1b0>] mpage_end_io_read+0x30/0x90 21:12:00:PGD 6a0eb067 PUD 5f9bd067 PMD 0 21:12:00:Oops: 0002 [#1] SMP 21:12:00:last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/virtio0/block/vda/queue/scheduler 21:12:00:CPU 1 21:12:00:Modules linked in: ldiskfs(U) jbd2 nfsd exportfs nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] 21:12:00: 21:12:00:Pid: 3109, comm: modprobe Not tainted 2.6.32-431.29.2.el6_lustre.gca29247.x86_64 #1 Red Hat KVM 21:12:00:RIP: 0010:[<ffffffff811ca1b0>] [<ffffffff811ca1b0>] mpage_end_io_read+0x30/0x90 21:12:00:RSP: 0018:ffff880002303d58 EFLAGS: 00010002 21:12:00:RAX: ffff880057915ec0 RBX: ffff880057915ea0 RCX: 0000000000000000 21:12:00:RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 21:12:00:RBP: ffff880002303d78 R08: ffff880057947e60 R09: 0000000000000000 21:12:00:R10: ffff880037d2b640 R11: 0000000000000001 R12: ffff880065275c00 21:12:00:R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 21:12:00:FS: 00007fcb045a5700(0000) GS:ffff880002300000(0000) knlGS:0000000000000000 21:12:00:CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 21:12:00:CR2: 0000000000000000 CR3: 0000000079aa3000 CR4: 00000000000006e0 21:12:00:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 21:12:00:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 21:12:00:Process modprobe (pid: 3109, threadinfo ffff880079454000, task ffff880079160040) 21:12:00:Stack: 21:12:00: ffff880037da2fa8 ffff880065275c00 0000000000000000 ffff880057947e60 21:12:00:<d> ffff880002303d88 ffffffff811c314d ffff880002303db8 ffffffff812661fb 21:12:00:<d> ffff880002303da8 ffff880065275c00 0000000000000000 0000000000000000 21:12:00:Call Trace: 21:12:00: <IRQ> 21:12:00: [<ffffffff811c314d>] bio_endio+0x1d/0x40 21:12:00: [<ffffffff812661fb>] req_bio_endio+0x9b/0xe0 21:12:00: [<ffffffff81267777>] blk_update_request+0x117/0x490 21:12:00: [<ffffffff81095211>] ? __queue_work+0x41/0x50 21:12:00: [<ffffffff81267b17>] blk_update_bidi_request+0x27/0xa0 21:12:00: [<ffffffff812698be>] __blk_end_request_all+0x2e/0x60 21:12:00: [<ffffffffa005a22a>] blk_done+0x4a/0x110 [virtio_blk] 21:12:00: [<ffffffff810ecb14>] ? __rcu_process_callbacks+0x54/0x350 21:12:00: [<ffffffffa004e2ac>] vring_interrupt+0x3c/0xd0 [virtio_ring] 21:12:00: [<ffffffff810e7090>] handle_IRQ_event+0x60/0x170 21:12:00: [<ffffffff8107a64f>] ? __do_softirq+0x11f/0x1e0 21:12:00: [<ffffffff810e99ee>] handle_edge_irq+0xde/0x180 21:12:00: [<ffffffff8100faf9>] handle_irq+0x49/0xa0 21:12:00: [<ffffffff81532dbc>] do_IRQ+0x6c/0xf0 21:12:00: [<ffffffff8100b9d3>] ret_from_intr+0x0/0x11 21:12:00: <EOI> 21:12:00: [<ffffffff8116ff8a>] ? kmem_cache_alloc+0xda/0x190 21:12:00: [<ffffffff811222e5>] ? mempool_alloc_slab+0x15/0x20 21:12:00: [<ffffffff81122483>] mempool_alloc+0x63/0x140 21:12:00: [<ffffffff811c4f3e>] bio_alloc_bioset+0x3e/0xf0 21:12:00: [<ffffffff811c5095>] bio_alloc+0x15/0x30 21:12:00: [<ffffffff811ca065>] mpage_alloc+0x35/0xa0 21:12:00: [<ffffffff811ca5de>] do_mpage_readpage+0x35e/0x5f0 21:12:00: [<ffffffff8114252e>] ? __inc_zone_page_state+0x2e/0x30 21:12:00: [<ffffffff81136b00>] ? __lru_cache_add+0x40/0x90 21:12:00: [<ffffffff811ca9c9>] mpage_readpages+0xe9/0x130 21:12:00: [<ffffffffa008c140>] ? ext3_get_block+0x0/0x120 [ext3] 21:12:00: [<ffffffffa008c140>] ? ext3_get_block+0x0/0x120 [ext3] 21:12:00: [<ffffffff81167cca>] ? alloc_pages_current+0xaa/0x110 21:12:00: [<ffffffffa0089ccd>] ext3_readpages+0x1d/0x20 [ext3] 21:12:00: [<ffffffff81135795>] __do_page_cache_readahead+0x185/0x210 21:12:00: [<ffffffff81135841>] ra_submit+0x21/0x30 21:12:00: [<ffffffff81135bb5>] ondemand_readahead+0x115/0x240 21:12:00: [<ffffffff81135dd3>] page_cache_sync_readahead+0x33/0x50 21:12:00: [<ffffffff81121868>] generic_file_aio_read+0x558/0x700 21:12:00: [<ffffffff811890ba>] do_sync_read+0xfa/0x140 21:12:00: [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 21:12:00: [<ffffffff811519d5>] ? do_mmap_pgoff+0x335/0x380 21:12:00: [<ffffffff81227366>] ? security_file_permission+0x16/0x20 21:12:00: [<ffffffff81189a75>] vfs_read+0xb5/0x1a0 21:12:00: [<ffffffff81189bb1>] sys_read+0x51/0x90 21:12:00: [<ffffffff810e202e>] ? __audit_syscall_exit+0x25e/0x290 21:12:00: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b 21:12:00:Code: 55 41 54 53 48 83 ec 08 0f 1f 44 00 00 0f b7 57 28 4c 8b 6f 18 48 8b 47 48 49 89 fc 41 83 e5 01 48 c1 e2 04 48 8d 5c 02 f0 eb 13 <f0> 80 0f 08 e8 67 5a f5 ff 49 8b 44 24 48 48 39 c3 72 2e 48 8b 21:12:00:RIP [<ffffffff811ca1b0>] mpage_end_io_read+0x30/0x90 21:12:00: RSP <ffff880002303d58> 21:12:00:CR2: 0000000000000000 |
| Comment by Bob Glossman (Inactive) [ 18/Aug/15 ] |
|
another on master, on el7.1: |
| Comment by Yang Sheng [ 19/Aug/15 ] |
|
Hi, Minh, I have reserved Shadow node shadow-57 trying to reproduce this issue. But i have got nothing. Could you please point whether any special configuration may help to hit a instance? I just notice autotest used LVM volume as block device. Any comment is welcome. |
| Comment by Yang Sheng [ 21/Aug/15 ] |
|
I have reproduced this issue on Shadow cluster. The problem relate to block device type. Shadow VM choose virtio storage type. It would crashed either used lvm or raw-disk as lustre back storage. But I don't encountered this issue use loopback device. So could you please check our VM version or choose other storage type? I'll continue follow up this issue. |
| Comment by James Nunez (Inactive) [ 24/Aug/15 ] |
|
We hit this issue again at: |
| Comment by Sarah Liu [ 24/Aug/15 ] |
|
more instance on master branch: This issue blocks the 2.8 testing |
| Comment by Bob Glossman (Inactive) [ 25/Aug/15 ] |
|
while not hard evidence, I see the 3 test fails from James' last comment are all different kernels. Makes me suspect a TEI cause is more likely. |
| Comment by Andreas Dilger [ 25/Aug/15 ] |
|
Sarah, if you are able to hit this problem repeatedly, could you please try the patch http://review.whamcloud.com/16035 from |
| Comment by Yang Sheng [ 25/Aug/15 ] |
|
In James's comment, just 2015-08-25 01:45:31 - https://testing.hpdd.intel.com/test_sets/b0d4981c-4ad5-11e5-aa52-5254006e85c2 is likely same issue. Other two is different issue. |
| Comment by Sarah Liu [ 25/Aug/15 ] |
|
Andreas, sure, I will try and update the ticket. |
| Comment by Bob Glossman (Inactive) [ 26/Aug/15 ] |
|
I believe noting this as RHEL 7.1 in the title is incorrect. panics with similar stack traces can be seen in tests of sles11sp4 and sles12 too. |
| Comment by Andreas Dilger [ 26/Aug/15 ] |
|
Based on Fan Yong's comments in |
| Comment by Bob Glossman (Inactive) [ 26/Aug/15 ] |
|
seen in past test fails on sles11sp3 too. didn't know what to look for to recognize this problem until more recently. |
| Comment by Andreas Dilger [ 26/Aug/15 ] |
|
Yang Sheng, Sarah, if you are now able to reproduce this problem easily on the Shadow VM, can you please run tests to see if this can be isolated to a specific patch landing on master. There are a number of patches landed on Aug 3. The last patch that has Test-Parameters: mdsdistro=el7 that passed testing appears to be http://review.whamcloud.com/15737 testing was based on 6af80537ef30 " If this problem can be reproduced consistently via autotest, it would be easiest to submit several independent patches that are based on the patch just before a likely candidate (using ~1 means "one patch before the named one" for git), and then push it to autotest with fortestonly and specific to RHEL7.1 so it doesn't cause a lot of useless testing overhead, something like: $ git checkout 50c79ee142be~1 $ echo "testing $(git describe) before 50c79ee142be" >> README $ git commit -S -a -F - LU-6974 test: testing RHEL7.1 before patch 50c79ee142be Testing RHEL 7.1 before patch 50c79ee142be to see if it works. Test-Parameters: fortestonly mdsdistro=el7 ossdistro=el7 \ clientdistro=el7 mdsfilesystemtype=ldiskfs \ mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \ testgroup=review-ldiskfs ^D $ git push ssh://ys@review/fs/lustre-release HEAD:refs/for/master Since these patches will run only minimal tests, they can all be pushed and tested in parallel rather than doing a bisection. It is also worthwhile to test against some previous patches to see that they are passing. The patches of interest are below, and you should test the patch one before each patch to see if that resolves the problem: bad49e39e301 LU-6096 ldiskfs: mark dir's inode dirty 71b3cc331511 LU-6544 mkfs: Improve MDT inode size calculation af46754e1e9c LU-6904 mdd: prepare linkea before declare bf0b93a88479 LU-6874 out: missing noop in out_update_ops 0484a16883a9 LU-5888 utils: limit max_sectors_kb tunable setting bcef61a80ab4 LU-6768 osd: unmap reallocated blocks 11b048aaffd6 LU-6427 osd: race between destroy and load LMV Without the Strangely, I do see patch http://review.whamcloud.com/15130 passed RHEL7.1 testing with a master tree based on c45c8ad26004 |
| Comment by Yang Sheng [ 27/Aug/15 ] |
|
Hi, Andreas, What i have found is that this issue can be reproduced on shadow cluster stably. I have narrow down to failure point. I can sure this issue only exist in master branch. The system was crashed as below code path: /* send back the whole mti in the reply */
rep_mti = req_capsule_server_get(tsi->tsi_pill, &RMF_MGS_TARGET_INFO);
*rep_mti = *mti;
/* Flush logs to disk */
dt_sync(tsi->tsi_env, mgs->mgs_bottom); <<<----- crashed at this point.
RETURN(rc);
}
I'll continue dig up root cause. But limit to poor network speed. Access Shadow cluster from China is painful experience. Also this issue only reproduced on Shadow cluster VM nodes. I cannot hit it on my local VM box. |
| Comment by Oleg Drokin [ 27/Aug/15 ] |
|
Another good idea is to create a patch similar to http://review.whamcloud.com/#/c/4139/ only for RHEL7.1 and force rhel7.1 testign and we might be able to catch the party doing wrong write red-handed too. YuJian, any chance you can do this in addition to the Andreas' suggestion? |
| Comment by Andreas Dilger [ 27/Aug/15 ] |
|
Unfortunately, this code path hasn't changed in several years. It seems that there is some corruption of requests in the IO path, and when dt_sync->osd_sync() is called it flushes the corrupt IO requests to disk and the crash happens in the queue. What we need to do is find where the corruption happens, which is why I'd suggested testing with different Lustre patches reverted until the problem goes away. Unfortunately, the test system is very backed up right now, so it seems like this will go faster by manually testing the different Lustre builds on Shadow using git bisect. Oleg also suggested to enable debug options in the kernel to see if they can detect where the corruption is hit. |
| Comment by Jian Yu [ 27/Aug/15 ] |
Sure, let me do this. |
| Comment by Yang Sheng [ 27/Aug/15 ] |
|
This issue relate to commit 0484a16883a951f27c30ee926c9a1c305f7cf489. Looks like it should set to 32 * 1024 -1 is proper value. |
| Comment by Jian Yu [ 27/Aug/15 ] |
|
Here are three for-test-only patches to verify the detection from Yang Sheng: http://review.whamcloud.com/16114, http://review.whamcloud.com/16113 and http://review.whamcloud.com/16115 |
| Comment by Bob Glossman (Inactive) [ 27/Aug/15 ] |
|
while I'm not doubting Yang Sheng's pinpointing of a root cause I'm having trouble understanding why it's not showing up everywhere. the same utils code applies & is used on all builds and distros, including el6.6 and el6.7 where the problem isn't seen. |
| Comment by Gerrit Updater [ 27/Aug/15 ] |
|
Yang Sheng (yang.sheng@intel.com) uploaded a new patch: http://review.whamcloud.com/16116 |
| Comment by James A Simmons [ 27/Aug/15 ] |
|
Really, a one off error. I wonder if that was the reason the |
| Comment by Andreas Dilger [ 28/Aug/15 ] |
|
I still think this is a block layer bug. If userspace can set the max_sectors_kb=32768 then this shouldn't crash the kernel. If that value is incorrect for some reason then the kernel shouldn't allow it. Setting it one less to make it work seems like voodoo to me, and also has the side-effect of not having an IO size that is an even multiple of 1MB. If we're going to go down that route then it would be better to make it 16384 KB so that the IO submitted to the disk is a power-of-two size to align with the RAID. |
| Comment by Vinayak (Inactive) [ 28/Aug/15 ] |
|
We have also encountered with this issue in our test environment in every single run on master recent days. I would say this is 100% reproducible. We reverted the patch for commit 0484a16883a951f27c30ee926c9a1c305f7cf489
Author: Andreas Dilger <andreas.dilger@intel.com>
AuthorDate: Wed Apr 1 12:35:29 2015 -0400
Commit: Oleg Drokin <oleg.drokin@intel.com>
CommitDate: Mon Aug 3 01:57:17 2015 +0000
LU-5888 utils: limit max_sectors_kb tunable setting
Properly limit the max_sectors_kb tunable if larger than 32MB, as
added in commit 73bd456e896 (http://review.whamcloud.com/12723).
It was using the old value in "buf" instead of the updated "newval".
Change write_file() to use a file descriptor directly instead of
streams, since it only writes a single value before closing again.
and ran the test. Mount process was successful. If I am not wrong this process is not specific to RHEL7.1. Please correct If I am wrong. |
| Comment by Andreas Dilger [ 28/Aug/15 ] |
|
Vinayak, is the problem also gone if the |
| Comment by Vinayak (Inactive) [ 29/Aug/15 ] |
|
Hello Andreas, We have started the build on new patch you recently submitted through http://review.whamcloud.com/16116. We will update the same once we get the results. BTW, I am also trying to understand how |
| Comment by Vinayak (Inactive) [ 29/Aug/15 ] |
|
We did not face this issue with your new patch on our test system. Coming back to behavior of So just wanted cross verify the Investigation till now on this issue making me to focus on kvm virtio. We found a related (hope related) link. i.e http://sourceforge.net/p/kvm/bugs/516/ which talks about limiting too large iovecs. Please correct me if I am wrong. |
| Comment by Vinayak (Inactive) [ 29/Aug/15 ] |
|
Mount console logs on virtual box successful on virtual box Checking servers environments Checking clients cli-3,cli-4 environments Loading modules from /usr/lib64/lustre/tests/.. detected 2 online CPUs by sysfs Force libcfs to create 2 CPU partitions debug=vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck subsystem_debug=all -lnet -lnd -pinger Setup mgs, mdt, osts Starting mgs: /dev/sdc /mnt/mgs cli-1: mount.lustre: increased /sys/block/sdc/queue/max_sectors_kb from 1024 to 32767 Started MGS Starting mds1: /dev/sdb /mnt/mds1 cli-1: mount.lustre: increased /sys/block/sdb/queue/max_sectors_kb from 1024 to 32767 Started lustre-MDT0000 Starting ost1: /dev/sdb /mnt/ost1 cli-2: mount.lustre: increased /sys/block/sdb/queue/max_sectors_kb from 1024 to 32767 Started lustre-OST0000 Starting ost2: /dev/sdc /mnt/ost2 cli-2: mount.lustre: increased /sys/block/sdc/queue/max_sectors_kb from 1024 to 32767 Started lustre-OST0001 Starting client: cli-4: -o user_xattr,flock cli-1@tcp:/lustre /mnt/lustre Starting client cli-3,cli-4: -o user_xattr,flock cli-1@tcp:/lustre /mnt/lustre Started clients cli-3,cli-4: cli-1@tcp:/lustre on /mnt/lustre type lustre (rw,user_xattr,flock) cli-1@tcp:/lustre on /mnt/lustre type lustre (rw,user_xattr,flock) Using TIMEOUT=20 seting jobstats to procname_uid Setting lustre.sys.jobid_var from disable to procname_uid Waiting 90 secs for update Updated after 7s: wanted 'procname_uid' got 'procname_uid' disable quota as required Please note this : In the above case mount.lustre did not enter if (newval > 32 * 1024) {
newval = 32 * 1024;
snprintf(buf, sizeof(buf), "%llu", newval);
}
I feel the fix by Yeng Sheng is fine. ie if (newval > 32 * 1024) {
- newval = 32 * 1024;
+ newval = 32 * 1024 - 1;
snprintf(buf, sizeof(buf), "%llu", newval);
}
I did not go into much details about handing of this in block layer. but looks max value should be calculated as In other words the range starts from 0 to 32767 (2^15 - 1). If we consider 32768 then it might be over flow case. We tested this on our test environment and it works. The fact is that mount.lustre: set max_sectors_kb to 2147483647 Unexpected large value (2147483647) might be because of but the definition of stroull() tells that /* Not currently supported: leading whitespace, sign, 0x prefix, zero base */ Please refer strtoull I agree that max_sectors_kb should be handled in block layer also but unfortunately it might be missing. Please correct if I am wrong. |
| Comment by Vinayak (Inactive) [ 30/Aug/15 ] |
|
When I tried to set max_sectors_kb on virtual box to 32768 resulted in "Invalid argument" and works well if we try to set the max_sectors_kb to 32767. Looks 32767 is the max value we can set (limitation). Is this behavior same on kvm also ? # echo 32768 > /sys/block/sdb/queue/max_sectors_kb bash: echo: write error: Invalid argument # cat /sys/block/sdb/queue/max_sectors_kb 512 # echo 32767 > /sys/block/sdb/queue/max_sectors_kb # cat /sys/block/sdb/queue/max_sectors_kb 32767 |
| Comment by Yang Sheng [ 31/Aug/15 ] |
|
Hi, Vinayak, Please notice that max_hw_sectors_kb. Kernel doesn't permit max_sectors_kb exceed max_hw_sectors_kb. That is why the issue rarely hit on rhel6. rhel7 is able to handle big sector value. So max_hw_sectors_kb can be set to 2147483647(UINT_MAX/2). |
| Comment by James A Simmons [ 31/Aug/15 ] |
|
Vinayak does reducing max_sectors_kb to 16384 work for you? |
| Comment by Vinayak (Inactive) [ 31/Aug/15 ] |
|
>> Vinayak does reducing max_sectors_kb to 16384 work for you? |
| Comment by Gerrit Updater [ 31/Aug/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16116/ |
| Comment by Joseph Gmitter (Inactive) [ 31/Aug/15 ] |
|
Landed for 2.8. |