Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.4.0
-
None
-
3
-
8389
Description
Running replay-single in a loop, I hit this oops today:
[352448.539528] Lustre: DEBUG MARKER: == replay-single test 3b: replay failed open -ENOMEM == 15:27:17 (1369337237) [352449.238483] LustreError: 539:0:(osd_handler.c:1134:osd_ro()) *** setting lustre-MDT0000 read-only *** [352449.238948] Turning device loop0 (0x700000) read-only [352449.349409] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 [352449.367296] Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000 [352449.530689] Lustre: *** cfs_fail_loc=114, val=0*** [352450.067118] LustreError: 27045:0:(client.c:1048:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff880081bc77f0 x1435853793530172/t0(0) o13->lustre-OST0001-osc-MDT0000@0@lo:7/4 lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 [352450.031173] BUG: unable to handle kernel paging request at ffff8800893ef9d0 [352450.031173] IP: [<ffffffff812872cf>] _raw_spin_lock+0xdf/0x180 [352450.031173] PGD 1a26063 PUD 501067 PMD 54b067 PTE 80000000893ef060 [352450.031173] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [352450.031173] last sysfs file: /sys/devices/system/cpu/possible [352450.031173] CPU 7 [352450.031173] Modules linked in: lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd mgs lquota obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd sha512_generic sha256_generic ext4 mbcache jbd2 virtio_balloon virtio_console i2c_piix4 i2c_core virtio_net virtio_blk pata_acpi ata_generic ata_piix virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs] [352450.031173] [352450.031173] Pid: 0, comm: swapper Not tainted 2.6.32-rhe6.4-debug #2 Bochs Bochs [352450.031173] RIP: 0010:[<ffffffff812872cf>] [<ffffffff812872cf>] _raw_spin_lock+0xdf/0x180 [352450.031173] RSP: 0018:ffff8800063c3d80 EFLAGS: 00010016 [352450.031173] RAX: 000000006ec735fa RBX: ffff8800893ef9d0 RCX: 000000006ec735fa [352450.031173] RDX: 0000000000000361 RSI: 0000000000000003 RDI: 0000000000000001 [352450.031173] RBP: ffff8800063c3dc0 R08: ffff880096082608 R09: 0000000000000000 [352450.031173] R10: 0000000000000800 R11: ffffffff81d5fb40 R12: 00000000ca3244a0 [352450.031173] R13: ffff8800bac08440 R14: ffff8800bac08ae0 R15: 000000000008295c [352450.031173] FS: 0000000000000000(0000) GS:ffff8800063c0000(0000) knlGS:0000000000000000 [352450.031173] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [352450.031173] CR2: ffff8800893ef9d0 CR3: 0000000001a25000 CR4: 00000000000006e0 [352450.031173] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [352450.031173] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [352450.031173] Process swapper (pid: 0, threadinfo ffff8800bac0a000, task ffff8800bac08440) [352450.031173] Stack: [352450.031173] ffff8800063d6880 0000000100000286 ffff8800063c3db0 0000000000000282 [352450.031173] <d> ffff8800893efc48 0000000000000003 0000000000000001 0000000000000000 [352450.031173] <d> ffff8800063c3de0 ffffffff814fe204 0000000700000001 ffff8800893ef9d0 [352450.031173] Call Trace: [352450.031173] <IRQ> [352450.031173] [<ffffffff814fe204>] _spin_lock_irqsave+0x24/0x30 [352450.031173] [<ffffffff810545f2>] __wake_up+0x32/0x70 [352450.031173] [<ffffffffa09095f0>] ? osp_statfs_timer_cb+0x0/0x50 [osp] [352450.031173] [<ffffffffa0df078a>] cfs_waitq_signal+0x1a/0x20 [libcfs] [352450.031173] [<ffffffffa090960a>] osp_statfs_timer_cb+0x1a/0x50 [osp] [352450.031173] [<ffffffff8107f3f2>] run_timer_softirq+0x192/0x330 [352450.031173] [<ffffffff8102e2fd>] ? lapic_next_event+0x1d/0x30 [352450.031173] [<ffffffff810749d1>] __do_softirq+0xc1/0x1e0 [352450.031173] [<ffffffff8109940a>] ? hrtimer_interrupt+0x14a/0x270 [352450.031173] [<ffffffff8100c20c>] call_softirq+0x1c/0x30 [352450.031173] [<ffffffff8100de45>] do_softirq+0x65/0xa0 [352450.031173] [<ffffffff810747b5>] irq_exit+0x85/0x90 [352450.031173] [<ffffffff81504ed0>] smp_apic_timer_interrupt+0x70/0x9b [352450.031173] [<ffffffff8100bbd3>] apic_timer_interrupt+0x13/0x20 [352450.031173] <EOI> [352450.031173] [<ffffffff8103a86b>] ? native_safe_halt+0xb/0x10 [352450.031173] [<ffffffff81014a0d>] default_idle+0x4d/0xb0 [352450.031173] [<ffffffff81009a26>] cpu_idle+0xb6/0x110 [352450.031173] [<ffffffff814f4289>] start_secondary+0x2ac/0x2ef [352450.031173] Code: 85 7a 00 fa 00 00 00 c7 45 cc 01 00 00 00 4d 8d b5 a0 06 00 00 eb 0c 0f 1f 44 00 00 8b 45 cc 85 c0 75 63 45 31 ff 4d 39 fc 76 f1 <8b> 03 89 c2 c1 c0 10 39 c2 8d 90 00 00 01 00 75 04 f0 0f b1 13
base code is 2.4.0-RC1, my code branch master-20130523
Crashdump and modules are in /exports/crashdumps/192.168.10.221-2013-05-23-15\:27\:24