Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
Lustre 2.5.0
-
None
-
3
-
9371
Description
Just hit this running recent master:
<4>[113366.463322] Lustre: DEBUG MARKER: == replay-single test 0c: check replay-barrier == 10:40:38 (1374676838) <3>[113367.223979] LustreError: 22867:0:(osd_handler.c:1191:osd_ro()) *** setting lustre-MDT0000 read-only *** <4>[113367.225361] Turning device loop0 (0x700000) read-only <4>[113367.303612] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 <4>[113367.345628] Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000 <4>[113367.526705] Lustre: Unmounted lustre-client <4>[113367.907894] Lustre: Failing over lustre-MDT0000 <1>[113368.092078] BUG: unable to handle kernel paging request at ffff8800b6966c68 <1>[113368.092813] IP: [<ffffffffa0936019>] osp_key_exit+0x9/0x20 [osp] <4>[113368.093486] PGD 1a26063 PUD 501067 PMD 6b6067 PTE 80000000b6966060 <4>[113368.094166] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC <4>[113368.094774] last sysfs file: /sys/devices/system/cpu/possible <4>[113368.095424] CPU 2 <4>[113368.095515] Modules linked in: <3>[113368.096298] LustreError: 11-0: lustre-MDT0000-lwp-OST0001: Communicating with 0@lo, operation obd_ping failed with -107. <4>[113368.096304] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete <4>[113368.096034] lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd <3>[113368.116346] LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 0@lo (no target) <4>[113368.096034] mgs lquota lfsck obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd sha512_generic sha256_generic ext4 mbcache jbd2 virtio_balloon i2c_piix4 i2c_core virtio_console virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs] <4>[113368.096034] <4>[113368.096034] Pid: 22149, comm: mdt00_002 Not tainted 2.6.32-rhe6.4-debug #2 Red Hat KVM <4>[113368.096034] RIP: 0010:[<ffffffffa0936019>] [<ffffffffa0936019>] osp_key_exit+0x9/0x20 [osp] <4>[113368.096034] RSP: 0018:ffff8800973dfe10 EFLAGS: 00010282 <4>[113368.096034] RAX: ffffffffa0936010 RBX: 00000000000000c8 RCX: 0000000000000000 <4>[113368.096034] RDX: ffff8800b6966bf0 RSI: ffffffffa095ac00 RDI: ffff8800b7121610 <4>[113368.096034] RBP: ffff8800973dfe10 R08: 0000000000000001 R09: 0000000000000000 <4>[113368.096034] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800b7121610 <4>[113368.096034] R13: ffff88008a49af30 R14: ffff88006bbafef0 R15: ffff8800b52aac20 <4>[113368.096034] FS: 0000000000000000(0000) GS:ffff880006280000(0000) knlGS:0000000000000000 <4>[113368.096034] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>[113368.096034] CR2: ffff8800b6966c68 CR3: 0000000001a25000 CR4: 00000000000006e0 <4>[113368.096034] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[113368.096034] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>[113368.096034] Process mdt00_002 (pid: 22149, threadinfo ffff8800973de000, task ffff8800973dc380) <4>[113368.096034] Stack: <4>[113368.096034] ffff8800973dfe30 ffffffffa0ff4488 ffff8800b69677f0 ffff8800b52aabf0 <4>[113368.096034] <d> ffff8800973dfed0 ffffffffa1190899 ffff8800973dfe50 ffff880000000000 <4>[113368.096034] <d> ffff8800b52aac80 00000000973dffd8 ffff88006bbafef0 00000000973dc938 <4>[113368.096034] Call Trace: <4>[113368.096034] [<ffffffffa0ff4488>] lu_context_exit+0x58/0xa0 [obdclass] <4>[113368.096034] [<ffffffffa1190899>] ptlrpc_main+0x9d9/0x1650 [ptlrpc] <4>[113368.096034] [<ffffffffa118fec0>] ? ptlrpc_main+0x0/0x1650 [ptlrpc] <4>[113368.096034] [<ffffffff81094606>] kthread+0x96/0xa0 <4>[113368.096034] [<ffffffff8100c10a>] child_rip+0xa/0x20 <4>[113368.096034] [<ffffffff81094570>] ? kthread+0x0/0xa0 <4>[113368.096034] [<ffffffff8100c100>] ? child_rip+0x0/0x20 <4>[113368.096034] Code: <48> c7 42 78 00 00 00 00 c9 c3 66 66 66 66 2e 0f 1f 84 00 00 00 00
Crashdump and modules are in /exports/crashdumps/192.168.10.221-2013-07-24-10\:40\:42
source branch in my tree: master-20130723