Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
Lustre 2.4.0, Lustre 2.8.0
-
None
-
3
-
8054
Description
Running racer on a recent master, it crashed after about 21 hours with:
[76336.978485] BUG: unable to handle kernel paging request at ffff880079ed5ea8 [76336.978811] IP: [<ffffffffa0dc22f8>] lu_object_put+0x1d8/0x330 [obdclass] [76336.979138] PGD 1a26063 PUD 300067 PMD 4d0067 PTE 8000000079ed5060 [76336.979443] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [76336.979704] last sysfs file: /sys/devices/system/cpu/possible [76336.979980] CPU 3 [76336.980018] Modules linked in: lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd mgs lquota obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd sha512_generic sha256_generic ext4 mbcache jbd2 virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs] [76336.982446] [76336.982446] Pid: 5799, comm: mdt00_008 Not tainted 2.6.32-rhe6.4-debug #2 Bochs Bochs [76336.982446] RIP: 0010:[<ffffffffa0dc22f8>] [<ffffffffa0dc22f8>] lu_object_put+0x1d8/0x330 [obdclass] [76336.982446] RSP: 0018:ffff880082f49a00 EFLAGS: 00010246 [76336.982446] RAX: 0000000000000000 RBX: ffff880079ed5ea8 RCX: 0000000000000002 [76336.982446] RDX: 0000000000000002 RSI: ffffc900015ca000 RDI: 0000000000000001 [76336.982446] RBP: ffff880082f49a60 R08: 0000000000000400 R09: 0000000000000ffa [76336.982446] R10: 0000000000000693 R11: cc00000000000000 R12: ffff880010703668 [76336.982446] R13: ffff880079ed5f00 R14: ffff8800b738c168 R15: ffff880082f49a20 [76336.982446] FS: 00007fd3d883b700(0000) GS:ffff8800062c0000(0000) knlGS:0000000000000000 [76336.982446] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [76336.982446] CR2: ffff880079ed5ea8 CR3: 000000008ca47000 CR4: 00000000000006e0 [76336.982446] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [76336.982446] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [76336.982446] Process mdt00_008 (pid: 5799, threadinfo ffff880082f48000, task ffff880096508040) [76336.982446] Stack: [76336.982446] ffffc90008672f78 ffff88004ce3af30 ffffc900015da028 ffffc900015ca000 [76336.982446] <d> ffffc900015ca000 0000000000000967 ffff880082f49a60 ffff880079ed5ea8 [76336.982446] <d> ffff880010703668 00000000fffffffe 0000000200010001 0000000000000000 [76336.982446] Call Trace: [76336.982446] [<ffffffffa070df4d>] mdt_object_unlock_put+0x3d/0x110 [mdt] [76336.982446] [<ffffffffa074019f>] mdt_reint_open+0x95f/0x20c0 [mdt] [76336.982446] [<ffffffffa0cb9b3f>] ? upcall_cache_get_entry+0x3bf/0x870 [libcfs] [76336.982446] [<ffffffffa115c78c>] ? lustre_msg_add_version+0x6c/0xc0 [ptlrpc] [76336.982446] [<ffffffffa0de21f0>] ? lu_ucred+0x20/0x30 [obdclass] [76336.982446] [<ffffffffa072b621>] mdt_reint_rec+0x41/0xe0 [mdt] [76336.982446] [<ffffffffa0724ae3>] mdt_reint_internal+0x4e3/0x7d0 [mdt] [76336.982446] [<ffffffffa072509d>] mdt_intent_reint+0x1ed/0x520 [mdt] [76336.982446] [<ffffffffa0720c6e>] mdt_intent_policy+0x3ae/0x750 [mdt] [76336.982446] [<ffffffffa111470a>] ldlm_lock_enqueue+0x2ea/0x870 [ptlrpc] [76336.982446] [<ffffffffa113ae67>] ldlm_handle_enqueue0+0x4f7/0x10b0 [ptlrpc] [76336.982446] [<ffffffffa0721146>] mdt_enqueue+0x46/0x110 [mdt] [76336.982446] [<ffffffffa0712d18>] mdt_handle_common+0x648/0x1660 [mdt] [76336.982446] [<ffffffffa074ede5>] mds_regular_handle+0x15/0x20 [mdt] [76336.982446] [<ffffffffa116c898>] ptlrpc_server_handle_request+0x3a8/0xc70 [ptlrpc] [76336.982446] [<ffffffffa0c9d5ee>] ? cfs_timer_arm+0xe/0x10 [libcfs] [76336.982446] [<ffffffffa0caee9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] [76336.982446] [<ffffffffa1163fe1>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc] [76336.982446] [<ffffffff81054613>] ? __wake_up+0x53/0x70 [76336.982446] [<ffffffffa116db95>] ptlrpc_main+0xa35/0x1640 [ptlrpc] [76336.982446] [<ffffffffa116d160>] ? ptlrpc_main+0x0/0x1640 [ptlrpc] [76336.982446] [<ffffffff8100c10a>] child_rip+0xa/0x20 [76336.982446] [<ffffffffa116d160>] ? ptlrpc_main+0x0/0x1640 [ptlrpc] [76336.982446] [<ffffffffa116d160>] ? ptlrpc_main+0x0/0x1640 [ptlrpc] [76336.982446] [<ffffffff8100c100>] ? child_rip+0x0/0x20 [76336.982446] Code: b0 48 8b 70 10 48 83 c2 08 e8 75 56 4c e0 49 8b 06 be 01 00 00 00 48 8b 7d c0 48 8b 40 20 ff 50 18 e9 da fe ff ff 0f 1f 44 00 00 <f6> 03 01 0f 84 cc fe ff ff 48 8b 7d b0 48 83 c7 18 e8 22 b4 ed [76336.982446] RIP [<ffffffffa0dc22f8>] lu_object_put+0x1d8/0x330 [obdclass] [76336.982446] RSP <ffff880082f49a00> [76336.982446] CR2: ffff880079ed5ea8
Crashdump and modules are in /exports/crashdumps/192.168.10.220-2013-04-30-16:12:30
lu_object_put+0x1d8 is lustre/obdclass/lu_object.c:107
107 if (lu_object_is_dying(top)) {
Tag in my tree is master-20130430