Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3252

MDT crash in lu_object_put+0x1d8

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.4.0, Lustre 2.8.0
    • None
    • 3
    • 8054

    Description

      Running racer on a recent master, it crashed after about 21 hours with:

      [76336.978485] BUG: unable to handle kernel paging request at ffff880079ed5ea8
      [76336.978811] IP: [<ffffffffa0dc22f8>] lu_object_put+0x1d8/0x330 [obdclass]
      [76336.979138] PGD 1a26063 PUD 300067 PMD 4d0067 PTE 8000000079ed5060
      [76336.979443] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      [76336.979704] last sysfs file: /sys/devices/system/cpu/possible
      [76336.979980] CPU 3 
      [76336.980018] Modules linked in: lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd mgs lquota obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd sha512_generic sha256_generic ext4 mbcache jbd2 virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs]
      [76336.982446] 
      [76336.982446] Pid: 5799, comm: mdt00_008 Not tainted 2.6.32-rhe6.4-debug #2 Bochs Bochs
      [76336.982446] RIP: 0010:[<ffffffffa0dc22f8>]  [<ffffffffa0dc22f8>] lu_object_put+0x1d8/0x330 [obdclass]
      [76336.982446] RSP: 0018:ffff880082f49a00  EFLAGS: 00010246
      [76336.982446] RAX: 0000000000000000 RBX: ffff880079ed5ea8 RCX: 0000000000000002
      [76336.982446] RDX: 0000000000000002 RSI: ffffc900015ca000 RDI: 0000000000000001
      [76336.982446] RBP: ffff880082f49a60 R08: 0000000000000400 R09: 0000000000000ffa
      [76336.982446] R10: 0000000000000693 R11: cc00000000000000 R12: ffff880010703668
      [76336.982446] R13: ffff880079ed5f00 R14: ffff8800b738c168 R15: ffff880082f49a20
      [76336.982446] FS:  00007fd3d883b700(0000) GS:ffff8800062c0000(0000) knlGS:0000000000000000
      [76336.982446] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [76336.982446] CR2: ffff880079ed5ea8 CR3: 000000008ca47000 CR4: 00000000000006e0
      [76336.982446] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [76336.982446] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [76336.982446] Process mdt00_008 (pid: 5799, threadinfo ffff880082f48000, task ffff880096508040)
      [76336.982446] Stack:
      [76336.982446]  ffffc90008672f78 ffff88004ce3af30 ffffc900015da028 ffffc900015ca000
      [76336.982446] <d> ffffc900015ca000 0000000000000967 ffff880082f49a60 ffff880079ed5ea8
      [76336.982446] <d> ffff880010703668 00000000fffffffe 0000000200010001 0000000000000000
      [76336.982446] Call Trace:
      [76336.982446]  [<ffffffffa070df4d>] mdt_object_unlock_put+0x3d/0x110 [mdt]
      [76336.982446]  [<ffffffffa074019f>] mdt_reint_open+0x95f/0x20c0 [mdt]
      [76336.982446]  [<ffffffffa0cb9b3f>] ? upcall_cache_get_entry+0x3bf/0x870 [libcfs]
      [76336.982446]  [<ffffffffa115c78c>] ? lustre_msg_add_version+0x6c/0xc0 [ptlrpc]
      [76336.982446]  [<ffffffffa0de21f0>] ? lu_ucred+0x20/0x30 [obdclass]
      [76336.982446]  [<ffffffffa072b621>] mdt_reint_rec+0x41/0xe0 [mdt]
      [76336.982446]  [<ffffffffa0724ae3>] mdt_reint_internal+0x4e3/0x7d0 [mdt]
      [76336.982446]  [<ffffffffa072509d>] mdt_intent_reint+0x1ed/0x520 [mdt]
      [76336.982446]  [<ffffffffa0720c6e>] mdt_intent_policy+0x3ae/0x750 [mdt]
      [76336.982446]  [<ffffffffa111470a>] ldlm_lock_enqueue+0x2ea/0x870 [ptlrpc]
      [76336.982446]  [<ffffffffa113ae67>] ldlm_handle_enqueue0+0x4f7/0x10b0 [ptlrpc]
      [76336.982446]  [<ffffffffa0721146>] mdt_enqueue+0x46/0x110 [mdt]
      [76336.982446]  [<ffffffffa0712d18>] mdt_handle_common+0x648/0x1660 [mdt]
      [76336.982446]  [<ffffffffa074ede5>] mds_regular_handle+0x15/0x20 [mdt]
      [76336.982446]  [<ffffffffa116c898>] ptlrpc_server_handle_request+0x3a8/0xc70 [ptlrpc]
      [76336.982446]  [<ffffffffa0c9d5ee>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      [76336.982446]  [<ffffffffa0caee9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
      [76336.982446]  [<ffffffffa1163fe1>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc]
      [76336.982446]  [<ffffffff81054613>] ? __wake_up+0x53/0x70
      [76336.982446]  [<ffffffffa116db95>] ptlrpc_main+0xa35/0x1640 [ptlrpc]
      [76336.982446]  [<ffffffffa116d160>] ? ptlrpc_main+0x0/0x1640 [ptlrpc]
      [76336.982446]  [<ffffffff8100c10a>] child_rip+0xa/0x20
      [76336.982446]  [<ffffffffa116d160>] ? ptlrpc_main+0x0/0x1640 [ptlrpc]
      [76336.982446]  [<ffffffffa116d160>] ? ptlrpc_main+0x0/0x1640 [ptlrpc]
      [76336.982446]  [<ffffffff8100c100>] ? child_rip+0x0/0x20
      [76336.982446] Code: b0 48 8b 70 10 48 83 c2 08 e8 75 56 4c e0 49 8b 06 be 01 00 00 00 48 8b 7d c0 48 8b 40 20 ff 50 18 e9 da fe ff ff 0f 1f 44 00 00 <f6> 03 01 0f 84 cc fe ff ff 48 8b 7d b0 48 83 c7 18 e8 22 b4 ed 
      [76336.982446] RIP  [<ffffffffa0dc22f8>] lu_object_put+0x1d8/0x330 [obdclass]
      [76336.982446]  RSP <ffff880082f49a00>
      [76336.982446] CR2: ffff880079ed5ea8
      

      Crashdump and modules are in /exports/crashdumps/192.168.10.220-2013-04-30-16:12:30

      lu_object_put+0x1d8 is lustre/obdclass/lu_object.c:107

      107	                if (lu_object_is_dying(top)) {
      

      Tag in my tree is master-20130430

      Attachments

        Activity

          People

            bfaccini Bruno Faccini (Inactive)
            green Oleg Drokin
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: