Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3270

ptlrpcd strnlen crash trying to log a message

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.7.0
    • None
    • None
    • 3
    • 8107

    Description

      Just hit this (use after free, I think) after a few days of recovery-small:

      [307783.003678] Lustre: DEBUG MARKER: == recovery-small test complete, duration 1814 sec == 08:30:02 (1367584202)
      [307783.153758] BUG: unable to handle kernel paging request at ffff880097114fa8
      [307783.154227] IP: [<ffffffff8127cc39>] strnlen+0x9/0x40
      [307783.154604] PGD 1a26063 PUD 501067 PMD 5ba067 PTE 8000000097114163
      [307783.155028] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      [307783.155408] last sysfs file: /sys/devices/system/cpu/possible
      [307783.155791] CPU 0 
      [307783.155847] Modules linked in: lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd mgs lquota obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs exportfs jbd sha512_generic sha256_generic ext4 mbcache jbd2 virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs]
      [307783.156905] 
      [307783.156905] Pid: 17616, comm: ptlrpcd_3 Not tainted 2.6.32-rhe6.4-debug #2 Bochs Bochs
      [307783.156905] RIP: 0010:[<ffffffff8127cc39>]  [<ffffffff8127cc39>] strnlen+0x9/0x40
      [307783.156905] RSP: 0018:ffff8800af8e9980  EFLAGS: 00010206
      [307783.156905] RAX: ffffffff817a6cc1 RBX: ffff8800b00e1000 RCX: 0000000000000001
      [307783.156905] RDX: ffff880097114fa8 RSI: 0000000000000017 RDI: ffff880097114fa8
      [307783.156905] RBP: ffff8800af8e9980 R08: 000000000000002e R09: 0000000000000053
      [307783.156905] R10: 0000000000000001 R11: 0000000000000104 R12: ffff8800b00e0a4b
      [307783.156905] R13: ffff880097114fa8 R14: 00000000ffffffff R15: 0000000000000000
      [307783.156905] FS:  00007fbed06dc700(0000) GS:ffff880006200000(0000) knlGS:0000000000000000
      [307783.156905] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [307783.156905] CR2: ffff880097114fa8 CR3: 0000000001a25000 CR4: 00000000000006f0
      [307783.156905] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [307783.156905] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [307783.156905] Process ptlrpcd_3 (pid: 17616, threadinfo ffff8800af8e8000, task ffff8800b08aa5c0)
      [307783.156905] Stack:
      [307783.156905]  ffff8800af8e99c0 ffffffff8127def0 ffff880099facdf0 ffff8800b00e0a4b
      [307783.156905] <d> ffffffffa13328e0 ffffffffa13328df ffff8800af8e9b40 ffff8800b00e1000
      [307783.156905] <d> ffff8800af8e9a60 ffffffff8127f338 0000000000000004 0000000affffffff
      [307783.156905] Call Trace:
      [307783.156905]  [<ffffffff8127def0>] string+0x40/0x100
      [307783.156905]  [<ffffffff8127f338>] vsnprintf+0x218/0x5e0
      [307783.156905]  [<ffffffffa0a9127b>] ? cfs_set_ptldebug_header+0x2b/0xc0 [libcfs]
      [307783.156905]  [<ffffffffa0aa1b6b>] libcfs_debug_vmsg2+0x2fb/0xab0 [libcfs]
      [307783.156905]  [<ffffffffa0aa2361>] libcfs_debug_msg+0x41/0x50 [libcfs]
      [307783.156905]  [<ffffffffa0a9cde8>] ? libcfs_log_return+0x28/0x40 [libcfs]
      [307783.156905]  [<ffffffffa132c2e5>] mdc_finish_intent_lock+0x535/0x7f0 [mdc]
      [307783.156905]  [<ffffffffa0a9cde8>] ? libcfs_log_return+0x28/0x40 [libcfs]
      [307783.156905]  [<ffffffffa132d388>] ? mdc_finish_enqueue+0x208/0x1180 [mdc]
      [307783.156905]  [<ffffffffa132e626>] mdc_intent_getattr_async_interpret+0x326/0x4c0 [mdc]
      [307783.156905]  [<ffffffffa0ea46b2>] ptlrpc_check_set+0x2b2/0x1da0 [ptlrpc]
      [307783.156905]  [<ffffffffa0ed1e1b>] ptlrpcd_check+0x55b/0x590 [ptlrpc]
      [307783.156905]  [<ffffffffa0ed2363>] ptlrpcd+0x233/0x390 [ptlrpc]
      [307783.156905]  [<ffffffff8105ad10>] ? default_wake_function+0x0/0x20
      [307783.156905]  [<ffffffffa0ed2130>] ? ptlrpcd+0x0/0x390 [ptlrpc]
      [307783.156905]  [<ffffffff8100c10a>] child_rip+0xa/0x20
      [307783.156905]  [<ffffffffa0ed2130>] ? ptlrpcd+0x0/0x390 [ptlrpc]
      [307783.156905]  [<ffffffffa0ed2130>] ? ptlrpcd+0x0/0x390 [ptlrpc]
      [307783.156905]  [<ffffffff8100c100>] ? child_rip+0x0/0x20
      [307783.156905] Code: 66 90 48 83 c2 01 80 3a 00 75 f7 48 89 d0 48 29 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 85 f6 48 89 e5 74 2e <80> 3f 00 74 29 48 83 ee 01 48 89 f8 eb 12 66 0f 1f 84 00 00 00 
      

      Crashdump and modules are in /exports/crashdumps/192.168.10.217-2013-05-03-08\:30\:08/
      source tag is 20130430

      Attachments

        Issue Links

          Activity

            [LU-3270] ptlrpcd strnlen crash trying to log a message

            Lai Siyao (lai.siyao@intel.com) uploaded a new patch: http://review.whamcloud.com/12901
            Subject: LU-3270 statahead: refactor statahead code
            Project: fs/lustre-release
            Branch: b2_5
            Current Patch Set: 1
            Commit: 2cccaaee80e60ca7cf1fcd701fe7705198099164

            gerrit Gerrit Updater added a comment - Lai Siyao (lai.siyao@intel.com) uploaded a new patch: http://review.whamcloud.com/12901 Subject: LU-3270 statahead: refactor statahead code Project: fs/lustre-release Branch: b2_5 Current Patch Set: 1 Commit: 2cccaaee80e60ca7cf1fcd701fe7705198099164

            Patches landed to Master.

            jlevi Jodi Levi (Inactive) added a comment - Patches landed to Master.

            Last patch has been merged. This ticket can be closed.

            simmonsja James A Simmons added a comment - Last patch has been merged. This ticket can be closed.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/9667/
            Subject: LU-3270 statahead: small fixes and cleanup
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 672ab0e00d6148330561d21ccff9331124914d35

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/9667/ Subject: LU-3270 statahead: small fixes and cleanup Project: fs/lustre-release Branch: master Current Patch Set: Commit: 672ab0e00d6148330561d21ccff9331124914d35

            Many patches have been landed. Only one patch left at http://review.whamcloud.com/#/c/9667. Please inspect.

            simmonsja James A Simmons added a comment - Many patches have been landed. Only one patch left at http://review.whamcloud.com/#/c/9667 . Please inspect.

            Refreshed http://review.whamcloud.com/#/c/9663. Patch is ready for inspection.

            simmonsja James A Simmons added a comment - Refreshed http://review.whamcloud.com/#/c/9663 . Patch is ready for inspection.

            Has this work been delayed until 2.8?

            simmonsja James A Simmons added a comment - Has this work been delayed until 2.8?
            laisiyao Lai Siyao added a comment -

            Thanks Ryan, and as is pointed out in LU-4985, this has been fixed by LU-3498, which has been landed to master, and waiting to land on 2.5.

            laisiyao Lai Siyao added a comment - Thanks Ryan, and as is pointed out in LU-4985 , this has been fixed by LU-3498 , which has been landed to master, and waiting to land on 2.5.
            haasken Ryan Haasken added a comment -

            Lai, I have created LU-4985 for the bug about incorrect handling of the kthread_run return value in start_statahead_thread. That bug is also for other calls to kthread_run which are handled incorrectly.

            haasken Ryan Haasken added a comment - Lai, I have created LU-4985 for the bug about incorrect handling of the kthread_run return value in start_statahead_thread. That bug is also for other calls to kthread_run which are handled incorrectly.
            laisiyao Lai Siyao added a comment -

            Thank you, I'll fix it in related patch soon.

            laisiyao Lai Siyao added a comment - Thank you, I'll fix it in related patch soon.

            People

              laisiyao Lai Siyao
              green Oleg Drokin
              Votes:
              1 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: