Details

    • 3
    • 11583

    Description

      OSS crashed with Projection Fault at lqe64_hash_keycmp

      See attached files.
      Looks like we are crashing here

      libcfs/libcfs/hash.c
      static cfs_hlist_node_t *
      cfs_hash_bd_lookup_intent(cfs_hash_t *hs, cfs_hash_bd_t *bd,
      .
      .
      .
      cfs_hlist_for_each(ehnode, hhead) { <<<<<<<< CRASH
      if (!cfs_hash_keycmp(hs, key, ehnode))
      continue;

      cfs_hlist_for_each defined as hlist_for_each but hlist_for_each doesn't seem to be defined any where.

      See attached backtrace info.

      Attachments

        1. debug.tgz
          344 kB
        2. service188.nov13.2013.tgz
          11 kB
        3. syslog.tgz
          54 kB

        Issue Links

          Activity

            [LU-4249] exception RIP: lqe64_hash_keycmp+12
            pjones Peter Jones added a comment -

            Landed for 2.7

            pjones Peter Jones added a comment - Landed for 2.7
            niu Niu Yawei (Inactive) added a comment - - edited b2_4: http://review.whamcloud.com/11019 b2_5: http://review.whamcloud.com/11020
            niu Niu Yawei (Inactive) added a comment - http://review.whamcloud.com/10988

            Hi, Ihara, the ENOLCK error message problem is addressed in another ticket, please see LU-4920.

            This debug patch is to collect information on lqe refcount when system crash on "exception RIP: ...", please keep the debug patch applied until the crash problem reproduced. Thanks.

            niu Niu Yawei (Inactive) added a comment - Hi, Ihara, the ENOLCK error message problem is addressed in another ticket, please see LU-4920 . This debug patch is to collect information on lqe refcount when system crash on "exception RIP: ...", please keep the debug patch applied until the crash problem reproduced. Thanks.

            Niu, we applied patches, and then, same error messages have been showing up. Attahced are recent debug log and syslog messages after applies patches.

            ihara Shuichi Ihara (Inactive) added a comment - Niu, we applied patches, and then, same error messages have been showing up. Attahced are recent debug log and syslog messages after applies patches.

            Hi Niu, so, finally, what exactly patches should we apply againt b2_4 if we want enabled debug patches?

            fix of LU-3460 http://review.whamcloud.com/#/c/8169/ and debug patch http://review.whamcloud.com/9833

            niu Niu Yawei (Inactive) added a comment - Hi Niu, so, finally, what exactly patches should we apply againt b2_4 if we want enabled debug patches? fix of LU-3460 http://review.whamcloud.com/#/c/8169/ and debug patch http://review.whamcloud.com/9833

            Hi Niu, so, finally, what exactly patches should we apply againt b2_4 if we want enabled debug patches?

            ihara Shuichi Ihara (Inactive) added a comment - Hi Niu, so, finally, what exactly patches should we apply againt b2_4 if we want enabled debug patches?

            thanks niu. deleted from here.

            javed javed shaikh (Inactive) added a comment - thanks niu. deleted from here.

            Hi Javed

            Looks your post isn't related to this bug, could you open another ticket to track it? Thanks.

            niu Niu Yawei (Inactive) added a comment - Hi Javed Looks your post isn't related to this bug, could you open another ticket to track it? Thanks.

            Hi Niu, I cherry-picked LU-3460 and then #9833. It applied clean now. Thanks!

            jaylan Jay Lan (Inactive) added a comment - Hi Niu, I cherry-picked LU-3460 and then #9833. It applied clean now. Thanks!

            Sorry, I made the patch against master mistakenly, here is the patch for b2_4: http://review.whamcloud.com/9833

            This new patch doesn't address the conflict in lustre/quota/qsd_writeback.c, because there is a quota fix introduced in 2.4.2 (LU-3460 http://review.whamcloud.com/#/c/8169/), I suggest you apply this fix first.

            niu Niu Yawei (Inactive) added a comment - Sorry, I made the patch against master mistakenly, here is the patch for b2_4: http://review.whamcloud.com/9833 This new patch doesn't address the conflict in lustre/quota/qsd_writeback.c, because there is a quota fix introduced in 2.4.2 ( LU-3460 http://review.whamcloud.com/#/c/8169/ ), I suggest you apply this fix first.

            People

              niu Niu Yawei (Inactive)
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: