Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.5.0, Lustre 2.4.2
    • Lustre 2.4.0
    • None
    • 3
    • 9102

    Description

      One of our Lustre 2.4.0 based clients is hitting an assertion when we run robinhood.

      2013-07-09 14:35:15 vrelic8 login: LustreError: 5467:0:(mdc_request.c:1500:changelog_kuc_hdr()) ASSERTION( len <= cfs_size_round(2*255 + 1 + sizeof(struct changelog_rec)) ) failed:
      2013-07-09 14:41:07 LustreError: 5467:0:(mdc_request.c:1500:changelog_kuc_hdr()) LBUG
      

      Backtrace of the crashed thread:

      PID: 5467   TASK: ffff885c8668eaa0  CPU: 17  COMMAND: "mdc_clg_send_th"
       #0 [ffff885c866779e8] machine_kexec at ffffffff81035bfb
       #1 [ffff885c86677a48] crash_kexec at ffffffff810c0932
       #2 [ffff885c86677b18] panic at ffffffff8150d943
       #3 [ffff885c86677b98] lbug_with_loc at ffffffffa042bf4b [libcfs]
       #4 [ffff885c86677bb8] changelog_kuc_hdr at ffffffffa096bf3e [mdc]
       #5 [ffff885c86677bc8] changelog_kkuc_cb at ffffffffa096d28c [mdc]
       #6 [ffff885c86677bf8] llog_process_thread at ffffffffa05e110b [obdclass]
       #7 [ffff885c86677ca8] llog_process_or_fork at ffffffffa05e2e1d [obdclass]
       #8 [ffff885c86677cf8] llog_cat_process_cb at ffffffffa05e560a [obdclass]
       #9 [ffff885c86677d58] llog_process_thread at ffffffffa05e110b [obdclass]
      #10 [ffff885c86677e08] llog_process_or_fork at ffffffffa05e2e1d [obdclass]
      #11 [ffff885c86677e58] llog_cat_process_or_fork at ffffffffa05e3ef9 [obdclass]
      #12 [ffff885c86677ee8] llog_cat_process at ffffffffa05e41d9 [obdclass]
      #13 [ffff885c86677f08] mdc_changelog_send_thread at ffffffffa097157b [mdc]
      #14 [ffff885c86677f48] kernel_thread at ffffffff8100c0ca
      

      Digging into the crash dump, the values on the stack did not appear to contain either the "len" variable calculated by changelog_kkuc_cb(). But the problem was easily reproducible, so I added a debug message and tried again. This told me:

      2013-07-15 11:27:32 LustreError: 5472:0:(mdc_request.c:1504:changelog_kuc_hdr()) ASSERTION( len <= CR_MAXSIZE ) failed: CR_MAXSIZE=576, len=588, rec=ffff885ce8c9a148, changelog_rec_size=96
      2013-07-15 11:27:32 LustreError: 5472:0:(mdc_request.c:1504:changelog_kuc_hdr()) LBUG
      

      With a pointer to the llog_changelog_rec function, I could print it out in crash:

      crash> print *(struct llog_changelog_rec *)0xffff885ce8c9a148
      $4 = {
        cr_hdr = {
          lrh_len = 608,
          lrh_index = 29516,
          lrh_type = 275120128,
          padding = 0
        },
        cr = {
          cr_namelen = 484,
          cr_flags = 8192,
          cr_type = 8,
          cr_index = 80023354,
          cr_prev = 0,
          cr_time = 1474100508289501189,
          {
            cr_tfid = {
              f_seq = 0,
              f_oid = 0,
              f_ver = 0
            },
            cr_markerflags = 0
          },
          cr_pfid = {
            f_seq = 8589961859,
            f_oid = 28956,
            f_ver = 0
          },
          cr_name = 0xffff885ce8c9a198 "\263j"
        },
        cr_tail = {
          lrt_len = 27315,
          lrt_index = 2
        }
      }
      

      So here is what I understand:

      • cr_type tells us that this is a CL_RENAME
      • cr_flags tells us that cr is a struct changelog_ext_rec instead of the normal struct changelog_rec
      • cr_namelen is large at 484

      Note that cr_name looks corrupt, but is not. It is merely because crash does not know that we have an undeclared union here. If I manually add 32 bytes to the cr_namelen pointer, I am able to see a correct filename:

      crash> print (((struct llog_changelog_rec *)0xffff885ce8c9a148)->cr).cr_name+32

      I won't share the users's file name here, it is sufficient to say that it is 245 characters long, not including the terminating NULL character. If I then add 246 characters, I can see a second filename:

      crash> print (((struct llog_changelog_rec *)0xffff885ce8c9a148)->cr).cr_name+32+246

      This filename is nearly the same as the first one, but the trailing string of "-failed" was removed, resulting in a string 238 characters long, not including the terminating null character. Doing the math:

      245+238+1 = 484

      OK, that matches cr_len! Now I think I see why CR_MAXSIZE starts out with "2*LEN_MAX + 1". But I don't understand it. Why are we only accounting for one of the null characters??

      But the root of the bug appears to be that CR_MAXSIZE only accounts for the size of the smaller "sizeof(struct changelog_rec)", when it should be using "sizeof(struct changelog_ext_rec)". Those extra 32 bytes would have been enough to store this structure, and avoid the bug.

      I can change CR_MAXSIZE to do the right thing, and I will push a patch to do so soon. I have not yet fully checked to see if changing CR_MAXSIZE could break anything else.

      Attachments

        Issue Links

          Activity

            [LU-3587] CR_MAXSIZE is too small
            bogl Bob Glossman (Inactive) added a comment - backports for b2_4: http://review.whamcloud.com/8170 http://review.whamcloud.com/8171
            pjones Peter Jones added a comment -

            Landed for 2.5

            pjones Peter Jones added a comment - Landed for 2.5
            bogl Bob Glossman (Inactive) added a comment - added sanity test http://review.whamcloud.com/7751
            pjones Peter Jones added a comment -

            Bob

            Could you please look into creating a suitable test?

            Thanks

            Peter

            pjones Peter Jones added a comment - Bob Could you please look into creating a suitable test? Thanks Peter
            pjones Peter Jones added a comment -

            The patch has landed to master, but has a test been created yet?

            pjones Peter Jones added a comment - The patch has landed to master, but has a test been created yet?

            Also, as has already been mentioned, it would be very nice to have a new sanity test that tests a rename of a file name 255 characters long to another 255 character long file name, and then read the changelog that results. I won't have time to work on that test.

            morrone Christopher Morrone (Inactive) added a comment - Also, as has already been mentioned, it would be very nice to have a new sanity test that tests a rename of a file name 255 characters long to another 255 character long file name, and then read the changelog that results. I won't have time to work on that test.
            morrone Christopher Morrone (Inactive) added a comment - Second patch is: http://review.whamcloud.com/7406

            Note that there is an additional problem, beyond the incorrectly defined CR_MAXSIZE. The KUC code uses a size for a buffer that is basically sizeof(struct kuc_hdr) + CR_MAXSIZE, but the assertion does not take into account the struct kuc_hdr size, only CR_MAXSIZE.

            I have a second patch that I will push shortly.

            morrone Christopher Morrone (Inactive) added a comment - Note that there is an additional problem, beyond the incorrectly defined CR_MAXSIZE. The KUC code uses a size for a buffer that is basically sizeof(struct kuc_hdr) + CR_MAXSIZE, but the assertion does not take into account the struct kuc_hdr size, only CR_MAXSIZE. I have a second patch that I will push shortly.

            FYI, change 6993 did address the problem, and we are using it on nodes where we run robinhood.

            morrone Christopher Morrone (Inactive) added a comment - FYI, change 6993 did address the problem, and we are using it on nodes where we run robinhood.
            laisiyao Lai Siyao added a comment -

            CR_MAXSIZE starts out with "2*NAME_MAX +1" is because a null character is used as delimiter between two names, and we've already stored the total name length of the two names plus the delimiter in rec->cr_namelen, and we always access the second name with length.

            The fix looks correct, and it will be great if a sanity test is added for this.

            laisiyao Lai Siyao added a comment - CR_MAXSIZE starts out with "2*NAME_MAX +1" is because a null character is used as delimiter between two names, and we've already stored the total name length of the two names plus the delimiter in rec->cr_namelen, and we always access the second name with length. The fix looks correct, and it will be great if a sanity test is added for this.

            People

              bogl Bob Glossman (Inactive)
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: