[LU-3587] CR_MAXSIZE is too small Created: 16/Jul/13 Updated: 17/Nov/13 Resolved: 08/Oct/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.5.0, Lustre 2.4.2 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Christopher Morrone | Assignee: | Bob Glossman (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9102 |
| Description |
|
One of our Lustre 2.4.0 based clients is hitting an assertion when we run robinhood. 2013-07-09 14:35:15 vrelic8 login: LustreError: 5467:0:(mdc_request.c:1500:changelog_kuc_hdr()) ASSERTION( len <= cfs_size_round(2*255 + 1 + sizeof(struct changelog_rec)) ) failed: 2013-07-09 14:41:07 LustreError: 5467:0:(mdc_request.c:1500:changelog_kuc_hdr()) LBUG Backtrace of the crashed thread: PID: 5467 TASK: ffff885c8668eaa0 CPU: 17 COMMAND: "mdc_clg_send_th" #0 [ffff885c866779e8] machine_kexec at ffffffff81035bfb #1 [ffff885c86677a48] crash_kexec at ffffffff810c0932 #2 [ffff885c86677b18] panic at ffffffff8150d943 #3 [ffff885c86677b98] lbug_with_loc at ffffffffa042bf4b [libcfs] #4 [ffff885c86677bb8] changelog_kuc_hdr at ffffffffa096bf3e [mdc] #5 [ffff885c86677bc8] changelog_kkuc_cb at ffffffffa096d28c [mdc] #6 [ffff885c86677bf8] llog_process_thread at ffffffffa05e110b [obdclass] #7 [ffff885c86677ca8] llog_process_or_fork at ffffffffa05e2e1d [obdclass] #8 [ffff885c86677cf8] llog_cat_process_cb at ffffffffa05e560a [obdclass] #9 [ffff885c86677d58] llog_process_thread at ffffffffa05e110b [obdclass] #10 [ffff885c86677e08] llog_process_or_fork at ffffffffa05e2e1d [obdclass] #11 [ffff885c86677e58] llog_cat_process_or_fork at ffffffffa05e3ef9 [obdclass] #12 [ffff885c86677ee8] llog_cat_process at ffffffffa05e41d9 [obdclass] #13 [ffff885c86677f08] mdc_changelog_send_thread at ffffffffa097157b [mdc] #14 [ffff885c86677f48] kernel_thread at ffffffff8100c0ca Digging into the crash dump, the values on the stack did not appear to contain either the "len" variable calculated by changelog_kkuc_cb(). But the problem was easily reproducible, so I added a debug message and tried again. This told me: 2013-07-15 11:27:32 LustreError: 5472:0:(mdc_request.c:1504:changelog_kuc_hdr()) ASSERTION( len <= CR_MAXSIZE ) failed: CR_MAXSIZE=576, len=588, rec=ffff885ce8c9a148, changelog_rec_size=96 2013-07-15 11:27:32 LustreError: 5472:0:(mdc_request.c:1504:changelog_kuc_hdr()) LBUG With a pointer to the llog_changelog_rec function, I could print it out in crash: crash> print *(struct llog_changelog_rec *)0xffff885ce8c9a148
$4 = {
cr_hdr = {
lrh_len = 608,
lrh_index = 29516,
lrh_type = 275120128,
padding = 0
},
cr = {
cr_namelen = 484,
cr_flags = 8192,
cr_type = 8,
cr_index = 80023354,
cr_prev = 0,
cr_time = 1474100508289501189,
{
cr_tfid = {
f_seq = 0,
f_oid = 0,
f_ver = 0
},
cr_markerflags = 0
},
cr_pfid = {
f_seq = 8589961859,
f_oid = 28956,
f_ver = 0
},
cr_name = 0xffff885ce8c9a198 "\263j"
},
cr_tail = {
lrt_len = 27315,
lrt_index = 2
}
}
So here is what I understand:
Note that cr_name looks corrupt, but is not. It is merely because crash does not know that we have an undeclared union here. If I manually add 32 bytes to the cr_namelen pointer, I am able to see a correct filename: crash> print (((struct llog_changelog_rec *)0xffff885ce8c9a148)->cr).cr_name+32 I won't share the users's file name here, it is sufficient to say that it is 245 characters long, not including the terminating NULL character. If I then add 246 characters, I can see a second filename: crash> print (((struct llog_changelog_rec *)0xffff885ce8c9a148)->cr).cr_name+32+246 This filename is nearly the same as the first one, but the trailing string of "-failed" was removed, resulting in a string 238 characters long, not including the terminating null character. Doing the math: 245+238+1 = 484 OK, that matches cr_len! Now I think I see why CR_MAXSIZE starts out with "2*LEN_MAX + 1". But I don't understand it. Why are we only accounting for one of the null characters?? But the root of the bug appears to be that CR_MAXSIZE only accounts for the size of the smaller "sizeof(struct changelog_rec)", when it should be using "sizeof(struct changelog_ext_rec)". Those extra 32 bytes would have been enough to store this structure, and avoid the bug. I can change CR_MAXSIZE to do the right thing, and I will push a patch to do so soon. I have not yet fully checked to see if changing CR_MAXSIZE could break anything else. |
| Comments |
| Comment by Christopher Morrone [ 16/Jul/13 ] |
|
Pushed a possible solution in change http://review.whamcloud.com/6993. I will test it some time tomorrow. |
| Comment by Lai Siyao [ 16/Jul/13 ] |
|
CR_MAXSIZE starts out with "2*NAME_MAX +1" is because a null character is used as delimiter between two names, and we've already stored the total name length of the two names plus the delimiter in rec->cr_namelen, and we always access the second name with length. The fix looks correct, and it will be great if a sanity test is added for this. |
| Comment by Christopher Morrone [ 22/Jul/13 ] |
|
FYI, change 6993 did address the problem, and we are using it on nodes where we run robinhood. |
| Comment by Christopher Morrone [ 20/Aug/13 ] |
|
Note that there is an additional problem, beyond the incorrectly defined CR_MAXSIZE. The KUC code uses a size for a buffer that is basically sizeof(struct kuc_hdr) + CR_MAXSIZE, but the assertion does not take into account the struct kuc_hdr size, only CR_MAXSIZE. I have a second patch that I will push shortly. |
| Comment by Christopher Morrone [ 20/Aug/13 ] |
|
Second patch is: |
| Comment by Christopher Morrone [ 20/Aug/13 ] |
|
Also, as has already been mentioned, it would be very nice to have a new sanity test that tests a rename of a file name 255 characters long to another 255 character long file name, and then read the changelog that results. I won't have time to work on that test. |
| Comment by Peter Jones [ 24/Sep/13 ] |
|
The patch has landed to master, but has a test been created yet? |
| Comment by Peter Jones [ 24/Sep/13 ] |
|
Bob Could you please look into creating a suitable test? Thanks Peter |
| Comment by Bob Glossman (Inactive) [ 24/Sep/13 ] |
|
added sanity test |
| Comment by Peter Jones [ 08/Oct/13 ] |
|
Landed for 2.5 |
| Comment by Bob Glossman (Inactive) [ 04/Nov/13 ] |
|
backports for b2_4: |