[LU-5024] (mdc_lib.c:163:mdc_pack_name()) ASSERTION( cpy_len == name_len && lu_name_is_valid_2(buf, cpy_len) ) failed: Created: 07/May/14 Updated: 17/Sep/19 Resolved: 22/Nov/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0, Lustre 2.9.0 |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.4 |
| Type: | Bug | Priority: | Minor |
| Reporter: | John Hammond | Assignee: | Lai Siyao |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | llite, mdc | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 13904 | ||||||||
| Description |
|
Testing http://review.whamcloud.com/#/c/10198/, Oleg got it to crash under racer. This can be easily reproduced using: llmount.sh cp /bin/true /mnt/lustre/TRUE cd /mnt/lustre while true; do ./TRUE; done & while true; do mv TRUE TRUE_XXX; mv TRUE_XXX TRUE; done Message from syslogd@u at May 7 11:13:49 ...
kernel:[491063.276112] LustreError: 14609:0:(mdc_lib.c:163:mdc_pack_name()) ASSERTION( cp\
y_len == name_len && lu_name_is_valid_2(buf, cpy_len) ) failed:
Message from syslogd@u at May 7 11:13:49 ...
kernel:[491063.279161] LustreError: 14609:0:(mdc_lib.c:163:mdc_pack_name()) LBUG
Message from syslogd@u at May 7 11:13:49 ...
kernel:[491063.317026] Kernel panic - not syncing: LBUG
crash> bt
PID: 14609 TASK: ffff88011b6006c0 CPU: 4 COMMAND: "bash"
#0 [ffff880110c6f550] machine_kexec at ffffffff81039950
#1 [ffff880110c6f5b0] crash_kexec at ffffffff810d4372
#2 [ffff880110c6f680] panic at ffffffff81550d83
#3 [ffff880110c6f700] lbug_with_loc at ffffffffa079df1b [libcfs]
#4 [ffff880110c6f720] mdc_pack_name at ffffffffa0991d25 [mdc]
#5 [ffff880110c6f760] mdc_open_pack at ffffffffa0992789 [mdc]
#6 [ffff880110c6f7c0] mdc_enqueue at ffffffffa099699e [mdc]
#7 [ffff880110c6f900] mdc_intent_lock at ffffffffa0997d4e [mdc]
#8 [ffff880110c6f9c0] lmv_intent_open at ffffffffa095df35 [lmv]
#9 [ffff880110c6fa60] lmv_intent_lock at ffffffffa095e88b [lmv]
#10 [ffff880110c6faf0] ll_intent_file_open at ffffffffa06508ed [lustre]
#11 [ffff880110c6fb80] ll_file_open at ffffffffa0651a15 [lustre]
#12 [ffff880110c6fc80] __dentry_open at ffffffff8119fa5a
#13 [ffff880110c6fce0] nameidata_to_filp at ffffffff8119fdc4
#14 [ffff880110c6fd00] do_filp_open at ffffffff811b5640
#15 [ffff880110c6fe70] open_exec at ffffffff811ac200
#16 [ffff880110c6fec0] do_execve at ffffffff811ac39f
#17 [ffff880110c6ff20] sys_execve at ffffffff810095ea
#18 [ffff880110c6ff50] stub_execve at ffffffff8100b54a
RIP: 000000377fead047 RSP: 00007fff66ccc718 RFLAGS: 00000246
RAX: 000000000000003b RBX: 00000000015b9490 RCX: ffffffffffffffff
RDX: 00000000015623b0 RSI: 00000000015b9530 RDI: 00000000015b9490
RBP: 00000000015b9490 R8: 000000378018fee8 R9: 0000000000000001
R10: 0000000000000010 R11: 0000000000000246 R12: 0000000000000001
R13: 00000000015b9530 R14: 00000000015623b0 R15: 0000000001537280
ORIG_RAX: 000000000000003b CS: 0033 SS: 002b
Looking at the stack and debug logs I see that execve() is called on ./TRUE but TRUE_XXX is being packed into the request (with the length of "TRUE"). Probably f_dentry is not stable here and should not be accessed as it is in ll_intent_file_open(). There have been patches to drop the name (see |
| Comments |
| Comment by Saurabh Tandan (Inactive) [ 08/Nov/16 ] |
|
An instance for Interop - 2.8.0 EL7.2 Server/EL7.2 Client |
| Comment by Åke Sandgren [ 28/Feb/17 ] |
|
We've just been bitten by this assert on a production system. We would really like to see a fix for this. ==================== |
| Comment by Åke Sandgren [ 28/Feb/17 ] |
|
We've seen it 3 times on different nodes in a couple of hours... |
| Comment by Gerrit Updater [ 22/Sep/17 ] |
|
Lai Siyao (lai.siyao@intel.com) uploaded a new patch: https://review.whamcloud.com/29161 |
| Comment by Gerrit Updater [ 22/Nov/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29161/ |
| Comment by Peter Jones [ 22/Nov/17 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 04/Dec/17 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30355 |
| Comment by Gerrit Updater [ 12/Mar/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30355/ |