Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 1.8.6, Lustre 1.8.x (1.8.0 - 1.8.5)
-
CentOS 5 with 1.8.6-WC1 clients and 1.8.4 servers.
-
3
-
4028
Description
During running some tests, we found client hang during tests. Further investigation shows that it is because client is looping for ever in ll_readdir_page(). The first ll_dir_dentry is correct but some of following up ll_dir_dentry record is all NULL.
crash> struct ll_dir_entry 0xffff8105f5818000
struct ll_dir_entry {
lde_inode = 748257583,
lde_rec_len = 12,
lde_name_len = 1 '\001',
lde_file_type = 2 '\002',
lde_name = ".\000\000\000\200\202\230,\f\000\002\002..\000\0000\201\231,\024\000\n\001ssciohb.nrat1\201\231,\020\000\005\001krsni8552\201\231,\020\000\005\001nticpemc3\201\231,\020\000\b\001crn.wole4\201\231,\f\000\003\001lsrt5\201\231,\024\000\n\001loita.hdal2.6\201\231,\024\000\n\001feekg.vsri9\0007\201\231,\024\000\f\001lgaumt.ggesd8\201\231,\024\000\v\001eilltn.ncsr.9\201\231,\020\000\b\001sai.blol:\201\231,\024\000\t\001rnbtmru.sing;\201\231,\020\000\006\001eta.fo64<\201\231,\f\000\003\001rkr.=\201\231,\020\000\005\001aco.d13"
}
crash> struct ll_dir_entry 0xffff8105f5818a19
struct ll_dir_entry {
lde_inode = 0,
lde_rec_len = 0,
lde_name_len = 0 '\0',
lde_file_type = 0 '\0',
lde_name = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"
}
After applying bellow changes, tests passes smoothly and the debug message is printed a lot.
diff --git a/lustre/llite/dir.c b/lustre/llite/dir.c
index 3154d32..3b9779b 100644
— a/lustre/llite/dir.c
+++ b/lustre/llite/dir.c
@@ -327,6 +327,12 @@ static int ll_readdir_page(char *addr, __u64 base, unsigned *offset,
de = ll_entry_at(addr, *offset);
end = addr + CFS_PAGE_SIZE - ll_dir_rec_len(1);
for (nr = 0 ;(char*)de <= end; de = ll_dir_next_entry(de)) {
+ if (de->lde_rec_len == 0)
if (de->lde_inode != 0) {
nr++;
*offset = (char *)de - addr;
It may not be the right fix as I didn't figure out why the page is partially zeroed.
Attachments
Activity
Fix Version/s | New: Lustre 1.8.9 [ 10204 ] | |
Fix Version/s | Original: Lustre 1.8.8 [ 10131 ] |
Fix Version/s | New: Lustre 1.8.8 [ 10131 ] | |
Resolution | New: Fixed [ 1 ] | |
Status | Original: Open [ 1 ] | New: Resolved [ 5 ] |
Assignee | Original: WC Triage [ wc-triage ] | New: Keith Mannthey [ keith ] |
Labels | Original: emc | New: emc patch |