[LU-3717] Kernel panic in ll_encode_fh() while testing file handle syscalls on FC18 client Created: 07/Aug/13 Updated: 04/Dec/13 Resolved: 04/Dec/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.1, Lustre 2.5.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Swapnil Pimpale (Inactive) | Assignee: | Jian Yu |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9574 | ||||||||||||||||
| Description |
|
Hit a kernel panic while trying to test the new file handle syscalls (name_to_handle_at()/open_by_handle_at()) To reproduce follow the following steps: The following is the stack trace of the panic: crash> bt -l |
| Comments |
| Comment by Oleg Drokin [ 08/Aug/13 ] |
|
The LBUG is due to ll_inode2fid() wishing that the inode is not NULL, and it's somehow passed in as NULL to ll_encode_fh() |
| Comment by Jian Yu [ 24/Oct/13 ] |
|
On FC18 client node, I set panic_on_lbug=0 and got the lctl debug log as follows: 00000080:00000001:2.0:1382612125.266070:0:21111:0:(llite_nfs.c:187:ll_encode_fh()) Process entered In ll_encode_fh(): static int ll_encode_fh(struct inode *inode, __u32 *fh, int *plen, struct inode *parent) { //...... CDEBUG(D_INFO, "encoding for (%lu,"DFID") maxlen=%d minlen=%d\n", inode->i_ino, PFID(ll_inode2fid(inode)), *plen, (int)sizeof(struct lustre_nfs_fid)); //...... nfs_fid->lnf_child = *ll_inode2fid(inode); nfs_fid->lnf_parent = *ll_inode2fid(parent); <------ parent was NULL, which caused the ASSERTION failure //...... } Need to dig out why "parent" passed from exportfs_encode_fh() to ll_encode_fh() was NULL. |
| Comment by Jian Yu [ 25/Oct/13 ] |
|
In Linux kernel 3.6.10-4 used by FC18, exportfs_encode_fh() was called from do_sys_name_to_handle() as follows: static long do_sys_name_to_handle(struct path *path, struct file_handle __user *ufh, int __user *mnt_id) { //...... /* we ask for a non connected handle */ retval = exportfs_encode_fh(path->dentry, (struct fid *)handle->f_handle, &handle_dwords, 0); <------ Here, 0 was passed to exportfs_encode_fh(). //...... } While in exportfs_encode_fh(), the codes are: int exportfs_encode_fh(struct dentry *dentry, struct fid *fid, int *max_len, int connectable) { //...... struct inode *inode = dentry->d_inode, *parent = NULL; if (connectable && !S_ISDIR(inode->i_mode)) { <------ Here, connectable was 0. p = dget_parent(dentry); //...... parent = p->d_inode; } if (nop->encode_fh) error = nop->encode_fh(inode, fid->raw, max_len, parent); <------ Here, parent was NULL. //...... } So, exportfs_encode_fh() finally passed "parent" parameter as NULL to ll_encode_fh(). |
| Comment by Jian Yu [ 25/Oct/13 ] |
|
Patch for master branch is in http://review.whamcloud.com/8072. |
| Comment by Dmitry Eremin (Inactive) [ 26/Nov/13 ] |
|
It's a similar issue to |
| Comment by James A Simmons [ 03/Dec/13 ] |
|
Just tested the http://review.whamcloud.com/8347 patch and I get this: [root@spoon46 ~]# /usr/lib64/lustre/tests/check_fhandle_syscalls temp-file /lustre/barry/ It appears to work correctly. |
| Comment by Jian Yu [ 04/Dec/13 ] |
|
The patch in http://review.whamcloud.com/8347 resolves the failure. Let's close this ticket as a duplicate of |