[LU-3952] llite_nfs.c:349:ll_get_parent()) ASSERTION( body->valid & (0x00000001ULL) ) failed Created: 16/Sep/13 Updated: 29/Jul/14 Resolved: 04/Jan/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.6.0, Lustre 2.5.1 |
| Type: | Bug | Priority: | Major |
| Reporter: | Daire Byrne (Inactive) | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | mn4 | ||
| Environment: |
Lustre client: 2.6.32-358.6.2.el6_lustre.g230b174, exporting via NFS. |
||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 10510 | ||||
| Description |
|
I couldn't find this exact stack in JIRA so I'm reporting it. Finally had to mount with abort_recov to stop the machine from hitting the same LBUG. kdump available upon request. <0>LustreError: 3436:0:(llite_nfs.c:349:ll_get_parent()) ASSERTION( body->valid & (0x00000001ULL) ) failed: <0>LustreError: 3436:0:(llite_nfs.c:349:ll_get_parent()) LBUG <4>Pid: 3436, comm: nfsd <4> <4>Call Trace: <4> [<ffffffffa0432895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4> [<ffffffffa0432e97>] lbug_with_loc+0x47/0xb0 [libcfs] <4> [<ffffffffa0ac0ff3>] ll_get_parent+0x7b3/0x820 [lustre] <4> [<ffffffffa02f53b0>] reconnect_path+0x160/0x310 [exportfs] <4> [<ffffffffa02f57aa>] exportfs_decode_fh+0xea/0x2bc [exportfs] <4> [<ffffffffa0bfa810>] ? nfsd_acceptable+0x0/0x120 [nfsd] <4> [<ffffffffa034d5b0>] ? cache_check+0x60/0x360 [sunrpc] <4> [<ffffffffa0c00702>] ? exp_find_key+0x62/0xb0 [nfsd] <4> [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 <4> [<ffffffffa0bfad0a>] fh_verify+0x32a/0x640 [nfsd] <4> [<ffffffffa0c059ac>] nfsd3_proc_getattr+0x6c/0xe0 [nfsd] <4> [<ffffffffa0bf743e>] nfsd_dispatch+0xfe/0x240 [nfsd] <4> [<ffffffffa0343604>] svc_process_common+0x344/0x640 [sunrpc] <4> [<ffffffff81063310>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa0343c40>] svc_process+0x110/0x160 [sunrpc] <4> [<ffffffffa0bf7b62>] nfsd+0xc2/0x160 [nfsd] <4> [<ffffffffa0bf7aa0>] ? nfsd+0x0/0x160 [nfsd] <4> [<ffffffff81096936>] kthread+0x96/0xa0 <4> [<ffffffff8100c0ca>] child_rip+0xa/0x20 <4> [<ffffffff810968a0>] ? kthread+0x0/0xa0 <4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 <4> <0>Kernel panic - not syncing: LBUG <4>Pid: 3436, comm: nfsd Tainted: G --------------- T 2.6.32-358.6.2.el6_lustre.g230b174.x86_64 #1 <4>Call Trace: <4> [<ffffffff8150d878>] ? panic+0xa7/0x16f <4> [<ffffffffa0432eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] <4> [<ffffffffa0ac0ff3>] ? ll_get_parent+0x7b3/0x820 [lustre] <4> [<ffffffffa02f53b0>] ? reconnect_path+0x160/0x310 [exportfs] <4> [<ffffffffa02f57aa>] ? exportfs_decode_fh+0xea/0x2bc [exportfs] <4> [<ffffffffa0bfa810>] ? nfsd_acceptable+0x0/0x120 [nfsd] <4> [<ffffffffa034d5b0>] ? cache_check+0x60/0x360 [sunrpc] <4> [<ffffffffa0c00702>] ? exp_find_key+0x62/0xb0 [nfsd] <4> [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 <4> [<ffffffffa0bfad0a>] ? fh_verify+0x32a/0x640 [nfsd] <4> [<ffffffffa0c059ac>] ? nfsd3_proc_getattr+0x6c/0xe0 [nfsd] <4> [<ffffffffa0bf743e>] ? nfsd_dispatch+0xfe/0x240 [nfsd] <4> [<ffffffffa0343604>] ? svc_process_common+0x344/0x640 [sunrpc] <4> [<ffffffff81063310>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa0343c40>] ? svc_process+0x110/0x160 [sunrpc] <4> [<ffffffffa0bf7b62>] ? nfsd+0xc2/0x160 [nfsd] <4> [<ffffffffa0bf7aa0>] ? nfsd+0x0/0x160 [nfsd] <4> [<ffffffff81096936>] ? kthread+0x96/0xa0 <4> [<ffffffff8100c0ca>] ? child_rip+0xa/0x20 <4> [<ffffffff810968a0>] ? kthread+0x0/0xa0 <4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 |
| Comments |
| Comment by Andreas Dilger [ 16/Sep/13 ] |
|
Daire, it appears from the stack that this is on a client which is re-exporting the Lustre filesystem via NFS? |
| Comment by Daire Byrne (Inactive) [ 16/Sep/13 ] |
|
Yes that is correct. Although the NFS access would have been very minimal at the time. |
| Comment by Daire Byrne (Inactive) [ 02/Oct/13 ] |
|
Another instance. Again we have the kdump vmcore if that is useful. Minimal to zero NFS access. <3>LustreError: 4858:0:(llite_nfs.c:107:search_inode_for_lustre()) can't get object attrs, fid [0x200007260:0x19409:0x0], rc -2 <0>LustreError: 4855:0:(llite_nfs.c:349:ll_get_parent()) ASSERTION( body->valid & (0x00000001ULL) ) failed: <0>LustreError: 4855:0:(llite_nfs.c:349:ll_get_parent()) LBUG <4>Pid: 4855, comm: nfsd <4> <4>Call Trace: <4> [<ffffffffa0486895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] <4> [<ffffffffa0486e97>] lbug_with_loc+0x47/0xb0 [libcfs] <4> [<ffffffffa0b08ff3>] ll_get_parent+0x7b3/0x820 [lustre] <4> [<ffffffffa01223b0>] reconnect_path+0x160/0x310 [exportfs] <4> [<ffffffffa01227aa>] exportfs_decode_fh+0xea/0x2bc [exportfs] <4> [<ffffffffa043d810>] ? nfsd_acceptable+0x0/0x120 [nfsd] <4> [<ffffffffa034d5b0>] ? cache_check+0x60/0x360 [sunrpc] <4> [<ffffffffa0443702>] ? exp_find_key+0x62/0xb0 [nfsd] <4> [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 <4> [<ffffffffa043dd0a>] fh_verify+0x32a/0x640 [nfsd] <4> [<ffffffffa04489ac>] nfsd3_proc_getattr+0x6c/0xe0 [nfsd] <4> [<ffffffffa043a43e>] nfsd_dispatch+0xfe/0x240 [nfsd] <4> [<ffffffffa0343604>] svc_process_common+0x344/0x640 [sunrpc] <4> [<ffffffff81063310>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa0343c40>] svc_process+0x110/0x160 [sunrpc] <4> [<ffffffffa043ab62>] nfsd+0xc2/0x160 [nfsd] <4> [<ffffffffa043aaa0>] ? nfsd+0x0/0x160 [nfsd] <4> [<ffffffff81096936>] kthread+0x96/0xa0 <4> [<ffffffff8100c0ca>] child_rip+0xa/0x20 <4> [<ffffffff810968a0>] ? kthread+0x0/0xa0 <4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 <4> <0>Kernel panic - not syncing: LBUG <4>Pid: 4855, comm: nfsd Tainted: G --------------- T 2.6.32-358.6.2.el6_lustre.g230b174.x86_64 #1 <4>Call Trace: <4> [<ffffffff8150d878>] ? panic+0xa7/0x16f <4> [<ffffffffa0486eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] <4> [<ffffffffa0b08ff3>] ? ll_get_parent+0x7b3/0x820 [lustre] <4> [<ffffffffa01223b0>] ? reconnect_path+0x160/0x310 [exportfs] <4> [<ffffffffa01227aa>] ? exportfs_decode_fh+0xea/0x2bc [exportfs] <4> [<ffffffffa043d810>] ? nfsd_acceptable+0x0/0x120 [nfsd] <4> [<ffffffffa034d5b0>] ? cache_check+0x60/0x360 [sunrpc] <4> [<ffffffffa0443702>] ? exp_find_key+0x62/0xb0 [nfsd] <4> [<ffffffff810096f0>] ? __switch_to+0xd0/0x320 <4> [<ffffffffa043dd0a>] ? fh_verify+0x32a/0x640 [nfsd] <4> [<ffffffffa04489ac>] ? nfsd3_proc_getattr+0x6c/0xe0 [nfsd] <4> [<ffffffffa043a43e>] ? nfsd_dispatch+0xfe/0x240 [nfsd] <4> [<ffffffffa0343604>] ? svc_process_common+0x344/0x640 [sunrpc] <4> [<ffffffff81063310>] ? default_wake_function+0x0/0x20 <4> [<ffffffffa0343c40>] ? svc_process+0x110/0x160 [sunrpc] <4> [<ffffffffa043ab62>] ? nfsd+0xc2/0x160 [nfsd] <4> [<ffffffffa043aaa0>] ? nfsd+0x0/0x160 [nfsd] <4> [<ffffffff81096936>] ? kthread+0x96/0xa0 <4> [<ffffffff8100c0ca>] ? child_rip+0xa/0x20 <4> [<ffffffff810968a0>] ? kthread+0x0/0xa0 <4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 |
| Comment by Andreas Dilger [ 02/Oct/13 ] |
|
Looks like this is being hit in the case where an NFS file handle is being accessed for a disconnected directory dentry, and NFS wants to connect it to the dcache (in reconnect_path()). In ll_get_parent() the client sends an RPC to the MDS to do a lookup for ".." in the directory, but this doesn't return the expected result to the client. I can see how this might be a rare case, since it requires long-running NFS client processes that are doing a lookup on a directory that is not fully connected in the Lustre client cache (e.g. after the Lustre client remounted the filesystem). Some strange things I found when looking in the code:
I suspect that this is an "always fails" case, but we don't have a proper test for it, and it only happens in rare cases when NFS exporting the filesystem from a client. Daire, could you confirm/deny if this happens after the NFS-exporting client has reconnected to the MDS, either after it was restarted or was evicted by the MDS? It has been a long time since I looked at the code in question, and the server-side code was reworked in the 2.4 release, so I don't know for sure that the above actually is the source of the defect, but it is a starting point. |
| Comment by Daire Byrne (Inactive) [ 04/Oct/13 ] |
|
Looking through the logs (client and MDS) it doesn't look like the client was evicted. The only thing accessing the filesystem through NFS atm is a script which runs every 10 minutes which simply stats a bunch of directories. These directories are often deleted by another Lustre client so it is possible a directory is removed at the same time. |
| Comment by Zhenyu Xu [ 09/Nov/13 ] |
|
patch tracking at http://review.whamcloud.com/8459 |
| Comment by Daire Byrne (Inactive) [ 19/Nov/13 ] |
|
We have applied the patch to an NFS gateway that was consistently crashing 1-2 times a week. I will report back in a week or so. |
| Comment by Daire Byrne (Inactive) [ 20/Nov/13 ] |
|
Well the addition of that patch (+ op_data->op_valid = OBD_MD_FLID |
| Comment by Zhenyu Xu [ 22/Nov/13 ] |
|
yes, please upload the client vmcore (plus supporting files. i.e system.map and vmlinux) as well as MDS lustre logs. |
| Comment by Daire Byrne (Inactive) [ 22/Nov/13 ] |
|
kdump vmcore: https://dl.dropboxusercontent.com/u/24821368/llite_nfs_vmcore.tar.bz2 The system.map and vmlinuz are the ones from the Whamcloud 2.4.1 rpms. I only recompiled the lustre-modules package with the patch and installed them. Unfortunately I don't have any MDS debug logs to go with the client crash. As far as the MDS syslog is concerned the client just timed out Nov 20 12:06:59 bmds1 kernel: Lustre: 10702:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1384949212/real 0] req@ffff8823c5fbc000 x1450793259087600/t0(0) o104->bravo-MDT0000@10.21.20.161@tcp:15/16 lens 296/224 e 0 to 1 dl 1384949219 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 Nov 20 12:06:59 bmds1 kernel: Lustre: 10702:0:(client.c:1868:ptlrpc_expire_one_request()) Skipped 52 previous similar messages Nov 20 12:06:59 bmds1 kernel: LustreError: 138-a: bravo-MDT0000: A client on nid 10.21.20.161@tcp was evicted due to a lock blocking callback time out: rc -107 Nov 20 12:08:11 bmds1 kernel: Lustre: 3360:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1384949280/real 1384949280] req@ffff881e91c8fc00 x1450793259664088/t0(0) o104->bravo-MDT0000@10.21.20.161@tcp:15/16 lens 296/224 e 0 to 1 dl 1384949291 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Nov 20 12:08:11 bmds1 kernel: LustreError: 138-a: bravo-MDT0000: A client on nid 10.21.20.161@tcp was evicted due to a lock blocking callback time out: rc -107 Nov 20 12:09:36 bmds1 kernel: Lustre: MGS: haven't heard from client 579e7714-62da-0239-1436-a6aaef6c3c2e (at 10.21.20.161@tcp) in 228 seconds. I think it's dead, and I am evicting it. exp ffff88023ee9f400, cur 1384949376 expire 1384949226 last 1384949148 Nov 20 12:09:36 bmds1 kernel: Lustre: Skipped 5 previous similar messages Nov 20 12:11:30 bmds1 kernel: Lustre: MGS: haven't heard from client bfce87ed-5f06-a61e-66b2-29c25777910c (at 10.21.20.161@tcp) in 228 seconds. I think it's dead, and I am evicting it. exp ffff8840099f5400, cur 1384949490 expire 1384949340 last 1384949262 Nov 20 12:12:55 bmds1 rshd[15712]: root@tidworth.dhcp.dneg.com as root: cmd='mount | egrep -w -q tidworth && echo 1' Nov 20 12:12:59 bmds1 kernel: Lustre: MGS: haven't heard from client e329d5cb-9323-ba9b-75eb-08e4bb28f5d6 (at 10.21.20.161@tcp) in 228 seconds. I think it's dead, and I am evicting it. exp ffff8821e9671000, cur 1384949579 expire 1384949429 last 1384949351 Nov 20 12:15:06 bmds1 kernel: Lustre: bravo-MDT0000: haven't heard from client 859d5542-b1cd-fc79-f151-e2a1d616d3b6 (at 10.21.20.161@tcp) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88222b768000, cur 1384949706 expire 1384949556 last 1384949479 Nov 20 12:15:06 bmds1 kernel: Lustre: Skipped 3 previous similar messages Nov 20 12:17:03 bmds1 kernel: Lustre: MGS: haven't heard from client c7f8980e-3768-eb52-c409-dd0e8c0262b8 (at 10.21.20.161@tcp) in 229 seconds. I think it's dead, and I am evicting it. exp ffff882143efd000, cur 1384949823 expire 1384949673 last 1384949594 Nov 20 12:17:03 bmds1 kernel: Lustre: Skipped 3 previous similar messages Nov 20 12:17:59 bmds1 kernel: Lustre: 30381:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1384949868/real 1384949868] req@ffff884009987800 x1450793263165844/t0(0) o104->bravo-MDT0000@10.21.20.161@tcp:15/16 lens 296/224 e 0 to 1 dl 1384949879 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Nov 20 12:17:59 bmds1 kernel: LustreError: 138-a: bravo-MDT0000: A client on nid 10.21.20.161@tcp was evicted due to a lock blocking callback time out: rc -107 Nov 20 12:19:17 bmds1 kernel: Lustre: MGS: haven't heard from client 4414062d-c26f-f3be-2991-56b4465da4d2 (at 10.21.20.161@tcp) in 230 seconds. I think it's dead, and I am evicting it. exp ffff88222c64d000, cur 1384949957 expire 1384949807 last 1384949727 Nov 20 12:19:17 bmds1 kernel: Lustre: Skipped 3 previous similar messages Nov 20 12:22:18 bmds1 kernel: Lustre: bravo-MDT0000: haven't heard from client ab18aa85-db83-fe4a-ee80-c5b97ce1b739 (at 10.21.20.161@tcp) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8822262fd000, cur 1384950138 expire 1384949988 last 1384949911 |
| Comment by Zhenyu Xu [ 02/Dec/13 ] |
|
The Lustre client does not initialize op_valid in ll_get_parent(), which mdc_pack_body() will set it if the inode's fid is valid (please refer to ll_prep_md_od_data() {... op_data->op_fid1 = *ll_inode2fid(i1); }Set op_valid in ll_get_parent will ask MDS to retrieve its parent's fid by name, regardless that the inode's fid is not available. If that still does not stop the LBUG, something wrong on MDS which fails to find its parent's fid could lead the issue. |
| Comment by Zhenyu Xu [ 03/Dec/13 ] |
|
patch updated at http://review.whamcloud.com/8459 |
| Comment by Daire Byrne (Inactive) [ 05/Dec/13 ] |
|
patch applied and so far so good. |
| Comment by Bob Glossman (Inactive) [ 02/Jan/14 ] |
|
backport to b2_5 |
| Comment by Peter Jones [ 04/Jan/14 ] |
|
Landed for 2.6 |
| Comment by Eric Kolb [ 17/Jan/14 ] |
|
What are the options for 2.4.2? We seem to be suffering from this bug but wish to stay on the stable release. |
| Comment by Shuichi Ihara (Inactive) [ 02/Feb/14 ] |
|
We hit same problem on 2.4.2, just posted backport patches for b2_4. http://review.whamcloud.com/9092 Please review on this. |
| Comment by Karsten Weiss [ 05/Mar/14 ] |
|
Shuichi, we're testing your 2.4.2 backport since 2014-02-21 and so far it's seems to have fixed this problem for us. The fix should probably be applied to b2_4, too. |
| Comment by Stephen Champion [ 12/Jun/14 ] |
|
backport to b1_8 |
| Comment by Lawrence Wright [ 29/Jul/14 ] |
|
Could you confirm if the backported 2.4 fix ever made it into 2.4.3? We've just seen an instance of this on a 2.4.3 based NFS gateway. Thanks! |
| Comment by Zhenyu Xu [ 29/Jul/14 ] |
|
no, it hasn't been landed in 2.4 yet (neither for 2.4.3). |