[LU-3952] llite_nfs.c:349:ll_get_parent()) ASSERTION( body->valid & (0x00000001ULL) ) failed Created: 16/Sep/13  Updated: 29/Jul/14  Resolved: 04/Jan/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.6.0, Lustre 2.5.1

Type: Bug Priority: Major
Reporter: Daire Byrne (Inactive) Assignee: Zhenyu Xu
Resolution: Fixed Votes: 0
Labels: mn4
Environment:

Lustre client: 2.6.32-358.6.2.el6_lustre.g230b174, exporting via NFS.


Issue Links:
Related
Severity: 3
Rank (Obsolete): 10510

 Description   

I couldn't find this exact stack in JIRA so I'm reporting it. Finally had to mount with abort_recov to stop the machine from hitting the same LBUG. kdump available upon request.

<0>LustreError: 3436:0:(llite_nfs.c:349:ll_get_parent()) ASSERTION( body->valid & (0x00000001ULL) ) failed: 
<0>LustreError: 3436:0:(llite_nfs.c:349:ll_get_parent()) LBUG
<4>Pid: 3436, comm: nfsd
<4>
<4>Call Trace:
<4> [<ffffffffa0432895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4> [<ffffffffa0432e97>] lbug_with_loc+0x47/0xb0 [libcfs]
<4> [<ffffffffa0ac0ff3>] ll_get_parent+0x7b3/0x820 [lustre]
<4> [<ffffffffa02f53b0>] reconnect_path+0x160/0x310 [exportfs]
<4> [<ffffffffa02f57aa>] exportfs_decode_fh+0xea/0x2bc [exportfs]
<4> [<ffffffffa0bfa810>] ? nfsd_acceptable+0x0/0x120 [nfsd]
<4> [<ffffffffa034d5b0>] ? cache_check+0x60/0x360 [sunrpc]
<4> [<ffffffffa0c00702>] ? exp_find_key+0x62/0xb0 [nfsd]
<4> [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
<4> [<ffffffffa0bfad0a>] fh_verify+0x32a/0x640 [nfsd]
<4> [<ffffffffa0c059ac>] nfsd3_proc_getattr+0x6c/0xe0 [nfsd]
<4> [<ffffffffa0bf743e>] nfsd_dispatch+0xfe/0x240 [nfsd]
<4> [<ffffffffa0343604>] svc_process_common+0x344/0x640 [sunrpc]
<4> [<ffffffff81063310>] ? default_wake_function+0x0/0x20
<4> [<ffffffffa0343c40>] svc_process+0x110/0x160 [sunrpc]
<4> [<ffffffffa0bf7b62>] nfsd+0xc2/0x160 [nfsd]
<4> [<ffffffffa0bf7aa0>] ? nfsd+0x0/0x160 [nfsd]
<4> [<ffffffff81096936>] kthread+0x96/0xa0
<4> [<ffffffff8100c0ca>] child_rip+0xa/0x20
<4> [<ffffffff810968a0>] ? kthread+0x0/0xa0
<4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
<4>
<0>Kernel panic - not syncing: LBUG
<4>Pid: 3436, comm: nfsd Tainted: G           ---------------  T 2.6.32-358.6.2.el6_lustre.g230b174.x86_64 #1
<4>Call Trace:
<4> [<ffffffff8150d878>] ? panic+0xa7/0x16f
<4> [<ffffffffa0432eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
<4> [<ffffffffa0ac0ff3>] ? ll_get_parent+0x7b3/0x820 [lustre]
<4> [<ffffffffa02f53b0>] ? reconnect_path+0x160/0x310 [exportfs]
<4> [<ffffffffa02f57aa>] ? exportfs_decode_fh+0xea/0x2bc [exportfs]
<4> [<ffffffffa0bfa810>] ? nfsd_acceptable+0x0/0x120 [nfsd]
<4> [<ffffffffa034d5b0>] ? cache_check+0x60/0x360 [sunrpc]
<4> [<ffffffffa0c00702>] ? exp_find_key+0x62/0xb0 [nfsd]
<4> [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
<4> [<ffffffffa0bfad0a>] ? fh_verify+0x32a/0x640 [nfsd]
<4> [<ffffffffa0c059ac>] ? nfsd3_proc_getattr+0x6c/0xe0 [nfsd]
<4> [<ffffffffa0bf743e>] ? nfsd_dispatch+0xfe/0x240 [nfsd]
<4> [<ffffffffa0343604>] ? svc_process_common+0x344/0x640 [sunrpc]
<4> [<ffffffff81063310>] ? default_wake_function+0x0/0x20
<4> [<ffffffffa0343c40>] ? svc_process+0x110/0x160 [sunrpc]
<4> [<ffffffffa0bf7b62>] ? nfsd+0xc2/0x160 [nfsd]
<4> [<ffffffffa0bf7aa0>] ? nfsd+0x0/0x160 [nfsd]
<4> [<ffffffff81096936>] ? kthread+0x96/0xa0
<4> [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
<4> [<ffffffff810968a0>] ? kthread+0x0/0xa0
<4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20


 Comments   
Comment by Andreas Dilger [ 16/Sep/13 ]

Daire, it appears from the stack that this is on a client which is re-exporting the Lustre filesystem via NFS?

Comment by Daire Byrne (Inactive) [ 16/Sep/13 ]

Yes that is correct. Although the NFS access would have been very minimal at the time.

Comment by Daire Byrne (Inactive) [ 02/Oct/13 ]

Another instance. Again we have the kdump vmcore if that is useful. Minimal to zero NFS access.

<3>LustreError: 4858:0:(llite_nfs.c:107:search_inode_for_lustre()) can't get object attrs, fid [0x200007260:0x19409:0x0], rc -2
<0>LustreError: 4855:0:(llite_nfs.c:349:ll_get_parent()) ASSERTION( body->valid & (0x00000001ULL) ) failed: 
<0>LustreError: 4855:0:(llite_nfs.c:349:ll_get_parent()) LBUG
<4>Pid: 4855, comm: nfsd
<4>
<4>Call Trace:
<4> [<ffffffffa0486895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4> [<ffffffffa0486e97>] lbug_with_loc+0x47/0xb0 [libcfs]
<4> [<ffffffffa0b08ff3>] ll_get_parent+0x7b3/0x820 [lustre]
<4> [<ffffffffa01223b0>] reconnect_path+0x160/0x310 [exportfs]
<4> [<ffffffffa01227aa>] exportfs_decode_fh+0xea/0x2bc [exportfs]
<4> [<ffffffffa043d810>] ? nfsd_acceptable+0x0/0x120 [nfsd]
<4> [<ffffffffa034d5b0>] ? cache_check+0x60/0x360 [sunrpc]
<4> [<ffffffffa0443702>] ? exp_find_key+0x62/0xb0 [nfsd]
<4> [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
<4> [<ffffffffa043dd0a>] fh_verify+0x32a/0x640 [nfsd]
<4> [<ffffffffa04489ac>] nfsd3_proc_getattr+0x6c/0xe0 [nfsd]
<4> [<ffffffffa043a43e>] nfsd_dispatch+0xfe/0x240 [nfsd]
<4> [<ffffffffa0343604>] svc_process_common+0x344/0x640 [sunrpc]
<4> [<ffffffff81063310>] ? default_wake_function+0x0/0x20
<4> [<ffffffffa0343c40>] svc_process+0x110/0x160 [sunrpc]
<4> [<ffffffffa043ab62>] nfsd+0xc2/0x160 [nfsd]
<4> [<ffffffffa043aaa0>] ? nfsd+0x0/0x160 [nfsd]
<4> [<ffffffff81096936>] kthread+0x96/0xa0
<4> [<ffffffff8100c0ca>] child_rip+0xa/0x20
<4> [<ffffffff810968a0>] ? kthread+0x0/0xa0
<4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
<4>
<0>Kernel panic - not syncing: LBUG
<4>Pid: 4855, comm: nfsd Tainted: G           ---------------  T 2.6.32-358.6.2.el6_lustre.g230b174.x86_64 #1
<4>Call Trace:
<4> [<ffffffff8150d878>] ? panic+0xa7/0x16f
<4> [<ffffffffa0486eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
<4> [<ffffffffa0b08ff3>] ? ll_get_parent+0x7b3/0x820 [lustre]
<4> [<ffffffffa01223b0>] ? reconnect_path+0x160/0x310 [exportfs]
<4> [<ffffffffa01227aa>] ? exportfs_decode_fh+0xea/0x2bc [exportfs]
<4> [<ffffffffa043d810>] ? nfsd_acceptable+0x0/0x120 [nfsd]
<4> [<ffffffffa034d5b0>] ? cache_check+0x60/0x360 [sunrpc]
<4> [<ffffffffa0443702>] ? exp_find_key+0x62/0xb0 [nfsd]
<4> [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
<4> [<ffffffffa043dd0a>] ? fh_verify+0x32a/0x640 [nfsd]
<4> [<ffffffffa04489ac>] ? nfsd3_proc_getattr+0x6c/0xe0 [nfsd]
<4> [<ffffffffa043a43e>] ? nfsd_dispatch+0xfe/0x240 [nfsd]
<4> [<ffffffffa0343604>] ? svc_process_common+0x344/0x640 [sunrpc]
<4> [<ffffffff81063310>] ? default_wake_function+0x0/0x20
<4> [<ffffffffa0343c40>] ? svc_process+0x110/0x160 [sunrpc]
<4> [<ffffffffa043ab62>] ? nfsd+0xc2/0x160 [nfsd]
<4> [<ffffffffa043aaa0>] ? nfsd+0x0/0x160 [nfsd]
<4> [<ffffffff81096936>] ? kthread+0x96/0xa0
<4> [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
<4> [<ffffffff810968a0>] ? kthread+0x0/0xa0
<4> [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Comment by Andreas Dilger [ 02/Oct/13 ]

Looks like this is being hit in the case where an NFS file handle is being accessed for a disconnected directory dentry, and NFS wants to connect it to the dcache (in reconnect_path()). In ll_get_parent() the client sends an RPC to the MDS to do a lookup for ".." in the directory, but this doesn't return the expected result to the client. I can see how this might be a rare case, since it requires long-running NFS client processes that are doing a lookup on a directory that is not fully connected in the Lustre client cache (e.g. after the Lustre client remounted the filesystem).

Some strange things I found when looking in the code:

  • the Lustre client does not initialize op_valid in ll_get_parent(), which means mdc_getattr_name() calls mdc_pack_body() and sets body->valid = 0
  • in mdt_lookup_raw() if body->valid does not have OBD_MD_FLID set the whole function is skipped
  • the client is LASSERTing that the MDS_GETATTR_NAME reply has OBD_MD_FLID set (which should always be set for the getattr reply, we can't possibly get attributes for an object without knowing its FID)
  • IOC_MDC_LOOKUP sets OBD_MD_FLID itself before calling mdc_getattr_name()

I suspect that this is an "always fails" case, but we don't have a proper test for it, and it only happens in rare cases when NFS exporting the filesystem from a client. Daire, could you confirm/deny if this happens after the NFS-exporting client has reconnected to the MDS, either after it was restarted or was evicted by the MDS?

It has been a long time since I looked at the code in question, and the server-side code was reworked in the 2.4 release, so I don't know for sure that the above actually is the source of the defect, but it is a starting point.

Comment by Daire Byrne (Inactive) [ 04/Oct/13 ]

Looking through the logs (client and MDS) it doesn't look like the client was evicted.

The only thing accessing the filesystem through NFS atm is a script which runs every 10 minutes which simply stats a bunch of directories. These directories are often deleted by another Lustre client so it is possible a directory is removed at the same time.

Comment by Zhenyu Xu [ 09/Nov/13 ]

patch tracking at http://review.whamcloud.com/8459

Comment by Daire Byrne (Inactive) [ 19/Nov/13 ]

We have applied the patch to an NFS gateway that was consistently crashing 1-2 times a week. I will report back in a week or so.

Comment by Daire Byrne (Inactive) [ 20/Nov/13 ]

Well the addition of that patch (+ op_data->op_valid = OBD_MD_FLID alone hasn't resolved the issue. I just got the same ASSERTION/LBUG. Even with abort_recov on the mountpoint the client/NFS gateway kept crashing over and over. vmcores available on request.

Comment by Zhenyu Xu [ 22/Nov/13 ]

yes, please upload the client vmcore (plus supporting files. i.e system.map and vmlinux) as well as MDS lustre logs.

Comment by Daire Byrne (Inactive) [ 22/Nov/13 ]

kdump vmcore:

https://dl.dropboxusercontent.com/u/24821368/llite_nfs_vmcore.tar.bz2

The system.map and vmlinuz are the ones from the Whamcloud 2.4.1 rpms. I only recompiled the lustre-modules package with the patch and installed them. Unfortunately I don't have any MDS debug logs to go with the client crash. As far as the MDS syslog is concerned the client just timed out

Nov 20 12:06:59 bmds1 kernel: Lustre: 10702:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1384949212/real 0]  req@ffff8823c5fbc000 x1450793259087600/t0(0) o104->bravo-MDT0000@10.21.20.161@tcp:15/16 lens 296/224 e 0 to 1 dl 1384949219 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
Nov 20 12:06:59 bmds1 kernel: Lustre: 10702:0:(client.c:1868:ptlrpc_expire_one_request()) Skipped 52 previous similar messages
Nov 20 12:06:59 bmds1 kernel: LustreError: 138-a: bravo-MDT0000: A client on nid 10.21.20.161@tcp was evicted due to a lock blocking callback time out: rc -107
Nov 20 12:08:11 bmds1 kernel: Lustre: 3360:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1384949280/real 1384949280]  req@ffff881e91c8fc00 x1450793259664088/t0(0) o104->bravo-MDT0000@10.21.20.161@tcp:15/16 lens 296/224 e 0 to 1 dl 1384949291 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Nov 20 12:08:11 bmds1 kernel: LustreError: 138-a: bravo-MDT0000: A client on nid 10.21.20.161@tcp was evicted due to a lock blocking callback time out: rc -107
Nov 20 12:09:36 bmds1 kernel: Lustre: MGS: haven't heard from client 579e7714-62da-0239-1436-a6aaef6c3c2e (at 10.21.20.161@tcp) in 228 seconds. I think it's dead, and I am evicting it. exp ffff88023ee9f400, cur 1384949376 expire 1384949226 last 1384949148
Nov 20 12:09:36 bmds1 kernel: Lustre: Skipped 5 previous similar messages
Nov 20 12:11:30 bmds1 kernel: Lustre: MGS: haven't heard from client bfce87ed-5f06-a61e-66b2-29c25777910c (at 10.21.20.161@tcp) in 228 seconds. I think it's dead, and I am evicting it. exp ffff8840099f5400, cur 1384949490 expire 1384949340 last 1384949262
Nov 20 12:12:55 bmds1 rshd[15712]: root@tidworth.dhcp.dneg.com as root: cmd='mount | egrep -w -q tidworth && echo 1'
Nov 20 12:12:59 bmds1 kernel: Lustre: MGS: haven't heard from client e329d5cb-9323-ba9b-75eb-08e4bb28f5d6 (at 10.21.20.161@tcp) in 228 seconds. I think it's dead, and I am evicting it. exp ffff8821e9671000, cur 1384949579 expire 1384949429 last 1384949351
Nov 20 12:15:06 bmds1 kernel: Lustre: bravo-MDT0000: haven't heard from client 859d5542-b1cd-fc79-f151-e2a1d616d3b6 (at 10.21.20.161@tcp) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88222b768000, cur 1384949706 expire 1384949556 last 1384949479
Nov 20 12:15:06 bmds1 kernel: Lustre: Skipped 3 previous similar messages
Nov 20 12:17:03 bmds1 kernel: Lustre: MGS: haven't heard from client c7f8980e-3768-eb52-c409-dd0e8c0262b8 (at 10.21.20.161@tcp) in 229 seconds. I think it's dead, and I am evicting it. exp ffff882143efd000, cur 1384949823 expire 1384949673 last 1384949594
Nov 20 12:17:03 bmds1 kernel: Lustre: Skipped 3 previous similar messages
Nov 20 12:17:59 bmds1 kernel: Lustre: 30381:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1384949868/real 1384949868]  req@ffff884009987800 x1450793263165844/t0(0) o104->bravo-MDT0000@10.21.20.161@tcp:15/16 lens 296/224 e 0 to 1 dl 1384949879 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Nov 20 12:17:59 bmds1 kernel: LustreError: 138-a: bravo-MDT0000: A client on nid 10.21.20.161@tcp was evicted due to a lock blocking callback time out: rc -107
Nov 20 12:19:17 bmds1 kernel: Lustre: MGS: haven't heard from client 4414062d-c26f-f3be-2991-56b4465da4d2 (at 10.21.20.161@tcp) in 230 seconds. I think it's dead, and I am evicting it. exp ffff88222c64d000, cur 1384949957 expire 1384949807 last 1384949727
Nov 20 12:19:17 bmds1 kernel: Lustre: Skipped 3 previous similar messages
Nov 20 12:22:18 bmds1 kernel: Lustre: bravo-MDT0000: haven't heard from client ab18aa85-db83-fe4a-ee80-c5b97ce1b739 (at 10.21.20.161@tcp) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8822262fd000, cur 1384950138 expire 1384949988 last 1384949911
Comment by Zhenyu Xu [ 02/Dec/13 ]

The Lustre client does not initialize op_valid in ll_get_parent(), which mdc_pack_body() will set it if the inode's fid is valid (please refer to ll_prep_md_od_data()

{... op_data->op_fid1 = *ll_inode2fid(i1); }

Set op_valid in ll_get_parent will ask MDS to retrieve its parent's fid by name, regardless that the inode's fid is not available.

If that still does not stop the LBUG, something wrong on MDS which fails to find its parent's fid could lead the issue.

Comment by Zhenyu Xu [ 03/Dec/13 ]

patch updated at http://review.whamcloud.com/8459

Comment by Daire Byrne (Inactive) [ 05/Dec/13 ]

patch applied and so far so good.

Comment by Bob Glossman (Inactive) [ 02/Jan/14 ]

backport to b2_5
http://review.whamcloud.com/8706

Comment by Peter Jones [ 04/Jan/14 ]

Landed for 2.6

Comment by Eric Kolb [ 17/Jan/14 ]

What are the options for 2.4.2? We seem to be suffering from this bug but wish to stay on the stable release.

Comment by Shuichi Ihara (Inactive) [ 02/Feb/14 ]

We hit same problem on 2.4.2, just posted backport patches for b2_4. http://review.whamcloud.com/9092 Please review on this.

Comment by Karsten Weiss [ 05/Mar/14 ]

Shuichi, we're testing your 2.4.2 backport since 2014-02-21 and so far it's seems to have fixed this problem for us. The fix should probably be applied to b2_4, too.

Comment by Stephen Champion [ 12/Jun/14 ]

backport to b1_8
http://review.whamcloud.com/#/c/10691/

Comment by Lawrence Wright [ 29/Jul/14 ]

Could you confirm if the backported 2.4 fix ever made it into 2.4.3? We've just seen an instance of this on a 2.4.3 based NFS gateway. Thanks!

Comment by Zhenyu Xu [ 29/Jul/14 ]

no, it hasn't been landed in 2.4 yet (neither for 2.4.3).

Generated at Sat Feb 10 01:38:21 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.