[LU-4400] Another LBUG with NFS reexport mainline 3.12 client Created: 19/Dec/13  Updated: 31/Jul/14  Resolved: 31/Jul/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.1
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Roland Fehrenbacher Assignee: Dmitry Eremin (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:
  • Client mainline kernel 3.12.5 with patches (mentioned below) / lustre-utils 2.4.0
  • Servers lustre 2.4.0/ZFS OSDs
  • ko2iblnd
  • reexport works with 2.6.32/2.4.0 client

Attachments: File 0001-QL-Include-lustre-b2_5-patch-6784a15c6ee32019237151a.patch    
Issue Links:
Related
is related to LU-3486 LBUG when exporting Lustre 2.4 via NF... Resolved
is related to LU-4416 support for 3.12 linux kernel Resolved
Severity: 4
Rank (Obsolete): 12080

 Description   

This is a followup on https://jira.hpdd.intel.com/browse/LU-4231. With the patch mentioned in LU-4231 applied I can
mount NFS, but when putting some stress on the NFS directory, I get the following LBUG
basically immediately:

LustreError: 8102:0:(dcache.c:223:ll_dops_init()) ASSERTION( de->d_op == &ll_d_ops ) failed:
[ 669.579419] LustreError: 8102:0:(dcache.c:223:ll_dops_init()) LBUG
[ 669.586643] Kernel panic - not syncing: LBUG
[ 669.590984] CPU: 1 PID: 8102 Comm: nfsd Tainted: P C O 3.12.5-ql-generic-14 #1
[ 669.599015] Hardware name: Supermicro X8DTT-H/X8DTT-H, BIOS 080016 03/08/2010
[ 669.606354] 0000000000000000 ffff88061064f9c8 ffffffff814fdd82 ffff880333c2dcf8
[ 669.613968] ffffffffa032c6bd ffff88061064fa48 ffffffff814fb31f ffffffff815ec722
[ 669.621568] ffffffff00000008 ffff88061064fa58 ffff88061064f9f8 ffffffffa09b0b60
[ 669.629169] Call Trace:
[ 669.631687] [<ffffffff814fdd82>] dump_stack+0x46/0x58
[ 669.636892] [<ffffffff814fb31f>] panic+0xb6/0x1c6
[ 669.641762] [<ffffffffa0310b50>] lbug_with_loc+0xb0/0xb0 [libcfs]
[ 669.648006] [<ffffffffa0956cc5>] ll_dops_init+0x215/0x3e0 [lustre]
[ 669.654337] [<ffffffff8114ea41>] ? d_obtain_alias+0x41/0x1b0
[ 669.660153] [<ffffffffa09870cb>] ll_iget_for_nfs+0xbb/0x200 [lustre]
[ 669.666664] [<ffffffffa09874e8>] ll_get_parent+0x2d8/0x540 [lustre]
[ 669.673091] [<ffffffffa027c458>] reconnect_path+0x118/0x290 [exportfs]
[ 669.679771] [<ffffffffa02834d0>] ? _fh_update.isra.9.part.10+0x50/0x50 [nfsd]
[ 669.687118] [<ffffffffa027c6b9>] exportfs_decode_fh+0xe9/0x300 [exportfs]
[ 669.694060] [<ffffffffa0289205>] ? exp_find+0x105/0x1c0 [nfsd]
[ 669.700044] [<ffffffff8108c8ab>] ? getboottime+0x2b/0x30
[ 669.705519] [<ffffffffa02838e6>] fh_verify+0x2f6/0x5a0 [nfsd]
[ 669.711415] [<ffffffff81486905>] ? svcauth_unix_set_client+0x4f5/0x5a0
[ 669.718101] [<ffffffffa0291912>] nfsd3_proc_getacl+0x62/0x1e0 [nfsd]
[ 669.724611] [<ffffffffa0280ca1>] nfsd_dispatch+0xa1/0x1b0 [nfsd]
[ 669.730774] [<ffffffff81481d3f>] svc_process_common+0x2ef/0x5a0
[ 669.736846] [<ffffffff8148233f>] svc_process+0xff/0x150
[ 669.742222] [<ffffffffa02806ef>] nfsd+0xbf/0x130 [nfsd]
[ 669.747598] [<ffffffffa0280630>] ? nfsd_destroy+0x80/0x80 [nfsd]
[ 669.753754] [<ffffffff8106013b>] kthread+0xbb/0xc0
[ 669.758692] [<ffffffff81060080>] ? kthread_freezable_should_stop+0x70/0x70
[ 669.765721] [<ffffffff8150a8bc>] ret_from_fork+0x7c/0xb0
[ 669.771187] [<ffffffff81060080>] ? kthread_freezable_should_stop+0x70/0x70



 Comments   
Comment by Roland Fehrenbacher [ 19/Dec/13 ]

I just saw that this is already fixed in master and b2_5 with 6784a15c6ee32019237151a63755103c68ff51dd.
I tested the mainline client with this patch and it seems to work (stress tested for 30min. without error).
Please update b2_4 and mainline client with this patch.

Comment by Roland Fehrenbacher [ 19/Dec/13 ]

Patch for mainline kernel.

Comment by Dmitry Eremin (Inactive) [ 23/Dec/13 ]

The same issue as LU-3486. The fix should be back ported to upstream.

Comment by James A Simmons [ 12/Jun/14 ]

This patch was merged upstream in commit 3ea8f3bcabe422c6b5778089ae0929c1028e58f8. This ticket can be closed.

Generated at Sat Feb 10 01:42:22 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.