[LU-3727] LBUG (llite_nfs.c:281:ll_get_parent()) ASSERTION(body->valid & OBD_MD_FLID) failed Created: 08/Aug/13 Updated: 14/Jun/18 Resolved: 10/Feb/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.5, Lustre 1.8.9, Lustre 2.4.1 |
| Fix Version/s: | Lustre 2.7.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Oz Rentas | Assignee: | Emoly Liu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9597 | ||||||||
| Description |
|
At GE Global Research, we ran into an LBUG with a 1.8.9 client that is re-exporting 2.1.5 Lustre: Jul 31 10:26:46 scinfra3 kernel: Installing knfsd (copyright (C) 1996 okir@monad.swb.de). It appears to be easily reproducible, we are going to try to get a core dump, but I was wondering if there was anything obvious from this trace or any other jira tickets I might have missed. Also is there any other information that might be useful? Thanks. |
| Comments |
| Comment by Peter Jones [ 08/Aug/13 ] |
|
Thanks for the report Kit! |
| Comment by Peter Jones [ 09/Aug/13 ] |
|
Emoly What do you suggest here? Peter |
| Comment by Emoly Liu [ 12/Aug/13 ] |
|
Kit, could you please show how to reproduce this LBUG in detail? And a core dump file will be helpful. Thanks! |
| Comment by Kit Westneat (Inactive) [ 12/Aug/13 ] |
|
Hi Emoly, The customer is currently unable to reproduce after setting up kdump, so we are in a holding pattern. I will ask what they were doing to reproduce before. Thanks. |
| Comment by Kit Westneat (Inactive) [ 13/Aug/13 ] |
|
Hi Emoly, We were able to capture a crash dump from a different client node. Strangely, this client gets a null pointer dereference when trying to print the LBUG stack trace. Here is the vmcore: This is the kernel debuginfo for it: crash> bt
PID: 23553 TASK: ffff881019a32080 CPU: 10 COMMAND: "nfsd"
#0 [ffff88100a241520] machine_kexec at ffffffff8103281b
#1 [ffff88100a241580] crash_kexec at ffffffff810ba792
#2 [ffff88100a241650] text_poke at ffffffff815016f0
#3 [ffff88100a241680] no_context at ffffffff81043bab
#4 [ffff88100a2416d0] __bad_area_nosemaphore at ffffffff81043e35
#5 [ffff88100a241720] bad_area_nosemaphore at ffffffff81043f03
#6 [ffff88100a241730] __do_page_fault at ffffffff81044661
#7 [ffff88100a241850] debugfs_kprobe_init at ffffffff815036ce
#8 [ffff88100a241880] do_debug at ffffffff81500a85
#9 [ffff88100a2419d8] libcfs_debug_dumpstack at ffffffffa01d78f5 [libcfs]
#10 [ffff88100a2419f8] lbug_with_loc at ffffffffa01d7f25 [libcfs]
#11 [ffff88100a241a48] libcfs_assertion_failed at ffffffffa01e0696 [libcfs]
#12 [ffff88100a241a98] ll_get_parent at ffffffffa08836f8 [lustre]
#13 [ffff88100a241b38] reconnect_path at ffffffffa021e3b0 [exportfs]
#14 [ffff88100a241ba8] exportfs_decode_fh at ffffffffa021e7aa [exportfs]
#15 [ffff88100a241d18] fh_verify at ffffffffa064abea [nfsd]
#16 [ffff88100a241da8] nfsd3_proc_getattr at ffffffffa0655b6c [nfsd]
#17 [ffff88100a241dd8] nfsd_dispatch at ffffffffa064743e [nfsd]
#18 [ffff88100a241e18] svc_process_common at ffffffffa05fb5d4 [sunrpc]
#19 [ffff88100a241e98] svc_process at ffffffffa05fbc10 [sunrpc]
#20 [ffff88100a241eb8] nfsd at ffffffffa0647b62 [nfsd]
#21 [ffff88100a241ee8] kthread at ffffffff81091d66
#22 [ffff88100a241f48] kernel_thread at ffffffff8100c14a
The customer uses NFS for staging files onto the clusters. So it should just be a lot of copies, removes, that sort of thing. There shouldn't be any jobs, but it's possible there are some desktop applications that run against it. It happens pretty frequently, though they don't have a reproducer. Let me know if there is any other data I can get. Thanks. |
| Comment by Li Xi (Inactive) [ 14/Aug/13 ] |
|
We hit the same problem. Here is how we reproduce it: After some inverstigation, we found that the cause is that the nfs daemon user does not have the permission to access ".." directory. We will upload the fix patch soon. |
| Comment by Li Xi (Inactive) [ 14/Aug/13 ] |
|
Here is the patch which tries to fix the problem. |
| Comment by Emoly Liu [ 14/Aug/13 ] |
|
Thanks for your patch! |
| Comment by Li Xi (Inactive) [ 16/Aug/13 ] |
|
Here is a patch which tries to fix the problem in another way: The last patch (http://review.whamcloud.com/7327) tries to fix the problem by skipping the permission check on MDT. This patch avoids permission denied by pretending the client is sending a RPC of root user. It is useful for us because this is a client-side-only patch, and it is difficult for us to get downtime for the customer. I am not sure which patch is better. Maybe we should think more about their security problem? |
| Comment by nasf (Inactive) [ 18/Aug/13 ] |
|
I am not sure whether I understand your case correctly or not. But consider the following case: 1) root user "mkdir /tmp/test" The result is "ls: cannot open directory .: Permission denied". Back to the |
| Comment by Li Xi (Inactive) [ 18/Aug/13 ] |
|
Sorry, maybe my former explaination was not accurate. Here is how to reproduce the problem. Please notice that at step 7), we are under "/mnt/lustre/export", and we have the permission for this operation. |
| Comment by Shuichi Ihara (Inactive) [ 26/Aug/13 ] |
|
Hi FanYong & Emoly Would you please review Li's patch quickly? Thanks! |
| Comment by nasf (Inactive) [ 26/Aug/13 ] |
|
I think the patch (http://review.whamcloud.com/#/c/7349/) is some hack, any will introduce security hole. There may be UID/GID mapping on MDT side, so even if you specify the special case as root user, it still possible be mapped to another user in the future. It is MDT to decide how to handle such case. |
| Comment by Li Xi (Inactive) [ 26/Aug/13 ] |
|
Thank you very much, Fan Yong! I guess the first patch (http://review.whamcloud.com/7327) is more acceptable, right? |
| Comment by nasf (Inactive) [ 26/Aug/13 ] |
|
I think 7327 is better than 7349, although the former one still can be improved. |
| Comment by Alexey Lyashkov [ 01/Oct/13 ] |
|
Li Xi, I have some questions about you script to reproduce a bug. if (2) is true - i think question - why 'md_getattr_name' isn't return an error as it's expected. may you attach a full lustre debug from lustre in that crash? i think you use a single node configuration for a nfs server so we will be see all operations in logs. |
| Comment by Li Xi (Inactive) [ 01/Oct/13 ] |
|
The trace output when this BUG happens. Please note that this log is trace on Lustre-2.1, so no LBUG happens. But basically, it is the same problem. |
| Comment by Li Xi (Inactive) [ 01/Oct/13 ] |
|
Hi Alexey, Yeah, NFS client has the permission to access '/mnt/lustre/export/' but it has no permission to access '/mnt/lustre/export/dir'. However, when NFS daemon restart (which is not normal and that is when ll_get_parent() is called), NFS client should get the attribute of '/mnt/lustre/export/dir/../', which will make Lustre client (and NFS server) hit the LBUG. Normally, NFS client could access '/mnt/lustre/export/' withouth any problem though it has no permission to access '/mnt/lustre/export/dir', but ll_get_parent() is different. Since there is no reason to require that a user doing 'ls -l /mnt/lustre/export/' has the permission to access '/mnt/lustre/export/dir', I think the best way to fix this is to avoid the permission check of '/mnt/lustre/export/dir' in this case. I've post the lustre debug file as 'log.txt'. Sorry, I should have post it earlier. |
| Comment by Patrick Farrell (Inactive) [ 23/Oct/13 ] |
|
We encountered this assertion while running a set of tests from the Linux Test Project over NFS. We did not do any unmount/remounting of NFS or Lustre as part of this, but are consistently able to hit the bug with this test. We hit it with both a SLES11SP1 Lustre client and a CentOS 6.4 Lustre client, in both cases to CentOS 6.4 servers running 2.4.1. I'll attached the source file for this particular test (the LTP is a GPL'ed project). The test set is called unlink08, and is a series of tests of the unlink system call. We're planning to test the patch in http://review.whamcloud.com/#/c/7327/ today, I'll report back with results. |
| Comment by Patrick Farrell (Inactive) [ 23/Oct/13 ] |
|
With 3727 applied, the 'unlink8' test no longer hits this bug. |
| Comment by Alexey Lyashkov [ 23/Oct/13 ] |
|
[root@rhel6-64 WC-review]# gcc unlink08.c |
| Comment by Patrick Farrell (Inactive) [ 23/Oct/13 ] |
|
Sorry, Alexey - I hadn't intended that to be buildable/useable by itself, I just posted it for reference. It's part of LTP, which can be downloaded from here: You'll find it in testcases/kernel/syscalls/unlink/ in the untarred package, but you'll have to figure out how to build and run that specific test. |
| Comment by Alexey Lyashkov [ 23/Oct/13 ] |
|
I have checked on ~2 days ago with some older ltp code, and don't able to replicate that assertion. after hung finished [root@rhel6-64 ltp]# export TMPDIR=/mnt/lustre2/; /Users/shadow/work/lustre/work/ltp/testcases/kernel/syscalls/unlink/unlink08 unlink08 1 TPASS : unlink(<unwritable directory>) returned 0 unlink08 2 TPASS : unlink(<unsearchable directory>) returned 0 unlink08 3 TPASS : unlink(<directory>) Failed, errno=21 where
/without fsid i have issues with exporting directory while root lustre dir exported fine/ ps. same for last LTP from git. |
| Comment by Patrick Farrell (Inactive) [ 23/Oct/13 ] |
|
Interesting. I confirmed and we were able to cause that assertion by running only unlink8, as well as running unlink8 as part of the larger test suite. By the way... |
| Comment by Alexey Lyashkov [ 23/Oct/13 ] |
|
question about fsid - is long story. lustre put native fid with parent fid in NFS file handle, FSID may be any number and it's now same as block device id (for compatibility with older servers in pair). but main question - why i able to export /mnt/lustre without fsid set, but /mnt/lustre/export need fsid. as about assert - did you able to take a crashdump and extract lustre debug log from core file? |
| Comment by Patrick Farrell (Inactive) [ 23/Oct/13 ] |
|
re FSID: re: The assertion. We could probably extract logs from the dump we have, but the logs only had the default debugging options, IE: If you're still interested, I could probably get those for you tomorrow. |
| Comment by Alexey Lyashkov [ 23/Oct/13 ] |
|
it's don't enough tried to replicate with v2_4_1 tag, but without success. |
| Comment by Alexey Lyashkov [ 23/Oct/13 ] |
|
Li, attached log don't have an information about rpc and isn't full - it's have just last step when we hit an error.. |
| Comment by Li Xi (Inactive) [ 24/Oct/13 ] |
|
I reproduced the problem with following steps. Message from syslogd@server1 at Oct 24 12:08:53 ... Message from syslogd@server1 at Oct 24 12:08:53 ... |
| Comment by Li Xi (Inactive) [ 24/Oct/13 ] |
|
Alexey, 'lustre.log' is the trace log which I got when reproducing the problem. Hope it helps. |
| Comment by Alexey Lyashkov [ 24/Oct/13 ] |
|
Li, thanks! |
| Comment by Alexey Lyashkov [ 24/Oct/13 ] |
|
Li, thanks again. devil in details.. we need additional directory created in exported dir. |
| Comment by Alexey Lyashkov [ 24/Oct/13 ] |
|
as i talk before - mdt generate an error as before 00000004:00000001:1.0:1382635559.670116:0:15672:0:(mdd_permission.c:309:__mdd_permission_internal()) Process leaving (rc=18446744073709551603 : -13 : fffffffffffffff3) 00000004:00000001:1.0:1382635559.670117:0:15672:0:(mdd_dir.c:90:__mdd_lookup()) Process leaving (rc=18446744073709551603 : -13 : fffffffffffffff3) 00000004:00000001:1.0:1382635559.670117:0:15672:0:(mdd_dir.c:115:mdd_lookup()) Process leaving (rc=18446744073709551603 : -13 : fffffffffffffff3) but that error isn't returned to the caller 00000004:00000001:1.0:1382635559.670119:0:15672:0:(mdt_handler.c:1273:mdt_getattr_name_lock()) Process leaving (rc=0 : 0 : 0) i that case client correctly trigger a panic as we have none errors in processing, but reply isn't filled correctly. |
| Comment by Alexey Lyashkov [ 24/Oct/13 ] |
|
Li, may you look into MDT code to verify - why that error isn't returned correctly to the client? #if 0 /* XXX is raw_lookup possible as intent operation? */ if (rc != 0) { if (rc == -ENOENT) mdt_set_disposition(info, ldlm_rep, DISP_LOOKUP_NEG); RETURN(rc); } else mdt_set_disposition(info, ldlm_rep, DISP_LOOKUP_POS); repbody = req_capsule_server_get(info->mti_pill, &RMF_MDT_BODY); #endif or we need to replace an 'RETURN(1);' to "return(rc)' at end of mdt_raw_lookup() function. |
| Comment by Patrick Farrell (Inactive) [ 24/Oct/13 ] |
|
At Alexey's request, we reproduced this. Here's the procedure from our test engineer: 2) Start nfsserver daemon on NFS server 3) Export nfs (sudo /usr/sbin/exportfs -i -o rw,insecure,no_root_squash,no_subtree_check,fsid=538 *:/extlus ) 4) Mount NFS on client (sudo mount perses-esl3:/extlus /tmp/lus) 5) Run test using /tmp/lus Attaching logs shortly. |
| Comment by Patrick Farrell (Inactive) [ 24/Oct/13 ] |
|
MDS log during the test. Client LBUGged doing unlink8 test from LTP as described earlier. |
| Comment by Li Xi (Inactive) [ 25/Oct/13 ] |
|
Hi Alexey, I agree on that mdt_raw_lookup() should not return 1 all the time. And follwoing patch tries to fix that too. |
| Comment by Alexey Lyashkov [ 25/Oct/13 ] |
|
Hi Li, main question for it - did we need a set intent disposition in reply. may you check - how it send from client? via mdc_intent_lock or other way ? |
| Comment by Patrick Farrell (Inactive) [ 28/Oct/13 ] |
|
It might be worth noting that we hit this on 2.4.1. The ticket only lists 1.8.9/2.1.5. |
| Comment by Alexey Lyashkov [ 14/Nov/13 ] |
|
any ability to answer ? |
| Comment by Li Xi (Inactive) [ 14/Nov/13 ] |
|
Hi Alexey, I am sorry, maybe because the lack of background knowledge, I don't understand the question well. Would you please explain a little bit about it? And do you have any specific problems about the patch? |
| Comment by Shuichi Ihara (Inactive) [ 14/Feb/14 ] |
|
Alexey, can you please describe your quesiton in detial here? |
| Comment by Frederik Ferner (Inactive) [ 21/Aug/14 ] |
|
Looks like we've just hit this as well on a NFS server/lustre client which is still running 1.8.9 after upgrading one file system to 2.5.2. We intend to upgrade the client to 2.5.2 as well ASAP but need to upgrade all file system first. Is there any indication that this might be fixed in 2.5.2? |
| Comment by Patrick Farrell (Inactive) [ 22/Aug/14 ] |
|
Frederik - There's no movement towards a fix at the moment. If you're building your own Lustre, there's an option: Alexey and Oleg dislike http://review.whamcloud.com/#/c/7327/, but it does avoid the bug & we've been running it at Cray for a bit. |
| Comment by Li Xi (Inactive) [ 01/Dec/14 ] |
|
Patch of ' |
| Comment by Gerrit Updater [ 26/Dec/14 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/7327/ |
| Comment by Gerrit Updater [ 07/Jan/15 ] |
|
Lai Siyao (lai.siyao@intel.com) uploaded a new patch: http://review.whamcloud.com/13270 |
| Comment by Peter Jones [ 10/Feb/15 ] |
|
Landed for 2.7 |
| Comment by Gerrit Updater [ 20/Apr/15 ] |
|
Lai Siyao (lai.siyao@intel.com) uploaded a new patch: http://review.whamcloud.com/14498 |