[LU-6227] Master testing: (osc_request.c:1219:osc_brw_prep_request()) ASSERTION( i == 0 || pg->off > pg_prev->off) Created: 09/Feb/15 Updated: 01/Jul/16 Resolved: 07/Jul/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Patrick Farrell (Inactive) | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Environment: |
CentOS 6.5 servers & clients, current master (tag 2.6.94). |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 17428 | ||||||||
| Description |
|
Spawning multiple copies of diotest1 from LTP in the same directory causes the assertion from I was able to replicate with full debug enabled and will make the dump & KO files available momentarily. I'll also include the diotest1 binary, but note that it is unchanged from LTP. Stack trace: Given the fact that this problem has reoccurred, it seems sensible to add a test for this to the test suite. |
| Comments |
| Comment by Patrick Farrell (Inactive) [ 09/Feb/15 ] |
|
On ftp.whamcloud.com (uploading now): uploads/ Dump, log, vmlinux and ko files: Test script and test: |
| Comment by Patrick Farrell (Inactive) [ 09/Feb/15 ] |
|
Transfer died... Trying again: 150209_full_debug2.tar.gz |
| Comment by Patrick Farrell (Inactive) [ 09/Feb/15 ] |
|
I've also made it available for non-Intel folks on the Cray FTP. ftp.cray.com outbound/xyratex/lu-6227 |
| Comment by Alexander Boyko [ 10/Feb/15 ] |
|
Patrick, the test was added with patch - test_241 sanity.sh dio vs bio. And we saw regression for it without the fix. Yeap, the issue looks like |
| Comment by Andreas Dilger [ 24/Feb/15 ] |
|
It looks like the client is not invalidating the buffered page from cache before submitting the dio page. There are a couple of solutions to that. One us to actually drop the buffered page, the other is to disallow merging these into a single RPC. |
| Comment by Patrick Farrell (Inactive) [ 24/Feb/15 ] |
|
My understanding is that http://review.whamcloud.com/#/c/10930 took the second route - Disallowing merging - but it appears that it's not working any more. (It fixed the problem originally, at least in 2.5 for Cray. I'm not sure we tested it on master at the time.) |
| Comment by Alexander Boyko [ 18/Mar/15 ] |
|
Base on the assert message <0>LustreError: 7700:0:(osc_request.c:1219:osc_brw_prep_request()) ASSERTION( i == 0 || pg->off > pg_prev->off ) failed: i 3 p_c 10 pg ffffea00017a5208 [pri 0 ind 2771] off 16384 prev_pg ffffea00017a51d0 [pri 0 ind 2256] off 16384 The type of both pages is direct io, and they has the same offset. So this is not I have found the commit which introduce this regression. commit e6f592fc57b048223e2676408232b6662aad712d
Author: Prakash Surya <surya1@llnl.gov>
Date: Wed Oct 2 17:16:51 2013 -0700
LU-1669 vvp: Use lockless __generic_file_aio_write
Testing multi-threaded single shard file write performance has shown
the inode mutex to be a limiting factor when using the
generic_file_aio_write function. To work around this bottle neck, this
change replaces the locked version of that call with the lock less
version, specifically, __generic_file_aio_write.
|
| Comment by Patrick Farrell (Inactive) [ 18/Mar/15 ] |
|
Hmmm. Unfortunately, that makes some sense... The purpose of that patch was to allow multiple threads on a single client to write to one file at the same time, so there are some possible concurrency issues. I wonder if this issue is only in the context of direct IO... Obviously this crash is with direct IO pages, but I wonder about the underlying problem. |
| Comment by Oleg Drokin [ 24/Mar/15 ] |
|
I think the parllel io is still guarded with some sort of range lock, so should we better understand why did that not help? |
| Comment by Patrick Farrell (Inactive) [ 24/Mar/15 ] |
|
I looked at this the other day, and the range lock is taken in ll_file_io_generic, which is not used in the direct IO path. The two conflicting pages in this instance are created by calls in to ll_direct_IO_26, then the assert fires while pages are being written out after, somehow in the process of an ll_file_read call. The ll_file_read call uses the range lock, but two conflicting pages are already present at that time. The direct IO path does not use the range lock. Should it? I actually don't see how the patch called out by Alex Boyko above changes the behavior of the direct IO path in a way that would have prevented this issue - It only takes the inode mutex on the read path. What about the write path? (I'm not arguing the patch isn't source of the problem, just saying I can't see how. It seems there's something, perhaps several somethings, I don't understand here. |
| Comment by Patrick Farrell (Inactive) [ 07/Apr/15 ] |
|
Looking closer... 00000008:00040000:3.0:1423457026.801250:0:7700:0:(osc_request.c:1219:osc_brw_prep_request()) ASSERTION( i == 0 || pg->off > pg_prev->off ) failed: i 3 p_c 10 pg ffffea00017a5208 [pri 0 ind 2771] off 16384 prev_pg ffffea00017a51d0 [pri 0 ind 2256] off 16384 00000080:00000001:2.0:1423457026.799792:0:1562:0:(file.c:1285:ll_file_aio_read()) Process entered 00000080:00000001:3.0:1423457026.800382:0:1915:0:(file.c:1285:ll_file_aio_read()) Process entered Reads do not take the range lock, but for direct IO, this can mean they try to read the same page at the same time. This doesn't happen for non direct IO reads - Presumably this happens for direct IO because some of the other locking machinery is disabled/not used. I'll submit a patch to use the range lock for direct IO reads... But if someone else feels this concurrency should be handled at another layer, please let me know where and how. |
| Comment by Gerrit Updater [ 07/Apr/15 ] |
|
Patrick Farrell (paf@cray.com) uploaded a new patch: http://review.whamcloud.com/14385 |
| Comment by Patrick Farrell (Inactive) [ 07/Apr/15 ] |
|
The patch submitted above passes testing with Cray's reproducer for this bug. |
| Comment by Patrick Farrell (Inactive) [ 07/Apr/15 ] |
|
The bug was introduced specifically in this part of the change highlighted by Alexander: The comment there does not mention that DIO reads must also be kept separate from one another, so it appears safe to remove the concurrency control, as the range locking covers the cases described in that comment. |
| Comment by Jinshan Xiong (Inactive) [ 13/Apr/15 ] |
|
Hi Patrick, I didn't read all the comments above but the patch. Isn't it a better idea to take inode mutex in direct IO path? |
| Comment by Patrick Farrell (Inactive) [ 13/Apr/15 ] |
|
Jinshan - I think not, for the same reason we don't take it (any more) in the non-direct IO path. It allows parallelism between multiple threads doing direct IO. Why do you think we should take the mutex? Also, if the old comment removed by
|
| Comment by Gerrit Updater [ 28/Apr/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/14385/ |
| Comment by Peter Jones [ 07/Jul/15 ] |
|
Landed for 2.8 |
| Comment by James A Simmons [ 20/Jan/16 ] |
|
ORNL just ran into this for our 2.7 clients. Looks like a back port is needed. |
| Comment by Patrick Farrell (Inactive) [ 20/Jan/16 ] |
|
James, the affects version is 2.7.0. So, of course it's needed there. The patch shouldn't need any particular porting, though. |