[LU-4856] osc_lru_reserve()) ASSERTION( atomic_read(cli->cl_lru_left) >= 0 ) failed Created: 03/Apr/14 Updated: 01/Oct/15 Resolved: 30/Sep/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.0, Lustre 2.6.0, Lustre 2.4.2 |
| Fix Version/s: | Lustre 2.7.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Stephen Champion | Assignee: | Jian Yu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 13394 | ||||
| Description |
|
The atomic_t used to count LRU entries is overflowing on systems with large memory configurations: LustreError: 22141:0:(osc_page.c:892:osc_lru_reserve()) ASSERTION(atomic_read(cli->cl_lru_left) >= 0 ) failed: PID: 54214 TASK: ffff88fdef4e4100 CPU: 40 COMMAND: "cat" In this case, the atomic_t (signed int) held: We've triggered this specific problem with configurations down to 11TB of physmem. A 10.5TB system can cat a small file without crashing. I noticed several other cases where page counts are handled using a signed int, and suspect anything more than 4TB is problematic. The kernel itself is consistently using unsigned long for page counts on all architectures. |
| Comments |
| Comment by Jinshan Xiong (Inactive) [ 03/Apr/14 ] |
|
we can use atomic64_t instead. |
| Comment by Stephen Champion [ 10/Apr/14 ] |
|
I've been digging at this, trying to identify the changes required. To support large memory systems, all global accounting of pages needs to be done with 64 types. Just tracing usage of cfs_num_physpages (which cl_lru_left is derived from), the problem snowballs quickly, and affects almost every subsystem in Lustre. Some casting will be required, but it should not be a problem to use 32 bit counters for page vectors. I doubt any networks support 8 TB transactions yet. |
| Comment by Stephen Champion [ 07/May/14 ] |
|
I have been working on a patch against master to address easily identified overflow hazards. This cascaded into lock management as well. I am about to give it a whirl on internal systems to make sure I didn't break anything, then allocate time on a system with 16T of memory to make sure it addresses the problem. I won't be able to run acceptance on the large system anytime soon, but will do some basic functionality testing. I will also need to cleanup for coding standards. |
| Comment by Stephen Champion [ 31/May/14 ] |
| Comment by Stephen Champion [ 03/Jun/14 ] |
|
I setup an i686 build environment and worked through the initial errors. The kernel does not implement atomic64_add_unless on this arch, so I'll have to find a way around this problem. I will push the updated patch for feedback, but there will certainly be additional revisions, possibly major. |
| Comment by John Fuchs-Chesney (Inactive) [ 25/Jul/14 ] |
|
Hello Stephen, Many thanks, |
| Comment by Stephen Champion [ 25/Jul/14 ] |
|
Yes please. The patch needs to have i686 build problems addressed, and I need to sync up with everyone who offered comments. I expect to get back to it during the week of Aug 5. |
| Comment by Stephen Champion [ 27/Aug/14 ] |
|
I pushed a new revision of the patch this morning. I expected tests to start automatically - do I need to add Test-Parameters? I will testing on my own x86_64 / IB test environment today, but do not have a means to test i686. |
| Comment by Peter Jones [ 27/Aug/14 ] |
|
Hi Steve It's started testing now. Just higher than usual load on the test system. Peter |
| Comment by Stephen Champion [ 04/Sep/14 ] |
|
Revision 4 of http://review.whamcloud.com/#/c/10537/ is tested, working.
|
| Comment by Stephen Champion [ 10/Sep/14 ] |
|
http://review.whamcloud.com/#/c/10537/5 is confirmed as resolving this problem on a 32TB system. |
| Comment by John Fuchs-Chesney (Inactive) [ 16/Sep/14 ] |
|
Stephen, Thanks, |
| Comment by Stephen Champion [ 16/Sep/14 ] |
|
In the middle of it right now. Had to rebase to master again. I am hesitant to simply #define the lprocfs_.._long functions to _u64 functions, as sign conversion hazards might catch unsuspecting users. Seems like a great way to introduce very obscure bugs. I think I can eliminate the introduction of the long function by using the _64 functions in the cases my patch was using them. |
| Comment by Stephen Champion [ 18/Sep/14 ] |
|
http://review.whamcloud.com/#/c/10537/7 eliminates the lprocfs_..long functions entirely. |
| Comment by John Fuchs-Chesney (Inactive) [ 18/Sep/14 ] |
|
Made it through autotest. |
| Comment by Peter Jones [ 30/Sep/14 ] |
|
Landed for 2.7 |
| Comment by Jian Yu [ 10/Oct/14 ] |
|
Here is the patch for master branch to resolve the issue that if "val" is larger than 2^32 on a 32-bit system, the code in proc_max_dirty_pages_in_mb() may truncate "val" when assigning it to obd_max_dirty_pages: http://review.whamcloud.com/12269/ |
| Comment by Manish Patel (Inactive) [ 10/Jul/15 ] |
|
Hi We are seeing same issues with SLES 11 + SP3 and we using Lustre version 2.4.3 luster client installed with 2048 core SGI UV1000 running: cat /etc/SuSE-release SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 3 hungabee:~ # lsb_release -a LSB Version: core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch Distributor ID: SUSE LINUX Description: SUSE Linux Enterprise Server 11 (x86_64) Release: 11 Codename: n/a hungabee:~ # rpm -qa | egrep "(lustre|ofed)" lustre-client-modules-2.4.3-3.0.101_0.29_default ofed-doc-1.5.4.1-0.11.5 ofed-1.5.4.1-0.11.5 ofed-kmp-trace-1.5.4.1_3.0.76_0.11-0.11.5 ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5 lustre-client-2.4.3-3.0.101_0.29_default So can we have a backport path for Lustre client v2.4.3 and is this patch included in 2.5.x branches if not then can we have backport patch for 2.5 branch too. Thank You, |
| Comment by Gerrit Updater [ 01/Oct/15 ] |
|
Grégoire Pichon (gregoire.pichon@bull.net) uploaded a new patch: http://review.whamcloud.com/16697 |