[LU-185] LBUG: (cl_page.c:1362:cl_page_completion()) !(pg->cp_flags & CPF_READ_COMPLETED) ASSERTION(0) failed Created: 01/Apr/11 Updated: 17/May/11 Resolved: 04/May/11 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.0.0 |
| Fix Version/s: | Lustre 2.1.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Sebastien Buisson (Inactive) | Assignee: | Jinshan Xiong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Bugzilla ID: | 19,352 |
| Rank (Obsolete): | 5056 |
| Description |
|
Hi, At CEA they have 'special' client nodes dedicated to file exchange between two clusters. These nodes frequently crash with the following messages in the syslog: LustreError: 8142:0:(osc_request.c:773:osc_announce_cached()) dirty 1807 - 1807 > system dirty_max 8650752
Analyzing the crash dump we can see the following stack: In the crash dump we also see that the concerned cl_page struct has only CPF_READ_COMPLETED set. Looking for similar issues in Lustre bugzilla database, I found bug 19352. To me this is exactly the same bug, but the problem is a fix for this bug was landed in 2.0. I have made sure that our sources do include this fix. At CEA, it seems that this problem began to occur when the copy tool running on these nodes was modified to do O_DIRECT IOs. |
| Comments |
| Comment by Peter Jones [ 01/Apr/11 ] |
|
Jay Could you look at this one please? Thanks Peter |
| Comment by Jinshan Xiong (Inactive) [ 01/Apr/11 ] |
|
looking into this. |
| Comment by Jinshan Xiong (Inactive) [ 01/Apr/11 ] |
|
Does the customer do regularly read/write on the same file while he is doing direct-IO? |
| Comment by Jinshan Xiong (Inactive) [ 05/Apr/11 ] |
|
The patch is at: http://review.whamcloud.com/#change,404 The root cause of this problem is that the application is doing regular and direct IO on the same time. This causes that a ra page is to be submitted twice for read. |
| Comment by Build Master (Inactive) [ 05/Apr/11 ] |
|
Integrated in Jinshan Xiong : 84631daa279bc77599d6cd69eef413b50b64f5e8
|
| Comment by Build Master (Inactive) [ 05/Apr/11 ] |
|
Integrated in Jinshan Xiong : 84631daa279bc77599d6cd69eef413b50b64f5e8
|
| Comment by Build Master (Inactive) [ 05/Apr/11 ] |
|
Integrated in Jinshan Xiong : 84631daa279bc77599d6cd69eef413b50b64f5e8
|
| Comment by Build Master (Inactive) [ 05/Apr/11 ] |
|
Integrated in Jinshan Xiong : 84631daa279bc77599d6cd69eef413b50b64f5e8
|
| Comment by Build Master (Inactive) [ 05/Apr/11 ] |
|
Integrated in Jinshan Xiong : 84631daa279bc77599d6cd69eef413b50b64f5e8
|
| Comment by Build Master (Inactive) [ 05/Apr/11 ] |
|
Integrated in Jinshan Xiong : 84631daa279bc77599d6cd69eef413b50b64f5e8
|
| Comment by Build Master (Inactive) [ 05/Apr/11 ] |
|
Integrated in Jinshan Xiong : 84631daa279bc77599d6cd69eef413b50b64f5e8
|
| Comment by Build Master (Inactive) [ 05/Apr/11 ] |
|
Integrated in Jinshan Xiong : 84631daa279bc77599d6cd69eef413b50b64f5e8
|
| Comment by Build Master (Inactive) [ 06/Apr/11 ] |
|
Integrated in Jinshan Xiong : 3bab56b54457cbdf287ed89e07314fd988bdea4a
|
| Comment by Build Master (Inactive) [ 06/Apr/11 ] |
|
Integrated in Jinshan Xiong : 3bab56b54457cbdf287ed89e07314fd988bdea4a
|
| Comment by Build Master (Inactive) [ 06/Apr/11 ] |
|
Integrated in Jinshan Xiong : 3bab56b54457cbdf287ed89e07314fd988bdea4a
|
| Comment by Build Master (Inactive) [ 06/Apr/11 ] |
|
Integrated in Jinshan Xiong : 3bab56b54457cbdf287ed89e07314fd988bdea4a
|
| Comment by Build Master (Inactive) [ 06/Apr/11 ] |
|
Integrated in Jinshan Xiong : 3bab56b54457cbdf287ed89e07314fd988bdea4a
|
| Comment by Build Master (Inactive) [ 06/Apr/11 ] |
|
Integrated in Jinshan Xiong : 3bab56b54457cbdf287ed89e07314fd988bdea4a
|
| Comment by Build Master (Inactive) [ 06/Apr/11 ] |
|
Integrated in Jinshan Xiong : 3bab56b54457cbdf287ed89e07314fd988bdea4a
|
| Comment by Build Master (Inactive) [ 06/Apr/11 ] |
|
Integrated in Jinshan Xiong : 3bab56b54457cbdf287ed89e07314fd988bdea4a
|
| Comment by Build Master (Inactive) [ 07/Apr/11 ] |
|
Integrated in Jinshan Xiong : 74029e4c5e8d0b212962a480727d52371260e527
|
| Comment by Build Master (Inactive) [ 07/Apr/11 ] |
|
Integrated in Jinshan Xiong : 74029e4c5e8d0b212962a480727d52371260e527
|
| Comment by Build Master (Inactive) [ 07/Apr/11 ] |
|
Integrated in Jinshan Xiong : 74029e4c5e8d0b212962a480727d52371260e527
|
| Comment by Build Master (Inactive) [ 07/Apr/11 ] |
|
Integrated in Jinshan Xiong : 74029e4c5e8d0b212962a480727d52371260e527
|
| Comment by Build Master (Inactive) [ 07/Apr/11 ] |
|
Integrated in Jinshan Xiong : 74029e4c5e8d0b212962a480727d52371260e527
|
| Comment by Build Master (Inactive) [ 07/Apr/11 ] |
|
Integrated in Jinshan Xiong : 74029e4c5e8d0b212962a480727d52371260e527
|
| Comment by Build Master (Inactive) [ 07/Apr/11 ] |
|
Integrated in Jinshan Xiong : 74029e4c5e8d0b212962a480727d52371260e527
|
| Comment by Build Master (Inactive) [ 07/Apr/11 ] |
|
Integrated in Jinshan Xiong : 74029e4c5e8d0b212962a480727d52371260e527
|
| Comment by Build Master (Inactive) [ 07/Apr/11 ] |
|
Integrated in Jinshan Xiong : 74029e4c5e8d0b212962a480727d52371260e527
|
| Comment by Peter Jones [ 21/Apr/11 ] |
|
Patch to be rolled into production at CEA next week |
| Comment by Peter Jones [ 26/Apr/11 ] |
|
As per Bull, this fix is now in production at CEA |
| Comment by Peter Jones [ 03/May/11 ] |
|
Update from Bull - no reoccurrences of this issue since it was rolled into production |
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Build Master (Inactive) [ 03/May/11 ] |
|
Integrated in Oleg Drokin : 119036444bc48381b2d5ca3333438000c409046a
|
| Comment by Peter Jones [ 04/May/11 ] |
|
Patch landed for 2.1. Please reopen if this issue reoccurs with the patch in place |
| Comment by Sebastien Buisson (Inactive) [ 17/May/11 ] |
|
The customer has been testing for several weeks a backport of this patch in 2.0.0.1, now it considers the problem as fixed. |