[LU-5471] Assertion at cl_page_assume for HSM Created: 11/Aug/14 Updated: 30/Jan/15 Resolved: 11/Aug/14 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Jinshan Xiong (Inactive) | Assignee: | Jinshan Xiong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 15242 |
| Description |
|
The stack trace is as follows: LustreError: 9876:0:(cl_page.c:698:cl_page_assume()) page@ffff88062850a7e0[2 ffff880bf2500e18 0 1834972009 1 ffff880633900065 0000000000000003 0x0] LustreError: 9876:0:(cl_page.c:698:cl_page_assume()) vvp-page@ffff88062850a880(0:0:0) vm@ffffea00154cfbd8 40000000000801 4:0 ffff88062850a7e0 0 lru LustreError: 9876:0:(cl_page.c:698:cl_page_assume()) lov-page@ffff88062850a8d8, raid0 LustreError: 9876:0:(cl_page.c:698:cl_page_assume()) osc-page@ffff88062851a700 0: 1< 0x845fed 0 1 - - > 2< 0 0 0 0x10000 0x100 | 0001000000000000 ffff880c090f3620 ffff88063267cea8 > 3< - (null) 0 18446462603027842665 0 > 4< 0 0 8 10481664 - | - - - - > 5< - - - - | 0 - | 0 - -> LustreError: 9876:0:(cl_page.c:698:cl_page_assume()) end page@ffff88062850a7e0 LustreError: 9876:0:(cl_page.c:698:cl_page_assume()) pg->cp_owner == NULL LustreError: 9876:0:(cl_page.c:698:cl_page_assume()) ASSERTION( 0 ) failed: Aug 5 13:23:35 LustreError: 9876:0:(cl_page.c:698:cl_page_assume()) LBUG Pid: 9876, comm: cat Call Trace: [<ffffffffa0839895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] n003 kernel: Lus [<ffffffffa0839e97>] lbug_with_loc+0x47/0xb0 [libcfs] treError: 9876:0 [<ffffffffa0999002>] cl_page_assume+0x212/0x220 [obdclass] [<ffffffffa0f03f73>] ll_readpage+0x83/0x1a0 [lustre] :(cl_page.c:698: [<ffffffff81120f6c>] generic_file_aio_read+0x1fc/0x700 [<ffffffffa0f2a48f>] ? cl_glimpse_lock+0x1df/0x490 [lustre] cl_page_assume() [<ffffffffa0f34f99>] vvp_io_read_start+0x259/0x470 [lustre] ) page@ffff88062 [<ffffffffa09a199a>] cl_io_start+0x6a/0x140 [obdclass] 850a7e0[2 ffff88 [<ffffffffa09a5b24>] cl_io_loop+0xb4/0x1b0 [obdclass] 0bf2500e18 0 183 [<ffffffffa0ed6f06>] ll_file_io_generic+0x2b6/0x710 [lustre] 4972009 1 ffff88 [<ffffffffa0ed7faf>] ll_file_aio_read+0x13f/0x2c0 [lustre] [<ffffffffa0ed829c>] ll_file_read+0x16c/0x2a0 [lustre] 0633900065 00000 [<ffffffff81189365>] vfs_read+0xb5/0x1a0 [<ffffffff811894a1>] sys_read+0x51/0x90 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b The root cause of this issue is that we need to adjust page size after a file loses its layout; otherwise, the page size will keep increasing and eventually overflow an u16 integer that is used to remember the current page size. This issue can be easily reproduced by releasing and restoring an HSM in a loop. |
| Comments |
| Comment by Andrew Moe [ 29/Jan/15 ] |
|
Jinshan, is there a patch associated with this fix? I have encountered this bug as well. Should I expect this to be fixed in a particular Lustre release? |
| Comment by Frank Zago (Inactive) [ 30/Jan/15 ] |
|
Andy, it's probably the same bug as |