[LU-3524] Lustre 2.1.3: lov_io.c:212:lov_sub_get()) ASSERTION( stripe < lio->lis_stripe_count ) failed Created: 28/Jun/13 Updated: 18/Sep/14 Resolved: 18/Sep/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Patrick Valentin (Inactive) | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 8869 |
| Description |
|
At TGCC site, which is currently running Lustre 2.1.3, time to time, customer get crashes with the following assertion : LustreError: 23580:0:(lov_io.c:212:lov_sub_get()) ASSERTION( stripe < lio->lis_stripe_count ) failed: LustreError: 23580:0:(lov_io.c:212:lov_sub_get()) LBUG Pid: 23580, comm: IMB-IO Call Trace: [<ffffffffa034d7f5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa034de07>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0917a8f>] lov_sub_get+0x47f/0x6f0 [lov] [<ffffffffa0913ca2>] lov_sublock_env_get+0xd2/0x140 [lov] [<ffffffffa0914e61>] lov_sublock_alloc+0xf1/0x470 [lov] [<ffffffffa09162fc>] lov_lock_init_raid0+0x3dc/0xe30 [lov] [<ffffffffa090eab4>] lov_lock_init+0x54/0xe0 [lov] [<ffffffffa049215c>] cl_lock_hold_mutex+0x37c/0x6b0 [obdclass] [<ffffffffa04925ee>] cl_lock_request+0x5e/0x1c0 [obdclass] [<ffffffffa09ee9bf>] cl_glimpse_lock+0x16f/0x410 [lustre] [<ffffffffa09f2f0a>] ccc_prep_size+0x10a/0x290 [lustre] [<ffffffffa09f8425>] vvp_io_read_start+0xb5/0x3e0 [lustre] [<ffffffffa04938da>] cl_io_start+0x6a/0x140 [obdclass] [<ffffffffa0497bbc>] cl_io_loop+0xcc/0x190 [obdclass] [<ffffffffa09a7f07>] ll_file_io_generic+0x3a7/0x560 [lustre] [<ffffffffa09a81f9>] ll_file_aio_read+0x139/0x2c0 [lustre] [<ffffffffa09a86b9>] ll_file_read+0x169/0x2a0 [lustre] [<ffffffff81163a15>] vfs_read+0xb5/0x1a0 [<ffffffff81163b51>] sys_read+0x51/0x90 [<ffffffff81487d7e>] ? do_device_not_available+0xe/0x10 [<ffffffff810030f2>] system_call_fastpath+0x16/0x1b After some investigation, it seems to be |
| Comments |
| Comment by Peter Jones [ 28/Jun/13 ] |
|
Bruno is looking into this one |
| Comment by Bruno Faccini (Inactive) [ 28/Jun/13 ] |
|
Patrick, |
| Comment by Lustre Bull [ 28/Jun/13 ] |
|
Hi bruno, I don't have anymore information about this LBUG. I forward you questions to Bull support team to have more details. |
| Comment by Alexandre Louvet [ 01/Jul/13 ] |
|
I guess it is a standard IMB-IO but with a lustre aware mpi-io library. I have asked final user to provide fine details and will keep you updated. Alex. |
| Comment by Bruno Faccini (Inactive) [ 04/Jul/13 ] |
|
On my side and in the meantime I investigate patches from |
| Comment by Bruno Faccini (Inactive) [ 12/Jul/13 ] |
|
To help me working more in-deep on this issue, could it be possible to get the full stacks out of the crash-dump ?? And may be more like concerned data structs if I ask you later ? |
| Comment by Sebastien Buisson (Inactive) [ 18/Sep/14 ] |
|
As we are unable to provide requested information, this ticket can be closed. Thank you, |
| Comment by Peter Jones [ 18/Sep/14 ] |
|
ok thanks Sebastien |