Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.4.0, Lustre 2.4.1
-
Cray Lustre 2.4 clients running on SLES11 SP2.
-
3
-
8552
Description
During a test run using Lustre 2.4 one of our clients encountered this LBUG.
[2013-05-29 18:39:47][c7-1c1s3n3]LustreError:16573:0:(osc_lock.c:1165:osc_lock_enqueue()) ASSERTION( ols->ols_state == OLS_NEW ) failed: Impossible state: 4
[2013-05-29 18:39:47][c7-1c1s3n3]LustreError: 16573:0
(osc_lock.c:1165:osc_lock_enqueue()) LBUG
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff81006451>] try_stack_unwind+0x161/0x1a0
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff81004ca9>] dump_trace+0x89/0x440
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa013a897>] libcfs_debug_dumpstack+0x57/0x80 [libcfs]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa013ade7>] lbug_with_loc+0x47/0xc0 [libcfs]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa0639f55>] osc_lock_enqueue+0x725/0x8b0 [osc]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa032d4eb>] cl_enqueue_try+0xfb/0x320 [obdclass]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa06ccded>] lov_lock_enqueue+0x1fd/0x880 [lov]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa032d4eb>] cl_enqueue_try+0xfb/0x320 [obdclass]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa032e3bf>] cl_enqueue_locked+0x7f/0x1f0 [obdclass]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa032efbe>] cl_lock_request+0x7e/0x270 [obdclass]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa0334274>] cl_io_lock+0x394/0x5c0 [obdclass]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa033453a>] cl_io_loop+0x9a/0x1a0 [obdclass]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa074b90f>] ll_file_io_generic+0x33f/0x5f0 [lustre]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa074c08b>] ll_file_aio_read+0x23b/0x290 [lustre]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffffa074d002>] ll_file_read+0x1f2/0x280 [lustre]
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff81135548>] vfs_read+0xc8/0x180
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff8113b799>] kernel_read+0x49/0x60
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff8113b885>] prepare_binprm+0xd5/0x100
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff8113c670>] do_execve_common+0x1c0/0x300
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff8113c83f>] do_execve+0x3f/0x50
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff8100ae1e>] sys_execve+0x4e/0x80
[2013-05-29 18:39:47][c7-1c1s3n3] [<ffffffff81316a3c>] stub_execve+0x6c/0xc0
[2013-05-29 18:39:47][c7-1c1s3n3] [<0000000020176437>] 0x20176437
[2013-05-29 18:39:47][c7-1c1s3n3]Kernel panic - not syncing: LBUG
Patrick - I understand that. We hit the bug described in this ticket, not the bug caused by the fix for this ticket. We would like to be able to cherry-pick the fix so that our client does not assert hit this bug, but if the patch needed to be reverted it is not yet safe for me to do so.