Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
Lustre 2.4.3
-
None
-
2.5.54
-
3
-
12901
Description
The assertion failed on our system. The system are running Lustre-2.5.54, so patch from LU-3889 is already merged.
<3>LustreError: 109372:0:(file.c:3087:ll_inode_revalidate_fini()) home2: revalidate FID [0x2000048a2:0x17:0x0] error: rc = -116
<0>LustreError: 237806:0:(osc_lock.c:1204:osc_lock_enqueue()) ASSERTION( ols->ols_state == OLS_NEW ) failed: Impossible state: 6
<0>LustreError: 237806:0:(osc_lock.c:1204:osc_lock_enqueue()) LBUG
<4>Pid: 237806, comm: tail
<4>
<4>Call Trace:
<4> [<ffffffffa054c895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
<4> [<ffffffffa054ce97>] lbug_with_loc+0x47/0xb0 [libcfs]
<4> [<ffffffffa0c620f0>] ? osc_lock_enqueue+0x0/0x8c0 [osc]
<4> [<ffffffffa0c62820>] osc_lock_enqueue+0x730/0x8c0 [osc]
<4> [<ffffffffa069a337>] ? cl_lock_state_signal+0x87/0x160 [obdclass]
<4> [<ffffffffa069dcfc>] cl_enqueue_try+0xfc/0x300 [obdclass]
<4> [<ffffffffa0a5c42a>] lov_lock_enqueue+0x22a/0x850 [lov]
<4> [<ffffffffa069dcfc>] cl_enqueue_try+0xfc/0x300 [obdclass]
<4> [<ffffffffa069ef4f>] cl_enqueue_locked+0x6f/0x1f0 [obdclass]
<4> [<ffffffffa069fb9e>] cl_lock_request+0x7e/0x270 [obdclass]
<4> [<ffffffffa06a4b2c>] cl_io_lock+0x3cc/0x560 [obdclass]
<4> [<ffffffffa06a4d62>] cl_io_loop+0xa2/0x1b0 [obdclass]
<4> [<ffffffffa0addca6>] ll_file_io_generic+0x2b6/0x710 [lustre]
<4> [<ffffffffa0694ce9>] ? cl_env_get+0x29/0x350 [obdclass]
<4> [<ffffffffa0ade23f>] ll_file_aio_read+0x13f/0x2c0 [lustre]
<4> [<ffffffffa0ade6fc>] ll_file_read+0x16c/0x2a0 [lustre]
<4> [<ffffffff811816c5>] vfs_read+0xb5/0x1a0
<4> [<ffffffff81181801>] sys_read+0x51/0x90
<4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
<4>
<0>Kernel panic - not syncing: LBUG
<4>Pid: 237806, comm: tail Tainted: G W --------------- 2.6.32-358.el6.x86_64 #1
<4>Call Trace:
<4> [<ffffffff8150cfc8>] ? panic+0xa7/0x16f
<4> [<ffffffffa054ceeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
<4> [<ffffffffa0c620f0>] ? osc_lock_enqueue+0x0/0x8c0 [osc]
<4> [<ffffffffa0c62820>] ? osc_lock_enqueue+0x730/0x8c0 [osc]
<4> [<ffffffffa069a337>] ? cl_lock_state_signal+0x87/0x160 [obdclass]
<4> [<ffffffffa069dcfc>] ? cl_enqueue_try+0xfc/0x300 [obdclass]
<4> [<ffffffffa0a5c42a>] ? lov_lock_enqueue+0x22a/0x850 [lov]
<4> [<ffffffffa069dcfc>] ? cl_enqueue_try+0xfc/0x300 [obdclass]
<4> [<ffffffffa069ef4f>] ? cl_enqueue_locked+0x6f/0x1f0 [obdclass]
<4> [<ffffffffa069fb9e>] ? cl_lock_request+0x7e/0x270 [obdclass]
<4> [<ffffffffa06a4b2c>] ? cl_io_lock+0x3cc/0x560 [obdclass]
<4> [<ffffffffa06a4d62>] ? cl_io_loop+0xa2/0x1b0 [obdclass]
<4> [<ffffffffa0addca6>] ? ll_file_io_generic+0x2b6/0x710 [lustre]
<4> [<ffffffffa0694ce9>] ? cl_env_get+0x29/0x350 [obdclass]
<4> [<ffffffffa0ade23f>] ? ll_file_aio_read+0x13f/0x2c0 [lustre]
<4> [<ffffffffa0ade6fc>] ? ll_file_read+0x16c/0x2a0 [lustre]
<4> [<ffffffff811816c5>] ? vfs_read+0xb5/0x1a0
<4> [<ffffffff81181801>] ? sys_read+0x51/0x90
<4> [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Attachments
Issue Links
- is duplicated by
-
LU-4443 Failure on test suite parallel-scale test_write_disjoint: ASSERTION( ols->ols_state == OLS_NEW ) failed
-
- Resolved
-
- is related to
-
LU-4591 Related cl_lock failures on master/2.5
-
- Resolved
-
- is related to
-
LU-5492 osc_lock.c:1165:osc_lock_enqueue()) ASSERTION( ols->ols_state == OLS_NEW ) failed: Impossible state: 6
-
- Resolved
-
We hit the same problem shortly after upgrading to 2.4.2 clients. Have not seen this on 2.4.1
crash> bt
PID: 18546 TASK: ffff8809f1aa8040 CPU: 0 COMMAND: "turbostream"
#0 [ffff8809f0cf98a0] machine_kexec at ffffffff81038f3b
#1 [ffff8809f0cf9900] crash_kexec at ffffffff810c5da2
#2 [ffff8809f0cf99d0] panic at ffffffff8152721a
#3 [ffff8809f0cf9a50] lbug_with_loc at ffffffffa0f47eeb [libcfs]
#4 [ffff8809f0cf9a70] lovsub_lock_state at ffffffffa14b7d51 [lov]
#5 [ffff8809f0cf9ac0] cl_lock_state_signal at ffffffffa10aedd8 [obdclass]
#6 [ffff8809f0cf9b00] cl_lock_state_set at ffffffffa10aef25 [obdclass]
#7 [ffff8809f0cf9b30] cl_enqueue_try at ffffffffa10b2c29 [obdclass]
#8 [ffff8809f0cf9b80] lov_lock_enqueue at ffffffffa14b00da [lov]
#9 [ffff8809f0cf9c20] cl_enqueue_try at ffffffffa10b2bdc [obdclass]
#10 [ffff8809f0cf9c70] cl_enqueue_locked at ffffffffa10b3fcf [obdclass]
#11 [ffff8809f0cf9cb0] cl_lock_request at ffffffffa10b4c3e [obdclass]
#12 [ffff8809f0cf9d10] cl_io_lock at ffffffffa10b9edc [obdclass]
#13 [ffff8809f0cf9d70] cl_io_loop at ffffffffa10ba112 [obdclass]
#14 [ffff8809f0cf9da0] ll_file_io_generic at ffffffffa152ca90 [lustre]
#15 [ffff8809f0cf9e20] ll_file_aio_write at ffffffffa152d9d2 [lustre]
#16 [ffff8809f0cf9e80] ll_file_write at ffffffffa152dcbc [lustre]
#17 [ffff8809f0cf9ef0] vfs_write at ffffffff81188f88
#18 [ffff8809f0cf9f30] sys_write at ffffffff81189881
#19 [ffff8809f0cf9f80] system_call_fastpath at ffffffff8100b072
RIP: 00007fdf7eeea6fd RSP: 00007fffad51c8b8 RFLAGS: 00000202
RAX: 0000000000000001 RBX: ffffffff8100b072 RCX: 0000000000000001
RDX: 00000000000009c4 RSI: 0000000008b32154 RDI: 000000000000001b
RBP: 0000000013252a00 R8: 00007fdf8069d6a8 R9: 00000000000009c4
R10: 00007fffad512d3c R11: 0000000000000293 R12: 000000004c00010d
R13: 00000000000009c4 R14: 0000000000e6d964 R15: 000000000000001b
ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
crash> sys
SYSTEM MAP: /boot/System.map-2.6.32-431.1.2.el6.hpcs.x86_64
DEBUG KERNEL: /usr/lib/debug/lib/modules/2.6.32-431.1.2.el6.hpcs.x86_64/vmlinux (2.6.32-431.1.2.el6.hpcs.x86_64)
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 12
DATE: Wed Mar 12 16:17:17 2014
UPTIME: 1 days, 23:37:03
LOAD AVERAGE: 1.99, 1.97, 1.91
TASKS: 449
NODENAME: tesla14
RELEASE: 2.6.32-431.1.2.el6.hpcs.x86_64
VERSION: #1 SMP Sun Dec 15 04:51:47 GMT 2013
MACHINE: x86_64 (2599 Mhz)
MEMORY: 64 GB
PANIC: "Kernel panic - not syncing: LBUG"